opendp.trans module#

opendp.trans.make_bounded_mean(lower, upper, n, T=None)[source]#

Make a Transformation that computes the mean of bounded data. Use make_clamp to bound data.

Parameters
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • n (int) – Number of records in input data.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns

A bounded_mean step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_bounded_sum(lower, upper, T=None)[source]#

Make a Transformation that computes the sum of bounded data. Use make_clamp to bound data.

Parameters
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns

A bounded_sum step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_bounded_sum_n(lower, upper, n, T=None)[source]#

Make a Transformation that computes the sum of bounded data with known length. This uses a restricted-sensitivity proof that takes advantage of known N for better utility. Use make_clamp to bound data.

Parameters
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • n (int) – Number of records in input data.

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns

A bounded_sum_n step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_bounded_variance(lower, upper, n, ddof=1, T=None)[source]#

Make a Transformation that computes the variance of bounded data. Use make_clamp to bound data.

Parameters
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • n (int) – Number of records in input data.

  • ddof (int) – Delta degrees of freedom. Set to 0 if not a sample, 1 for sample estimate.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns

A bounded_variance step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast(TI, TO)[source]#

Make a Transformation that casts a vector of data from type TI to type TO. Failure to parse results in None, else Some<TO>.

Parameters
  • TI (RuntimeTypeDescriptor) – input data type to cast from

  • TO (RuntimeTypeDescriptor) – data type to cast into

Returns

A cast step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_default(TI, TO)[source]#

Make a Transformation that casts a vector of data from type TI to type TO. If cast fails, fill with default.

Parameters
  • TI (RuntimeTypeDescriptor) – input data type to cast from

  • TO (RuntimeTypeDescriptor) – data type to cast into

Returns

A cast_default step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_inherent(TI, TO)[source]#

Make a Transformation that casts a vector of data from type TI to a type that can represent nullity TO. If cast fails, fill with TO’s null value.

Parameters
  • TI (RuntimeTypeDescriptor) – input data type to cast from

  • TO (RuntimeTypeDescriptor) – data type to cast into

Returns

A cast_inherent step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_metric(MI, MO, T)[source]#

Make a Transformation that converts the dataset metric from type MI to type MO.

Parameters
  • MI (DatasetMetric) – input dataset metric

  • MO (DatasetMetric) – output dataset metric

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns

A cast_metric step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_clamp(lower, upper, T=None)[source]#

Make a Transformation that clamps numeric data in Vec<T> between lower and upper.

Parameters
  • lower – If datum is less than lower, let datum be lower.

  • upper – If datum is greater than upper, let datum be upper.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns

A clamp step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count(TIA, TO='i32')[source]#

Make a Transformation that computes a count of the number of records in data.

Parameters
  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.

  • TO (RuntimeTypeDescriptor) – type of output integer

Returns

A count step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_by(n, MO, TI, TO='i32')[source]#

Make a Transformation that computes the count of each unique value in data. This assumes that the category set is unknown. Use make_base_stability to release this query.

Parameters
  • n (int) – Number of records in input data.

  • MO (SensitivityMetric) – Output Metric.

  • TI (RuntimeTypeDescriptor) – Input Type. Categorical/hashable input data type. Input data must be Vec<TI>.

  • TO (RuntimeTypeDescriptor) – Output Type. express counts in terms of this integral type

Returns

The carrier type is HashMap<TI, TO>- the counts for each unique data input.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_by_categories(categories, MO, TI=None, TO='i32')[source]#

Make a Transformation that computes the number of times each category appears in the data. This assumes that the category set is known.

Parameters
  • categories (Any) – The set of categories to compute counts for.

  • MO (SensitivityMetric) – output sensitivity metric

  • TI (RuntimeTypeDescriptor) – categorical/hashable input data type. Input data must be Vec<TI>.

  • TO (RuntimeTypeDescriptor) – express counts in terms of this integral type

Returns

A count_by_categories step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_distinct(TIA, TO='i32')[source]#

Make a Transformation that computes a count of the number of unique, distinct records in data.

Parameters
  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.

  • TO (RuntimeTypeDescriptor) – Output Type. integer

Returns

A count_distinct step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_create_dataframe(col_names, K=None)[source]#

Make a Transformation that constructs a dataframe from a Vec<Vec<String>>.

Parameters
  • col_names (Any) – Column names for each record entry.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of column names

Returns

A create_dataframe step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_identity(M, T)[source]#

Make a Transformation that simply passes the data through.

Parameters
  • M (DatasetMetric) – dataset metric

  • T (RuntimeTypeDescriptor) – Type of data passed to the identity function.

Returns

A identity step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_impute_constant(constant, DA='OptionNullDomain<AllDomain<T>>')[source]#

Make a Transformation that replaces null/None data with constant. By default, the input type is Vec<Option<T>>, as emitted by make_cast. Set DA to InherentNullDomain<AllDomain<T>> for imputing on types that have an inherent representation of nullity, like floats.

Parameters
  • constant (Any) – Value to replace nulls with.

  • DA (RuntimeTypeDescriptor) – domain of data being imputed. This is OptionNullDomain<AllDomain<T>> or InherentNullDomain<AllDomain<T>>

Returns

A impute_constant step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_impute_uniform_float(lower, upper, T=None)[source]#

Make a Transformation that replaces null/None data in Vec<T> with constant

Parameters
  • lower – Lower bound of uniform distribution to sample from.

  • upper – Upper bound of uniform distribution to sample from.

  • T (RuntimeTypeDescriptor) – type of data being imputed

Returns

A impute_uniform_float step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_is_equal(value, TI=None)[source]#

Make a Transformation that checks if each element is equal to value.

Parameters
  • value (Any) – value to check against

  • TI (RuntimeTypeDescriptor) – input data type

Returns

A is_equal step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_is_null(DIA)[source]#

Make a Transformation that checks if each element in a vector is null.

Parameters

DIA (RuntimeTypeDescriptor) – atomic input domain

Returns

A is_null step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_parse_column(key, T, impute=True, K=None)[source]#

Make a Transformation that parses the key column of a dataframe as T.

Parameters
  • key (Any) – name of column to select from dataframe and parse

  • impute (bool) – Enable to impute values that fail to parse. If false, raise an error if parsing fails.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of the key/column name

  • T (RuntimeTypeDescriptor) – data type to parse into

Returns

A parse_column step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_resize(constant, length, TA=None)[source]#

Make a Transformation that either truncates or imputes records with constant in a Vec<T> to match a provided length. WARNING: This function is temporary. It will be replaced by a more general make_resize that accepts domains

param constant

Value to impute with.

type constant

Any

param length

Number of records in output data.

type length

int

param TA

Atomic type.

type TA

RuntimeTypeDescriptor

return

A vector of the same type TA, but with the provided length.

rtype

Transformation

raises AssertionError

if an argument’s type differs from the expected type

raises UnknownTypeError

if a type-argument fails to parse

raises OpenDPException

packaged error from the core OpenDP library

Parameters
  • constant (Any) –

  • length (int) –

  • TA (Optional[Union[opendp.typing.RuntimeType, _GenericAlias, str, Type[Union[List, Tuple, int, float, str, bool]], tuple]]) –

Return type

opendp.mod.Transformation

opendp.trans.make_resize_bounded(constant, length, lower, upper, TA=None)[source]#

Make a Transformation that either truncates or imputes records with constant in a Vec<T> to match a provided length. WARNING: This function is temporary. It will be replaced by a more general make_resize_constant that accepts domains

param constant

Value to impute with.

param length

Number of records in output data.

type length

int

param lower

Lower bound of data in input domain

param upper

Upper bound of data in input domain

param TA

Atomic type.

type TA

RuntimeTypeDescriptor

return

A vector of the same type TA, but with the provided length.

rtype

Transformation

raises AssertionError

if an argument’s type differs from the expected type

raises UnknownTypeError

if a type-argument fails to parse

raises OpenDPException

packaged error from the core OpenDP library

Parameters
  • length (int) –

  • TA (Optional[Union[opendp.typing.RuntimeType, _GenericAlias, str, Type[Union[List, Tuple, int, float, str, bool]], tuple]]) –

Return type

opendp.mod.Transformation

opendp.trans.make_select_column(key, T, K=None)[source]#

Make a Transformation that retrieves the column key from a dataframe as Vec<T>.

Parameters
  • key (Any) – categorical/hashable data type of the key/column name

  • K (RuntimeTypeDescriptor) – data type of the key

  • T (RuntimeTypeDescriptor) – data type to downcast to

Returns

A select_column step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_dataframe(separator, col_names, K=None)[source]#

Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>, and loads the resulting table into a dataframe keyed by col_names.

Parameters
  • separator (str) – The token(s) that separate entries in each record.

  • col_names (Any) – Column names for each record entry.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of column names

Returns

A split_dataframe step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_lines()[source]#

Make a Transformation that takes a string and splits it into a Vec<String> of its lines.

Returns

A split_lines step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_records(separator)[source]#

Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>.

Parameters

separator (str) – The token(s) that separate entries in each record.

Returns

A split_records step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_unclamp(lower, upper, T=None)[source]#

Make a Transformation that unclamps a VectorDomain<IntervalDomain<T>> to a VectorDomain<AllDomain<T>>.

Parameters
  • lower – Lower bound of the input data.

  • upper – Upper bound of the input data.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns

A unclamp step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library