opendp.trans module#

opendp.trans.make_bounded_mean(lower, upper, n, T=None)[source]#

Make a Transformation that computes the mean of bounded data. Use make_clamp to bound data.

Parameters:
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • n (int) – Number of records in input data.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns:

A bounded_mean step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_bounded_sum(lower, upper, T=None)[source]#

Make a Transformation that computes the sum of bounded data. Use make_clamp to bound data.

Parameters:
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns:

A bounded_sum step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_bounded_sum_n(lower, upper, n, T=None)[source]#

Make a Transformation that computes the sum of bounded data with known length. This uses a restricted-sensitivity proof that takes advantage of known N for better utility. Use make_clamp to bound data.

Parameters:
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • n (int) – Number of records in input data.

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns:

A bounded_sum_n step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_bounded_variance(lower, upper, n, ddof=1, T=None)[source]#

Make a Transformation that computes the variance of bounded data. Use make_clamp to bound data.

Parameters:
  • lower – Lower bound of input data.

  • upper – Upper bound of input data.

  • n (int) – Number of records in input data.

  • ddof (int) – Delta degrees of freedom. Set to 0 if not a sample, 1 for sample estimate.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns:

A bounded_variance step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast(TI, TO)[source]#

Make a Transformation that casts a vector of data from type TI to type TO. Failure to parse results in None, else Some<TO>.

Parameters:
  • TI (RuntimeTypeDescriptor) – input data type to cast from

  • TO (RuntimeTypeDescriptor) – data type to cast into

Returns:

A cast step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_default(TI, TO)[source]#

Make a Transformation that casts a vector of data from type TI to type TO. If cast fails, fill with default.

Parameters:
  • TI (RuntimeTypeDescriptor) – input data type to cast from

  • TO (RuntimeTypeDescriptor) – data type to cast into

Returns:

A cast_default step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_inherent(TI, TO)[source]#

Make a Transformation that casts a vector of data from type TI to a type that can represent nullity TO. If cast fails, fill with TO’s null value.

Parameters:
  • TI (RuntimeTypeDescriptor) – input data type to cast from

  • TO (RuntimeTypeDescriptor) – data type to cast into

Returns:

A cast_inherent step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_metric(MI, MO, T)[source]#

Make a Transformation that converts the dataset metric from type MI to type MO.

Parameters:
  • MI (DatasetMetric) – input dataset metric

  • MO (DatasetMetric) – output dataset metric

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns:

A cast_metric step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_clamp(lower, upper, DI='VectorDomain<AllDomain<T>>', M='SymmetricDistance')[source]#

Make a Transformation that clamps numeric data in Vec<T> between lower and upper. Set DI to AllDomain<T> for clamping aggregated values.

Parameters:
  • lower – If datum is less than lower, let datum be lower.

  • upper – If datum is greater than upper, let datum be upper.

  • DI (RuntimeTypeDescriptor) – input domain. One of VectorDomain<AllDomain<_>> or AllDomain<_>.

  • M (RuntimeTypeDescriptor) – metric. Set to SymmetricDistance when clamping datasets, or AbsoluteDistance<_> when clamping aggregated scalars

Returns:

A clamp step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count(TIA, TO='i32')[source]#

Make a Transformation that computes a count of the number of records in data.

Parameters:
  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.

  • TO (RuntimeTypeDescriptor) – type of output integer

Returns:

A count step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_by(n, MO, TI, TO='i32')[source]#

Make a Transformation that computes the count of each unique value in data. This assumes that the category set is unknown. Use make_base_stability to release this query.

Parameters:
  • n (int) – Number of records in input data.

  • MO (SensitivityMetric) – Output Metric.

  • TI (RuntimeTypeDescriptor) – Input Type. Categorical/hashable input data type. Input data must be Vec<TI>.

  • TO (RuntimeTypeDescriptor) – Output Type. express counts in terms of this integral type

Returns:

The carrier type is HashMap<TI, TO>- the counts for each unique data input.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_by_categories(categories, MO, TI=None, TO='i32')[source]#

Make a Transformation that computes the number of times each category appears in the data. This assumes that the category set is known.

Parameters:
  • categories (Any) – The set of categories to compute counts for.

  • MO (SensitivityMetric) – output sensitivity metric

  • TI (RuntimeTypeDescriptor) – categorical/hashable input data type. Input data must be Vec<TI>.

  • TO (RuntimeTypeDescriptor) – express counts in terms of this integral type

Returns:

A count_by_categories step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_distinct(TIA, TO='i32')[source]#

Make a Transformation that computes a count of the number of unique, distinct records in data.

Parameters:
  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.

  • TO (RuntimeTypeDescriptor) – Output Type. integer

Returns:

A count_distinct step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_create_dataframe(col_names, K=None)[source]#

Make a Transformation that constructs a dataframe from a Vec<Vec<String>>.

Parameters:
  • col_names (Any) – Column names for each record entry.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of column names

Returns:

A create_dataframe step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_identity(M, T)[source]#

Make a Transformation that simply passes the data through.

Parameters:
  • M (DatasetMetric) – dataset metric

  • T (RuntimeTypeDescriptor) – Type of data passed to the identity function.

Returns:

A identity step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_impute_constant(constant, DA='OptionNullDomain<AllDomain<T>>')[source]#

Make a Transformation that replaces null/None data with constant. By default, the input type is Vec<Option<T>>, as emitted by make_cast. Set DA to InherentNullDomain<AllDomain<T>> for imputing on types that have an inherent representation of nullity, like floats.

Parameters:
  • constant – Value to replace nulls with.

  • DA (RuntimeTypeDescriptor) – domain of data being imputed. This is OptionNullDomain<AllDomain<T>> or InherentNullDomain<AllDomain<T>>

Returns:

A impute_constant step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_impute_uniform_float(lower, upper, T=None)[source]#

Make a Transformation that replaces null/None data in Vec<T> with constant

Parameters:
  • lower – Lower bound of uniform distribution to sample from.

  • upper – Upper bound of uniform distribution to sample from.

  • T (RuntimeTypeDescriptor) – type of data being imputed

Returns:

A impute_uniform_float step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_is_equal(value, TI=None)[source]#

Make a Transformation that checks if each element is equal to value.

Parameters:
  • value – value to check against

  • TI (RuntimeTypeDescriptor) – input data type

Returns:

A is_equal step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_is_null(DIA)[source]#

Make a Transformation that checks if each element in a vector is null.

Parameters:

DIA (RuntimeTypeDescriptor) – atomic input domain

Returns:

A is_null step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_parse_column(key, T, impute=True, K=None)[source]#

Make a Transformation that parses the key column of a dataframe as T.

Parameters:
  • key – name of column to select from dataframe and parse

  • impute (bool) – Enable to impute values that fail to parse. If false, raise an error if parsing fails.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of the key/column name

  • T (RuntimeTypeDescriptor) – data type to parse into

Returns:

A parse_column step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_select_column(key, T, K=None)[source]#

Make a Transformation that retrieves the column key from a dataframe as Vec<T>.

Parameters:
  • key – categorical/hashable data type of the key/column name

  • K (RuntimeTypeDescriptor) – data type of the key

  • T (RuntimeTypeDescriptor) – data type to downcast to

Returns:

A select_column step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_dataframe(separator, col_names, K=None)[source]#

Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>, and loads the resulting table into a dataframe keyed by col_names.

Parameters:
  • separator (str) – The token(s) that separate entries in each record.

  • col_names (Any) – Column names for each record entry.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of column names

Returns:

A split_dataframe step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_lines()[source]#

Make a Transformation that takes a string and splits it into a Vec<String> of its lines.

Returns:

A split_lines step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_records(separator)[source]#

Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>.

Parameters:

separator (str) – The token(s) that separate entries in each record.

Returns:

A split_records step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_unclamp(lower, upper, M, T='VectorDomain<IntervalDomain<T>>')[source]#

Make a Transformation that unclamps a VectorDomain<IntervalDomain<T>> to a VectorDomain<AllDomain<T>>. Set DI to IntervalDomain<T> to work on scalars.

Parameters:
  • lower – Lower bound of the input data.

  • upper – Upper bound of the input data.

  • T (RuntimeTypeDescriptor) – domain of data being unclamped

  • M (RuntimeTypeDescriptor) – metric to use on the input and output spaces

Returns:

A unclamp step.

Return type:

Transformation

Raises:
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library