opendp.trans module#

opendp.trans.make_bounded_resize(size, bounds, constant, TA=None)[source]#

Make a Transformation that either truncates or imputes records with constant in a Vec<TA> to match a provided size. WARNING: This function is temporary. It will be replaced by a more general make_resize that accepts domains

param size

Number of records in output data.

type size

int

param bounds

Tuple of lower and upper bounds for data in the input domain

type bounds

Tuple[Any, Any]

param constant

Value to impute with.

param TA

Atomic type. If not passed, TA is inferred from the lower bound.

type TA

RuntimeTypeDescriptor

return

A vector of the same type TA, but with the provided size.

rtype

Transformation

raises AssertionError

if an argument’s type differs from the expected type

raises UnknownTypeError

if a type-argument fails to parse

raises OpenDPException

packaged error from the core OpenDP library

Parameters
  • size (int) –

  • bounds (Tuple[Any, Any]) –

  • TA (Optional[Union[opendp.typing.RuntimeType, _GenericAlias, str, Type[Union[List, Tuple, int, float, str, bool]], tuple]]) –

Return type

opendp.mod.Transformation

opendp.trans.make_bounded_sum(bounds, T=None)[source]#

Make a Transformation that computes the sum of bounded data. Use make_clamp to bound data.

Parameters
  • bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for data in the input domain

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns

A bounded_sum step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast(TIA, TOA)[source]#

Make a Transformation that casts a vector of data from type TIA to type TOA. Failure to parse results in None, else Some<TOA>.

Parameters
  • TIA (RuntimeTypeDescriptor) – atomic input data type to cast from

  • TOA (RuntimeTypeDescriptor) – atomic data type to cast into

Returns

A cast step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_default(TIA, TOA)[source]#

Make a Transformation that casts a vector of data from type TIA to type TOA. If cast fails, fill with default.

Parameters
  • TIA (RuntimeTypeDescriptor) – atomic input data type to cast from

  • TOA (RuntimeTypeDescriptor) – atomic data type to cast into

Returns

A cast_default step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_inherent(TIA, TOA)[source]#

Make a Transformation that casts a vector of data from type TI to a type that can represent nullity TO. If cast fails, fill with TO’s null value.

Parameters
  • TIA (RuntimeTypeDescriptor) – input data type to cast from

  • TOA (RuntimeTypeDescriptor) – data type to cast into

Returns

A cast_inherent step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_cast_metric(MI, MO, TA)[source]#

Make a Transformation that converts the dataset metric from type MI to type MO.

Parameters
  • MI (DatasetMetric) – input dataset metric

  • MO (DatasetMetric) – output dataset metric

  • TA (RuntimeTypeDescriptor) – atomic type of data

Returns

A cast_metric step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_clamp(bounds, TA=None)[source]#

Make a Transformation that clamps numeric data in Vec<T> to bounds. If datum is less than lower, let datum be lower. If datum is greater than upper, let datum be upper.

Parameters
  • bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.

  • TA (RuntimeTypeDescriptor) – atomic data type

Returns

A clamp step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count(TIA, TO='i32')[source]#

Make a Transformation that computes a count of the number of records in data.

Parameters
  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.

  • TO (RuntimeTypeDescriptor) – Output Type. Must be an integer.

Returns

A count step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_by(size, MO, TIA, TOA='i32')[source]#

Make a Transformation that computes the count of each unique value in data. This assumes that the category set is unknown. Use make_base_stability to release this query.

Parameters
  • size (int) – Number of records in input data.

  • MO (SensitivityMetric) – Output Metric.

  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Categorical/hashable input data type. Input data must be Vec<TI>.

  • TOA (RuntimeTypeDescriptor) – Atomic Output Type. Express counts in terms of this integral type.

Returns

The carrier type is HashMap<TI, TO>- the counts for each unique data input.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_by_categories(categories, MO, TIA=None, TOA='i32')[source]#

Make a Transformation that computes the number of times each category appears in the data. This assumes that the category set is known.

Parameters
  • categories (Any) – The set of categories to compute counts for.

  • MO (SensitivityMetric) – output sensitivity metric

  • TIA (RuntimeTypeDescriptor) – categorical/hashable input type. Input data must be Vec<TIA>.

  • TOA (RuntimeTypeDescriptor) – express counts in terms of this integral type

Returns

A count_by_categories step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_count_distinct(TIA, TO='i32')[source]#

Make a Transformation that computes a count of the number of unique, distinct records in data.

Parameters
  • TIA (RuntimeTypeDescriptor) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.

  • TO (RuntimeTypeDescriptor) – Output Type. Must be an integer.

Returns

A count_distinct step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_create_dataframe(col_names, K=None)[source]#

Make a Transformation that constructs a dataframe from a Vec<Vec<String>>.

Parameters
  • col_names (Any) – Column names for each record entry.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of column names

Returns

A create_dataframe step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_identity(M, TA)[source]#

Make a Transformation that simply passes the data through.

Parameters
  • M (DatasetMetric) – dataset metric

  • TA (RuntimeTypeDescriptor) – Type of data passed to the identity function.

Returns

A identity step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_impute_constant(constant, DA='OptionNullDomain<AllDomain<TA>>')[source]#

Make a Transformation that replaces null/None data with constant. By default, the input type is Vec<Option<TA>>, as emitted by make_cast. Set DA to InherentNullDomain<AllDomain<TA>> for imputing on types that have an inherent representation of nullity, like floats.

Parameters
  • constant (Any) – Value to replace nulls with.

  • DA (RuntimeTypeDescriptor) – domain of data being imputed. This is OptionNullDomain<AllDomain<TA>> or InherentNullDomain<AllDomain<TA>>

Returns

A impute_constant step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_impute_uniform_float(bounds, TA=None)[source]#

Make a Transformation that replaces null/None data in Vec<TA> with uniformly distributed floats within bounds.

Parameters
  • bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.

  • TA (RuntimeTypeDescriptor) – type of data being imputed

Returns

A impute_uniform_float step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_is_equal(value, TIA=None)[source]#

Make a Transformation that checks if each element is equal to value.

Parameters
  • value (Any) – value to check against

  • TIA (RuntimeTypeDescriptor) – atomic input data type

Returns

A is_equal step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_is_null(DIA)[source]#

Make a Transformation that checks if each element in a vector is null.

Parameters

DIA (RuntimeTypeDescriptor) – atomic input domain

Returns

A is_null step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_resize(size, constant, TA=None)[source]#

Make a Transformation that either truncates or imputes records with constant in a Vec<TA> to match a provided size. WARNING: This function is pending change. It will accept an additional domain argument.

param size

Number of records in output data.

type size

int

param constant

Value to impute with.

type constant

Any

param TA

Atomic type.

type TA

RuntimeTypeDescriptor

return

A vector of the same type TA, but with the provided size.

rtype

Transformation

raises AssertionError

if an argument’s type differs from the expected type

raises UnknownTypeError

if a type-argument fails to parse

raises OpenDPException

packaged error from the core OpenDP library

Parameters
  • size (int) –

  • constant (Any) –

  • TA (Optional[Union[opendp.typing.RuntimeType, _GenericAlias, str, Type[Union[List, Tuple, int, float, str, bool]], tuple]]) –

Return type

opendp.mod.Transformation

opendp.trans.make_select_column(key, TOA, K=None)[source]#

Make a Transformation that retrieves the column key from a dataframe as Vec<TOA>.

Parameters
  • key (Any) – categorical/hashable data type of the key/column name

  • K (RuntimeTypeDescriptor) – data type of the key

  • TOA (RuntimeTypeDescriptor) – atomic data type to downcast to

Returns

A select_column step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_sized_bounded_mean(size, bounds, T=None)[source]#

Make a Transformation that computes the mean of bounded data. Use make_clamp to bound data.

Parameters
  • size (int) – Number of records in input data.

  • bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds of the input data.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns

A sized_bounded_mean step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_sized_bounded_sum(size, bounds, T=None)[source]#

Make a Transformation that computes the sum of bounded data with known length. This uses a restricted-sensitivity proof that takes advantage of known N for better utility. Use make_clamp to bound data.

Parameters
  • size (int) – Number of records in input data.

  • bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data

  • T (RuntimeTypeDescriptor) – atomic type of data

Returns

A sized_bounded_sum step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_sized_bounded_variance(size, bounds, ddof=1, T=None)[source]#

Make a Transformation that computes the variance of bounded data. Use make_clamp to bound data.

Parameters
  • size (int) – Number of records in input data.

  • bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data

  • ddof (int) – Delta degrees of freedom. Set to 0 if not a sample, 1 for sample estimate.

  • T (RuntimeTypeDescriptor) – atomic data type

Returns

A sized_bounded_variance step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_dataframe(separator, col_names, K=None)[source]#

Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>, and loads the resulting table into a dataframe keyed by col_names.

Parameters
  • separator (str) – The token(s) that separate entries in each record.

  • col_names (Any) – Column names for each record entry.

  • K (RuntimeTypeDescriptor) – categorical/hashable data type of column names

Returns

A split_dataframe step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_lines()[source]#

Make a Transformation that takes a string and splits it into a Vec<String> of its lines.

Returns

A split_lines step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_split_records(separator)[source]#

Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>.

Parameters

separator (str) – The token(s) that separate entries in each record.

Returns

A split_records step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library

opendp.trans.make_unclamp(bounds, TA=None)[source]#

Make a Transformation that unclamps a VectorDomain<BoundedDomain<T>> to a VectorDomain<AllDomain<T>>.

Parameters
  • bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.

  • TA (RuntimeTypeDescriptor) – atomic data type

Returns

A unclamp step.

Return type

Transformation

Raises
  • AssertionError – if an argument’s type differs from the expected type

  • UnknownTypeError – if a type-argument fails to parse

  • OpenDPException – packaged error from the core OpenDP library