opendp.trans module#
- opendp.trans.make_bounded_float_checked_sum(size_limit, bounds, S='Pairwise<T>')[source]#
Make a Transformation that computes the sum of bounded floats with unknown dataset size. This computes the sum on up to size_limit rows randomly selected from the input.
- Parameters:
size_limit (int) – Limit on number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
S (Type Arguments) – summation algorithm to use on data type T. One of Sequential<T> or Pairwise<T>.
- Returns:
A bounded_float_checked_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_bounded_float_ordered_sum(size_limit, bounds, S='Pairwise<T>')[source]#
Make a Transformation that computes the sum of bounded floats. You may need to use make_ordered_random to impose an ordering on the data.
- Parameters:
size_limit (int) – Upper bound on the number of records in input data. Used to bound sensitivity. Can be overestimated.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for data in the input domain
S (Type Arguments) – summation algorithm to use on data type T. One of Sequential<T> or Pairwise<T>.
- Returns:
A bounded_float_ordered_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_bounded_int_monotonic_sum(bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints, where all values share the same sign.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A bounded_int_monotonic_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_bounded_int_ordered_sum(bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints. You may need to use make_ordered_random to impose an ordering on the data.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A bounded_int_ordered_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_bounded_int_split_sum(bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints. Adds the saturating sum of the positives to the saturating sum of the negatives.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A bounded_int_split_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_bounded_resize(size, bounds, constant, MI='SymmetricDistance', MO='SymmetricDistance', TA=None)[source]#
Make a Transformation that either truncates or imputes records with constant in a Vec<TA> to match a provided size.
- Parameters:
size (int) – Number of records in output data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for data in the input domain
constant – Value to impute with.
MI (Type Arguments) – Input Metric. One of InsertDeleteDistance or SymmetricDistance
MO (Type Arguments) – Output Metric. One of InsertDeleteDistance or SymmetricDistance
TA (Type Arguments) – Atomic type. If not passed, TA is inferred from the lower bound.
- Returns:
A vector of the same type TA, but with the provided size.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_bounded_sum(bounds, MI='SymmetricDistance', T=None)[source]#
Make a Transformation that computes the sum of bounded data. Use make_clamp to bound data.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for data in the input domain
MI (Type Arguments) – Input Metric. One of SymmetricDistance or InsertDeleteDistance.
T (Type Arguments) – atomic type of data
- Returns:
A bounded_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_cast(TIA, TOA)[source]#
Make a Transformation that casts a vector of data from type TIA to type TOA. Failure to parse results in None, else Some<TOA>.
- Parameters:
TIA (Type Arguments) – atomic input data type to cast from
TOA (Type Arguments) – atomic data type to cast into
- Returns:
A cast step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_cast_default(TIA, TOA)[source]#
Make a Transformation that casts a vector of data from type TIA to type TOA. If cast fails, fill with default.
- Parameters:
TIA (Type Arguments) – atomic input data type to cast from
TOA (Type Arguments) – atomic data type to cast into
- Returns:
A cast_default step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_cast_inherent(TIA, TOA)[source]#
Make a Transformation that casts a vector of data from type TI to a type that can represent nullity TO. If cast fails, fill with TO’s null value.
- Parameters:
TIA (Type Arguments) – input data type to cast from
TOA (Type Arguments) – data type to cast into
- Returns:
A cast_inherent step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_clamp(bounds, TA=None)[source]#
Make a Transformation that clamps numeric data in Vec<T> to bounds. If datum is less than lower, let datum be lower. If datum is greater than upper, let datum be upper.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.
TA (Type Arguments) – atomic data type
- Returns:
A clamp step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_count(TIA, TO='int')[source]#
Make a Transformation that computes a count of the number of records in data.
- Parameters:
TIA (Type Arguments) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.
TO (Type Arguments) – Output Type. Must be numeric.
- Returns:
A count step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_count_by(MO, TK, TV='int')[source]#
Make a Transformation that computes the count of each unique value in data. This assumes that the category set is unknown.
- Parameters:
MO (SensitivityMetric) – Output Metric.
TK (Type Arguments) – Type of Key. Categorical/hashable input data type. Input data must be Vec<TK>.
TV (Type Arguments) – Type of Value. Express counts in terms of this integral type.
- Returns:
The carrier type is HashMap<TK, TV>, a hashmap of the count (TV) for each unique data input (TK).
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_count_by_categories(categories, MO='L1Distance<int>', TIA=None, TOA='int')[source]#
Make a Transformation that computes the number of times each category appears in the data. This assumes that the category set is known.
- Parameters:
categories (Any) – The set of categories to compute counts for.
MO (SensitivityMetric) – output sensitivity metric
TIA (Type Arguments) – categorical/hashable input type. Input data must be Vec<TIA>.
TOA (Type Arguments) – express counts in terms of this numeric type
- Returns:
A count_by_categories step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_count_distinct(TIA, TO='int')[source]#
Make a Transformation that computes a count of the number of unique, distinct records in data.
- Parameters:
TIA (Type Arguments) – Atomic Input Type. Input data is expected to be of the form Vec<TIA>.
TO (Type Arguments) – Output Type. Must be numeric.
- Returns:
A count_distinct step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_create_dataframe(col_names, K=None)[source]#
Make a Transformation that constructs a dataframe from a Vec<Vec<String>>.
- Parameters:
col_names (Any) – Column names for each record entry.
K (Type Arguments) – categorical/hashable data type of column names
- Returns:
A create_dataframe step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_drop_null(DA)[source]#
Make a Transformation that drops null values.
- Parameters:
DA (Type Arguments) – atomic domain of input data that contains nulls. This is OptionNullDomain<AllDomain<TA>> or InherentNullDomain<AllDomain<TA>>
- Returns:
A drop_null step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_find(categories, TIA=None)[source]#
Find the index of a data value in a set of categories.
- Parameters:
categories (Any) – The set of categories to find indexes from.
TIA (Type Arguments) – categorical/hashable input type. Input data must be Vec<TIA>.
- Returns:
A find step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_find_bin(edges, TIA=None)[source]#
Find the bin index from a monotonically increasing vector of edges.
- Parameters:
edges (Any) – The set of edges to split bins by.
TIA (Type Arguments) – numerical input type. Input data must be Vec<TIA>.
- Returns:
A find_bin step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_identity(D, M)[source]#
Make a Transformation that simply passes the data through.
- Parameters:
D (Type Arguments) – Domain of the identity function. Must be VectorDomain<AllDomain<_>> or AllDomain<_>
M (Type Arguments) – metric. Must be a dataset metric if D is a VectorDomain or a sensitivity metric if D is an AllDomain
- Returns:
A transformation where the input and output domain are D and the input and output metric are M
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_impute_constant(constant, DA='OptionNullDomain<AllDomain<TA>>')[source]#
Make a Transformation that replaces null/None data with constant. By default, the input type is Vec<Option<TA>>, as emitted by make_cast. Set DA to InherentNullDomain<AllDomain<TA>> for imputing on types that have an inherent representation of nullity, like floats.
- Parameters:
constant (Any) – Value to replace nulls with.
DA (Type Arguments) – domain of data being imputed. This is OptionNullDomain<AllDomain<TA>> or InherentNullDomain<AllDomain<TA>>
- Returns:
A impute_constant step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_impute_uniform_float(bounds, TA=None)[source]#
Make a Transformation that replaces null/None data in Vec<TA> with uniformly distributed floats within bounds.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.
TA (Type Arguments) – type of data being imputed
- Returns:
A impute_uniform_float step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_index(categories, null, TOA=None)[source]#
Index into a vector of categories.
- Parameters:
categories (Any) – The set of categories to index into.
null (Any) – Category to return if the index is out-of-range of the category set.
TOA (Type Arguments) – atomic output type. Output data will be Vec<TIA>.
- Returns:
A index step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_is_equal(value, TIA=None)[source]#
Make a Transformation that checks if each element is equal to value.
- Parameters:
value (Any) – value to check against
TIA (Type Arguments) – atomic input data type
- Returns:
A is_equal step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_is_null(DIA)[source]#
Make a Transformation that checks if each element in a vector is null.
- Parameters:
DIA (Type Arguments) – atomic input domain
- Returns:
A is_null step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_lipschitz_float_mul(constant, bounds, D='AllDomain<T>', M='AbsoluteDistance<T>')[source]#
Multiply an aggregate by a constant.
- Parameters:
constant – The constant to multiply aggregates by.
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds of the input data.
D (Type Arguments) – Domain of the function. Must be AllDomain<T> or VectorDomain<AllDomain<T>>
M (Type Arguments) – Metric. Must be AbsoluteDistance<T>, L1Distance<T> or L2Distance<T>
- Returns:
A lipschitz_float_mul step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_metric_bounded(size, TA, MI='SymmetricDistance')[source]#
Make a Transformation that converts the unbounded dataset metric MI to the respective bounded dataset metric with a no-op. If “SymmetricDistance”, then output metric is “ChangeOneDistance”, and respectively “InsertDeleteDistance” maps to “HammingDistance”.
- Parameters:
size (int) – Number of records in input data.
MI (DatasetMetric) – input dataset metric.
TA (Type Arguments) – atomic type of data
- Returns:
A metric_bounded step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_metric_unbounded(size, TA, MI='ChangeOneDistance')[source]#
Make a Transformation that converts the bounded dataset metric MI to the respective unbounded dataset metric with a no-op. If “ChangeOneDistance”, then output metric is “SymmetricDistance”, and respectively “HammingDistance” maps to “InsertDeleteDistance”.
- Parameters:
size (int) – Number of records in input data.
MI (DatasetMetric) – input dataset metric.
TA (Type Arguments) – atomic type of data
- Returns:
A metric_unbounded step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_ordered_random(TA)[source]#
Make a Transformation that converts the unordered dataset metric SymmetricDistance to the respective ordered dataset metric InsertDeleteDistance by assigning a random permutatation. Operates exclusively on VectorDomain<AllDomain<TA>>.
- Parameters:
TA (Type Arguments) – atomic type of data
- Returns:
A ordered_random step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_resize(size, constant, MI='SymmetricDistance', MO='SymmetricDistance', TA=None)[source]#
Make a Transformation that either truncates or imputes records with constant in a Vec<TA> to match a provided size.
- Parameters:
size (int) – Number of records in output data.
constant (Any) – Value to impute with.
MI (Type Arguments) – Input Metric. One of InsertDeleteDistance or SymmetricDistance
MO (Type Arguments) – Output Metric. One of InsertDeleteDistance or SymmetricDistance
TA (Type Arguments) – Atomic type.
- Returns:
A vector of the same type TA, but with the provided size.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_select_column(key, TOA, K=None)[source]#
Make a Transformation that retrieves the column key from a dataframe as Vec<TOA>.
- Parameters:
key (Any) – categorical/hashable data type of the key/column name
K (Type Arguments) – data type of the key
TOA (Type Arguments) – atomic data type to downcast to
- Returns:
A select_column step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_float_checked_sum(size, bounds, S='Pairwise<T>')[source]#
Make a Transformation that computes the sum of bounded floats with known dataset size. This uses a restricted-sensitivity proof that takes advantage of known dataset size for better utility.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
S (Type Arguments) – summation algorithm to use on data type T. One of Sequential<T> or Pairwise<T>.
- Returns:
A sized_bounded_float_checked_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_float_ordered_sum(size, bounds, S='Pairwise<T>')[source]#
Make a Transformation that computes the sum of bounded floats with known dataset size. This uses a restricted-sensitivity proof that takes advantage of known dataset size for better utility. You may need to use make_ordered_random to impose an ordering on the data.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
S (Type Arguments) – summation algorithm to use on data type T. One of Sequential<T> or Pairwise<T>.
- Returns:
A sized_bounded_float_ordered_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_int_checked_sum(size, bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints. The effective range is reduced, as (bounds * size) must not overflow.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_int_checked_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_int_monotonic_sum(size, bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints with known dataset size. This uses a restricted-sensitivity proof that takes advantage of known dataset size for better utility. Adds the saturating sum of the positives to the saturating sum of the negatives.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_int_monotonic_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_int_ordered_sum(size, bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints with known dataset size. This uses a restricted-sensitivity proof that takes advantage of known dataset size for better utility. You may need to use make_ordered_random to impose an ordering on the data.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_int_ordered_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_int_split_sum(size, bounds, T=None)[source]#
Make a Transformation that computes the sum of bounded ints with known dataset size. This uses a restricted-sensitivity proof that takes advantage of known dataset size for better utility. Adds the saturating sum of the positives to the saturating sum of the negatives.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
T (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_int_split_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_mean(size, bounds, MI='SymmetricDistance', T=None)[source]#
Make a Transformation that computes the mean of bounded data. This uses a restricted-sensitivity proof that takes advantage of known dataset size. Use make_clamp to bound data and make_bounded_resize to establish dataset size.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds of the input data.
MI (Type Arguments) – Input Metric. One of SymmetricDistance or InsertDeleteDistance
T (Type Arguments) – atomic data type
- Returns:
A sized_bounded_mean step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_ordered_random(size, bounds, TA, MI='SymmetricDistance')[source]#
Make a Transformation that converts the unordered dataset metric MI to the respective ordered dataset metric by assigning a random permutatation. Operates exclusively on SizedDomain<VectorDomain<BoundedDomain<TA>>>. If MI is “SymmetricDistance”, then output metric is “InsertDeleteDistance”, and respectively “ChangeOneDistance” maps to “HammingDistance”.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.
MI (DatasetMetric) – input dataset metric
TA (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_ordered_random step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_sum(size, bounds, MI='SymmetricDistance', T=None)[source]#
Make a Transformation that computes the sum of bounded data with known dataset size. This uses a restricted-sensitivity proof that takes advantage of known dataset size for better utility. Use make_clamp to bound data and make_bounded_resize to establish dataset size.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
MI (Type Arguments) – Input Metric. One of SymmetricDistance or InsertDeleteDistance.
T (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_sum step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_sum_of_squared_deviations(size, bounds, S='Pairwise<T>')[source]#
Make a Transformation that computes the sum of squared deviations of bounded data. This uses a restricted-sensitivity proof that takes advantage of known dataset size. Use make_clamp to bound data and make_bounded_resize to establish dataset size.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
S (Type Arguments) – summation algorithm to use on data type T. One of Sequential<T> or Pairwise<T>.
- Returns:
A sized_bounded_sum_of_squared_deviations step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_unordered(size, bounds, TA, MI='InsertDeleteDistance')[source]#
Make a Transformation that converts the ordered dataset metric MI to the respective unordered dataset metric with a no-op. Operates exclusively on SizedDomain<VectorDomain<BoundedDomain<TA>>>. If MI is “InsertDeleteDistance”, then output metric is “SymmetricDistance”, and respectively “HammingDistance” maps to “ChangeOneDistance”.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.
MI (DatasetMetric) – input dataset metric
TA (Type Arguments) – atomic type of data
- Returns:
A sized_bounded_unordered step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_bounded_variance(size, bounds, ddof=1, S='Pairwise<T>')[source]#
Make a Transformation that computes the variance of bounded data. This uses a restricted-sensitivity proof that takes advantage of known dataset size. Use make_clamp to bound data and make_bounded_resize to establish dataset size.
- Parameters:
size (int) – Number of records in input data.
bounds (Tuple[Any, Any]) – Tuple of lower and upper bounds for input data
ddof (int) – Delta degrees of freedom. Set to 0 if not a sample, 1 for sample estimate.
S (Type Arguments) – summation algorithm to use on data type T. One of Sequential<T> or Pairwise<T>.
- Returns:
A sized_bounded_variance step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_ordered_random(size, TA, MI='SymmetricDistance')[source]#
Make a Transformation that converts the unordered dataset metric MI to the respective ordered dataset metric by assigning a random permutatation. Operates exclusively on SizedDomain<VectorDomain<AllDomain<TA>>>. If MI is “SymmetricDistance”, then output metric is “InsertDeleteDistance”, and respectively “ChangeOneDistance” maps to “HammingDistance”.
- Parameters:
size (int) – Number of records in input data.
MI (DatasetMetric) – input dataset metric
TA (Type Arguments) – atomic type of data
- Returns:
A sized_ordered_random step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_sized_unordered(size, TA, MI='InsertDeleteDistance')[source]#
Make a Transformation that converts the ordered dataset metric MI to the respective unordered dataset metric with a no-op. Operates exclusively on SizedDomain<VectorDomain<AllDomain<TA>>>. If MI is “InsertDeleteDistance”, then output metric is “SymmetricDistance”, and respectively “HammingDistance” maps to “ChangeOneDistance”.
- Parameters:
size (int) – Number of records in input data.
MI (DatasetMetric) – input dataset metric.
TA (Type Arguments) – atomic type of data
- Returns:
A sized_unordered step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_split_dataframe(separator, col_names, K=None)[source]#
Make a Transformation that splits each record in a String into a Vec<Vec<String>>, and loads the resulting table into a dataframe keyed by col_names.
- Parameters:
separator (str) – The token(s) that separate entries in each record.
col_names (Any) – Column names for each record entry.
K (Type Arguments) – categorical/hashable data type of column names
- Returns:
A split_dataframe step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_split_lines()[source]#
Make a Transformation that takes a string and splits it into a Vec<String> of its lines.
- Returns:
A split_lines step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_split_records(separator)[source]#
Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>.
- Parameters:
separator (str) – The token(s) that separate entries in each record.
- Returns:
A split_records step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_unclamp(bounds, TA=None)[source]#
Make a Transformation that unclamps a VectorDomain<BoundedDomain<T>> to a VectorDomain<AllDomain<T>>.
- Parameters:
bounds (Tuple[Any, Any]) – Tuple of inclusive lower and upper bounds.
TA (Type Arguments) – atomic data type
- Returns:
A unclamp step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library
- opendp.trans.make_unordered(TA)[source]#
Make a Transformation that converts the ordered dataset metric InsertDeleteDistance to the respective unordered dataset metric SymmetricDistance with a no-op. Operates exclusively on VectorDomain<AllDomain<TA>>. If MI is “InsertDeleteDistance”, then output metric is “SymmetricDistance”, and respectively “HammingDistance” maps to “ChangeOneDistance”.
- Parameters:
TA (Type Arguments) – atomic type of data
- Returns:
A unordered step.
- Return type:
- Raises:
AssertionError – if an argument’s type differs from the expected type
UnknownTypeError – if a type-argument fails to parse
OpenDPException – packaged error from the core OpenDP library