This documentation is for a development version of OpenDP.

The current release of OpenDP is v0.11.1.

opendp.context module#

The context module provides opendp.context.Context and supporting utilities.

For more context, see context in the User Guide.

For convenience, all the functions of this module are also available from opendp.prelude. We suggest importing under the conventional name dp:

>>> import opendp.prelude as dp
class opendp.context.Context(accountant, queryable, d_in, d_mids=None, d_out=None, space_override=None)[source]#

A Context coordinates queries to an instance of a privacy accountant.

It is recommended to use the make_sequential_composition constructor instead of this one.

Parameters:
  • accountant (Measurement) – The measurement used to spawn the queryable.

  • queryable (Queryable) – Executes the queries and tracks the privacy expenditure.

  • d_in (float) – An upper bound on the distance between adjacent datasets.

  • d_mids (list[float] | None) – A sequence of privacy losses for each query to be sent to the queryable. Used for compositors.

  • d_out (float | None) – An upper bound on the overall privacy loss. Used for filters.

  • space_override (tuple[Domain, Metric] | None) –

accountant: Measurement#

The accountant is the measurement used to spawn the queryable. It contains information about the queryable, such as the input domain, input metric, and output measure expected of measurement queries sent to the queryable.

static compositor(data, privacy_unit, privacy_loss, split_evenly_over=None, split_by_weights=None, domain=None, margins=None)[source]#

Constructs a new context containing a sequential compositor with the given weights.

If the domain is not specified, it will be inferred from the data. This makes the assumption that the structure of the data is public information.

split_evenly_over and split_by_weights are mutually exclusive.

Parameters:
  • data (Any) – The data to be analyzed.

  • privacy_unit (tuple[Metric, float]) – The privacy unit of the compositor.

  • privacy_loss (tuple[Measure, Any]) – The privacy loss of the compositor.

  • split_evenly_over (int | None) – The number of parts to evenly distribute the privacy loss.

  • split_by_weights (list[float] | None) – A list of weights for each intermediate privacy loss.

  • domain (Domain | None) – The domain of the data.

  • margins (dict[tuple[str, ...], Margin] | None) – A dictionary where the keys are grouping columns and values describe known properties of the respective margins.

Return type:

Context

query(**kwargs)[source]#

Starts a new Query to be executed in this context.

If the context has been constructed with a sequence of privacy losses, the next loss will be used. Otherwise, the loss will be computed from the kwargs.

Parameters:

kwargs – The privacy loss to use for the query. Passed directly into loss_of().

Return type:

Query | LazyFrameQuery

queryable: Queryable#

The queryable executes the queries and tracks the privacy expenditure.

class opendp.context.PartialChain(f, *args, **kwargs)[source]#

A partial chain is a transformation or measurement that is missing one numeric parameter.

The parameter can be solved for by calling the fix method, which returns the closest transformation or measurement that satisfies the given stability or privacy constraint.

fix(d_in, d_out, output_measure=None, T=None)[source]#

Returns the closest transformation or measurement that satisfies the given stability or privacy constraint.

The discovered parameter is assigned to the param attribute of the returned transformation or measurement.

Parameters:
  • d_in (float) –

  • d_out (float) –

  • output_measure (Measure | None) –

partial: Callable[[float], Transformation | Measurement]#

The partial transformation or measurement.

classmethod wrap(f)[source]#

Wraps a constructor for a transformation or measurement to return a partial chain instead.

class opendp.context.Query(chain, output_measure=None, d_in=None, d_out=None, context=None, _wrap_release=None)[source]#

A helper API to build a measurement.

Parameters:
compositor(split_evenly_over=None, split_by_weights=None, d_out=None, output_measure=None)[source]#

Constructs a new context containing a sequential compositor with the given weights.

split_evenly_over and split_by_weights are mutually exclusive.

Parameters:
  • split_evenly_over (int | None) – The number of parts to evenly distribute the privacy loss

  • split_by_weights (list[float] | None) – A list of weights for each intermediate privacy loss

  • d_out (float | None) –

  • output_measure (Measure | None) –

Return type:

Query

new_with(*, chain, wrap_release=None)[source]#

Convenience constructor that creates a new query with a different chain.

Parameters:

chain (tuple[Domain, Metric] | Transformation | Measurement | PartialChain) –

Return type:

Query

param()[source]#

Returns the discovered parameter, if there is one.

release()[source]#

Release the query. The query must be part of a context.

Return type:

Any

resolve(allow_transformations=False)[source]#

Resolve the query into a measurement.

Parameters:

allow_transformations (bool) – If true, allow the response to be a transformation instead of a measurement.

opendp.context.domain_of(T, infer=False)[source]#

Constructs an instance of a domain from carrier type T, or from an example.

Accepts a limited set of Python type expressions:

>>> import opendp.prelude as dp
>>> dp.domain_of(list[int])
VectorDomain(AtomDomain(T=i32))

As well as strings representing types in the underlying Rust syntax:

>>> dp.domain_of('Vec<int>')
VectorDomain(AtomDomain(T=i32))

Dictionaries, optional types, and a range of primitive types are supported:

>>> dp.domain_of(dict[str, int])
MapDomain { key_domain: AtomDomain(T=String), value_domain: AtomDomain(T=i32) }
>>> dp.domain_of('Option<int>')  # Python's `Optional` is not supported.
OptionDomain(AtomDomain(T=i32))
>>> dp.domain_of(dp.i32)
AtomDomain(T=i32)

More complex types are not supported:

>>> dp.domain_of(list[list[int]]) 
Traceback (most recent call last):
...
opendp.mod.OpenDPException:
  FFI("VectorDomain constructor only supports AtomDomain or UserDomain inner domains")

Alternatively, an example of the data can be provided, but note that passing sensitive data may result in a privacy violation:

>>> dp.domain_of([1, 2, 3], infer=True)
VectorDomain(AtomDomain(T=i32))
Parameters:
  • T – carrier type

  • infer (bool) – if True, T is an example of the sensitive dataset. Passing sensitive data may result in a privacy violation.

Return type:

Domain

opendp.context.loss_of(epsilon=None, delta=None, rho=None)[source]#

Constructs a privacy loss, consisting of a privacy measure and a privacy loss parameter.

>>> import opendp.prelude as dp
>>> dp.loss_of(epsilon=1.0)
(MaxDivergence(f64), 1.0)
>>> dp.loss_of(epsilon=1.0, delta=1e-9)
(FixedSmoothedMaxDivergence(f64), (1.0, 1e-09))
>>> dp.loss_of(rho=1.0)
(ZeroConcentratedDivergence(f64), 1.0)
Parameters:
  • epsilon (float | None) – Parameter for pure ε-DP.

  • delta (float | None) – Parameter for approximate (ε,δ)-DP.

  • rho (float | None) – Parameter for zero-concentrated ρ-DP.

Return type:

tuple[Measure, float | tuple[float, float]]

opendp.context.metric_of(M)[source]#

Constructs an instance of a metric from metric type M.

Return type:

Metric

opendp.context.space_of(T, M=None, infer=False)[source]#

A shorthand for building a metric space.

A metric space consists of a domain and a metric.

>>> import opendp.prelude as dp
...
>>> dp.space_of(list[int])
(VectorDomain(AtomDomain(T=i32)), SymmetricDistance())
>>> # the verbose form allows greater control:
>>> (dp.vector_domain(dp.atom_domain(T=dp.i32)), dp.symmetric_distance())
(VectorDomain(AtomDomain(T=i32)), SymmetricDistance())
Parameters:
  • T – carrier type (the type of members in the domain)

  • M – metric type

  • infer (bool) – if True, T is an example of the sensitive dataset. Passing sensitive data may result in a privacy violation.

Return type:

tuple[Domain, Metric]

opendp.context.unit_of(*, contributions=None, changes=None, absolute=None, l1=None, l2=None, ordered=False, U=None)[source]#

Constructs a unit of privacy, consisting of a metric and a dataset distance. The parameters are mutually exclusive.

>>> import opendp.prelude as dp
>>> dp.unit_of(contributions=3)
(SymmetricDistance(), 3)
>>> dp.unit_of(l1=2.0)
(L1Distance(f64), 2.0)
Parameters:
  • contributions (int | None) – Greatest number of records a privacy unit may contribute to microdata

  • changes (int | None) – Greatest number of records a privacy unit may change in microdata

  • absolute (float | None) – Greatest absolute distance a privacy unit can influence a scalar aggregate data set

  • l1 (float | None) – Greatest l1 distance a privacy unit can influence a vector aggregate data set

  • l2 (float | None) – Greatest l2 distance a privacy unit can influence a vector aggregate data set

  • ordered (bool) – Set to True to use InsertDeleteDistance instead of SymmetricDistance, or HammingDistance instead of ChangeOneDistance.

  • U – The type of the dataset distance.

Return type:

tuple[Metric, float]