Questions or feedback?

Report a bug or request a feature on Github.
Send general queries to info@opendp.org, or email security@opendp.org if it is related to security.
Join the conversation on Slack, or the mailing list.

opendp.context module#

The context module provides opendp.context.Context and supporting utilities.

For more context, see context in the User Guide.

For convenience, all the functions of this module are also available from opendp.prelude. We suggest importing under the conventional name dp:

>>> import opendp.prelude as dp

class opendp.context.Context(accountant, queryable, d_in, d_mids=None, d_out=None, space_override=None)[source]#

A Context coordinates queries to an instance of a privacy accountant.

It is recommended to use the make_sequential_composition constructor instead of this one.

Parameters:

accountant (Measurement) – The measurement used to spawn the queryable.
queryable (Queryable) – Executes the queries and tracks the privacy expenditure.
d_in (float) – An upper bound on the distance between adjacent datasets.
d_mids (Sequence[float] | None) – A sequence of privacy losses for each query to be sent to the queryable. Used for compositors.
d_out (float | None) – An upper bound on the overall privacy loss. Used for filters.
space_override (tuple[Domain, Metric] | None) –

accountant: Measurement#: The accountant is the measurement used to spawn the queryable. It contains information about the queryable, such as the input domain, input metric, and output measure expected of measurement queries sent to the queryable.

static compositor(data, privacy_unit, privacy_loss, split_evenly_over=None, split_by_weights=None, domain=None, margins=None)[source]#

Constructs a new context containing a sequential compositor with the given weights.

If the domain is not specified, it will be inferred from the data. This makes the assumption that the structure of the data is public information.

split_evenly_over and split_by_weights are mutually exclusive.

When data is a Polars LazyFrame, queries are specified as a Polars compute plan. In addition, margins may be specified, which contain descriptors for the data under grouping.

Parameters:

data (Any) – The data to be analyzed.
privacy_unit (tuple[Metric, float]) – The privacy unit of the compositor.
privacy_loss (tuple[Measure, Any]) – The privacy loss of the compositor.
split_evenly_over (int | None) – The number of parts to evenly distribute the privacy loss.
split_by_weights (Sequence[float] | None) – A list of weights for each intermediate privacy loss.
domain (Domain | None) – The domain of the data.
margins (Sequence[Margin] | None) – Descriptors for grouped data.

Return type:

Context

deserialize_polars_plan(serialized_plan)[source]#

Given a serialized Polars plan, wraps it with a LazyFrameQuery. See the serialization documentation for context and full example.

Parameters:: serialized_plan (bytes) – A plan like that returned by query.polars_plan.serialize()
Return type:: LazyFrameQuery

query(**kwargs)[source]#

Starts a new Query to be executed in this context.

If the context has been constructed with a sequence of privacy losses, the next loss will be used. Otherwise, the loss will be computed from the kwargs.

Parameters:: kwargs – The privacy loss to use for the query. Passed directly into loss_of().
Return type:: Query | LazyFrameQuery

queryable: Queryable#: The queryable executes the queries and tracks the privacy expenditure.

class opendp.context.PartialChain(f, *args, **kwargs)[source]#

A partial chain is a transformation or measurement that is missing one numeric parameter.

The parameter can be solved for by calling the fix method, which returns the closest transformation or measurement that satisfies the given stability or privacy constraint.

fix(d_in, d_out, output_measure=None, T=None)[source]#

Returns the closest transformation or measurement that satisfies the given stability or privacy constraint.

The discovered parameter is assigned to the param attribute of the returned transformation or measurement.

Parameters:

d_in (float) –
d_out (float) –
output_measure (Measure | None) –
T –

partial: Callable[[float], Transformation | Measurement]#: The partial transformation or measurement.

classmethod wrap(f)[source]#

Wraps a constructor for a transformation or measurement to return a partial chain instead.

Parameters:: f – function to wrap

class opendp.context.Query(chain, output_measure=None, d_in=None, d_out=None, context=None, _wrap_release=None)[source]#

Initializes the query with the given chain and output measure.

It is more convenient to use the context.query() constructor than this one. However, this can be used stand-alone to help build a transformation/measurement that is not part of a context.

Parameters:

chain (tuple[Domain, Metric] | Transformation | Measurement | PartialChain) – an initial metric space (tuple of domain and metric) or transformation
output_measure (Measure) – how privacy will be measured on the output of the query
d_in (float | None) – an upper bound on the distance between adjacent datasets
d_out (float | None) – an upper bound on the overall privacy loss
context (Context) – if specified, then when the query is released, the chain will be submitted to this context
_wrap_release (Callable[[Any], Any] | None) – for internal use only

compositor(split_evenly_over=None, split_by_weights=None, d_out=None, output_measure=None, alpha=None)[source]#

Constructs a new context containing a sequential compositor with the given weights.

split_evenly_over and split_by_weights are mutually exclusive.

Parameters:

split_evenly_over (int | None) – The number of parts to evenly distribute the privacy loss
split_by_weights (Sequence[float] | None) – A list of weights for each intermediate privacy loss
d_out (float | None) – Optional upper bound on privacy loss.
output_measure (Measure | None) – Optional method of accounting to be used by this compositor. Defaults to same.
alpha (float | None) – Optional parameter to split delta between zCDP conversion and δ-approximate in approx-ZCDP

Return type:

Query

new_with(*, chain, wrap_release=None)[source]#

Convenience constructor that creates a new query with a different chain.

Parameters:

chain (tuple[Domain, Metric] | Transformation | Measurement | PartialChain) – the prior query. Either a metric space or transformation
wrap_release – a function to apply to apply to releases

Return type:

Query

param()[source]#: Returns the discovered parameter, if there is one.

release()[source]#

Release the query. The query must be part of a context.

Return type:: Any

resolve(allow_transformations=False)[source]#

Resolve the query into a measurement.

Parameters:: allow_transformations (bool) – If true, allow the response to be a transformation instead of a measurement.

opendp.context.domain_of(T, infer=False)[source]#

Constructs an instance of a domain from carrier type T, or from an example.

Accepts a limited set of Python type expressions:

>>> import opendp.prelude as dp
>>> dp.domain_of(list[int])
VectorDomain(AtomDomain(T=i32))

As well as strings representing types in the underlying Rust syntax:

>>> dp.domain_of('Vec<int>')
VectorDomain(AtomDomain(T=i32))

Dictionaries, optional types, and a range of primitive types are supported:

>>> dp.domain_of(dict[str, int])
MapDomain { key_domain: AtomDomain(T=String), value_domain: AtomDomain(T=i32) }

>>> dp.domain_of('Option<int>')  # Python's `Optional` is not supported.
OptionDomain(AtomDomain(T=i32))
>>> dp.domain_of(dp.i32)
AtomDomain(T=i32)

More complex types are not supported:

>>> dp.domain_of(list[list[int]]) 
Traceback (most recent call last):
...
opendp.mod.OpenDPException:
  FFI("Inner domain of VectorDomain must be AtomDomain or ExtrinsicDomain (created via user_domain)")

Alternatively, an example of the data can be provided, but note that passing sensitive data may result in a privacy violation:

>>> dp.domain_of([1, 2, 3], infer=True)
VectorDomain(AtomDomain(T=i32))

Parameters:

T – carrier type
infer (bool) – if True, T is an example of the sensitive dataset. Passing sensitive data may result in a privacy violation.

Return type:

Domain

opendp.context.loss_of(epsilon=None, delta=None, rho=None)[source]#

Constructs a privacy loss, consisting of a privacy measure and a privacy loss parameter.

>>> import opendp.prelude as dp
>>> dp.loss_of(epsilon=1.0)
(MaxDivergence, 1.0)
>>> dp.loss_of(epsilon=1.0, delta=1e-9)
(Approximate(MaxDivergence), (1.0, 1e-09))
>>> dp.loss_of(rho=1.0)
(ZeroConcentratedDivergence, 1.0)

Parameters:

epsilon (float | None) – Parameter for pure ε-DP.
delta (float | None) – Parameter for δ-approximate DP.
rho (float | None) – Parameter for zero-concentrated ρ-DP.

Return type:

tuple[Measure, float | tuple[float, float]]

opendp.context.metric_of(M)[source]#

Constructs an instance of a metric from metric type M.

Parameters:: M – Metric type
Return type:: Metric

opendp.context.register(constructor, name=None)[source]#

Register a constructor function to be used in the Context API.

If the constructor supports partial application (first two arguments are input_domain and input_metric), then the input domain and input metric are omitted when called via the Context API.

Constructor requirements:

The constructor must return a transformation or measurement.
If name is None, the constructor’s name must start with make_.

Parameters:

constructor (Callable[[...], Transformation | Measurement]) – The constructor function to register.
name (str | None) – The name to register the constructor under in the Context API. If None, the name will be derived from the constructor’s name.

opendp.context.space_of(T, M=None, infer=False)[source]#

A shorthand for building a metric space.

A metric space consists of a domain and a metric.

>>> import opendp.prelude as dp
...
>>> dp.space_of(list[int])
(VectorDomain(AtomDomain(T=i32)), SymmetricDistance())
>>> # the verbose form allows greater control:
>>> (dp.vector_domain(dp.atom_domain(T=dp.i32)), dp.symmetric_distance())
(VectorDomain(AtomDomain(T=i32)), SymmetricDistance())

Parameters:

T – carrier type (the type of members in the domain)
M – metric type
infer (bool) – if True, T is an example of the sensitive dataset. Passing sensitive data may result in a privacy violation.

Return type:

tuple[Domain, Metric]

opendp.context.unit_of(*, contributions=None, changes=None, absolute=None, l1=None, l2=None, ordered=False, U=None)[source]#

Constructs a unit of privacy, consisting of a metric and a dataset distance. The parameters are mutually exclusive.

>>> import opendp.prelude as dp
>>> dp.unit_of(contributions=3)
(SymmetricDistance(), 3)
>>> dp.unit_of(l1=2.0)
(L1Distance(f64), 2.0)

Parameters:

contributions (int | None) – Greatest number of records a privacy unit may contribute to microdata
changes (int | None) – Greatest number of records a privacy unit may change in microdata
absolute (float | None) – Greatest absolute distance a privacy unit can influence a scalar aggregate data set
l1 (float | None) – Greatest l1 distance a privacy unit can influence a vector aggregate data set
l2 (float | None) – Greatest l2 distance a privacy unit can influence a vector aggregate data set
ordered (bool) – Set to True to use InsertDeleteDistance instead of SymmetricDistance, or HammingDistance instead of ChangeOneDistance.
U – The type of the dataset distance.

Return type:

tuple[Metric, float]

Branches

Releases

opendp.context module#