(Non-Adaptive) Composition#
Any functions that have not completed the proof-writing and vetting process may still be accessed if you opt-in to “contrib”. Please contact us if you are interested in proof-writing. Thank you!
>>> import opendp.prelude as dp
>>> dp.enable_features("contrib")
library(opendp)
enable_features("contrib")
Define a few queries you might want to run up-front:
>>> # define the dataset space and how distances are measured
>>> input_space = (
... dp.vector_domain(dp.atom_domain(T=int)),
... dp.symmetric_distance(),
... )
>>> meas_count = (
... input_space
... >> dp.t.then_count()
... >> dp.m.then_laplace(scale=1.0)
... )
>>> meas_sum = (
... input_space
... >> dp.t.then_clamp((0, 10))
... >> dp.t.then_sum()
... >> dp.m.then_laplace(scale=5.0)
... )
# define the dataset space and how distances are measured
input_space <- c(vector_domain(atom_domain(.T = i32)), symmetric_distance())
meas_count <- input_space |> then_count() |> then_laplace(scale = 1.0)
meas_sum <- input_space |>
then_clamp(c(0L, 10L)) |>
then_sum() |>
then_laplace(scale = 5.0)
Notice that both of these measurements share the same input domain, input metric, and output measure:
>>> print("count:", meas_count)
count: Measurement(
input_domain = VectorDomain(AtomDomain(T=i32)),
input_metric = SymmetricDistance(),
output_measure = MaxDivergence)
>>> print("sum:", meas_sum)
sum: Measurement(
input_domain = VectorDomain(AtomDomain(T=i32)),
input_metric = SymmetricDistance(),
output_measure = MaxDivergence)
meas_count
# Measurement(
# input_domain=VectorDomain(AtomDomain(T=i32)),
# input_metric=SymmetricDistance(),
# output_measure=MaxDivergence
# )
meas_sum
# Measurement(
# input_domain=VectorDomain(AtomDomain(T=i32)),
# input_metric=SymmetricDistance(),
# output_measure=MaxDivergence
# )
This is important, because compositors require these three supporting elements to match for all queries.
The non-adaptive compositor takes a collection of queries to execute on the dataset simultaneously. When the data is passed in, all queries are evaluated together, in a single batch.
>>> meas_mean_fraction = dp.c.make_composition(
... [meas_sum, meas_count]
... )
>>> int_dataset = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> dp_sum, dp_count = meas_mean_fraction(int_dataset)
>>> print("dp sum:", dp_sum)
dp sum: ...
>>> print("dp count:", dp_count)
dp count: ...
meas_mean_fraction <- make_composition(c(meas_sum, meas_count))
int_dataset <- c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L)
unlist(meas_mean_fraction(arg = int_dataset))
# [1] 52 10
The privacy map sums the constituent output distances.
>>> meas_mean_fraction.map(1)
3.0
meas_mean_fraction(d_in = 1L)
# 3.0