This documentation is for a development version of OpenDP.

The current release of OpenDP is v0.11.1.

Programming Framework#

OpenDP is based on a conceptual model that defines the characteristics of privacy-preserving operations and provides a way for components to be assembled into programs with desired behavior. This model, known as the OpenDP Programming Framework, is described in the paper A Programming Framework for OpenDP. The framework is designed with a precise and verifiable means of capturing the privacy-relevant aspects of an algorithm, while remaining highly flexible and extensible. OpenDP (the software library) is intended to be a faithful implementation of that approach. Because OpenDP is based on a well-defined model, users can create applications with rigorous privacy properties.

Summary#

The OpenDP Programming Framework consists of a set of high-level conceptual elements. We’ll cover the highlights here, which should be enough for you to get acquainted with OpenDP programming. If you’re interested in more of the details and motivations behind the framework, you’re encouraged to read the paper. There is also an illustrative notebook A Framework to Understand DP.

  • Measurements are randomized mappings from a private, potentially sensitive dataset or value to an arbitrary output value that is safe to release. They are a controlled means of introducing privacy protection (e.g. noise) to a computation. An example of a measurement is one that adds Laplace noise to a value.

  • Transformations are deterministic mappings from a private dataset to another private dataset or value. They are used to summarize or transform values in some way. An example of a transformation is one which calculates the mean of a set of values.

  • Domains are sets which identify the possible values that some object can take. They are used to constrain the input or output of measurements and transformations. Examples of domains are the integers between 1 and 10, or vectors of length 5 containing floating point numbers.

  • Measures and metrics are things that specify distances between two mathematical objects.

    • Measures characterize a distance between two probability distributions. An example measure is the “max-divergence” of pure differential privacy.

    • Metrics capture a distance between two private datasets or values. An example metric is “symmetric distance” (counting the number of additions or removals).

  • Privacy maps and stability maps are functions that characterize the relationship between “closeness” of operation inputs and operation outputs. They are the glue that binds everything together.

    • A privacy map is a statement about a measurement. It’s a function that takes an input distance (in a specific metric) and emits the smallest upper bound on the output distance (in a specific measure). A privacy map lets you make assertions about a measurement when the measurement is evaluated on any pair of neighboring datasets. It’s guaranteed that any pair of measurement inputs within the input distance will always produce a pair of measurement outputs within the output distance.

    • A stability relation is a statement about a transformation. It’s also a function that takes an input distance (in a specific metric) and emits the smallest upper bound on the output distance (in a specific metric, possibly different from the input metric). A stability map lets you make assertions about the behavior of a transformation when that transformation is evaluated on any pair of neighboring datasets. It’s guaranteed that any pair of transformation inputs within the input distance will always produce transformation outputs within the output distance.

    Maps capture the notion of closeness in a very general way, allowing the extension of OpenDP to different definitions of privacy.

As you can see, these elements are interdependent and support each other. The interaction of these elements is what gives the OpenDP Programming Framework its flexibility and expressiveness.

Key Points#

You don’t need to know all the details of the Programming Framework to write OpenDP applications, but it helps understand some of the key points:

  • OpenDP calculations are built by assembling a measurement from a number of constituent transformations and measurements, typically through chaining or composition.

  • Measurements don’t have a static privacy loss specified when constructing the measurement. Instead, measurements are typically constructed by specifying the scale of noise, and the loss is bounded by the resulting privacy relation. This requires some extra work compared to specifying the loss directly, but OpenDP provides some utilities to make this easier on the programmer, and the benefit is greatly increased flexibility of the framework as a whole.