Questions or feedback?

opendp.extras.sklearn.linear_model package#

Module contents#

This module requires extra installs: pip install 'opendp[scikit-learn]'

For convenience, all the members of this module are also available from opendp.prelude. We suggest importing under the conventional name dp:

>>> import opendp.prelude as dp

The members of this module will then be accessible at dp.sklearn.linear_model.

If you’re interested in the underlying algorithm, we’ve also implemented Theil-Sen Regression as a demonstration of OpenDP plugins.

class opendp.extras.sklearn.linear_model.LinearRegression(output_measure, x_bounds, y_bounds, scale, runs=1, candidates_count=100, fraction_bounds=(0.25, 0.75))[source]#

DP Linear Regression

The interface is parallel to that offered by sklearn’s LinearRegression. The fit method returns an sklearn LinearRegression object.

Parameters:
  • x_bounds (Iterable[tuple[float, float]]) – Bounds for training data; For the moment, only lists containing a single tuple are supported

  • y_bounds (tuple[float, float]) – Bounds for target data

  • scale (float) – The scale of the noise to be added

  • runs (int) – Controls how many times randomized pairwise predictions are computed. Increasing this value can improve the robustness and accuracy of the results; However, it can also increase computational cost and amount of noise needed later in the algorithm.

  • candidates_count (int) – How many evenly spaced candidates to generate

  • fraction_bounds (tuple[float, float]) – predict y values at these cut percentiles of x_bounds.

  • output_measure (Measure) –

fit(X, y)[source]#

Fit DP linear model.

Parameters:
  • X – Training data. Array-like of shape (n_samples, 1)

  • y – Target values. Array-like of shape (n_samples,)

Returns:

A fitted sklearn LinearRegression

Example:

>>> import opendp.prelude as dp
>>> try:
...    import sklearn
... except ModuleNotFoundError:
...     import pytest
...     pytest.skip('Requires extra install')
>>> dp.enable_features("floating-point")
>>> lin_reg = dp.sklearn.linear_model.LinearRegression(
...     dp.max_divergence(),
...     x_bounds=[(0, 10)],
...     y_bounds=(0, 10),
...     scale=1,
... ).fit(
...     X=[[1], [2], [3], [4], [5]],
...     y=[1, 2, 3, 4, 5],
... )
>>> lin_reg.predict([[10]])
array([...])
predict()[source]#

The fit() method returns a new sklearn object, so this method is never actually used. The sklearn documentation of the method with the same name is copied here only for reference.

Predict using the linear model.

Parameters

X : array-like or sparse matrix, shape (n_samples, n_features)

Samples.

Returns

C : array, shape (n_samples,)

Returns predicted values.

Raises:

NotImplementedError – This method is included only for documention.