Abstract 1 Introduction 2 Preliminaries 3 Upper bounds for summation 4 Lower bound for summation References

The Randomness Complexity of Differential Privacy

Clément L. Canonne ORCID University of Sydney, Australia Francis E. Su ORCID Harvey Mudd College, Claremont, CA, USA Salil P. Vadhan ORCID Harvard University, Cambridge, MA, USA
Abstract

We initiate the study of the randomness complexity of differential privacy, i.e., how many random bits an algorithm needs in order to generate accurate differentially private releases. As a test case, we focus on the task of releasing the results of d counting queries, or equivalently all one-way marginals on a d-dimensional dataset with boolean attributes. While standard differentially private mechanisms for this task have randomness complexity that grows linearly with d, we show that, surprisingly, only log2d+O(1) random bits (in expectation) suffice to achieve an error that depends polynomially on d (and is independent of the size n of the dataset), and furthermore this is possible with pure, unbounded differential privacy and privacy-loss parameter ε=1/poly(d). Conversely, we show that at least log2dO(1) random bits are also necessary for nontrivial accuracy, even with approximate, bounded DP, provided the privacy-loss parameters satisfy ε,δ1/poly(d). We obtain our results by establishing a close connection between the randomness complexity of differentially private mechanisms and the geometric notion of “deterministic rounding schemes” recently introduced and studied by Vander Woude et al. (2022, 2023).

Keywords and phrases:
differential privacy, randomness, geometry
Funding:
Clément L. Canonne: Supported by an ARC DECRA (DE230101329).
Francis E. Su: Work done in part while visiting and supported by the University of Sydney Mathematical Research Institute.
Salil P. Vadhan: Work done in part while visiting and supported by the University of Sydney Department of Computer Science and the Sydney Mathematical Research Institute. Also supported in part by Cooperative Agreement CB20ADR0160001 with the U.S. Census Bureau and a Simons Investigator Award.
Copyright and License:
[Uncaptioned image] © Clément L. Canonne, Francis E. Su, and Salil P. Vadhan; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Pseudorandomness and derandomization
; Theory of computation Computational complexity and cryptography ; Security and privacy ; Mathematics of computing
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2024/169/
Acknowledgements:
We thank Mark Bun for pointing out the connections to recent work on replicability, and anonymous referees for helpful corrections and comments.
Editors:
Raghu Meka

1 Introduction

Differential privacy [5] is a widely accepted theoretical framework for protecting the privacy of individuals in a database while analysts query the database for statistical information. Differentially private (DP) mechanisms provide quantitative guarantees and tradeoffs on the level of privacy afforded to individuals and the accuracy of answers to queries. In order to provide these guarantees, DP mechanisms rely on the use of carefully calibrated “random noise” to protect privacy. Thus, large-scale deployments of differential privacy can require a massive amount of high-quality random (or cryptographically pseudorandom) bits.111In this work, we focus on using truly random bits to achieve the standard definition of DP with information-theoretic security. Later in the introduction, we discuss how this relates to the use of pseudorandom bits to implement DP with computational security, as typically done in practice. Indeed, Garfinkel and LeClerc [7] estimate that the U.S. Census Bureau’s differentially private TopDown Algorithm as used for the 2020 Decennial Census required at least 90 terabytes of random bits, and described major engineering challenges in generating those bits with sufficient efficiency and security. As they write in their conclusion,

“The need to generate a large number of high-quality random numbers is a largely unrecognized requirement of a production differential privacy system.”

Motivated by these challenges, and following the long line of inquiry in theoretical computer science on the role and necessity of randomness in computation, we initiate the study of the randomness complexity of differential privacy:

What is the minimum amount of randomness required by differentially private mechanisms to achieve a specific level of accuracy?

We will quantify this “minimal amount of randomness” using either the maximum or expected number of random bits used by a differentially private algorithm, as a function of the dataset dimensions, privacy-loss parameters, and accuracy. For the sake of concreteness, we focus in this work on a specific task, summation (or one-way marginal) queries, which asks to provide an estimate of the sum of d-dimensional (binary) vectors, each corresponding to a different individual in the dataset.

In more detail, the task of summation is as follows: the dataset x consists of data from n individuals, each contributing a d-dimensional binary vector xi. We can think of x as a n×d matrix with rows x1,,xn. The mechanism M(x) must output an estimate x^d of sum(x):=i=1nxi, such that x^sum(x)α with probability at least 1β. We call such a mechanism (α,β)-accurate. (See Definition 2.3 for a formal definition.)

We require that the mechanism M also satisfies (ε,δ)-differential privacy ((ε,δ)-DP) for some parameters ε0,δ[0,1], which means that for every pair of datasets x,x that differ on one row, we have

SPr[M(x)S]eεPr[M(x)S]+δ.

When δ=0, this is called pure differential privacy, and we say that M is ε-DP. Our algorithm will work in the so-called unbounded DP setting, where the size n of the dataset is private information and thus needs to be protected by differential privacy like any other statistic. In contrast, our lower bound applies to the (easier) bounded DP setting, where n is public and fixed. (See Definitions 2.3 and 2.2 for formal definitions.)

We define the maximum randomness complexity R0(M) of a mechanism M:({0,1}d)nd as the maximum over databases x({0,1}d)n of the maximum number of random bits used by M on input x. We define the expected randomness complexity R(M) of a mechanism M:({0,1}d)nd as the maximum over databases x({0,1}d)n of the expected number of random bits used by M on input x.

Differentially private summation is a well-studied question, for which many approximately optimal (in terms of accuracy) DP mechanisms have been proposed and analyzed. One of the simplest is the Laplace mechanism, which consists in adding d-dimensional Laplace noise with scale parameter d/ε to the true value sum(x). This can be shown to have error O~(d)/ε with high probability, which is optimal up to log factors. (Here O~(f) is shorthand for a function that is O(flogcf) for an unspecified constant c>0.) However, this comes at a significant cost, as adding continuous Laplace noise would technically require an infinite number of random bits. Instead, one can choose to add (independent) geometric noise to each coordinate of the true count, using a symmetrized Geometric random variable. This similarly will achieve nearly optimal error, but now with a finite expected randomness complexity. Unfortunately, the expected number of random bits required, O~ε(d), still scales at least linearly with the dimension. (Here the O~ε means that ε is treated as a constant, on which the hidden constants in the O~ can depend.)

At first, this linear scaling seems inherent: For d=1, to achieve non-trivial accuracy every DP mechanism must have entropy Ω(1):

Lemma 1.1.

For every α<n/2 and β<1/2 and every (α,β)-accurate ε-DP mechanism M:{0,1}n for binary counts, there is some database x{0,1}n such that H(M(x))1ε2ln2.

Here H(M(x)) denotes the Shannon entropy of the output distribution M(x), which is a lower bound on the expected number of random bits used by M. (We provide a simple proof of Lemma 1.1 in the full version.) Given that, in the d-dimensional summation task, the d dimensions are totally independent, one might expect the entropy required to be additive across dimensions, leading to an Ω(d) lower bound on expected randomness complexity.

Our results

Surprisingly, the above intuition turns out to be false: by leveraging recent work on deterministic rounding schemes, we show that the expected randomness complexity can be made as low as log2d+O(1), while still achieving good accuracy (i.e., an error that is polynomial in the dimension d, independent of the size n of the database):

Theorem 1.2 (Informal; see Corollary 3.2).

For every >0 and ε>0, there is an ε-DP mechanism M:({0,1}d)d (in the unbounded DP setting) such that the expected randomness complexity of M satisfies

R(M)dlog2(+1)+O(1)

and for every β1/poly(d), M is (α,β)-accurate for summation with

α=O~(d)ε.

Taking =d gives expected randomness complexity log2d+O(1) as claimed. Furthermore, with this setting of and ε1/poly(d), the accuracy is α=poly(d), independent of n. If we instead take =1, we get near-optimal accuracy α=O~(d)/ε, but then the expected randomness complexity becomes d+O(1). Choosing 1<<d provides a tradeoff between these two extremes.

For bounding the maximum randomness complexity, we necessarily222If M uses at most r random bits, then for every database x, the support of the distribution M(x) is of size at most 2r. However, a pure DP algorithm must have the same support on every input database. A summation mechanism with nontrivial accuracy on datasets of size n should at least distinguish the 2d datasets where all n rows are the same, and thus must have rd. relax to approximate DP:

Theorem 1.3 (Informal; see Corollary 3.4).

For every >0, ε>0, δ(0,1/d), there is an (ε,δ)-DP mechanism M:({0,1}d)d (in the unbounded DP setting) such that the maximum randomness complexity of M satisfies

R0(M)dlog2(+1)+log2(1/δ)+O(1)

and for every β1/poly(d), M is (α,β)-accurate for summation with

α=O(dlog(1/δ)ε).

Again, the randomness complexity is minimized at =d, which gives R0(M)log2d+log2(1/δ)+O(1) and α=O(d3/2)log(1/δ)/ε, and the latter is again a factor of Θ(dlog(1/δ)) larger than the error achievable with unlimited randomness. Typically, δ is taken to be cryptographically negligible, so δdω(1) and our bound on the maximum randomness complexity becomes R0(M)(1+o(1))log2(1/δ).

Next, we prove lower bounds showing that the randomness complexity we achieve is nearly optimal (in certain parameter regimes):

Theorem 1.4 (Informal; see Corollary 4.2).

Suppose that M:({0,1}d)nd is an (ε,δ)-DP mechanism (in the bounded DP setting) that is (α,β)-accurate for summation, where αn/21. Then the maximum randomness complexity of M satisfies

R0(M)min{d,log2(1/δ)}.

If in addition, β1/d, ε1/d, and δ1/O(d2). Then both the maximum and expected randomness complexities of M satisfy:

R0(M)R(M)log2dO(1).

Let’s examine the constraint on the parameters. The constraint on α essentially just requires that M has nontrivial accuracy. Indeed, accuracy n/2 on datasets of size n can trivially be achieved by the deterministic mechanism that always outputs (n/2,n/2,,n/2)d; this mechanism is (n/2,0)-accurate, 0-DP, and has randomness complexity 0. The constraint on δ is quite mild, since δ is intended to be cryptographically small, and in particular subpolynomial in input size parameters like d and n. The constraints on β and ε are more nontrivial. They match our upper bound, in the sense that Theorem 1.2 achieves expected randomness complexity log2d+O(1) even when β,ε=1/d. However, it would be interesting to know whether even smaller randomness complexity is possible when ε and β are constants.

Connection to geometry

The key ingredient in our results is a two-way connection to the notion of deterministic rounding schemes recently introduced and studied by Vander Woude, Dixon, Pavan, Radcliffe, and Vinodchandran [11, 12]. Deterministic rounding schemes provide methods for rounding data in d to nearby points so that any ball of small enough radius rounds to only a small number of points. As discussed in [11, 12], such rounding schemes are equivalent to the geometric notion of a secluded partition, which is a partition of d into sets of bounded radius such that balls of a sufficiently small radius do not intersect too many sets of the partition. We tie these ideas to the randomness complexity of accurate DP algorithms. In particular, our log2d±Θ(1)=log2(d+1)±Θ(1) upper and lower bounds on randomness complexity are intimately tied to the fact that in d dimensions, bounded-radius sets that cover d must contain intersections of at least size d+1 at certain points, and it is possible to find covers where no more than d+1 sets ever jointly intersect. In the theory of set intersections, these ideas are closely related to KKM covers of bounded polyhedra and the Polytopal Sperner lemma [3], and in fact our initial proofs of our results (not included here) came through these connections.

Related work on replicability

Recent work [4] has studied the randomness complexity of replicable algorithms using similar geometric tools. There are bidirectional conversions between replicable algorithms and approximate differentially private algorithms [8, 1] for problems about “population statistics,” where the dataset consists of iid samples from an unknown distribution and the goal is to estimate statistics abut the distribution. In contrast, our focus is on “empirical statistics,” where there is no iid assumption on the dataset and the goal is to estimate statistics of the dataset itself (as in the motivating use case of the 2020 U.S. Census). One can convert differentially private algorithms for population statistics into ones for empirical statistics (see, e.g., [2]), but this conversion incurs a high cost in randomness complexity (running the DP algorithm for population statistics on a dataset formed by sampling n rows from the input dataset independently with replacement). In the reverse direction, differentially private algorithms for empirical statistics are also automatically differentially private algorithms for population statistics (provided the dataset size n is large enough so that the empirical statistics approximate the population statistics with high probability), but converting a DP algorithm for population statistics into a replicable algorithm also appears to incur a large cost in randomness complexity. Thus neither our upper bound nor our lower bound appear to follow as a black box from the existing results on replicability, but it will be interesting to explore whether the techniques in either setting can be used to improve or extend any of the results in the other.

On using pseudorandom bits

In practice, implementations of differential privacy (including for the 2020 Decennial U.S. Census [7]) do not use truly random bits, but use pseudorandom bits generated by a cryptographically strong pseudorandom generator from a short seed, which is assumed to be truly random (or at least have sufficiently high entropy). If the differentially private algorithms satisfy the standard information-theoretic definition of DP (Definition 2.2) when run with truly random bits, then when executed using cryptographically strong pseudorandom bits, they satisfy a natural and convincing relaxation of differential privacy to computationally bounded adversaries [9]. Although the use of a pseudorandom generator reduces the number of truly random bits needed to equal the seed length of the generator, there is still a substantial cost if the differentially private algorithm is designed to use a huge number of random bits, since we must run the pseudorandom generator enough times to produce all of those bits. This is the cost referred to by Garfinkel and LeClerc [7], and it motivates our focus on the number of truly random bits needed to achieve information-theoretic DP. Furthermore, our Theorem 1.2 shows that we can reduce the expected number of truly random bits to be even smaller than the seed length of a cryptographic pseudorandom generator. Specifically, a cryptographic pseudorandom generator requires seed length at least log2(1/δ), where δ is a bound on the probability with which any computationally bounded adversary can distinguish the pseudorandom bits from uniform. This δ plays an analogous role in computational differential privacy to the δ in information-theoretic (ε,δ)-DP. Typically, δ is taken to be cryptographically negligible, so δ=dω(1) and the randomness complexity log2(d+1)+O(1) of Theorem 1.2 is asymptotically smaller than log2(1/δ).

Overview of the proofs

The high-level idea of our upper bound (algorithmic result) starts with the following observation: if we could post-process the output of a DP mechanism M to only allow a small number of outcomes on every database without hurting its accuracy too much, then we would be in good shape: for every database x, having |supp(M(x))|k implies that H(M(x)) is at most log2k, and so we obtain a new mechanism M with low entropy but similar accuracy. Now, M could still have very high randomness complexity R(M) (since M itself could have been using a lot of random bits), but it is a standard fact in information theory (based on Huffman coding) that any discrete random variable Y can be generated using at most H(Y)+O(1) random bits on average. Thus, there is yet another mechanism M′′ with randomness complexity at most log2k+O(1).

Unfortunately, if we want to achieve pure DP as in Theorem 1.2, it is too much to hope for bounding the randomness complexity via support size. (See Footnote 2.) Instead, we will ensure that there is a set Sx of size at most k such that M(x)Sx with probability at least 1γ, for a small γ>0; we’ll take γ=1/d. If we can also ensure that the entropy of M(x) conditioned on M(x)Sx is bounded by O(1/γ), then this suffices to achieve an overall entropy of at most log2k+O(1).

Refer to caption
Figure 1: Using a deterministic rounding scheme (the square cells round to their center points) to construct a DP mechanism M for sum(x) with low randomness complexity. Given a dataset x with sum(x) at the annular dot, we add Laplace noise (depicted here by the diamond cloud of points). With high probability this lands somewhere nearby (the triangular dot). Next, we round that to the nearest point in a suitably chosen grid (the square dot) – the limited grid options help keep the resulting entropy low. We then use the given deterministic rounding scheme to jump to the center of the cell containing the square dot. This defines a DP mechanism with low entropy, because there are only a small number k (here, k=3) of cell centers that could be in the high probability support of this mechanism applied to x.

To implement the above plan, we start with any accurate ε-DP mechanism M, say one based on the Laplace mechanism. The crucial step is then how to achieve the post-processing step. This we do with a deterministic rounding scheme f. Such a scheme will take the output M(x) and round it to a nearby point f(M(x)), say at distance at most r from M(x). We choose r to be large enough so that the noise in M(x) will be of magnitude (in norm) at most rτ with high probability, for a sufficiently small τ. f is called a (k,τ) deterministic rounding scheme of radius r if on every ball of radius rτ, f takes on at most k distinct values. With such a scheme, we get that f(M(x)) lies in a set of size k with high probability.

So then the question becomes what parameters (k,τ) are possible, and this is exactly what is addressed in the work of Vander Woude et al. [12]. In particular, they construct a scheme with k=(d+1) and τ=1/(2), which is what we use in Theorem 1.2. As noted above, completing the proof of the theorem requires controlling the entropy of M(x) also conditioned on the event that the noise is large. This we achieve by an additional coarsening of the output of M(x), via a standard rounding of every coordinate to a multiple of rτ/2, before applying the deterministic rounding scheme f.

For our upper bound on maximum randomness complexity (Theorem 1.3), we apply the same strategy as above to the Gaussian mechanism for differential privacy. In this case, we replace our use of Huffman coding with the fact that if we have a random variable Y that lies in a set S of size at most k with probability at least 1γ, then for every η>0, using at most log2k+log2(1/η)+O(1) random bits, we can generate a random variable Y that is at total variation distance at most η+γ from Y.

To prove our lower bound of log2dO(1) (second part of Theorem 1.4) on expected randomness complexity, we show how randomness-efficient DP mechanisms for summation with good accuracy imply the existence of deterministic rounding schemes with good parameters: where the parameters k,τ of the rounding schemes are directly related to the randomness complexity of the DP mechanism M. This allows us to invoke impossibility results from [11, 12] on the existence of “too-good-to-be-true” deterministic rounding schemes to rule out DP mechanisms which are simultaneously accurate and randomness-efficient. In more detail, the main steps of the lower bound are as follows: given a purported ε-DP mechanism M for summation with low randomness complexity, (1) we embed the hypergrid [n]d into the space of d-dimensional datasets of size n, ({0,1}d)n, such that 1 distance in the former maps to Hamming distance in the latter and computing summation on a given dataset x=x(v) allows one to retrieve the original hypergrid vector v[n]d. (2) We use the randomness guarantees of M to extract, from its output distribution on a given database x(v), the highest-probability representative output y(v), which defines a deterministic rounding scheme f:

v[n]dx(v)My(v)df

(3) we leverage the accuracy guarantees of M to argue that this rounding y is indeed close to v; and, finally, (4) we invoke the (group) privacy guarantee of M to show that the image of any given ball by our newly-defined rounding scheme cannot contain too many “representatives” y(v). There is one last step required, as the rounding scheme f outline above is only defined on [n]d: to conclude, we need to “lift” this rounding scheme to the whole of d while preserving its properties, which we do by a careful tiling of the space d with translations and reflections of the hypergrid.

The lower bound of min{d,log2(1/δ)} on maximum randomness complexity follows from observing that any (ε,δ)-DP mechanism that uses less than log2(1/δ) random bits is ε-DP for some finite ε, and then we can apply the argument of Footnote 2 to deduce R0(M)d.

Directions for Future Work

We conclude this introduction by mentioning some open questions, and directions for future work.

The first question is whether one can close the gap between our upper and lower bounds. Our upper bounds show that randomness complexity log2d+O(1) can be achieved with accuracy α=poly(d)/ε, but to achieve near-optimal accuracy α=O~(d)/ε, we only achieve randomness complexity d+O(1). Can our algorithm be improved to achieve logarithmic randomness complexity with near-optimal accuracy? Or can our lower bound be improved to show that linear randomness complexity is necessary for near-optimal accuracy? As discussed earlier, another question is to remove some of the constraints on parameters, like ε and β, in our lower bound, or else show that sub-logarithmic randomness complexity is possible when these parameters are constant.

A second question whether one can obtain efficient low-randomness DP mechanisms, e.g., running in time poly(n,d). Recall that our mechanisms from Theorem 1.2 and Theorem 1.3 rely on computationally inefficient procedures (e.g., Huffman coding) to convert a mechanism with low output entropy into one with low randomness complexity. For this reason, as well as the aforementioned accuracy loss, we would not recommend using our mechanisms in practice (even though our work was inspired by the practical concerns raised in [7]). Still, it raises the hope for practical DP mechanisms that are much more randomness-efficient than ones currently used.

As a test case we focused in this paper on the task of differentially private summation: extending our study of the randomness requirement of DP mechanisms to other statistical releases, or attempting to provide a general treatment of the randomness complexity of differential privacy, would be an interesting direction.

Finally, as mentioned earlier, exploring connections to recent work on replicable algorithms may also prove to be fruitful.

2 Preliminaries

2.1 Differential Privacy

Consider a database x consisting of n entries chosen from some domain 𝒳: that is, x is an element of 𝒳n. If each entry of the database consists of d numerical attributes, then 𝒳 might be d or d (for discrete data) or {0,1}d (for binary data). It is convenient to think of x as an n×d matrix, with each row of the matrix being the vector xi=(xij)1jd. In general, when the size of the database n is unknown, or allowed to grow, we will denote it by |x|, and accordingly consider databases in 𝒳=n=0𝒳n.

Definition 2.1 (Database metrics and adjacency).

For two databases x,x𝒳, their insert-delete distance (aka LCS distance) DID(x,x) is the number of insertions and deletions of elements of 𝒳 to transform x into x. For two databases x,x𝒳 such that |x|=|x|, their Hamming distance DHam(x,x) is the number of rows i such that xixi. If |x||x|, we define DHam(x,x)=. For a database metric D, we say that x and x are adjacent with respect to D, denoted xDx if D(x,x)1.

We use DID to capture unbounded differential privacy, where the size n of the dataset is unknown and considered private information that needs to be protected. DHam captures bounded differential privacy, where the size n is public information, known to a potential adversary. Since DID(x,x)2DHam(x,x) for databases x,x of the same size, unbounded-DP algorithms are also bounded DP, up to a factor of 2 in the privacy-loss parameters. We state our positive result in the unbounded-DP setting and our negative result in the bounded-DP setting, making both results stronger.

Definition 2.2 (Differential privacy).

Fix ε>0 and δ[0,1]. A randomized algorithm M:𝒳𝒴 is (ε,δ)-differentially private (or (ε,δ)-DP) with respect to database metric D if for every pair of adjacent databases xDx in 𝒳 and every measurable S𝒴, we have

Pr[M(x)S]eεPr[M(x)S]+δ.

If δ=0, we simply say M is ε-DP.

We restrict our attention in this article to databases where each record has d binary attributes, so 𝒳={0,1}d, and to mechanisms that output an estimate of the sum of the attributes. This motivates the following definition:

Definition 2.3 (Accuracy).

Given α0 and β[0,1], a randomized algorithm M:({0,1}d)nd is said to be (α,β)-accurate for summation if for every x({0,1}d), we have

Pr[M(x)sum(x)>α]β,

where

sum(x)=i=1|x|xid.

Note that we used ({0,1}d)n rather than ({0,1}d) as the domain of M. This allows for the possibility that, even in the unbounded DP setting, the accuracy of the mechanism depends on the size n of the dataset, as well as the dimension d and the privacy parameters ε and δ.

Two key features of differential privacy are its immunity to post-processing and its group privacy property:

Lemma 2.4 (Postprocessing).

Let M:𝒳𝒴 be an (ε,δ)-DP mechanism, and f:𝒴𝒵 be any (possibly randomized) function. Then fM is (ε,δ)-DP.

Lemma 2.5 (Group privacy).

Let M:𝒳𝒴 be an (ε,δ)-DP mechanism with respect to database metric D, and x,x𝒳 be two databases at distance D(x,x)k. Then, for every measurable S𝒴, we have

Pr[M(x)S]ekεPr[M(x)S]+ke(k1)εδ.

We refer the reader to, e.g., [6, 10] for more background on differential privacy and the proof of these properties. We also briefly recall the definition and guarantees of two of the standard noise mechanisms used in differential privacy, the Laplace and Gaussian mechanisms:

Lemma 2.6 (Laplace mechanism; see e.g., [6, Theorems 3.6 and 3.8]).

Suppose f:𝒳d has 1 sensitivity Δ1(f), that is, Δ1(f)=maxxDxf(x)f(x)1. Then, for every ε>0, the mechanism M:𝒳d defined by

M(x)=f(x)+Lap(Δ1(f)/ε)d

is ε-DP, where Lap(b)d denotes the product distribution over d with iid marginals distributed as a Laplace with scale parameter b>0 (i.e., probability density function f(x)=12be|x|/b). Moreover, its accuracy satisfies the following: for every β(0,1],

Pr[M(x)f(x)Δ1(f)εlndβ]β.
Lemma 2.7 (Gaussian mechanism; see e.g., [6, Theorems 3.22 and A.1]).

Suppose f:𝒳d has 2 sensitivity Δ2(f), that is, Δ2(f)=maxxDxf(x)f(x)2. Then, for every ε(0,1] and δ(0,1], the mechanism M:𝒳d defined by

M(x)=f(x)+𝒩(0,(Δ2(f)ε2ln1.25δ)2)d

is (ε,δ)-DP, where 𝒩(0,σ2)d denotes the product distribution over d with iid marginals distributed as a Normal distribution with mean 0 and variance σ2. Moreover, its accuracy satisfies the following: for every β(0,1],

Pr[M(x)f(x)Δ2(f)ε2ln1.25δln2dβ]β.

2.2 Randomness Complexity

We work with a model of randomized algorithms where the algorithm has access to a coin-tossing oracle that returns an independent, unbiased random bit on each invocation. The number of random bits used by M on a particular execution is defined to be the number of the calls to the coin-tossing oracle (which is a random variable, as the algorithm could adaptively decide whether to call the oracle again or not depending on the results of prior coin tosses). We require that on all inputs, the algorithm halts with probability 1 over the coin tosses received.

We define below two natural measures of randomness complexity. Like with accuracy, we will use 𝒳n as the domain rather than 𝒳 to allow the possibility that the randomness complexity depends on the size n of the dataset.

Definition 2.8 (randomness complexity).

For a randomized algorithm M:𝒳n𝒴 and γ[0,1], we define:

  • R(M) = the maximum over x𝒳n of the expected number of random bits used by M on input x, where the expectation is taken over the coin tosses of M.

  • R0(M): the maximum over x of the maximum number of random bits used by M on input x.

Rather than work directly with randomness complexity, it is more convenient for us to work with measures of output entropy:

Definition 2.9.

For a randomized algorithm M:𝒳n𝒴, we define

  • H(M)=maxxH(M(x)), where H(Y) denotes the Shannon entropy of random variable Y (in bits), defined as 𝔼yY[log2(1/Pr[Y=y])].

  • H(M)=maxxH(M(x)), where H(Y) denotes the min-entropy of random variable Y, defined as log2[1/(maxyPr[Y=y])].

  • H0(M)=maxxH0(M(x)), where H0(Y) denotes the max-entropy of random variable Y, defined as log2|supp(Y)|.

  • H0γ(M)=maxxH0γ(M(x)), where H0γ(Y) denotes the γ-smoothed max-entropy of random variable Y, which is defined to be the minimum of H0(YE) over (probabilistic) events E=E(Y) of probability at least 1γ.

Some basic relations between the randomness complexity measures and the output entropy measures are as follows:

Lemma 2.10.

For any randomized algorithm M:𝒳n𝒴, the following hold:

  1. 1.

    H(M)H(M)H0(M), H0γ(M)H0(M), R(M)R0(M).

  2. 2.

    H(M)R(M) and H0(M)R0(M).

  3. 3.

    For every M, there is an M such that R(M)H(M)+O(1) and on every input x, M(x) is identically distributed to x.

  4. 4.

    For every M and γ,η>0, there is an M such that R0(M)H0γ(M)+log2(1/η) and for every x, M(x) is at total variation distance at most γ+η from M(x).

In particular, Item 2 says that a lower bound on output entropy is also a lower bound on randomness complexity. Items 3 and 4 say that we can go in the other direction as well, by modifying the mechanism.

2.3 Deterministic Rounding Schemes and Secluded Partitions

A crucial building block for our results is the notion of deterministic rounding scheme, recently introduced by Vander Woude, Dixon, Pavan, Radcliffe, and Vinodchandran [12]. Although they defined such a scheme as a function f:dd, we generalize their definition slightly to include functions f:Sd on an arbitrary domain S and a radius parameter ρ that becomes necessary when S is not all of d:

Definition 2.11.

For k, ρ,τ0, and Sd, a function f:Sd is said to be a (k,τ)-deterministic rounding scheme of radius ρ on S if the following two conditions hold:

  1. 1.

    for all zS, f(z)zρ;

  2. 2.

    for all zd, |{f(y):yB(z,2τρ)S}|k.

where B(z,r) denotes the closed ball of radius r centered at z. That is, each f “rounds” inputs to nearby points (within ρ), and inputs that are close (in a 2τρ-ball) can only be rounded to a small number k of representatives.

Refer to caption
Figure 2: A (k,τ)-deterministic rounding scheme of radius ρ0.14 on S=[0,1]2, where k=3 and τ=14. Points in each cell round to one point inside that cell, never moving more than ρ in -distance. The shaded box is a ball of radius 2τρ, and all balls of this size will round to at most k=3 points.

When S=d, the parameter ρ is not important; if f is a (k,τ) deterministic rounding scheme of radius ρ, then for every c>0, we see that f(x)=cf(x/c) is a (k,τ) deterministic rounding scheme of radius cρ. When ρ is not specified, its default value is ρ=1/2, matching the radius of “round-to-the-nearest-integer.” However, we will also consider deterministic rounding schemes on S=[0,1]d; then the choice of ρ is more important. Indeed, [0,1]d has a trivial (1,)-deterministic rounding scheme of radius 1/2, where all points get rounded to the center of [0,1]d, but it has no (1,)-deterministic rounding scheme of radius 1/4.

As pointed out by Vander Woude et al. [12, Observation 1.3], deterministic rounding schemes have a nice geometric interpretation as (k,τ)-secluded partitions: partitions of S by sets of radius at most ρ such that every ball of radius 2τρ intersects at most k sets in the partition. Using this geometric perspective, Vander Woude et al. prove the following upper and lower bounds:

Theorem 2.12 ([11, 12]).

For every d, there exists a (d+1,12d)-deterministic rounding scheme f:dd. More generally, for every , there exists a ((+1)d/,12)-deterministic rounding scheme f:dd.

Theorem 2.13 ([11, 12]).

If f:dd is a (k,τ)-deterministic rounding scheme and τ>0, then

kmax{d+1,(1+2τ)d}.

3 Upper bounds for summation

In this section, we establish our upper bound results, which state that good deterministic rounding schemes imply differentially private mechanisms for summation with low randomness complexity (Theorem 3.1).

We begin with our upper bound on expected randomness complexity.

Theorem 3.1.

Suppose there exists a (k,τ)-deterministic rounding scheme f:dd. Then for every ε>0, there is a mechanism M:({0,1}d)d that is ε-DP with respect to DID satisfying

H01/d(M) log2kand
H(M) log2k+O(1),

such that, for every β(0,1], M is (α,β)-accurate for summation for

α=O(dlog(d/β)ε+dlogdτε).

Combining this with Theorem 2.12 and the conversion from output entropy to randomness complexity (Lemma 2.10, Item 3), we get:

Corollary 3.2.

For every d,,ε>0, there is a mechanism M:({0,1}d)d that is ε-DP with respect to DID satisfying

R(M)dlog2(+1)+O(1),

such that, for every β(0,1], M is (α,β)-accurate for summation for

α=O(dlog(d/β)ε+dlogdε).

In particular, taking =d, we obtain H(M)log2d+O(1) and

α=O(dlog(d/β)ε+d2logdε).
Proof of Theorem 3.1.

Let f:dd be a (k,τ)-deterministic rounding scheme. Without loss of generality, by scaling, we can assume that f has radius r/2, for a parameter r>0 to be determined later in the proof. Then we have that for every zd,

  1. 1.

    f(z)zr/2, and

  2. 2.

    |{f(y):yB(z,rτ)}|k.

Now define this mechanism M on a database x:

Mechanism 𝑴(𝒙).
  1. 1.

    Draw ηLap(d/ε)d, and let y=roundrτ(sum(x)+η),333Note that the distribution of y is not the same as applying the Geometric Mechanism (i.e. Discrete Laplace Mechanism) supported on integer multiples of rτ. To apply the Geometric Mechanism, we would need to first round sum(x) to a multiple of rτ, which would substantially increase the sensitivity and increase the amount of noise we need to add. We could apply the Geometric Mechanism to sum(x) instead of the Laplace Mechanism prior to the rounding step, but it does not simplify anything in the analysis below.

  2. 2.

    Output f(y).

By the post-processing property of differential privacy and the ε-DP guarantees of the Laplace mechanism, M is itself ε-DP: it remains to argue about its accuracy and randomness complexity. By the accuracy of the Laplace mechanism (Lemma 2.6), we have that, with probability at least 1β, the noise infusion introduces an error of at most

ηln(d/β)dε.

Rounding the coordinates of sum(x)+η to the nearest multiple of rτ and replacing y with f(y) increase the error by at most rτ/2+r/2, so we have (α,β)-accuracy for

α=(1+τ)r2+ln(d/β)dε.

For the randomness complexity, by the same analysis as in the accuracy above (but replacing β with 1/d), we have that with probability at least 11/d, the following event E holds:

event E:ηr0 for r0:=2dlndε.

Conditioned on E, the point y lies in an ball of radius r0+rτ/2 around sum(x). By setting r=2r0/τ, y lies in a ball of radius 2r0=rτ, and the fact that f is a (k,τ) deterministic rounding scheme of radius r tell us that, conditioned on E, the support size of M(x) is at most k, and in particular has entropy at most log2k. Thus, we have H01/d(M(x))log2k.

To bound H(M(x)) overall, we also need to bound the entropy of M(x) conditioned on ¬E, which is upper bounded by the entropy of y=roundrτ(sum(x)+η) conditioned on ¬E. ¬E is the event that at least one coordinate of η has magnitude larger than r0; the remainder may or may not have magnitude larger than r0. Conditional on ¬E, coordinate i of y is distributed as roundrτ(sum(x)i+ηi), where ηi is a mixture of Lap(d/ε) conditioned on having magnitude at most r0 and Lap(d/ε) conditioned on having magnitude greater than r0. Provided that rτ/2=Ω(d/ε), such a distribution has entropy O(1), similarly to how a geometric distribution with parameter p=Ω(1) has entropy H(p)/p=O(1).444For a bit more detail: observe that, before conditioning, the distribution of roundrτ(sum(x)i+ηi) is a mixture of a point mass and two geometric distributions: a point mass on m=roundrτ(sum(x)i), a geometric distribution on the multiples of rτ/2 smaller than m (assuming wlog that m<sum(x)i), and a geometric distribution supported on the multiples of rτ/2 greater than or equal to sum(x)i. Under the assumption that rτ=Ω(d/ε), the parameter p of these geometric distributions is bounded away from 1, i.e., the probability mass of each point in the support is a constant factor smaller than the point in the support closer to m. This latter property is preserved under conditioning on |ηi|r0 or conditioning on |ηi|>r0, and suffices to ensure entropy O(1). By our setting of r above, rτ/2 is larger than d/ε by a factor of 2ln(d)=Ω(1). With O(1) entropy contributed from each coordinate, we get O(d) entropy overall conditioned on ¬E. (The coordinates of y are not independent conditioned on ¬E, but entropy is subadditive even for dependent random variables.)

Thus, letting I be the indicator random variable for event E, we have

H(M(x)) H(M(x),I)
= H(M(x)I)+H(I)
Pr[E]H(M(x)E)+Pr[¬E]H(M(x)¬E)+H(1d)
1log2k+1dO(d)+1
= log2k+O(1).

The accuracy bound follows by plugging our setting for r0 and r=2r0/τ into the expression for α above.

Now we turn to our upper bound on maximum randomness complexity.

Theorem 3.3.

Suppose there exists a (k,τ)-deterministic rounding scheme f:dd. Then for every ε>0, there is a mechanism M:({0,1}d)d that is (ε,δ)-DP with respect to DID satisfying H0γ(M)log2k and such that, for every β(0,1], M is (α,β)-accurate for summation for

α=O(dlog(1/δ)log(d/β)ε+dlog(1/δ)log(d/γ)τε).

Combining this with Theorem 2.12 and the conversion from output entropy to randomness complexity (Lemma 2.10, Item 4), we get:

Corollary 3.4.

For every d,, ε>0 and δ(0,1/d), there is a mechanism M:({0,1}d)d that is (ε,δ)-DP with respect to DID satisfying

R0(M)dlog2(+1)+log2(1δ)+O(1),

such that, for every β(0,1], M is (α,β)-accurate for summation for

α=O(dlog(1/δ)log(d/β)ε+dlog(1/δ)ε).

In particular, taking =d, we obtain R0(M)log2d+log2(1/δ)+O(1) and

α=O(dlog(1/δ)log(d/β)ε+d3/2log(1/δ)ε).
Proof Sketch of Theorem 3.3.

Follow the proof of Theorem 3.1 but replace the use of the Laplace mechanism (Lemma 2.6) with the Gaussian mechanism (Lemma 2.7), and define the event E using the accuracy bound r0 that holds with probability at least 1γ (rather than 11/d). In this proof, there is no need to analyze H(M(x)|¬E).

4 Lower bound for summation

We now turn to establishing a lower bound on the randomness complexity of any accurate DP mechanism for summation. To strengthen the result, we will consider the bounded DP setting and allow approximate DP mechanisms; and our conclusion will yield a lower bound on the min-entropy H(M) of any such DP mechanism.

Theorem 4.1.

Suppose that M:({0,1}d)nd is a mechanism that is a (ε,δ)-DP mechanism with respect to DHam with H(M)log2K that is (α,β)-accurate for summation for β<1/K and αn/21. Then, for every τ>0, there exists a (k,τ)-deterministic rounding scheme with

kKeεh1hKeεhδ

where

h=min(d(8(α+1)τ12τ+1),n).

Observe that when we take τ0, we have hmin{d,n}d, so, at least when δ=0 our upper bound on k becomes kKeεd. By the lower bound on deterministic rounding schemes (Theorem 2.13), we know that kd+1, so we obtain an entropy lower bound of H(M)=log2Klog2(d+1)O(εd). When ε=O(1/d), this matches our best upper bound on randomness complexity up to an additive constant (the case =d in Corollary 3.2). For positive δ, we only lose an additive constant in the lower bound provided that δ1/(2dKeεd). Using the fact that H(M)H(M)R(M), we obtain:

Corollary 4.2.

Suppose that M:({0,1}d)nd is a mechanism that is (ε,δ)-DP mechanism with respect to DHam with αn/21, ε1/d, β<1/d, and δ1/(6d2). Then

R(M)log2dO(1).

(The conditions involving K in Theorem 4.1 disappear by doing a case analysis. If H(M)>log2d, then we are done. Otherwise, we can set K=d in Theorem 4.1.)

We will prove Theorem 4.1 by first constructing a deterministic rounding scheme for the cube [0,1]d and then lifting it to a deterministic rounding scheme for all of d by the following lemma (whose proof we defer to later).

Lemma 4.3.

Suppose there exists a (k,τ)-deterministic rounding scheme f:[0,1]dd of radius ρ<1/2, where τ(0,1). Then there exists a (k,ζ)-deterministic rounding scheme f~:dd, for ζ=max{τ42τ,τ2+4ρ}.

Proof of Theorem 4.1.

By Lemma 4.3, it suffices to to construct a (k,τ)-deterministic rounding scheme f:[0,1]dd of radius ρ<1/2 with τ=4τ/(1+2τ) or τ=τ(2+4ρ).

Refer to caption
Figure 3: The hypergrid G for d=2 and n=3. Each integer point vG={0,1,2,3}2 corresponds to a dataset x(v) (a 3×2 matrix) with the property that sum(x(v))=v (the column sums of the matrix at v is just v). The figure shows how a point z[0,1]2, scaled by 2(α+1), is rounded to the nearest v, here (2,1). The given summation mechanism M takes x(v) to a nearby point y(v). Not shown: this grid is then scaled back down to yield the desired deterministic rounding scheme on [0,1]2.

Let G={0,1,2,,n}d be the d-dimensional hypergrid: we identify each point vG of the grid with a dataset x(v)({0,1}d)n in the following fashion. For row i[n] and column j[d], we have

xij(v)=𝟙{vji},

where 𝟙 denotes the indicator function. That is, if v=(v1,,vd), then x(v) is an n×d matrix where in each column j, the first vj entries are ones and any remaining entries are zeroes. It follows that sum(x(v))=v, and if u,v are at 1 distance at most a, then x(u) and x(v) are at distance at most a from each other as datasets: DHam(x(u),x(v))uv1.

Since H(M)log2K, for every vG, there is some output y(v)d such that Pr[M(x(v))=y(v)]1/K. (If there are several such outputs, choose the lexicographically largest.)

Now we construct our function f:[0,1]dd as follows. For every z[0,1]d, let v=v(z) be a point in G obtained by rounding all coordinates of 2(α+1)z to the nearest integer (and rounding up if halfway between integers). The resulting v is in G because 2(α+1)n, and by construction v is at distance at most 1/2 from 2(α+1)z. Then y(v) exists as noted above. See Figure 3. Set

f(z)=y(v)/(2α+2).

To show that f is a deterministic rounding scheme, we analyze the two properties separately.

  • To show that f(z)zρ<1/2, we use the accuracy of the mechanism. Fix any z[0,1]d: by construction, 2(α+1)z and v=v(z) are at distance at most 1/2. Recall that we have

    Pr[M(x(v))sum(x(v))>α]β,

    and

    Pr[M(x(v))=y(v)]1K.

    Since β<1/K, the outcome of M(x(v)) has probability at most 1/K of being farther than α from sum(x(v)). Since the particular outcome y(v) has probability greater than 1/K, then y(v) must be at distance at most α from sum(x(v))=v, and hence at distance at most α+1/2 from 2(α+1)z. Hence, f(z)=y(v)/(2α+2) is at distance at most ρ=12α+22α+12<12 from z.

  • For the second, we use the privacy guarantee of the mechanism: consider a closed ball B(w,τ) and points z1,,zkB(w,τ) such that there are k distinct outputs f(z1),,f(zk). Let v1,,vk be the corresponding gridpoints, i.e., vi=v(zi). Recall that f(zi)=y(vi)/(2α+2), so the y(vi)’s are all distinct.

    Since z1,,zk are all in B(w,τ), the gridpoints v1,,vk are all at distance at most 2(α+1)τ+1/2 from 2(α+1)w, which is itself at distance at most 1/2 from some gridpoint v0. In particular, each vi is at 1 distance at most d(2(α+1)τ+1) from v0, so the dataset x(vi) is at distance at most h from x(v0) in Hamming distance, for

    h=min(d(2(α+1)τ+1),n).

    By (group) differential privacy, for each 1ik, we have

    Pr[M(x(v0))=y(vi)]eεhPr[M(x(vi))=y(vi)]he(h1)εδeεh1Kheεδ.

    Summing over 1ik, we get

    1i=1kPr[M(x(v0))=y(vi)]eεhkKkheεδ.

    Reorganizing and upper bounding eε1, we have

    kKeεh1hKehεδ

    as desired.

As mentioned earlier, all that remains is to invoke Lemma 4.3 to lift our deterministic rounding scheme from [0,1]d to the whole of d.

It only remains to establish Lemma 4.3.

Proof of Lemma 4.3.

Fix a (k,τ)-deterministic rounding scheme f:[0,1]dd of radius ρ<1/2, with τ<1. Although the image of f might extend to points outside [0,1]d by -distance ρ<1/2, we may assume without loss of generality that the image of f is contained in [0,1]d because projecting the outputs of f to the nearest point in [0,1]d preserves both properties of a deterministic rounding scheme.

We first modify f so that it behaves nicely near the faces of the cube to get a rounding scheme f, then we show how to extend that to all of d to get the desired rounding scheme f′′.

We define f:[0,1]d[0,1]d by the following process, which is also illustrated in Figure 4.

Refer to caption
Figure 4: The trajectory of a point in [0,1]2 as it moves from x to x to f(x) to y. This composition defines the rounding scheme f on [0,1]2, given any rounding scheme f on [0,1]2. In the case shown, f rounds points in each cell to a point inside. The shaded area is a “moat” of thickness τ/2 where points get projected onto the boundary (and are otherwise not moved) before applying f. The final step is rounding to a corner of [0,1]2.
  1. 1.

    Given x[0,1]d, for each i[d], set

    xi={0if xiτ/21if xi1τ/2xiotherwise.

    That is, if x is in a “moat” of width τ/2, then it is τ/2-close to one (or possibly several) face(s) of the cube and x is the projection of x onto (the intersection of) the face(s). Otherwise, set x=x. Note that x is well-defined by our assumption that τ<1.

  2. 2.

    For each i[d], set

    yi={0if 0f(x)i<121if 12f(x)i1.

    (Recall that we can assume without loss of generality that the image of f is in [0,1]d.) That is, we round f(x) to the nearest corner of the cube.

  3. 3.

    Set f(x):=(y1,,yd)d.

To understand the radius of f, consider Figure 4. The first step from x to x is bounded (in distance) by τ/2 (and is zero if x is not in the moat). The second step from x to f(x) is bounded by ρ (the radius of f). The third step from f(x) to y is bounded by 1/2 (since it is rounding to a cube corner). The triangle inequality shows the total distance from x to y is bounded by 1/2+ρ+τ/2, but Figure 4 suggests we can improve this bound: some cancellation is occurring in axes directions where x is close to the boundary. To make this intuition precise, we do casework.

  • If xiτ/2 or xi1τ/2, then since |f(x)ixi|ρ<1/2, we see from yi’s definition that yi=xi so that

    |yixi|=|xixi|τ2.
  • If τ/2<xi<1τ/2, then since yi{0,1}, we have

    |yixi|<1τ2.

    But when τ/2<xi<1τ/2 we also have xi=xi, which implies that

    |yixi||(yif(x)i)|+|f(x)ixi|12+ρ.

Since τ<1, the bound τ/2 never exceeds either 1τ2 or 12+ρ. So, for all x,

f(x)xmin{1τ2,12+ρ},

establishing the bound on the radius.

Observe that for every ball B(z,τ/2) of radius τ/2, we have

|f(B(z,τ/2)[0,1]d)||f(B(z,τ)[0,1]d)|k.

This is because if xzτ/2, then xzτ, and we obtain f(x) by applying f to x and then projecting.

Refer to caption
Figure 5: The construction of a rounding scheme f′′ on 2 from a rounding scheme f on the box [0,1]2 that was illustrated in Figure 4. The definition of f is reflected across the sides of the box to fill out 2. Nearby points in the moats of each box have similar fates as their reflections.

From f, we construct a deterministic rounding scheme f′′:dd by using f in each unit cube, but reflected so that points on either side of a unit cube boundary behave similarly. See Figure 5. We define f′′ as follows:

Given 𝒙𝒅.
  1. 1.

    For each i[d], write xi=ri+siei, where ri2, si{±1}, and ei[0,1]. Note that this is not unique when xi, as two decompositions are then possible with either si=1 and one with si=1 (but both with the same value of ei{0,1}).555Specifically: if xi is an even integer, then (ri,si,ei) could be either (xi,1,0) or (xi,1,0). If xi is an odd integer, then the two options are (xi1,1,1) and (xi+1,1,1). We will show that the output f′′(x) is independent of the choice we make.

  2. 2.

    Let e=(e1,,ed)[0,1]d.

  3. 3.

    For each i[d], set yi=ri+sif(e)i.

  4. 4.

    Set f′′(x)=(y1,,yd).

The fact that f′′(x) is well-defined, regardless of the non-unique decompositions when xi is boolean follows from the facts that (a) the vector e is independent of how those choices are made (already noted above), and (b) when ei{0,1}, we have ei=ei and f(e)i=ei (shown earlier for points x where xi close to 0 or 1), so that yi=ri+sif(e)i=ri+siei=xi, regardless of whether we chose to use si=1 or si=1.

We can now analyze the guarantees this f′′ provides. For every xd, since |si|=1 for all i we get

f′′(x)x=f(e)emin{2τ2,1+2ρ2}

recalling the radius of f established earlier. Next, consider any ball B of radius at most τ/2. We claim that |f′′(B)|=|f′′(B)| for an ball BB that is entirely contained within a single unit hypercube whose corners are in d, and thus |f′′(B)|k by the guarantees of f. The reason is that if xd is within distance τ/2 from a face F of a unit hypercube, f′′(x)=f′′(x), where x is the projection of x to F. Thus we can remove portions of B that are on the opposite side of a face from its center without changing f(B). (More formally, if we write B=[a1,b1]×[a2,b2]××[ad,bd], then for any dimension i where there is an integer ci strictly between ai and bi, we can replace the interval [ai,bi] with the shorter of [ai,ci] and [ci,bi] without changing f(B).)

Therefore, f′′ is a (k,ζ)-deterministic rounding scheme of radius r=min{2τ2,1+2ρ2}

ζ=τ/22r=max{τ2(2τ),τ2(1+2ρ)}.

References

  • [1] Mark Bun, Marco Gaboardi, Max Hopkins, Russell Impagliazzo, Rex Lei, Toniann Pitassi, Satchit Sivakumar, and Jessica Sorrell. Stability is stable: Connections between replicability, privacy, and adaptive generalization. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, pages 520–527, New York, NY, USA, 2023. Association for Computing Machinery. doi:10.1145/3564246.3585246.
  • [2] Mark Bun, Kobbi Nissim, Uri Stemmer, and Salil Vadhan. Differentially private release and learning of threshold functions. In Proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘15), pages 634–649. IEEE, 18–20 october 2015. Full version posted as arXiv:1504.07553. arXiv:1504.07553.
  • [3] Jesus A De Loera, Elisha Peterson, and Francis Edward Su. A polytopal generalization of Sperner’s lemma. Journal of Combinatorial Theory, Series A, 100(1):1–26, 2002. doi:10.1006/JCTA.2002.3274.
  • [4] Peter Dixon, A. Pavan, Jason Vander Woude, and N. V. Vinodchandran. List and certificate complexities in replicable learning. In Proceedings of the 37th International Conference on Neural Information Processing Systems (NeurIPS ‘23), NeurIPS ’23, Red Hook, NY, USA, 2023. Curran Associates Inc.
  • [5] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography, volume 3876 of Lecture Notes in Comput. Sci., pages 265–284. Springer, Berlin, 2006. doi:10.1007/11681878_14.
  • [6] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014. doi:10.1561/0400000042.
  • [7] Simson L. Garfinkel and Philip Leclerc. Randomness concerns when deploying differential privacy. In WPES@CCS, pages 73–86. ACM, 2020. doi:10.1145/3411497.3420211.
  • [8] Badih Ghazi, Ravi Kumar, and Pasin Manurangsi. User-level differentially private learning via correlated sampling. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 20172–20184, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/a89cf525e1d9f04d16ce31165e139a4b-Abstract.html.
  • [9] Ilya Mironov, Omkant Pandey, Omer Reingold, and Salil Vadhan. Computational differential privacy. In S. Halevi, editor, Advances in Cryptology—CRYPTO ‘09, volume 5677 of Lecture Notes in Computer Science, pages 126–142. Springer-Verlag, 16–20 august 2009. doi:10.1007/978-3-642-03356-8_8.
  • [10] Salil P. Vadhan. The complexity of differential privacy. In Tutorials on the Foundations of Cryptography, pages 347–450. Springer International Publishing, 2017. doi:10.1007/978-3-319-57048-8_7.
  • [11] Jason Vander Woude, Peter Dixon, Aduri Pavan, Jamie Radcliffe, and N. V. Vinodchandran. Geometry of rounding. CoRR, abs/2211.02694, 2022. doi:10.48550/arXiv.2211.02694.
  • [12] Jason Vander Woude, Peter Dixon, Aduri Pavan, Jamie Radcliffe, and N. V. Vinodchandran. Geometry of rounding: Near optimal bounds and a new neighborhood Sperner’s lemma. CoRR, abs/2304.04837, 2023. doi:10.48550/arXiv.2304.04837.