Abstract 1 Introduction 2 Preliminaries 3 Improved Smooth Sensitivity Mechanism References Appendix A Deferred Proofs Appendix B Other Mechanisms for Smooth Sensitivity

Smooth Sensitivity Revisited: Towards Optimality

Richard Hladík ORCID ETH Zurich, Switzerland Jakub Tětek ORCID INSAIT, Sofia University “St. Kliment Ohridski”, Bulgaria
Abstract

Smooth sensitivity is one of the most commonly used techniques for designing practical differentially private mechanisms. In this approach, one computes the smooth sensitivity of a given query q on the given input D and releases q(D) with noise added proportional to this smooth sensitivity. One question remains: what distribution should we pick the noise from?

In this paper, we give a new class of distributions suitable for the use with smooth sensitivity, which we name the PolyPlace distribution. This distribution improves upon the state-of-the-art Student’s T distribution in terms of standard deviation by arbitrarily large factors, depending on a “smoothness parameter” γ, which one has to set in the smooth sensitivity framework. Moreover, our distribution is defined for a wider range of parameter γ, which can lead to significantly better performance.

Furthermore, we prove that the PolyPlace distribution converges for γ0 to the Laplace distribution and so does its variance. This means that the Laplace mechanism is a limit special case of the PolyPlace mechanism. This implies that our mechanism is in a certain sense optimal for γ0.

Keywords and phrases:
differential privacy, smooth sensitivity
Funding:
Richard Hladík: Supported in part by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (grant agreement No. 949272)
Copyright and License:
[Uncaptioned image] © Richard Hladík and Jakub Tětek; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Security and privacy Privacy-preserving protocols
Related Version:
Full Version: https://arxiv.org/abs/2407.05067
Acknowledgements:
We would like to thank Rasmus Pagh and Adam Sealfon for helpful discussions. We would like to thank Rasmus Pagh for hosting Richard at the University of Copenhagen. Jakub worked on this paper while affiliated with BARC, University of Copenhagen. Richard partially worked on this paper while visiting BARC, University of Copenhagen.
Funding:
Partially funded by the Ministry of Education and Science of Bulgaria’s support for INSAIT as part of the Bulgarian National Roadmap for Research Infrastructure. Partially supported by the VILLUM Foundation grants 54451 and 16582.
Editors:
Mark Bun

1 Introduction

Differential privacy has in recent years become the golden standard for data privacy. Possibly the most common way of achieving differential privacy for numerical queries is the Laplace mechanism, which adds noise from the Laplace distribution scaled with the global sensitivity of the query.111The global sensitivity is defined as the greatest difference in the query value between some two neighboring datasets. However, global sensitivity has one significant shortcoming: it is not adaptive to the dataset at hand. Namely, the error on every input is the same as that on the most difficult input, which determines the global sensitivity.

One of the most common ways to get practical mechanisms with instance-adaptive performance is to use smooth sensitivity. In contrast with global sensitivity, the smooth sensitivity can adapt to a specific dataset, meaning that easier inputs can have significantly lower errors. The approach works by computing the γ-smooth sensitivity (Definition 2) of the given query q on the input D at hand for a smoothness parameter γ, and releasing q(D) with added noise scaled with the γ-smooth sensitivity, picked from an appropriately chosen distribution. The parameter γ here controls a tradeoff between how large the γ-smooth sensitivity is on one hand, and how concentrated a distribution we can use for the noise on the other. This leaves one important question unanswered: what distribution should the noise be sampled from?

In this paper, we give a new distribution class suitable for use with smooth sensitivity, which we name the PolyPlace distribution (Definition 6). The specific distribution from the class that we use depends on the smoothness parameter γ and the privacy parameter ε. Our distribution has two advantages over the state-of-the-art Student’s T distribution: (1) it achieves significantly smaller error, having standard deviation smaller by arbitrarily large factors (depending on γ), as shown in Figure 1, and (2) it is defined for any γ<ε, whereas the previously used distributions all required γ<ε/2. While this may seem as an improvement by a constant factor, it may in fact bring asymptotic decrease of the error as smooth sensitivity can change by an arbitrarily large amount even with a small change in γ. Our method also significantly improves upon the Laplace distribution for all but tiny values of γ (Laplace distribution moreover provides only approximate DP when used with smooth sensitivity). One of the main strengths of our mechanism is that it is completely general and it provides an improvement in any application relying on smooth sensitivity.

We also prove that for γ0, our PolyPlace distributions converge in distribution to the Laplace distribution and the variance of PolyPlace also converges to that of the Laplace distribution. At the same time, with γ0, the γ-smooth sensitivity SSγ converges to the global sensitivity. This means that in some sense, our proposed mechanism generalizes the Laplace mechanism, which is a limit special case of our mechanism for γ=0. Since the Laplace mechanism is optimal for pure differential privacy [8], this means that our method is also optimal for γ0 in the sense that any quantile of PolyPlace converges to that of the optimal distribution. Our result can be summarized as follows:

Theorem 1 (Informal version of Theorem 8).

Let us have a query q and two parameters 0<γ<ε. Let D be a dataset. There exists a class of distributions PolyPlace(s,α) such that releasing

q(D)+PolyPlace(SSqγ(D)γ,εγ)

is ε-DP. Moreover, for γ/ε0, the variance is up to a factor (1+o(1)) the same as scaling the Laplace distribution with the smooth sensitivity. Also, for any value a>0 it holds that PolyPlace(aγ,εγ)dLap(a/ε).

Figure 1: Comparison of the standard deviation of our and the other distributions that can be used with the smooth sensitivity framework. Laplace distribution when used with smooth sensitivity only provides approximate differential privacy; we assume here δ=105. The shape parameters for the two other distributions are numerically selected so as to minimize their standard deviations. The standard deviations are calculated in Appendix B. The dashed line represents the standard deviation of the Laplace distribution which, however, needs to be used with the larger global sensitivity and not smooth sensitivity.

Note that this convergence result is meaningful despite the fact that the smooth sensitivity converges to the global sensitivity for γ0: This result implies that for fixed small values of γ, the distribution is close to optimal, while the smooth sensitivity will, in general, still be asymptotically smaller than the global sensitivity. This refutes the following argument, that one might naturally make against our result: Namely, that since the smooth sensitivity is, for γ0, equal to the global sensitivity, this result does not show anything that would not already hold for the Laplace mechanism.

Smooth Sensitivity vs. Inverse Sensitivity Mechanism

Recently, an alternative to smooth sensitivity appeared in the form of the inverse sensitivity mechanism [1, 2]. This approach has nice theoretical guarantees, and mechanisms that are optimal up to logarithmic factors have been designed by relying on inverse sensitivity. Admittedly, this reduces the overall impact of our approach. However, we are still convinced that smooth sensitivity has an important place in the repertoire of differential privacy techniques, and that it will continue to be used in practice.

The inverse sensitivity mechanism comes with its own problems. First, inverse sensitivity may be difficult to compute [1, 7], even in cases when smooth sensitivity is easy to compute, such as for counting triangles in a graph. While it is often possible to efficiently approximately implement the inverse sensitivity mechanism [1, 7], this already leads to worsened performance – it remains to be seen whether this approach will be competitive in practice for a wide range of problems.

Moreover, there are practical considerations that often favor the smooth sensitivity approach. First, it is possible to have an efficient generic implementation just based on an oracle that returns the smooth sensitivity. This is clearly not possible for the inverse sensitivity mechanism – the inverse sensitivity mechanism instantiates the exponential mechanism which in general cannot be implemented in time sublinear in the size of the universe. It is also due to this ease of use that there is extensive literature on the uses of smooth sensitivity on the applied side, as we discuss in Section 1.2. We think that even just the existence of this large amount of literature on smooth sensitivity would make improving smooth sensitivity an important undertaking.

Another issue with the inverse sensitivity mechanism is that it may in some cases have worse constants than smooth sensitivity. Specifically, it relies on the exponential mechanism which loses a factor of 2 in some instances. There are other mechanisms that could be used in place of the exponential mechanism, most notably the report-noisy-max mechanism [5], that may have better constants, but even those lose a factor of 2 in the standard deviation in some cases. For example, the Laplace mechanism can be seen as an instantiation of the exponential mechanism or report noisy max, but in doing so, one would lose a factor of 2. Our mechanism for smooth sensitivity does not have this issue (unlike the previously known smooth sensitivity mechanisms that achieved pure differential privacy).

1.1 Technical Overview

The principle that motivated the PolyPlace distribution is that we wanted to make sure that for ε0, the privacy loss will be the same regardless of what value we released (recall that the privacy loss random variable is a function of the value that we release). Indeed, this is precisely the property that the Laplace distribution has in the context of global sensitivity. We formalize this property as a claim about some derivatives involving the density of the PolyPlace distribution (Lemma 12). Intuitively speaking, these derivatives capture an “infinitesimal privacy loss” as if we had two neighboring datasets whose query values as well as smooth sensitivities differ infinitesimally. In Lemma 9, we essentially integrate this infinitesimal privacy loss in order to express the true privacy loss. Doing so takes bounds on the infinitesimal privacy loss and gives bounds on the actual privacy loss, proving differential privacy.

Intuitively speaking, our improvement comes from two sources. The first is unsurprisingly that our distribution simply has a smaller variance than the Student’s T distribution. The second source of our improvement is more subtle: we do not split the privacy budget the way previous work did. Intuitively speaking, there are two sources of privacy loss when we use smooth sensitivity: from the change in the query value between adjacent inputs (corresponding to shifting of noise) and from the change in the smooth sensitivity between neighboring datasets (corresponding to scaling noise). The standard approach is to split the privacy budget between the privacy loss caused by the shifting and the privacy loss caused by scaling. Surprisingly, if the scaling is not too large, it does not contribute anything to the worst-case privacy loss with the PolyPlace distribution. This allows us in some sense to use all of the privacy budget to “pay” for the change in query value between neighboring datasets, leading to a more efficient use of the privacy budget.

1.2 Related Work

Smooth sensitivity has been proposed by Nissim, Raskhodnikova, and Smith [12] as an alternative to global sensitivity, rectifying the global sensitivity’s drawback that it is determined by the most difficult input. They have shown how it can be used to privately approximately release the median, minimum, minimum spanning tree, and the number of triangles in a graph. Bun and Steinke [3] later gave new distributions suitable for use with smooth sensitivity. Notably, they propose the use of the Student’s T distribution. They also propose other distributions but those only work with the weaker concentrated differential privacy [6], which we do not focus on in this paper.

Since its inception, smooth sensitivity has been used in many applications. An incomplete list includes: principal component analysis [9], deep learning [13], statistics from genome-wide association studies [17], generating synthetic datasets using Bayesian models [11], generating synthetic graphs [15], random forests [16], or some graph centrality measures [10].

Other frameworks for beyond-worst-case differentially private mechanisms include propose-test-release [4], or privately bounding local sensitivity [14, Section 3.4]. However, these techniques only give approximate differential privacy and they generally speaking tend to need larger datasets to work. Namely, to get non-trivial guarantees, they usually need a dataset of size Ω(logδ1ε).222This is the case for propose-test-release as it otherwise always fails. For privately bounding local sensitivity, this is the case assuming the local sensitivity has global sensitivity 1 (it often has bigger sensitivity in which case the situation is even worse). For these reasons as well as for the sake of brevity, we choose not to focus on these techniques.

2 Preliminaries

2.1 Differential Privacy

Let us have a symmetric notion of dataset adjacency . A randomized algorithm 𝒜 is said to be ε-differentially private if, for any two adjacent datasets D,D, and for any measurable set S, the following condition holds:

Pr[𝒜(D)S]eεPr[𝒜(D)S],

where ε controls the trade-off between privacy and data utility. Smaller values of ε provide stronger privacy guarantees, but may limit the accuracy of the analysis. One of the most notable ways of achieving differential privacy is using the notion of smooth sensitivity:

Definition 2 (Smooth sensitivity).

Let us have a query q:𝒟. We define the local sensitivity of q at D𝒟 as

LSq(D)=max{|q(D)q(D)|:DD}.

In turn, we define the γ-smooth sensitivity of q at D as

SSqγ(D)=max{LSq(D)eγd(D,D):D𝒟},

where d(D,D) denotes distance induced by .

2.2 Calculus

We now give several statements from calculus. The following is a standard result.

Fact 3.

Let us have f:n. If at a point 𝐱n, all partial derivatives exist and are continuous, then the function is differentiable at 𝐱.

For the following two statements, we choose to give statements of special cases that suffice for our needs, instead of giving the more general cases. We defer their proofs to Appendix A.

Fact 4.

Let us have a function f:2. At any x at which f(x,x) is differentiable, it holds

ddxf(x,x) =[ddλf(x+λ,x)+ddλf(x,x+λ)]λ=0.
Fact 5.

Let us have a real function f. It holds that

[ddxf(cx)]x=0=c[ddxf(x)]x=0.

3 Improved Smooth Sensitivity Mechanism

Definition 6.

We define the PolyPlace(s,α) distribution with scale s>0 and shape α>1 by its probability density function

fs,α(x)={Ns,α(α1)(1|x|s)α1if |x|s<α1,Ns,α(α+1)(11α2)α(1+|x|s)α1otherwise,

for Ns,α=α2(2(α1α)α+α1)s.

Figure 2: A log-scale plot of selected members of the Laplace and PolyPlace families of distributions. The values of α and s correspond (for SS(D)=1 and ε=1) to γ0.91,γ=2/3, and γ=0.2, respectively. Note that the distributions get closer to Laplace, which is explained by the convergence from Theorem 8.

We plot a few example members of this family of distributions in Figure 2, where we also compare it with the Laplace distribution.

We start by claiming that this distribution is correctly defined. We defer the proof to Appendix A.

Lemma 7.

The distribution PolyPlace(s,α) is correctly defined for s>0,α>1. That is, the density is non-negative and integrates to 1.

We now state our main result.

Theorem 8.

Let us have a query q:𝒟. Let us have two parameters 0<γ<ε. Then the mechanism

M(D)=q(D)+PolyPlace(SSqγ(D)γ,εγ)

is ε-DP. Moreover, for γ/ε0, it holds that

Var(M(D))=(1+O(γ/ε))2SSqγ(D)2ε2

and for any value a>0 it holds that PolyPlace(aγ,εγ)dLap(a/ε).

Proof.

We prove the asymptotic expansion of the variance in Corollary 14. We show that the distribution converges to Laplace in Lemma 15.

It remains to prove privacy. Let pM(D) denote the density function of M(D). We then have

pM(D)(x)pM(D)(x)=fSSqγ(D)/γ,ε/γ(xq(D))fSSqγ(D)/γ,ε/γ(xq(D))eε,

where final inequality holds by Lemma 9. This implies ε-differential privacy.

Proving Privacy

In this section, we prove Lemma 9, which is the result we used above to prove that our mechanism is differentially private. The technical parts of the proof of the lemma appear in Lemmas 12, 10, and 11. The casual reader may wish to skip the proof of Lemma 9 as well as the other lemmas in this subsection.

Lemma 9.

Let q:𝒟 for 𝒟 being the set of all possible datasets, be a query function. Let us have two neighboring datasets DD and x. Then

fSSqγ(D)/γ,ε/γ(xq(D))fSSqγ(D)/γ,ε/γ(xq(D))eε,

where f is defined in Definition 6.

Proof.

Let x=xq(D)SSqγ(D) and define λr such that eλrγ=SSqγ(D)SSqγ(D). Let also λs=q(D)q(D)SSqγ(D) and y(t)=eλrγt(x+λst). Note that for any s>0,α>1 and c>0, it holds

fs,α(x)=cfcs,α(cx). (1)

We now give a bound and we explain the individual steps below.

fSSqγ(D)/γ,ε/γ(xq(D))fSSqγ(D)/γ,ε/γ(xq(D))
=SSqγ(D)feλrγ/γ,ε/γ(xq(D)SSqγ(D))SSqγ(D)f1/γ,ε/γ(xq(D)SSqγ(D)) (2)
=feλrγ/γ,ε/γ(x+λs)f1/γ,ε/γ(x) (3)
=exp(logfeλrγ/γ,ε/γ(x+λs)logf1/γ,ε/γ(x))
=exp(01(ddtlog(feλrtγ/γ,ε/γ(x+λst)))𝑑t) (4)
=exp(01[ddλlog(feλr(t+λ)γ/γ,ε/γ(x+λst))
+ddλlog(feλrtγ/γ,ε/γ(x+λs(t+λ)))]λ=0dt) (5)
=exp(01[ddλlog(feλrλγ/γ,ε/γ(y(t)))+ddλlog(f1/γ,ε/γ(y(t)+eλrtγλsλ))]λ=0𝑑t)
=exp(01[λrddλlog(feλγ/γ,ε/γ(y(t)))
+eλrtγλsddλlog(f1/γ,ε/γ(y(t)+λ))]λ=0dt) (6)
exp(01[λr|ddλlog(feλγ/γ,ε/γ(y(t)))|
+eλrtγλs|ddλlog(f1/γ,ε/γ(y(t)+λ))|]λ=0dt)
exp(01[|ddλlog(feλγ/γ,ε/γ(y(t)))|+|ddλlog(f1/γ,ε/γ(y(t)+λ))|]λ=0𝑑t) (7)
=exp(01ε𝑑t)=eε. (8)

We now justify the manipulations that may not be clear. Equality (2) holds by Equation 1. Equality (3) holds by the definition of x and λs. The equality (4) holds because of the fundamental theorem of calculus; we use here the fact that by Lemma 11, the function

(y,z)=log(feλryγ/γ,ε/γ(x+λsz))

is differentiable except at points where x+λsz=0, at which points it is continuous; furthermore, the function ϕ(t)=(t,t) is continuous everywhere and differentiable everywhere except for at most one point where x+λst=0. The equality (5) holds by (4); we again used that h is differentiable everywhere except for the at most one t satisfying x+λst=0. The equality (6) holds by viewing each summand as a univariate function in λ and applying Equation 5. The bound (7) holds because we argue below that the removed constants are λr,eλrtγγs1. The equality (8) holds by the Lemma 12 below.

It remains to prove λr,eλrtγγs1. The definition of smooth sensitivity ensures that SSqγ(D)SSqγ(D)eγ, and thus it holds that λr1. It also holds

q(D)q(D)min(SSqγ(D),SSqγ(D))

and thus

λsmin(1,SSqγ(D)SSqγ(D)).

It thus holds eλrtγλs1 as we needed to prove.

In the following lemma, we explicitly calculate the derivatives of logfs,α(x) with respect to s and x, which is useful for Lemmas 11 and 12.

Lemma 10.

For α>1 and h:>0× defined as h(s,x)=logfs,α(x), it holds for any x0:

[ddλh(s,x+λ)]λ=0 =sgn(x){α1|x|sif |x|s<α1,α1|x|+sotherwise,
[ddλh(eλ/ss,x)]λ=0 ={1s+(α1)|x|s(s|x|)if |x|s<α1,1s+(α+1)|x|s(s+|x|)otherwise.
Proof.

The statement follows from direct computation. Full proof is deferred to Appendix A. We proceed with a technical lemma that justifies the invocation of the fundamental theorem of calculus in the proof of Lemma 9.

Lemma 11.

Let 0<γ<ε; λr,λs,x. The function

(y,z)=log(feλryγ/γ,ε/γ(x+λsz))

is continuous everywhere. It is differentiable everywhere, except when x+λsz=0.

Proof.

Set α=ε/γ and take h from Lemma 10. Define y=eλryγ/γ and z=x+λsz. Then (y,z)=h(y,z) and we have

ddy(y,z)=[(y+λ,z)]λ=0 =[h(yeλrλγ,z)]λ=0,
ddz(y,z)=[(y,z+λ)]λ=0 =[h(y,z+λsλ)]λ=0.

Now we can apply Lemma 10 on the right-hand sides, together with the chain rule, to obtain that the partial derivatives on the left exist and are continuous for all (y,z) such that z=x+λsz0. By Fact 3 and the fact that around every (y,z), there is a sufficiently small neighborhood where x+λsz0, we get that is differentiable (and thus continuous) in every (y,z) with x+λsz0.

It remains to prove that is also continuous for x+λsz=0. It suffices to show that h(s,x) is continuous at (s,0) for all choices of s>0. The first branch h1 of h is continuous everywhere, and for any (s,0), there exists a sufficiently small ε-neighborhood of (s,0) where |x|/s<α1, that is, where h(s,x)=h1(s,x). Therefore, h is continuous at (s,0).

Lemma 12.

Let x{0} and let 0<γ<ε. It holds

[|ddλlog(f1/γ,ε/γ(x+λ))|+|ddλlog(feλγ/γ,ε/γ(x)))|]λ=0=ε. (9)
Proof sketch.

For the full proof, see Appendix A. The main idea of the proof is to calculate both expressions in the sum using Lemma 10, with s=1/γ and α=ε/γ. Then, we distinguish two cases: either |x|/s<α1, or |x|/sα1. For both cases separately, and for each absolute value, we can argue that the expression inside this absolute value will always have the same sign, and can be therefore written without the absolute value. This simplifies the analysis, and we can then verify that the left-hand side of Equation 9 always sums up to α/s. Finally, we substitute s and α back to get α/s=ε.

3.1 Properties of the 𝐏𝐨𝐥𝐲𝐏𝐥𝐚𝐜𝐞(𝒔,𝜶) Distribution

We now give a lemma that among others states the variance of PolyPlace. To make sure we have not made a mistake in the very technical manipulations, we have checked that we get the same result if we approximate the variance by numerical integration.333Specifically, by scaling, we can assume w.l.o.g. that s=1. For this value of s, we have plotted the variance as calculated by numeric integration and the expression above. We checked that they are indeed the same. We defer the proof to Appendix A.

Lemma 13.

PolyPlace has mean 0 and variance

σs,α2=2s2(1α+1)α((19α2+5)(11α2)α+(α2)(α1)2(1α+1)α)(2(α1α)α+α1)(α45α2+4).
Corollary 14.

Assume γ<ε/2. It holds

Var(M(D))=(1+O(γ/ε))2SSqγ(D)2ε2.
Proof.

We substitute into the above lemma with s=SSqγ(D)/γ, α=ε/γ, and take the Taylor expansion of order one w.r.t. γ around γ=0. We skip routine steps in the computation and just state that the second-order Taylor expansion is

2SSqγ(D)2ε22(3e17)γSSqγ(D)2eε3+O(γ2).

This is exactly as we claim.

Note that the variance of the Laplace distribution scaled to the smooth sensitivity is 2ε2, meaning that for γ/ε0, the variance of our mechanism approaches that of the Laplace mechanism scaled to the smooth sensitivity. In fact, the distribution converges to the Laplace distribution for γ0.

Lemma 15.

For γ0 and any a,ε>0, it holds that fa/γ,ε/γ converges pointwise to the probability density function (pdf) of Lap(a/ε).

Proof.

Let us denote the pdf of Lap(a/ε) by fLap(a/ε). We prove pointwise convergence, that is, that for any x, fa/γ,ε/γ(x)fLap(a/ε)(x). This in turn implies the desired convergence in distribution.444This holds because by Scheffé’s lemma, this means that the TV distance of the two distributions goes to zero, which in turn implies convergence in distribution. Since fLap(a/ε) is bounded, this follows if we prove that for γ/ε0 and any fixed a>0 and x, it holds that fa/γ,ε/γ(x)/fLap(a/ε)(x)1, denoted fa/γ,ε/γ(y)fLap(a/ε)(y). We prove this claim in the rest of this proof.

Set s=a/γ, α=ε/γ. It holds that

f¯1 =def(1|x|s)α1=(1|x|γa)ε/γ1(1|x|γa)ε/γ
=((1|x|γa)a/(γ|x|))ε|x|/aeε|x|/a,

where the first uses the fact that the base of the exponential converges to 1 and the second uses the standard limit (1z)1/ze1. At the same time, it holds that

c=defα2(2(α1α)α+α1)s =ε/γ2(2(ε/γ1ε/γ)ε/γ+ε/γ1)a/γ
ε/γ2(2e1+ε/γ1)a/γ
ε/γ2ε/γa/γ=γ2a,

where the first again uses the limit (1z)1/ze1. Overall, we can thus for |x|s<α1 write

fa/γ,ε/γ(x) =c(εγ1)f¯1γ2a(εγ1)eε|x|/aγ2aεγeε|x|/a
=ε2aeε|x|/a=fLap(a/ε)(x),

as we wanted to show. The argument for |x|sα1 is very similar:

f¯2 =def(1+|x|s)α1=(1+|x|γa)ε/γ1
(1+|x|γa)ε/γ((1+|x|γa)a/(γ|x|))ε|x|/aeε|x|/a

and

fa/γ,ε/γ(x) =c(εγ+1)(1γ2ε2)ε/γf¯2
γ2a(εγ+1)(1γ2ε2)ε/γeε|x|/a
γ2aεγeε|x|/a
=ε2aeε|x|/a=fLap(a/ε)(x),

where the second uses the standard limit (1x2)1/x1 for x0.

References

  • [1] Hilal Asi and John C. Duchi. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, volume 33, pages 14106–14117, 2020. URL: https://proceedings.neurips.cc/paper/2020/hash/a267f936e54d7c10a2bb70dbe6ad7a89-Abstract.html.
  • [2] Hilal Asi and John C. Duchi. Near instance-optimality in differential privacy. CoRR, abs/2005.10630, 2020. doi:10.48550/arXiv.2005.10630.
  • [3] Mark Bun and Thomas Steinke. Average-case averages: Private algorithms for smooth sensitivity and mean estimation. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, volume 32, pages 181–191, 2019. URL: https://proceedings.neurips.cc/paper/2019/hash/3ef815416f775098fe977004015c6193-Abstract.html.
  • [4] Cynthia Dwork and Jing Lei. Differential privacy and robust statistics. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 – June 2, 2009, pages 371–380. ACM, 2009. doi:10.1145/1536414.1536466.
  • [5] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3-4):211–407, 2014. doi:10.1561/0400000042.
  • [6] Cynthia Dwork and Guy N. Rothblum. Concentrated differential privacy. CoRR, abs/1603.01887, 2016. doi:10.48550/arXiv.1603.01887.
  • [7] Juanru Fang, Wei Dong, and Ke Yi. Shifted inverse: A general mechanism for monotonic functions under user differential privacy. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS 2022, Los Angeles, CA, USA, November 7-11, 2022, pages 1009–1022. ACM, 2022. doi:10.1145/3548606.3560567.
  • [8] Natasha Fernandes, Annabelle McIver, and Carroll Morgan. The Laplace mechanism has optimal utility for differential privacy over continuous queries. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 – July 2, 2021, pages 1–12. IEEE, IEEE, 2021. doi:10.1109/LICS52264.2021.9470718.
  • [9] Alon Gonem and Ran Gilad-Bachrach. Smooth sensitivity based approach for differentially private PCA. In Algorithmic Learning Theory, ALT 2018, 7-9 April 2018, Lanzarote, Canary Islands, Spain, volume 83 of Proceedings of Machine Learning Research, pages 438–450. PMLR, PMLR, 2018. URL: http://proceedings.mlr.press/v83/gonem18a.html.
  • [10] Jesse Laeuchli, Yunior Ramírez-Cruz, and Rolando Trujillo-Rasua. Analysis of centrality measures under differential privacy models. Applied Mathematics and Computation, 412:126546, 2022. doi:10.1016/j.amc.2021.126546.
  • [11] Mingzhu Li and Xuebin Ma. Bayesian networks-based data publishing method using smooth sensitivity. In IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications, ISPA/IUCC/BDCloud/SocialCom/SustainCom 2018, Melbourne, Australia, December 11-13, 2018, pages 795–800. IEEE, IEEE, 2018. doi:10.1109/BDCloud.2018.00119.
  • [12] Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San Diego, California, USA, June 11-13, 2007, pages 75–84. ACM, 2007. doi:10.1145/1250790.1250803.
  • [13] Lichao Sun, Yingbo Zhou, Philip S. Yu, and Caiming Xiong. Differentially private deep learning with smooth sensitivity. CoRR, abs/2003.00505, 2020. doi:10.48550/arXiv.2003.00505.
  • [14] Salil P. Vadhan. The complexity of differential privacy. In Tutorials on the Foundations of Cryptography, pages 347–450. Springer International Publishing, 2017. doi:10.1007/978-3-319-57048-8_7.
  • [15] Yue Wang and Xintao Wu. Preserving differential privacy in degree-correlation based graph generation. Trans. Data Priv., 6(2):127–145, 2013. URL: http://www.tdp.cat/issues11/abs.a113a12.php.
  • [16] Bangzhou Xin, Wei Yang, Shaowei Wang, and Liusheng Huang. Differentially private greedy decision forest. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019, pages 2672–2676. IEEE, IEEE, 2019. doi:10.1109/ICASSP.2019.8682219.
  • [17] Akito Yamamoto and Tetsuo Shibuya. Privacy-preserving publication of GWAS statistics using smooth sensitivity. In 20th Annual International Conference on Privacy, Security and Trust, PST 2023, Copenhagen, Denmark, August 21-23, 2023, pages 1–12. IEEE, IEEE, 2023. doi:10.1109/PST58708.2023.10320160.

Appendix A Deferred Proofs

We start by proving the calculus facts from Section 2.2.

Proof of Fact 4.

The statement follows from the chain rule. Its special case states that for f:2 and differentiable g: and h:, and all x such that y=g(x), z=h(x) and f is differentiable at (y,z), it holds

df(y,z)dx =df(y,z)dydydx+df(y,z)dzdzdx
=[df(y+λ,z)dλdydx+df(y,z+λ)dλdzdx]λ=0,

where the notation on right-hand side is an equivalent way of writing partial derivatives. In our case, g=h=id, x=y=z and dydx=dzdx=1, and substituting these facts in the equation above concludes the proof.

Proof of Fact 5.

The statement follows from the chain rule. Its special case states that if h(x)=f(g(x)), g is differentiable everywhere and f is differentiable at g(x), then h(x)=f(g(x))g(x). If we apply the chain rule on g(x)=cx, then h(x)=f(cx) and we get that ddxf(cx)=h(x)=f(g(x))g(x)=f(cx)c. Evaluated at x=0, this becomes cf(0), as needed.

Next, we prove Lemma 7, which states that the PolyPlace distribution is correctly defined.

Proof of Lemma 7..

By scaling, we may assume without loss of generality that s=1. By symmetry, it then suffices to integrate f1,α from 0 to + and verify that it integrates to 1/2. The first branch in the definition of f1,α integrates to555We do not show the detailed steps of how to compute the antiderivative. However, it should be noted that it is easy to check correctness by differentiating and checking that this results in the respective expressions we were integrating.

I1(α,x)=(α1)(1x)α2(2(α1α)α+α1),

and the second integrates to

I2(α,x)=(11α2)α(α+1)(x+1)α2(2(α1α)α+α1).

We now write the integral as

0f1,α(x)𝑑x=I1(α,1/α)I1(α,0)+limx+I2(α,x)I2(α,1/α)=1/2,

where in the last step we skipped some elementary manipulations.

Next, we prove Lemma 10, which provides a formula for the derivatives of h(s,x)=logfs,α(x).

Proof of Lemma 10.

Define

g(s,x,m)=Ns,α(αm)(1mxs)mα1.

Then one can verify that

h(s,x)={logg(s,|x|,+1)if |x|s<α1,logg(s,|x|,1)+αlog(11α2)otherwise,

When taking derivatives, we can ignore the additive αlog(11α2) in the second branch as α is constant.

We start with analyzing the derivatives of g with respect to x. For m=±1,x0:

[ddλlog(g(s,|x+λ|,m))]λ=0
=[ddλlog(Ns,α(αm))0+ddλ(mα1)log(1m|x+λ|s)]λ=0
=[(mα1)sgn(x+λ)mm|x+λ|s]λ=0=sgn(x){α1|x|sif m=1,α1|x|+sotherwise.

Now, for x0, |x|/sα1, it holds:

[ddλh(s,x+λ)]λ=0 =sgn(x){α1|x|sif |x|s<α1,α1|x|+sotherwise. (10)

Finally, for |x|/s=α1, each branch gives us a one-sided derivative at x=±s/α, and one can verify that both derivatives are equal and therefore Equation 10 holds for all x0.

Similarly, for m=±1,x0:

[ddλlog(g(eλ/ss,|x|,m))]λ=0
=[ddλlogNeλ/ss,α+ddλlog(αm)0+ddλ(mα1)log(1m|x|eλ/ss)]λ=0
=[ddλlogeλ/s+ddλ(mα1)log(1m|x|eλ/ss)]λ=0
=[1s+(mα1)m|x|s(eλ/ssm|x|)]λ=0={1s+(α1)|x|s(s|x|)if m=1,1s+(α+1)|x|s(s+|x|)otherwise.

Thus, for x0,|x|/sα1:

[ddλh(eλ/ss,x)]λ=0={1s+(α1)|x|s(s|x|)if |x|s<α1,1s+(α+1)|x|s(s+|x|)otherwise. (11)

Again, the two branches are equal for |x|/s=α1 and Equation 11 hence holds for all x0.

Proof of Lemma 12.

In this whole proof, fix s>0,α>1. We will carry through the proof with s and α, and in the very end substitute s=1/γ and α=ε/γ.

Per Lemma 10, it holds:

[|ddλlog(fs,α(x+λ))|]λ=0 ={|α1|x|s|=α1s|x|if |x|s<α1,|α1|x|+s|=α+1s+|x|otherwise. (12)

To justify the removal of the absolute value on the right-hand side, note that α>1 and thus α1>0 and α1<0. Then consider two cases: if |x|/s<α1<1, then |x|s<0 and thus (α1)/(|x|s)<0, and if |x|/sα1>0, then |x|+s>0 and thus (α1)/(|x|+s)<0.

Similarly,

[|ddλlog(feλ/ss,α(x))|]λ=0 ={|1s+(α1)|x|s(s|x|)|=1s(α1)|x|s(s|x|)if |x|s<α1,|1s+(α+1)|x|s(s+|x|)|=1s+(α+1)|x|s(s+|x|)otherwise.

To again justify the removal of absolute values, we distinguish two cases. If |x|/s<α1, then

s|x|>α s|x||x|>α1s|x|(α1)|x|>1
(α1)|x|s|x|<1(α1)|x|s(s|x|)<1s,

and thus, 1s(α1)|x|s(s|x|)>0. If |x|/sα1, then, similarly,

s|x|α s+|x||x|α+1s+|x|(α+1)|x|1
(α+1)|x|s+|x|1(α+1)|x|s(s+|x|)1s,

and thus, 1s(α+1)|x|s(s+|x|)0.

Now we can combine both results to conclude that

[|ddλlog(fs,α(x+λ))|+|ddλlog(feλ/ss,α(x))|]λ=0
={α1s|x|+1s(α1)|x|s(s|x|)=(s|x|)(1+α1)s(s|x|)if |x|s<α1,α+1s+|x|1s+(α+1)|x|s(s+|x|)=(s+|x|)(α+11)s(s+|x|)otherwise;}=αs.

To conclude the proof, set s=1/γ, α=ε/γ and observe that s>0 and α>1 as needed. Thus, all previous claims hold, and we have

[|ddλlog(f1/γ,ε/γ(x+λ))|+|ddλlog(feλγ/γ,ε/γ(x))|]λ=0=ε/γ1/γ=ε.

Now we prove Lemma 13, which states that the PolyPlace distribution has mean 0 and calculates its variance.

Proof of Lemma 13..

Assuming the expectation is defined, it has to be 0 as the distribution is symmetric around this point. We now focus on the variance; by proving that the variance is finite, we also get that the expectation is defined and therefore 0. Similarly to proof of Lemma 7, we first separately give the indefinite integrals of the two branches, this time of x2f1,α(x):

I1σ2=(α1)(1x)α(α(α+1)x2+2αx+2)2(α+1)(α+2)(2(α1α)α+α1)
I2σ2=(11α2)α(α+1)(x+1)α(((α1)αx2)2αx2)2(α2)(α1)(2(α1α)α+α1).

We may now use this to write the integral as

σ1,α2 =20+x2f1,α(x)𝑑x=I1σ2(α,1α)I1σ2(α,0)+limx+I2σ2(α,x)I2σ2(α,1α)
=(1α+1)α((19α2+5)(11α2)α+(α2)(α1)2(1α+1)α)(2(α1α)α+α1)(α45α2+4).

Appendix B Other Mechanisms for Smooth Sensitivity

In this section, we calculate the variances of the generalized Cauchy distribution of [12], the Student’s T distribution [3], and the Laplace distribution, when used with the smooth sensitivity framework. We use these variances in Figure 1 in order to compare our distribution with these two distributions.

Claim 16.

When scaled according to Nissim, Raskhodnikova, and Smith [12], the generalized Cauchy distribution has, for c>3 and 0<γ<ε/(c+1), a standard deviation

c+1(εγ(c+1))2cos(2πc)+1.

For other values of c>0,γ>0, the standard deviation is infinite.

Proof.

The generalized Cauchy distribution is defined in Nissim, Raskhodnikova, and Smith [12] as having density 11+|y|α. Using a Computer Algebra System (such as Mathematica), it is easy to verify that the density is in fact equal to

c2πcsc(πc)(|y|c+1).

Writing the variance as an integral and again using a CAS to evaluate it, one gets that the variance is

12cos(2πc)+1

for c>3 and infinite otherwise. For two neighboring datasets DD, let us have λr such that SSqγ(D)=eλrSSqγ(D) and let λs=q(D)q(D)SSqγ(D). It is shown in Nissim, Raskhodnikova, and Smith [12] that the privacy loss is at most (|λr|+|λs|)(c+1). If we substitute the bound |λr|γ, the remaining privacy budget is εγ(c+1), meaning that we need to scale the distribution with c+1εγ(c+1) in order to ensure the bound on privacy loss is at most ε, assuming γ<ε/(c+1). This results in a standard deviation of

c+1(εγ(c+1))2cos(2πc)+1.

Claim 17.

When scaled according to Bun and Steinke [3], the Student’s T distribution has, for d>2 and 0<γ<ε/(d+1), a standard deviation

dd2d+12d(εγ(d+1))

For other values of d>0,γ>0, the standard deviation is infinite.

Proof.

The variance of the T distribution with d degrees of freedom is dd2. At the same time, for λr, λs defined as in the proof above, the privacy loss is at most

|λr|(d+1)+|λs|d+12d.

We have |λr|γ. The remaining budget for privacy loss caused by the distribution shift is εγ(d+1), meaning that we need

|λs|ν=2d(εγ(d+1))d+1

assuming γ<ε/(d+1). To achieve ε-differential privacy, for a fixed value of γ, we have to scale the distribution with ν1. This results in a standard deviation of

dd2d+12d(εγ(d+1)).

Claim 18.

When scaled according to Nissim, Raskhodnikova, and Smith [12], the Laplace distribution has, for 0<γ<ε/logδ1, a standard deviation

2εγlog2δ1.
Proof.

The proof is essentially the same as the proofs above. It is shown by [12] that the privacy loss is at most |λr|log2δ1+|λs|. This means we have to scale the noise to be inversely proportional to εγlog2δ1. This leads to the claimed standard deviation.