Differential Privacy from Axioms
Abstract
Differential privacy (DP) is the de facto notion of privacy both in theory and in practice. However, despite its popularity, DP imposes strict requirements which guard against strong worst-case scenarios. For example, it guards against seemingly unrealistic scenarios where an attacker has full information about all but one point in the data set, and still nothing can be learned about the remaining point. While preventing such a strong attack is desirable, many works have explored whether average-case relaxations of DP are easier to satisfy [17, 28, 5, 22].
In this work, we are motivated by the question of whether alternate, weaker notions of privacy are possible: can a weakened privacy notion still guarantee some basic level of privacy, and on the other hand, achieve privacy more efficiently and/or for a substantially broader set of tasks? Our main result shows the answer is no: even in the statistical setting, any reasonable measure of privacy satisfying nontrivial composition is equivalent to DP. To prove this, we identify a core set of four axioms or desiderata: pre-processing invariance, prohibition of blatant non-privacy, strong composition, and linear scalability. Our main theorem shows that any privacy measure satisfying our axioms is equivalent to DP, up to polynomial factors in sample complexity. We complement this result by showing our axioms are minimal: removing any one of our axioms enables ill-behaved measures of privacy.
Keywords and phrases:
Differential Privacy, Privacy Amplification, CompositionFunding:
Guy Blanc: Supported by NSF awards 1942123, 2211237, and 2224246 and a Jane Street Graduate Research Fellowship.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Theory of database privacy and security ; Theory of computation Machine learning theoryAcknowledgements:
The authors thank the anonymous ITCS reviewers for their helpful feedback.Editor:
Shubhangi SarafSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Differential privacy has won. It is the de facto formalization of privacy both in theory (see, e.g., the textbooks [13, 27, 24]) and in practice (see, e.g., its use in the U.S. Census [3] and by various technology companies [1, 29, 9]).
Definition 1 (-Differential Privacy, [12, 11]).
A randomized algorithm is -DP if, for every differing in only one of the coordinates and ,
A large part of the reason that differential privacy (DP) has been so successful is the extensive toolkit of DP algorithms for a variety of basic primitives [13]. This toolkit can then be combined with strong composition: The sequential combination of -many of these primitives has a privacy loss ( in Definition 1) that scales sublinearly in [15, 19]. This allows for efficient and simple construction of DP algorithms for a variety of tasks (see e.g. [2] for how strong composition enables differentially private deep learning). This work is motivated by the following question.
Question 1.
How inevitable was Definition 1? Is it possible to construct a materially different formulation of privacy that still satisfies strong composition?
A natural reason to suspect alternative definitions of privacy may be useful is that Definition 1 guards against an incredibly strong, and in some cases unrealistic, attack. Even if the attacker is able to freely manipulate all but one point in the dataset, corresponding to the points and agree on, they must still learn almost nothing about the one unknown point. In statistical settings, we model the entire dataset as being drawn from some unknown distribution , in which case the attacker is not nearly as strong as Definition 1 suggests. That observation has motivated a number of relaxations of DP in which privacy must only be preserved on more “typical” datasets [17, 28, 5, 22].
Our main result shows that we may as well use the worst-case definition of differential privacy.
Even in the statistical setting, any reasonable measure of privacy that satisfies strong composition is equivalent to Definition 1 up to polynomial factors in the sample complexity.
To formalize this, we define the following four privacy axioms that we posit should be satisfied by any measure of privacy that is both reasonable and useful.111By a privacy measure, we mean a scalar quantity associated with an algorithm , and we say that is -private if is at most 1.
-
1.
Preprocessing: Privacy is preserved under preprocessing. Specifically, privacy should hold regardless of the ordering of the dataset, and regardless of the ordering of the domain.
-
2.
Prohibits blatant non-privacy:
a private algorithm should not reveal almost all of the dataset.
-
3.
Strong composition: the privacy measure should grow sublinearly under composition. I.e., the composition of -many -private algorithms should be -private, for some .
-
4.
Linear scalability: the privacy measure should decrease linearly with the number of samples.
See Section 2 for a more detailed description of these axioms, and justification for why we view these axioms as both reasonable and usable.
With these axioms in place, our main results are captured by the following three theorems. The first and most important theorem states that any algorithm satisfying our axioms is also differentially private:
Theorem 2 (Our axioms imply differential privacy).
Let be any privacy measure that satisfies Axioms 1, 2, 3, and 4 and be any algorithm that is -private. For any and there is an (,)-DP algorithm that is equivalent (as in Definition 6) to .
The exact polynomial in Theorem 2 depends on the constant in our our strong composition axiom (Axiom 3). The best known constant for strong-composition is , in which case the sample-size in Theorem 2 would be , provided the domain is not too small.222Our actual analysis case splits on the size of the domain, and gets a worse polynomial on very small domains. We refer the reader to the full version of this work [6] for the formal version of Theorem 2. We note that given the known equivalences between DP, replicability, and various notions of stability, Theorem 2 shows that these other notions are also implied by our axioms.
Second, we show that differential privacy satisfies our axioms.
Theorem 3 (Informal).
Approximate differential privacy (Definition 1) satisfies Axioms 1, 2, 3, and 4.
Lastly, we show that removal of any one of our axioms would allow for measures of privacy that do not intuitively align with any reasonable notion of privacy. The reader can find a brief overview of what those nonsensical privacy measures are in Section 3.3. For an in-depth discussion we refer the reader to the full version of this work [6]. We do have the following simple implication of those results.
Theorem 4 (Minimality of our axioms).
Organization of Paper.
In Section 2, we present our framework and our axiomatic formulation of privacy; in Section 3, we give high level overviews of the proofs of our main theorems, and in Section 4 we discuss related work and several open problems raised by our framework. The formal proofs of Theorems 2, 3, and 4 can be found in the full version of this work [6].
2 Our Framework
All of our equivalences will hold with respect to algorithms that solve statistical tasks.
Definition 5 (Statistical task, [16, 7]).
A statistical task is defined by a set of distributions over data domain , an output space and a mapping from distributions to valid responses . An algorithm solves with failure probability if, for all ,
Statistical tasks capture essentially any setting where the algorithm is learning from i.i.d. data. We note that in many such tasks, there is an error parameter . This parameter is implicit in Definition 5 as we can define to only consist of outputs that are “-good.” For example, if we aim to capture realizable PAC learning of a concept class to error , then would consist of all distributions over labeled pairs where for some single with probability . The valid responses would be any hypothesis satisfying . Our notion of equivalence will be agnostic to the particular statistical task an algorithm wishes to solve, and hence, automatically applies to all goals and error parameters.
Definition 6 (Equivalent algorithm).
We say an algorithm is -equivalent to if, any statistical task that solves with failure probability , solves with failure probability .
2.1 Privacy measures and our axioms
To formalize Theorems 2, 3, and 4 we define a series of axioms that any reasonable and useful privacy measure should satisfy.
Definition 7 (Privacy measure).
A privacy measure is a mapping from (possibly randomized) algorithms to their level of privacy, parametrized as a number on . We adopt the convention that a lower values for indicate that is more private. It will often be useful to succinctly say that is -private if .
We remark upon a few basic properties about Definition 7. First, as is typical of previous definitions of privacy, a single privacy measure must provide privacy levels for algorithms taking in samples of all sizes . Later, our scaling axiom (Axiom 4) will enforce some amount of consistency between how behaves on different sample sizes.
Second, Definition 7 is a single parameter definition of privacy, in contrast to the two-parameters of DP (Definition 1). This single parameter was a deliberate choice. A guiding philosophy in the development of our axioms was to not directly enforce specific meaning to the privacy value , as we did not want our axioms to be biased by the meaning of and in DP. If we had a two (or more) parameter definition of privacy, we would need our axioms to somehow encode the distinction between those parameters, contradicting that guiding philosophy.
Furthermore, despite DP having two parameters, they are not of equal importance. Typical applications of DP simply set small enough to ignore and focus on . Indeed, following the intuition that one only needs “small enough,” we show in Section 3.2 how to collapse Definition 1 into a single parameter in a way that respects all of our axioms.
2.1.1 Axioms any reasonable definition of privacy should satisfy
We now proceed to define our axioms, beginning with those that any “reasonable” definition of privacy should satisfy. The first axioms encodes some basic operations that should maintain privacy.
Axiom 1 (Preprocessing maintains privacy).
We say a privacy measure satisfies the preprocessing axiom if the following is true.
-
1.
Reordering the input maintains privacy: For any algorithm and permutation , defining
we have that .
-
2.
Remapping the domain maintains privacy: For any mapping and algorithm , defining
we have that .
The first criteria, that reordering the input maintains privacy, says that under it is equally bad to leak information about the point and point for any . The second criteria similarly says that it is equally bad to leak information about users and for any .
While both of these criteria are intuitively reasonable, we also provide more formal justification for their inclusion as axioms. In the full version, we show show that removing any one of our four axioms would allow for ill-behaved privacy measures, illustrating why these axioms are necessary (see Section 3.3 for a briefer overview). Since Axiom 1 has two criteria, we will furthermore show that removing either of them would similarly result in ill-behaved privacy measures, helping to justify why both are necessary.
Our second axiom requires private mechanisms to not reveal (essentially) the entire dataset. This is the only axiom that directly enforces that measures some notion of privacy.
Definition 8 (Blatantly non-private).
A mechanism is blatantly non-private if there is a “high-entropy” distribution (formally for all ) and adversary mapping mechanism outputs to datasets for which333This constant of could be replaced with any .
The “high-entropy” requirement of Definition 8 is designed to ensure the adversary’s task is not too easy. In particular, it means that if the adversary were not able to see the ’s output, it would not even be able to guess a single point in . This stands in sharp contrast to the adversary being able to guess nearly all of upon seeing ’s output.
Axiom 2 (Prohibits blatant non-privacy).
We say a privacy measure satisfies the prohibits blatant non-privacy axiom if any -private algorithm is not blatantly non-private.
2.1.2 Strong-composition axioms
While the first two axioms were meant to be minimal requirements of any privacy definition to capture some reasonable notion of privacy, our next two axioms together formalize the notion of strong composition. As discussed earlier, the fact that the privacy costs of Definition 1 scale sublinearly with composition is crucial to the widespread adoption of differential privacy. Our next axiom encodes that the composition of many algorithms each of which have privacy level results in an algorithm with privacy level . We will state this in the minimal form we need: In particular, we only need that the composed algorithm is -private whenever .
Axiom 3 (Strong composition).
For , we say a privacy measure satisfies -strong composition if for any algorithms all satisfying and
if , then the composed algorithm that takes in a sample and outputs the responses is -private.
Interestingly, we are able to define Axiom 3 to be qualitatively weaker than the strong composition DP satisfies. DP satisfies adaptive strong composition, where the choice of may depend adaptively on the outputs of . In contrast, Axiom 3 only requires strong composition to hold when are fixed in advance. Yet, we are still able to show that our axioms imply DP. This shows, in some sense, that non-adaptive strong composition is enough to derive adaptive strong composition.
Axiom 3 on its own is not enough to enforce any reasonable notion of strong composition because it does not enforce any notion of scaling. For example, suppose we had some privacy measure that only satisfied linear composition444As in the case for the average-case variants of DP defined in [17, 28, 22] (Axiom 3 with ). Then, we could simply define a new privacy measure as . This new measure would satisfy Axiom 3 with . Our last axiom rectifies this.
Axiom 4 (Linear scalability).
We say a privacy measure satisfies linear scaling, if for some polynomial , any -private algorithm , any failure probability , and any large enough , there exists a -equivalent algorithm taking in samples that satisfies .
Roughly speaking, linear scalability says that the privacy level can be improved by a factor of by increasing the sample size by a factor of . For example, one common way to amplify privacy is subsampling, meaning is the randomized algorithm which runs on a uniform size- subsample of its size- input dataset. Indeed, for Definition 1, subsampling an -DP algorithm by a factor of leads to an -DP algorithm, though we will need a slightly more complicated amplification algorithm after we collapse and to a single parameter (see Lemma 7.1 of the full version [6]).
Axioms 3 and 4 are best viewed as together enforcing the following notion of strong composition. If the goal is to do a sequence of operations that each require a sample of size to perform privately, then only need a single sample size of . That is, we require some non-trivial improvement over a strategy that, for example, uses separate samples for each of the operations. We prefer this definition of strong composition in terms of the sample size required for many operations over explicit definitions that enforce a particular meaning to the value of in lieu of Axiom 4.
3 Technical Overview
3.1 Overview of Theorem 2: Our axioms imply DP
Given any privacy measure satisfying our axioms and -private algorithm , we wish to construct an equivalent that is -DP. To do so, we use the following intermediate notion of stability.
Definition 9 (TV-Stability, also called TV-indistinguishability by [20])).
The TV-stability of an algorithm under distribution is defined as
We simply say is -TV-stable if for all distributions over .
This definition is useful because (slight modifications) of the results of [7] allow us to convert any TV-stable algorithm into an equivalent DP algorithm (see Lemma 6.1 of the full version [6] for a formal statement of that conversion). Most of our effort goes into converting a -private algorithm into a TV-stable algorithm.
Theorem 10 (Our privacy axioms imply TV-stability).
To prove Theorem 10, we show, roughly speaking, that for any non-TV-stable algorithm , there exists algorithms for satisfying,
-
1.
Each can be formed by preprocessing , and therefore, by the Axiom 1 (preprocessing), should have the same privacy.
-
2.
The composed algorithm that takes as input and outputs the tuple
is blatantly non-private.
By Axiom 2 (prohibition of blatant non-privacy), we can conclude that is not -private. Then, Axiom 3 (strong composition) says that at least one must satisfy . By Axiom 1 (preprocessing) this in fact means that .
By contrapositive, this allows us to prove something just short of our goal: Any satisfying sufficiently strong privacy, , then itself must be TV-stable.555We note that there are some caveats to this statement: Briefly, it only holds for symmetric algorithms, those whose output does not depend on the order of its input, and assumes the domain is not too small. Both details are handled in the body. In contrast, Theorem 10 only assume that is -private. Here, we can exploit linear scalability: Using Axiom 4, we can convert any that is -private to an satisfying with only a polynomial increase in the sample size. This is the step where we crucially utilize the combined power of linear scalability and strong composition: Ultimately, we want to convert any -stable algorithm using samples into one using samples with the additional property that it can be composed times and still be -stable. Axioms 4 and 3 together allow us to do this.
3.1.1 Exploiting TV-unstable algorithms
The key step in proving Theorem 10 is to show that if we compose many preprocessed copies of a non-TV-stable algorithm , we will obtain a blatantly non-private algorithm. To prove this, we show a single random preprocessing reveals much information about the sample. It will be most convenient to state this lemma in terms of algorithms that take as input an unordered size- set as input, and we will use to denote all such sets.
Lemma 11 (Key lemma, uniform permutations distinguish far samples).
For any where , define
| (1) |
Then, for any and a uniform permutation,
where is the number of points and differ on.
Since we start with a that is not TV-stable, the quantity in Equation 1 is promised to be somewhat large. Lemma 11 says that, if we draw just one , the algorithm provides roughly “ bit” of useful information in distinguishing any and that are somewhat far, satisfying . Since the number of possible datasets is , it is possible to determine a dataset close to by observing for . We furthermore show in the body of the paper how to reduce to the case where , in which case suffices.
The key step in proving Lemma 11 is constructing the following random walk.
Lemma 12 (Random walk to disjoint samples).
For any , setting and , there exists random variables with the following properties:
-
1.
For any the marginal distribution of is equal to the distribution of when is a uniform permutation.
-
2.
The marginal distribution of is equal to the distribution of conditioned on .
The intuition behind Lemma 12 is that can be formed by “rerandomizing” exactly many of the elements in . As long as we have at least steps, we can ensure all elements get rerandomized. The actual proof of Lemma 12 is a bit precise. In particular we need to use a non-Markovian walk (in that the distribution of is not independent of conditioned on ) for the following reasons:
-
1.
In order to ensure all elements get rerandomized, the steps of the random walk cannot be independent. Instead, we enforce that the elements rerandomized in each step are different, while still ensuring that all the pairwise marginal have the right distribution.
-
2.
When is not exactly an integer some elements will be rerandomized twice. In this case, we need to ensure that no element accidentally gets rerandomized back into an element appearing in as that would cause .
Nonetheless, we show with a careful construction that Lemma 12 holds.
3.2 Overview of Theorem 3: DP satisfies the axioms
Since Definition 1 has two parameters, and , we must collapse them to one parameter for our framework. We do this by defining,
| (2) |
There is just one of many ways to collapse and into a single parameter in a way that respects our axioms. The exponents and could be replaced with and for any . Furthermore, the factor could be replaced with any for . With this privacy measure, we can state the formal version of Theorem 3. We first state the formal version of Theorem 3.
Theorem 2 (DP implies our axioms, formal version).
The reason we want to be much smaller than is because that’s the regime in which differential privacy satisfies strong composition. The following well-known theorem shows that DP has strong composition.
Theorem (DP-Strong-Composition, Theorem 3.20 in [13]).
For all , if are -differentially private, then the composed algorithm is -differentially private where and
Given that we have forced to be small, we cannot simply use subsampling [4] to ensure that satisfies linear scalability, as subsampling’s effect on is too mild. Instead, we use (a small modification of) a recent result of [7] to prove satisfies linear scalability. See Lemma 7.1 of the full version [6] for this amplification procedure and the surrounding discussion for comparison to [7]’s result. Given the well-known strong-composition theorem for DP and this amplification procedure, showing that satisfies all our axioms is straightforward.
3.3 Overview of Theorem 4: Necessity of our axioms
Here, we explain why all four of our axioms are necessary. For each axiom, we exhibit ill-behaved notions of privacy that would be allowed if we removed the axiom. In the case of Axiom 1, we even show this is true if only one of the two parts of it are removed, and in the case of Axiom 3, we will show it is true even if we replace strong composition with linear composition (i.e. setting ). The proof of Theorem 4, given in the full version, will build on these ill-behaved privacy measures by showing that they allow algorithms solving statistical tasks that no differentially private algorithm can solve..
If we remove just the first part of Axiom 1, that reordering the input maintains privacy, then there is a privacy measure satisfying the remaining axioms, that we call , which deems the algorithm that outputs the first half of its dataset perfectly private, satisfying .
If we remove the second half of Axiom 1, that remapping the domain maintains privacy, then there is a privacy measure satisfying the remaining axioms, that we call , that deems the following algorithm perfectly private: Let be the algorithm with the following behavior:
Essentially, is allowed to leak the entire dataset if there is any element appearing frequently enough. Despite this leakage, , indicating that should have “perfect” privacy. We further observe that still satisfies that permuting the domain maintains privacy. This shows that we could not have replaced the arbitrary mappings in Axiom 1 with arbitrary permutations without allowing this ill-behaved notion of privacy.
If we remove Axiom 2 (prohibition of blatant non-privacy), then a privacy measure that deems all algorithms perfectly private, i.e. for all , satisfies the remaining axioms.
If we relax strong composition to linear composition, i.e. allow in Axiom 3, then there is a privacy measure, that we call with the following behavior: The algorithm which outputs the first points in its dataset satisfies . For example, an algorithm which outputs the first half of its dataset is still -private.
If we remove Axiom 4 (linear scaling), then there is a rescaling, , of the above privacy measure that satisfies the remaining axioms. The algorithm which outputs the first points in its dataset is satisfies . This still has essentially the same consequences as if we weakened Axiom 3. For example, we still have that the algorithm which outputs the first half of its dataset is -private.
4 Discussion and Related Work
Computational efficiency.
In Theorem 2, we guarantee that any sample efficient -private algorithm can be transformed into an equivalent DP algorithm with approximately the same sample complexity. While our transformation is constructive, it does not necessarily preserve computational efficiency. Part of the reason is that Axiom 4 does not require the scaling to preserve computational efficiency, and we utilize a scaled version of to construct . This choice to allow for non-computationally efficient amplification is crucial to Theorem 3 as we utilize the following (computationally inefficient) procedure to prove that DP fits our axioms:
Theorem (DP-Amplification, [7]).
For any -DP algorithm , there exists an equivalent -DP algorithm using samples.
We remark that there is a computationally efficient way to amplify an -DP algorithm to -DP at the cost of a increase in the sample size, via subsampling [4]. While subsampling’s linear amplification of is as good as DP-Amplification, the linear amplification of is not sufficient for our purposes, and so we need to utilize the computationally inefficient amplification of DP-Amplification.
As far as we are aware, despite it being of independent interest, it is unknown whether a computationally efficient analogue of DP-Amplification exists. More broadly, we leave open the possibility that it is possible to obtain a computationally efficient analogue of our results, possibly by adjusting the axioms appropriately.
Other formalizations of differentially privacy.
We focused on the well-studied -DP formulation of Definition 1 (often called approximate DP). One popular alternative, pure DP, is equivalent to Definition 1 where is fixed to be . We did not focus on pure-DP because it does not satisfy strong composition, which makes it more difficult to utilize in practice and also that it does not fit our axioms. That said, it would be interesting to come up with an alternative set of axioms that characterize pure DP in the same sense as our axioms characterize approximate DP. One tempting solution is to simply remove our strong composition axiom (Axiom 3). However, as we show, removing Axiom 3 allows for a degenerate privacy measure which is much weaker than pure DP, so a different approach is needed (see Section 8 in the full version [6]).
A second popular generalization of approximate DP follows from the simple observation that algorithms are not -DP for a single fixed choice of and . Rather, for any algorithm , there is an entire “curve” for which is -DP for all choices of . There are a variety of formulations of DP that bound the behavior of this curve (e.g. through bounding appropriately defined “moments”) such as Réyni DP and concentrated DP [23, 14, 8]. These variations are popular precisely because they allow for easy (and often strong) composition, and in appropriate parameter regimes, also are amplified by subsampling. We refer the reader to [25] for an excellent overview.
Given this, it’s natural to expect these variants would play nicely with our axioms. Indeed, we show in that the privacy measure that assigns to the smallest privacy value s.t. is -Réyni DP respects all our axioms with an even more straightforward analysis than the proof of Theorem 3 (For more details we refer the reader to appendix of the full version). In the other direction, we show a variant of Theorem 2, that our axioms imply Réyni DP. One distinction between that statement (This corresponds to Theorem 7 in the full version) and Theorem 2 is that the Réyni DP algorithm has a sample size that depends on , which we also show is necessary in the appendix of the full version.
Related Work.
Perhaps most in the spirit of our results is recent work on reproducibility [18], and in particular the followup paper of Bun et al. [7] (see also [20]). That work examines the broader context of algorithmic stability, which are various ways of formalizing that an algorithms output does not depend too much on its input. They show that some of these measures of stability, replicability, max-information, and perfect generalization, are equivalent to differential privacy using the same formalization of equivalence as us. Measures of algorithmic stability and privacy share many of the same basic properties. In some sense, the only distinction between algorithmic stability and privacy is simply that measures of algorithmic stability were designed for applications other than privacy. Indeed, one could just as easily view our axioms as desirable properties for any measure of algorithmic stability. From this perspective, our work is a natural evolution of [7] as we show all measures of stability satisfying our axioms are equivalent to privacy. We also utilize some of their techniques to prove our results.
More broadly, there have been several works formalizing axioms that any “reasonable” definition of privacy should satisfy. Often this includes an axiom or assumption that privacy should be some measure of distance between the distributions and for worst-case and (as in Definition 1). This includes [21, 26], which both investigate what measures of distance satisfy other reasonable axioms. Also in this spirit is the central limit theorem of [10]. Roughly speaking, it says that if we consider only privacy definitions based on some distance between and , in the limit of many compositions, we may as well define “Gaussian differential privacy.” The key distinction between all of these works and ours is that we aim to justify why the most successful privacy definitions are measures of distance between and for worst-case and , whereas previous works take that as an assumption or axiom.
References
- [1] Differential privacy: Technical overview. Apple Inc. White paper; documents Apple’s local DP deployment and budgets. Accessed Aug 19, 2025. URL: https://www.apple.com/privacy/docs/Differential_Privacy_Overview.pdf.
- [2] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016. doi:10.1145/2976749.2978318.
- [3] John M Abowd, Robert Ashmead, Ryan Cumings-Menon, Simson Garfinkel, Micah Heineck, Christine Heiss, Robert Johns, Daniel Kifer, Philip Leclerc, Ashwin Machanavajjhala, et al. The 2020 census disclosure avoidance system topdown algorithm. Harvard Data Science Review, 2, 2022.
- [4] Borja Balle, Gilles Barthe, and Marco Gaboardi. Privacy amplification by subsampling: Tight analyses via couplings and divergences. Advances in neural information processing systems, 31, 2018.
- [5] Raef Bassily and Yoav Freund. Typical stability. arXiv preprint arXiv:1604.03336, 2016.
- [6] Guy Blanc, William Pires, and Toniann Pitassi. Differential privacy from axioms, 2025. arXiv:2511.21876.
- [7] Mark Bun, Marco Gaboardi, Max Hopkins, Russell Impagliazzo, Rex Lei, Toniann Pitassi, Satchit Sivakumar, and Jessica Sorrell. Stability is stable: Connections between replicability, privacy, and adaptive generalization. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, pages 520–527, New York, NY, USA, 2023. Association for Computing Machinery. doi:10.1145/3564246.3585246.
- [8] Mark Bun and Thomas Steinke. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of cryptography conference, pages 635–658. Springer, 2016. doi:10.1007/978-3-662-53641-4_24.
- [9] Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately. Advances in Neural Information Processing Systems, 30, 2017.
- [10] Jinshuo Dong, Aaron Roth, and Weijie J Su. Gaussian differential privacy. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 84(1):3–37, 2022.
- [11] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Annual international conference on the theory and applications of cryptographic techniques, pages 486–503. Springer, 2006. doi:10.1007/11761679_29.
- [12] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pages 265–284. Springer, 2006. doi:10.1007/11681878_14.
- [13] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3–4):211–407, 2014. doi:10.1561/0400000042.
- [14] Cynthia Dwork and Guy N Rothblum. Concentrated differential privacy. arXiv preprint arXiv:1603.01887, 2016. arXiv:1603.01887.
- [15] Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. Boosting and differential privacy. In 2010 IEEE 51st annual symposium on foundations of computer science, pages 51–60. IEEE, 2010. doi:10.1109/FOCS.2010.12.
- [16] Vitaly Feldman. A general characterization of the statistical query complexity. In Conference on learning theory, pages 785–830. PMLR, 2017. URL: http://proceedings.mlr.press/v65/feldman17c.html.
- [17] Robert Hall, Larry Wasserman, and Alessandro Rinaldo. Random differential privacy. Journal of Privacy and Confidentiality, 4(2), 2013. doi:10.29012/JPC.V4I2.621.
- [18] Russell Impagliazzo, Rex Lei, Toniann Pitassi, and Jessica Sorrell. Reproducibility in learning. In Proceedings of the 54th annual ACM SIGACT symposium on theory of computing, pages 818–831, 2022. doi:10.1145/3519935.3519973.
- [19] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. The composition theorem for differential privacy. In International conference on machine learning, pages 1376–1385. PMLR, 2015. URL: http://proceedings.mlr.press/v37/kairouz15.html.
- [20] Alkis Kalavasis, Amin Karbasi, Shay Moran, and Grigoris Velegkas. Statistical indistinguishability of learning algorithms. In International Conference on Machine Learning, pages 15586–15622. PMLR, 2023. URL: https://proceedings.mlr.press/v202/kalavasis23a.html.
- [21] Daniel Kifer and Bing-Rong Lin. Towards an axiomatization of statistical privacy and utility. In Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 147–158, 2010. doi:10.1145/1807085.1807106.
- [22] Ao Liu, Yu-Xiang Wang, and Lirong Xia. Smoothed differential privacy. Transactions on Machine Learning Research, 2023.
- [23] Ilya Mironov. Rényi differential privacy. In 2017 IEEE 30th computer security foundations symposium (CSF), pages 263–275. IEEE, 2017. doi:10.1109/CSF.2017.11.
- [24] Joseph P Near, Xi He, et al. Differential privacy for databases. Foundations and Trends® in Databases, 11(2):109–225, 2021. doi:10.1561/1900000066.
- [25] Thomas Steinke. Composition of differential privacy & privacy amplification by subsampling. arXiv preprint arXiv:2210.00597, 2022. doi:10.48550/arXiv.2210.00597.
- [26] Weijie J Su. A statistical viewpoint on differential privacy: Hypothesis testing, representation, and blackwell’s theorem. Annual Review of Statistics and Its Application, 12, 2024.
- [27] Salil Vadhan. The complexity of differential privacy. In Tutorials on the Foundations of Cryptography: Dedicated to Oded Goldreich, pages 347–450. Springer, 2017. doi:10.1007/978-3-319-57048-8_7.
- [28] Yu-Xiang Wang, Jing Lei, and Stephen E Fienberg. On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms. In International Conference on Privacy in Statistical Databases, pages 121–134. Springer, 2016. doi:10.1007/978-3-319-45381-1_10.
- [29] Zheng Xu, Yanxiang Zhang, Galen Andrew, Christopher A Choquette-Choo, Peter Kairouz, H Brendan McMahan, Jesse Rosenstock, and Yuanbo Zhang. Federated learning of gboard language models with differential privacy. arXiv preprint arXiv:2305.18465, 2023. doi:10.48550/arXiv.2305.18465.
