General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean Estimation

Authors Aleksandar Nikolov, Haohua Tang



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2024.85.pdf
  • Filesize: 0.75 MB
  • 23 pages

Document Identifiers

Author Details

Aleksandar Nikolov
  • University of Toronto, Canada
Haohua Tang
  • University of Toronto, Canada

Cite AsGet BibTex

Aleksandar Nikolov and Haohua Tang. General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean Estimation. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 85:1-85:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ITCS.2024.85

Abstract

We investigate unbiased high-dimensional mean estimators in differential privacy. We consider differentially private mechanisms whose expected output equals the mean of the input dataset, for every dataset drawn from a fixed bounded domain K in ℝ^d. A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it. In the first part of this paper, we study the optimal error achievable by a Gaussian noise mechanism for a given domain K, when the error is measured in the 𝓁_p norm for some p ≥ 2. We give algorithms that compute the optimal covariance for the Gaussian noise for a given K under suitable assumptions, and prove a number of nice geometric properties of the optimal error. These results generalize the theory of factorization mechanisms from domains K that are symmetric and finite (or, equivalently, symmetric polytopes) to arbitrary bounded domains. In the second part of the paper we show that Gaussian noise mechanisms achieve nearly optimal error among all private unbiased mean estimation mechanisms in a very strong sense. In particular, for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error as the best Gaussian noise mechanism. We extend this result to local differential privacy, and to approximate differential privacy, but for the latter the error lower bound holds either for a dataset or for a neighboring dataset, and this relaxation is necessary.

Subject Classification

ACM Subject Classification
  • Theory of computation → Theory and algorithms for application domains
Keywords
  • differential privacy
  • mean estimation
  • unbiased estimator
  • instance optimality

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon and Assaf Naor. Approximating the cut-norm via Grothendieck’s inequality. In ACM Symposium on Theory of Computing, pages 72-80, 2004. URL: https://doi.org/10.1145/1007352.1007371.
  2. Hilal Asi and John C. Duchi. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL: https://proceedings.neurips.cc/paper/2020/hash/a267f936e54d7c10a2bb70dbe6ad7a89-Abstract.html.
  3. Hilal Asi and John C. Duchi. Near instance-optimality in differential privacy. CoRR, abs/2005.10630, 2020. URL: https://arxiv.org/abs/2005.10630.
  4. Hilal Asi, Vitaly Feldman, and Kunal Talwar. Optimal algorithms for mean estimation under local differential privacy. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 1046-1056. PMLR, 2022. URL: https://proceedings.mlr.press/v162/asi22b.html.
  5. Raef Bassily, Adam D. Smith, and Abhradeep Thakurta. Private empirical risk minimization: Efficient algorithms and tight error bounds. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 464-473. IEEE Computer Society, 2014. URL: https://doi.org/10.1109/FOCS.2014.56.
  6. Aditya Bhaskara, Daniel Dadush, Ravishankar Krishnaswamy, and Kunal Talwar. Unconditional differentially private mechanisms for linear queries. In Howard J. Karloff and Toniann Pitassi, editors, Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19 - 22, 2012, pages 1269-1284. ACM, 2012. URL: https://doi.org/10.1145/2213977.2214089.
  7. Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor. A framework for quadratic form maximization over convex sets through nonconvex relaxations. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC '21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 870-881. ACM, 2021. URL: https://doi.org/10.1145/3406325.3451128.
  8. Jarosław Błasiok, Mark Bun, Aleksandar Nikolov, and Thomas Steinke. Towards instance-optimal private query release. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. SODA 2019, pages 2480-2497. SIAM, Philadelphia, PA, 2019. URL: https://doi.org/10.1137/1.9781611975482.152.
  9. Mark Bun and Thomas Steinke. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography - 14th International Conference, TCC 2016-B, Beijing, China, October 31 - November 3, 2016, Proceedings, Part I, volume 9985 of Lecture Notes in Computer Science, pages 635-658, 2016. URL: https://doi.org/10.1007/978-3-662-53641-4_24.
  10. Mark Bun, Jonathan Ullman, and Salil P. Vadhan. Fingerprinting codes and the price of approximate differential privacy. In David B. Shmoys, editor, Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 1-10. ACM, 2014. URL: https://doi.org/10.1145/2591796.2591877.
  11. Douglas G. Chapman and Herbert Robbins. Minimum variance estimation without regularity assumptions. Ann. Math. Statistics, 22:581-586, 1951. URL: https://doi.org/10.1214/aoms/1177729548.
  12. Christopher A. Choquette-Choo, H. Brendan McMahan, Keith Rush, and Abhradeep Thakurta. Multi-epoch matrix factorization mechanisms for private machine learning. CoRR, abs/2211.06530, 2022. URL: https://doi.org/10.48550/arXiv.2211.06530.
  13. Travis Dick, Alex Kulesza, Ziteng Sun, and Ananda Theertha Suresh. Subset-based instance optimality in private estimation. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 7992-8014. PMLR, 2023. URL: https://proceedings.mlr.press/v202/dick23a.html.
  14. Wei Dong, Yuting Liang, and Ke Yi. Differentially private covariance revisited. CoRR, abs/2205.14324, 2022. URL: https://doi.org/10.48550/arXiv.2205.14324.
  15. Wei Dong and Ke Yi. A nearly instance-optimal differentially private mechanism for conjunctive queries. In PODS '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, pages 213-225. ACM, 2022. URL: https://doi.org/10.1145/3517804.3524143.
  16. John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Minimax optimal procedures for locally private estimation. J. Amer. Statist. Assoc., 113(521):182-201, 2018. URL: https://doi.org/10.1080/01621459.2017.1389735.
  17. John C. Duchi and Feng Ruan. The right complexity measure in locally private estimation: It is not the fisher information. CoRR, abs/1806.05756, 2018. URL: https://arxiv.org/abs/1806.05756.
  18. Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28 - June 1, 2006, Proceedings, pages 486-503, 2006. Google Scholar
  19. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Conference on Theory of Cryptography, TCC'06, pages 265-284, Berlin, Heidelberg, 2006. Springer-Verlag. URL: https://doi.org/10.1007/11681878_14.
  20. Cynthia Dwork, Aleksandar Nikolov, and Kunal Talwar. Efficient algorithms for privately releasing marginals via convex relaxations. Discrete Comput. Geom., 53(3):650-673, 2015. URL: https://doi.org/10.1007/s00454-015-9678-x.
  21. Cynthia Dwork and Guy N. Rothblum. Concentrated differential privacy. CoRR, abs/1603.01887, 2016. URL: http://arxiv.org/abs/1603.01887, URL: https://arxiv.org/abs/1603.01887.
  22. Alexander Edmonds, Aleksandar Nikolov, and Jonathan Ullman. The power of factorization mechanisms in local and central differential privacy. In STOC'20 - Proceedings of the 52n Annual ACM SIGACT Symposium on Theory of Computing, pages 425-438. ACM, 2020. URL: https://doi.org/10.1145/3357713.3384297.
  23. Alexandre V. Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, pages 211-222. ACM, 2003. Google Scholar
  24. Alexandre Grothendieck. Résumé de la théorie métrique des produits tensoriels topologiques. Bol. Soc. Mat. Sao Paulo, 8(1-79):88, 1953. Google Scholar
  25. J. M. Hammersley. On estimating restricted parameters. J. Roy. Statist. Soc. Ser. B, 12:192-229; discussion, 230-240, 1950. URL: http://links.jstor.org/sici?sici=0035-9246(1950)12:2<192:OERP>2.0.CO;2-M&origin=MSN.
  26. Monika Henzinger and Jalaj Upadhyay. Constant matters: Fine-grained complexity of differentially private continual observation using completely bounded norms. CoRR, abs/2202.11205, 2022. URL: https://arxiv.org/abs/2202.11205.
  27. Monika Henzinger, Jalaj Upadhyay, and Sarvagya Upadhyay. Almost tight error bounds on differentially private continual counting. CoRR, abs/2211.05006, 2022. URL: https://doi.org/10.48550/arXiv.2211.05006.
  28. Ziyue Huang, Yuting Liang, and Ke Yi. Instance-optimal mean estimation under differential privacy. In Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 25993-26004, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/da54dd5a0398011cdfa50d559c2c0ef8-Abstract.html.
  29. Gautam Kamath, Argyris Mouzakis, Matthew Regehr, Vikrant Singhal, Thomas Steinke, and Jonathan R. Ullman. A bias-variance-privacy trilemma for statistical estimation. CoRR, abs/2301.13334, 2023. URL: https://doi.org/10.48550/arXiv.2301.13334.
  30. Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately? In FOCS, pages 531-540. IEEE, October 25-28 2008. Google Scholar
  31. Shiva Prasad Kasiviswanathan and Adam D. Smith. On the ’semantics' of differential privacy: A bayesian formulation. J. Priv. Confidentiality, 6(1), 2014. URL: https://doi.org/10.29012/jpc.v6i1.634.
  32. Troy Lee, Adi Shraibman, and Robert Spalek. A direct product theorem for discrepancy. In Proceedings of the 23rd Annual IEEE Conference on Computational Complexity, CCC 2008, 23-26 June 2008, College Park, Maryland, USA, pages 71-80. IEEE Computer Society, 2008. URL: https://doi.org/10.1109/CCC.2008.25.
  33. Chao Li, Michael Hay, Vibhor Rastogi, Gerome Miklau, and Andrew McGregor. Optimizing linear counting queries under differential privacy. In Proceedings of the 29th ACM Symposium on Principles of Database Systems, PODS'10, pages 123-134. ACM, 2010. Google Scholar
  34. Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. The matrix mechanism: optimizing linear counting queries under differential privacy. VLDB J., 24(6):757-781, 2015. URL: https://doi.org/10.1007/s00778-015-0398-x.
  35. Ryan McKenna, Gerome Miklau, Michael Hay, and Ashwin Machanavajjhala. Optimizing error of high-dimensional statistical queries under differential privacy. Proc. VLDB Endow., 11(10):1206-1219, 2018. URL: https://doi.org/10.14778/3231751.3231769.
  36. Brendan McMahan, Keith Rush, and Abhradeep Guha Thakurta. Private online prefix sums via optimal matrix factorizations. CoRR, abs/2202.08312, 2022. URL: https://arxiv.org/abs/2202.08312.
  37. Audra McMillan, Adam D. Smith, and Jonathan R. Ullman. Instance-optimal differentially private estimation. CoRR, abs/2210.15819, 2022. URL: https://doi.org/10.48550/arXiv.2210.15819.
  38. Aleksandar Nikolov. New Computational Aspects of Discrepancy Theory. PhD thesis, Rutgers, The State University of New Jersey, 2014. URL: https://doi.org/doi:10.7282/T3RN3749.
  39. Aleksandar Nikolov, Kunal Talwar, and Li Zhang. The geometry of differential privacy: the sparse and approximate cases. In STOC'13 - Proceedings of the 2013 ACM Symposium on Theory of Computing, pages 351-360. ACM, New York, 2013. URL: https://doi.org/10.1145/2488608.2488652.
  40. Aleksandar Nikolov, Kunal Talwar, and Li Zhang. The geometry of differential privacy: The small database and approximate cases. SIAM J. Comput., 45(2):575-616, 2016. URL: https://doi.org/10.1137/130938943.
  41. Kandethody M. Ramachandran and Chris P. Tsokos. Mathematical statistics with applications. Elsevier/Academic Press, Amsterdam, 2009. Google Scholar
  42. N. Tomczak-Jaegermann. Banach-Mazur Distances and Finite-Dimensional Operator Ideals. Pitman Monographs and Surveys in Pure and Applied Mathematics 38. J. Wiley, New York, 1989. Google Scholar
  43. V. G. Voinov and M. S. Nikulin. Unbiased estimators and their applications. Vol. 1, volume 263 of Mathematics and its Applications. Kluwer Academic Publishers, Dordrecht, 1993. Univariate case, Translated from the 1989 Russian original by L. E. Strautman and revised by the authors. URL: https://doi.org/10.1007/978-94-011-1970-2.
  44. V. G. Voinov and M. S. Nikulin. Unbiased estimators and their applications. Vol. 2, volume 362 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht, 1996. Multivariate case. Google Scholar
  45. Stanley L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63-69, 1965. Google Scholar
  46. Yingtai Xiao, Guanlin He, Danfeng Zhang, and Daniel Kifer. An optimal and scalable matrix mechanism for noisy marginals under convex loss functions. CoRR, abs/2305.08175, 2023. URL: https://doi.org/10.48550/arXiv.2305.08175.
  47. Keyu Zhu, Ferdinando Fioretto, Pascal Van Hentenryck, Saswat Das, and Christine Task. Privacy and bias analysis of disclosure avoidance systems. CoRR, abs/2301.12204, 2023. URL: https://doi.org/10.48550/arXiv.2301.12204.
  48. Keyu Zhu, Pascal Van Hentenryck, and Ferdinando Fioretto. Bias and variance of post-processing in differential privacy. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Virtual Event, February 2-9, 2021, pages 11177-11184. AAAI Press, 2021. URL: https://ojs.aaai.org/index.php/AAAI/article/view/17333, URL: https://doi.org/10.1609/AAAI.V35I12.17333.