On Distributed Differential Privacy and Counting Distinct Elements

Chen, Lijie; Ghazi, Badih; Kumar, Ravi; Manurangsi, Pasin

doi:10.4230/LIPIcs.ITCS.2021.56

Abstract

We study the setup where each of n users holds an element from a discrete set, and the goal is to count the number of distinct elements across all users, under the constraint of (ε,δ)-differentially privacy:  
- In the non-interactive local setting, we prove that the additive error of any protocol is Ω(n) for any constant ε and for any δ inverse polynomial in n. 
- In the single-message shuffle setting, we prove a lower bound of Ω̃(n) on the error for any constant ε and for some δ inverse quasi-polynomial in n. We do so by building on the moment-matching method from the literature on distribution estimation. 
- In the multi-message shuffle setting, we give a protocol with at most one message per user in expectation and with an error of Õ(√n) for any constant ε and for any δ inverse polynomial in n. Our protocol is also robustly shuffle private, and our error of √n matches a known lower bound for such protocols.  Our proof technique relies on a new notion, that we call dominated protocols, and which can also be used to obtain the first non-trivial lower bounds against multi-message shuffle protocols for the well-studied problems of selection and learning parity.
Our first lower bound for estimating the number of distinct elements provides the first ω(√n) separation between global sensitivity and error in local differential privacy, thus answering an open question of Vadhan (2017). We also provide a simple construction that gives Ω̃(n) separation between global sensitivity and error in two-party differential privacy, thereby answering an open question of McGregor et al. (2011).

John M Abowd. The US Census Bureau adopts differential privacy. In KDD, pages 2867-2867, 2018.
Kareem Amin, Matthew Joseph, and Jieming Mao. Pan-private uniformity testing. In COLT, pages 183-218, 2020.
Apple Differential Privacy Team. Learning with privacy at scale. Apple Machine Learning Journal, 2017.
Victor Balcer and Albert Cheu. Separating local & shuffled differential privacy via histograms. In ITC, pages 1:1-1:14, 2020.
Victor Balcer, Albert Cheu, Matthew Joseph, and Jieming Mao. Connecting robust shuffle privacy and pan-privacy. In SODA, 2021. URL: http://arxiv.org/abs/2004.09481.
Borja Balle, James Bell, Adrià Gascón, and Kobbi Nissim. The privacy blanket of the shuffle model. In CRYPTO, pages 638-667, 2019.
Amos Beimel, Iftach Haitner, Kobbi Nissim, and Uri Stemmer. On the round complexity of the shuffle model. In TCC, 2020.
Andrea Bittau, Úlfar Erlingsson, Petros Maniatis, Ilya Mironov, Ananth Raghunathan, David Lie, Mitch Rudominer, Ushasree Kode, Julien Tinnes, and Bernhard Seefeld. Prochlo: Strong privacy for analytics in the crowd. In SOSP, pages 441-459, 2017.
Joshua Brody, Amit Chakrabarti, Ranganath Kondapally, David P Woodruff, and Grigory Yaroslavtsev. Beyond set disjointness: the communication complexity of finding the intersection. In PODC, pages 106-113, 2014.
Mark Bun, Jelani Nelson, and Uri Stemmer. Heavy hitters and the structure of local privacy. TALG, 15(4):1-40, 2019.
TH Hubert Chan, Elaine Shi, and Dawn Song. Optimal lower bound for differentially private multi-party aggregation. In ESA, pages 277-288, 2012.
Albert Cheu, Adam D. Smith, Jonathan Ullman, David Zeber, and Maxim Zhilyaev. Distributed differential privacy via shuffling. In EUROCRYPT, pages 375-403, 2019. URL: http://arxiv.org/abs/1808.01394.
Albert Cheu and Jonathan Ullman. The limits of pan privacy and shuffle privacy for learning and estimation. arXiv, 2020. URL: http://arxiv.org/abs/2009.08000.
Seung Geol Choi, Dana Dachman-Soled, Mukul Kulkarni, and Arkady Yerukhimovich. Differentially-private multi-party sketching for large-scale statistics. PoPETs, 3:153-174, 2020.
Damien Desfontaines, Andreas Lochbihler, and David Basin. Cardinality estimators do not preserve privacy. PoPETs, 2019(2):26-46, 2019.
Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately. In NIPS, pages 3571-3580, 2017.
John C Duchi, Michael I Jordan, and Martin J Wainwright. Local privacy and statistical minimax rates. In FOCS, pages 429-438, 2013.
John C. Duchi and Feng Ruan. The right complexity measure in locally private estimation: It is not the Fisher information. arXiv, 2018. URL: http://arxiv.org/abs/1806.05756.
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486-503, 2006. URL: https://doi.org/10.1007/11761679_29.
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265-284, 2006. URL: https://doi.org/10.1007/11681878_14.
Cynthia Dwork, Moni Naor, Toniann Pitassi, Guy N Rothblum, and Sergey Yekhanin. Pan-private streaming algorithms. In ICS, pages 66-80, 2010.
Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci., 9(3-4):211-407, 2014. URL: https://doi.org/10.1561/0400000042.
Alexander Edmonds, Aleksandar Nikolov, and Jonathan Ullman. The power of factorization mechanisms in local and central differential privacy. In STOC, pages 425-438, 2020. URL: https://doi.org/10.1145/3357713.3384297.
Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Abhradeep Thakurta. Amplification by shuffling: From local to central differential privacy via anonymity. In SODA, pages 2468-2479, 2019.
Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In CCS, pages 1054-1067, 2014.
Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, Rasmus Pagh, and Ameya Velingker. Pure differentially private summation from anonymous messages. In ITC, pages 15:1-15:23, 2020.
Badih Ghazi, Noah Golowich, Ravi Kumar, Rasmus Pagh, and Ameya Velingker. On the power of multiple anonymous messages. IACR Cryptol. ePrint Arch., 2019:1382, 2019. URL: https://eprint.iacr.org/2019/1382.
Andy Greenberg. Apple’s "differential privacy" is about collecting your data - but not your data. Wired, June, 13, 2016.
Yuval Ishai, Eyal Kushilevitz, Rafail Ostrovsky, and Amit Sahai. Cryptography from anonymity. In FOCS, pages 239-248, 2006.
Daniel M Kane, Jelani Nelson, and David P Woodruff. An optimal algorithm for the distinct elements problem. In PODS, pages 41-52, 2010.
Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately? SICOMP, 40(3):793-826, 2011.
Michael Kearns. Efficient noise-tolerant learning from statistical queries. JACM, 45(6):983-1006, 1998.
Andrew McGregor, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil Vadhan. The limits of two-party differential privacy. In FOCS, pages 81-90, 2010.
Andrew McGregor, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil P. Vadhan. The limits of two-party differential privacy. ECCC, 18:106, 2011. URL: http://eccc.hpi-web.de/report/2011/106.
Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In FOCS, pages 94-103, 2007.
Darakhshan Mir, Shan Muthukrishnan, Aleksandar Nikolov, and Rebecca N Wright. Pan-private algorithms via statistics on sketches. In PODS, pages 37-48, 2011.
Ryan O'Donnell. Analysis of Boolean Functions. Cambridge University Press, 2014. URL: http://www.cambridge.org/de/academic/subjects/computer-science/algorithmics-complexity-computer-algebra-and-computational-g/analysis-boolean-functions.
Rasmus Pagh and Nina Mesing Stausholm. Efficient differentially private f₀ linear sketching. arXiv, 2020. URL: http://arxiv.org/abs/2001.11932.
Stephen Shankland. How Google tricks itself to protect Chrome user privacy. CNET, October, 2014.
Thomas Steinke and Jonathan Ullman. Tight lower bounds for differentially private selection. In FOCS, pages 552-563, 2017.
Jonathan Ullman. Tight lower bounds for locally differentially private selection. arXiv, 2018. URL: http://arxiv.org/abs/1802.02638.
Salil Vadhan. The complexity of differential privacy. In Tutorials on the Foundations of Cryptography, pages 347-450. Springer, 2017.
Gregory Valiant and Paul Valiant. Estimating the unseen: Improved estimators for entropy and other properties. JACM, 64(6):37:1-37:41, 2017. URL: https://doi.org/10.1145/3125643.
Stanley L Warner. Randomized response: A survey technique for eliminating evasive answer bias. JASA, 60(309):63-69, 1965.
Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. The Annals of Statistics, 47(2):857-883, 2019.

On Distributed Differential Privacy and Counting Distinct Elements

Authors Lijie Chen, Badih Ghazi, Ravi Kumar, Pasin Manurangsi

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

On Distributed Differential Privacy and Counting Distinct Elements

Authors Lijie Chen, Badih Ghazi, Ravi Kumar, Pasin Manurangsi

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message