Two Party Distribution Testing: Communication and Security

Andoni, Alexandr; Malkin, Tal; Nosatzki, Negev Shekel

doi:10.4230/LIPIcs.ICALP.2019.15

Abstract

We study the problem of discrete distribution testing in the two-party setting. For example, in the standard closeness testing problem, Alice and Bob each have t samples from, respectively, distributions a and b over [n], and they need to test whether a=b or a,b are epsilon-far (in the l_1 distance). This is in contrast to the well-studied one-party case, where the tester has unrestricted access to samples of both distributions. Despite being a natural constraint in applications, the two-party setting has previously evaded attention. We address two fundamental aspects of the two-party setting: 1) what is the communication complexity, and 2) can it be accomplished securely, without Alice and Bob learning extra information about each other’s input. Besides closeness testing, we also study the independence testing problem, where Alice and Bob have t samples from distributions a and b respectively, which may be correlated; the question is whether a,b are independent or epsilon-far from being independent. Our contribution is three-fold: 1) We show how to gain communication efficiency given more samples, beyond the information-theoretic bound on t. The gain is polynomially better than what one would obtain via adapting one-party algorithms. 2) We prove tightness of our trade-off for the closeness testing, as well as that the independence testing requires tight Omega(sqrt{m}) communication for unbounded number of samples. These lower bounds are of independent interest as, to the best of our knowledge, these are the first 2-party communication lower bounds for testing problems, where the inputs are a set of i.i.d. samples. 3) We define the concept of secure distribution testing, and provide secure versions of the above protocols with an overhead that is only polynomial in the security parameter.

Jayadev Acharya, Clément L. Canonne, Cody Freitag, and Himanshu Tyagi. Test without Trust: Optimal Locally Private Distribution Testing. CoRR, abs/1808.02174, 2018.
Jayadev Acharya, Clément L. Canonne, and Himanshu Tyagi. Distributed Simulation and Distributed Inference. CoRR, abs/1804.06952, 2018. URL: http://arxiv.org/abs/1804.06952.
Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan. Competitive closeness testing. In Proceedings of the 24th Annual Conference on Learning Theory, pages 47-68, 2011.
Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, Shengjun Pan, and Ananda Suresh. Competitive classification and closeness testing. In Conference on Learning Theory, pages 22-1, 2012.
Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, and Ananda Theertha Suresh. Sublinear algorithms for outlier detection and generalized closeness testing. In Information Theory (ISIT), 2014 IEEE International Symposium on, pages 3200-3204. IEEE, 2014.
Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. Differentially Private Testing of Identity and Closeness of Discrete Distributions. CoRR, abs/1707.05128, 2017. URL: http://arxiv.org/abs/1707.05128.
Rudolf Ahlswede and Imre Csiszár. Hypothesis testing with communication constraints. IEEE transactions on information theory, 32(4), 1986.
Maryam Aliakbarpour, Ilias Diakonikolas, and Ronitt Rubinfeld. Differentially Private Identity and Closeness Testing of Discrete Distributions. CoRR, abs/1707.05497, 2017. URL: http://arxiv.org/abs/1707.05497.
Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. J. Comp. Sys. Sci., 58:137-147, 1999. Previously appeared in STOC'96.
S Amari et al. Statistical inference under multiterminal data compression. IEEE Transactions on Information Theory, 44(6):2300-2324, 1998.
Andrew D Barbour. Stein’s method and Poisson process convergence. Journal of Applied Probability, 25(A):175-184, 1988.
Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D Smith, and Patrick White. Testing closeness of discrete distributions. Journal of the ACM (JACM), 60(1):4, 2013.
Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing Closeness of Discrete Distributions. J. ACM, 60(1):4:1-4:25, February 2013. URL: http://dx.doi.org/10.1145/2432622.2432626.
Amos Beimel, Paz Carmi, Kobbi Nissim, and Enav Weinreb. Private Approximation of Search Problems. SIAM J. Comput., 38(5):1728-1760, 2008.
Amos Beimel, Renen Hallak, and Kobbi Nissim. Private Approximation of Clustering and Vertex Cover. Computational Complexity, 18(3):435-494, 2009.
Bhaswar Bhattacharya and Gregory Valiant. Testing closeness with unequal sized samples. In Advances in Neural Information Processing Systems, pages 2611-2619, 2015.
Elette Boyle, Niv Gilboa, and Yuval Ishai. Breaking the Circuit Size Barrier for Secure Computation Under DDH. In CRYPTO (1), volume 9814 of Lecture Notes in Computer Science, pages 509-539. Springer, 2016.
Elette Boyle, Niv Gilboa, and Yuval Ishai. Group-Based Secure Computation: Optimizing Rounds, Communication, and Computation. In EUROCRYPT (2), volume 10211 of Lecture Notes in Computer Science, pages 163-193, 2017.
Mark Braverman, Ankit Garg, Tengyu Ma, Huy L Nguyen, and David P Woodruff. Communication lower bounds for statistical estimation problems via a distributed data processing inequality. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 1011-1020. ACM, 2016.
Sergey Bravyi, Aram W Harrow, and Avinatan Hassidim. Quantum algorithms for testing properties of distributions. IEEE Transactions on Information Theory, 57(6):3971-3981, 2011.
Bryan Cai, Constantinos Daskalakis, and Gautam Kamath. Priv'IT: Private and Sample Efficient Identity Testing. arXiv preprint, 2017. URL: http://arxiv.org/abs/1703.10127.
Clément L Canonne. A survey on distribution testing: Your data is big. but is it blue? In Electronic Colloquium on Computational Complexity (ECCC), volume 22(63), pages 1-9, 2015.
Amit Chakrabarti and Oded Regev. An optimal lower bound on the communication complexity of gap-hamming-distance. SIAM Journal on Computing, 41(5):1299-1317, 2012.
Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 1193-1203. Society for Industrial and Applied Mathematics, 2014.
Thomas M Cover. Hypothesis testing with finite statistics. The Annals of Mathematical Statistics, 40(3):828-835, 1969.
Michael Crouch, Andrew McGregor, Gregory Valiant, and David P Woodruff. Stochastic streams: Sample complexity vs. space complexity. In LIPIcs-Leibniz International Proceedings in Informatics, volume 57. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
Yuval Dagan and Ohad Shamir. Detecting Correlations with Little Memory and Communication. CoRR, abs/1803.01420, 2018. URL: http://arxiv.org/abs/1803.01420.
I. Diakonikolas, E. Grigorescu, J. Li, A. Natarajan, K. Onak, and L. Schmidt. Communication-Efficient Distributed Learning of Discrete Distributions. In Advances in Neural Information Processing Systems, 2017. To appear.
Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. Collision-based testers are optimal for uniformity and closeness. arXiv preprint, 2016. URL: http://arxiv.org/abs/1611.03579.
Ilias Diakonikolas, Moritz Hardt, and Ludwig Schmidt. Differentially private learning of structured discrete distributions. In Advances in Neural Information Processing Systems, pages 2566-2574, 2015.
Ilias Diakonikolas and Daniel M Kane. A new approach for testing properties of discrete distributions. In Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, pages 685-694. IEEE, 2016.
Joan Feigenbaum, Yuval Ishai, Tal Malkin, Kobbi Nissim, Martin J. Strauss, and Rebecca N. Wright. Secure Multiparty Computation of Approximations. ACM Trans. Algorithms, 2(3):435-472, July 2006. URL: http://dx.doi.org/10.1145/1159892.1159900.
Sumegha Garg, Ran Raz, and Avishay Tal. Extractor-based time-space lower bounds for learning. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 990-1002. ACM, 2018.
Craig Gentry. Fully homomorphic encryption using ideal lattices. In STOC, pages 169-178. ACM, 2009.
Oded Goldreich. Introduction to Property Testing (working draft), 2017. URL: https://www.wisdom.weizmann.ac.il/~oded/PDF/pt-v3.pdf.
Oded Goldreich and Dana Ron. On Testing Expansion in Bounded-Degree Graphs. ECCC 7(20), 2000.
Michael Gutman. Asymptotically optimal classification for multiple tests with empirically observed statistics. IEEE Transactions on Information Theory, 35(2):401-408, 1989.
U. Hadar, J. Liu, Y. Polyanskiy, and O. Shayevitz. Communication Complexity of Estimating Correlations. In Proceedings of the Symposium on Theory of Computing (STOC), 2019.
Shai Halevi, Robert Krauthgamer, Eyal Kushilevitz, and Kobbi Nissim. Private approximation of NP-hard functions. In STOC, pages 550-559. ACM, 2001.
Te Han. Hypothesis testing with multiterminal data compression. IEEE transactions on information theory, 33(6):759-772, 1987.
Martin E Hellman and Thomas M Cover. Learning with finite memory. The Annals of Mathematical Statistics, pages 765-782, 1970.
Piotr Indyk. Stable Distributions, Pseudorandom Generators, Embeddings and Data Stream Computation. J. ACM, 53(3):307-323, 2006. Previously appeared in FOCS'00.
Piotr Indyk and David Woodruff. Polylogarithmic Private Approximations and Efficient Matching. In Proceedings of the Third Conference on Theory of Cryptography, TCC'06, pages 245-264, Berlin, Heidelberg, 2006. Springer-Verlag. URL: http://dx.doi.org/10.1007/11681878_13.
Yuval Ishai, Tal Malkin, Martin J. Strauss, and Rebecca N. Wright. Private multiparty sampling and approximation of vector combinations. Theor. Comput. Sci., 410(18):1730-1745, 2009.
Joe Kilian, André Madeira, Martin J. Strauss, and Xuan Zheng. Fast Private Norm Estimation and Heavy Hitters. In TCC, volume 4948 of Lecture Notes in Computer Science, pages 176-193. Springer, 2008.
Gillat Kol, Ran Raz, and Avishay Tal. Time-space hardness of learning sparse parities. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 1067-1080. ACM, 2017.
Eyal Kushilevitz, Rafail Ostrovsky, and Yuval Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput., 30(2):457-474, 2000. Preliminary version appeared in STOC'98.
Dana Moshkovitz and Michal Moshkovitz. Mixing implies lower bounds for space bounded learning. In Conference on Learning Theory, pages 1516-1566, 2017.
Moni Naor and Kobbi Nissim. Communication Complexity and Secure Function Evaluation. CoRR, cs.CR/0109011, 2001. URL: http://arxiv.org/abs/cs.CR/0109011.
Ran Raz. Fast learning requires good memory: A time-space lower bound for parity learning. In Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, pages 266-275. IEEE, 2016.
Ran Raz. A time-space lower bound for a large class of learning problems. In Foundations of Computer Science (FOCS), 2017 IEEE 58th Annual Symposium on, pages 732-742. IEEE, 2017.
Ronitt Rubinfeld. Sublinear time algorithms. In International Congress of Mathematicians, volume 3, pages 1095-1110, 2006.
Ronitt Rubinfeld. Taming big probability distributions. XRDS: Crossroads, The ACM Magazine for Students, 19(1):24-28, 2012.
KR Sahasranand and Himanshu Tyagi. Extra Samples can Reduce the Communication for Independence Testing. In 2018 IEEE International Symposium on Information Theory (ISIT), pages 2316-2320. IEEE, 2018.
Ofer Shayevitz. On Rényi measures and hypothesis testing. In Information Theory Proceedings (ISIT), 2011 IEEE International Symposium on, pages 894-898. IEEE, 2011.
Or Sheffet. Locally Private Hypothesis Testing. In ICML, 2018.
Alexander A. Sherstov. The Communication Complexity of Gap Hamming Distance. Theory of Computing, 8(1):197-208, 2012. URL: http://dx.doi.org/10.4086/toc.2012.v008a008.
Jayakrishnan Unnikrishnan. On optimal two sample homogeneity tests for finite alphabets. In Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on, pages 2027-2031. Ieee, 2012.
Paul Valiant. Testing symmetric properties of distributions. SIAM Journal on Computing, 40(6):1927-1968, 2011. Previously in STOC'08.
Elad Verbin and Wei Yu. The streaming complexity of cycle counting, sorting by reversals, and other problems. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 11-25. SIAM, 2011.
Andrew Chi-Chih Yao. Protocols for Secure Computations (Extended Abstract). In Proceedings of the 23rd IEEE Symposium on Foundations of Computer Science, FOCS '82, pages 160-164, 1982.

Two Party Distribution Testing: Communication and Security

Authors Alexandr Andoni, Tal Malkin, Negev Shekel Nosatzki

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Two Party Distribution Testing: Communication and Security

Authors Alexandr Andoni, Tal Malkin, Negev Shekel Nosatzki

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message