Comparison Graphs: A Unified Method for Uniformity Testing

Meir, Uri

doi:10.4230/LIPIcs.ITCS.2021.17

File

LIPIcs.ITCS.2021.17.pdf

Filesize: 486 kB
20 pages

Document Identifiers

DOI: 10.4230/LIPIcs.ITCS.2021.17
URN: urn:nbn:de:0030-drops-135560

Author Details

Uri Meir

Tel Aviv University, Israel

Acknowledgements

The author would like to thank Rotem Oshman for valuable counsel, Noam Mazor for very helpful discussions, and Ami Paz as well as the anonymous referees of ITCS 2021 for their useful comments.

Cite AsGet BibTex

Uri Meir. Comparison Graphs: A Unified Method for Uniformity Testing. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 17:1-17:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ITCS.2021.17

Abstract

Distribution testing can be described as follows: q samples are being drawn from some unknown distribution P over a known domain [n]. After the sampling process, a decision must be made about whether P holds some property, or is far from it. The most studied problem in the field is arguably uniformity testing, where one needs to distinguish the case that P is uniform over [n] from the case that P is ε-far from being uniform (in 𝓁₁). It is known that for this task Θ(√n/ε²) samples are necessary and sufficient. This problem was recently considered in various restricted models that pose, for example, communication or memory constraints. In more than one occasion, the known optimal solution boils down to counting collisions among the drawn samples (each two samples that have the same value add one to the count). This idea dates back to the first uniformity tester, and was coined the name "collision-based tester". In this paper, we introduce the notion of comparison graphs and use it to formally define a generalized collision-based tester. Roughly speaking, the edges of the graph indicate the tester which pairs of samples should be compared (that is, the original tester is induced by a clique, where all pairs are being compared). We prove a structural theorem that gives a sufficient condition for a comparison graph to induce a good uniformity tester. As an application, we develop a generic method to test uniformity, and devise nearly-optimal uniformity testers under various computational constraints. We improve and simplify a few known results, and introduce a new constrained model in which the method also produces an efficient tester. The idea behind our method is to translate computational constraints of a certain model to ones on the comparison graph, which paves the way to finding a good graph: a set of comparisons allowed by the model that suffice to test for uniformity. We believe that in future consideration of uniformity testing in new models, our method can be used to obtain efficient testers with minimal effort.

Subject Classification

ACM Subject Classification

Theory of computation → Distributed algorithms
Theory of computation → Streaming, sublinear and near linear time algorithms

Keywords

Distribution Testing
Uniformity Testing
Distributed Algorithms
Streaming Algorithms
Comparison Graphs

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Jayadev Acharya, Sourbh Bhadane, Piotr Indyk, and Ziteng Sun. Estimating entropy of distributions in constant space. In Advances in Neural Information Processing Systems, 2019.
Jayadev Acharya, Clément Canonne, Cody Freitag, and Himanshu Tyagi. Test without trust: Optimal locally private distribution testing. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2067-2076, 2019.
Jayadev Acharya, Clément L Canonne, Yuhan Liu, Ziteng Sun, and Himanshu Tyagi. Interactive inference under information constraints. arXiv preprint, 2020. URL: http://arxiv.org/abs/2007.10976.
Jayadev Acharya, Clément L Canonne, and Himanshu Tyagi. Distributed signal detection under communication constraints. In Conference on Learning Theory. PMLR, 2020.
Jayadev Acharya, Clément L Canonney, and Himanshu Tyagiz. Inference under information constraints i: Lower bounds from chi-square contraction. IEEE Transactions on Information Theory, 2020.
Jayadev Acharya, Clément L Canonney, and Himanshu Tyagiz. Inference under information constraints ii: Communication constraints and shared randomness. IEEE Transactions on Information Theory, 2020.
Kareem Amin, Matthew Joseph, and Jieming Mao. Pan-private uniformity testing. In Conference on Learning Theory, pages 183-218. PMLR, 2020.
Alexandr Andoni, Tal Malkin, and Negev Shekel Nosatzki. Two party distribution testing: Communication and security. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In 41st Annual Symposium on Foundations of Computer Science, FOCS 2000, 12-14 November 2000, Redondo Beach, California, USA, pages 259-269, 2000.
Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. Journal of computer and system sciences, 47(3):549-595, 1993.
Clément L. Canonne. A survey on distribution testing: Your data is big. but is it blue? Electronic Colloquium on Computational Complexity (ECCC), 22:63, 2015.
Ilias Diakonikolas, Themis Gouleakis, Daniel M Kane, and Sankeerth Rao. Communication and memory efficient testing of discrete distributions. arXiv preprint, 2019. URL: http://arxiv.org/abs/1906.04709.
Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. Sample-optimal identity testing with high probability. CoRR, abs/1708.02728, 2017. URL: http://arxiv.org/abs/1708.02728.
Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. Collision-based testers are optimal for uniformity and closeness. Chicago Journal OF Theoretical Computer Science, 1:1-21, 2019.
Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of discrete distributions. In IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA, 2016.
Orr Fischer, Uri Meir, and Rotem Oshman. Distributed uniformity testing. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC '18. ACM, 2018.
Sumegha Garg, Pravesh K. Kothari, and Ran Raz. Time-Space Tradeoffs for Distinguishing Distributions and Applications to Security of Goldreich’s PRG. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020), pages 21:1-21:18, 2020.
Oded Goldreich. Introduction to Property Testing. Cambridge University Press, 2017.
Oded Goldreich. The uniform distribution is complete with respect to testing identity to a fixed distribution. In Computational Complexity and Property Testing. Springer, 2020.
Oded Goldreich, Shari Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM (JACM), 45(4):653-750, 1998.
Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Electronic Colloquium on Computational Complexity (ECCC), 7(20), 2000.
Uri Meir, Dor Minzer, and Rotem Oshman. Can distributed uniformity testing be local? In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 228-237. ACM, 2019.
Varun Narayanan, Manoj Mishra, and Vinod M Prabhakaran. Private two-terminal hypothesis testing. arXiv preprint, 2020. URL: http://arxiv.org/abs/2005.05961.
L. Paninski. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Transactions on Information Theory, 54(10):4750-4755, 2008.
Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252-271, 1996.
Gregory Valiant and Paul Valiant. An automatic inequality prover and instance optimal identity testing. SIAM Journal on Computing, 46(1):429-455, 2017.