Document Open Access Logo

Comparison Graphs: A Unified Method for Uniformity Testing

Author Uri Meir



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2021.17.pdf
  • Filesize: 486 kB
  • 20 pages

Document Identifiers

Author Details

Uri Meir
  • Tel Aviv University, Israel

Acknowledgements

The author would like to thank Rotem Oshman for valuable counsel, Noam Mazor for very helpful discussions, and Ami Paz as well as the anonymous referees of ITCS 2021 for their useful comments.

Cite AsGet BibTex

Uri Meir. Comparison Graphs: A Unified Method for Uniformity Testing. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 17:1-17:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ITCS.2021.17

Abstract

Distribution testing can be described as follows: q samples are being drawn from some unknown distribution P over a known domain [n]. After the sampling process, a decision must be made about whether P holds some property, or is far from it. The most studied problem in the field is arguably uniformity testing, where one needs to distinguish the case that P is uniform over [n] from the case that P is ε-far from being uniform (in 𝓁₁). It is known that for this task Θ(√n/ε²) samples are necessary and sufficient. This problem was recently considered in various restricted models that pose, for example, communication or memory constraints. In more than one occasion, the known optimal solution boils down to counting collisions among the drawn samples (each two samples that have the same value add one to the count). This idea dates back to the first uniformity tester, and was coined the name "collision-based tester". In this paper, we introduce the notion of comparison graphs and use it to formally define a generalized collision-based tester. Roughly speaking, the edges of the graph indicate the tester which pairs of samples should be compared (that is, the original tester is induced by a clique, where all pairs are being compared). We prove a structural theorem that gives a sufficient condition for a comparison graph to induce a good uniformity tester. As an application, we develop a generic method to test uniformity, and devise nearly-optimal uniformity testers under various computational constraints. We improve and simplify a few known results, and introduce a new constrained model in which the method also produces an efficient tester. The idea behind our method is to translate computational constraints of a certain model to ones on the comparison graph, which paves the way to finding a good graph: a set of comparisons allowed by the model that suffice to test for uniformity. We believe that in future consideration of uniformity testing in new models, our method can be used to obtain efficient testers with minimal effort.

Subject Classification

ACM Subject Classification
  • Theory of computation → Distributed algorithms
  • Theory of computation → Streaming, sublinear and near linear time algorithms
Keywords
  • Distribution Testing
  • Uniformity Testing
  • Distributed Algorithms
  • Streaming Algorithms
  • Comparison Graphs

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jayadev Acharya, Sourbh Bhadane, Piotr Indyk, and Ziteng Sun. Estimating entropy of distributions in constant space. In Advances in Neural Information Processing Systems, 2019. Google Scholar
  2. Jayadev Acharya, Clément Canonne, Cody Freitag, and Himanshu Tyagi. Test without trust: Optimal locally private distribution testing. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2067-2076, 2019. Google Scholar
  3. Jayadev Acharya, Clément L Canonne, Yuhan Liu, Ziteng Sun, and Himanshu Tyagi. Interactive inference under information constraints. arXiv preprint, 2020. URL: http://arxiv.org/abs/2007.10976.
  4. Jayadev Acharya, Clément L Canonne, and Himanshu Tyagi. Distributed signal detection under communication constraints. In Conference on Learning Theory. PMLR, 2020. Google Scholar
  5. Jayadev Acharya, Clément L Canonney, and Himanshu Tyagiz. Inference under information constraints i: Lower bounds from chi-square contraction. IEEE Transactions on Information Theory, 2020. Google Scholar
  6. Jayadev Acharya, Clément L Canonney, and Himanshu Tyagiz. Inference under information constraints ii: Communication constraints and shared randomness. IEEE Transactions on Information Theory, 2020. Google Scholar
  7. Kareem Amin, Matthew Joseph, and Jieming Mao. Pan-private uniformity testing. In Conference on Learning Theory, pages 183-218. PMLR, 2020. Google Scholar
  8. Alexandr Andoni, Tal Malkin, and Negev Shekel Nosatzki. Two party distribution testing: Communication and security. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019. Google Scholar
  9. Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In 41st Annual Symposium on Foundations of Computer Science, FOCS 2000, 12-14 November 2000, Redondo Beach, California, USA, pages 259-269, 2000. Google Scholar
  10. Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. Journal of computer and system sciences, 47(3):549-595, 1993. Google Scholar
  11. Clément L. Canonne. A survey on distribution testing: Your data is big. but is it blue? Electronic Colloquium on Computational Complexity (ECCC), 22:63, 2015. Google Scholar
  12. Ilias Diakonikolas, Themis Gouleakis, Daniel M Kane, and Sankeerth Rao. Communication and memory efficient testing of discrete distributions. arXiv preprint, 2019. URL: http://arxiv.org/abs/1906.04709.
  13. Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. Sample-optimal identity testing with high probability. CoRR, abs/1708.02728, 2017. URL: http://arxiv.org/abs/1708.02728.
  14. Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. Collision-based testers are optimal for uniformity and closeness. Chicago Journal OF Theoretical Computer Science, 1:1-21, 2019. Google Scholar
  15. Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of discrete distributions. In IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA, 2016. Google Scholar
  16. Orr Fischer, Uri Meir, and Rotem Oshman. Distributed uniformity testing. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC '18. ACM, 2018. Google Scholar
  17. Sumegha Garg, Pravesh K. Kothari, and Ran Raz. Time-Space Tradeoffs for Distinguishing Distributions and Applications to Security of Goldreich’s PRG. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020), pages 21:1-21:18, 2020. Google Scholar
  18. Oded Goldreich. Introduction to Property Testing. Cambridge University Press, 2017. Google Scholar
  19. Oded Goldreich. The uniform distribution is complete with respect to testing identity to a fixed distribution. In Computational Complexity and Property Testing. Springer, 2020. Google Scholar
  20. Oded Goldreich, Shari Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM (JACM), 45(4):653-750, 1998. Google Scholar
  21. Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Electronic Colloquium on Computational Complexity (ECCC), 7(20), 2000. Google Scholar
  22. Uri Meir, Dor Minzer, and Rotem Oshman. Can distributed uniformity testing be local? In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 228-237. ACM, 2019. Google Scholar
  23. Varun Narayanan, Manoj Mishra, and Vinod M Prabhakaran. Private two-terminal hypothesis testing. arXiv preprint, 2020. URL: http://arxiv.org/abs/2005.05961.
  24. L. Paninski. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Transactions on Information Theory, 54(10):4750-4755, 2008. Google Scholar
  25. Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252-271, 1996. Google Scholar
  26. Gregory Valiant and Paul Valiant. An automatic inequality prover and instance optimal identity testing. SIAM Journal on Computing, 46(1):429-455, 2017. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail