Testing Properties of Distributions in the Streaming Model

Authors Sampriti Roy , Yadu Vasudev



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2023.56.pdf
  • Filesize: 0.78 MB
  • 17 pages

Document Identifiers

Author Details

Sampriti Roy
  • Department of Computer Science and Engineering, IIT Madras, Chennai, India
Yadu Vasudev
  • Department of Computer Science and Engineering, IIT Madras, Chennai, India

Cite AsGet BibTex

Sampriti Roy and Yadu Vasudev. Testing Properties of Distributions in the Streaming Model. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 56:1-56:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ISAAC.2023.56

Abstract

We study distribution testing in the standard access model and the conditional access model when the memory available to the testing algorithm is bounded. In both scenarios, we consider the samples appear in an online fashion. The goal is to test the properties of distribution using an optimal number of samples subject to a memory constraint on how many samples can be stored at a given time. First, we provide a trade-off between the sample complexity and the space complexity for testing identity when the samples are drawn according to the conditional access oracle. We then show that we can learn a succinct representation of a monotone distribution efficiently with a memory constraint on the number of samples that are stored that is almost optimal. We also show that the algorithm for monotone distributions can be extended to a larger class of decomposable distributions.

Subject Classification

ACM Subject Classification
  • Theory of computation
Keywords
  • Property testing
  • distribution testing
  • streaming

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jayadev Acharya, Sourbh Bhadane, Piotr Indyk, and Ziteng Sun. Estimating entropy of distributions in constant space. In H. Wallach, H. Larochelle, A. Beygelzimer, F. dquotesingle Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. Google Scholar
  2. Maryam Aliakbarpour, Andrew McGregor, Jelani Nelson, and Erik Waingarten. Estimation of entropy in constant space with improved sample complexity. arXiv preprint arXiv:2205.09804, 2022. Google Scholar
  3. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137-147, 1999. Google Scholar
  4. Gabriel Bathie and Tatiana Starikovskaya. Property testing of regular languages with applications to streaming property testing of visibly pushdown languages. In ICALP 2021, GLASGOW (virtual conference), United Kingdom, 2021. Google Scholar
  5. T. Batu, L. Fortnow, E. Fischer, R. Kumar, R. Rubinfeld, and P. White. Testing random variables for independence and identity. In Proceedings of the 42Nd IEEE Symposium on Foundations of Computer Science, FOCS '01, pages 442-451, Washington, DC, USA, 2001. IEEE Computer Society. Google Scholar
  6. Tugkan Batu, Ravi Kumar, and Ronitt Rubinfeld. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing, STOC '04, pages 381-390, New York, NY, USA, 2004. ACM. Google Scholar
  7. Tomer Berg, Or Ordentlich, and Ofer Shayevitz. On the memory complexity of uniformity testing. In Po-Ling Loh and Maxim Raginsky, editors, Proceedings of Thirty Fifth Conference on Learning Theory, volume 178 of Proceedings of Machine Learning Research, pages 3506-3523. PMLR, 02-05 July 2022. Google Scholar
  8. Lucien Birge. On the Risk of Histograms for Estimating Decreasing Densities. The Annals of Statistics, 15(3):1013-1022, 1987. Google Scholar
  9. Clément L. Canonne. Big Data on the rise: Testing monotonicity of distributions. In 42nd International Conference on Automata, Languages and Programming (ICALP), 2015. Google Scholar
  10. Clément L. Canonne. Topics and techniques in distribution testing: A biased but representative sample. Found. Trends Commun. Inf. Theory, 19(6):1032-1198, 2022. URL: https://doi.org/10.1561/0100000114.
  11. Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. Testing shape restrictions of discrete distributions. Theory of Computing Systems, 62(1):4-62, January 2018. Publisher Copyright: copyright 2017, Springer Science+Business Media New York. Google Scholar
  12. Clément L. Canonne, Dana Ron, and Rocco A. Servedio. Testing probability distributions using conditional samples. SIAM Journal on Computing, 44(3):540-616, 2015. Google Scholar
  13. Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. On the power of conditional samples in distribution testing. SIAM Journal on Computing, 45(4):1261-1296, 2016. Google Scholar
  14. Steve Chien, Katrina Ligett, and Andrew McGregor. Space-efficient estimation of robust statistics and distribution testing. In Andrew Chi-Chih Yao, editor, Innovations in Computer Science - ICS 2010, Tsinghua University, Beijing, China, January 5-7, 2010. Proceedings, pages 251-265. Tsinghua University Press, 2010. Google Scholar
  15. Graham Cormode and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. In J. Algorithms, 2004. Google Scholar
  16. Artur Czumaj, Hendrik Fichtenberger, Pan Peng, and Christian Sohler. Testable properties in general graphs and random order streaming. In 24th International Conference on Randomization and Computation (RANDOM), 2020. Google Scholar
  17. Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, and Sankeerth Rao. Communication and memory efficient testing of discrete distributions. In Annual Conference Computational Learning Theory, 2019. Google Scholar
  18. Eldar Fischer, Oded Lachish, and Yadu Vasudev. Improving and extending the testing of distributions for shape-restricted properties. Algorithmica, Springer, 81,3765–3802, 2019. URL: https://arxiv.org/abs/1609.06736.
  19. Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Electron Colloq Comput Complexity, 7, January 2000. Google Scholar
  20. Andrew McGregor. Graph stream algorithms: A survey. SIGMOD Rec., 43(1):9-20, May 2014. Google Scholar
  21. Shanmugavelayutham Muthukrishnan et al. Data streams: Algorithms and applications. Foundations and Trendsregistered in Theoretical Computer Science, 1(2):117-236, 2005. Google Scholar
  22. Sampriti Roy and Yadu Vasudev. Testing properties of distributions in the streaming model, 2023. URL: https://arxiv.org/abs/2309.03245.
  23. Paul Valiant. Testing symmetric properties of distributions. SIAM Journal on Computing, 40(6):1927-1968, 2011. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail