Testing Properties of Multiple Distributions with Few Samples

Authors Maryam Aliakbarpour, Sandeep Silwal



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2020.69.pdf
  • Filesize: 0.68 MB
  • 41 pages

Document Identifiers

Author Details

Maryam Aliakbarpour
  • Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Sandeep Silwal
  • Massachusetts Institute of Technology, Cambridge, MA 02139, USA

Acknowledgements

The authors would like to thank Sushruth Reddy, Rikhav Shah, and Greg Valiant for helpful discussions about de Finetti’s theorem. The authors would also like to thank Ronitt Rubinfeld for valuable feedback.

Cite As Get BibTex

Maryam Aliakbarpour and Sandeep Silwal. Testing Properties of Multiple Distributions with Few Samples. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 69:1-69:41, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ITCS.2020.69

Abstract

We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from s distributions, p_1, p_2, …, p_s, we design testers for the following problems: (1) Uniformity Testing: Testing whether all the p_i’s are uniform or ε-far from being uniform in ℓ_1-distance (2) Identity Testing: Testing whether all the p_i’s are equal to an explicitly given distribution q or ε-far from q in ℓ_1-distance, and (3) Closeness Testing: Testing whether all the p_i’s are equal to a distribution q which we have sample access to, or ε-far from q in ℓ_1-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Hypothesis testing and confidence interval computation
Keywords
  • Hypothesis Testing
  • Property Testing
  • Distribution Testing
  • Identity Testing
  • Closeness Testing
  • Multiple Sources

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. Optimal Testing for Properties of Distributions. In NIPS, pages 3591-3599, 2015. Google Scholar
  2. Maryam Aliakbarpour, Eric Blais, and Ronitt Rubinfeld. Learning and Testing Junta Distributions. In COLT, pages 19-46, 2016. Google Scholar
  3. Maryam Aliakbarpour, Ilias Diakonikolas, Daniel Kane, and Ronitt Rubinfeld. Private Testing of Distributions via Sample Permutations. To appear in NeurIPS, 2019. URL: http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations.pdf.
  4. Tugkan Batu and Clément L. Canonne. Generalized Uniformity Testing. 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 880-889, 2017. Google Scholar
  5. Tugkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In FOCS, pages 442-451, 2001. Google Scholar
  6. Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing That Distributions Are Close. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, FOCS '00, pages 259-, Washington, DC, USA, 2000. IEEE Computer Society. URL: http://dl.acm.org/citation.cfm?id=795666.796548.
  7. Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing Closeness of Discrete Distributions. JACM, 60(1):4:1-4:25, 2013. Google Scholar
  8. Eric Blais, Clément L. Canonne, and Tom Gur. Distribution Testing Lower Bounds via Reductions from Communication Complexity. ACM Trans. Comput. Theory, 11(2):6:1-6:37, February 2019. URL: https://doi.org/10.1145/3305270.
  9. Clément L Canonne. A survey on distribution testing: Your data is big. but is it blue? ECCC, 22:63, 2015. Google Scholar
  10. Siu-on Chan, Ilias Diakonikolas, Paul Valiant, and Gregory Valiant. Optimal Algorithms for Testing Closeness of Discrete Distributions. In SODA, pages 1193-1203, 2014. Google Scholar
  11. Zdravko Cvetkovski. Inequalities theorems, techniques and selected problems. Springer, 2012. Google Scholar
  12. Persi Diaconis. Finite forms of de Finetti’s theorem on exchangeability. Synthese, 36(2):271-281, October 1977. URL: https://doi.org/10.1007/BF00486116.
  13. Persi Diaconis and David Freedman. Finite Exchangeable Sequences. The Annals of Probability, 8(4):745-764, 1980. URL: http://www.jstor.org/stable/2242823.
  14. Ilias Diakonikolas, Themis Gouleakis, J. Peebles, and Eric Price. Collision-based Testers are Optimal for Uniformity and Closeness. ECCC, 23:178, 2016. Google Scholar
  15. Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. Sample-Optimal Identity Testing with High Probability. In 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, pages 41:1-41:14, 2018. Google Scholar
  16. Ilias Diakonikolas and Daniel M. Kane. A New Approach for Testing Properties of Discrete Distributions. In FOCS, pages 685-694, 2016. Google Scholar
  17. Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. Testing Identity of Structured Distributions. In SODA, pages 1841-1854, 2015. Google Scholar
  18. Oded Goldreich. The uniform distribution is complete with respect to testing identity to a fixed distribution. Electronic Colloquium on Computational Complexity (ECCC), 23:15, 2016. Google Scholar
  19. Oded Goldreich. The uniform distribution is complete with respect to testing identity to a fixed distribution. ECCC, 23, 2016. URL: http://www.wisdom.weizmann.ac.il/~oded/R2/dr.pdf.
  20. Oded Goldreich. Introduction to Property Testing. Cambridge University Press, 2017. URL: http://www.wisdom.weizmann.ac.il/~oded/pt-intro.html.
  21. Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. JACM, 45:653-750, 1998. Google Scholar
  22. Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. In Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation, pages 68-75. Springer, 2011. Google Scholar
  23. Oded Goldreich and Dana Ron. On Testing Expansion in Bounded-Degree Graphs, pages 68-75. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. URL: https://doi.org/10.1007/978-3-642-22670-0_9.
  24. Olav Kallenberg. Probabilistic Symmetries and Invariance Principles. Springer New York, 2005. Google Scholar
  25. Erich L. Lehmann and Joseph P. Romano. Testing statistical hypotheses. Springer Texts in Statistics. Springer, 2005. Google Scholar
  26. Reut Levi, Dana Ron, and Ronitt Rubinfeld. Testing Properties of Collections of Distributions. Theory of Computing, 9(8):295-347, 2013. Google Scholar
  27. Jerzy Neyman and Egon S. Pearson. On the Problem of the Most Efficient Tests of Statistical Hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231(694-706):289-337, 1933. URL: https://doi.org/10.1098/rsta.1933.0009.
  28. Liam Paninski. A coincidence-based test for uniformity given very sparsely-sampled discrete data. IEEE TOIT, 54:4750-4755, 2008. Google Scholar
  29. Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07), pages 559-569, October 2007. URL: https://doi.org/10.1109/FOCS.2007.47.
  30. Ronitt Rubinfeld. Taming big probability distributions. XRDS, 19(1):24-28, 2012. Google Scholar
  31. Kevin Tian, Weihao Kong, and Gregory Valiant. Learning Populations of Parameters. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5778-5787. Curran Associates, Inc., 2017. URL: http://papers.nips.cc/paper/7160-learning-populations-of-parameters.pdf.
  32. Gregory Valiant and Paul Valiant. An Automatic Inequality Prover and Instance Optimal Identity Testing. SICOMP, 46(1):429-455, 2017. Google Scholar
  33. Gregory Valiant and Paul Valiant. Estimating the Unseen: Improved Estimators for Entropy and Other Properties. JACM, 64(6):37:1-37:41, 2017. Google Scholar
  34. Paul Valiant. Testing Symmetric Properties of Distributions. In Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, STOC '08, pages 383-392. ACM, 2008. Google Scholar
  35. Paul Valiant. Testing symmetric properties of distributions. In STOC, pages 383-392, 2008. Google Scholar
  36. Ramya Korlakai Vinayak, Weihao Kong, Gregory Valiant, and Sham Kakade. Maximum Likelihood Estimation for Learning Populations of Parameters. In Proceedings of the 36th International Conference on Machine Learning (ICML), pages 6448-6457. PMLR, 2019. Google Scholar
  37. Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. arXiv preprint, 2016. URL: http://arxiv.org/abs/1504.01227v2.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail