Proofs of Proximity for Distribution Testing

Authors Alessandro Chiesa, Tom Gur



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2018.53.pdf
  • Filesize: 0.53 MB
  • 14 pages

Document Identifiers

Author Details

Alessandro Chiesa
Tom Gur

Cite AsGet BibTex

Alessandro Chiesa and Tom Gur. Proofs of Proximity for Distribution Testing. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 53:1-53:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ITCS.2018.53

Abstract

Distribution testing is an area of property testing that studies algorithms that receive few samples from a probability distribution D and decide whether D has a certain property or is far (in total variation distance) from all distributions with that property. Most natural properties of distributions, however, require a large number of samples to test, which motivates the question of whether there are natural settings wherein fewer samples suffice. We initiate a study of proofs of proximity for properties of distributions. In their basic form, these proof systems consist of a tester that not only has sample access to a distribution but also explicit access to a proof string that depends on the distribution. We refer to these as NP distribution testers, or MA distribution testers if the tester is a probabilistic algorithm. We also study the more general notion of IP distribution testers, in which the tester interacts with an all-powerful untrusted prover. We investigate the power and limitations of proofs of proximity for distributions and chart a landscape that, surprisingly, is significantly different from that of proofs of proximity for functions. Our main results include showing that MA distribution testers can be quadratically stronger than standard distribution testers, but no stronger than that; in contrast, IP distribution testers can be exponentially stronger than standard distribution testers, but when restricted to public coins they can be at best quadratically stronger.
Keywords
  • distribution testing
  • proofs of proximity
  • property testing

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Scott Aaronson and Avi Wigderson. Algebrization: A new barrier in complexity theory. ACM Transactions on Computation Theory, 1:2:1-2:54, 2009. Google Scholar
  2. Jayadev Acharya, Clément L. Canonne, and Gautam Kamath. A chasm between identity and equivalence testing with conditional queries. In Proceedings of the 19th International Workshop on Randomization and Computation, RANDOM '15, pages 449-466, 2015. Google Scholar
  3. Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan. Competitive closeness testing. In Proceedings of the 24th Annual Conference on Learning Theory, COLT 2011, pages 47-68, 2011. Google Scholar
  4. László Babai, Peter Frankl, and Janos Simon. Complexity classes in communication complexity theory. In Proceedings of the 27th Annual IEEE Symposium on Foundations of Computer Science, FOCS 1986, pages 337-347, 1986. Google Scholar
  5. László Babai and Shlomo Moran. Arthur-merlin games: a randomized proof system, and a hierarchy of complexity classes. Journal of Computer and System Sciences, 36:254-276, 1988. Google Scholar
  6. Tuğkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld. The complexity of approximating the entropy. SIAM Journal on Computing, 35(1):132-150, 2005. Google Scholar
  7. Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, pages 442-451, 2001. Google Scholar
  8. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, FOCS 2000, pages 259-269, 2000. Google Scholar
  9. Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and Salil P. Vadhan. Robust PCPs of proximity, shorter PCPs, and applications to coding. SIAM Journal on Computing, 36(4):889-974, 2006. Google Scholar
  10. Itay Berman, Ron D. Rothblum, and Vinod Vaikuntanathan. Zero-knowledge proofs of proximity. In Proceedings of the 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, page To appear, 2018. Google Scholar
  11. Bhaswar B. Bhattacharya and Gregory Valiant. Testing closeness with unequal sized samples. In Proceedings of the 2015 Conference on Neural Information Processing Systems, NIPS 2015, pages 2611-2619, 2015. Google Scholar
  12. Arnab Bhattacharyya, Eldar Fischer, Ronitt Rubinfeld, and Paul Valiant. Testing monotonicity of distributions over general partial orders. In Proceedings of the 2nd Innovations in Theoretical Computer Science Conference, ITCS 2011, pages 239-252, 2011. Google Scholar
  13. Eric Blais, Clément L. Canonne, and Tom Gur. Distribution testing lower bounds via reductions from communication complexity (Alice and Bob don't talk to each other anymore.). In Proceedings of the 32th Conference on Computational Complexity, CCC 2017, pages 1-42, 2017. Google Scholar
  14. Kenneth P Burnham and David R Anderson. Model selection and multimodel inference: a practical information-theoretic approach. Springer Science &Business Media, 2003. Google Scholar
  15. Clément Canonne and Ronitt Rubinfeld. Testing probability distributions underlying aggregated data. In International Colloquium on Automata, Languages, and Programming, ICALP '14, pages 283-295, 2014. Google Scholar
  16. Clément L. Canonne. A survey on distribution testing. your data is big. But is it blue?, 2017. Google Scholar
  17. Clément L. Canonne, Elena Grigorescu, Siyao Guo, Akash Kumar, and Karl Wimmer. Testing k-monotonicity. Electronic Colloquium on Computational Complexity (ECCC), 23:136, 2016. URL: http://eccc.hpi-web.de/report/2016/136.
  18. Clément L. Canonne, Dana Ron, and Rocco A. Servedio. Testing probability distributions using conditional samples. SIAM Journal on Computing, 44(3):540-616, 2015. Google Scholar
  19. Amit Chakrabarti, Graham Cormode, Andrew McGregor, and Justin Thaler. Annotations in data streams. ACM Transactions on Algorithms, 11, 2014. Google Scholar
  20. Amit Chakrabarti, Graham Cormode, Andrew McGregor, Justin Thaler, and Suresh Venkatasubramanian. Verifiable stream computation and Arthur-Merlin communication. In Proceedings of the 30th Conference on Computational Complexity, CCC 2015, pages 217-243, 2015. Google Scholar
  21. Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. On the power of conditional samples in distribution testing. SIAM Journal on Computing, 45(4):1261-1296, 2016. Google Scholar
  22. Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the 25th Symposium on Discrete Algorithms, SODA 2014, pages 1193-1203, 2014. Google Scholar
  23. Alessandro Chiesa and Tom Gur. Proofs of proximity for distribution testing. Electronic Colloquium on Computational Complexity (ECCC), 24:155, 2017. URL: https://eccc.weizmann.ac.il/report/2017/155.
  24. Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of discrete distributions. In Proceedings of the 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, pages 685-694, 2016. Google Scholar
  25. Funda Ergün, Ravi Kumar, and Ronitt Rubinfeld. Fast approximate probabilistically checkable proofs. Information and Computation, 189(2):135-159, 2004. Google Scholar
  26. Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapati, and Ananda Theertha Suresh. Faster algorithms for testing under conditional sampling. In Conference on Learning Theory, COLT '15, pages 607-636, 2015. Google Scholar
  27. Eldar Fischer, Yonatan Goldhirsh, and Oded Lachish. Partial tests, universal tests and decomposability. In Proceedings of the 5th Innovations in Theoretical Computer Science Conference, ITCS 2014, pages 483-500, 2014. Google Scholar
  28. Eldar Fischer, Oded Lachish, and Yadu Vasudev. Trading query complexity for sample-based testing and multi-testing scalability. In Proceedings of the 56th Symposium on Foundations of Computer Science, FOCS 2015, pages 1163-1182, 2015. Google Scholar
  29. Oded Goldreich. Introduction to property testing, 2017. Google Scholar
  30. Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM, 45(4):653-750, 1998. Google Scholar
  31. Oded Goldreich and Tom Gur. Universal locally verifiable codes and 3-round interactive proofs of proximity for CSP. Electronic Colloquium on Computational Complexity (ECCC), 23:192, 2016. URL: http://eccc.hpi-web.de/report/2016/192.
  32. Oded Goldreich, Tom Gur, and Ilan Komargodski. Strong locally testable codes with relaxed local decoders. In Proceedings of the 30th Conference on Computational Complexity, CCC 2015, pages 1-41, 2015. Google Scholar
  33. Oded Goldreich, Tom Gur, and Ron D. Rothblum. Proofs of proximity for context-free languages and read-once branching programs. In Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming, ICALP 2015, pages 666-677, 2015. Google Scholar
  34. Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. In Studies in Complexity and Cryptography, pages 68-75. Springer, 2011. Google Scholar
  35. Oded Goldreich and Or Sheffet. On the randomness complexity of property testing. Computational Complexity, 19(1):99-133, 2010. Google Scholar
  36. Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM Journal on computing, 18(1):186-208, 1989. Google Scholar
  37. Shafi Goldwasser and Michael Sipser. Private coins versus public coins in interactive proof systems. In Proceedings of the 18th a Annual ACM Symposium on Theory of Computing, STOC 1986, pages 59-68, 1986. Google Scholar
  38. Tom Gur and Ran Raz. Arthur-Merlin streaming complexity. Information and Computing, 243:145-165, 2015. Google Scholar
  39. Tom Gur and Ron Rothblum. Non-interactive proofs of proximity. Computational Complexity, To appear, 2017. Google Scholar
  40. Tom Gur and Ron D. Rothblum. A hierarchy theorem for interactive proofs of proximity. In Proceedings of the 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, 2017. Google Scholar
  41. Yael Tauman Kalai and Ron D. Rothblum. Arguments of proximity. In Proceedings of the 35th Annual International Cryptology Conference, CRYPTO 2015, pages 422-442, 2015. Google Scholar
  42. Hartmut Klauck. On Arthur Merlin games in communication complexity. In Proceedings of the 26th Conference on Computational Complexity, CCC 2017, pages 189-199, 2011. Google Scholar
  43. Erich L Lehmann and Joseph P Romano. Testing statistical hypotheses. Springer Science &Business Media, 2006. Google Scholar
  44. Reut Levi, Dana Ron, and Ronitt Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9:295-347, 2013. Google Scholar
  45. Ilan Newman and Mario Szegedy. Public vs. private coin flips in one round communication games. In Proceedings of the 28th a Annual ACM Symposium on Theory of Computing, STOC 1996, pages 561-570, 1996. Google Scholar
  46. Liam Paninski. Estimating entropy on m bins given fewer than m samples. IEEE Transactions on Information Theory, 50(9):2200-2203, 2004. Google Scholar
  47. Liam Paninski. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Transactions on Information Theory, 54(10):4750-4755, 2008. Google Scholar
  48. Aditi Raghunathan, Gregory Valiant, and James Zou. Estimating the unseen from multiple populations. CoRR, abs/1707.03854, 2017. URL: http://arxiv.org/abs/1707.03854.
  49. Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. Strong lower bounds for approximating distribution support size and the distinct elements problem. SIAM Journal on Computing, 39:813-842, 2009. Google Scholar
  50. Ran Raz and Amir Shpilka. On the power of quantum proofs. In Proceedings of the 19th Conference on Computational Complexity, CCC 2004, pages 260-274, 2004. Google Scholar
  51. Omer Reingold, Ron Rothblum, and Guy Rothblum. Constant-round interactive proofs for delegating computation. In Proceedings of the 48th ACM Symposium on the Theory of Computing, STOC 2016, pages 49-62, 2016. Google Scholar
  52. Guy N. Rothblum, Salil P. Vadhan, and Avi Wigderson. Interactive proofs of proximity: delegating computation in sublinear time. In Proceedings of the 45th Symposium on Theory of Computing, STOC 2013, pages 793-802, 2013. Google Scholar
  53. Ronitt Rubinfeld. Taming big probability distributions. ACM Crossroads, 19(1):24-28, 2012. Google Scholar
  54. Ronitt Rubinfeld and Rocco Servedio. Testing monotone high-dimensional distributions. Random Structures &Algorithms, 34:24-44, 2009. Google Scholar
  55. Ronitt Rubinfeld and Madhu Sudan. Robust characterization of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252-271, 1996. Google Scholar
  56. Gregory Valiant and Paul Valiant. Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of the 43rd Symposium on Theory of Computing, STOC 2011, pages 685-694, 2011. Google Scholar
  57. Gregory Valiant and Paul Valiant. An automatic inequality prover and instance optimal identity testing. SIAM Journal on Computing, 46(1):429-455, 2017. Google Scholar
  58. Paul Valiant. Testing symmetric properties of distributions. SIAM Journal on Computing, 40:1927-1968, 2011. Google Scholar
  59. Thomas Vidick and John Watrous. Quantum proofs. Foundations and Trends in Theoretical Computer Science, 11:1-215, 2016. Google Scholar
  60. John Von Neumann. Various techniques used in connection with random digits. National Bureau of Standards Applied Math Series, 12:36-38, 1951. Google Scholar
  61. John Watrous. Succinct quantum proofs for properties of finite groups. In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science, FOCS 2000, pages 537-546, 2000. Google Scholar
  62. James Zou, Gregory Valiant, Paul Valiant, Konrad Karczewski, Siu On Chan, Kaitlin Samocha, Monkol Lek, Shamil Sunyaev, Mark Daly, and Daniel G MacArthur. Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects. Nature Communications, 7, 2016. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail