Downsampling for Testing and Learning in Product Distributions

Authors Nathaniel Harms , Yuichi Yoshida



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2022.71.pdf
  • Filesize: 0.84 MB
  • 19 pages

Document Identifiers

Author Details

Nathaniel Harms
  • University of Waterloo, Canada
Yuichi Yoshida
  • National Institute of Informatics, Tokyo, Japan

Cite AsGet BibTex

Nathaniel Harms and Yuichi Yoshida. Downsampling for Testing and Learning in Product Distributions. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 71:1-71:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ICALP.2022.71

Abstract

We study distribution-free property testing and learning problems where the unknown probability distribution is a product distribution over ℝ^d. For many important classes of functions, such as intersections of halfspaces, polynomial threshold functions, convex sets, and k-alternating functions, the known algorithms either have complexity that depends on the support size of the distribution, or are proven to work only for specific examples of product distributions. We introduce a general method, which we call downsampling, that resolves these issues. Downsampling uses a notion of "rectilinear isoperimetry" for product distributions, which further strengthens the connection between isoperimetry, testing and learning. Using this technique, we attain new efficient distribution-free algorithms under product distributions on ℝ^d: 1) A simpler proof for non-adaptive, one-sided monotonicity testing of functions [n]^d → {0,1}, and improved sample complexity for testing monotonicity over unknown product distributions, from O(d⁷) [Black, Chakrabarty, & Seshadhri, SODA 2020] to O(d³). 2) Polynomial-time agnostic learning algorithms for functions of a constant number of halfspaces, and constant-degree polynomial threshold functions; 3) An exp{O(dlog(dk))}-time agnostic learning algorithm, and an exp{O(dlog(dk))}-sample tolerant tester, for functions of k convex sets; and a 2^O(d) sample-based one-sided tester for convex sets; 4) An exp{O(k√d)}-time agnostic learning algorithm for k-alternating functions, and a sample-based tolerant tester with the same complexity.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Probabilistic algorithms
  • Theory of computation → Machine learning theory
  • Theory of computation → Streaming, sublinear and near linear time algorithms
Keywords
  • property testing
  • learning
  • monotonicity
  • halfspaces
  • intersections of halfspaces
  • polynomial threshold functions

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Nir Ailon and Bernard Chazelle. Information theory in property testing and monotonicity testing in higher dimension. Information and Computation, 204(11):1704-1717, 2006. Google Scholar
  2. Maria-Florina Balcan, Eric Blais, Avrim Blum, and Liu Yang. Active property testing. In Proceedings of the IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS), pages 21-30, 2012. Google Scholar
  3. Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Tolerant testers of image properties. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016. Google Scholar
  4. Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Testing convexity of figures under the uniform distribution. Random Structures & Algorithms, 54(3):413-443, 2019. Google Scholar
  5. Hadley Black, Deeparnab Chakrabarty, and C Seshadhri. A o(d) ⋅ polylog n monotonicity tester for boolean functions over the hypergrid [n]^d. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2133-2151, 2018. Google Scholar
  6. Hadley Black, Deeparnab Chakrabarty, and C Seshadhri. Domain reduction for monotonicity testing: A o(d) tester for boolean functions in d-dimensions. In Proceedings of the 31st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1975-1994, 2020. Google Scholar
  7. Eric Blais and Abhinav Bommireddi. On testing and robust characterizations of convexity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020. Google Scholar
  8. Eric Blais, Clément Canonne, Igor Oliveira, Rocco Servedio, and Li-Yang Tan. Learning circuits with few negations. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, page 512, 2015. Google Scholar
  9. Eric Blais, Renato Ferreira Pinto Jr, and Nathaniel Harms. VC dimension and distribution-free sample-based testing. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 504-517, 2021. Google Scholar
  10. Eric Blais, Ryan O'Donnell, and Karl Wimmer. Polynomial regression under arbitrary product distributions. Machine Learning, 80(2-3):273-294, 2010. Google Scholar
  11. Eric Blais, Sofya Raskhodnikova, and Grigory Yaroslavtsev. Lower bounds for testing properties of functions over hypergrid domains. In Proceedings of the IEEE 29th Conference on Computational Complexity (CCC), pages 309-320, 2014. Google Scholar
  12. Eric Blais and Yuichi Yoshida. A characterization of constant-sample testable properties. Random Structures & Algorithms, 55(1):73-88, 2019. Google Scholar
  13. Avrim L Blum and Ronald L Rivest. Training a 3-node neural network is NP-complete. Neural Networks, 5(1):117-127, 1992. Google Scholar
  14. Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K Warmuth. Learnability and the vapnik-chervonenkis dimension. Journal of the ACM (JACM), 36(4):929-965, 1989. Google Scholar
  15. Clément L. Canonne, Elena Grigorcscu, Siyao Guo, Akash Kumar, and Karl Wimmer. Testing k-monotonicity: The rise and fall of boolean functions. Theory of Computing, 15(1):1-55, 2019. Google Scholar
  16. Deeparnab Chakrabarty, Kashyap Dixit, Madhav Jha, and C Seshadhri. Property testing on product distributions: Optimal testers for bounded derivative properties. ACM Transactions on Algorithms (TALG), 13(2):1-30, 2017. Google Scholar
  17. Deeparnab Chakrabarty and Comandur Seshadhri. An o(n) monotonicity tester for boolean functions over the hypercube. SIAM Journal on Computing, 45(2):461-472, 2016. Google Scholar
  18. Xi Chen, Adam Freilich, Rocco A Servedio, and Timothy Sun. Sample-based high-dimensional convexity testing. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017. Google Scholar
  19. Xue Chen, Anindya De, and Rocco A Servedio. Testing noisy linear functions for sparsity. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 610-623, 2020. Google Scholar
  20. Mónika Csikós, Nabil H Mustafa, and Andrey Kupavskii. Tight lower bounds on the VC-dimension of geometric set systems. Journal of Machine Learning Research, 20(81):1-8, 2019. Google Scholar
  21. Anindya De, Elchanan Mossel, and Joe Neeman. Is your function low-dimensional? In Conference on Learning Theory, pages 979-993, 2019. Google Scholar
  22. Ilias Diakonikolas, Prahladh Harsha, Adam Klivans, Raghu Meka, Prasad Raghavendra, Rocco A Servedio, and Li-Yang Tan. Bounding the average sensitivity and noise sensitivity of polynomial threshold functions. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC), pages 533-542, 2010. Google Scholar
  23. Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Learning geometric concepts with nasty noise. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 1061-1073, 2018. Google Scholar
  24. Shahar Fattal and Dana Ron. Approximating the distance to monotonicity in high dimensions. ACM Transactions on Algorithms (TALG), 6(3):1-37, 2010. Google Scholar
  25. Eldar Fischer, Eric Lehman, Ilan Newman, Sofya Raskhodnikova, Ronitt Rubinfeld, and Alex Samorodnitsky. Monotonicity testing over general poset domains. In Proceedings of the 34th annual ACM symposium on Theory of Computing (STOC), pages 474-483, 2002. Google Scholar
  26. Noah Fleming and Yuichi Yoshida. Distribution-free testing of linear functions on ℝⁿ. In Proceedings of the 11th Innovations in Theoretical Computer Science Conference (ITCS). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020. Google Scholar
  27. Oded Goldreich and Dana Ron. On sample-based testers. ACM Transactions on Computation Theory (TOCT), 8(2):1-54, 2016. Google Scholar
  28. Shirley Halevy and Eyal Kushilevitz. Distribution-free property-testing. SIAM Journal on Computing, 37(4):1107-1138, 2007. Google Scholar
  29. Nathaniel Harms. Testing halfspaces over rotation-invariant distributions. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 694-713, 2019. Google Scholar
  30. Adam T. Kalai, Adam R. Klivans, Yishay Mansour, and Rocco A Servedio. Agnostically learning halfspaces. SIAM Journal on Computing, 37(6):1777-1805, 2008. Google Scholar
  31. Michael Kearns and Dana Ron. Testing problems with sublearning sample complexity. Journal of Computer and System Sciences, 61(3):428-456, 2000. Google Scholar
  32. Subhash Khot, Dor Minzer, and Muli Safra. On monotonicity testing and boolean isoperimetric-type theorems. SIAM Journal on Computing, 47(6):2238-2276, 2018. Google Scholar
  33. Adam R Klivans, Ryan O'Donnell, and Rocco A Servedio. Learning intersections and thresholds of halfspaces. Journal of Computer and System Sciences, 68(4):808-840, 2004. Google Scholar
  34. Adam R Klivans, Ryan O'Donnell, and Rocco A Servedio. Learning geometric concepts via gaussian surface area. In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 541-550, 2008. Google Scholar
  35. Luis Rademacher and Santosh Vempala. Testing geometric convexity. In International Conference on Foundations of Software Technology and Theoretical Computer Science, pages 469-480. Springer, 2004. Google Scholar
  36. Sofya Raskhodnikova. Approximate testing of visual properties. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 370-381. Springer, 2003. Google Scholar
  37. Dana Ron and Asaf Rosin. Optimal distribution-free sample-based testing of subsequence-freeness. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 337-256. SIAM, 2021. Google Scholar
  38. Santosh Vempala. Learning convex concepts from gaussian distributions with PCA. In Proceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS), pages 124-130, 2010. Google Scholar
  39. Santosh Vempala. A random-sampling-based algorithm for learning intersections of halfspaces. Journal of the ACM, 57(6):1-14, 2010. Google Scholar
  40. Hugh E. Warren. Lower bounds for approximation by nonlinear manifolds. Transactions of the American Mathematical Society, 133(1):167-178, 1968. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail