Downsampling for Testing and Learning in Product Distributions

Harms, Nathaniel; Yoshida, Yuichi

doi:10.4230/LIPIcs.ICALP.2022.71

File

Subject Classification

ACM Subject Classification

Mathematics of computing → Probabilistic algorithms
Theory of computation → Machine learning theory
Theory of computation → Streaming, sublinear and near linear time algorithms

Keywords

property testing
learning
monotonicity
halfspaces
intersections of halfspaces
polynomial threshold functions

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

We study distribution-free property testing and learning problems where the unknown probability distribution is a product distribution over ℝ^d. For many important classes of functions, such as intersections of halfspaces, polynomial threshold functions, convex sets, and k-alternating functions, the known algorithms either have complexity that depends on the support size of the distribution, or are proven to work only for specific examples of product distributions. We introduce a general method, which we call downsampling, that resolves these issues. Downsampling uses a notion of "rectilinear isoperimetry" for product distributions, which further strengthens the connection between isoperimetry, testing and learning. Using this technique, we attain new efficient distribution-free algorithms under product distributions on ℝ^d: 1) A simpler proof for non-adaptive, one-sided monotonicity testing of functions [n]^d → {0,1}, and improved sample complexity for testing monotonicity over unknown product distributions, from O(d⁷) [Black, Chakrabarty, & Seshadhri, SODA 2020] to O(d³). 2) Polynomial-time agnostic learning algorithms for functions of a constant number of halfspaces, and constant-degree polynomial threshold functions; 3) An exp{O(dlog(dk))}-time agnostic learning algorithm, and an exp{O(dlog(dk))}-sample tolerant tester, for functions of k convex sets; and a 2^O(d) sample-based one-sided tester for convex sets; 4) An exp{O(k√d)}-time agnostic learning algorithm for k-alternating functions, and a sample-based tolerant tester with the same complexity.

Cite As Get BibTex

Nathaniel Harms and Yuichi Yoshida. Downsampling for Testing and Learning in Product Distributions. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 71:1-71:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.ICALP.2022.71

Author Details

Nathaniel Harms

University of Waterloo, Canada

Yuichi Yoshida

National Institute of Informatics, Tokyo, Japan

References

Nir Ailon and Bernard Chazelle. Information theory in property testing and monotonicity testing in higher dimension. Information and Computation, 204(11):1704-1717, 2006.
Maria-Florina Balcan, Eric Blais, Avrim Blum, and Liu Yang. Active property testing. In Proceedings of the IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS), pages 21-30, 2012.
Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Tolerant testers of image properties. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
Piotr Berman, Meiram Murzabulatov, and Sofya Raskhodnikova. Testing convexity of figures under the uniform distribution. Random Structures & Algorithms, 54(3):413-443, 2019.
Hadley Black, Deeparnab Chakrabarty, and C Seshadhri. A o(d) ⋅ polylog n monotonicity tester for boolean functions over the hypergrid [n]^d. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2133-2151, 2018.
Hadley Black, Deeparnab Chakrabarty, and C Seshadhri. Domain reduction for monotonicity testing: A o(d) tester for boolean functions in d-dimensions. In Proceedings of the 31st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1975-1994, 2020.
Eric Blais and Abhinav Bommireddi. On testing and robust characterizations of convexity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020.
Eric Blais, Clément Canonne, Igor Oliveira, Rocco Servedio, and Li-Yang Tan. Learning circuits with few negations. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, page 512, 2015.
Eric Blais, Renato Ferreira Pinto Jr, and Nathaniel Harms. VC dimension and distribution-free sample-based testing. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 504-517, 2021.
Eric Blais, Ryan O'Donnell, and Karl Wimmer. Polynomial regression under arbitrary product distributions. Machine Learning, 80(2-3):273-294, 2010.
Eric Blais, Sofya Raskhodnikova, and Grigory Yaroslavtsev. Lower bounds for testing properties of functions over hypergrid domains. In Proceedings of the IEEE 29th Conference on Computational Complexity (CCC), pages 309-320, 2014.
Eric Blais and Yuichi Yoshida. A characterization of constant-sample testable properties. Random Structures & Algorithms, 55(1):73-88, 2019.
Avrim L Blum and Ronald L Rivest. Training a 3-node neural network is NP-complete. Neural Networks, 5(1):117-127, 1992.
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K Warmuth. Learnability and the vapnik-chervonenkis dimension. Journal of the ACM (JACM), 36(4):929-965, 1989.
Clément L. Canonne, Elena Grigorcscu, Siyao Guo, Akash Kumar, and Karl Wimmer. Testing k-monotonicity: The rise and fall of boolean functions. Theory of Computing, 15(1):1-55, 2019.
Deeparnab Chakrabarty, Kashyap Dixit, Madhav Jha, and C Seshadhri. Property testing on product distributions: Optimal testers for bounded derivative properties. ACM Transactions on Algorithms (TALG), 13(2):1-30, 2017.
Deeparnab Chakrabarty and Comandur Seshadhri. An o(n) monotonicity tester for boolean functions over the hypercube. SIAM Journal on Computing, 45(2):461-472, 2016.
Xi Chen, Adam Freilich, Rocco A Servedio, and Timothy Sun. Sample-based high-dimensional convexity testing. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.
Xue Chen, Anindya De, and Rocco A Servedio. Testing noisy linear functions for sparsity. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 610-623, 2020.
Mónika Csikós, Nabil H Mustafa, and Andrey Kupavskii. Tight lower bounds on the VC-dimension of geometric set systems. Journal of Machine Learning Research, 20(81):1-8, 2019.
Anindya De, Elchanan Mossel, and Joe Neeman. Is your function low-dimensional? In Conference on Learning Theory, pages 979-993, 2019.
Ilias Diakonikolas, Prahladh Harsha, Adam Klivans, Raghu Meka, Prasad Raghavendra, Rocco A Servedio, and Li-Yang Tan. Bounding the average sensitivity and noise sensitivity of polynomial threshold functions. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC), pages 533-542, 2010.
Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Learning geometric concepts with nasty noise. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 1061-1073, 2018.
Shahar Fattal and Dana Ron. Approximating the distance to monotonicity in high dimensions. ACM Transactions on Algorithms (TALG), 6(3):1-37, 2010.
Eldar Fischer, Eric Lehman, Ilan Newman, Sofya Raskhodnikova, Ronitt Rubinfeld, and Alex Samorodnitsky. Monotonicity testing over general poset domains. In Proceedings of the 34th annual ACM symposium on Theory of Computing (STOC), pages 474-483, 2002.
Noah Fleming and Yuichi Yoshida. Distribution-free testing of linear functions on ℝⁿ. In Proceedings of the 11th Innovations in Theoretical Computer Science Conference (ITCS). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020.
Oded Goldreich and Dana Ron. On sample-based testers. ACM Transactions on Computation Theory (TOCT), 8(2):1-54, 2016.
Shirley Halevy and Eyal Kushilevitz. Distribution-free property-testing. SIAM Journal on Computing, 37(4):1107-1138, 2007.
Nathaniel Harms. Testing halfspaces over rotation-invariant distributions. In Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 694-713, 2019.
Adam T. Kalai, Adam R. Klivans, Yishay Mansour, and Rocco A Servedio. Agnostically learning halfspaces. SIAM Journal on Computing, 37(6):1777-1805, 2008.
Michael Kearns and Dana Ron. Testing problems with sublearning sample complexity. Journal of Computer and System Sciences, 61(3):428-456, 2000.
Subhash Khot, Dor Minzer, and Muli Safra. On monotonicity testing and boolean isoperimetric-type theorems. SIAM Journal on Computing, 47(6):2238-2276, 2018.
Adam R Klivans, Ryan O'Donnell, and Rocco A Servedio. Learning intersections and thresholds of halfspaces. Journal of Computer and System Sciences, 68(4):808-840, 2004.
Adam R Klivans, Ryan O'Donnell, and Rocco A Servedio. Learning geometric concepts via gaussian surface area. In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 541-550, 2008.
Luis Rademacher and Santosh Vempala. Testing geometric convexity. In International Conference on Foundations of Software Technology and Theoretical Computer Science, pages 469-480. Springer, 2004.
Sofya Raskhodnikova. Approximate testing of visual properties. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 370-381. Springer, 2003.
Dana Ron and Asaf Rosin. Optimal distribution-free sample-based testing of subsequence-freeness. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 337-256. SIAM, 2021.
Santosh Vempala. Learning convex concepts from gaussian distributions with PCA. In Proceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS), pages 124-130, 2010.
Santosh Vempala. A random-sampling-based algorithm for learning intersections of halfspaces. Journal of the ACM, 57(6):1-14, 2010.
Hugh E. Warren. Lower bounds for approximation by nonlinear manifolds. Transactions of the American Mathematical Society, 133(1):167-178, 1968.

Downsampling for Testing and Learning in Product Distributions

Authors Nathaniel Harms , Yuichi Yoshida

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Downsampling for Testing and Learning in Product Distributions

Authors Nathaniel Harms , Yuichi Yoshida

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message