Learning Arithmetic Formulas in the Presence of Noise: A General Framework and Applications to Unsupervised Learning

Chandra, Pritam; Garg, Ankit; Kayal, Neeraj; Mittal, Kunal; Sinha, Tanmay

doi:10.4230/LIPIcs.ITCS.2024.25

Abstract

We present a general framework for designing efficient algorithms for unsupervised learning problems, such as mixtures of Gaussians and subspace clustering. Our framework is based on a meta algorithm that learns arithmetic formulas in the presence of noise, using lower bounds. This builds upon the recent work of Garg, Kayal and Saha (FOCS '20), who designed such a framework for learning arithmetic formulas without any noise. A key ingredient of our meta algorithm is an efficient algorithm for a novel problem called Robust Vector Space Decomposition. We show that our meta algorithm works well when certain matrices have sufficiently large smallest non-zero singular values. We conjecture that this condition holds for smoothed instances of our problems, and thus our framework would yield efficient algorithms for these problems in the smoothed setting.

Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade, and Matus Telgarsky. Tensor decompositions for learning latent variable models. Journal of Machine Learning Research, 15(1):2773-2832, January 2014.
Nima Anari, Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. Log-concave polynomials II: High-dimensional walks and an FPRAS for counting bases of a matroid. In STOC'19 - Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1-12, 2019.
Mitali Bafna, Jun-Ting Hsieh, Pravesh K Kothari, and Jeff Xu. Polynomial-time power-sum decomposition of polynomials. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 956-967. IEEE, 2022.
Ainesh Bakshi, Ilias Diakonikolas, He Jia, Daniel M. Kane, Pravesh K. Kothari, and Santosh S. Vempala. Robustly learning mixtures of k arbitrary Gaussians. In STOC '22 - Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 1234-1247. ACM, New York, 2022.
Aditya Bhaskara, Aidao Chen, Aidan Perreault, and Aravindan Vijayaraghavan. Smoothed analysis in unsupervised learning via decoupling. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 582-610. IEEE, 2019.
Pritam Chandra, Ankit Garg, Neeraj Kayal, Kunal Mittal, and Tanmay Sinha. Learning arithmetic formulas in the presence of noise: A general framework and applications to unsupervised learning, 2023. URL: https://arxiv.org/abs/2311.07284.
Sitan Chen, Jerry Li, Yuanzhi Li, and Anru R. Zhang. Learning polynomial transformations via generalized tensor decompositions. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1671-1684. ACM, 2023. URL: https://doi.org/10.1145/3564246.3585209.
Alexander L. Chistov, Gábor Ivanyos, and Marek Karpinski. Polynomial time algorithms for modules over finite dimensional algebras. In Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation, ISSAC '97, Maui, Hawaii, USA, July 21-23, 1997, pages 68-74, 1997.
Ankit Garg, Leonid Gurvits, Rafael Mendes de Oliveira, and Avi Wigderson. Operator scaling: Theory and applications. Found. Comput. Math., 20(2):223-290, 2020. URL: https://doi.org/10.1007/s10208-019-09417-z.
Ankit Garg, Neeraj Kayal, and Chandan Saha. Learning sums of powers of low-degree polynomials in the non-degenerate case. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 889-899. IEEE, 2020. Open source version at URL: https://arxiv.org/abs/2004.06898.
Rong Ge, Qingqing Huang, and Sham M. Kakade. Learning mixtures of gaussians in high dimensions. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC, pages 761-770, 2015. Open source version at URL: https://arxiv.org/abs/1503.00424.
Erich Kaltofen, John P. May, Zhengfeng Yang, and Lihong Zhi. Approximate factorization of multivariate polynomials using singular value decomposition. Journal of Symboilic Computation, 43(5):359-376, 2008. URL: https://doi.org/10.1016/J.JSC.2007.11.005.
Erich Kaltofen and Barry M. Trager. Computing with polynomials given by black boxes for their evaluations: Greatest common divisors, factorization, separation of numerators and denominators. Journal of Symboilic Computation, 9(3):301-320, 1990.
Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. SIAM Review, 51(3):455-500, 2009.
Nimrod Megiddo and Arie Tamir. On the complexity of locating linear facilities in the plane. Operations Research Letters, 1(5):194-197, 1982. URL: https://doi.org/10.1016/0167-6377(82)90039-6.
Lance Parsons, Ehtesham Haque, and Huan Liu. Subspace clustering for high dimensional data: A review. SIGKDD Explor. Newsl., 6(1):90-105, June 2004.
Youming Qiao. Block diagonalization for adjoint action. Private communication, 2018.
Wentao Qu, Xianchao Xiu, Huangyue Chen, and Lingchen Kong. A survey on high-dimensional subspace clustering. Mathematics, 11(2), 2023. URL: https://doi.org/10.3390/math11020436.
Aravindan Vijayaraghavan. Efficient tensor decompositions. In Tim Roughgarden, editor, Beyond the Worst-Case Analysis of Algorithms, pages 424-444. Cambridge University Press, 2020. URL: https://arxiv.org/abs/2007.15589.
Hanna Wallach. Topic modeling: Beyond bag-of-words. In ICML 2006 - Proceedings of the 23rd International Conference on Machine Learning, volume 2006, pages 977-984, January 2006. URL: https://doi.org/10.1145/1143844.1143967.

Learning Arithmetic Formulas in the Presence of Noise: A General Framework and Applications to Unsupervised Learning

Authors Pritam Chandra, Ankit Garg, Neeraj Kayal, Kunal Mittal, Tanmay Sinha

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Learning Arithmetic Formulas in the Presence of Noise: A General Framework and Applications to Unsupervised Learning

Authors Pritam Chandra, Ankit Garg, Neeraj Kayal, Kunal Mittal, Tanmay Sinha

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message