Random Separating Hyperplane Theorem and Learning Polytopes

Authors Chiranjib Bhattacharyya, Ravindran Kannan, Amit Kumar

Thumbnail PDF


  • Filesize: 0.83 MB
  • 20 pages

Document Identifiers

Author Details

Chiranjib Bhattacharyya
  • Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India
Ravindran Kannan
  • Department of Operations Research, Carnegie Mellon University, Pittsburgh, USA
Amit Kumar
  • Department of Computer Science and Engineering, Indian Institute of Technology Delhi, India

Cite AsGet BibTex

Chiranjib Bhattacharyya, Ravindran Kannan, and Amit Kumar. Random Separating Hyperplane Theorem and Learning Polytopes. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 25:1-25:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. The theorem asserts that for a point a not in a closed convex set K, there is a hyperplane with K on one side and a strictly on the other side. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. RSH asserts that if the distance between a and a polytope K with k vertices and unit diameter in ℜ^d is at least δ, where δ is a fixed constant in (0,1), then a randomly chosen hyperplane separates a and K with probability at least 1/poly(k) and margin at least Ω (δ/√d). RSH has algorithmic applications in learning polytopes. We consider a fundamental problem, denoted the "Hausdorff problem", of learning a unit diameter polytope K within Hausdorff distance δ, given an optimization oracle for K. Using RSH, we show that with polynomially many random queries to the optimization oracle, K can be approximated within error O(δ). To our knowledge, this is the first provable algorithm for the Hausdorff Problem in this setting. Building on this result, we show that if the vertices of K are well-separated, then an optimization oracle can be used to generate a list of points, each within distance O(δ) of K, with the property that the list contains a point close to each vertex of K. Further, we show how to prune this list to generate a (unique) approximation to each vertex of the polytope. We prove that in many latent variable settings, e.g., topic modeling, LDA, optimization oracles do exist provided we project to a suitable SVD subspace. Thus, our work yields the first efficient algorithm for finding approximations to the vertices of the latent polytope under the well-separatedness assumption. This assumption states that each vertex of K is far from the convex hull of the remaining vertices of K, and is much weaker than other assumptions behind algorithms in the literature which find vertices of the latent polytope.

Subject Classification

ACM Subject Classification
  • Theory of computation → Unsupervised learning and clustering
  • Separating Hyperplane Theorem
  • Learning Polytopes
  • Optimization Oracles


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Edoardo M. Airoldi, David M. Blei, Elena A. Erosheva, and Stephen E. Fienberg. Introduction to mixed membership models and methods. In Edoardo M. Airoldi, David M. Blei, Elena A. Erosheva, and Stephen E. Fienberg, editors, Handbook of Mixed Membership Models and Their Applications, pages 3-13. Chapman and Hall/CRC, 2014. URL: https://doi.org/10.1201/b17520-3.
  2. Noga Alon. Problems and results in extremal combinatorics—i. Discrete Mathematics, 273(1-3):31-53, 2003. Google Scholar
  3. Anima Anandkumar, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, and Yi-Kai Liu. A spectral algorithm for latent dirichlet allocation. Algorithmica, 72(1):193-214, 2015. URL: https://doi.org/10.1007/s00453-014-9909-1.
  4. Sanjeev Arora, Rong Ge, Yoni Halpern, David M. Mimno, Ankur Moitra, David A. Sontag, Yichen Wu, and Michael Zhu. Learning topic models - provably and efficiently. Commun. ACM, 61(4):85-93, 2018. URL: https://doi.org/10.1145/3186262.
  5. Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan, David P. Woodruff, and Samson Zhou. Learning a latent simplex in input sparsity time. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. URL: https://openreview.net/forum?id=04LZCAxMSco.
  6. D. Bertsekas. Convex Optimization Theory. Athena Scientific optimization and computation series. Athena Scientific, 2009. URL: https://books.google.co.in/books?id=0H1iQwAACAAJ.
  7. Chiranjib Bhattacharyya and Ravindran Kannan. Finding a latent k-simplex in O* (k ⋅ nnz(data)) time via subset smoothing. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 122-140. SIAM, 2020. URL: https://doi.org/10.1137/1.9781611975994.8.
  8. Chiranjib Bhattacharyya, Ravindran Kannan, and Amit Kumar. Finding k in latent k- polytope. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 894-903. PMLR, 2021. URL: http://proceedings.mlr.press/v139/bhattacharyya21a.html.
  9. Chiranjib Bhattacharyya, Ravindran Kannan, and Amit Kumar. Random separating hyperplane theorem and learning polytopes, 2023. URL: https://arxiv.org/abs/2307.11371.
  10. David M. Blei. Probabilistic topic models. Commun. ACM, 55(4):77-84, 2012. URL: https://doi.org/10.1145/2133806.2133826.
  11. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993-1022, 2003. URL: http://jmlr.org/papers/v3/blei03a.html.
  12. Avrim Blum, Sariel Har-Peled, and Benjamin Raichel. Sparse approximation via generating point sets. ACM Trans. Algorithms, 15(3):32:1-32:16, 2019. URL: https://doi.org/10.1145/3302249.
  13. Stephen P. Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2014. URL: https://doi.org/10.1017/CBO9780511804441.
  14. Samuel B. Hopkins and David Steurer. Efficient bayesian estimation from few samples: Community detection and related problems. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 379-390. IEEE Computer Society, 2017. URL: https://doi.org/10.1109/FOCS.2017.42.
  15. William Johnson and Joram Lindenstrauss. Extensions of lipschitz maps into a hilbert space. Contemporary Mathematics, 26:189-206, January 1984. URL: https://doi.org/10.1090/conm/026/737400.
  16. Kasper Green Larsen and Jelani Nelson. Optimality of the johnson-lindenstrauss lemma. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 633-638. IEEE Computer Society, 2017. URL: https://doi.org/10.1109/FOCS.2017.64.
  17. B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics, 28(5):1302-1338, 2000. URL: https://doi.org/10.1214/aos/1015957395.
  18. R Tyrrell Rockafellar and Johannes O Royset. Random variables, monotone relations, and convex analysis. Mathematical Programming, 148:297-331, 2014. Google Scholar
  19. Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices. arXiv preprint, 2010. URL: https://arxiv.org/abs/1011.3027.