Sparse Regression via Range Counting

Cardinal, Jean; Ooms, Aurélien

doi:10.4230/LIPIcs.SWAT.2020.20

File

Subject Classification

ACM Subject Classification

Theory of computation → Design and analysis of algorithms
Theory of computation → Computational geometry
Information systems → Nearest-neighbor search

Keywords

Sparse Linear Regression
Orthogonal Range Searching
Affine Degeneracy Testing
Nearest Neighbors
Hyperplane Arrangements

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set S of n points in ℝ^d, a point y∈ ℝ^d, and an integer 2 ≤ k ≤ d, find an affine combination of at most k points of S that is nearest to y. We describe a O(n^{k-1} log^{d-k+2} n)-time randomized (1+ε)-approximation algorithm for this problem with d and ε constant. This is the first algorithm for this problem running in time o(n^k). Its running time is similar to the query time of a data structure recently proposed by Har-Peled, Indyk, and Mahabadi (ICALP'18), while not requiring any preprocessing. Up to polylogarithmic factors, it matches a conditional lower bound relying on a conjecture about affine degeneracy testing. In the special case where k = d = O(1), we provide a simple O_δ(n^{d-1+δ})-time deterministic exact algorithm, for any δ > 0. Finally, we show how to adapt the approximation algorithm for the sparse linear regression and sparse convex regression problems with the same running time, up to polylogarithmic factors.

Cite As Get BibTex

Jean Cardinal and Aurélien Ooms. Sparse Regression via Range Counting. In 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 162, pp. 20:1-20:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.SWAT.2020.20

Author Details

Jean Cardinal

Université libre de Bruxelles (ULB), Brussels, Belgium

Aurélien Ooms

BARC, University of Copenhagen, Denmark

References

Pankaj K. Agarwal, Natan Rubin, and Micha Sharir. Approximate nearest neighbor search amid higher-dimensional flats. In 25th Annual European Symposium on Algorithms, ESA 2017, September 4-6, 2017, Vienna, Austria, pages 4:1-4:13, 2017. URL: https://doi.org/10.4230/LIPIcs.ESA.2017.4.
Michal Aharon, Michael Elad, and Alfred Bruckstein. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11):4311-4322, November 2006. URL: https://doi.org/10.1109/TSP.2006.881199.
Nir Ailon and Bernard Chazelle. Lower bounds for linear degeneracy testing. J. ACM, 52(2):157-171, 2005. URL: https://doi.org/10.1145/1059513.1059515.
Rahul Arya, Sunil Arya, Guilherme D. da Fonseca, and David M. Mount. Optimal bound on the combinatorial complexity of approximating polytopes. In SODA, pages 786-805. SIAM, 2020.
Sunil Arya, Guilherme D. da Fonseca, and David M. Mount. On the combinatorial complexity of approximating polytopes, April 2016. URL: http://arxiv.org/abs/1604.01175v4.
Sunil Arya, Guilherme D. da Fonseca, and David M. Mount. Approximate convex intersection detection with applications to width and minkowski sums, July 2018. URL: http://arxiv.org/abs/1807.00484v1.
Sunil Arya, Guilherme Dias da Fonseca, and David M. Mount. Near-optimal epsilon-kernel construction and related problems. In Symposium on Computational Geometry, volume 77 of LIPIcs, pages 10:1-10:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
Sunil Arya, David M. Mount, Nathan S. Netanyahu, Ruth Silverman, and Angela Y. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM, 45(6):891-923, 1998. URL: https://doi.org/10.1145/293347.293348.
Luis Barba, Jean Cardinal, John Iacono, Stefan Langerman, Aurélien Ooms, and Noam Solomon. Subquadratic algorithms for algebraic 3SUM. Discrete & Computational Geometry, 61(4):698-734, 2019. URL: https://doi.org/10.1007/s00454-018-0040-y.
Ronen Basri, Tal Hassner, and Lihi Zelnik-Manor. Approximate nearest subspace search. IEEE Trans. Pattern Anal. Mach. Intell., 33(2):266-278, 2011. URL: https://doi.org/10.1109/TPAMI.2010.110.
Dimitris Bertsimas, Angela King, and Rahul Mazumder. Best subset selection via a modern optimization lens. Ann. Statist., 44(2):813-852, April 2016. URL: https://doi.org/10.1214/15-AOS1388.
Dimitris Bertsimas, Jean Pauphilet, and Bart Van Parys. Sparse Regression: Scalable algorithms and empirical performance. arXiv e-prints, February 2019. URL: http://arxiv.org/abs/1902.06547.
Emmanuel J. Candès, Justin K. Romberg, and Terence Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Information Theory, 52(2):489-509, 2006. URL: https://doi.org/10.1109/TIT.2005.862083.
Bernard Chazelle. A functional approach to data structures and its use in multidimensional searching. SIAM J. Comput., 17(3):427-462, 1988. URL: https://doi.org/10.1137/0217026.
Bernard Chazelle. Cutting hyperplanes for divide-and-conquer. Discrete & Computational Geometry, 9:145-158, 1993. URL: https://doi.org/10.1007/BF02189314.
Bernard Chazelle and Joel Friedman. A deterministic view of random sampling and its use in geometry. Combinatorica, 10(3):229-249, 1990. URL: https://doi.org/10.1007/BF02122778.
G. Davis, S. Mallat, and M. Avellaneda. Adaptive greedy approximations. Constructive Approximation, 13(1):57-98, March 1997. URL: https://doi.org/10.1007/BF02678430.
David L. Donoho. Compressed sensing. IEEE Trans. Information Theory, 52(4):1289-1306, 2006. URL: https://doi.org/10.1109/TIT.2006.871582.
Richard M. Dudley. Metric entropy of some classes of sets with differentiable boundaries. Journal of Approximation Theory, 10(3):227-236, 1974. URL: https://doi.org/10.1016/0021-9045(74)90120-8.
Herbert Edelsbrunner, Raimund Seidel, and Micha Sharir. On the zone theorem for hyperplane arrangements. SIAM J. Comput., 22(2):418-429, 1993. URL: https://doi.org/10.1137/0222031.
Jeff Erickson and Raimund Seidel. Better lower bounds on detecting affine and spherical degeneracies. Discrete & Computational Geometry, 13:41-57, 1995. URL: https://doi.org/10.1007/BF02574027.
Dean P. Foster, Satyen Kale, and Howard J. Karloff. Online sparse linear regression. In Proceedings of the 29th Conference on Learning Theory, COLT 2016, New York, USA, June 23-26, 2016, pages 960-970, 2016. URL: http://proceedings.mlr.press/v49/foster16.html.
Dean P. Foster, Howard J. Karloff, and Justin Thaler. Variable selection is hard. In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, pages 696-709, 2015. URL: http://proceedings.mlr.press/v40/Foster15.html.
Anka Gajentaan and Mark H. Overmars. On a class of o(n²) problems in computational geometry. Comput. Geom., 5:165-185, 1995. URL: https://doi.org/10.1016/0925-7721(95)00022-2.
Sariel Har-Peled, Piotr Indyk, and Sepideh Mahabadi. Approximate sparse linear regression. In 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, pages 77:1-77:14, 2018. URL: https://doi.org/10.4230/LIPIcs.ICALP.2018.77.
Trevor Hastie, Robert Tibshirani, and Jerome H. Friedman. The elements of statistical learning: data mining, inference, and prediction, 2nd Edition. Springer series in statistics. Springer, 2009. URL: http://www.worldcat.org/oclc/300478243.
Avner Magen. Dimensionality reductions in 𝓁₂ that preserve volumes and distance to affine spaces. Discrete & Computational Geometry, 38(1):139-153, 2007. URL: https://doi.org/10.1007/s00454-007-1329-4.
Sepideh Mahabadi. Approximate nearest line search in high dimensions. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 337-354, 2015. URL: https://doi.org/10.1137/1.9781611973730.25.
Stéphane Mallat and Zhifeng Zhang. Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Processing, 41(12):3397-3415, 1993. URL: https://doi.org/10.1109/78.258082.
Alan Miller. Subset Selection in Regression. Chapman and Hall/CRC, 2002.
Balas K. Natarajan. Sparse approximate solutions to linear systems. SIAM J. Comput., 24(2):227-234, 1995. URL: https://doi.org/10.1137/S0097539792240406.
Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267-288, 1996. URL: http://www.jstor.org/stable/2346178.
Csaba D. Tóth, Joseph O'Rourke, and Jacob E. Goodman, editors. Handbook of Discrete and Computational Geometry. Chapman and Hall/CRC, 3rd edition, 2017. URL: https://doi.org/10.1201/9781315119601.
Dan E. Willard. New data structures for orthogonal range queries. SIAM J. Comput., 14(1):232-253, 1985.

Sparse Regression via Range Counting

Authors Jean Cardinal , Aurélien Ooms

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

Sparse Regression via Range Counting

Authors Jean Cardinal , Aurélien Ooms

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Acknowledgements

References

Thanks for your feedback!

Could not send message