Approximate Sparse Linear Regression
In the Sparse Linear Regression (SLR) problem, given a d x n matrix M and a d-dimensional query q, the goal is to compute a k-sparse n-dimensional vector tau such that the error ||M tau - q|| is minimized. This problem is equivalent to the following geometric problem: given a set P of n points and a query point q in d dimensions, find the closest k-dimensional subspace to q, that is spanned by a subset of k points in P. In this paper, we present data-structures/algorithms and conditional lower bounds for several variants of this problem (such as finding the closest induced k dimensional flat/simplex instead of a subspace).
In particular, we present approximation algorithms for the online variants of the above problems with query time O~(n^{k-1}), which are of interest in the "low sparsity regime" where k is small, e.g., 2 or 3. For k=d, this matches, up to polylogarithmic factors, the lower bound that relies on the affinely degenerate conjecture (i.e., deciding if n points in R^d contains d+1 points contained in a hyperplane takes Omega(n^d) time). Moreover, our algorithms involve formulating and solving several geometric subproblems, which we believe to be of independent interest.
Sparse Linear Regression
Approximate Nearest Neighbor
Sparse Recovery
Nearest Induced Flat
Nearest Subspace Search
Theory of computation~Computational geometry
Theory of computation~Data structures design and analysis
77:1-77:14
Regular Paper
This research was supported by NSF and Simons Foundation.
https://arxiv.org/abs/1609.08739
Sariel
Har-Peled
Sariel Har-Peled
Department of Computer Science, University of Illinois, Urbana, IL, USA
Work on this paper was partially supported by NSF AF awards CCF-1421231, and CCF-1217462.
Piotr
Indyk
Piotr Indyk
Department of Computer Science, MIT, Cambridge, MA, USA
Sepideh
Mahabadi
Sepideh Mahabadi
Data Science Institute, Columbia University, New York, NY, USA
This work was done while this author was at MIT.
10.4230/LIPIcs.ICALP.2018.77
S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. J. Assoc. Comput. Mach., 45(6):891-923, 1998. URL: http://dx.doi.org/10.1145/293347.293348.
http://dx.doi.org/10.1145/293347.293348
Ronen Basri, Tal Hassner, and Lihi Zelnik-Manor. Approximate nearest subspace search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2):266-278, 2011.
E. J. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theor., 52(2):489-509, February 2006. URL: http://dx.doi.org/10.1109/TIT.2005.862083.
http://dx.doi.org/10.1109/TIT.2005.862083
Scott Shaobing Chen, David L. Donoho, and Michael A. Saunders. Atomic decomposition by basis pursuit. SIAM J. Sci. Comput., 20(1):33-61, 1998. URL: http://dx.doi.org/10.1137/S1064827596304010.
http://dx.doi.org/10.1137/S1064827596304010
G. Davis, S. Mallat, and M. Avellaneda. Adaptive greedy approximations. Constructive Approx., 13(1):57-98, 1997. URL: http://dx.doi.org/10.1007/BF02678430.
http://dx.doi.org/10.1007/BF02678430
David L. Donoho. Compressed sensing. IEEE Trans. Inf. Theor., 52(4):1289-1306, 2006. URL: http://dx.doi.org/10.1109/TIT.2006.871582.
http://dx.doi.org/10.1109/TIT.2006.871582
J. Erickson and R. Seidel. Better lower bounds on detecting affine and spherical degeneracies. Discrete Comput. Geom., 13:41-57, 1995. URL: http://dx.doi.org/10.1007/BF02574027.
http://dx.doi.org/10.1007/BF02574027
Dean P. Foster, Howard J. Karloff, and Justin Thaler. Variable selection is hard. In Peter Grünwald, Elad Hazan, and Satyen Kale, editors, Proc. 28th Annu. Conf. Comp. Learn. Theo. (COLT), volume 40 of JMLR Proceedings, pages 696-709. JMLR.org, 2015. URL: http://jmlr.org/proceedings/papers/v40/Foster15.html.
http://jmlr.org/proceedings/papers/v40/Foster15.html
P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proc. 30th Annu. ACM Sympos. Theory Comput. (STOC), pages 604-613, 1998. URL: http://dx.doi.org/10.1145/276698.276876.
http://dx.doi.org/10.1145/276698.276876
E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput., 2(30):457-474, 2000. URL: http://dx.doi.org/10.1137/S0097539798347177.
http://dx.doi.org/10.1137/S0097539798347177
Avner Magen. Dimensionality reductions that preserve volumes and distance to affine spaces, and their algorithmic applications. In International Workshop on Randomization and Approximation Techniques in Computer Science, pages 239-253. Springer, 2002.
Sepideh Mahabadi. Approximate nearest line search in high dimensions. In Proc. 26th ACM-SIAM Sympos. Discrete Algs. (SODA), SODA '15, pages 337-354. SIAM, 2015. URL: http://dl.acm.org/citation.cfm?id=2722129.2722154.
http://dl.acm.org/citation.cfm?id=2722129.2722154
Balas Kausik Natarajan. Sparse approximate solutions to linear systems. SIAM J. Comput., 24(2):227-234, 1995. URL: http://dx.doi.org/10.1137/S0097539792240406.
http://dx.doi.org/10.1137/S0097539792240406
Mihai Patrascu and Ryan Williams. On the possibility of faster SAT algorithms. In Moses Charikar, editor, Proc. 21st ACM-SIAM Sympos. Discrete Algs. (SODA), pages 1065-1075. SIAM, 2010. URL: http://dx.doi.org/10.1137/1.9781611973075.86.
http://dx.doi.org/10.1137/1.9781611973075.86
R. Tibshirani. Regression shrinkage and selection via the lasso. J. Royal Stat. Soc. Series B, 58(1):267-288, 1996. URL: http://statweb.stanford.edu/~tibs/lasso/lasso.pdf.
http://statweb.stanford.edu/~tibs/lasso/lasso.pdf
Robert Tibshirani. Regression shrinkage and selection via the lasso: a retrospective. J. Royal Stat. Soc. Series B, 73(3):273-282, 2011. URL: http://dx.doi.org/10.1111/j.1467-9868.2011.00771.x.
http://dx.doi.org/10.1111/j.1467-9868.2011.00771.x
John Wright, Allen Y Yang, Arvind Ganesh, Shankar S Sastry, and Yi Ma. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Machine Intel., 31(2):210-227, 2009. URL: http://dx.doi.org/10.1109/TPAMI.2008.79.
http://dx.doi.org/10.1109/TPAMI.2008.79
Sariel Har-Peled, Piotr Indyk, and Sepideh Mahabadi
Creative Commons Attribution 3.0 Unported license
https://creativecommons.org/licenses/by/3.0/legalcode