eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2018-07-04
77:1
77:14
10.4230/LIPIcs.ICALP.2018.77
article
Approximate Sparse Linear Regression
Har-Peled, Sariel
1
Indyk, Piotr
2
Mahabadi, Sepideh
3
Department of Computer Science, University of Illinois, Urbana, IL, USA
Department of Computer Science, MIT, Cambridge, MA, USA
Data Science Institute, Columbia University, New York, NY, USA
In the Sparse Linear Regression (SLR) problem, given a d x n matrix M and a d-dimensional query q, the goal is to compute a k-sparse n-dimensional vector tau such that the error ||M tau - q|| is minimized. This problem is equivalent to the following geometric problem: given a set P of n points and a query point q in d dimensions, find the closest k-dimensional subspace to q, that is spanned by a subset of k points in P. In this paper, we present data-structures/algorithms and conditional lower bounds for several variants of this problem (such as finding the closest induced k dimensional flat/simplex instead of a subspace).
In particular, we present approximation algorithms for the online variants of the above problems with query time O~(n^{k-1}), which are of interest in the "low sparsity regime" where k is small, e.g., 2 or 3. For k=d, this matches, up to polylogarithmic factors, the lower bound that relies on the affinely degenerate conjecture (i.e., deciding if n points in R^d contains d+1 points contained in a hyperplane takes Omega(n^d) time). Moreover, our algorithms involve formulating and solving several geometric subproblems, which we believe to be of independent interest.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol107-icalp2018/LIPIcs.ICALP.2018.77/LIPIcs.ICALP.2018.77.pdf
Sparse Linear Regression
Approximate Nearest Neighbor
Sparse Recovery
Nearest Induced Flat
Nearest Subspace Search