Non-Negative Sparse Regression and Column Subset Selection with L1 Error

Bhaskara, Aditya; Lattanzi, Silvio

doi:10.4230/LIPIcs.ITCS.2018.7

Abstract

We consider the problems of sparse regression and column subset selection under L1 error. For both problems, we show that in the non-negative setting it is possible to obtain tight and efficient approximations, without any additional structural assumptions (such as restricted isometry, incoherence, expansion, etc.). For sparse regression, given a matrix A and a vector b with non-negative entries, we give an efficient algorithm to output a vector x of sparsity O(k), for which |Ax - b|_1 is comparable to the smallest error possible using non-negative k-sparse x. We then use this technique to obtain our main result: an efficient algorithm for column subset selection under L1 error for non-negative matrices.

Jason Altschuler, Aditya Bhaskara, Gang Fu, Vahab S. Mirrokni, Afshin Rostamizadeh, and Morteza Zadimoghaddam. Greedy column subset selection: New bounds and distributed algorithms. In Maria-Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pages 2539-2548. JMLR.org, 2016. URL: http://jmlr.org/proceedings/papers/v48/altschuler16.html.
Edoardo Amaldi and Viggo Kann. On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209(1Ð2):237-260, 1998.
Sanjeev Arora, Rong Ge, Yonatan Halpern, David M. Mimno, Ankur Moitra, David Sontag, Yichen Wu, and Michael Zhu. A practical algorithm for topic modeling with provable guarantees. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, volume 28 of JMLR Workshop and Conference Proceedings, pages 280-288. JMLR.org, 2013. URL: http://jmlr.org/proceedings/papers/v28/arora13.html.
Sanjeev Arora, Rong Ge, Ravindran Kannan, and Ankur Moitra. Computing a nonnegative matrix factorization - provably. In Howard J. Karloff and Toniann Pitassi, editors, Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19 - 22, 2012, pages 145-162. ACM, 2012. URL: http://dx.doi.org/10.1145/2213977.2213994.
Arturs Backurs, Piotr Indyk, Ilya P. Razenshteyn, and David P. Woodruff. Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 318-337. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch24.
Afonso S. Bandeira, Edgar Dobriban, Dustin G. Mixon, and William F. Sawin. Certifying the restricted isometry property is hard, 2012. URL: http://arxiv.org/abs/arXiv:1204.1580.
Siddharth Barman. Approximating nash equilibria and dense bipartite subgraphs via an approximate version of caratheodory’s theorem. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 361-369. ACM, 2015. URL: http://dx.doi.org/10.1145/2746539.2746566.
Aditya Bhaskara, Ananda Theertha Suresh, and Morteza Zadimoghaddam. Sparse solutions to nonnegative linear systems and applications. In Guy Lebanon and S. V. N. Vishwanathan, editors, Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2015, San Diego, California, USA, May 9-12, 2015, volume 38 of JMLR Workshop and Conference Proceedings. JMLR.org, 2015. URL: http://jmlr.org/proceedings/papers/v38/bhaskara15.html.
Avrim Blum, Sariel Har-Peled, and Benjamin Raichel. Sparse approximation via generating point sets. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 548-557. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch40.
Christos Boutsidis and David P. Woodruff. Optimal CUR matrix decompositions. SIAM J. Comput., 46(2):543-589, 2017. URL: http://dx.doi.org/10.1137/140977898.
J.P. Brooks, J.H. Dulá, and E.L. Boone. A pure 𝓁₁-norm principal component analysis. Computational Statistics &Data Analysis, 61:83-98, 2013.
Joe M Butler, D Timothy Bishop, and Jennifer H Barrett. Strategies for selecting subsets of single-nucleotide polymorphisms to genotype in association studies. BMC genetics, 6(1):S72, 2005.
Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. Robust principal component analysis? JACM, 58(3):11:1-11:37, 2011.
Emmanuel J Candes, Justin K Romberg, and Terence Tao. Stable signal recovery from incomplete and inaccurate measurements. Communications on pure and applied mathematics, 59(8):1207-1223, 2006.
Flavio Chierichetti, Sreenivas Gollapudi, Ravi Kumar, Silvio Lattanzi, Rina Panigrahy, and David P. Woodruff. Algorithms for dollarbackslashell_pdollar low-rank approximation. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pages 806-814. PMLR, 2017. URL: http://proceedings.mlr.press/v70/chierichetti17a.html.
Hugh A Chipman and Hong Gu. Interpretable dimension reduction. Journal of applied statistics, 32(9):969-987, 2005.
Kenneth L. Clarkson. Coresets, sparse greedy approximation, and the frank-wolfe algorithm. ACM Trans. Algorithms, 6(4):63:1-63:30, 2010. URL: http://dx.doi.org/10.1145/1824777.1824783.
M. J. Donahue, C. Darken, L. Gurvits, and E. Sontag. Rates of convex approximation in non-hilbert spaces. Constructive Approximation, 13(2):187-220, 1997. URL: http://dx.doi.org/10.1007/BF02678464.
David L Donoho and Michael Elad. Optimally sparse representation in general (nonorthogonal) dictionaries via ?1 minimization. Proceedings of the National Academy of Sciences, 100(5):2197-2202, 2003.
A. Eriksson and A. van den Hengel. Efficient computation of robust low-rank matrix approximations using the L₁ norm. PAMI, 34(9):1681-1690, 2012.
Dan Feldman, Mikhail Volkov, and Daniela Rus. Dimensionality reduction of massive sparse datasets using coresets. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 2766-2774. Curran Associates, Inc., 2016. URL: http://papers.nips.cc/paper/6596-dimensionality-reduction-of-massive-sparse-datasets-using-coresets.pdf.
Dean P. Foster, Howard J. Karloff, and Justin Thaler. Variable selection is hard. In Peter Grünwald, Elad Hazan, and Satyen Kale, editors, Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, volume 40 of JMLR Workshop and Conference Proceedings, pages 696-709. JMLR.org, 2015. URL: http://jmlr.org/proceedings/papers/v40/Foster15.html.
Alan M. Frieze, Ravi Kannan, and Santosh Vempala. Fast monte-carlo algorithms for finding low-rank approximations. J. ACM, 51(6):1025-1041, 2004. URL: http://dx.doi.org/10.1145/1039488.1039494.
Rahul Garg and Rohit Khandekar. Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 337-344. ACM, 2009.
Venkatesan Guruswami and Ali Kemal Sinop. Optimal column-based low-rank matrix reconstruction. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012, pages 1207-1214. SIAM, 2012. URL: http://portal.acm.org/citation.cfm?id=2095211&CFID=63838676&CFTOKEN=79617016.
Peter J. Huber. Robust Statistics. John Wiley &Sons, New York,, 1981.
Qifa Ke and Takeo Kanade. Robust L₁ norm factorization in the presence of outliers and missing data by alternative convex programming. In CVPR, pages 739-746, 2005.
Pascal Koiran and Anastasios Zouzias. Hidden cliques and the certification of the restricted isometry property. IEEE Trans. Information Theory, 60(8):4999-5006, 2014. URL: http://dx.doi.org/10.1109/TIT.2014.2331341.
Robert Krauthgamer, editor. Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016. SIAM, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.
Anastasios Kyrillidis, Stephen Becker, Volkan Cevher, and Christoph Koch. Sparse projections onto the simplex. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, volume 28 of JMLR Workshop and Conference Proceedings, pages 235-243. JMLR.org, 2013. URL: http://jmlr.org/proceedings/papers/v28/kyrillidis13.html.
Cewu Lu, Jiaping Shi, and Jiaya Jia. Scalable adaptive robust dictionary learning. TIP, 23(2):837-847, 2014.
Deyu Meng and Fernando. D. L. Torre. Robust matrix factorization with unknown noise. In ICCV, pages 1337-1344, 2013.
Bruno A. Olshausen and David J. Field. Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37(23):3311-3325, 1997. URL: http://dx.doi.org/10.1016/S0042-6989(97)00169-7.
Saurabh Paul, Malik Magdon-Ismail, and Petros Drineas. Column selection via adaptive sampling. In Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett, editors, Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 406-414, 2015. URL: http://papers.nips.cc/paper/6011-column-selection-via-adaptive-sampling.
Shai Shalev-Shwartz, Nathan Srebro, and Tong Zhang. Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM J. on Optimization, 20(6):2807-2832, 2010.
Zhao Song, David P. Woodruff, and Pelin Zhong. Low rank approximation with entrywise 𝓁₁-norm error. In STOC, 2017.
Naiyan Wang, Tiansheng Yao, Jingdong Wang, and Dit-Yan Yeung. A probabilistic approach to robust matrix factorization. In ECCV, pages 126-139, 2012.
Naiyan Wang and Dit-Yan Yeung. Bayesian robust matrix factorization for image and video processing. In ICCV, pages 1785-1792, 2013.
Tengyao Wang, Quentin Berthet, and Yaniv Plan. Average-case hardness of RIP certification. CoRR, abs/1605.09646, 2016. URL: http://arxiv.org/abs/1605.09646.
Kai Wei, Yuzong Liu, Katrin Kirchhoff, Chris Bartels, and Jeff Bilmes. Submodular subset selection for large-scale speech training data. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 3311-3315. IEEE, 2014.
L. Xiong, X. Chen, and J. Schneider. Direct robust matrix factorization for anomaly detection. In ICDM, pages 844-853, 2011.
L. Xu and A. L. Yuille. Robust principal component analysis by self-organizing rules based on statistical physics approach. IEEE Transactions on Neural Networks, 6(1):131-143, 1995.
Y. Zheng, G. Liu, S. Sugimoto, S. Yan, and M. Okutomi. Practical low-rank matrix approximation under robust L₁-norm. In CVPR, pages 1410-1417, 2012.

Non-Negative Sparse Regression and Column Subset Selection with L1 Error

Authors Aditya Bhaskara, Silvio Lattanzi

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

References