Optimal Approximate Matrix Product in Terms of Stable Rank
We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having m = O(˜r/epsilon^2) rows. Here r˜ is the maximum stable rank, i.e., the squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [Magen and Zouzias, SODA, 2011] and [Kyrillidis et al., arXiv, 2014] and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future.
Our main theorem, via connections with spectral error matrix multiplication proven in previous work, implies quantitative improvements for approximate least squares regression and low rank approximation, and implies faster low rank approximation for popular kernels in machine learning such as the gaussian and Sobolev kernels. Our main result has also already been applied to improve dimensionality reduction guarantees for k-means clustering, and also implies new results for nonparametric regression.
Lastly, we point out that the proof of the "BSS" deterministic row-sampling result of [Batson et al., SICOMP, 2012] can be modified to obtain deterministic row-sampling for approximate matrix product in terms of the stable rank of the matrices. The original "BSS" proof was in terms of the rank rather than the stable rank.
subspace embeddings
approximate matrix multiplication
stable rank
regression
low rank approximation
11:1-11:14
Regular Paper
Michael B.
Cohen
Michael B. Cohen
Jelani
Nelson
Jelani Nelson
David P.
Woodruff
David P. Woodruff
10.4230/LIPIcs.ICALP.2016.11
Nir Ailon and Bernard Chazelle. The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput., 39(1):302-322, 2009.
Nir Ailon and Edo Liberty. An almost optimal unrestricted fast Johnson-Lindenstrauss transform. ACM Transactions on Algorithms, 9(3):21, 2013.
Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice-Ramanujan sparsifiers. SIAM J. Comput., 41(6):1704-1721, 2012.
Jean Bourgain. An improved estimate in the restricted isometry problem. Geometric Aspects of Functional Analysis, 2116:65-70, 2014.
Christos Boutsidis, Anastasios Zouzias, Michael W. Mahoney, and Petros Drineas. Randomized dimensionality reduction for k-means clustering. IEEE Transactions on Information Theory, 61(2):1045-1062, 2015.
Moses Charikar, Kevin C. Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theor. Comput. Sci., 312(1):3-15, 2004.
Pei-Chun Chen, Kuang-Yao Lee, Tsung-Ju Lee, Yuh-Jye Lee, and Su-Yun Huang. Multiclass support vector classification via coding and regression. Neurocomputing, 73(7-9):1501-1512, 2010.
Kenneth L. Clarkson and David P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), pages 205-214, 2009.
Kenneth L. Clarkson and David P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC), pages 81-90, 2013. Full version at URL: http://arxiv.org/abs/1207.6365v4.
http://arxiv.org/abs/1207.6365v4
Michael B. Cohen. Nearly tight oblivious subspace embeddings by trace inequalities. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 278-287, 2016.
Michael B. Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Mădălina Persu. Dimensionality reduction for k-means clustering and low rank approximation. In Proceedings of the 47th ACM Symposium on Theory of Computing (STOC), 2015. Full version at URL: http://arxiv.org/abs/1410.6801v3.
http://arxiv.org/abs/1410.6801v3
Michael B. Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, and Aaron Sidford. Uniform sampling for matrix approximation. In Proc. of the 6th Annual Conference on Innovations in Theoretical Computer Science (ITCS), pages 181-190, 2015.
Michael B. Cohen, Jelani Nelson, and David P. Woodruff. Optimal approximate matrix product in terms of stable rank. CoRR, abs/1507.02268, 2015.
Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. A sparse Johnson-Lindenstrauss transform. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC), 2010.
Petros Drineas, Ravi Kannan, and Michael W. Mahoney. Fast Monte Carlo algorithms for matrices I: approximating matrix multiplication. SIAM J. Comput., 36(1):132-157, 2006.
Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, and David P. Woodruff. Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13:3475-3506, 2012.
Petros Drineas, Michael W. Mahoney, and S. Muthukrishnan. Sampling algorithms for 𝓁₂ regression and applications. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1127-1136, 2006.
Alex Gittens and Michael W. Mahoney. Revisiting the nystrom method for improved large-scale machine learning. In Proceedings of the 30th International Conference on Machine Learning (ICML), pages 567-575, 2013.
Nathan Halko, Per-Gunnar Martinsson, and Joel A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53(2):217-288, 2011.
Ishay Haviv and Oded Regev. The restricted isometry property of subsampled Fourier matrices. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), to appear, 2016.
William B. Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26:189-206, 1984.
Daniel M. Kane and Jelani Nelson. Sparser Johnson-Lindenstrauss transforms. J. ACM, 61(1):4, 2014.
Alexandra Kolla, Yury Makarychev, Amin Saberi, and Shang-Hua Teng. Subgraph sparsification and nearly optimal ultrasparsifiers. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC), pages 57-66, 2010.
Felix Krahmer and Rachel Ward. New and improved Johnson-Lindenstrauss embeddings via the Restricted Isometry Property. SIAM J. Math. Anal., 43(3):1269-1281, 2011.
Anastasios T. Kyrillidis, Michail Vlachos, and Anastasios Zouzias. Approximate matrix multiplication with application to linear embeddings. CoRR, abs/1403.7683, 2014.
Yin Tat Lee and He Sun. Constructing linear sized spectral sparsification in almost linear time. In Proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 250-269, 2015.
Mu Li, Gary L. Miller, and Richard Peng. Iterative row sampling. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2013.
Yingyu Liang, Maria-Florina Balcan, Vandana Kanchanapally, and David P. Woodruff. Improved distributed principal component analysis. In Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems (NIPS), 2014.
Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert. Randomized algorithms for the low-rank approximation of matrices. Proceedings of the National Academy of Sciences, 104(51):20167-20172, 2007.
Yichao Lu, Paramveer Dhillon, Dean Foster, and Lyle Ungar. Faster ridge regression via the subsampled randomized Hadamard transform. In Proceedings of the 26th Annual Conference on Advances in Neural Information Processing Systems (NIPS), 2013.
Avner Magen and Anastasios Zouzias. Low rank matrix-valued Chernoff bounds and approximate matrix multiplication. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1422-1436, 2011.
Michael W. Mahoney. Randomized algorithms for matrices and data. Foundations and Trends in Machine Learning, 3(2):123-224, 2011.
Xiangrui Meng and Michael W. Mahoney. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC), pages 91-100, 2013.
Jelani Nelson and Huy L. Nguŷẽn. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 117-126, 2013.
Jelani Nelson and Huy L. Nguŷẽn. Lower bounds for oblivious subspace embeddings. In Proceedings of the 41st International Colloquium on Automata, Languages, and Programming (ICALP), pages 883-894, 2014.
Jelani Nelson, Eric Price, and Mary Wootters. New constructions of RIP matrices with fast multiplication and fewer rows. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2014.
Nima Reyhani, Hideitsu Hino, and Ricardo Vigário. New probabilistic bounds on eigenvalues and eigenvectors of random kernel matrices. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI), pages 627-634, 2011.
Tamás Sarlós. Improved approximation algorithms for large matrices via random projections. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 143-152, 2006.
Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. SIAM J. Comput., 40(6):1913-1926, 2011.
Mikkel Thorup and Yin Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput., 41(2):293-331, 2012.
Joel A. Tropp. Improved analysis of the subsampled randomized Hadamard transform. Adv. Adapt. Data Anal., 3(1-2):115-126, 2011.
David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science, 10(1-2):1-157, 2014.
Yun Yang, Mert Pilanci, and Martin J. Wainwright. Randomized sketches for kernels: Fast and optimal non-parametric regression. CoRR, abs/1501.06195, 2015.
Creative Commons Attribution 3.0 Unported license
https://creativecommons.org/licenses/by/3.0/legalcode