Optimal Approximate Matrix Product in Terms of Stable Rank

Authors Michael B. Cohen, Jelani Nelson, David P. Woodruff



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2016.11.pdf
  • Filesize: 0.53 MB
  • 14 pages

Document Identifiers

Author Details

Michael B. Cohen
Jelani Nelson
David P. Woodruff

Cite AsGet BibTex

Michael B. Cohen, Jelani Nelson, and David P. Woodruff. Optimal Approximate Matrix Product in Terms of Stable Rank. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.ICALP.2016.11

Abstract

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having m = O(˜r/epsilon^2) rows. Here r˜ is the maximum stable rank, i.e., the squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [Magen and Zouzias, SODA, 2011] and [Kyrillidis et al., arXiv, 2014] and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future. Our main theorem, via connections with spectral error matrix multiplication proven in previous work, implies quantitative improvements for approximate least squares regression and low rank approximation, and implies faster low rank approximation for popular kernels in machine learning such as the gaussian and Sobolev kernels. Our main result has also already been applied to improve dimensionality reduction guarantees for k-means clustering, and also implies new results for nonparametric regression. Lastly, we point out that the proof of the "BSS" deterministic row-sampling result of [Batson et al., SICOMP, 2012] can be modified to obtain deterministic row-sampling for approximate matrix product in terms of the stable rank of the matrices. The original "BSS" proof was in terms of the rank rather than the stable rank.
Keywords
  • subspace embeddings
  • approximate matrix multiplication
  • stable rank
  • regression
  • low rank approximation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Nir Ailon and Bernard Chazelle. The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput., 39(1):302-322, 2009. Google Scholar
  2. Nir Ailon and Edo Liberty. An almost optimal unrestricted fast Johnson-Lindenstrauss transform. ACM Transactions on Algorithms, 9(3):21, 2013. Google Scholar
  3. Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice-Ramanujan sparsifiers. SIAM J. Comput., 41(6):1704-1721, 2012. Google Scholar
  4. Jean Bourgain. An improved estimate in the restricted isometry problem. Geometric Aspects of Functional Analysis, 2116:65-70, 2014. Google Scholar
  5. Christos Boutsidis, Anastasios Zouzias, Michael W. Mahoney, and Petros Drineas. Randomized dimensionality reduction for k-means clustering. IEEE Transactions on Information Theory, 61(2):1045-1062, 2015. Google Scholar
  6. Moses Charikar, Kevin C. Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theor. Comput. Sci., 312(1):3-15, 2004. Google Scholar
  7. Pei-Chun Chen, Kuang-Yao Lee, Tsung-Ju Lee, Yuh-Jye Lee, and Su-Yun Huang. Multiclass support vector classification via coding and regression. Neurocomputing, 73(7-9):1501-1512, 2010. Google Scholar
  8. Kenneth L. Clarkson and David P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), pages 205-214, 2009. Google Scholar
  9. Kenneth L. Clarkson and David P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC), pages 81-90, 2013. Full version at URL: http://arxiv.org/abs/1207.6365v4.
  10. Michael B. Cohen. Nearly tight oblivious subspace embeddings by trace inequalities. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 278-287, 2016. Google Scholar
  11. Michael B. Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Mădălina Persu. Dimensionality reduction for k-means clustering and low rank approximation. In Proceedings of the 47th ACM Symposium on Theory of Computing (STOC), 2015. Full version at URL: http://arxiv.org/abs/1410.6801v3.
  12. Michael B. Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, and Aaron Sidford. Uniform sampling for matrix approximation. In Proc. of the 6th Annual Conference on Innovations in Theoretical Computer Science (ITCS), pages 181-190, 2015. Google Scholar
  13. Michael B. Cohen, Jelani Nelson, and David P. Woodruff. Optimal approximate matrix product in terms of stable rank. CoRR, abs/1507.02268, 2015. Google Scholar
  14. Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. A sparse Johnson-Lindenstrauss transform. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC), 2010. Google Scholar
  15. Petros Drineas, Ravi Kannan, and Michael W. Mahoney. Fast Monte Carlo algorithms for matrices I: approximating matrix multiplication. SIAM J. Comput., 36(1):132-157, 2006. Google Scholar
  16. Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, and David P. Woodruff. Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13:3475-3506, 2012. Google Scholar
  17. Petros Drineas, Michael W. Mahoney, and S. Muthukrishnan. Sampling algorithms for 𝓁₂ regression and applications. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1127-1136, 2006. Google Scholar
  18. Alex Gittens and Michael W. Mahoney. Revisiting the nystrom method for improved large-scale machine learning. In Proceedings of the 30th International Conference on Machine Learning (ICML), pages 567-575, 2013. Google Scholar
  19. Nathan Halko, Per-Gunnar Martinsson, and Joel A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53(2):217-288, 2011. Google Scholar
  20. Ishay Haviv and Oded Regev. The restricted isometry property of subsampled Fourier matrices. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), to appear, 2016. Google Scholar
  21. William B. Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26:189-206, 1984. Google Scholar
  22. Daniel M. Kane and Jelani Nelson. Sparser Johnson-Lindenstrauss transforms. J. ACM, 61(1):4, 2014. Google Scholar
  23. Alexandra Kolla, Yury Makarychev, Amin Saberi, and Shang-Hua Teng. Subgraph sparsification and nearly optimal ultrasparsifiers. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC), pages 57-66, 2010. Google Scholar
  24. Felix Krahmer and Rachel Ward. New and improved Johnson-Lindenstrauss embeddings via the Restricted Isometry Property. SIAM J. Math. Anal., 43(3):1269-1281, 2011. Google Scholar
  25. Anastasios T. Kyrillidis, Michail Vlachos, and Anastasios Zouzias. Approximate matrix multiplication with application to linear embeddings. CoRR, abs/1403.7683, 2014. Google Scholar
  26. Yin Tat Lee and He Sun. Constructing linear sized spectral sparsification in almost linear time. In Proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 250-269, 2015. Google Scholar
  27. Mu Li, Gary L. Miller, and Richard Peng. Iterative row sampling. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2013. Google Scholar
  28. Yingyu Liang, Maria-Florina Balcan, Vandana Kanchanapally, and David P. Woodruff. Improved distributed principal component analysis. In Proceedings of the 27th Annual Conference on Advances in Neural Information Processing Systems (NIPS), 2014. Google Scholar
  29. Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert. Randomized algorithms for the low-rank approximation of matrices. Proceedings of the National Academy of Sciences, 104(51):20167-20172, 2007. Google Scholar
  30. Yichao Lu, Paramveer Dhillon, Dean Foster, and Lyle Ungar. Faster ridge regression via the subsampled randomized Hadamard transform. In Proceedings of the 26th Annual Conference on Advances in Neural Information Processing Systems (NIPS), 2013. Google Scholar
  31. Avner Magen and Anastasios Zouzias. Low rank matrix-valued Chernoff bounds and approximate matrix multiplication. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1422-1436, 2011. Google Scholar
  32. Michael W. Mahoney. Randomized algorithms for matrices and data. Foundations and Trends in Machine Learning, 3(2):123-224, 2011. Google Scholar
  33. Xiangrui Meng and Michael W. Mahoney. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC), pages 91-100, 2013. Google Scholar
  34. Jelani Nelson and Huy L. Nguŷẽn. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 117-126, 2013. Google Scholar
  35. Jelani Nelson and Huy L. Nguŷẽn. Lower bounds for oblivious subspace embeddings. In Proceedings of the 41st International Colloquium on Automata, Languages, and Programming (ICALP), pages 883-894, 2014. Google Scholar
  36. Jelani Nelson, Eric Price, and Mary Wootters. New constructions of RIP matrices with fast multiplication and fewer rows. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2014. Google Scholar
  37. Nima Reyhani, Hideitsu Hino, and Ricardo Vigário. New probabilistic bounds on eigenvalues and eigenvectors of random kernel matrices. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI), pages 627-634, 2011. Google Scholar
  38. Tamás Sarlós. Improved approximation algorithms for large matrices via random projections. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 143-152, 2006. Google Scholar
  39. Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. SIAM J. Comput., 40(6):1913-1926, 2011. Google Scholar
  40. Mikkel Thorup and Yin Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput., 41(2):293-331, 2012. Google Scholar
  41. Joel A. Tropp. Improved analysis of the subsampled randomized Hadamard transform. Adv. Adapt. Data Anal., 3(1-2):115-126, 2011. Google Scholar
  42. David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science, 10(1-2):1-157, 2014. Google Scholar
  43. Yun Yang, Mert Pilanci, and Martin J. Wainwright. Randomized sketches for kernels: Fast and optimal non-parametric regression. CoRR, abs/1501.06195, 2015. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail