Online Row Sampling

Authors Michael B. Cohen, Cameron Musco, Jakub Pachocki



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2016.7.pdf
  • Filesize: 0.53 MB
  • 18 pages

Document Identifiers

Author Details

Michael B. Cohen
Cameron Musco
Jakub Pachocki

Cite AsGet BibTex

Michael B. Cohen, Cameron Musco, and Jakub Pachocki. Online Row Sampling. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 7:1-7:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2016.7

Abstract

Finding a small spectral approximation for a tall n x d matrix A is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of A. Row sampling improves interpretability, saves space when A is sparse, and preserves row structure, which is especially important, for example, when A represents a graph. However, correctly sampling rows from A can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [Kelner Levin 2013] [Kapralov et al. 2014]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of A one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates A up to multiplicative error epsilon and additive error delta using O(d log d log (epsilon ||A||_2^2/delta) / epsilon^2) online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses O(d^2) memory but only requires O(d log (epsilon ||A||_2^2/delta) / epsilon^2) samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation.
Keywords
  • spectral sparsification
  • leverage score sampling
  • online sparsification

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ahmed Alaoui and Michael W Mahoney. Fast randomized kernel ridge regression with statistical guarantees. In Advances in Neural Information Processing Systems \intcalcSub20151987 (NIPS), pages 775-783, 2015. Google Scholar
  2. Joshua Batson, Daniel A Spielman, and Nikhil Srivastava. Twice-ramanujan sparsifiers. SIAM Journal on Computing, 41(6):1704-1721, 2012. Google Scholar
  3. Antoine Bordes and Léon Bottou. The huller: a simple and efficient online SVM. In Machine Learning: ECML 2005, pages 505-512. Springer, 2005. Google Scholar
  4. Christos Boutsidis, Dan Garber, Zohar Karnin, and Edo Liberty. Online principal components analysis. In Proceedings of the \nth\intcalcSub20151989 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 887-901, 2015. Google Scholar
  5. Christos Boutsidis and David P Woodruff. Optimal CUR matrix decompositions. In Proceedings of the \nth\intcalcSub20141968 Annual ACM Symposium on Theory of Computing (STOC), pages 353-362, 2014. Google Scholar
  6. Kenneth L. Clarkson and David P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the \nth\intcalcSub20131968 Annual ACM Symposium on Theory of Computing (STOC), pages 81-90, 2013. Google Scholar
  7. Michael B Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Madalina Persu. Dimensionality reduction for k-means clustering and low rank approximation. In Proceedings of the \nth\intcalcSub20151968 Annual ACM Symposium on Theory of Computing (STOC), pages 163-172, 2015. Google Scholar
  8. Michael B Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, and Aaron Sidford. Uniform sampling for matrix approximation. In Proceedings of the \nth\intcalcSub20152009 Conference on Innovations in Theoretical Computer Science (ITCS), pages 181-190, 2015. Google Scholar
  9. Michael B Cohen, Cameron Musco, and Christopher Musco. Ridge leverage scores for low-rank approximation. http://arxiv.org/abs/1511.07263, 2015. Google Scholar
  10. Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. Online passive-aggressive algorithms. The Journal of Machine Learning Research, 7:551-585, 2006. Google Scholar
  11. Michael Kapralov, Yin Tat Lee, Cameron Musco, Christopher Musco, and Aaron Sidford. Single pass spectral sparsification in dynamic streams. In Proceedings of the \nth\intcalcSub20141959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 561-570, 2014. Google Scholar
  12. Jonathan A Kelner and Alex Levin. Spectral sparsification in the semi-streaming setting. Theory of Computing Systems, 53(2):243-262, 2013. Google Scholar
  13. Ioannis Koutis, Gary L Miller, and Richard Peng. Approaching optimality for solving SDD linear systems. In Proceedings of the \nth\intcalcSub20101959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 235-244, 2010. Google Scholar
  14. Yin Tat Lee and He Sun. Constructing linear-sized spectral sparsification in almost-linear time. In Proceedings of the \nth\intcalcSub20151959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 250-269, 2015. Google Scholar
  15. Mu Li, Gary L Miller, and Richard Peng. Iterative row sampling. In Proceedings of the \nth\intcalcSub20131959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 127-136, 2013. Google Scholar
  16. Edo Liberty, Ram Sriharsha, and Maxim Sviridenko. An algorithm for online k-means clustering. In Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 81-89, 2016. Google Scholar
  17. Michael W. Mahoney and Xiangrui Meng. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the \nth\intcalcSub20131968 Annual ACM Symposium on Theory of Computing (STOC), pages 91-100, 2013. Google Scholar
  18. Jelani Nelson and Huy L. Nguyen. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the \nth\intcalcSub20131959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 117-126, 2013. Google Scholar
  19. Daniel A Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. SIAM Journal on Computing, 40(6):1913-1926, 2011. Google Scholar
  20. Daniel A Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the \nth\intcalcSub20041968 Annual ACM Symposium on Theory of Computing (STOC), pages 81-90, 2004. Google Scholar
  21. Daniel A Spielman and Shang-Hua Teng. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM Journal on Matrix Analysis and Applications, 35(3):835-885, 2014. Google Scholar
  22. Joel Tropp. Freedman’s inequality for matrix martingales. Electronic Communications in Probability, 16:262-270, 2011. Google Scholar