Online Row Sampling

Cohen, Michael B.; Musco, Cameron; Pachocki, Jakub

doi:10.4230/LIPIcs.APPROX-RANDOM.2016.7

File

LIPIcs.APPROX-RANDOM.2016.7.pdf

Filesize: 0.53 MB
18 pages

Document Identifiers

DOI: 10.4230/LIPIcs.APPROX-RANDOM.2016.7
URN: urn:nbn:de:0030-drops-66304

Author Details

Michael B. Cohen

Cameron Musco

Jakub Pachocki

Cite AsGet BibTex

Michael B. Cohen, Cameron Musco, and Jakub Pachocki. Online Row Sampling. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 7:1-7:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2016.7

Abstract

Finding a small spectral approximation for a tall n x d matrix A is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of A. Row sampling improves interpretability, saves space when A is sparse, and preserves row structure, which is especially important, for example, when A represents a graph. However, correctly sampling rows from A can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [Kelner Levin 2013] [Kapralov et al. 2014]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of A one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates A up to multiplicative error epsilon and additive error delta using O(d log d log (epsilon ||A||_2^2/delta) / epsilon^2) online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses O(d^2) memory but only requires O(d log (epsilon ||A||_2^2/delta) / epsilon^2) samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation.

Keywords

spectral sparsification
leverage score sampling
online sparsification

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Ahmed Alaoui and Michael W Mahoney. Fast randomized kernel ridge regression with statistical guarantees. In Advances in Neural Information Processing Systems \intcalcSub20151987 (NIPS), pages 775-783, 2015.
Joshua Batson, Daniel A Spielman, and Nikhil Srivastava. Twice-ramanujan sparsifiers. SIAM Journal on Computing, 41(6):1704-1721, 2012.
Antoine Bordes and Léon Bottou. The huller: a simple and efficient online SVM. In Machine Learning: ECML 2005, pages 505-512. Springer, 2005.
Christos Boutsidis, Dan Garber, Zohar Karnin, and Edo Liberty. Online principal components analysis. In Proceedings of the \nth\intcalcSub20151989 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 887-901, 2015.
Christos Boutsidis and David P Woodruff. Optimal CUR matrix decompositions. In Proceedings of the \nth\intcalcSub20141968 Annual ACM Symposium on Theory of Computing (STOC), pages 353-362, 2014.
Kenneth L. Clarkson and David P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the \nth\intcalcSub20131968 Annual ACM Symposium on Theory of Computing (STOC), pages 81-90, 2013.
Michael B Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Madalina Persu. Dimensionality reduction for k-means clustering and low rank approximation. In Proceedings of the \nth\intcalcSub20151968 Annual ACM Symposium on Theory of Computing (STOC), pages 163-172, 2015.
Michael B Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, and Aaron Sidford. Uniform sampling for matrix approximation. In Proceedings of the \nth\intcalcSub20152009 Conference on Innovations in Theoretical Computer Science (ITCS), pages 181-190, 2015.
Michael B Cohen, Cameron Musco, and Christopher Musco. Ridge leverage scores for low-rank approximation. http://arxiv.org/abs/1511.07263, 2015.
Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. Online passive-aggressive algorithms. The Journal of Machine Learning Research, 7:551-585, 2006.
Michael Kapralov, Yin Tat Lee, Cameron Musco, Christopher Musco, and Aaron Sidford. Single pass spectral sparsification in dynamic streams. In Proceedings of the \nth\intcalcSub20141959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 561-570, 2014.
Jonathan A Kelner and Alex Levin. Spectral sparsification in the semi-streaming setting. Theory of Computing Systems, 53(2):243-262, 2013.
Ioannis Koutis, Gary L Miller, and Richard Peng. Approaching optimality for solving SDD linear systems. In Proceedings of the \nth\intcalcSub20101959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 235-244, 2010.
Yin Tat Lee and He Sun. Constructing linear-sized spectral sparsification in almost-linear time. In Proceedings of the \nth\intcalcSub20151959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 250-269, 2015.
Mu Li, Gary L Miller, and Richard Peng. Iterative row sampling. In Proceedings of the \nth\intcalcSub20131959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 127-136, 2013.
Edo Liberty, Ram Sriharsha, and Maxim Sviridenko. An algorithm for online k-means clustering. In Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 81-89, 2016.
Michael W. Mahoney and Xiangrui Meng. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the \nth\intcalcSub20131968 Annual ACM Symposium on Theory of Computing (STOC), pages 91-100, 2013.
Jelani Nelson and Huy L. Nguyen. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the \nth\intcalcSub20131959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 117-126, 2013.
Daniel A Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. SIAM Journal on Computing, 40(6):1913-1926, 2011.
Daniel A Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the \nth\intcalcSub20041968 Annual ACM Symposium on Theory of Computing (STOC), pages 81-90, 2004.
Daniel A Spielman and Shang-Hua Teng. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM Journal on Matrix Analysis and Applications, 35(3):835-885, 2014.
Joel Tropp. Freedman’s inequality for matrix martingales. Electronic Communications in Probability, 16:262-270, 2011.