Document

**Published in:** LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

Randomized matrix algorithms have had significant recent impact on numerical linear algebra. One especially powerful class of methods are algorithms for approximate matrix multiplication based on sampling. Such methods typically sample individual matrix rows and columns using carefully chosen importance sampling probabilities. However, due to practical considerations like memory locality and the preservation of matrix structure, it is often preferable to sample contiguous blocks of rows and columns all together. Recently, (Wu, 2018) addressed this setting by developing an approximate matrix multiplication method based on block sampling. However, the method is inefficient, as it requires knowledge of optimal importance sampling probabilities that are expensive to compute.
We address this issue by showing that the method of Wu can be accelerated through the use of a randomized implicit trace estimation method. Doing so allows us to provably reduce the cost of sampling to near-linear in the size of the matrices being multiplied, without impacting the accuracy of the final approximate matrix multiplication. Overall, this yields a fast practical algorithm, which we test on a number of synthetic and real-world data sets. We complement our algorithmic contribution with the first extensive empirical comparison of block algorithms for randomized matrix multiplication. Our method offers a significant runtime advantage over the method of (Wu, 2018) and also outperforms basic uniform sampling of blocks. However, we find another recent method of (Charalambides, 2021) which uses sub-optimal but efficiently computable sampling probabilities often (but not always) offers the best trade-off between speed and accuracy.

Chuhan Yang and Christopher Musco. Efficient Block Approximate Matrix Multiplication. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 103:1-103:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{yang_et_al:LIPIcs.ESA.2023.103, author = {Yang, Chuhan and Musco, Christopher}, title = {{Efficient Block Approximate Matrix Multiplication}}, booktitle = {31st Annual European Symposium on Algorithms (ESA 2023)}, pages = {103:1--103:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-295-2}, ISSN = {1868-8969}, year = {2023}, volume = {274}, editor = {G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.103}, URN = {urn:nbn:de:0030-drops-187562}, doi = {10.4230/LIPIcs.ESA.2023.103}, annote = {Keywords: Approximate matrix multiplication, randomized numerical linear algebra, trace estimation} }

Document

**Published in:** LIPIcs, Volume 204, 29th Annual European Symposium on Algorithms (ESA 2021)

Given points P = {p₁,...,p_n} subset of ℝ^d, how do we find a point x which approximately maximizes the function 1/n ∑_{p_i ∈ P} e^{-‖p_i-x‖²}? In other words, how do we find an approximate mode of a Gaussian kernel density estimate (KDE) of P? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. We provide fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low (constant) dimension, our main contribution is a reduction to solving systems of polynomial inequalities. For high dimension, we prove the first dimensionality reduction result for KDE mode finding. The latter result leverages Johnson-Lindenstrauss projection, Kirszbraun’s classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding. For constant approximation factor these algorithms run in O(n (log n)^{O(d)}) and O(nd + (log n)^{O(log³ n)}), respectively; these are proven more precisely as a (1+ε)-approximation guarantee. Furthermore, for the special case of d = 2, we give a combinatorial algorithm running in O(n log² n) time. We empirically demonstrate that the random projection approach and the 2-dimensional algorithm improves over the state-of-the-art mode-finding heuristics.

Jasper C.H. Lee, Jerry Li, Christopher Musco, Jeff M. Phillips, and Wai Ming Tai. Finding an Approximate Mode of a Kernel Density Estimate. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 61:1-61:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{lee_et_al:LIPIcs.ESA.2021.61, author = {Lee, Jasper C.H. and Li, Jerry and Musco, Christopher and Phillips, Jeff M. and Tai, Wai Ming}, title = {{Finding an Approximate Mode of a Kernel Density Estimate}}, booktitle = {29th Annual European Symposium on Algorithms (ESA 2021)}, pages = {61:1--61:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-204-4}, ISSN = {1868-8969}, year = {2021}, volume = {204}, editor = {Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2021.61}, URN = {urn:nbn:de:0030-drops-146428}, doi = {10.4230/LIPIcs.ESA.2021.61}, annote = {Keywords: Kernel density estimation, Dimensionality reduction, Coresets, Means-shift} }

Document

**Published in:** LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

In the masked low-rank approximation problem, one is given data matrix A ∈ ℝ^{n × n} and binary mask matrix W ∈ {0,1}^{n × n}. The goal is to find a rank-k matrix L for which:
cost(L) := ∑_{i=1}^n ∑_{j=1}^n W_{i,j} ⋅ (A_{i,j} - L_{i,j})² ≤ OPT + ε ‖A‖_F²,
where OPT = min_{rank-k L̂} cost(L̂) and ε is a given error parameter. Depending on the choice of W, the above problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, low-rank recovery from monotone missing data, and a number of other important problems. Many of these problems are NP-hard, and while algorithms with provable guarantees are known in some cases, they either 1) run in time n^Ω(k²/ε) or 2) make strong assumptions, for example, that A is incoherent or that the entries in W are chosen independently and uniformly at random.
In this work, we show that a common polynomial time heuristic, which simply sets A to 0 where W is 0, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank k' > k depending on the public coin partition number of W, the heuristic outputs rank-k' L with cost(L) ≤ OPT + ε ‖A‖_F². This partition number is in turn bounded by the randomized communication complexity of W, when interpreted as a two-player communication matrix. For many important cases, including all those listed above, this yields bicriteria approximation guarantees with rank k' = k ⋅ poly(log n/ε).
Beyond this result, we show that different notions of communication complexity yield bicriteria algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.

Cameron Musco, Christopher Musco, and David P. Woodruff. Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{musco_et_al:LIPIcs.ITCS.2021.6, author = {Musco, Cameron and Musco, Christopher and Woodruff, David P.}, title = {{Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation}}, booktitle = {12th Innovations in Theoretical Computer Science Conference (ITCS 2021)}, pages = {6:1--6:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-177-1}, ISSN = {1868-8969}, year = {2021}, volume = {185}, editor = {Lee, James R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.6}, URN = {urn:nbn:de:0030-drops-135452}, doi = {10.4230/LIPIcs.ITCS.2021.6}, annote = {Keywords: low-rank approximation, communication complexity, weighted low-rank approximation, bicriteria approximation algorithms} }

Document

**Published in:** LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

We give a simple distributed algorithm for computing adjacency matrix eigenvectors for the communication graph in an asynchronous gossip model. We show how to use this algorithm to give state-of-the-art asynchronous community detection algorithms when the communication graph is drawn from the well-studied stochastic block model. Our methods also apply to a natural alternative model of randomized communication, where nodes within a community communicate more frequently than nodes in different communities.
Our analysis simplifies and generalizes prior work by forging a connection between asynchronous eigenvector computation and Oja's algorithm for streaming principal component analysis. We hope that our work serves as a starting point for building further connections between the analysis of stochastic iterative methods, like Oja's algorithm, and work on asynchronous and gossip-type algorithms for distributed computation.

Frederik Mallmann-Trenn, Cameron Musco, and Christopher Musco. Eigenvector Computation and Community Detection in Asynchronous Gossip Models. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 159:1-159:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{mallmanntrenn_et_al:LIPIcs.ICALP.2018.159, author = {Mallmann-Trenn, Frederik and Musco, Cameron and Musco, Christopher}, title = {{Eigenvector Computation and Community Detection in Asynchronous Gossip Models}}, booktitle = {45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)}, pages = {159:1--159:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-076-7}, ISSN = {1868-8969}, year = {2018}, volume = {107}, editor = {Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.159}, URN = {urn:nbn:de:0030-drops-91639}, doi = {10.4230/LIPIcs.ICALP.2018.159}, annote = {Keywords: block model, community detection, distributed clustering, eigenvector computation, gossip algorithms, population protocols} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail