DROPS

Document

DOI: 10.4230/LIPIcs.ITCS.2025.43

Space Complexity of Minimum Cut Problems in Single-Pass Streams

Authors: Matthew Ding, Alexandro Garces, Jason Li, Honghao Lin, Jelani Nelson, Vihan Shah, and David P. Woodruff

Published in: LIPIcs, Volume 325, 16th Innovations in Theoretical Computer Science Conference (ITCS 2025)

Abstract

We consider the problem of finding a minimum cut of a weighted graph presented as a single-pass stream. While graph sparsification in streams has been intensively studied, the specific application of finding minimum cuts in streams is less well-studied. To this end, we show upper and lower bounds on minimum cut problems in insertion-only streams for a variety of settings, including for both randomized and deterministic algorithms, for both arbitrary and random order streams, and for both approximate and exact algorithms. One of our main results is an Õ(n/ε) space algorithm with fast update time for approximating a spectral cut query with high probability on a stream given in an arbitrary order. Our result breaks the Ω(n/ε²) space lower bound required of a sparsifier that approximates all cuts simultaneously. Using this result, we provide streaming algorithms with near optimal space of Õ(n/ε) for minimum cut and approximate all-pairs effective resistances, with matching space lower-bounds. The amortized update time of our algorithms is Õ(1), provided that the number of edges in the input graph is at least (n/ε²)^{1+o(1)}. We also give a generic way of incorporating sketching into a recursive contraction algorithm to improve the post-processing time of our algorithms. In addition to these results, we give a random-order streaming algorithm that computes the exact minimum cut on a simple, unweighted graph using Õ(n) space. Finally, we give an Ω(n/ε²) space lower bound for deterministic minimum cut algorithms which matches the best-known upper bound up to polylogarithmic factors.

Cite as

Matthew Ding, Alexandro Garces, Jason Li, Honghao Lin, Jelani Nelson, Vihan Shah, and David P. Woodruff. Space Complexity of Minimum Cut Problems in Single-Pass Streams. In 16th Innovations in Theoretical Computer Science Conference (ITCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 325, pp. 43:1-43:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ding_et_al:LIPIcs.ITCS.2025.43,
  author =	{Ding, Matthew and Garces, Alexandro and Li, Jason and Lin, Honghao and Nelson, Jelani and Shah, Vihan and Woodruff, David P.},
  title =	{{Space Complexity of Minimum Cut Problems in Single-Pass Streams}},
  booktitle =	{16th Innovations in Theoretical Computer Science Conference (ITCS 2025)},
  pages =	{43:1--43:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-361-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{325},
  editor =	{Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2025.43},
  URN =		{urn:nbn:de:0030-drops-226714},
  doi =		{10.4230/LIPIcs.ITCS.2025.43},
  annote =	{Keywords: minimum cut, approximate, random order, lower bound}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2024.33

On the Amortized Complexity of Approximate Counting

Authors: Ishaq Aden-Ali, Yanjun Han, Jelani Nelson, and Huacheng Yu

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)

Abstract

Naively storing a counter up to value n would require Ω(log n) bits of memory. Nelson and Yu [Jelani Nelson and Huacheng Yu, 2022], following work of Morris [Robert H. Morris, 1978], showed that if the query answers need only be (1+ε)-approximate with probability at least 1 - δ, then O(log log n + log log(1/δ) + log(1/ε)) bits suffice, and in fact this bound is tight. Morris' original motivation for studying this problem though, as well as modern applications, require not only maintaining one counter, but rather k counters for k large. This motivates the following question: for k large, can k counters be simultaneously maintained using asymptotically less memory than k times the cost of an individual counter? That is to say, does this problem benefit from an improved amortized space complexity bound? We answer this question in the negative. Specifically, we prove a lower bound for nearly the full range of parameters showing that, in terms of memory usage, there is no asymptotic benefit possible via amortization when storing multiple counters. Our main proof utilizes a certain notion of "information cost" recently introduced by Braverman, Garg and Woodruff [Mark Braverman et al., 2020] to prove lower bounds for streaming algorithms.

Cite as

Ishaq Aden-Ali, Yanjun Han, Jelani Nelson, and Huacheng Yu. On the Amortized Complexity of Approximate Counting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 33:1-33:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{adenali_et_al:LIPIcs.APPROX/RANDOM.2024.33,
  author =	{Aden-Ali, Ishaq and Han, Yanjun and Nelson, Jelani and Yu, Huacheng},
  title =	{{On the Amortized Complexity of Approximate Counting}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{33:1--33:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.33},
  URN =		{urn:nbn:de:0030-drops-210264},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.33},
  annote =	{Keywords: streaming, approximate counting, information complexity, lower bounds}
}

Document

DOI: 10.4230/LIPIcs.ITC.2023.17

Differentially Private Aggregation via Imperfect Shuffling

Authors: Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Jelani Nelson, and Samson Zhou

Published in: LIPIcs, Volume 267, 4th Conference on Information-Theoretic Cryptography (ITC 2023)

Abstract

In this paper, we introduce the imperfect shuffle differential privacy model, where messages sent from users are shuffled in an almost uniform manner before being observed by a curator for private aggregation. We then consider the private summation problem. We show that the standard split-and-mix protocol by Ishai et. al. [FOCS 2006] can be adapted to achieve near-optimal utility bounds in the imperfect shuffle model. Specifically, we show that surprisingly, there is no additional error overhead necessary in the imperfect shuffle model.

Cite as

Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Jelani Nelson, and Samson Zhou. Differentially Private Aggregation via Imperfect Shuffling. In 4th Conference on Information-Theoretic Cryptography (ITC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 267, pp. 17:1-17:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{ghazi_et_al:LIPIcs.ITC.2023.17,
  author =	{Ghazi, Badih and Kumar, Ravi and Manurangsi, Pasin and Nelson, Jelani and Zhou, Samson},
  title =	{{Differentially Private Aggregation via Imperfect Shuffling}},
  booktitle =	{4th Conference on Information-Theoretic Cryptography (ITC 2023)},
  pages =	{17:1--17:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-271-6},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{267},
  editor =	{Chung, Kai-Min},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITC.2023.17},
  URN =		{urn:nbn:de:0030-drops-183453},
  doi =		{10.4230/LIPIcs.ITC.2023.17},
  annote =	{Keywords: Differential privacy, private summation, shuffle model}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2023.39

Generalized Private Selection and Testing with High Confidence

Authors: Edith Cohen, Xin Lyu, Jelani Nelson, Tamás Sarlós, and Uri Stemmer

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

Abstract

Composition theorems are general and powerful tools that facilitate privacy accounting across multiple data accesses from per-access privacy bounds. However they often result in weaker bounds compared with end-to-end analysis. Two popular tools that mitigate that are the exponential mechanism (or report noisy max) and the sparse vector technique, generalized in a recent private selection framework by Liu and Talwar (STOC 2019). In this work, we propose a flexible framework of private selection and testing that generalizes the one proposed by Liu and Talwar, supporting a wide range of applications. We apply our framework to solve several fundamental tasks, including query releasing, top-k selection, and stable selection, with improved confidence-accuracy tradeoffs. Additionally, for online settings, we apply our private testing to design a mechanism for adaptive query releasing, which improves the sample complexity dependence on the confidence parameter for the celebrated private multiplicative weights algorithm of Hardt and Rothblum (FOCS 2010).

Cite as

Edith Cohen, Xin Lyu, Jelani Nelson, Tamás Sarlós, and Uri Stemmer. Generalized Private Selection and Testing with High Confidence. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 39:1-39:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ITCS.2023.39,
  author =	{Cohen, Edith and Lyu, Xin and Nelson, Jelani and Sarl\'{o}s, Tam\'{a}s and Stemmer, Uri},
  title =	{{Generalized Private Selection and Testing with High Confidence}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{39:1--39:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.39},
  URN =		{urn:nbn:de:0030-drops-175426},
  doi =		{10.4230/LIPIcs.ITCS.2023.39},
  annote =	{Keywords: differential privacy, sparse vector technique, adaptive data analysis}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2023.55

Private Counting of Distinct and k-Occurring Items in Time Windows

Authors: Badih Ghazi, Ravi Kumar, Jelani Nelson, and Pasin Manurangsi

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

Abstract

In this work, we study the task of estimating the numbers of distinct and k-occurring items in a time window under the constraint of differential privacy (DP). We consider several variants depending on whether the queries are on general time windows (between times t₁ and t₂), or are restricted to being cumulative (between times 1 and t₂), and depending on whether the DP neighboring relation is event-level or the more stringent item-level. We obtain nearly tight upper and lower bounds on the errors of DP algorithms for these problems. En route, we obtain an event-level DP algorithm for estimating, at each time step, the number of distinct items seen over the last W updates with error polylogarithmic in W; this answers an open question of Bolot et al. (ICDT 2013).

Cite as

Badih Ghazi, Ravi Kumar, Jelani Nelson, and Pasin Manurangsi. Private Counting of Distinct and k-Occurring Items in Time Windows. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 55:1-55:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{ghazi_et_al:LIPIcs.ITCS.2023.55,
  author =	{Ghazi, Badih and Kumar, Ravi and Nelson, Jelani and Manurangsi, Pasin},
  title =	{{Private Counting of Distinct and k-Occurring Items in Time Windows}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{55:1--55:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.55},
  URN =		{urn:nbn:de:0030-drops-175580},
  doi =		{10.4230/LIPIcs.ITCS.2023.55},
  annote =	{Keywords: Differential Privacy, Algorithms, Distinct Elements, Time Windows}
}

Document

DOI: 10.4230/LIPIcs.STACS.2021.45

An Improved Sketching Algorithm for Edit Distance

Authors: Ce Jin, Jelani Nelson, and Kewen Wu

Published in: LIPIcs, Volume 187, 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)

Abstract

We provide improved upper bounds for the simultaneous sketching complexity of edit distance. Consider two parties, Alice with input x ∈ Σⁿ and Bob with input y ∈ Σⁿ, that share public randomness and are given a promise that the edit distance ed(x,y) between their two strings is at most some given value k. Alice must send a message sx and Bob must send sy to a third party Charlie, who does not know the inputs but shares the same public randomness and also knows k. Charlie must output ed(x,y) precisely as well as a sequence of ed(x,y) edits required to transform x into y. The goal is to minimize the lengths |sx|, |sy| of the messages sent. The protocol of Belazzougui and Zhang (FOCS 2016), building upon the random walk method of Chakraborty, Goldenberg, and Koucký (STOC 2016), achieves a maximum message length of Õ(k⁸) bits, where Õ(⋅) hides poly(log n) factors. In this work we build upon Belazzougui and Zhang’s protocol and provide an improved analysis demonstrating that a slight modification of their construction achieves a bound of Õ(k³).

Cite as

Ce Jin, Jelani Nelson, and Kewen Wu. An Improved Sketching Algorithm for Edit Distance. In 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 187, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{jin_et_al:LIPIcs.STACS.2021.45,
  author =	{Jin, Ce and Nelson, Jelani and Wu, Kewen},
  title =	{{An Improved Sketching Algorithm for Edit Distance}},
  booktitle =	{38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)},
  pages =	{45:1--45:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-180-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{187},
  editor =	{Bl\"{a}ser, Markus and Monmege, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2021.45},
  URN =		{urn:nbn:de:0030-drops-136905},
  doi =		{10.4230/LIPIcs.STACS.2021.45},
  annote =	{Keywords: edit distance, sketching}
}

Document

DOI: 10.4230/OASIcs.SOSA.2018.15

Simple Analyses of the Sparse Johnson-Lindenstrauss Transform

Authors: Michael B. Cohen, T.S. Jayram, and Jelani Nelson

Published in: OASIcs, Volume 61, 1st Symposium on Simplicity in Algorithms (SOSA 2018)

Abstract

For every n-point subset X of Euclidean space and target distortion 1+eps for 0<eps<1, the Sparse Johnson Lindenstrauss Transform (SJLT) of (Kane, Nelson, J. ACM 2014) provides a linear dimensionality-reducing map f:X-->l_2^m where f(x) = Ax for A a matrix with m rows where (1) m = O((log n)/eps^2), and (2) each column of A is sparse, having only O(eps m) non-zero entries. Though the constructions given for such A in (Kane, Nelson, J. ACM 2014) are simple, the analyses are not, employing intricate combinatorial arguments. We here give two simple alternative proofs of their main result, involving no delicate combinatorics. One of these proofs has already been tested pedagogically, requiring slightly under forty minutes by the third author at a casual pace to cover all details in a blackboard course lecture.

Cite as

Michael B. Cohen, T.S. Jayram, and Jelani Nelson. Simple Analyses of the Sparse Johnson-Lindenstrauss Transform. In 1st Symposium on Simplicity in Algorithms (SOSA 2018). Open Access Series in Informatics (OASIcs), Volume 61, pp. 15:1-15:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:OASIcs.SOSA.2018.15,
  author =	{Cohen, Michael B. and Jayram, T.S. and Nelson, Jelani},
  title =	{{Simple Analyses of the Sparse Johnson-Lindenstrauss Transform}},
  booktitle =	{1st Symposium on Simplicity in Algorithms (SOSA 2018)},
  pages =	{15:1--15:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-064-4},
  ISSN =	{2190-6807},
  year =	{2018},
  volume =	{61},
  editor =	{Seidel, Raimund},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SOSA.2018.15},
  URN =		{urn:nbn:de:0030-drops-83056},
  doi =		{10.4230/OASIcs.SOSA.2018.15},
  annote =	{Keywords: dimensionality reduction, Johnson-Lindenstrauss, Sparse Johnson-Lindenstrauss Transform}
}

Document

DOI: 10.4230/LIPIcs.APPROX-RANDOM.2017.32

Continuous Monitoring of l_p Norms in Data Streams

Authors: Jaroslaw Blasiok, Jian Ding, and Jelani Nelson

Published in: LIPIcs, Volume 81, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)

Abstract

In insertion-only streaming, one sees a sequence of indices a_1, a_2, ..., a_m in [n]. The stream defines a sequence of m frequency vectors x(1), ..., x(m) each in R^n, where x(t) is the frequency vector of items after seeing the first t indices in the stream. Much work in the streaming literature focuses on estimating some function f(x(m)). Many applications though require obtaining estimates at time t of f(x(t)), for every t in [m]. Naively this guarantee is obtained by devising an algorithm with failure probability less than 1/m, then performing a union bound over all stream updates to guarantee that all m estimates are simultaneously accurate with good probability. When f(x) is some l_p norm of x, recent works have shown that this union bound is wasteful and better space complexity is possible for the continuous monitoring problem, with the strongest known results being for p=2. In this work, we improve the state of the art for all 0<p<2, which we obtain via a novel analysis of Indyk's p-stable sketch.

Cite as

Jaroslaw Blasiok, Jian Ding, and Jelani Nelson. Continuous Monitoring of l_p Norms in Data Streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 81, pp. 32:1-32:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{blasiok_et_al:LIPIcs.APPROX-RANDOM.2017.32,
  author =	{Blasiok, Jaroslaw and Ding, Jian and Nelson, Jelani},
  title =	{{Continuous Monitoring of l\underlinep Norms in Data Streams}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)},
  pages =	{32:1--32:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-044-6},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{81},
  editor =	{Jansen, Klaus and Rolim, Jos\'{e} D. P. and Williamson, David P. and Vempala, Santosh S.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2017.32},
  URN =		{urn:nbn:de:0030-drops-75816},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2017.32},
  annote =	{Keywords: data streams, continuous monitoring, moment estimation}
}

Document

DOI: 10.4230/LIPIcs.ICALP.2016.11

Optimal Approximate Matrix Product in Terms of Stable Rank

Authors: Michael B. Cohen, Jelani Nelson, and David P. Woodruff

Published in: LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

Abstract

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having m = O(˜r/epsilon^2) rows. Here r˜ is the maximum stable rank, i.e., the squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [Magen and Zouzias, SODA, 2011] and [Kyrillidis et al., arXiv, 2014] and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future. Our main theorem, via connections with spectral error matrix multiplication proven in previous work, implies quantitative improvements for approximate least squares regression and low rank approximation, and implies faster low rank approximation for popular kernels in machine learning such as the gaussian and Sobolev kernels. Our main result has also already been applied to improve dimensionality reduction guarantees for k-means clustering, and also implies new results for nonparametric regression. Lastly, we point out that the proof of the "BSS" deterministic row-sampling result of [Batson et al., SICOMP, 2012] can be modified to obtain deterministic row-sampling for approximate matrix product in terms of the stable rank of the matrices. The original "BSS" proof was in terms of the rank rather than the stable rank.

Cite as

Michael B. Cohen, Jelani Nelson, and David P. Woodruff. Optimal Approximate Matrix Product in Terms of Stable Rank. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ICALP.2016.11,
  author =	{Cohen, Michael B. and Nelson, Jelani and Woodruff, David P.},
  title =	{{Optimal Approximate Matrix Product in Terms of Stable Rank}},
  booktitle =	{43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)},
  pages =	{11:1--11:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-013-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{55},
  editor =	{Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.11},
  URN =		{urn:nbn:de:0030-drops-62788},
  doi =		{10.4230/LIPIcs.ICALP.2016.11},
  annote =	{Keywords: subspace embeddings, approximate matrix multiplication, stable rank, regression, low rank approximation}
}

Document

DOI: 10.4230/LIPIcs.ICALP.2016.44

An Improved Analysis of the ER-SpUD Dictionary Learning Algorithm

Authors: Jaroslaw Blasiok and Jelani Nelson

Published in: LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

Abstract

In dictionary learning we observe Y = AX + E for some Y in R^{n*p}, A in R^{m*n}, and X in R^{m*p}, where p >= max{n, m}, and typically m >=n. The matrix Y is observed, and A, X, E are unknown. Here E is a "noise" matrix of small norm, and X is column-wise sparse. The matrix A is referred to as a dictionary, and its columns as atoms. Then, given some small number p of samples, i.e. columns of Y , the goal is to learn the dictionary A up to small error, as well as the coefficient matrix X. In applications one could for example think of each column of Y as a distinct image in a database. The motivation is that in many applications data is expected to sparse when represented by atoms in the "right" dictionary A (e.g. images in the Haar wavelet basis), and the goal is to learn A from the data to then use it for other applications. Recently, the work of [Spielman/Wang/Wright, COLT'12] proposed the dictionary learning algorithm ER-SpUD with provable guarantees when E = 0 and m = n. That work showed that if X has independent entries with an expected Theta n non-zeroes per column for 1/n <~ Theta <~ 1/sqrt(n), and with non-zero entries being subgaussian, then for p >~ n^2 log^2 n with high probability ER-SpUD outputs matrices A', X' which equal A, X up to permuting and scaling columns (resp. rows) of A (resp. X). They conjectured that p >~ n log n suffices, which they showed was information theoretically necessary for any algorithm to succeed when Theta =~ 1/n. Significant progress toward showing that p >~ n log^4 n might suffice was later obtained in [Luh/Vu, FOCS'15]. In this work, we show that for a slight variant of ER-SpUD, p >~ n log(n/delta) samples suffice for successful recovery with probability 1 - delta. We also show that without our slight variation made to ER-SpUD, p >~ n^{1.99} samples are required even to learn A, X with a small success probability of 1/ poly(n). This resolves the main conjecture of [Spielman/Wang/Wright, COLT'12], and contradicts a result of [Luh/Vu, FOCS'15], which claimed that p >~ n log^4 n guarantees high probability of success for the original ER-SpUD algorithm.

Cite as

Jaroslaw Blasiok and Jelani Nelson. An Improved Analysis of the ER-SpUD Dictionary Learning Algorithm. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 44:1-44:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{blasiok_et_al:LIPIcs.ICALP.2016.44,
  author =	{Blasiok, Jaroslaw and Nelson, Jelani},
  title =	{{An Improved Analysis of the ER-SpUD Dictionary Learning Algorithm}},
  booktitle =	{43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)},
  pages =	{44:1--44:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-013-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{55},
  editor =	{Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.44},
  URN =		{urn:nbn:de:0030-drops-63246},
  doi =		{10.4230/LIPIcs.ICALP.2016.44},
  annote =	{Keywords: dictionary learning, stochastic processes, generic chaining}
}

Document

DOI: 10.4230/LIPIcs.ICALP.2016.82

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction

Authors: Kasper Green Larsen and Jelani Nelson

Published in: LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

Abstract

For any n > 1, 0 < epsilon < 1/2, and N > n^C for some constant C > 0, we show the existence of an N-point subset X of l_2^n such that any linear map from X to l_2^m with distortion at most 1 + epsilon must have m = Omega(min{n, epsilon^{-2}*lg(N)). This improves a lower bound of Alon [Alon, Discre. Mathem., 1999], in the linear setting, by a lg(1/epsilon) factor. Our lower bound matches the upper bounds provided by the identity matrix and the Johnson-Lindenstrauss lemma [Johnson and Lindenstrauss, Contem. Mathem., 1984].

Cite as

Kasper Green Larsen and Jelani Nelson. The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 82:1-82:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{larsen_et_al:LIPIcs.ICALP.2016.82,
  author =	{Larsen, Kasper Green and Nelson, Jelani},
  title =	{{The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction}},
  booktitle =	{43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)},
  pages =	{82:1--82:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-013-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{55},
  editor =	{Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.82},
  URN =		{urn:nbn:de:0030-drops-62032},
  doi =		{10.4230/LIPIcs.ICALP.2016.82},
  annote =	{Keywords: dimensionality reduction, lower bounds, Johnson-Lindenstrauss}
}

Search Results

Documents authored by Nelson, Jelani

Space Complexity of Minimum Cut Problems in Single-Pass Streams

Abstract

Cite as

On the Amortized Complexity of Approximate Counting

Abstract

Cite as

Differentially Private Aggregation via Imperfect Shuffling

Abstract

Cite as

Generalized Private Selection and Testing with High Confidence

Abstract

Cite as

Private Counting of Distinct and k-Occurring Items in Time Windows

Abstract

Cite as

An Improved Sketching Algorithm for Edit Distance

Abstract

Cite as

Simple Analyses of the Sparse Johnson-Lindenstrauss Transform

Abstract

Cite as

Continuous Monitoring of l_p Norms in Data Streams

Abstract

Cite as

Optimal Approximate Matrix Product in Terms of Stable Rank

Abstract

Cite as

An Improved Analysis of the ER-SpUD Dictionary Learning Algorithm

Abstract

Cite as

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction

Abstract

Cite as

Thanks for your feedback!

Could not send message