Document

**Published in:** LIPIcs, Volume 267, 4th Conference on Information-Theoretic Cryptography (ITC 2023)

In this paper, we introduce the imperfect shuffle differential privacy model, where messages sent from users are shuffled in an almost uniform manner before being observed by a curator for private aggregation. We then consider the private summation problem. We show that the standard split-and-mix protocol by Ishai et. al. [FOCS 2006] can be adapted to achieve near-optimal utility bounds in the imperfect shuffle model. Specifically, we show that surprisingly, there is no additional error overhead necessary in the imperfect shuffle model.

Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Jelani Nelson, and Samson Zhou. Differentially Private Aggregation via Imperfect Shuffling. In 4th Conference on Information-Theoretic Cryptography (ITC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 267, pp. 17:1-17:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{ghazi_et_al:LIPIcs.ITC.2023.17, author = {Ghazi, Badih and Kumar, Ravi and Manurangsi, Pasin and Nelson, Jelani and Zhou, Samson}, title = {{Differentially Private Aggregation via Imperfect Shuffling}}, booktitle = {4th Conference on Information-Theoretic Cryptography (ITC 2023)}, pages = {17:1--17:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-271-6}, ISSN = {1868-8969}, year = {2023}, volume = {267}, editor = {Chung, Kai-Min}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITC.2023.17}, URN = {urn:nbn:de:0030-drops-183453}, doi = {10.4230/LIPIcs.ITC.2023.17}, annote = {Keywords: Differential privacy, private summation, shuffle model} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

Composition theorems are general and powerful tools that facilitate privacy accounting across multiple data accesses from per-access privacy bounds. However they often result in weaker bounds compared with end-to-end analysis. Two popular tools that mitigate that are the exponential mechanism (or report noisy max) and the sparse vector technique, generalized in a recent private selection framework by Liu and Talwar (STOC 2019). In this work, we propose a flexible framework of private selection and testing that generalizes the one proposed by Liu and Talwar, supporting a wide range of applications. We apply our framework to solve several fundamental tasks, including query releasing, top-k selection, and stable selection, with improved confidence-accuracy tradeoffs. Additionally, for online settings, we apply our private testing to design a mechanism for adaptive query releasing, which improves the sample complexity dependence on the confidence parameter for the celebrated private multiplicative weights algorithm of Hardt and Rothblum (FOCS 2010).

Edith Cohen, Xin Lyu, Jelani Nelson, Tamás Sarlós, and Uri Stemmer. Generalized Private Selection and Testing with High Confidence. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 39:1-39:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ITCS.2023.39, author = {Cohen, Edith and Lyu, Xin and Nelson, Jelani and Sarl\'{o}s, Tam\'{a}s and Stemmer, Uri}, title = {{Generalized Private Selection and Testing with High Confidence}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {39:1--39:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.39}, URN = {urn:nbn:de:0030-drops-175426}, doi = {10.4230/LIPIcs.ITCS.2023.39}, annote = {Keywords: differential privacy, sparse vector technique, adaptive data analysis} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

In this work, we study the task of estimating the numbers of distinct and k-occurring items in a time window under the constraint of differential privacy (DP). We consider several variants depending on whether the queries are on general time windows (between times t₁ and t₂), or are restricted to being cumulative (between times 1 and t₂), and depending on whether the DP neighboring relation is event-level or the more stringent item-level. We obtain nearly tight upper and lower bounds on the errors of DP algorithms for these problems. En route, we obtain an event-level DP algorithm for estimating, at each time step, the number of distinct items seen over the last W updates with error polylogarithmic in W; this answers an open question of Bolot et al. (ICDT 2013).

Badih Ghazi, Ravi Kumar, Jelani Nelson, and Pasin Manurangsi. Private Counting of Distinct and k-Occurring Items in Time Windows. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 55:1-55:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{ghazi_et_al:LIPIcs.ITCS.2023.55, author = {Ghazi, Badih and Kumar, Ravi and Nelson, Jelani and Manurangsi, Pasin}, title = {{Private Counting of Distinct and k-Occurring Items in Time Windows}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {55:1--55:24}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.55}, URN = {urn:nbn:de:0030-drops-175580}, doi = {10.4230/LIPIcs.ITCS.2023.55}, annote = {Keywords: Differential Privacy, Algorithms, Distinct Elements, Time Windows} }

Document

**Published in:** LIPIcs, Volume 187, 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)

We provide improved upper bounds for the simultaneous sketching complexity of edit distance. Consider two parties, Alice with input x ∈ Σⁿ and Bob with input y ∈ Σⁿ, that share public randomness and are given a promise that the edit distance ed(x,y) between their two strings is at most some given value k. Alice must send a message sx and Bob must send sy to a third party Charlie, who does not know the inputs but shares the same public randomness and also knows k. Charlie must output ed(x,y) precisely as well as a sequence of ed(x,y) edits required to transform x into y. The goal is to minimize the lengths |sx|, |sy| of the messages sent.
The protocol of Belazzougui and Zhang (FOCS 2016), building upon the random walk method of Chakraborty, Goldenberg, and Koucký (STOC 2016), achieves a maximum message length of Õ(k⁸) bits, where Õ(⋅) hides poly(log n) factors. In this work we build upon Belazzougui and Zhang’s protocol and provide an improved analysis demonstrating that a slight modification of their construction achieves a bound of Õ(k³).

Ce Jin, Jelani Nelson, and Kewen Wu. An Improved Sketching Algorithm for Edit Distance. In 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 187, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{jin_et_al:LIPIcs.STACS.2021.45, author = {Jin, Ce and Nelson, Jelani and Wu, Kewen}, title = {{An Improved Sketching Algorithm for Edit Distance}}, booktitle = {38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)}, pages = {45:1--45:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-180-1}, ISSN = {1868-8969}, year = {2021}, volume = {187}, editor = {Bl\"{a}ser, Markus and Monmege, Benjamin}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2021.45}, URN = {urn:nbn:de:0030-drops-136905}, doi = {10.4230/LIPIcs.STACS.2021.45}, annote = {Keywords: edit distance, sketching} }

Document

**Published in:** OASIcs, Volume 61, 1st Symposium on Simplicity in Algorithms (SOSA 2018)

For every n-point subset X of Euclidean space and target distortion 1+eps for 0<eps<1, the Sparse Johnson Lindenstrauss Transform (SJLT) of (Kane, Nelson, J. ACM 2014) provides a linear dimensionality-reducing map f:X-->l_2^m where f(x) = Ax for A a matrix with m rows where (1) m = O((log n)/eps^2), and (2) each column of A is sparse, having only O(eps m) non-zero entries. Though the constructions given for such A in (Kane, Nelson, J. ACM 2014) are simple, the analyses are not, employing intricate combinatorial arguments. We here give two simple alternative proofs of their main result, involving no delicate combinatorics. One of these proofs has already been tested pedagogically, requiring slightly under forty minutes by the third author at a casual pace to cover all details in a blackboard course lecture.

Michael B. Cohen, T.S. Jayram, and Jelani Nelson. Simple Analyses of the Sparse Johnson-Lindenstrauss Transform. In 1st Symposium on Simplicity in Algorithms (SOSA 2018). Open Access Series in Informatics (OASIcs), Volume 61, pp. 15:1-15:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:OASIcs.SOSA.2018.15, author = {Cohen, Michael B. and Jayram, T.S. and Nelson, Jelani}, title = {{Simple Analyses of the Sparse Johnson-Lindenstrauss Transform}}, booktitle = {1st Symposium on Simplicity in Algorithms (SOSA 2018)}, pages = {15:1--15:9}, series = {Open Access Series in Informatics (OASIcs)}, ISBN = {978-3-95977-064-4}, ISSN = {2190-6807}, year = {2018}, volume = {61}, editor = {Seidel, Raimund}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SOSA.2018.15}, URN = {urn:nbn:de:0030-drops-83056}, doi = {10.4230/OASIcs.SOSA.2018.15}, annote = {Keywords: dimensionality reduction, Johnson-Lindenstrauss, Sparse Johnson-Lindenstrauss Transform} }

Document

**Published in:** LIPIcs, Volume 81, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)

In insertion-only streaming, one sees a sequence of indices a_1, a_2, ..., a_m in [n]. The stream defines a sequence of m frequency vectors x(1), ..., x(m) each in R^n, where x(t) is the frequency vector of items after seeing the first t indices in the stream. Much work in the streaming literature focuses on estimating some function f(x(m)). Many applications though require obtaining estimates at time t of f(x(t)), for every t in [m]. Naively this guarantee is obtained by devising an algorithm with failure probability less than 1/m, then performing a union bound over all stream updates to guarantee that all m estimates are simultaneously accurate with good probability. When f(x) is some l_p norm of x, recent works have shown that this union bound is wasteful and better space complexity is possible for the continuous monitoring problem, with the strongest known results being for p=2. In this work, we improve the state of the art for all 0<p<2, which we obtain via a novel analysis of Indyk's p-stable sketch.

Jaroslaw Blasiok, Jian Ding, and Jelani Nelson. Continuous Monitoring of l_p Norms in Data Streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 81, pp. 32:1-32:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{blasiok_et_al:LIPIcs.APPROX-RANDOM.2017.32, author = {Blasiok, Jaroslaw and Ding, Jian and Nelson, Jelani}, title = {{Continuous Monitoring of l\underlinep Norms in Data Streams}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)}, pages = {32:1--32:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-044-6}, ISSN = {1868-8969}, year = {2017}, volume = {81}, editor = {Jansen, Klaus and Rolim, Jos\'{e} D. P. and Williamson, David P. and Vempala, Santosh S.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2017.32}, URN = {urn:nbn:de:0030-drops-75816}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2017.32}, annote = {Keywords: data streams, continuous monitoring, moment estimation} }

Document

**Published in:** LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having m = O(˜r/epsilon^2) rows. Here r˜ is the maximum stable rank, i.e., the squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [Magen and Zouzias, SODA, 2011] and [Kyrillidis et al., arXiv, 2014] and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future.
Our main theorem, via connections with spectral error matrix multiplication proven in previous work, implies quantitative improvements for approximate least squares regression and low rank approximation, and implies faster low rank approximation for popular kernels in machine learning such as the gaussian and Sobolev kernels. Our main result has also already been applied to improve dimensionality reduction guarantees for k-means clustering, and also implies new results for nonparametric regression.
Lastly, we point out that the proof of the "BSS" deterministic row-sampling result of [Batson et al., SICOMP, 2012] can be modified to obtain deterministic row-sampling for approximate matrix product in terms of the stable rank of the matrices. The original "BSS" proof was in terms of the rank rather than the stable rank.

Michael B. Cohen, Jelani Nelson, and David P. Woodruff. Optimal Approximate Matrix Product in Terms of Stable Rank. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ICALP.2016.11, author = {Cohen, Michael B. and Nelson, Jelani and Woodruff, David P.}, title = {{Optimal Approximate Matrix Product in Terms of Stable Rank}}, booktitle = {43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)}, pages = {11:1--11:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-013-2}, ISSN = {1868-8969}, year = {2016}, volume = {55}, editor = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.11}, URN = {urn:nbn:de:0030-drops-62788}, doi = {10.4230/LIPIcs.ICALP.2016.11}, annote = {Keywords: subspace embeddings, approximate matrix multiplication, stable rank, regression, low rank approximation} }

Document

**Published in:** LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

In dictionary learning we observe Y = AX + E for some Y in R^{n*p}, A in R^{m*n}, and X in R^{m*p}, where p >= max{n, m}, and typically m >=n. The matrix Y is observed, and A, X, E are unknown. Here E is a "noise" matrix of small norm, and X is column-wise sparse. The matrix A is referred to as a dictionary, and its columns as atoms. Then, given some small number p of samples, i.e. columns of Y , the goal is to learn the dictionary A up to small error, as well as the coefficient matrix X. In applications one could for example think of each column of Y as a distinct image in a database. The motivation is that in many applications data is expected to sparse when represented by atoms in the "right" dictionary A (e.g. images in the Haar wavelet basis), and the goal is to learn A from the data to then use it for other applications.
Recently, the work of [Spielman/Wang/Wright, COLT'12] proposed the dictionary learning algorithm ER-SpUD with provable guarantees when E = 0 and m = n. That work showed that if X has independent entries with an expected Theta n non-zeroes per column for 1/n <~ Theta <~ 1/sqrt(n), and with non-zero entries being subgaussian, then for p >~ n^2 log^2 n with high probability ER-SpUD outputs matrices A', X' which equal A, X up to permuting and scaling columns (resp. rows) of A (resp. X). They conjectured that p >~ n log n suffices, which they showed was information theoretically necessary for any algorithm to succeed when Theta =~ 1/n. Significant progress toward showing that p >~ n log^4 n might suffice was later obtained in [Luh/Vu, FOCS'15].
In this work, we show that for a slight variant of ER-SpUD, p >~ n log(n/delta) samples suffice for successful recovery with probability 1 - delta. We also show that without our slight variation made to ER-SpUD, p >~ n^{1.99} samples are required even to learn A, X with a small success probability of 1/ poly(n). This resolves the main conjecture of [Spielman/Wang/Wright, COLT'12], and contradicts a result of [Luh/Vu, FOCS'15], which claimed that p >~ n log^4 n guarantees high probability of success for the original ER-SpUD algorithm.

Jaroslaw Blasiok and Jelani Nelson. An Improved Analysis of the ER-SpUD Dictionary Learning Algorithm. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 44:1-44:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{blasiok_et_al:LIPIcs.ICALP.2016.44, author = {Blasiok, Jaroslaw and Nelson, Jelani}, title = {{An Improved Analysis of the ER-SpUD Dictionary Learning Algorithm}}, booktitle = {43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)}, pages = {44:1--44:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-013-2}, ISSN = {1868-8969}, year = {2016}, volume = {55}, editor = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.44}, URN = {urn:nbn:de:0030-drops-63246}, doi = {10.4230/LIPIcs.ICALP.2016.44}, annote = {Keywords: dictionary learning, stochastic processes, generic chaining} }

Document

**Published in:** LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

For any n > 1, 0 < epsilon < 1/2, and N > n^C for some constant C > 0, we show the existence of an N-point subset X of l_2^n such that any linear map from X to l_2^m with distortion at most 1 + epsilon must have m = Omega(min{n, epsilon^{-2}*lg(N)). This improves a lower bound of Alon [Alon, Discre. Mathem., 1999], in the linear setting, by a lg(1/epsilon) factor. Our lower bound matches the upper bounds provided by the identity matrix and the Johnson-Lindenstrauss lemma [Johnson and Lindenstrauss, Contem. Mathem., 1984].

Kasper Green Larsen and Jelani Nelson. The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 82:1-82:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{larsen_et_al:LIPIcs.ICALP.2016.82, author = {Larsen, Kasper Green and Nelson, Jelani}, title = {{The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction}}, booktitle = {43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)}, pages = {82:1--82:11}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-013-2}, ISSN = {1868-8969}, year = {2016}, volume = {55}, editor = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.82}, URN = {urn:nbn:de:0030-drops-62032}, doi = {10.4230/LIPIcs.ICALP.2016.82}, annote = {Keywords: dimensionality reduction, lower bounds, Johnson-Lindenstrauss} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail