Document

**Published in:** LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)

We describe two algorithms for multiplying n × n matrices using time and energy Õ(n²) under basic models of classical physics. The first algorithm is for multiplying integer-valued matrices, and the second, quite different algorithm, is for Boolean matrix multiplication. We hope this work inspires a deeper consideration of physically plausible/realizable models of computing that might allow for algorithms which improve upon the runtimes and energy usages suggested by the parallel RAM model in which each operation requires one unit of time and one unit of energy.

Gregory Valiant. Matrix Multiplication in Quadratic Time and Energy? Towards a Fine-Grained Energy-Centric Church-Turing Thesis. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 96:1-96:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{valiant:LIPIcs.ITCS.2024.96, author = {Valiant, Gregory}, title = {{Matrix Multiplication in Quadratic Time and Energy? Towards a Fine-Grained Energy-Centric Church-Turing Thesis}}, booktitle = {15th Innovations in Theoretical Computer Science Conference (ITCS 2024)}, pages = {96:1--96:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-309-6}, ISSN = {1868-8969}, year = {2024}, volume = {287}, editor = {Guruswami, Venkatesan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.96}, URN = {urn:nbn:de:0030-drops-196248}, doi = {10.4230/LIPIcs.ITCS.2024.96}, annote = {Keywords: Physics based computing, matrix multiplication, low-energy computing} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

We study a "pen testing" problem, in which we are given n pens with unknown amounts of ink X₁, X₂, …, X_n, and we want to choose a pen with the maximum amount of remaining ink in it. The challenge is that we cannot access each X_i directly; we only get to write with the i-th pen until either a certain amount of ink is used, or the pen runs out of ink. In both cases, this testing reduces the remaining ink in the pen and thus the utility of selecting it.
Despite this significant lack of information, we show that it is possible to approximately maximize our utility up to an O(log n) factor. Formally, we consider two different setups: the "prophet" setting, in which each X_i is independently drawn from some distribution 𝒟_i, and the "secretary" setting, in which (X_i)_{i=1}^n is a random permutation of arbitrary a₁, a₂, …, a_n. We derive the optimal competitive ratios in both settings up to constant factors. Our algorithms are surprisingly robust: (1) In the prophet setting, we only require one sample from each 𝒟_i, rather than a full description of the distribution; (2) In the secretary setting, the algorithm also succeeds under an arbitrary permutation, if an estimate of the maximum a_i is given.
Our techniques include a non-trivial online sampling scheme from a sequence with an unknown length, as well as the construction of a hard, non-uniform distribution over permutations. Both might be of independent interest. We also highlight some immediate open problems and discuss several directions for future research.

Mingda Qiao and Gregory Valiant. Online Pen Testing. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 91:1-91:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{qiao_et_al:LIPIcs.ITCS.2023.91, author = {Qiao, Mingda and Valiant, Gregory}, title = {{Online Pen Testing}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {91:1--91:26}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.91}, URN = {urn:nbn:de:0030-drops-175940}, doi = {10.4230/LIPIcs.ITCS.2023.91}, annote = {Keywords: Optimal stopping, online algorithm} }

Document

**Published in:** LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)

We introduce a criterion, resilience, which allows properties of a dataset (such as its mean or best low rank approximation) to be robustly computed, even in the presence of a large fraction of arbitrary additional data. Resilience is a weaker condition than most other properties considered so far in the literature, and yet enables robust estimation in a broader variety of settings. We provide new information-theoretic results on robust distribution learning, robust estimation of stochastic block models, and robust mean estimation under bounded kth moments. We also provide new algorithmic results on robust distribution learning, as well as robust mean estimation in p-norms. Among our proof techniques is a method for pruning a high-dimensional distribution with bounded 1st moments to a stable "core" with bounded 2nd moments, which may be of independent interest.

Jacob Steinhardt, Moses Charikar, and Gregory Valiant. Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 45:1-45:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{steinhardt_et_al:LIPIcs.ITCS.2018.45, author = {Steinhardt, Jacob and Charikar, Moses and Valiant, Gregory}, title = {{Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers}}, booktitle = {9th Innovations in Theoretical Computer Science Conference (ITCS 2018)}, pages = {45:1--45:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-060-6}, ISSN = {1868-8969}, year = {2018}, volume = {94}, editor = {Karlin, Anna R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.45}, URN = {urn:nbn:de:0030-drops-83687}, doi = {10.4230/LIPIcs.ITCS.2018.45}, annote = {Keywords: robust learning, outliers, stochastic block models, p-norm estimation} }

Document

**Published in:** LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)

We consider the problem of accurately recovering a matrix B of size M by M, which represents a probability distribution over M^2 outcomes, given access to an observed matrix of "counts" generated by taking independent samples from the distribution B. How can structural properties of the underlying matrix B be leveraged to yield computationally efficient and information theoretically optimal reconstruction algorithms? When can accurate reconstruction be accomplished in the sparse data regime? This basic problem lies at the core of a number of questions that are currently being considered by different communities, including building recommendation systems and collaborative filtering in the sparse data regime, community detection in sparse random graphs, learning structured models such as topic models or hidden Markov models, and the efforts from the natural language processing community to compute "word embeddings". Many aspects of this problem---both in terms of learning and property testing/estimation and on both the algorithmic and information theoretic sides---remain open.
Our results apply to the setting where B has a low rank structure. For this setting, we propose an efficient (and practically viable) algorithm that accurately recovers the underlying M by M matrix using O(M) samples} (where we assume the rank is a constant). This linear sample complexity is optimal, up to constant factors, in an extremely strong sense: even testing basic properties of the underlying matrix (such as whether it has rank 1 or 2) requires Omega(M) samples. Additionally, we provide an even stronger lower bound showing that distinguishing whether a sequence of observations were drawn from the uniform distribution over M observations versus being generated by a well-conditioned Hidden Markov Model with two hidden states requires Omega(M) observations, while our positive results for recovering B immediately imply that Omega(M) observations suffice to learn such an HMM. This lower bound precludes sublinear-sample hypothesis tests for basic properties, such as identity or uniformity, as well as sublinear sample estimators for quantities such as the entropy rate of HMMs.

Qingqing Huang, Sham M. Kakade, Weihao Kong, and Gregory Valiant. Recovering Structured Probability Matrices. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 46:1-46:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.ITCS.2018.46, author = {Huang, Qingqing and Kakade, Sham M. and Kong, Weihao and Valiant, Gregory}, title = {{Recovering Structured Probability Matrices}}, booktitle = {9th Innovations in Theoretical Computer Science Conference (ITCS 2018)}, pages = {46:1--46:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-060-6}, ISSN = {1868-8969}, year = {2018}, volume = {94}, editor = {Karlin, Anna R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.46}, URN = {urn:nbn:de:0030-drops-83625}, doi = {10.4230/LIPIcs.ITCS.2018.46}, annote = {Keywords: Random matrices, matrix recovery, stochastic block model, Hidden Markov Models} }

Document

**Published in:** LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)

We consider the problem of learning a discrete distribution in the presence of an epsilon fraction of malicious data sources. Specifically, we consider the setting where there is some underlying distribution, p, and each data source provides a batch of >= k samples, with the guarantee that at least a (1 - epsilon) fraction of the sources draw their samples from a distribution with total variation distance at most \eta from p. We make no assumptions on the data provided by the remaining epsilon fraction of sources--this data can even be chosen as an adversarial function of the (1 - epsilon) fraction of "good" batches. We provide two algorithms: one with runtime exponential in the support size, n, but polynomial in k, 1/epsilon and 1/eta that takes O((n + k)/epsilon^2) batches and recovers p to error O(eta + epsilon/sqrt(k)). This recovery accuracy is information theoretically optimal, to constant factors, even given an infinite number of data sources. Our second algorithm applies to the eta = 0 setting and also achieves an O(epsilon/sqrt(k)) recover guarantee, though it runs in poly((nk)^k) time. This second algorithm, which approximates a certain tensor via a rank-1 tensor minimizing l_1 distance, is surprising in light of the hardness of many low-rank tensor approximation problems, and may be of independent interest.

Mingda Qiao and Gregory Valiant. Learning Discrete Distributions from Untrusted Batches. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 47:1-47:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{qiao_et_al:LIPIcs.ITCS.2018.47, author = {Qiao, Mingda and Valiant, Gregory}, title = {{Learning Discrete Distributions from Untrusted Batches}}, booktitle = {9th Innovations in Theoretical Computer Science Conference (ITCS 2018)}, pages = {47:1--47:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-060-6}, ISSN = {1868-8969}, year = {2018}, volume = {94}, editor = {Karlin, Anna R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.47}, URN = {urn:nbn:de:0030-drops-83215}, doi = {10.4230/LIPIcs.ITCS.2018.47}, annote = {Keywords: robust statistics, information-theoretic optimality} }

Document

**Published in:** LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)

We address the trade-off between the computational resources needed to process a large data set and the number of samples available from the data set. Specifically, we consider the following abstraction: we receive a potentially infinite stream of IID samples from some unknown distribution D, and are tasked with computing some function f(D). If the stream is observed for time t, how much memory, s, is required to estimate f(D)? We refer to t as the sample complexity and s as the space complexity. The main focus of this paper is investigating the trade-offs between the space and sample complexity. We study these trade-offs for several canonical problems studied in the data stream model: estimating the collision probability, i.e., the second moment of a distribution, deciding if a graph is connected, and approximating the dimension of an unknown subspace. Our results are based on techniques for simulating different classical sampling procedures in this model, emulating random walks given a sequence of IID samples, as well as leveraging a characterization between communication bounded protocols and statistical query algorithms.

Michael Crouch, Andrew McGregor, Gregory Valiant, and David P. Woodruff. Stochastic Streams: Sample Complexity vs. Space Complexity. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 32:1-32:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{crouch_et_al:LIPIcs.ESA.2016.32, author = {Crouch, Michael and McGregor, Andrew and Valiant, Gregory and Woodruff, David P.}, title = {{Stochastic Streams: Sample Complexity vs. Space Complexity}}, booktitle = {24th Annual European Symposium on Algorithms (ESA 2016)}, pages = {32:1--32:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-015-6}, ISSN = {1868-8969}, year = {2016}, volume = {57}, editor = {Sankowski, Piotr and Zaroliagis, Christos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.32}, URN = {urn:nbn:de:0030-drops-63838}, doi = {10.4230/LIPIcs.ESA.2016.32}, annote = {Keywords: data streams, sample complexity, frequency moments, graph connectivity, subspace approximation} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail