New Characterizations in Turnstile Streams with Applications

Ai, Yuqing; Hu, Wei; Li, Yi; Woodruff, David P.

doi:10.4230/LIPIcs.CCC.2016.20

File

LIPIcs.CCC.2016.20.pdf

Filesize: 0.59 MB
22 pages

Document Identifiers

DOI: 10.4230/LIPIcs.CCC.2016.20
URN: urn:nbn:de:0030-drops-58337

Author Details

Yuqing Ai

Wei Hu

Yi Li

David P. Woodruff

Cite AsGet BibTex

Yuqing Ai, Wei Hu, Yi Li, and David P. Woodruff. New Characterizations in Turnstile Streams with Applications. In 31st Conference on Computational Complexity (CCC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 50, pp. 20:1-20:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.CCC.2016.20

Abstract

Recently, [Li, Nguyen, Woodruff, STOC 2014] showed any 1-pass constant probability streaming algorithm for computing a relation f on a vector x in {-m, -(m-1), ..., m}^n presented in the turnstile data stream model can be implemented by maintaining a linear sketch Ax mod q, where A is an r times n integer matrix and q = (q_1, ..., q_r) is a vector of positive integers. The space complexity of maintaining Ax mod q, not including the random bits used for sampling A and q, matches the space of the optimal algorithm. We give multiple strengthenings of this reduction, together with new applications. In particular, we show how to remove the following shortcomings of their reduction: 1. The Box Constraint. Their reduction applies only to algorithms that must be correct even if x_{infinity} = max_{i in [n]} |x_i| is allowed to be much larger than m at intermediate points in the stream, provided that x is in {-m, -(m-1), ..., m}^n at the end of the stream. We give a condition under which the optimal algorithm is a linear sketch even if it works only when promised that x is in {-m, -(m-1), ..., m}^n at all points in the stream. Using this, we show the first super-constant Omega(log m) bits lower bound for the problem of maintaining a counter up to an additive epsilon*m error in a turnstile stream, where epsilon is any constant in (0, 1/2). Previous lower bounds are based on communication complexity and are only for relative error approximation; interestingly, we do not know how to prove our result using communication complexity. More generally, we show the first super-constant Omega(log(m)) lower bound for additive approximation of l_p-norms; this bound is tight for p in [1, 2]. 2. Negative Coordinates. Their reduction allows x_i to be negative while processing the stream. We show an equivalence between 1-pass algorithms and linear sketches Ax mod q in dynamic graph streams, or more generally, the strict turnstile model, in which for all i in [n], x_i is nonnegative at all points in the stream. Combined with [Assadi, Khanna, Li, Yaroslavtsev, SODA 2016], this resolves the 1-pass space complexity of approximating the maximum matching in a dynamic graph stream, answering a question in that work. 3. 1-Pass Restriction. Their reduction only applies to 1-pass data stream algorithms in the turnstile model, while there exist algorithms for heavy hitters and for low rank approximation which provably do better with multiple passes. We extend the reduction to algorithms which make any number of passes, showing the optimal algorithm is to choose a new linear sketch at the beginning of each pass, based on the output of previous passes.

Keywords

communication complexity
data streams
dynamic graph streams
norm estimation

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Noga Alon, Yossi Matias, and Mario Szegedy. The Space Complexity of Approximating the Frequency Moments. JCSS, 58(1):137-147, 1999.
Sepehr Assadi, Sanjeev Khanna, Yang Li, and Grigory Yaroslavtsev. Maximum matchings in dynamic graph streams and the simultaneous communication model. In SODA, pages 1345-1364, 2016.
Christos Boutsidis, David P. Woodruff, and Peilin Zhong. Optimal principal component analysis in distributed and streaming models. In STOC, 2016.
Sumit Ganguly. Lower bounds on frequency estimation of data streams. In Proceedings of the 3rd International Conference on Computer Science: theory and applications, CSR'08, pages 204-215, 2008.
Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM, 53(3):307-323, 2006.
Piotr Indyk. Sketching, streaming and sublinear-space algorithms, 2007. Graduate course notes available at URL: http://stellar.mit.edu/S/course/6/fa07/6.895/.
Piotr Indyk, Eric Price, and David P. Woodruff. On the power of adaptivity in sparse recovery. In FOCS, pages 285-294, 2011.
Daniel M. Kane, Jelani Nelson, and David P. Woodruff. On the exact space complexity of sketching and streaming small norms. In SODA, pages 1161-1178, 2010.
Yi Li, Huy L. Nguyen, and David P. Woodruff. Turnstile streaming algorithms might as well be linear sketches. In STOC, pages 174-183, 2014.
S. Muthukrishnan. Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science, 1(2):117-236, 2005.
Ilan Newman. Private vs. common random bits in communication complexity. Information Processing Letter, pages 67-71, 1991.
Joachim von zur Gathen and Malte Sieveking. A bound on solutions of linear integer equalities and inequalities. In Proceedings of the American Mathematical Society, pages 155-158, 1978.
Omri Weinstein and David P. Woodruff. The simultaneous communication of disjointness with applications to data streams. In ICALP, pages 1082-1093, 2015.
David P. Woodruff. Low rank approximation lower bounds in row-update streams. In NIPS, pages 1781-1789, 2014.
Andrew Chi-Chih Yao. Some complexity questions related to distributive computing. In STOC, pages 209-213, 1979.