eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2016-05-19
20:1
20:22
10.4230/LIPIcs.CCC.2016.20
article
New Characterizations in Turnstile Streams with Applications
Ai, Yuqing
Hu, Wei
Li, Yi
Woodruff, David P.
Recently, [Li, Nguyen, Woodruff, STOC 2014] showed any 1-pass constant probability streaming algorithm for computing a relation f on a vector x in {-m, -(m-1), ..., m}^n presented in the turnstile data stream model can be implemented by maintaining a linear sketch Ax mod q, where A is an r times n integer matrix and q = (q_1, ..., q_r) is a vector of positive integers. The space complexity of maintaining Ax mod q, not including the random bits used for sampling A and q, matches the space of the optimal algorithm.
We give multiple strengthenings of this reduction, together with new applications. In particular, we show how to remove the following shortcomings of their reduction:
1. The Box Constraint. Their reduction applies only to algorithms that must be correct even if x_{infinity} = max_{i in [n]} |x_i| is allowed to be much larger than m at intermediate points in the stream, provided that x is in {-m, -(m-1), ..., m}^n at the end of the stream. We give a condition under which the optimal algorithm is a linear sketch even if it works only when promised that x is in {-m, -(m-1), ..., m}^n at all points in the stream. Using this, we show the first super-constant Omega(log m) bits lower bound for the problem of maintaining a counter up to an additive epsilon*m error in a turnstile stream, where epsilon is any constant in (0, 1/2). Previous lower bounds are based on communication complexity and are only for relative error approximation; interestingly, we do not know how to prove our result using communication complexity. More generally, we show the first super-constant Omega(log(m)) lower bound for additive approximation of l_p-norms; this bound is tight for p in [1, 2].
2. Negative Coordinates. Their reduction allows x_i to be negative while processing the stream. We show an equivalence between 1-pass algorithms and linear sketches Ax mod q in dynamic graph streams, or more generally, the strict turnstile model, in which for all i in [n], x_i is nonnegative at all points in the stream. Combined with [Assadi, Khanna, Li, Yaroslavtsev, SODA 2016], this resolves the 1-pass space complexity of approximating the maximum matching in a dynamic graph stream, answering a question in that work.
3. 1-Pass Restriction. Their reduction only applies to 1-pass data stream algorithms in the turnstile model, while there exist algorithms for heavy hitters and for low rank approximation which provably do better with multiple passes. We extend the reduction to algorithms which make any number of passes, showing the optimal algorithm is to choose a new linear sketch at the beginning of each pass, based on the output of previous passes.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol050-ccc2016/LIPIcs.CCC.2016.20/LIPIcs.CCC.2016.20.pdf
communication complexity
data streams
dynamic graph streams
norm estimation