Jayaram, Rajesh ;
Woodruff, David P.
Towards Optimal Moment Estimation in Streaming and Distributed Models
Abstract
One of the oldest problems in the data stream model is to approximate the pth moment X_p^p = sum_{i=1}^n X_i^p of an underlying nonnegative vector X in R^n, which is presented as a sequence of poly(n) updates to its coordinates. Of particular interest is when p in (0,2]. Although a tight space bound of Theta(epsilon^2 log n) bits is known for this problem when both positive and negative updates are allowed, surprisingly there is still a gap in the space complexity of this problem when all updates are positive. Specifically, the upper bound is O(epsilon^2 log n) bits, while the lower bound is only Omega(epsilon^2 + log n) bits. Recently, an upper bound of O~(epsilon^2 + log n) bits was obtained under the assumption that the updates arrive in a random order.
We show that for p in (0, 1], the random order assumption is not needed. Namely, we give an upper bound for worstcase streams of O~(epsilon^2 + log n) bits for estimating X _p^p. Our techniques also give new upper bounds for estimating the empirical entropy in a stream. On the other hand, we show that for p in (1,2], in the natural coordinator and blackboard distributed communication topologies, there is an O~(epsilon^2) bit maxcommunication upper bound based on a randomized rounding scheme. Our protocols also give rise to protocols for heavy hitters and approximate matrix product. We generalize our results to arbitrary communication topologies G, obtaining an O~(epsilon^2 log d) maxcommunication upper bound, where d is the diameter of G. Interestingly, our upper bound rules out natural communication complexitybased approaches for proving an Omega(epsilon^2 log n) bit lower bound for p in (1,2] for streaming algorithms. In particular, any such lower bound must come from a topology with large diameter.
BibTeX  Entry
@InProceedings{jayaram_et_al:LIPIcs:2019:11244,
author = {Rajesh Jayaram and David P. Woodruff},
title = {{Towards Optimal Moment Estimation in Streaming and Distributed Models}},
booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)},
pages = {29:129:21},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771252},
ISSN = {18688969},
year = {2019},
volume = {145},
editor = {Dimitris Achlioptas and L{\'a}szl{\'o} A. V{\'e}gh},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2019/11244},
URN = {urn:nbn:de:0030drops112443},
doi = {10.4230/LIPIcs.APPROXRANDOM.2019.29},
annote = {Keywords: Streaming, Sketching, Message Passing, Moment Estimation}
}
17.09.2019
Keywords: 

Streaming, Sketching, Message Passing, Moment Estimation 
Seminar: 

Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

Issue date: 

2019 
Date of publication: 

17.09.2019 