Braverman, Vladimir ;
Chung, KaiMin ;
Liu, Zhenming ;
Mitzenmacher, Michael ;
Ostrovsky, Rafail
AMS Without 4Wise Independence on Product Domains
Abstract
In their seminal work, Alon, Matias, and Szegedy introduced several sketching techniques, including showing that $4$wise independence is sufficient to obtain good approximations of the second frequency moment. In this work, we show that their sketching technique can be extended to product domains $[n]^k$ by using the product of $4$wise independent functions on $[n]$.
Our work extends that of Indyk and McGregor, who showed the result for $k = 2$. Their primary motivation was the problem of identifying correlations in data streams. In their model, a stream of pairs $(i,j) \in [n]^2$ arrive, giving a joint distribution $(X,Y)$, and they find approximation algorithms for how close the joint distribution is to the product of the marginal distributions under various metrics, which naturally corresponds to how close $X$ and $Y$ are to being independent. By using our technique, we obtain a new result for the problem of approximating the $\ell_2$ distance between the joint distribution and the product of the marginal distributions for $k$ary vectors, instead of just pairs, in a single pass. Our analysis gives a randomized algorithm that is a $(1\pm \epsilon)$ approximation (with probability $1\delta$) that requires space logarithmic in $n$ and $m$ and proportional to $3^k$.
BibTeX  Entry
@InProceedings{braverman_et_al:LIPIcs:2010:2449,
author = {Vladimir Braverman and KaiMin Chung and Zhenming Liu and Michael Mitzenmacher and Rafail Ostrovsky},
title = {{AMS Without 4Wise Independence on Product Domains}},
booktitle = {27th International Symposium on Theoretical Aspects of Computer Science},
pages = {119130},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783939897163},
ISSN = {18688969},
year = {2010},
volume = {5},
editor = {JeanYves Marion and Thomas Schwentick},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2010/2449},
URN = {urn:nbn:de:0030drops24496},
doi = {http://dx.doi.org/10.4230/LIPIcs.STACS.2010.2449},
annote = {Keywords: Data Streams, Randomized Algorithms, Streaming Algorithms, Independence, Sketches}
}
2010
Keywords: 

Data Streams, Randomized Algorithms, Streaming Algorithms, Independence, Sketches 
Seminar: 

27th International Symposium on Theoretical Aspects of Computer Science

Issue date: 

2010 
Date of publication: 

2010 