We prove a lower estimate on the increase in entropy when two copies of a conditional random variable X | Y, with X supported on Z_q={0,1,...,q-1} for prime q, are summed modulo q. Specifically, given two i.i.d. copies (X_1,Y_1) and (X_2,Y_2) of a pair of random variables (X,Y), with X taking values in Z_q, we show H(X_1 + X_2 \mid Y_1, Y_2) - H(X|Y) >=e alpha(q) * H(X|Y) (1-H(X|Y)) for some alpha(q) > 0, where H(.) is the normalized (by factor log_2(q)) entropy. In particular, if X | Y is not close to being fully random or fully deterministic and H(X| Y) \in (gamma,1-gamma), then the entropy of the sum increases by Omega_q(gamma). Our motivation is an effective analysis of the finite-length behavior of polar codes, for which the linear dependence on gamma is quantitatively important. The assumption of q being prime is necessary: for X supported uniformly on a proper subgroup of Z_q we have H(X+X)=H(X). For X supported on infinite groups without a finite subgroup (the torsion-free case) and no conditioning, a sumset inequality for the absolute increase in (unnormalized) entropy was shown by Tao in [Tao, CP&R 2010]. We use our sumset inequality to analyze Ari kan's construction of polar codes and prove that for any q-ary source X, where q is any fixed prime, and anyepsilon > 0, polar codes allow efficient data compression of N i.i.d. copies of X into (H(X)+epsilon)N q-ary symbols, as soon as N is polynomially large in 1/epsilon. We can get capacity-achieving source codes with similar guarantees for composite alphabets, by factoring q into primes and combining different polar codes for each prime in factorization. A consequence of our result for noisy channel coding is that for all discrete memoryless channels, there are explicit codes enabling reliable communication within epsilon > 0 of the symmetric Shannon capacity for a block length and decoding complexity bounded by a polynomial in 1/epsilon. The result was previously shown for the special case of binary-input channels [Guruswami/Xial, FOCS'13; Hassani/Alishahi/Urbanke, CoRR 2013], and this work extends the result to channels over any alphabet.
@InProceedings{guruswami_et_al:LIPIcs.CCC.2015.42, author = {Guruswami, Venkatesan and Velingker, Ameya}, title = {{An Entropy Sumset Inequality and Polynomially Fast Convergence to Shannon Capacity Over All Alphabets}}, booktitle = {30th Conference on Computational Complexity (CCC 2015)}, pages = {42--57}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-81-1}, ISSN = {1868-8969}, year = {2015}, volume = {33}, editor = {Zuckerman, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2015.42}, URN = {urn:nbn:de:0030-drops-50755}, doi = {10.4230/LIPIcs.CCC.2015.42}, annote = {Keywords: Polar codes, polynomial gap to capacity, entropy sumset inequality, arbitrary alphabets} }
Feedback for Dagstuhl Publishing