when quoting this document, please refer to the following
DOI:
URN: urn:nbn:de:0030-drops-63628
URL:

; ; ;

### On the Hardness of Learning Sparse Parities

 pdf-format:

### Abstract

This work investigates the hardness of computing sparse solutions to systems of linear equations over F_2. Consider the k-EventSet problem: given a homogeneous system of linear equations over \$\F_2\$ on \$n\$ variables, decide if there exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse solution). While there is a simple O(n^{k/2})-time algorithm for it, establishing fixed parameter intractability for k-EventSet has been a notorious open problem. Towards this goal, we show that unless \kclq can be solved in n^{o(k)} time, k-EventSet has no polynomial time algorithm when k = omega(log^2(n)).

Our work also shows that the non-homogeneous generalization of the problem - which we call k-VectorSum - is W[1]-hard on instances where the number of equations is O(k*log(n)), improving on previous reductions which produced Omega(n) equations. We use the hardness of k-VectorSum as a starting point to prove the result for k-EventSet, and additionally strengthen the former to show the hardness of approximately learning k-juntas. In particular, we prove that given a system of O(exp(O(k))*log(n)) linear equations, it is W[1]-hard to decide if there is a k-sparse linear form satisfying all the equations or any function on at most k-variables (a k-junta) satisfies at most (1/2 + epsilon)-fraction of the equations, for any constant epsilon > 0. In the setting of computational learning, this shows hardness of approximate non-proper learning of k-parities.

In a similar vein, we use the hardness of k-EventSet to show that that for any constant d, unless k-Clique can be solved in n^{o(k)} time, there is no poly(m,n)*2^{o(sqrt{k})} time algorithm to decide whether a given set of \$m\$ points in F_2^n satisfies: (i) there exists a non-trivial k-sparse homogeneous linear form evaluating to 0 on all the points, or (ii) any non-trivial degree d polynomial P supported on at most k variables evaluates to zero on approx Pr_{F_2^n}[P({z}) = 0] fraction of the points i.e., P is fooled by the set of points.

Lastly, we study the approximation in the sparsity of the solution. Let the Gap-k-VectorSum problem be: given an instance of k-VectorSum of size n, decide if there exist a k-sparse solution, or every solution is of sparsity at least k' = (1+delta_0)k. Assuming the Exponential Time Hypothesis, we show that for some constants c_0, delta_0 > 0 there is no poly(n) time algorithm for Gap-k-VectorSum when k = omega((log(log( n)))^{c_0}).

### BibTeX - Entry

```@InProceedings{bhattacharyya_et_al:LIPIcs:2016:6362,
author =	{Arnab Bhattacharyya and Ameet Gadekar and Suprovat Ghoshal and Rishi Saket},
title =	{{On the Hardness of Learning Sparse Parities}},
booktitle =	{24th Annual European Symposium on Algorithms (ESA 2016)},
pages =	{11:1--11:17},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-95977-015-6},
ISSN =	{1868-8969},
year =	{2016},
volume =	{57},
editor =	{Piotr Sankowski and Christos Zaroliagis},
publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},