Found 2 Possible Name Variants:

Document

**Published in:** LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of k ReLUs minimizing the squared error (for k > 1) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error ε. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest κ-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in 1/ε². Together with a previous work regarding improperly learning a ReLU [Surbhi Goel et al., 2017], this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on ε.

Surbhi Goel, Adam Klivans, Pasin Manurangsi, and Daniel Reichman. Tight Hardness Results for Training Depth-2 ReLU Networks. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 22:1-22:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{goel_et_al:LIPIcs.ITCS.2021.22, author = {Goel, Surbhi and Klivans, Adam and Manurangsi, Pasin and Reichman, Daniel}, title = {{Tight Hardness Results for Training Depth-2 ReLU Networks}}, booktitle = {12th Innovations in Theoretical Computer Science Conference (ITCS 2021)}, pages = {22:1--22:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-177-1}, ISSN = {1868-8969}, year = {2021}, volume = {185}, editor = {Lee, James R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.22}, URN = {urn:nbn:de:0030-drops-135611}, doi = {10.4230/LIPIcs.ITCS.2021.22}, annote = {Keywords: ReLU, Learning Algorithm, Running Time Lower Bound} }

Document

**Published in:** LIPIcs, Volume 28, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)

We give the first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution. We reduce from the problem of learning sparse parities with noise with respect to the uniform distribution on the hypercube (sparse LPN), a notoriously hard problem in theoretical computer science and show that any algorithm for agnostically learning halfspaces requires n^Omega(log(1/\epsilon)) time under the assumption that k-sparse LPN requires n^Omega(k) time, ruling out a polynomial time algorithm for the problem. As far as we are aware, this is the first representation-independent hardness result for supervised learning when the underlying distribution is restricted to be a Gaussian.
We also show that the problem of agnostically learning sparse polynomials with respect to the Gaussian distribution in polynomial time is as hard as PAC learning DNFs on the uniform distribution in polynomial time. This complements the surprising result of Andoni et. al. 2013 who show that sparse polynomials are learnable under random Gaussian noise in polynomial time.
Taken together, these results show the inherent difficulty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions. Our results use a novel embedding of random labeled examples from the uniform distribution on the Boolean hypercube into random labeled examples from the Gaussian distribution that allows us to relate the hardness of learning problems on two different domains and distributions.

Adam Klivans and Pravesh Kothari. Embedding Hard Learning Problems Into Gaussian Space. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 28, pp. 793-809, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{klivans_et_al:LIPIcs.APPROX-RANDOM.2014.793, author = {Klivans, Adam and Kothari, Pravesh}, title = {{Embedding Hard Learning Problems Into Gaussian Space}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)}, pages = {793--809}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-74-3}, ISSN = {1868-8969}, year = {2014}, volume = {28}, editor = {Jansen, Klaus and Rolim, Jos\'{e} and Devanur, Nikhil R. and Moore, Cristopher}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2014.793}, URN = {urn:nbn:de:0030-drops-47391}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2014.793}, annote = {Keywords: distribution-specific hardness of learning, gaussian space, halfspace-learning, agnostic learning} }

Document

**Published in:** LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)

Suppose Est is a randomized estimation algorithm that uses n random bits and outputs values in R^d. We show how to execute Est on k adaptively chosen inputs using only n + O(k log(d + 1)) random bits instead of the trivial nk (at the cost of mild increases in the error and failure probability). Our algorithm combines a variant of the INW pseudorandom generator [Impagliazzo et al., 1994] with a new scheme for shifting and rounding the outputs of Est. We prove that modifying the outputs of Est is necessary in this setting, and furthermore, our algorithm's randomness complexity is near-optimal in the case d <= O(1). As an application, we give a randomness-efficient version of the Goldreich-Levin algorithm; our algorithm finds all Fourier coefficients with absolute value at least theta of a function F: {0, 1}^n -> {-1, 1} using O(n log n) * poly(1/theta) queries to F and O(n) random bits (independent of theta), improving previous work by Bshouty et al. [Bshouty et al., 2004].

William M. Hoza and Adam R. Klivans. Preserving Randomness for Adaptive Algorithms. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 43:1-43:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{hoza_et_al:LIPIcs.APPROX-RANDOM.2018.43, author = {Hoza, William M. and Klivans, Adam R.}, title = {{Preserving Randomness for Adaptive Algorithms}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)}, pages = {43:1--43:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-085-9}, ISSN = {1868-8969}, year = {2018}, volume = {116}, editor = {Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.43}, URN = {urn:nbn:de:0030-drops-94477}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2018.43}, annote = {Keywords: pseudorandomness, adaptivity, estimation} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail