9 Search Results for "Klivans, Adam"


Document
Loss Minimization Through the Lens Of Outcome Indistinguishability

Authors: Parikshit Gopalan, Lunjia Hu, Michael P. Kim, Omer Reingold, and Udi Wieder

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)


Abstract
We present a new perspective on loss minimization and the recent notion of Omniprediction through the lens of Outcome Indistingusihability. For a collection of losses and hypothesis class, omniprediction requires that a predictor provide a loss-minimization guarantee simultaneously for every loss in the collection compared to the best (loss-specific) hypothesis in the class. We present a generic template to learn predictors satisfying a guarantee we call Loss Outcome Indistinguishability. For a set of statistical tests - based on a collection of losses and hypothesis class - a predictor is Loss OI if it is indistinguishable (according to the tests) from Nature’s true probabilities over outcomes. By design, Loss OI implies omniprediction in a direct and intuitive manner. We simplify Loss OI further, decomposing it into a calibration condition plus multiaccuracy for a class of functions derived from the loss and hypothesis classes. By careful analysis of this class, we give efficient constructions of omnipredictors for interesting classes of loss functions, including non-convex losses. This decomposition highlights the utility of a new multi-group fairness notion that we call calibrated multiaccuracy, which lies in between multiaccuracy and multicalibration. We show that calibrated multiaccuracy implies Loss OI for the important set of convex losses arising from Generalized Linear Models, without requiring full multicalibration. For such losses, we show an equivalence between our computational notion of Loss OI and a geometric notion of indistinguishability, formulated as Pythagorean theorems in the associated Bregman divergence. We give an efficient algorithm for calibrated multiaccuracy with computational complexity comparable to that of multiaccuracy. In all, calibrated multiaccuracy offers an interesting tradeoff point between efficiency and generality in the omniprediction landscape.

Cite as

Parikshit Gopalan, Lunjia Hu, Michael P. Kim, Omer Reingold, and Udi Wieder. Loss Minimization Through the Lens Of Outcome Indistinguishability. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 60:1-60:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{gopalan_et_al:LIPIcs.ITCS.2023.60,
  author =	{Gopalan, Parikshit and Hu, Lunjia and Kim, Michael P. and Reingold, Omer and Wieder, Udi},
  title =	{{Loss Minimization Through the Lens Of Outcome Indistinguishability}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{60:1--60:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.60},
  URN =		{urn:nbn:de:0030-drops-175635},
  doi =		{10.4230/LIPIcs.ITCS.2023.60},
  annote =	{Keywords: Loss Minimization, Indistinguishability}
}
Document
Polynomial Threshold Functions for Decision Lists

Authors: Vladimir Podolskii and Nikolay V. Proskurin

Published in: LIPIcs, Volume 248, 33rd International Symposium on Algorithms and Computation (ISAAC 2022)


Abstract
For S ⊆ {0,1}ⁿ a Boolean function f : S → {-1,1} is a polynomial threshold function (PTF) of degree d and weight W if there is a polynomial p with integer coefficients of degree d and with sum of absolute coefficients W such that f(x) = sign p(x) for all x ∈ S. We study a representation of decision lists as PTFs over Boolean cubes {0,1}ⁿ and over Hamming balls {0,1}ⁿ_{≤ k}. As our first result, we show that for all d = O((n/(log n))^{1/3}) any decision list over {0,1}ⁿ can be represented by a PTF of degree d and weight 2^O(n/d²). This improves the result by Klivans and Servedio [Adam R. Klivans and Rocco A. Servedio, 2006] by a log² d factor in the exponent of the weight. Our bound is tight for all d = O((n/(log n))^{1/3}) due to the matching lower bound by Beigel [Richard Beigel, 1994]. For decision lists over a Hamming ball {0,1}ⁿ_{≤ k} we show that the upper bound on weight above can be drastically improved to n^O(√k) for d = Θ(√k). We also show that similar improvement is not possible for smaller degrees by proving the lower bound W = 2^Ω(n/d²) for all d = O(√k).

Cite as

Vladimir Podolskii and Nikolay V. Proskurin. Polynomial Threshold Functions for Decision Lists. In 33rd International Symposium on Algorithms and Computation (ISAAC 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 248, pp. 52:1-52:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{podolskii_et_al:LIPIcs.ISAAC.2022.52,
  author =	{Podolskii, Vladimir and Proskurin, Nikolay V.},
  title =	{{Polynomial Threshold Functions for Decision Lists}},
  booktitle =	{33rd International Symposium on Algorithms and Computation (ISAAC 2022)},
  pages =	{52:1--52:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-258-7},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{248},
  editor =	{Bae, Sang Won and Park, Heejin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2022.52},
  URN =		{urn:nbn:de:0030-drops-173372},
  doi =		{10.4230/LIPIcs.ISAAC.2022.52},
  annote =	{Keywords: Threshold function, decision list, Hamming ball}
}
Document
RANDOM
On the Structure of Learnability Beyond P/Poly

Authors: Ninad Rajgopal and Rahul Santhanam

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)


Abstract
Motivated by the goal of showing stronger structural results about the complexity of learning, we study the learnability of strong concept classes beyond P/poly, such as PSPACE/poly and EXP/poly. We show the following: 1) (Unconditional Lower Bounds for Learning) Building on [Adam R. Klivans et al., 2013], we prove unconditionally that BPE/poly cannot be weakly learned in polynomial time over the uniform distribution, even with membership and equivalence queries. 2) (Robustness of Learning) For the concept classes EXP/poly and PSPACE/poly, we show unconditionally that worst-case and average-case learning are equivalent, that PAC-learnability and learnability over the uniform distribution are equivalent, and that membership queries do not help in either case. 3) (Reducing Succinct Search to Decision for Learning) For the decision problems R_{Kt} and R_{KS} capturing the complexity of learning EXP/poly and PSPACE/poly respectively, we show a succinct search to decision reduction: for each of these problems, the problem is in BPP iff there is a probabilistic polynomial-time algorithm computing circuits encoding proofs for positive instances of the problem. This is shown via a more general result giving succinct search to decision results for PSPACE, EXP and NEXP, which might be of independent interest. 4) (Implausibility of Oblivious Strongly Black-Box Reductions showing NP-hardness of learning NP/poly) We define a natural notion of hardness of learning with respect to oblivious strongly black-box reductions. We show that learning PSPACE/poly is PSPACE-hard with respect to oblivious strongly black-box reductions. On the other hand, if learning NP/poly is NP-hard with respect to oblivious strongly black-box reductions, the Polynomial Hierarchy collapses.

Cite as

Ninad Rajgopal and Rahul Santhanam. On the Structure of Learnability Beyond P/Poly. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 46:1-46:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{rajgopal_et_al:LIPIcs.APPROX/RANDOM.2021.46,
  author =	{Rajgopal, Ninad and Santhanam, Rahul},
  title =	{{On the Structure of Learnability Beyond P/Poly}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{46:1--46:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.46},
  URN =		{urn:nbn:de:0030-drops-147395},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.46},
  annote =	{Keywords: Hardness of Learning, Oracle Circuit Classes, Succinct Search, Black-Box Reductions}
}
Document
Tight Hardness Results for Training Depth-2 ReLU Networks

Authors: Surbhi Goel, Adam Klivans, Pasin Manurangsi, and Daniel Reichman

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)


Abstract
We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of k ReLUs minimizing the squared error (for k > 1) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error ε. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest κ-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in 1/ε². Together with a previous work regarding improperly learning a ReLU [Surbhi Goel et al., 2017], this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on ε.

Cite as

Surbhi Goel, Adam Klivans, Pasin Manurangsi, and Daniel Reichman. Tight Hardness Results for Training Depth-2 ReLU Networks. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 22:1-22:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{goel_et_al:LIPIcs.ITCS.2021.22,
  author =	{Goel, Surbhi and Klivans, Adam and Manurangsi, Pasin and Reichman, Daniel},
  title =	{{Tight Hardness Results for Training Depth-2 ReLU Networks}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{22:1--22:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.22},
  URN =		{urn:nbn:de:0030-drops-135611},
  doi =		{10.4230/LIPIcs.ITCS.2021.22},
  annote =	{Keywords: ReLU, Learning Algorithm, Running Time Lower Bound}
}
Document
Hitting Sets Give Two-Sided Derandomization of Small Space

Authors: Kuan Cheng and William M. Hoza

Published in: LIPIcs, Volume 169, 35th Computational Complexity Conference (CCC 2020)


Abstract
A hitting set is a "one-sided" variant of a pseudorandom generator (PRG), naturally suited to derandomizing algorithms that have one-sided error. We study the problem of using a given hitting set to derandomize algorithms that have two-sided error, focusing on space-bounded algorithms. For our first result, we show that if there is a log-space hitting set for polynomial-width read-once branching programs (ROBPs), then not only does 𝐋 = 𝐑𝐋, but 𝐋 = 𝐁𝐏𝐋 as well. This answers a question raised by Hoza and Zuckerman [William M. Hoza and David Zuckerman, 2018]. Next, we consider constant-width ROBPs. We show that if there are log-space hitting sets for constant-width ROBPs, then given black-box access to a constant-width ROBP f, it is possible to deterministically estimate 𝔼[f] to within ± ε in space O(log(n/ε)). Unconditionally, we give a deterministic algorithm for this problem with space complexity O(log² n + log(1/ε)), slightly improving over previous work. Finally, we investigate the limits of this line of work. Perhaps the strongest reduction along these lines one could hope for would say that for every explicit hitting set, there is an explicit PRG with similar parameters. In the setting of constant-width ROBPs over a large alphabet, we prove that establishing such a strong reduction is at least as difficult as constructing a good PRG outright. Quantitatively, we prove that if the strong reduction holds, then for every constant α > 0, there is an explicit PRG for constant-width ROBPs with seed length O(log^{1 + α} n). Along the way, unconditionally, we construct an improved hitting set for ROBPs over a large alphabet.

Cite as

Kuan Cheng and William M. Hoza. Hitting Sets Give Two-Sided Derandomization of Small Space. In 35th Computational Complexity Conference (CCC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 169, pp. 10:1-10:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{cheng_et_al:LIPIcs.CCC.2020.10,
  author =	{Cheng, Kuan and Hoza, William M.},
  title =	{{Hitting Sets Give Two-Sided Derandomization of Small Space}},
  booktitle =	{35th Computational Complexity Conference (CCC 2020)},
  pages =	{10:1--10:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-156-6},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{169},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2020.10},
  URN =		{urn:nbn:de:0030-drops-125625},
  doi =		{10.4230/LIPIcs.CCC.2020.10},
  annote =	{Keywords: hitting sets, derandomization, read-once branching programs}
}
Document
Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)


Abstract
Consider the following heuristic for building a decision tree for a function f : {0,1}^n → {± 1}. Place the most influential variable x_i of f at the root, and recurse on the subfunctions f_{x_i=0} and f_{x_i=1} on the left and right subtrees respectively; terminate once the tree is an ε-approximation of f. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds: - Upper bound: For every f with decision tree size s and every ε ∈ (0,1/2), this heuristic builds a decision tree of size at most s^O(log(s/ε)log(1/ε)). - Lower bound: For every ε ∈ (0,1/2) and s ≤ 2^Õ(√n), there is an f with decision tree size s such that this heuristic builds a decision tree of size s^Ω~(log s). We also obtain upper and lower bounds for monotone functions: s^O(√{log s}/ε) and s^Ω(∜{log s}) respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms - which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART - achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989), and even have certain qualitative advantages. Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 44:1-44:44, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2020.44,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations}},
  booktitle =	{11th Innovations in Theoretical Computer Science Conference (ITCS 2020)},
  pages =	{44:1--44:44},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-134-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{151},
  editor =	{Vidick, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.44},
  URN =		{urn:nbn:de:0030-drops-117295},
  doi =		{10.4230/LIPIcs.ITCS.2020.44},
  annote =	{Keywords: Decision trees, Influence of variables, Analysis of boolean functions, Learning theory, Top-down decision tree heuristics}
}
Document
Typically-Correct Derandomization for Small Time and Space

Authors: William M. Hoza

Published in: LIPIcs, Volume 137, 34th Computational Complexity Conference (CCC 2019)


Abstract
Suppose a language L can be decided by a bounded-error randomized algorithm that runs in space S and time n * poly(S). We give a randomized algorithm for L that still runs in space O(S) and time n * poly(S) that uses only O(S) random bits; our algorithm has a low failure probability on all but a negligible fraction of inputs of each length. As an immediate corollary, there is a deterministic algorithm for L that runs in space O(S) and succeeds on all but a negligible fraction of inputs of each length. We also give several other complexity-theoretic applications of our technique.

Cite as

William M. Hoza. Typically-Correct Derandomization for Small Time and Space. In 34th Computational Complexity Conference (CCC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 137, pp. 9:1-9:39, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{hoza:LIPIcs.CCC.2019.9,
  author =	{Hoza, William M.},
  title =	{{Typically-Correct Derandomization for Small Time and Space}},
  booktitle =	{34th Computational Complexity Conference (CCC 2019)},
  pages =	{9:1--9:39},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-116-0},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{137},
  editor =	{Shpilka, Amir},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2019.9},
  URN =		{urn:nbn:de:0030-drops-108317},
  doi =		{10.4230/LIPIcs.CCC.2019.9},
  annote =	{Keywords: Derandomization, pseudorandomness, space complexity}
}
Document
Preserving Randomness for Adaptive Algorithms

Authors: William M. Hoza and Adam R. Klivans

Published in: LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)


Abstract
Suppose Est is a randomized estimation algorithm that uses n random bits and outputs values in R^d. We show how to execute Est on k adaptively chosen inputs using only n + O(k log(d + 1)) random bits instead of the trivial nk (at the cost of mild increases in the error and failure probability). Our algorithm combines a variant of the INW pseudorandom generator [Impagliazzo et al., 1994] with a new scheme for shifting and rounding the outputs of Est. We prove that modifying the outputs of Est is necessary in this setting, and furthermore, our algorithm's randomness complexity is near-optimal in the case d <= O(1). As an application, we give a randomness-efficient version of the Goldreich-Levin algorithm; our algorithm finds all Fourier coefficients with absolute value at least theta of a function F: {0, 1}^n -> {-1, 1} using O(n log n) * poly(1/theta) queries to F and O(n) random bits (independent of theta), improving previous work by Bshouty et al. [Bshouty et al., 2004].

Cite as

William M. Hoza and Adam R. Klivans. Preserving Randomness for Adaptive Algorithms. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 43:1-43:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{hoza_et_al:LIPIcs.APPROX-RANDOM.2018.43,
  author =	{Hoza, William M. and Klivans, Adam R.},
  title =	{{Preserving Randomness for Adaptive Algorithms}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)},
  pages =	{43:1--43:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-085-9},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{116},
  editor =	{Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.43},
  URN =		{urn:nbn:de:0030-drops-94477},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2018.43},
  annote =	{Keywords: pseudorandomness, adaptivity, estimation}
}
Document
Embedding Hard Learning Problems Into Gaussian Space

Authors: Adam Klivans and Pravesh Kothari

Published in: LIPIcs, Volume 28, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)


Abstract
We give the first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution. We reduce from the problem of learning sparse parities with noise with respect to the uniform distribution on the hypercube (sparse LPN), a notoriously hard problem in theoretical computer science and show that any algorithm for agnostically learning halfspaces requires n^Omega(log(1/\epsilon)) time under the assumption that k-sparse LPN requires n^Omega(k) time, ruling out a polynomial time algorithm for the problem. As far as we are aware, this is the first representation-independent hardness result for supervised learning when the underlying distribution is restricted to be a Gaussian. We also show that the problem of agnostically learning sparse polynomials with respect to the Gaussian distribution in polynomial time is as hard as PAC learning DNFs on the uniform distribution in polynomial time. This complements the surprising result of Andoni et. al. 2013 who show that sparse polynomials are learnable under random Gaussian noise in polynomial time. Taken together, these results show the inherent difficulty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions. Our results use a novel embedding of random labeled examples from the uniform distribution on the Boolean hypercube into random labeled examples from the Gaussian distribution that allows us to relate the hardness of learning problems on two different domains and distributions.

Cite as

Adam Klivans and Pravesh Kothari. Embedding Hard Learning Problems Into Gaussian Space. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 28, pp. 793-809, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{klivans_et_al:LIPIcs.APPROX-RANDOM.2014.793,
  author =	{Klivans, Adam and Kothari, Pravesh},
  title =	{{Embedding Hard Learning Problems Into Gaussian Space}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)},
  pages =	{793--809},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-74-3},
  ISSN =	{1868-8969},
  year =	{2014},
  volume =	{28},
  editor =	{Jansen, Klaus and Rolim, Jos\'{e} and Devanur, Nikhil R. and Moore, Cristopher},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2014.793},
  URN =		{urn:nbn:de:0030-drops-47391},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2014.793},
  annote =	{Keywords: distribution-specific hardness of learning, gaussian space, halfspace-learning, agnostic learning}
}
  • Refine by Author
  • 3 Hoza, William M.
  • 2 Klivans, Adam
  • 1 Blanc, Guy
  • 1 Cheng, Kuan
  • 1 Goel, Surbhi
  • Show More...

  • Refine by Classification
  • 3 Theory of computation → Pseudorandomness and derandomization
  • 2 Theory of computation → Complexity classes
  • 1 Theory of computation → Boolean function learning
  • 1 Theory of computation → Computational complexity and cryptography
  • 1 Theory of computation → Machine learning theory
  • Show More...

  • Refine by Keyword
  • 2 pseudorandomness
  • 1 Analysis of boolean functions
  • 1 Black-Box Reductions
  • 1 Decision trees
  • 1 Derandomization
  • Show More...

  • Refine by Type
  • 9 document

  • Refine by Publication Year
  • 2 2020
  • 2 2021
  • 1 2014
  • 1 2018
  • 1 2019
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail