DROPS

Document

DOI: 10.4230/LIPIcs.ITCS.2023.18

Certification with an NP Oracle

Authors: Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

Abstract

In the certification problem, the algorithm is given a function f with certificate complexity k and an input x^⋆, and the goal is to find a certificate of size ≤ poly(k) for f’s value at x^⋆. This problem is in NP^NP, and assuming 𝖯 ≠ NP, is not in 𝖯. Prior works, dating back to Valiant in 1984, have therefore sought to design efficient algorithms by imposing assumptions on f such as monotonicity. Our first result is a BPP^NP algorithm for the general problem. The key ingredient is a new notion of the balanced influence of variables, a natural variant of influence that corrects for the bias of the function. Balanced influences can be accurately estimated via uniform generation, and classic BPP^NP algorithms are known for the latter task. We then consider certification with stricter instance-wise guarantees: for each x^⋆, find a certificate whose size scales with that of the smallest certificate for x^⋆. In sharp contrast with our first result, we show that this problem is NP^NP-hard even to approximate. We obtain an optimal inapproximability ratio, adding to a small handful of problems in the higher levels of the polynomial hierarchy for which optimal inapproximability is known. Our proof involves the novel use of bit-fixing dispersers for gap amplification.

Cite as

Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan. Certification with an NP Oracle. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 18:1-18:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2023.18,
  author =	{Blanc, Guy and Koch, Caleb and Lange, Jane and Strassle, Carmen and Tan, Li-Yang},
  title =	{{Certification with an NP Oracle}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{18:1--18:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.18},
  URN =		{urn:nbn:de:0030-drops-175217},
  doi =		{10.4230/LIPIcs.ITCS.2023.18},
  annote =	{Keywords: Certificate complexity, Boolean functions, polynomial hierarchy, hardness of approximation}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2022.24

Reconstructing Decision Trees

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Abstract

We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where: - T has size S := s^O((log s)²/ε³); - dist(f,T) ≤ O(opt)+ε; - Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time. This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Reconstructing Decision Trees. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 24:1-24:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2022.24,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Reconstructing Decision Trees}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{24:1--24:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.24},
  URN =		{urn:nbn:de:0030-drops-163653},
  doi =		{10.4230/LIPIcs.ICALP.2022.24},
  annote =	{Keywords: Property reconstruction, property testing, tolerant testing, decision trees}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2021.45

Decision Tree Heuristics Can Fail, Even in the Smoothed Setting

Authors: Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

Abstract

Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996). Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possible avenue towards resolving this disconnect. Within the smoothed setting and for targets f that are k-juntas, they showed that these heuristics successfully learn f with depth-k decision tree hypotheses. They conjectured that the same guarantee holds more generally for targets that are depth-k decision trees. We provide a counterexample to this conjecture: we construct targets that are depth-k decision trees and show that even in the smoothed setting, these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy. We also show that the guarantees of Brutzkus et al. cannot extend to the agnostic setting: there are targets that are very close to k-juntas, for which these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy.

Cite as

Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan. Decision Tree Heuristics Can Fail, Even in the Smoothed Setting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.APPROX/RANDOM.2021.45,
  author =	{Blanc, Guy and Lange, Jane and Qiao, Mingda and Tan, Li-Yang},
  title =	{{Decision Tree Heuristics Can Fail, Even in the Smoothed Setting}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{45:1--45:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.45},
  URN =		{urn:nbn:de:0030-drops-147386},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.45},
  annote =	{Keywords: decision trees, learning theory, smoothed analysis}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2021.30

Learning Stochastic Decision Trees

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

Abstract

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an η-corrupted set of uniform random samples labeled by a size-s stochastic decision tree, our algorithm runs in time n^{O(log(s/ε)/ε²)} and returns a hypothesis with error within an additive 2η + ε of the Bayes optimal. An additive 2η is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of O(η) + ε was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Learning Stochastic Decision Trees. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2021.30,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Learning Stochastic Decision Trees}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{30:1--30:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.30},
  URN =		{urn:nbn:de:0030-drops-140994},
  doi =		{10.4230/LIPIcs.ICALP.2021.30},
  annote =	{Keywords: Learning theory, decision trees, proper learning algorithms, adversarial noise}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2020.44

Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)

Abstract

Consider the following heuristic for building a decision tree for a function f : {0,1}^n → {± 1}. Place the most influential variable x_i of f at the root, and recurse on the subfunctions f_{x_i=0} and f_{x_i=1} on the left and right subtrees respectively; terminate once the tree is an ε-approximation of f. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds: - Upper bound: For every f with decision tree size s and every ε ∈ (0,1/2), this heuristic builds a decision tree of size at most s^O(log(s/ε)log(1/ε)). - Lower bound: For every ε ∈ (0,1/2) and s ≤ 2^Õ(√n), there is an f with decision tree size s such that this heuristic builds a decision tree of size s^Ω~(log s). We also obtain upper and lower bounds for monotone functions: s^O(√{log s}/ε) and s^Ω(∜{log s}) respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms - which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART - achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989), and even have certain qualitative advantages. Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 44:1-44:44, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2020.44,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations}},
  booktitle =	{11th Innovations in Theoretical Computer Science Conference (ITCS 2020)},
  pages =	{44:1--44:44},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-134-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{151},
  editor =	{Vidick, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.44},
  URN =		{urn:nbn:de:0030-drops-117295},
  doi =		{10.4230/LIPIcs.ITCS.2020.44},
  annote =	{Keywords: Decision trees, Influence of variables, Analysis of boolean functions, Learning theory, Top-down decision tree heuristics}
}

Search Results

Documents authored by Lange, Jane

Certification with an NP Oracle

Abstract

Cite as

Reconstructing Decision Trees

Abstract

Cite as

Decision Tree Heuristics Can Fail, Even in the Smoothed Setting

Abstract

Cite as

Learning Stochastic Decision Trees

Abstract

Cite as

Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Abstract

Cite as