Search Results

Documents authored by Lange, Jane


Document
Samplability Makes Learning Easier

Authors: Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
The standard definition of PAC learning (Valiant 1984) requires learners to succeed under all distributions - even ones that are intractable to sample from. This stands in contrast to samplable PAC learning (Blum, Furst, Kearns, and Lipton 1993), where learners only have to succeed under samplable distributions. We study this distinction and show that samplable PAC substantially expands the power of efficient learners. We first construct a concept class that requires exponential sample complexity in standard PAC but is learnable with polynomial sample complexity in samplable PAC. We then lift this statistical separation to the computational setting and obtain a separation relative to a random oracle. Our proofs center around a new complexity primitive, explicit evasive sets, that we introduce and study. These are sets for which membership is easy to determine but are extremely hard to sample from. Our results extend to the online setting to similarly show that its landscape changes when the adversary is assumed to be efficient instead of computationally unbounded.

Cite as

Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan. Samplability Makes Learning Easier. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 20:1-20:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2026.20,
  author =	{Blanc, Guy and Koch, Caleb and Lange, Jane and Strassle, Carmen and Tan, Li-Yang},
  title =	{{Samplability Makes Learning Easier}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{20:1--20:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.20},
  URN =		{urn:nbn:de:0030-drops-253071},
  doi =		{10.4230/LIPIcs.ITCS.2026.20},
  annote =	{Keywords: PAC learning, Samplable distributions}
}
Document
Limitations of Membership Queries in Testable Learning

Authors: Jane Lange and Mingda Qiao

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
Membership queries (MQ) often yield speedups for learning tasks, particularly in the distribution-specific setting. We show that in the testable learning model of Rubinfeld and Vasilyan [Rubinfeld and Vasilyan, 2023], membership queries cannot decrease the time complexity of testable learning algorithms beyond the complexity of sample-only distribution-specific learning. In the testable learning model, the learner must output a hypothesis whenever the data distribution satisfies a desired property, and if it outputs a hypothesis, the hypothesis must be near-optimal. We give a general reduction from sample-based refutation of boolean concept classes, as presented in [Vadhan, 2017; Kothari and Livni, 2018], to testable learning with queries (TL-Q). This yields lower bounds for TL-Q via the reduction from learning to refutation given in [Kothari and Livni, 2018]. The result is that, relative to a concept class and a distribution family, no m-sample TL-Q algorithm can be super-polynomially more time-efficient than the best m-sample PAC learner. Finally, we define a class of "statistical" MQ algorithms that encompasses many known distribution-specific MQ learners, such as those based on influence estimation or subcube-conditional statistical queries. We show that TL-Q algorithms in this class imply efficient statistical-query refutation and learning algorithms. Thus, combined with known SQ dimension lower bounds, our results imply that these efficient membership query learners cannot be made testable.

Cite as

Jane Lange and Mingda Qiao. Limitations of Membership Queries in Testable Learning. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 91:1-91:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{lange_et_al:LIPIcs.ITCS.2026.91,
  author =	{Lange, Jane and Qiao, Mingda},
  title =	{{Limitations of Membership Queries in Testable Learning}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{91:1--91:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.91},
  URN =		{urn:nbn:de:0030-drops-253785},
  doi =		{10.4230/LIPIcs.ITCS.2026.91},
  annote =	{Keywords: Testable learning, PAC learning}
}
Document
Certification with an NP Oracle

Authors: Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)


Abstract
In the certification problem, the algorithm is given a function f with certificate complexity k and an input x^⋆, and the goal is to find a certificate of size ≤ poly(k) for f’s value at x^⋆. This problem is in NP^NP, and assuming 𝖯 ≠ NP, is not in 𝖯. Prior works, dating back to Valiant in 1984, have therefore sought to design efficient algorithms by imposing assumptions on f such as monotonicity. Our first result is a BPP^NP algorithm for the general problem. The key ingredient is a new notion of the balanced influence of variables, a natural variant of influence that corrects for the bias of the function. Balanced influences can be accurately estimated via uniform generation, and classic BPP^NP algorithms are known for the latter task. We then consider certification with stricter instance-wise guarantees: for each x^⋆, find a certificate whose size scales with that of the smallest certificate for x^⋆. In sharp contrast with our first result, we show that this problem is NP^NP-hard even to approximate. We obtain an optimal inapproximability ratio, adding to a small handful of problems in the higher levels of the polynomial hierarchy for which optimal inapproximability is known. Our proof involves the novel use of bit-fixing dispersers for gap amplification.

Cite as

Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan. Certification with an NP Oracle. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 18:1-18:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2023.18,
  author =	{Blanc, Guy and Koch, Caleb and Lange, Jane and Strassle, Carmen and Tan, Li-Yang},
  title =	{{Certification with an NP Oracle}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{18:1--18:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.18},
  URN =		{urn:nbn:de:0030-drops-175217},
  doi =		{10.4230/LIPIcs.ITCS.2023.18},
  annote =	{Keywords: Certificate complexity, Boolean functions, polynomial hierarchy, hardness of approximation}
}
Document
Track A: Algorithms, Complexity and Games
Reconstructing Decision Trees

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)


Abstract
We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where: - T has size S := s^O((log s)²/ε³); - dist(f,T) ≤ O(opt)+ε; - Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time. This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Reconstructing Decision Trees. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 24:1-24:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2022.24,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Reconstructing Decision Trees}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{24:1--24:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.24},
  URN =		{urn:nbn:de:0030-drops-163653},
  doi =		{10.4230/LIPIcs.ICALP.2022.24},
  annote =	{Keywords: Property reconstruction, property testing, tolerant testing, decision trees}
}
Document
RANDOM
Decision Tree Heuristics Can Fail, Even in the Smoothed Setting

Authors: Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)


Abstract
Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996). Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possible avenue towards resolving this disconnect. Within the smoothed setting and for targets f that are k-juntas, they showed that these heuristics successfully learn f with depth-k decision tree hypotheses. They conjectured that the same guarantee holds more generally for targets that are depth-k decision trees. We provide a counterexample to this conjecture: we construct targets that are depth-k decision trees and show that even in the smoothed setting, these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy. We also show that the guarantees of Brutzkus et al. cannot extend to the agnostic setting: there are targets that are very close to k-juntas, for which these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy.

Cite as

Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan. Decision Tree Heuristics Can Fail, Even in the Smoothed Setting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.APPROX/RANDOM.2021.45,
  author =	{Blanc, Guy and Lange, Jane and Qiao, Mingda and Tan, Li-Yang},
  title =	{{Decision Tree Heuristics Can Fail, Even in the Smoothed Setting}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{45:1--45:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.45},
  URN =		{urn:nbn:de:0030-drops-147386},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.45},
  annote =	{Keywords: decision trees, learning theory, smoothed analysis}
}
Document
Track A: Algorithms, Complexity and Games
Learning Stochastic Decision Trees

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)


Abstract
We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an η-corrupted set of uniform random samples labeled by a size-s stochastic decision tree, our algorithm runs in time n^{O(log(s/ε)/ε²)} and returns a hypothesis with error within an additive 2η + ε of the Bayes optimal. An additive 2η is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of O(η) + ε was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Learning Stochastic Decision Trees. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2021.30,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Learning Stochastic Decision Trees}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{30:1--30:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.30},
  URN =		{urn:nbn:de:0030-drops-140994},
  doi =		{10.4230/LIPIcs.ICALP.2021.30},
  annote =	{Keywords: Learning theory, decision trees, proper learning algorithms, adversarial noise}
}
Document
Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)


Abstract
Consider the following heuristic for building a decision tree for a function f : {0,1}^n → {± 1}. Place the most influential variable x_i of f at the root, and recurse on the subfunctions f_{x_i=0} and f_{x_i=1} on the left and right subtrees respectively; terminate once the tree is an ε-approximation of f. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds: - Upper bound: For every f with decision tree size s and every ε ∈ (0,1/2), this heuristic builds a decision tree of size at most s^O(log(s/ε)log(1/ε)). - Lower bound: For every ε ∈ (0,1/2) and s ≤ 2^Õ(√n), there is an f with decision tree size s such that this heuristic builds a decision tree of size s^Ω~(log s). We also obtain upper and lower bounds for monotone functions: s^O(√{log s}/ε) and s^Ω(∜{log s}) respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms - which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART - achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989), and even have certain qualitative advantages. Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 44:1-44:44, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2020.44,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations}},
  booktitle =	{11th Innovations in Theoretical Computer Science Conference (ITCS 2020)},
  pages =	{44:1--44:44},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-134-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{151},
  editor =	{Vidick, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.44},
  URN =		{urn:nbn:de:0030-drops-117295},
  doi =		{10.4230/LIPIcs.ITCS.2020.44},
  annote =	{Keywords: Decision trees, Influence of variables, Analysis of boolean functions, Learning theory, Top-down decision tree heuristics}
}
Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail