Document

**Published in:** LIPIcs, Volume 300, 39th Computational Complexity Conference (CCC 2024)

Consider the expected query complexity of computing the k-fold direct product f^{⊗ k} of a function f to error ε with respect to a distribution μ^k. One strategy is to sequentially compute each of the k copies to error ε/k with respect to μ and apply the union bound. We prove a strong direct sum theorem showing that this naive strategy is essentially optimal. In particular, computing a direct product necessitates a blowup in both query complexity and error.
Strong direct sum theorems contrast with results that only show a blowup in query complexity or error but not both. There has been a long line of such results for distributional query complexity, dating back to (Impagliazzo, Raz, Wigderson 1994) and (Nisan, Rudich, Saks 1994), but a strong direct sum theorem that holds for all functions in the standard query model had been elusive.
A key idea in our work is the first use of the Hardcore Theorem (Impagliazzo 1995) in the context of query complexity. We prove a new resilience lemma that accompanies it, showing that the hardcore of f^{⊗k} is likely to remain dense under arbitrary partitions of the input space.

Guy Blanc, Caleb Koch, Carmen Strassle, and Li-Yang Tan. A Strong Direct Sum Theorem for Distributional Query Complexity. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 16:1-16:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.CCC.2024.16, author = {Blanc, Guy and Koch, Caleb and Strassle, Carmen and Tan, Li-Yang}, title = {{A Strong Direct Sum Theorem for Distributional Query Complexity}}, booktitle = {39th Computational Complexity Conference (CCC 2024)}, pages = {16:1--16:30}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-331-7}, ISSN = {1868-8969}, year = {2024}, volume = {300}, editor = {Santhanam, Rahul}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2024.16}, URN = {urn:nbn:de:0030-drops-204123}, doi = {10.4230/LIPIcs.CCC.2024.16}, annote = {Keywords: Query complexity, direct product theorem, hardcore theorem} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

In the certification problem, the algorithm is given a function f with certificate complexity k and an input x^⋆, and the goal is to find a certificate of size ≤ poly(k) for f’s value at x^⋆. This problem is in NP^NP, and assuming 𝖯 ≠ NP, is not in 𝖯. Prior works, dating back to Valiant in 1984, have therefore sought to design efficient algorithms by imposing assumptions on f such as monotonicity.
Our first result is a BPP^NP algorithm for the general problem. The key ingredient is a new notion of the balanced influence of variables, a natural variant of influence that corrects for the bias of the function. Balanced influences can be accurately estimated via uniform generation, and classic BPP^NP algorithms are known for the latter task.
We then consider certification with stricter instance-wise guarantees: for each x^⋆, find a certificate whose size scales with that of the smallest certificate for x^⋆. In sharp contrast with our first result, we show that this problem is NP^NP-hard even to approximate. We obtain an optimal inapproximability ratio, adding to a small handful of problems in the higher levels of the polynomial hierarchy for which optimal inapproximability is known. Our proof involves the novel use of bit-fixing dispersers for gap amplification.

Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan. Certification with an NP Oracle. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 18:1-18:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2023.18, author = {Blanc, Guy and Koch, Caleb and Lange, Jane and Strassle, Carmen and Tan, Li-Yang}, title = {{Certification with an NP Oracle}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {18:1--18:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.18}, URN = {urn:nbn:de:0030-drops-175217}, doi = {10.4230/LIPIcs.ITCS.2023.18}, annote = {Keywords: Certificate complexity, Boolean functions, polynomial hierarchy, hardness of approximation} }

Document

**Published in:** LIPIcs, Volume 236, 25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022)

The seminal Satisfiability Coding Lemma of Paturi, Pudlák, and Zane is a coding scheme for satisfying assignments of k-CNF formulas. We generalize it to give a coding scheme for implicants and use this generalized scheme to establish new structural and algorithmic properties of prime implicants of k-CNF formulas.
Our first application is a near-optimal bound of n⋅ 3^{n(1-Ω(1/k))} on the number of prime implicants of any n-variable k-CNF formula. This resolves an open problem from the Ph.D. thesis of Talebanfard, who proved such a bound for the special case of constant-read k-CNF formulas. Our proof is algorithmic in nature, yielding an algorithm for computing the set of all prime implicants - the Blake Canonical Form - of a given k-CNF formula. The problem of computing the Blake Canonical Form of a given function is a classic one, dating back to Quine, and our work gives the first non-trivial algorithm for k-CNF formulas.

Milan Mossé, Harry Sha, and Li-Yang Tan. A Generalization of the Satisfiability Coding Lemma and Its Applications. In 25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 236, pp. 9:1-9:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{mosse_et_al:LIPIcs.SAT.2022.9, author = {Moss\'{e}, Milan and Sha, Harry and Tan, Li-Yang}, title = {{A Generalization of the Satisfiability Coding Lemma and Its Applications}}, booktitle = {25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022)}, pages = {9:1--9:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-242-6}, ISSN = {1868-8969}, year = {2022}, volume = {236}, editor = {Meel, Kuldeep S. and Strichman, Ofer}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SAT.2022.9}, URN = {urn:nbn:de:0030-drops-166837}, doi = {10.4230/LIPIcs.SAT.2022.9}, annote = {Keywords: Prime Implicants, Satisfiability Coding Lemma, Blake Canonical Form, k-SAT} }

Document

**Published in:** LIPIcs, Volume 234, 37th Computational Complexity Conference (CCC 2022)

We study the complexity of computing majority as a composition of local functions: Maj_n = h(g_1,…,g_m), where each g_j: {0,1}ⁿ → {0,1} is an arbitrary function that queries only k ≪ n variables and h: {0,1}^m → {0,1} is an arbitrary combining function. We prove an optimal lower bound of m ≥ Ω(n/k log k) on the number of functions needed, which is a factor Ω(log k) larger than the ideal m = n/k. We call this factor the composition overhead; previously, no superconstant lower bounds on it were known for majority.
Our lower bound recovers, as a corollary and via an entirely different proof, the best known lower bound for bounded-width branching programs for majority (Alon and Maass '86, Babai et al. '90). It is also the first step in a plan that we propose for breaking a longstanding barrier in lower bounds for small-depth boolean circuits.
Novel aspects of our proof include sharp bounds on the information lost as computation flows through the inner functions g_j, and the bootstrapping of lower bounds for a multi-output function (Hamming weight) into lower bounds for a single-output one (majority).

Victor Lecomte, Prasanna Ramakrishnan, and Li-Yang Tan. The Composition Complexity of Majority. In 37th Computational Complexity Conference (CCC 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 234, pp. 19:1-19:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{lecomte_et_al:LIPIcs.CCC.2022.19, author = {Lecomte, Victor and Ramakrishnan, Prasanna and Tan, Li-Yang}, title = {{The Composition Complexity of Majority}}, booktitle = {37th Computational Complexity Conference (CCC 2022)}, pages = {19:1--19:26}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-241-9}, ISSN = {1868-8969}, year = {2022}, volume = {234}, editor = {Lovett, Shachar}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2022.19}, URN = {urn:nbn:de:0030-drops-165818}, doi = {10.4230/LIPIcs.CCC.2022.19}, annote = {Keywords: computational complexity, circuit lower bounds} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where:
- T has size S := s^O((log s)²/ε³);
- dist(f,T) ≤ O(opt)+ε;
- Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time.
This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers.
Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties.

Guy Blanc, Jane Lange, and Li-Yang Tan. Reconstructing Decision Trees. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 24:1-24:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2022.24, author = {Blanc, Guy and Lange, Jane and Tan, Li-Yang}, title = {{Reconstructing Decision Trees}}, booktitle = {49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)}, pages = {24:1--24:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-235-8}, ISSN = {1868-8969}, year = {2022}, volume = {229}, editor = {Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.24}, URN = {urn:nbn:de:0030-drops-163653}, doi = {10.4230/LIPIcs.ICALP.2022.24}, annote = {Keywords: Property reconstruction, property testing, tolerant testing, decision trees} }

Document

RANDOM

**Published in:** LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

We study the problem of deterministically approximating the number of satisfying assignments of a polynomial threshold function (PTF) over Boolean space. We present and analyze a scheme for transforming such algorithms for PTFs over Gaussian space into algorithms for the more challenging and more standard setting of Boolean space. Applying this transformation to existing algorithms for Gaussian space leads to new algorithms for Boolean space that improve on prior state-of-the-art results due to Meka and Zuckerman [Meka and Zuckerman, 2013] and Kane [Kane, 2012]. Our approach is based on a bias-preserving derandomization of Meka and Zuckerman’s regularity lemma for polynomials [Meka and Zuckerman, 2013] using the [Rocco A. Servedio and Li-Yang Tan, 2018] pseudorandom generator for PTFs.

Rocco A. Servedio and Li-Yang Tan. Deterministic Approximate Counting of Polynomial Threshold Functions via a Derandomized Regularity Lemma. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 37:1-37:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{servedio_et_al:LIPIcs.APPROX/RANDOM.2021.37, author = {Servedio, Rocco A. and Tan, Li-Yang}, title = {{Deterministic Approximate Counting of Polynomial Threshold Functions via a Derandomized Regularity Lemma}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)}, pages = {37:1--37:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-207-5}, ISSN = {1868-8969}, year = {2021}, volume = {207}, editor = {Wootters, Mary and Sanit\`{a}, Laura}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.37}, URN = {urn:nbn:de:0030-drops-147304}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2021.37}, annote = {Keywords: Derandomization, Polynomial threshold functions, deterministic approximate counting} }

Document

RANDOM

**Published in:** LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996).
Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possible avenue towards resolving this disconnect. Within the smoothed setting and for targets f that are k-juntas, they showed that these heuristics successfully learn f with depth-k decision tree hypotheses. They conjectured that the same guarantee holds more generally for targets that are depth-k decision trees.
We provide a counterexample to this conjecture: we construct targets that are depth-k decision trees and show that even in the smoothed setting, these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy. We also show that the guarantees of Brutzkus et al. cannot extend to the agnostic setting: there are targets that are very close to k-juntas, for which these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy.

Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan. Decision Tree Heuristics Can Fail, Even in the Smoothed Setting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.APPROX/RANDOM.2021.45, author = {Blanc, Guy and Lange, Jane and Qiao, Mingda and Tan, Li-Yang}, title = {{Decision Tree Heuristics Can Fail, Even in the Smoothed Setting}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)}, pages = {45:1--45:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-207-5}, ISSN = {1868-8969}, year = {2021}, volume = {207}, editor = {Wootters, Mary and Sanit\`{a}, Laura}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.45}, URN = {urn:nbn:de:0030-drops-147386}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2021.45}, annote = {Keywords: decision trees, learning theory, smoothed analysis} }

Document

**Published in:** LIPIcs, Volume 200, 36th Computational Complexity Conference (CCC 2021)

The Stabbing Planes proof system [Paul Beame et al., 2018] was introduced to model the reasoning carried out in practical mixed integer programming solvers. As a proof system, it is powerful enough to simulate Cutting Planes and to refute the Tseitin formulas - certain unsatisfiable systems of linear equations od 2 - which are canonical hard examples for many algebraic proof systems. In a recent (and surprising) result, Dadush and Tiwari [Daniel Dadush and Samarth Tiwari, 2020] showed that these short refutations of the Tseitin formulas could be translated into quasi-polynomial size and depth Cutting Planes proofs, refuting a long-standing conjecture. This translation raises several interesting questions. First, whether all Stabbing Planes proofs can be efficiently simulated by Cutting Planes. This would allow for the substantial analysis done on the Cutting Planes system to be lifted to practical mixed integer programming solvers. Second, whether the quasi-polynomial depth of these proofs is inherent to Cutting Planes.
In this paper we make progress towards answering both of these questions. First, we show that any Stabbing Planes proof with bounded coefficients (SP*) can be translated into Cutting Planes. As a consequence of the known lower bounds for Cutting Planes, this establishes the first exponential lower bounds on SP*. Using this translation, we extend the result of Dadush and Tiwari to show that Cutting Planes has short refutations of any unsatisfiable system of linear equations over a finite field. Like the Cutting Planes proofs of Dadush and Tiwari, our refutations also incur a quasi-polynomial blow-up in depth, and we conjecture that this is inherent. As a step towards this conjecture, we develop a new geometric technique for proving lower bounds on the depth of Cutting Planes proofs. This allows us to establish the first lower bounds on the depth of Semantic Cutting Planes proofs of the Tseitin formulas.

Noah Fleming, Mika Göös, Russell Impagliazzo, Toniann Pitassi, Robert Robere, Li-Yang Tan, and Avi Wigderson. On the Power and Limitations of Branch and Cut. In 36th Computational Complexity Conference (CCC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 200, pp. 6:1-6:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{fleming_et_al:LIPIcs.CCC.2021.6, author = {Fleming, Noah and G\"{o}\"{o}s, Mika and Impagliazzo, Russell and Pitassi, Toniann and Robere, Robert and Tan, Li-Yang and Wigderson, Avi}, title = {{On the Power and Limitations of Branch and Cut}}, booktitle = {36th Computational Complexity Conference (CCC 2021)}, pages = {6:1--6:30}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-193-1}, ISSN = {1868-8969}, year = {2021}, volume = {200}, editor = {Kabanets, Valentine}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2021.6}, URN = {urn:nbn:de:0030-drops-142809}, doi = {10.4230/LIPIcs.CCC.2021.6}, annote = {Keywords: Proof Complexity, Integer Programming, Cutting Planes, Branch and Cut, Stabbing Planes} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an η-corrupted set of uniform random samples labeled by a size-s stochastic decision tree, our algorithm runs in time n^{O(log(s/ε)/ε²)} and returns a hypothesis with error within an additive 2η + ε of the Bayes optimal. An additive 2η is the information-theoretic minimum.
Previously no non-trivial algorithm with a guarantee of O(η) + ε was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

Guy Blanc, Jane Lange, and Li-Yang Tan. Learning Stochastic Decision Trees. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2021.30, author = {Blanc, Guy and Lange, Jane and Tan, Li-Yang}, title = {{Learning Stochastic Decision Trees}}, booktitle = {48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)}, pages = {30:1--30:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-195-5}, ISSN = {1868-8969}, year = {2021}, volume = {198}, editor = {Bansal, Nikhil and Merelli, Emanuela and Worrell, James}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.30}, URN = {urn:nbn:de:0030-drops-140994}, doi = {10.4230/LIPIcs.ICALP.2021.30}, annote = {Keywords: Learning theory, decision trees, proper learning algorithms, adversarial noise} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

The randomized query complexity 𝖱(f) of a boolean function f: {0,1}ⁿ → {0,1} is famously characterized (via Yao’s minimax) by the least number of queries needed to distinguish a distribution 𝒟₀ over 0-inputs from a distribution 𝒟₁ over 1-inputs, maximized over all pairs (𝒟₀,𝒟₁). We ask: Does this task become easier if we allow query access to infinitely many samples from either 𝒟₀ or 𝒟₁? We show the answer is no: There exists a hard pair (𝒟₀,𝒟₁) such that distinguishing 𝒟₀^∞ from 𝒟₁^∞ requires Θ(𝖱(f)) many queries. As an application, we show that for any composed function f∘g we have 𝖱(f∘g) ≥ Ω(fbs(f)𝖱(g)) where fbs denotes fractional block sensitivity.

Andrew Bassilakis, Andrew Drucker, Mika Göös, Lunjia Hu, Weiyun Ma, and Li-Yang Tan. The Power of Many Samples in Query Complexity. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 9:1-9:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{bassilakis_et_al:LIPIcs.ICALP.2020.9, author = {Bassilakis, Andrew and Drucker, Andrew and G\"{o}\"{o}s, Mika and Hu, Lunjia and Ma, Weiyun and Tan, Li-Yang}, title = {{The Power of Many Samples in Query Complexity}}, booktitle = {47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)}, pages = {9:1--9:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-138-2}, ISSN = {1868-8969}, year = {2020}, volume = {168}, editor = {Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.9}, URN = {urn:nbn:de:0030-drops-124163}, doi = {10.4230/LIPIcs.ICALP.2020.9}, annote = {Keywords: Query complexity, Composition theorems} }

Document

**Published in:** LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)

Consider the following heuristic for building a decision tree for a function f : {0,1}^n → {± 1}. Place the most influential variable x_i of f at the root, and recurse on the subfunctions f_{x_i=0} and f_{x_i=1} on the left and right subtrees respectively; terminate once the tree is an ε-approximation of f. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds:
- Upper bound: For every f with decision tree size s and every ε ∈ (0,1/2), this heuristic builds a decision tree of size at most s^O(log(s/ε)log(1/ε)).
- Lower bound: For every ε ∈ (0,1/2) and s ≤ 2^Õ(√n), there is an f with decision tree size s such that this heuristic builds a decision tree of size s^Ω~(log s).
We also obtain upper and lower bounds for monotone functions: s^O(√{log s}/ε) and s^Ω(∜{log s}) respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009).
Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms - which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART - achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989), and even have certain qualitative advantages. Our lower bounds shed new light on the limitations of these heuristics.
Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime.

Guy Blanc, Jane Lange, and Li-Yang Tan. Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 44:1-44:44, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2020.44, author = {Blanc, Guy and Lange, Jane and Tan, Li-Yang}, title = {{Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations}}, booktitle = {11th Innovations in Theoretical Computer Science Conference (ITCS 2020)}, pages = {44:1--44:44}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-134-4}, ISSN = {1868-8969}, year = {2020}, volume = {151}, editor = {Vidick, Thomas}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.44}, URN = {urn:nbn:de:0030-drops-117295}, doi = {10.4230/LIPIcs.ITCS.2020.44}, annote = {Keywords: Decision trees, Influence of variables, Analysis of boolean functions, Learning theory, Top-down decision tree heuristics} }

Document

RANDOM

**Published in:** LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

We give the best known pseudorandom generators for two touchstone classes in unconditional derandomization: small-depth circuits and sparse F_2 polynomials. Our main results are an epsilon-PRG for the class of size-M depth-d AC^0 circuits with seed length log(M)^{d+O(1)}* log(1/epsilon), and an epsilon-PRG for the class of S-sparse F_2 polynomials with seed length 2^{O(sqrt{log S})}* log(1/epsilon). These results bring the state of the art for unconditional derandomization of these classes into sharp alignment with the state of the art for computational hardness for all parameter settings: improving on the seed lengths of either PRG would require breakthrough progress on longstanding and notorious circuit lower bounds.
The key enabling ingredient in our approach is a new pseudorandom multi-switching lemma. We derandomize recently-developed multi-switching lemmas, which are powerful generalizations of Håstad’s switching lemma that deal with families of depth-two circuits. Our pseudorandom multi-switching lemma - a randomness-efficient algorithm for sampling restrictions that simultaneously simplify all circuits in a family - achieves the parameters obtained by the (full randomness) multi-switching lemmas of Impagliazzo, Matthews, and Paturi [Impagliazzo et al., 2012] and Håstad [Johan Håstad, 2014]. This optimality of our derandomization translates into the optimality (given current circuit lower bounds) of our PRGs for AC^0 and sparse F_2 polynomials.

Rocco A. Servedio and Li-Yang Tan. Improved Pseudorandom Generators from Pseudorandom Multi-Switching Lemmas. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 45:1-45:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{servedio_et_al:LIPIcs.APPROX-RANDOM.2019.45, author = {Servedio, Rocco A. and Tan, Li-Yang}, title = {{Improved Pseudorandom Generators from Pseudorandom Multi-Switching Lemmas}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)}, pages = {45:1--45:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-125-2}, ISSN = {1868-8969}, year = {2019}, volume = {145}, editor = {Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.45}, URN = {urn:nbn:de:0030-drops-112605}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2019.45}, annote = {Keywords: pseudorandom generators, switching lemmas, circuit complexity, unconditional derandomization} }

Document

**Published in:** LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)

We study correlation bounds and pseudorandom generators for depth-two circuits that consist of a SYM-gate (computing an arbitrary symmetric function) or THR-gate (computing an arbitrary linear threshold function) that is fed by S {AND} gates. Such circuits were considered in early influential work on unconditional derandomization of Luby, Velickovi{c}, and Wigderson [Michael Luby et al., 1993], who gave the first non-trivial PRG with seed length 2^{O(sqrt{log(S/epsilon)})} that epsilon-fools these circuits.
In this work we obtain the first strict improvement of [Michael Luby et al., 1993]'s seed length: we construct a PRG that epsilon-fools size-S {SYM,THR} oAND circuits over {0,1}^n with seed length 2^{O(sqrt{log S})} + polylog(1/epsilon), an exponential (and near-optimal) improvement of the epsilon-dependence of [Michael Luby et al., 1993]. The above PRG is actually a special case of a more general PRG which we establish for constant-depth circuits containing multiple SYM or THR gates, including as a special case {SYM,THR} o AC^0 circuits. These more general results strengthen previous results of Viola [Viola, 2006] and essentially strengthen more recent results of Lovett and Srinivasan [Lovett and Srinivasan, 2011].
Our improved PRGs follow from improved correlation bounds, which are transformed into PRGs via the Nisan-Wigderson "hardness versus randomness" paradigm [Nisan and Wigderson, 1994]. The key to our improved correlation bounds is the use of a recent powerful multi-switching lemma due to Håstad [Johan Håstad, 2014].

Rocco A. Servedio and Li-Yang Tan. Luby-Velickovic-Wigderson Revisited: Improved Correlation Bounds and Pseudorandom Generators for Depth-Two Circuits. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 56:1-56:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{servedio_et_al:LIPIcs.APPROX-RANDOM.2018.56, author = {Servedio, Rocco A. and Tan, Li-Yang}, title = {{Luby-Velickovic-Wigderson Revisited: Improved Correlation Bounds and Pseudorandom Generators for Depth-Two Circuits}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)}, pages = {56:1--56:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-085-9}, ISSN = {1868-8969}, year = {2018}, volume = {116}, editor = {Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.56}, URN = {urn:nbn:de:0030-drops-94601}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2018.56}, annote = {Keywords: Pseudorandom generators, correlation bounds, constant-depth circuits} }

Document

**Published in:** LIPIcs, Volume 67, 8th Innovations in Theoretical Computer Science Conference (ITCS 2017)

Despite decades of intensive research, efficient - or even sub-exponential time - distribution-free PAC learning algorithms are not known for many important Boolean function classes. In this work we suggest a new perspective on these learning problems, inspired by a surge of recent research in complexity theory, in which the goal is to determine whether and how much of a savings over a naive 2^n runtime can be achieved.
We establish a range of exploratory results towards this end. In more detail,
(1) We first observe that a simple approach building on known uniform-distribution learning results gives non-trivial distribution-free learning algorithms for several well-studied classes including AC0, arbitrary functions of a few linear threshold functions (LTFs), and AC0 augmented with mod_p gates.
(2) Next we present an approach, based on the method of random restrictions from circuit complexity, which can be used to obtain several distribution-free learning algorithms that do not appear to be achievable by approach (1) above. The results achieved in this way include learning algorithms with non-trivial savings for LTF-of-AC0 circuits and improved savings for learning parity-of-AC0 circuits.
(3) Finally, our third contribution is a generic technique for converting lower bounds proved using Neciporuk's method to learning algorithms with non-trivial savings. This technique, which is the most involved of our three approaches, yields distribution-free learning algorithms for a range of classes where previously even non-trivial uniform-distribution learning algorithms were not known; these classes include full-basis formulas, branching programs, span programs, etc. up to some fixed polynomial size.

Rocco A. Servedio and Li-Yang Tan. What Circuit Classes Can Be Learned with Non-Trivial Savings?. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 67, pp. 30:1-30:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{servedio_et_al:LIPIcs.ITCS.2017.30, author = {Servedio, Rocco A. and Tan, Li-Yang}, title = {{What Circuit Classes Can Be Learned with Non-Trivial Savings?}}, booktitle = {8th Innovations in Theoretical Computer Science Conference (ITCS 2017)}, pages = {30:1--30:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-029-3}, ISSN = {1868-8969}, year = {2017}, volume = {67}, editor = {Papadimitriou, Christos H.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.30}, URN = {urn:nbn:de:0030-drops-81722}, doi = {10.4230/LIPIcs.ITCS.2017.30}, annote = {Keywords: computational learning theory, circuit complexity, non-trivial savings} }

Document

**Published in:** LIPIcs, Volume 81, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)

We give a poly(log(n),1/epsilon)-query adaptive algorithm for testing whether an unknown Boolean function f:{-1, 1}^n -> {-1, 1}, which is promised to be a halfspace, is monotone versus epsilon-far from monotone. Since non-adaptive algorithms are known to require almost Omega(n^{1/2}) queries to test whether an unknown halfspace is monotone versus far from monotone, this shows that adaptivity enables an exponential improvement in the query complexity of monotonicity testing for halfspaces.

Xi Chen, Rocco A. Servedio, Li-Yang Tan, and Erik Waingarten. Adaptivity Is Exponentially Powerful for Testing Monotonicity of Halfspaces. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 81, pp. 38:1-38:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.APPROX-RANDOM.2017.38, author = {Chen, Xi and Servedio, Rocco A. and Tan, Li-Yang and Waingarten, Erik}, title = {{Adaptivity Is Exponentially Powerful for Testing Monotonicity of Halfspaces}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)}, pages = {38:1--38:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-044-6}, ISSN = {1868-8969}, year = {2017}, volume = {81}, editor = {Jansen, Klaus and Rolim, Jos\'{e} D. P. and Williamson, David P. and Vempala, Santosh S.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2017.38}, URN = {urn:nbn:de:0030-drops-75877}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2017.38}, annote = {Keywords: property testing, linear threshold functions, monotonicity, adaptivity} }

Document

**Published in:** LIPIcs, Volume 79, 32nd Computational Complexity Conference (CCC 2017)

We prove that any non-adaptive algorithm that tests whether an unknown Boolean function f is a k-junta or epsilon-far from every k-junta must make ~Omega(k^{3/2}/ epsilon) many queries for a wide range of parameters k and epsilon. Our result dramatically improves previous lower bounds from [BGSMdW13,STW15], and is essentially optimal given Blais's non-adaptive junta tester from [Blais08], which makes ~O(k^{3/2})/epsilon queries. Combined with the adaptive tester of [Blais09] which makes O(k log k + k / epsilon) queries, our result shows that adaptivity enables polynomial savings in query complexity for junta testing.

Xi Chen, Rocco A. Servedio, Li-Yang Tan, Erik Waingarten, and Jinyu Xie. Settling the Query Complexity of Non-Adaptive Junta Testing. In 32nd Computational Complexity Conference (CCC 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 79, pp. 26:1-26:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.CCC.2017.26, author = {Chen, Xi and Servedio, Rocco A. and Tan, Li-Yang and Waingarten, Erik and Xie, Jinyu}, title = {{Settling the Query Complexity of Non-Adaptive Junta Testing}}, booktitle = {32nd Computational Complexity Conference (CCC 2017)}, pages = {26:1--26:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-040-8}, ISSN = {1868-8969}, year = {2017}, volume = {79}, editor = {O'Donnell, Ryan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2017.26}, URN = {urn:nbn:de:0030-drops-75283}, doi = {10.4230/LIPIcs.CCC.2017.26}, annote = {Keywords: property testing, juntas, query complexity} }

Document

**Published in:** LIPIcs, Volume 40, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)

Monotone Boolean functions, and the monotone Boolean circuits that compute them, have been intensively studied in complexity theory. In this paper we study the structure of Boolean functions in terms of the minimum number of negations in any circuit computing them, a complexity measure that interpolates between monotone functions and the class of all functions. We study this generalization of monotonicity from the vantage point of learning theory, establishing nearly matching upper and lower bounds on the uniform-distribution learnability of circuits in terms of the number of negations they contain. Our upper bounds are based on a new structural characterization of negation-limited circuits that extends a classical result of A.A. Markov. Our lower bounds, which employ Fourier-analytic tools from hardness amplification, give new results even for circuits with no negations (i.e. monotone functions).

Eric Blais, Clément L. Canonne, Igor C. Oliveira, Rocco A. Servedio, and Li-Yang Tan. Learning Circuits with few Negations. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 40, pp. 512-527, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{blais_et_al:LIPIcs.APPROX-RANDOM.2015.512, author = {Blais, Eric and Canonne, Cl\'{e}ment L. and Oliveira, Igor C. and Servedio, Rocco A. and Tan, Li-Yang}, title = {{Learning Circuits with few Negations}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)}, pages = {512--527}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-89-7}, ISSN = {1868-8969}, year = {2015}, volume = {40}, editor = {Garg, Naveen and Jansen, Klaus and Rao, Anup and Rolim, Jos\'{e} D. P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2015.512}, URN = {urn:nbn:de:0030-drops-53214}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2015.512}, annote = {Keywords: Boolean functions, monotonicity, negations, PAC learning} }

Document

**Published in:** LIPIcs, Volume 33, 30th Conference on Computational Complexity (CCC 2015)

We give a new lower bound on the query complexity of any non-adaptive algorithm for testing whether an unknown Boolean function is a k-junta versus epsilon-far from every k-junta. Our lower bound is that any non-adaptive algorithm must make Omega(( k * log*(k)) / ( epsilon^c * log(log(k)/epsilon^c))) queries for this testing problem, where c is any absolute constant <1. For suitable values of epsilon this is asymptotically larger than the O(k * log(k) + k/epsilon) query complexity of the best known adaptive algorithm [Blais,STOC'09] for testing juntas, and thus the new lower bound shows that adaptive algorithms are more powerful than non-adaptive algorithms for the junta testing problem.

Rocco A. Servedio, Li-Yang Tan, and John Wright. Adaptivity Helps for Testing Juntas. In 30th Conference on Computational Complexity (CCC 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 33, pp. 264-279, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{servedio_et_al:LIPIcs.CCC.2015.264, author = {Servedio, Rocco A. and Tan, Li-Yang and Wright, John}, title = {{Adaptivity Helps for Testing Juntas}}, booktitle = {30th Conference on Computational Complexity (CCC 2015)}, pages = {264--279}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-81-1}, ISSN = {1868-8969}, year = {2015}, volume = {33}, editor = {Zuckerman, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2015.264}, URN = {urn:nbn:de:0030-drops-50663}, doi = {10.4230/LIPIcs.CCC.2015.264}, annote = {Keywords: Property testing, juntas, adaptivity} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail