Document

**Published in:** LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)

Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal than loss minimization, even for simple predictors such as linear functions.
In this work, we consider the setting where the protected groups can be represented by neural networks of size k, and the predictors are neural networks of size n > k. We show that minimizing the squared loss over all neural nets of size n implies multicalibration for all but a bounded number of unlucky values of n. We also give evidence that our bound on the number of unlucky values is tight, given our proof technique. Previously, results of the flavor that loss minimization yields multicalibration were known only for predictors that were near the ground truth, hence were rather limited in applicability. Unlike these, our results rely on the expressivity of neural nets and utilize the representation of the predictor.

Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, and Preetum Nakkiran. Loss Minimization Yields Multicalibration for Large Neural Networks. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 17:1-17:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{blasiok_et_al:LIPIcs.ITCS.2024.17, author = {B{\l}asiok, Jaros{\l}aw and Gopalan, Parikshit and Hu, Lunjia and Kalai, Adam Tauman and Nakkiran, Preetum}, title = {{Loss Minimization Yields Multicalibration for Large Neural Networks}}, booktitle = {15th Innovations in Theoretical Computer Science Conference (ITCS 2024)}, pages = {17:1--17:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-309-6}, ISSN = {1868-8969}, year = {2024}, volume = {287}, editor = {Guruswami, Venkatesan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.17}, URN = {urn:nbn:de:0030-drops-195452}, doi = {10.4230/LIPIcs.ITCS.2024.17}, annote = {Keywords: Multi-group fairness, loss minimization, neural networks} }

Document

**Published in:** LIPIcs, Volume 264, 38th Computational Complexity Conference (CCC 2023)

This work initiates the systematic study of explicit distributions that are indistinguishable from a single exponential-size combinatorial object. In this we extend the work of Goldreich, Goldwasser and Nussboim (SICOMP 2010) that focused on the implementation of huge objects that are indistinguishable from the uniform distribution, satisfying some global properties (which they coined truthfulness). Indistinguishability from a single object is motivated by the study of generative models in learning theory and regularity lemmas in graph theory. Problems that are well understood in the setting of pseudorandomness present significant challenges and at times are impossible when considering generative models of huge objects.
We demonstrate the versatility of this study by providing a learning algorithm for huge indistinguishable objects in several natural settings including: dense functions and graphs with a truthfulness requirement on the number of ones in the function or edges in the graphs, and a version of the weak regularity lemma for sparse graphs that satisfy some global properties. These and other results generalize basic pseudorandom objects as well as notions introduced in algorithmic fairness. The results rely on notions and techniques from a variety of areas including learning theory, complexity theory, cryptography, and game theory.

Lunjia Hu, Inbal Rachel Livni Navon, and Omer Reingold. Generative Models of Huge Objects. In 38th Computational Complexity Conference (CCC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 264, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{hu_et_al:LIPIcs.CCC.2023.5, author = {Hu, Lunjia and Livni Navon, Inbal Rachel and Reingold, Omer}, title = {{Generative Models of Huge Objects}}, booktitle = {38th Computational Complexity Conference (CCC 2023)}, pages = {5:1--5:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-282-2}, ISSN = {1868-8969}, year = {2023}, volume = {264}, editor = {Ta-Shma, Amnon}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2023.5}, URN = {urn:nbn:de:0030-drops-182758}, doi = {10.4230/LIPIcs.CCC.2023.5}, annote = {Keywords: pseudorandomness, generative models, regularity lemma} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

We present a new perspective on loss minimization and the recent notion of Omniprediction through the lens of Outcome Indistingusihability. For a collection of losses and hypothesis class, omniprediction requires that a predictor provide a loss-minimization guarantee simultaneously for every loss in the collection compared to the best (loss-specific) hypothesis in the class. We present a generic template to learn predictors satisfying a guarantee we call Loss Outcome Indistinguishability. For a set of statistical tests - based on a collection of losses and hypothesis class - a predictor is Loss OI if it is indistinguishable (according to the tests) from Nature’s true probabilities over outcomes. By design, Loss OI implies omniprediction in a direct and intuitive manner. We simplify Loss OI further, decomposing it into a calibration condition plus multiaccuracy for a class of functions derived from the loss and hypothesis classes. By careful analysis of this class, we give efficient constructions of omnipredictors for interesting classes of loss functions, including non-convex losses.
This decomposition highlights the utility of a new multi-group fairness notion that we call calibrated multiaccuracy, which lies in between multiaccuracy and multicalibration. We show that calibrated multiaccuracy implies Loss OI for the important set of convex losses arising from Generalized Linear Models, without requiring full multicalibration. For such losses, we show an equivalence between our computational notion of Loss OI and a geometric notion of indistinguishability, formulated as Pythagorean theorems in the associated Bregman divergence. We give an efficient algorithm for calibrated multiaccuracy with computational complexity comparable to that of multiaccuracy. In all, calibrated multiaccuracy offers an interesting tradeoff point between efficiency and generality in the omniprediction landscape.

Parikshit Gopalan, Lunjia Hu, Michael P. Kim, Omer Reingold, and Udi Wieder. Loss Minimization Through the Lens Of Outcome Indistinguishability. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 60:1-60:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{gopalan_et_al:LIPIcs.ITCS.2023.60, author = {Gopalan, Parikshit and Hu, Lunjia and Kim, Michael P. and Reingold, Omer and Wieder, Udi}, title = {{Loss Minimization Through the Lens Of Outcome Indistinguishability}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {60:1--60:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.60}, URN = {urn:nbn:de:0030-drops-175635}, doi = {10.4230/LIPIcs.ITCS.2023.60}, annote = {Keywords: Loss Minimization, Indistinguishability} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

In many learning theory problems, a central role is played by a hypothesis class: we might assume that the data is labeled according to a hypothesis in the class (usually referred to as the realizable setting), or we might evaluate the learned model by comparing it with the best hypothesis in the class (the agnostic setting). Taking a step beyond these classic setups that involve only a single hypothesis class, we study a variety of problems that involve two hypothesis classes simultaneously.
We introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning: given two binary hypothesis classes S and B, we assume that the data is labeled according to a hypothesis in the source class S and require the learned model to achieve an accuracy comparable to the best hypothesis in the benchmark class B. Even when both S and B have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension VC(S,B) which we define to be the maximum size of a subset shattered by both S and B. We also show a similar result in the online setting, where we give a regret characterization in terms of the analogous mutual Littlestone dimension Ldim(S,B). These results also hold for partial hypotheses.
We additionally show that the insights necessary to characterize the sample complexity of comparative learning can be applied to other tasks involving two hypothesis classes. In particular, we characterize the sample complexity of realizable multiaccuracy and multicalibration using the mutual fat-shattering dimension, an analogue of the mutual VC dimension for real-valued hypotheses. This not only solves an open problem proposed by Hu, Peale, Reingold (2022), but also leads to independently interesting results extending classic ones about regression, boosting, and covering number to our two-hypothesis-class setting.

Lunjia Hu and Charlotte Peale. Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 72:1-72:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{hu_et_al:LIPIcs.ITCS.2023.72, author = {Hu, Lunjia and Peale, Charlotte}, title = {{Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {72:1--72:30}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.72}, URN = {urn:nbn:de:0030-drops-175752}, doi = {10.4230/LIPIcs.ITCS.2023.72}, annote = {Keywords: Comparative learning, mutual VC dimension, realizable multiaccuracy and multicalibration, sample complexity} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

The randomized query complexity 𝖱(f) of a boolean function f: {0,1}ⁿ → {0,1} is famously characterized (via Yao’s minimax) by the least number of queries needed to distinguish a distribution 𝒟₀ over 0-inputs from a distribution 𝒟₁ over 1-inputs, maximized over all pairs (𝒟₀,𝒟₁). We ask: Does this task become easier if we allow query access to infinitely many samples from either 𝒟₀ or 𝒟₁? We show the answer is no: There exists a hard pair (𝒟₀,𝒟₁) such that distinguishing 𝒟₀^∞ from 𝒟₁^∞ requires Θ(𝖱(f)) many queries. As an application, we show that for any composed function f∘g we have 𝖱(f∘g) ≥ Ω(fbs(f)𝖱(g)) where fbs denotes fractional block sensitivity.

Andrew Bassilakis, Andrew Drucker, Mika Göös, Lunjia Hu, Weiyun Ma, and Li-Yang Tan. The Power of Many Samples in Query Complexity. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 9:1-9:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{bassilakis_et_al:LIPIcs.ICALP.2020.9, author = {Bassilakis, Andrew and Drucker, Andrew and G\"{o}\"{o}s, Mika and Hu, Lunjia and Ma, Weiyun and Tan, Li-Yang}, title = {{The Power of Many Samples in Query Complexity}}, booktitle = {47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)}, pages = {9:1--9:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-138-2}, ISSN = {1868-8969}, year = {2020}, volume = {168}, editor = {Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.9}, URN = {urn:nbn:de:0030-drops-124163}, doi = {10.4230/LIPIcs.ICALP.2020.9}, annote = {Keywords: Query complexity, Composition theorems} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail