7 Search Results for "Gopalan, Parikshit"


Document
Loss Minimization Yields Multicalibration for Large Neural Networks

Authors: Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, and Preetum Nakkiran

Published in: LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)


Abstract
Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal than loss minimization, even for simple predictors such as linear functions. In this work, we consider the setting where the protected groups can be represented by neural networks of size k, and the predictors are neural networks of size n > k. We show that minimizing the squared loss over all neural nets of size n implies multicalibration for all but a bounded number of unlucky values of n. We also give evidence that our bound on the number of unlucky values is tight, given our proof technique. Previously, results of the flavor that loss minimization yields multicalibration were known only for predictors that were near the ground truth, hence were rather limited in applicability. Unlike these, our results rely on the expressivity of neural nets and utilize the representation of the predictor.

Cite as

Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, and Preetum Nakkiran. Loss Minimization Yields Multicalibration for Large Neural Networks. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 17:1-17:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{blasiok_et_al:LIPIcs.ITCS.2024.17,
  author =	{B{\l}asiok, Jaros{\l}aw and Gopalan, Parikshit and Hu, Lunjia and Kalai, Adam Tauman and Nakkiran, Preetum},
  title =	{{Loss Minimization Yields Multicalibration for Large Neural Networks}},
  booktitle =	{15th Innovations in Theoretical Computer Science Conference (ITCS 2024)},
  pages =	{17:1--17:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-309-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{287},
  editor =	{Guruswami, Venkatesan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.17},
  URN =		{urn:nbn:de:0030-drops-195452},
  doi =		{10.4230/LIPIcs.ITCS.2024.17},
  annote =	{Keywords: Multi-group fairness, loss minimization, neural networks}
}
Document
Loss Minimization Through the Lens Of Outcome Indistinguishability

Authors: Parikshit Gopalan, Lunjia Hu, Michael P. Kim, Omer Reingold, and Udi Wieder

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)


Abstract
We present a new perspective on loss minimization and the recent notion of Omniprediction through the lens of Outcome Indistingusihability. For a collection of losses and hypothesis class, omniprediction requires that a predictor provide a loss-minimization guarantee simultaneously for every loss in the collection compared to the best (loss-specific) hypothesis in the class. We present a generic template to learn predictors satisfying a guarantee we call Loss Outcome Indistinguishability. For a set of statistical tests - based on a collection of losses and hypothesis class - a predictor is Loss OI if it is indistinguishable (according to the tests) from Nature’s true probabilities over outcomes. By design, Loss OI implies omniprediction in a direct and intuitive manner. We simplify Loss OI further, decomposing it into a calibration condition plus multiaccuracy for a class of functions derived from the loss and hypothesis classes. By careful analysis of this class, we give efficient constructions of omnipredictors for interesting classes of loss functions, including non-convex losses. This decomposition highlights the utility of a new multi-group fairness notion that we call calibrated multiaccuracy, which lies in between multiaccuracy and multicalibration. We show that calibrated multiaccuracy implies Loss OI for the important set of convex losses arising from Generalized Linear Models, without requiring full multicalibration. For such losses, we show an equivalence between our computational notion of Loss OI and a geometric notion of indistinguishability, formulated as Pythagorean theorems in the associated Bregman divergence. We give an efficient algorithm for calibrated multiaccuracy with computational complexity comparable to that of multiaccuracy. In all, calibrated multiaccuracy offers an interesting tradeoff point between efficiency and generality in the omniprediction landscape.

Cite as

Parikshit Gopalan, Lunjia Hu, Michael P. Kim, Omer Reingold, and Udi Wieder. Loss Minimization Through the Lens Of Outcome Indistinguishability. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 60:1-60:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{gopalan_et_al:LIPIcs.ITCS.2023.60,
  author =	{Gopalan, Parikshit and Hu, Lunjia and Kim, Michael P. and Reingold, Omer and Wieder, Udi},
  title =	{{Loss Minimization Through the Lens Of Outcome Indistinguishability}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{60:1--60:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.60},
  URN =		{urn:nbn:de:0030-drops-175635},
  doi =		{10.4230/LIPIcs.ITCS.2023.60},
  annote =	{Keywords: Loss Minimization, Indistinguishability}
}
Document
Making Decisions Under Outcome Performativity

Authors: Michael P. Kim and Juan C. Perdomo

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)


Abstract
Decision-makers often act in response to data-driven predictions, with the goal of achieving favorable outcomes. In such settings, predictions don’t passively forecast the future; instead, predictions actively shape the distribution of outcomes they are meant to predict. This performative prediction setting [Brown et al., 2022] raises new challenges for learning "optimal" decision rules. In particular, existing solution concepts do not address the apparent tension between the goals of forecasting outcomes accurately and steering individuals to achieve desirable outcomes. To contend with this concern, we introduce a new optimality concept - performative omniprediction - adapted from the supervised (non-performative) learning setting [Gopalan et al., 2022]. A performative omnipredictor is a single predictor that simultaneously encodes the optimal decision rule with respect to many possibly-competing objectives. Our main result demonstrates that efficient performative omnipredictors exist, under a natural restriction of performative prediction, which we call outcome performativity. On a technical level, our results follow by carefully generalizing the notion of outcome indistinguishability [Cynthia Dwork et al., 2021] to the outcome performative setting. From an appropriate notion of Performative OI, we recover many consequences known to hold in the supervised setting, such as omniprediction and universal adaptability [Kim et al., 2022].

Cite as

Michael P. Kim and Juan C. Perdomo. Making Decisions Under Outcome Performativity. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 79:1-79:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{kim_et_al:LIPIcs.ITCS.2023.79,
  author =	{Kim, Michael P. and Perdomo, Juan C.},
  title =	{{Making Decisions Under Outcome Performativity}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{79:1--79:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.79},
  URN =		{urn:nbn:de:0030-drops-175824},
  doi =		{10.4230/LIPIcs.ITCS.2023.79},
  annote =	{Keywords: performative prediction, outcome indistinguishability}
}
Document
Omnipredictors

Authors: Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan, and Udi Wieder

Published in: LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)


Abstract
Loss minimization is a dominant paradigm in machine learning, where a predictor is trained to minimize some loss function that depends on an uncertain event (e.g., "will it rain tomorrow?"). Different loss functions imply different learning algorithms and, at times, very different predictors. While widespread and appealing, a clear drawback of this approach is that the loss function may not be known at the time of learning, requiring the algorithm to use a best-guess loss function. Alternatively, the same classifier may be used to inform multiple decisions, which correspond to multiple loss functions, requiring multiple learning algorithms to be run on the same data. We suggest a rigorous new paradigm for loss minimization in machine learning where the loss function can be ignored at the time of learning and only be taken into account when deciding an action. We introduce the notion of an (L,𝒞)-omnipredictor, which could be used to optimize any loss in a family L. Once the loss function is set, the outputs of the predictor can be post-processed (a simple univariate data-independent transformation of individual predictions) to do well compared with any hypothesis from the class C. The post processing is essentially what one would perform if the outputs of the predictor were true probabilities of the uncertain events. In a sense, omnipredictors extract all the predictive power from the class 𝒞, irrespective of the loss function in L. We show that such "loss-oblivious" learning is feasible through a connection to multicalibration, a notion introduced in the context of algorithmic fairness. A multicalibrated predictor doesn’t aim to minimize some loss function, but rather to make calibrated predictions, even when conditioned on inputs lying in certain sets c belonging to a family 𝒞 which is weakly learnable. We show that a 𝒞-multicalibrated predictor is also an (L,𝒞)-omnipredictor, where L contains all convex loss functions with some mild Lipschitz conditions. The predictors are even omnipredictors with respect to sparse linear combinations of functions in 𝒞. As a corollary, we deduce that distribution-specific weak agnostic learning is complete for a large class of loss minimization tasks. In addition, we show how multicalibration can be viewed as a solution concept for agnostic boosting, shedding new light on past results. Finally, we transfer our insights back to the context of algorithmic fairness by providing omnipredictors for multi-group loss minimization.

Cite as

Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan, and Udi Wieder. Omnipredictors. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 79:1-79:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{gopalan_et_al:LIPIcs.ITCS.2022.79,
  author =	{Gopalan, Parikshit and Kalai, Adam Tauman and Reingold, Omer and Sharan, Vatsal and Wieder, Udi},
  title =	{{Omnipredictors}},
  booktitle =	{13th Innovations in Theoretical Computer Science Conference (ITCS 2022)},
  pages =	{79:1--79:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-217-4},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{215},
  editor =	{Braverman, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2022.79},
  URN =		{urn:nbn:de:0030-drops-156755},
  doi =		{10.4230/LIPIcs.ITCS.2022.79},
  annote =	{Keywords: Loss-minimzation, multi-group fairness, agnostic learning, boosting}
}
Document
Hitting Sets Give Two-Sided Derandomization of Small Space

Authors: Kuan Cheng and William M. Hoza

Published in: LIPIcs, Volume 169, 35th Computational Complexity Conference (CCC 2020)


Abstract
A hitting set is a "one-sided" variant of a pseudorandom generator (PRG), naturally suited to derandomizing algorithms that have one-sided error. We study the problem of using a given hitting set to derandomize algorithms that have two-sided error, focusing on space-bounded algorithms. For our first result, we show that if there is a log-space hitting set for polynomial-width read-once branching programs (ROBPs), then not only does 𝐋 = 𝐑𝐋, but 𝐋 = 𝐁𝐏𝐋 as well. This answers a question raised by Hoza and Zuckerman [William M. Hoza and David Zuckerman, 2018]. Next, we consider constant-width ROBPs. We show that if there are log-space hitting sets for constant-width ROBPs, then given black-box access to a constant-width ROBP f, it is possible to deterministically estimate 𝔼[f] to within ± ε in space O(log(n/ε)). Unconditionally, we give a deterministic algorithm for this problem with space complexity O(log² n + log(1/ε)), slightly improving over previous work. Finally, we investigate the limits of this line of work. Perhaps the strongest reduction along these lines one could hope for would say that for every explicit hitting set, there is an explicit PRG with similar parameters. In the setting of constant-width ROBPs over a large alphabet, we prove that establishing such a strong reduction is at least as difficult as constructing a good PRG outright. Quantitatively, we prove that if the strong reduction holds, then for every constant α > 0, there is an explicit PRG for constant-width ROBPs with seed length O(log^{1 + α} n). Along the way, unconditionally, we construct an improved hitting set for ROBPs over a large alphabet.

Cite as

Kuan Cheng and William M. Hoza. Hitting Sets Give Two-Sided Derandomization of Small Space. In 35th Computational Complexity Conference (CCC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 169, pp. 10:1-10:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{cheng_et_al:LIPIcs.CCC.2020.10,
  author =	{Cheng, Kuan and Hoza, William M.},
  title =	{{Hitting Sets Give Two-Sided Derandomization of Small Space}},
  booktitle =	{35th Computational Complexity Conference (CCC 2020)},
  pages =	{10:1--10:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-156-6},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{169},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2020.10},
  URN =		{urn:nbn:de:0030-drops-125625},
  doi =		{10.4230/LIPIcs.CCC.2020.10},
  annote =	{Keywords: hitting sets, derandomization, read-once branching programs}
}
Document
Finding Skewed Subcubes Under a Distribution

Authors: Parikshit Gopalan, Roie Levin, and Udi Wieder

Published in: LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)


Abstract
Say that we are given samples from a distribution ψ over an n-dimensional space. We expect or desire ψ to behave like a product distribution (or a k-wise independent distribution over its marginals for small k). We propose the problem of enumerating/list-decoding all large subcubes where the distribution ψ deviates markedly from what we expect; we refer to such subcubes as skewed subcubes. Skewed subcubes are certificates of dependencies between small subsets of variables in ψ. We motivate this problem by showing that it arises naturally in the context of algorithmic fairness and anomaly detection. In this work we focus on the special but important case where the space is the Boolean hypercube, and the expected marginals are uniform. We show that the obvious definition of skewed subcubes can lead to intractable list sizes, and propose a better definition of a minimal skewed subcube, which are subcubes whose skew cannot be attributed to a larger subcube that contains it. Our main technical contribution is a list-size bound for this definition and an algorithm to efficiently find all such subcubes. Both the bound and the algorithm rely on Fourier-analytic techniques, especially the powerful hypercontractive inequality. On the lower bounds side, we show that finding skewed subcubes is as hard as the sparse noisy parity problem, and hence our algorithms cannot be improved on substantially without a breakthrough on this problem which is believed to be intractable. Motivated by this, we study alternate models allowing query access to ψ where finding skewed subcubes might be easier.

Cite as

Parikshit Gopalan, Roie Levin, and Udi Wieder. Finding Skewed Subcubes Under a Distribution. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 84:1-84:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{gopalan_et_al:LIPIcs.ITCS.2020.84,
  author =	{Gopalan, Parikshit and Levin, Roie and Wieder, Udi},
  title =	{{Finding Skewed Subcubes Under a Distribution}},
  booktitle =	{11th Innovations in Theoretical Computer Science Conference (ITCS 2020)},
  pages =	{84:1--84:30},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-134-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{151},
  editor =	{Vidick, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.84},
  URN =		{urn:nbn:de:0030-drops-117691},
  doi =		{10.4230/LIPIcs.ITCS.2020.84},
  annote =	{Keywords: Fourier Analysis, Anomaly Detection, Algorithmic Fairness, Probability, Unsupervised Learning}
}
Document
Degree and Sensitivity: Tails of Two Distributions

Authors: Parikshit Gopalan, Rocco A. Servedio, and Avi Wigderson

Published in: LIPIcs, Volume 50, 31st Conference on Computational Complexity (CCC 2016)


Abstract
The sensitivity of a Boolean function f is the maximum, over all inputs x, of the number of sensitive coordinates of x (namely the number of Hamming neighbors of x with different f-value). The well-known sensitivity conjecture of Nisan (see also Nisan and Szegedy) states that every sensitivity-s Boolean function can be computed by a polynomial over the reals of degree s^{O(1)}. The best known upper bounds on degree, however, are exponential rather than polynomial in s. Our main result is an approximate version of the conjecture: every Boolean function with sensitivity s can be eps-approximated (in l_2) by a polynomial whose degree is s * polylog(1/eps). This is the first improvement on the folklore bound of s/eps. We prove this via a new "switching lemma for low-sensitivity functions" which establishes that a random restriction of a low-sensitivity function is very likely to have low decision tree depth. This is analogous to the well-known switching lemma for AC^0 circuits. Our proof analyzes the combinatorial structure of the graph G_f of sensitive edges of a Boolean function f. Understanding the structure of this graph is of independent interest as a means of understanding Boolean functions. We propose several new complexity measures for Boolean functions based on this graph, including tree sensitivity and component dimension, which may be viewed as relaxations of worst-case sensitivity, and we introduce some new techniques, such as proper walks and shifting, to analyze these measures. We use these notions to show that the graph of a function of full degree must be sufficiently complex, and that random restrictions of low-sensitivity functions are unlikely to lead to such complex graphs. We postulate a robust analogue of the sensitivity conjecture: if most inputs to a Boolean function f have low sensitivity, then most of the Fourier mass of f is concentrated on small subsets. We prove a lower bound on tree sensitivity in terms of decision tree depth, and show that a polynomial strengthening of this lower bound implies the robust conjecture. We feel that studying the graph G_f is interesting in its own right, and we hope that some of the notions and techniques we introduce in this work will be of use in its further study.

Cite as

Parikshit Gopalan, Rocco A. Servedio, and Avi Wigderson. Degree and Sensitivity: Tails of Two Distributions. In 31st Conference on Computational Complexity (CCC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 50, pp. 13:1-13:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{gopalan_et_al:LIPIcs.CCC.2016.13,
  author =	{Gopalan, Parikshit and Servedio, Rocco A. and Wigderson, Avi},
  title =	{{Degree and Sensitivity: Tails of Two Distributions}},
  booktitle =	{31st Conference on Computational Complexity (CCC 2016)},
  pages =	{13:1--13:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-008-8},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{50},
  editor =	{Raz, Ran},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2016.13},
  URN =		{urn:nbn:de:0030-drops-58488},
  doi =		{10.4230/LIPIcs.CCC.2016.13},
  annote =	{Keywords: Boolean functions, random restrictions, Fourier analysis}
}
  • Refine by Author
  • 5 Gopalan, Parikshit
  • 3 Wieder, Udi
  • 2 Hu, Lunjia
  • 2 Kalai, Adam Tauman
  • 2 Kim, Michael P.
  • Show More...

  • Refine by Classification
  • 2 Theory of computation → Theory and algorithms for application domains
  • 1 Computing methodologies → Machine learning approaches
  • 1 Computing methodologies → Neural networks
  • 1 Mathematics of computing → Probabilistic algorithms
  • 1 Theory of computation → Complexity classes
  • Show More...

  • Refine by Keyword
  • 1 Algorithmic Fairness
  • 1 Anomaly Detection
  • 1 Boolean functions
  • 1 Fourier Analysis
  • 1 Fourier analysis
  • Show More...

  • Refine by Type
  • 7 document

  • Refine by Publication Year
  • 2 2020
  • 2 2023
  • 1 2016
  • 1 2022
  • 1 2024

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail