Search Results

Documents authored by Steinhardt, Jacob


Document
Agnostic Learning with Unknown Utilities

Authors: Kush Bhatia, Peter L. Bartlett, Anca D. Dragan, and Jacob Steinhardt

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)


Abstract
Traditional learning approaches for classification implicitly assume that each mistake has the same cost. In many real-world problems though, the utility of a decision depends on the underlying context x and decision y; for instance, misclassifying a stop sign is worse than misclassifying a road-side postbox. However, directly incorporating these utilities into the learning objective is often infeasible since these can be quite complex and difficult for humans to specify. We formally study this as agnostic learning with unknown utilities: given a dataset S = {x_1, …, x_n} where each data point x_i ∼ 𝒟_x from some unknown distribution 𝒟_x, the objective of the learner is to output a function f in some class of decision functions ℱ with small excess risk. This risk measures the performance of the output predictor f with respect to the best predictor in the class ℱ on the unknown underlying utility u^*:𝒳×𝒴↦ [0,1]. This utility u^* is not assumed to have any specific structure and is allowed to be any bounded function. This raises an interesting question whether learning is even possible in our setup, given that obtaining a generalizable estimate of utility u^* might not be possible from finitely many samples. Surprisingly, we show that estimating the utilities of only the sampled points S suffices to learn a decision function which generalizes well. With this insight, we study mechanisms for eliciting information from human experts which allow a learner to estimate the utilities u^* on the set S. While humans find it difficult to directly provide utility values reliably, it is often easier for them to provide comparison feedback based on these utilities. We show that, unlike in the realizable setup, the vanilla comparison queries where humans compare a pair of decisions for a single input x are insufficient. We introduce a family of elicitation mechanisms by generalizing comparisons, called the k-comparison oracle, which enables the learner to ask for comparisons across k different inputs x at once. We show that the excess risk in our agnostic learning framework decreases at a rate of O (1/k) with such queries. This result brings out an interesting accuracy-elicitation trade-off - as the order k of the oracle increases, the comparative queries become harder to elicit from humans but allow for more accurate learning.

Cite as

Kush Bhatia, Peter L. Bartlett, Anca D. Dragan, and Jacob Steinhardt. Agnostic Learning with Unknown Utilities. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 55:1-55:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bhatia_et_al:LIPIcs.ITCS.2021.55,
  author =	{Bhatia, Kush and Bartlett, Peter L. and Dragan, Anca D. and Steinhardt, Jacob},
  title =	{{Agnostic Learning with Unknown Utilities}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{55:1--55:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.55},
  URN =		{urn:nbn:de:0030-drops-135947},
  doi =		{10.4230/LIPIcs.ITCS.2021.55},
  annote =	{Keywords: agnostic learning, learning by comparisons, utility estimation, active learning}
}
Document
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers

Authors: Jacob Steinhardt, Moses Charikar, and Gregory Valiant

Published in: LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)


Abstract
We introduce a criterion, resilience, which allows properties of a dataset (such as its mean or best low rank approximation) to be robustly computed, even in the presence of a large fraction of arbitrary additional data. Resilience is a weaker condition than most other properties considered so far in the literature, and yet enables robust estimation in a broader variety of settings. We provide new information-theoretic results on robust distribution learning, robust estimation of stochastic block models, and robust mean estimation under bounded kth moments. We also provide new algorithmic results on robust distribution learning, as well as robust mean estimation in p-norms. Among our proof techniques is a method for pruning a high-dimensional distribution with bounded 1st moments to a stable "core" with bounded 2nd moments, which may be of independent interest.

Cite as

Jacob Steinhardt, Moses Charikar, and Gregory Valiant. Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 45:1-45:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{steinhardt_et_al:LIPIcs.ITCS.2018.45,
  author =	{Steinhardt, Jacob and Charikar, Moses and Valiant, Gregory},
  title =	{{Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers}},
  booktitle =	{9th Innovations in Theoretical Computer Science Conference (ITCS 2018)},
  pages =	{45:1--45:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-060-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{94},
  editor =	{Karlin, Anna R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.45},
  URN =		{urn:nbn:de:0030-drops-83687},
  doi =		{10.4230/LIPIcs.ITCS.2018.45},
  annote =	{Keywords: robust learning, outliers, stochastic block models, p-norm estimation}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail