DROPS

Document

DOI: 10.4230/LIPIcs.FORC.2024.5

Distribution-Specific Auditing for Subgroup Fairness

Authors: Daniel Hsu, Jizhou Huang, and Brendan Juba

Published in: LIPIcs, Volume 295, 5th Symposium on Foundations of Responsible Computing (FORC 2024)

Abstract

We study the problem of auditing classifiers for statistical subgroup fairness. Kearns et al. [Kearns et al., 2018] showed that the problem of auditing combinatorial subgroups fairness is as hard as agnostic learning. Essentially all work on remedying statistical measures of discrimination against subgroups assumes access to an oracle for this problem, despite the fact that no efficient algorithms are known for it. If we assume the data distribution is Gaussian, or even merely log-concave, then a recent line of work has discovered efficient agnostic learning algorithms for halfspaces. Unfortunately, the reduction of Kearns et al. was formulated in terms of weak, "distribution-free" learning, and thus did not establish a connection for families such as log-concave distributions. In this work, we give positive and negative results on auditing for Gaussian distributions: On the positive side, we present an alternative approach to leverage these advances in agnostic learning and thereby obtain the first polynomial-time approximation scheme (PTAS) for auditing nontrivial combinatorial subgroup fairness: we show how to audit statistical notions of fairness over homogeneous halfspace subgroups when the features are Gaussian. On the negative side, we find that under cryptographic assumptions, no polynomial-time algorithm can guarantee any nontrivial auditing, even under Gaussian feature distributions, for general halfspace subgroups.

Cite as

Daniel Hsu, Jizhou Huang, and Brendan Juba. Distribution-Specific Auditing for Subgroup Fairness. In 5th Symposium on Foundations of Responsible Computing (FORC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 295, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{hsu_et_al:LIPIcs.FORC.2024.5,
  author =	{Hsu, Daniel and Huang, Jizhou and Juba, Brendan},
  title =	{{Distribution-Specific Auditing for Subgroup Fairness}},
  booktitle =	{5th Symposium on Foundations of Responsible Computing (FORC 2024)},
  pages =	{5:1--5:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-319-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{295},
  editor =	{Rothblum, Guy N.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2024.5},
  URN =		{urn:nbn:de:0030-drops-200882},
  doi =		{10.4230/LIPIcs.FORC.2024.5},
  annote =	{Keywords: Fairness auditing, agnostic learning, intractability}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2017.45

Conditional Sparse Linear Regression

Authors: Brendan Juba

Published in: LIPIcs, Volume 67, 8th Innovations in Theoretical Computer Science Conference (ITCS 2017)

Abstract

Machine learning and statistics typically focus on building models that capture the vast majority of the data, possibly ignoring a small subset of data as "noise" or "outliers." By contrast, here we consider the problem of jointly identifying a significant (but perhaps small) segment of a population in which there is a highly sparse linear regression fit, together with the coefficients for the linear fit. We contend that such tasks are of interest both because the models themselves may be able to achieve better predictions in such special cases, but also because they may aid our understanding of the data. We give algorithms for such problems under the sup norm, when this unknown segment of the population is described by a k-DNF condition and the regression fit is s-sparse for constant k and s. For the variants of this problem when the regression fit is not so sparse or using expected error, we also give a preliminary algorithm and highlight the question as a challenge for future work.

Cite as

Brendan Juba. Conditional Sparse Linear Regression. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 67, pp. 45:1-45:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{juba:LIPIcs.ITCS.2017.45,
  author =	{Juba, Brendan},
  title =	{{Conditional Sparse Linear Regression}},
  booktitle =	{8th Innovations in Theoretical Computer Science Conference (ITCS 2017)},
  pages =	{45:1--45:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-029-3},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{67},
  editor =	{Papadimitriou, Christos H.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.45},
  URN =		{urn:nbn:de:0030-drops-81518},
  doi =		{10.4230/LIPIcs.ITCS.2017.45},
  annote =	{Keywords: linear regression, conditional regression, conditional distribution search}
}

Document

DOI: 10.4230/LIPIcs.ICALP.2016.35

AC^0 o MOD_2 Lower Bounds for the Boolean Inner Product

Authors: Mahdi Cheraghchi, Elena Grigorescu, Brendan Juba, Karl Wimmer, and Ning Xie

Published in: LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

Abstract

AC^0 o MOD_2 circuits are AC^0 circuits augmented with a layer of parity gates just above the input layer. We study AC^0 o MOD2 circuit lower bounds for computing the Boolean Inner Product functions. Recent works by Servedio and Viola (ECCC TR12-144) and Akavia et al. (ITCS 2014) have highlighted this problem as a frontier problem in circuit complexity that arose both as a first step towards solving natural special cases of the matrix rigidity problem and as a candidate for constructing pseudorandom generators of minimal complexity. We give the first superlinear lower bound for the Boolean Inner Product function against AC^0 o MOD2 of depth four or greater. Specifically, we prove a superlinear lower bound for circuits of arbitrary constant depth, and an ~Omega(n^2) lower bound for the special case of depth-4 AC^0 o MOD_2. Our proof of the depth-4 lower bound employs a new "moment-matching" inequality for bounded, nonnegative integer-valued random variables that may be of independent interest: we prove an optimal bound on the maximum difference between two discrete distributions’ values at 0, given that their first d moments match.

Cite as

Mahdi Cheraghchi, Elena Grigorescu, Brendan Juba, Karl Wimmer, and Ning Xie. AC^0 o MOD_2 Lower Bounds for the Boolean Inner Product. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 35:1-35:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{cheraghchi_et_al:LIPIcs.ICALP.2016.35,
  author =	{Cheraghchi, Mahdi and Grigorescu, Elena and Juba, Brendan and Wimmer, Karl and Xie, Ning},
  title =	{{AC^0 o MOD\underline2 Lower Bounds for the Boolean Inner Product}},
  booktitle =	{43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)},
  pages =	{35:1--35:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-013-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{55},
  editor =	{Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.35},
  URN =		{urn:nbn:de:0030-drops-63150},
  doi =		{10.4230/LIPIcs.ICALP.2016.35},
  annote =	{Keywords: Boolean analysis, circuit complexity, lower bounds}
}

Search Results

Documents authored by Juba, Brendan

Distribution-Specific Auditing for Subgroup Fairness

Abstract

Cite as

Conditional Sparse Linear Regression

Abstract

Cite as

AC^0 o MOD_2 Lower Bounds for the Boolean Inner Product

Abstract

Cite as