Bias In, Bias Out? Evaluating the Folk Wisdom

Rambachan, Ashesh; Roth, Jonathan

doi:10.4230/LIPIcs.FORC.2020.6

Abstract

We evaluate the folk wisdom that algorithmic decision rules trained on data produced by biased human decision-makers necessarily reflect this bias. We consider a setting where training labels are only generated if a biased decision-maker takes a particular action, and so "biased" training data arise due to discriminatory selection into the training data. In our baseline model, the more biased the decision-maker is against a group, the more the algorithmic decision rule favors that group. We refer to this phenomenon as bias reversal. We then clarify the conditions that give rise to bias reversal. Whether a prediction algorithm reverses or inherits bias depends critically on how the decision-maker affects the training data as well as the label used in training. We illustrate our main theoretical results in a simulation study applied to the New York City Stop, Question and Frisk dataset.

Joseph Altonji and Rebecca Blank. Race and gender in the labor market. In Orley C. Ashenfelter and David Card, editors, Handbook of Labor Economics, Volume 3C, pages 3143-3259. North Holland, 1999.
Shamena Anwar and Hanming Fang. An alternative test of racial prejudice in motor vehicle searches: Theory and evidence. American Economic Review, 96(1), 2006.
David Arnold, Will Dobbie, and Crystal Yang. Racial bias in bail decisions. The Quarterly Journal of Economics, 133(4):1885-1932, 2018.
Solon Barocas and Andrew Selbst. Big data’s disparate impact. California Law Review, 104, 2016.
Gary Becker. The Economics of Discrimination. University of Chicago Press, 1957.
Pedro Bordalo, Katherine Coffman, Nicola Gennaioli, and Andrei Shleifer. Stereotypes. The Quarterly Journal of Economics, 131(4):1753-1794, 2016.
Anupam Chander. The racist algorithm. Michigan Law Review, (6):1023-1046, 2017.
Jiahao Chen, Nathan Kallus, Xiaojie Mao, Geoffry Svacha, and Madeleine Udell. Fairness under unawareness: Assessing disparity when protected class is unobserved. In Proceedings of the Conference on Fairness, Accountability, and Transparency '19, pages 339-348, 2019.
Sam Corbett-Davies and Sharad Goel. The measure and mismeasure of fairness: A critical review of fair machine learning. Technical report, Stanford University Working Paper, 2018.
Bo Cowgill. Bias and productivity in humans and machines: Theory and evidence. Technical report, Columbia business School Working Paper, 2018.
Bo Cowgill. Bias and productivity in humans and machines. Technical report, Columbia Business School Working Paper, 2019.
Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, and Shayak Sen. Proxy non-discrimination in data-driven systems. Technical report, arXiv preprint, 2017. URL: http://arxiv.org/abs/1707.08120.
Maria De-Arteaga, Artur Dubrawski, and Alexandra Chouldechova. Learning under selective labels in the presence of expert consistency. Technical report, arXiv preprint, 2018. URL: http://arxiv.org/abs/1807.00905.
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. ITCS '12 Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pages 214-226, 2012.
Talia Gillis and Jann Spiess. Big data and discrimination. The University of Chicago Law Review, 86:459-487, 2019.
Sharad Goel, Justin Rao, and Ravi Shroff. Precinct or prejudice? understanding racial disparities in new york city’s stop-and-frisk policy. The Annals of Applied Statistics, 10(1), 2016.
Nathan Kallus and Angela Zhou. Residual unfairness in fair machine learning from prejudiced data. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2444-2453. PMLR, 2018.
Sampath Kannan, Aaron Roth, and Juba Ziani. Downstream effects of affirmative action. Technical report, arXiv preprint, 2018. URL: http://arxiv.org/abs/1808.09004.
Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. Human decisions and machine predictions. The Quarterly Journal of Economics, 133(1), 2018.
Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ashesh Rambachan. Algorithmic fairness. AEA Papers and Proceedings, 108:22-27, 2018.
Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Cass Sunstein. Discrimination in the age of algorithms. Journal of Legal Analysis, 80:1-62, 2018.
Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determination of risk scores. Technical report, arXiv preprint, 2016. URL: http://arxiv.org/abs/1609.05807.
John Knowles, Nicola Persico, and Petra Todd. Racial bias in motor vehicle searches: Theory and evidence. The Journal of Political Economy, 109(1), 2001.
Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. The selective labels problem: Evaluating algorithmic predictions in the presence of unobservables. KDD '17 Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 275-284, 2017.
Danielle Li. Expertise vs. bias in evaluation: Evidence from the nih. American Economic Journal: Applied Economics, 9(2), 2017.
Zachary Lipton, Alexandra Chouldechova, and Julian McAuley. Does mitigating ml’s impact disparity require treatment disparity? In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 8125-8135, 2018.
David Madras, Elliot Creager, Toniann Pitassi, and Richard Zemel. Fairness through causal awareness: Learning causal latent-variable models for biased data. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* '19, pages 349-358, 2019.
Sandra Mayson. Bias in, bias out. The Yale Law Journal, 128(8):2122-2473, 2018.
Sendhil Mullainathan and Ziad Obermeyer. Does machine learning automate moral hazard and error? American Economic Review, 107(5):476-80, 2017.
Manish Raghavan, Solon Barocas, Jon Kleinberg, and Karen Levy. Mitigating bias in algorithmic employment screening: Evaluating claims and practices. Technical report, arXiv preprint, 2019. URL: http://arxiv.org/abs/1906.09208.

Bias In, Bias Out? Evaluating the Folk Wisdom

Authors Ashesh Rambachan, Jonathan Roth

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

Bias In, Bias Out? Evaluating the Folk Wisdom

Authors Ashesh Rambachan, Jonathan Roth

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Acknowledgements

References

Thanks for your feedback!

Could not send message