research-article

Database Repair Meets Algorithmic Fairness

Authors:
Babak Salimi

University of Washington, WA, USA

University of Washington, WA, USA
View Profile

,
Bill Howe

University of Washington, WA, USA

University of Washington, WA, USA
View Profile

,
Dan Suciu

University of Washington, WA, USA

University of Washington, WA, USA
View Profile

Authors Info & Claims

ACM SIGMOD Record Volume 49 Issue 1March 2020pp 34–41https://doi.org/10.1145/3422648.3422657

Published:04 September 2020Publication History

ACM SIGMOD Record

Abstract

Fairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflect discrimination, suggesting a database repair problem. Existing treatments of fairness rely on statistical correlations that can be fooled by anomalies, such as Simpson's paradox. Proposals for causality-based definitions of fairness can correctly model some of these situations, but they rely on background knowledge of the underlying causal models. In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed to a complete causal model. We show that these conditions correctly capture subtle fairness violations. We then use these conditions as the basis for database repair algorithms that provide provable fairness guarantees about classifiers trained on their training labels. We demonstrate the effectiveness of our proposed techniques with experimental results.

References

Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
Leopoldo E. Bertossi. Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. Google ScholarDigital Library
Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21(2):277--292, 2010. Google ScholarDigital Library
Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. Optimized pre-processing for discrimination prevention. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3992--4001. Curran Associates, Inc., 2017. Google ScholarDigital Library
Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153--163, 2017.Google ScholarCross Ref
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 797--806. ACM, 2017. Google ScholarDigital Library
Rachel Courtland. Bias detectives: the researchers striving to make algorithms fair. Nature, 558, 2018.Google Scholar
Jeffrey Dastin. Rpt-insight-amazon scraps secret ai recruiting tool that showed bias against women. Reuters, 2018.Google Scholar
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214--226. ACM, 2012. Google ScholarDigital Library
Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 259--268. ACM, 2015. Google ScholarDigital Library
Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages 498--510. ACM, 2017. Google ScholarDigital Library
Moritz Hardt, Eric Price, Nati Srebro, et al. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315--3323, 2016. Google ScholarDigital Library
David Ingold and Spencer Soper. Amazon doesn't consider the race of its customers. should it? Bloomberg, 2016. www.bloomberg.com/graphics/2016-amazon-same-day/.Google Scholar
Faisal Kamiran and Toon Calders. Classifying without discriminating. In Computer, Control and Communication, 2009. IC4 2009. 2nd International Conference on, pages 1--6. IEEE, 2009.Google Scholar
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 35--50. Springer, 2012.Google ScholarCross Ref
Batya Kenig, Pranay Mundra, Guna Prasaad, Babak Salimi, and Dan Suciu. Mining approximate acyclic schemes from relations. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, June 14--19, 2020, pages 297--312. ACM, 2020. Google ScholarDigital Library
Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Sch¨olkopf. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems, pages 656--666, 2017. Google ScholarDigital Library
Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advances in Neural Information Processing Systems, pages 4069--4079, 2017. Google ScholarDigital Library
Matt J. Kusner, Joshua R. Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. CoRR, abs/1703.06856, 2017. Google ScholarDigital Library
Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we analyzed the compas recidivism algorithm. ProPublica (5 2016), 9, 2016.Google Scholar
Ester Livshits, Benny Kimelfeld, and Sudeepa Roy. Computing optimal repairs for functional dependencies. ACM Transactions on Database Systems (TODS), 45(1):1--46, 2020.Google Scholar
Joshua R Loftus, Chris Russell, Matt J Kusner, and Ricardo Silva. Causal reasoning for algorithmic fairness. arXiv preprint arXiv:1805.05859, 2018.Google Scholar
Dimitris Margaritis. Learning bayesian network model structure from data. Technical report, Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science, 2003.Google Scholar
Razieh Nabi and Ilya Shpitser. Fair inference on outcomes. In Proceedings of the... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, volume 2018, page 1931. NIH Public Access, 2018.Google Scholar
Judea Pearl. Causality. Cambridge university press, 2009.Google Scholar
Judea Pearl et al. Causal inference in statistics: An overview. Statistics Surveys, 3:96--146, 2009.Google ScholarCross Ref
Donald B Rubin. The Use of Matched Sampling and Regression Adjustment in Observational Studies. Ph.D. Thesis, Department of Statistics, Harvard University, Cambridge, MA, 1970.Google Scholar
Donald B Rubin. Statistics and causal inference: Comment: Which ifs have causal answers. Journal of the American Statistical Association, 81(396):961--962, 1986.Google Scholar
Chris Russell, Matt J Kusner, Joshua Loftus, and Ricardo Silva. When worlds collide: integrating different counterfactual assumptions in fairness. In Advances in Neural Information Processing Systems, pages 6414--6423, 2017. Google ScholarDigital Library
Babak Salimi, Johannes Gehrke, and Dan Suciu. Bias in olap queries: Detection, explanation, and removal. In Proceedings of the 2018 International Conference on Management of Data, pages 1021--1035. ACM, 2018. Google ScholarDigital Library
Babak Salimi, Harsh Parikh, Moe Kayali, Lise Getoor, Sudeepa Roy, and Dan Suciu. Causal relational learning. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, June 14--19, 2020, pages 241--256. ACM, 2020. Google ScholarDigital Library
Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. Capuchin: Causal database repair for algorithmic fairness. arXiv preprint arXiv:1902.08283, 2019.Google Scholar
Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. Interventional fairness: Causal database repair for algorithmic fairness. In Proceedings of the 2019 International Conference on Management of Data, pages 793--810. ACM, 2019. Google ScholarDigital Library
Andrew D Selbst. Disparate impact in big data policing. Ga. L. Rev., 52:109, 2017.Google Scholar
Camelia Simoiu, Sam Corbett-Davies, Sharad Goel, et al. The problem of infra-marginality in outcome tests for discrimination. The Annals of Applied Statistics, 11(3):1193--1216, 2017.Google ScholarCross Ref
Michael Veale, Max Van Kleek, and Reuben Binns. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pages 440:1--440:14, 2018. Google ScholarDigital Library
Lauren Weber and Elizabeth Dwoskin. Are workplace personality tests fair? Wall Strreet Journal, 2014.Google Scholar
SK Michael Wong, Cory J. Butz, and Dan Wu. On the implication problem for probabilistic conditional independency. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 30(6):785--805, 2000. Google ScholarDigital Library
Blake Woodworth, Suriya Gunasekar, Mesrob I. Ohannessian, and Nathan Srebro. Learning non-discriminatory predictors. In Proceedings of the 2017 Conference on Learning Theory, pages 1920--1953, 2017.Google Scholar
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, pages 1171--1180, 2017. Google ScholarDigital Library
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P. Gummadi. Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pages 962--970, 2017.Google Scholar
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In International Conference on Machine Learning, pages 325--333, 2013. Google ScholarDigital Library

Recommendations

Interventional Fairness: Causal Database Repair for Algorithmic Fairness
SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data

Fairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflect discrimination, suggesting a database repair problem. Existing treatments ...
Read More
Algorithmic Decision Making with Conditional Fairness
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Nowadays fairness issues have raised great concerns in decision-making systems. Various fairness notions have been proposed to measure the degree to which an algorithm is unfair. In practice, there frequently exist a certain set of variables we term as ...
Read More
Airtime Fairness for IEEE 802.11 Multirate Networks

Under a multi rate network scenario, the IEEE 802.11 DCF MAC fails to provide air-time fairness for all competing stations since the protocol is designed for ensuring max-min throughput fairness and the maximum achievable throughput by any station gets ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGMOD Record Volume 49, Issue 1
March 2020
72 pages
ISSN:0163-5808
DOI:10.1145/3422648
Editors:
Rada Chirkova
North Carolina State University
,
Vanessa Braganholo
Universidade Federal Fluminense
,
Wim Martens
University of Bayreuth
,
Divesh Srivastava
ATT research
,
Pinar Tözü
IBM Almaden Research Center
,
Marianne Winslett
University of Illinois
,
Jun Yang
Duke University
,
Azza Abouzied
NYU
,
Lyublena Antova
Datometry
,
Aaron J. Elmore
University of Chicago
,
Kyriakos Mouratidis
Singapore Management University
,
Dan Olteanu
University of Oxford
,
Immanuel Trummer
Cornell University
,
Yannis Velegrakis
Utrecht University
Issue’s Table of Contents
Copyright © 2020 Copyright is held by the owner/author(s)
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 September 2020
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 358
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Database Repair Meets Algorithmic Fairness

ACM SIGMOD Record

Abstract

References

Cited By

Recommendations

Interventional Fairness: Causal Database Repair for Algorithmic Fairness

Algorithmic Decision Making with Conditional Fairness

Airtime Fairness for IEEE 802.11 Multirate Networks