skip to main content
research-article

Database Repair Meets Algorithmic Fairness

Published:04 September 2020Publication History
Skip Abstract Section

Abstract

Fairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflect discrimination, suggesting a database repair problem. Existing treatments of fairness rely on statistical correlations that can be fooled by anomalies, such as Simpson's paradox. Proposals for causality-based definitions of fairness can correctly model some of these situations, but they rely on background knowledge of the underlying causal models. In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed to a complete causal model. We show that these conditions correctly capture subtle fairness violations. We then use these conditions as the basis for database repair algorithms that provide provable fairness guarantees about classifiers trained on their training labels. We demonstrate the effectiveness of our proposed techniques with experimental results.

References

  1. Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Leopoldo E. Bertossi. Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21(2):277--292, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. Optimized pre-processing for discrimination prevention. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3992--4001. Curran Associates, Inc., 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153--163, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  6. Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 797--806. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Rachel Courtland. Bias detectives: the researchers striving to make algorithms fair. Nature, 558, 2018.Google ScholarGoogle Scholar
  8. Jeffrey Dastin. Rpt-insight-amazon scraps secret ai recruiting tool that showed bias against women. Reuters, 2018.Google ScholarGoogle Scholar
  9. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214--226. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 259--268. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages 498--510. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Moritz Hardt, Eric Price, Nati Srebro, et al. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315--3323, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. David Ingold and Spencer Soper. Amazon doesn't consider the race of its customers. should it? Bloomberg, 2016. www.bloomberg.com/graphics/2016-amazon-same-day/.Google ScholarGoogle Scholar
  14. Faisal Kamiran and Toon Calders. Classifying without discriminating. In Computer, Control and Communication, 2009. IC4 2009. 2nd International Conference on, pages 1--6. IEEE, 2009.Google ScholarGoogle Scholar
  15. Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 35--50. Springer, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  16. Batya Kenig, Pranay Mundra, Guna Prasaad, Babak Salimi, and Dan Suciu. Mining approximate acyclic schemes from relations. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, June 14--19, 2020, pages 297--312. ACM, 2020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Sch¨olkopf. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems, pages 656--666, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advances in Neural Information Processing Systems, pages 4069--4079, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Matt J. Kusner, Joshua R. Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. CoRR, abs/1703.06856, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we analyzed the compas recidivism algorithm. ProPublica (5 2016), 9, 2016.Google ScholarGoogle Scholar
  21. Ester Livshits, Benny Kimelfeld, and Sudeepa Roy. Computing optimal repairs for functional dependencies. ACM Transactions on Database Systems (TODS), 45(1):1--46, 2020.Google ScholarGoogle Scholar
  22. Joshua R Loftus, Chris Russell, Matt J Kusner, and Ricardo Silva. Causal reasoning for algorithmic fairness. arXiv preprint arXiv:1805.05859, 2018.Google ScholarGoogle Scholar
  23. Dimitris Margaritis. Learning bayesian network model structure from data. Technical report, Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science, 2003.Google ScholarGoogle Scholar
  24. Razieh Nabi and Ilya Shpitser. Fair inference on outcomes. In Proceedings of the... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, volume 2018, page 1931. NIH Public Access, 2018.Google ScholarGoogle Scholar
  25. Judea Pearl. Causality. Cambridge university press, 2009.Google ScholarGoogle Scholar
  26. Judea Pearl et al. Causal inference in statistics: An overview. Statistics Surveys, 3:96--146, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  27. Donald B Rubin. The Use of Matched Sampling and Regression Adjustment in Observational Studies. Ph.D. Thesis, Department of Statistics, Harvard University, Cambridge, MA, 1970.Google ScholarGoogle Scholar
  28. Donald B Rubin. Statistics and causal inference: Comment: Which ifs have causal answers. Journal of the American Statistical Association, 81(396):961--962, 1986.Google ScholarGoogle Scholar
  29. Chris Russell, Matt J Kusner, Joshua Loftus, and Ricardo Silva. When worlds collide: integrating different counterfactual assumptions in fairness. In Advances in Neural Information Processing Systems, pages 6414--6423, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Babak Salimi, Johannes Gehrke, and Dan Suciu. Bias in olap queries: Detection, explanation, and removal. In Proceedings of the 2018 International Conference on Management of Data, pages 1021--1035. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Babak Salimi, Harsh Parikh, Moe Kayali, Lise Getoor, Sudeepa Roy, and Dan Suciu. Causal relational learning. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, June 14--19, 2020, pages 241--256. ACM, 2020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. Capuchin: Causal database repair for algorithmic fairness. arXiv preprint arXiv:1902.08283, 2019.Google ScholarGoogle Scholar
  33. Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. Interventional fairness: Causal database repair for algorithmic fairness. In Proceedings of the 2019 International Conference on Management of Data, pages 793--810. ACM, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Andrew D Selbst. Disparate impact in big data policing. Ga. L. Rev., 52:109, 2017.Google ScholarGoogle Scholar
  35. Camelia Simoiu, Sam Corbett-Davies, Sharad Goel, et al. The problem of infra-marginality in outcome tests for discrimination. The Annals of Applied Statistics, 11(3):1193--1216, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  36. Michael Veale, Max Van Kleek, and Reuben Binns. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pages 440:1--440:14, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lauren Weber and Elizabeth Dwoskin. Are workplace personality tests fair? Wall Strreet Journal, 2014.Google ScholarGoogle Scholar
  38. SK Michael Wong, Cory J. Butz, and Dan Wu. On the implication problem for probabilistic conditional independency. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 30(6):785--805, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Blake Woodworth, Suriya Gunasekar, Mesrob I. Ohannessian, and Nathan Srebro. Learning non-discriminatory predictors. In Proceedings of the 2017 Conference on Learning Theory, pages 1920--1953, 2017.Google ScholarGoogle Scholar
  40. Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, pages 1171--1180, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P. Gummadi. Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pages 962--970, 2017.Google ScholarGoogle Scholar
  42. Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In International Conference on Machine Learning, pages 325--333, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGMOD Record
    ACM SIGMOD Record  Volume 49, Issue 1
    March 2020
    72 pages
    ISSN:0163-5808
    DOI:10.1145/3422648
    Issue’s Table of Contents

    Copyright © 2020 Copyright is held by the owner/author(s)

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 4 September 2020

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader