Abstract
Fairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflect discrimination, suggesting a database repair problem. Existing treatments of fairness rely on statistical correlations that can be fooled by anomalies, such as Simpson's paradox. Proposals for causality-based definitions of fairness can correctly model some of these situations, but they rely on background knowledge of the underlying causal models. In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed to a complete causal model. We show that these conditions correctly capture subtle fairness violations. We then use these conditions as the basis for database repair algorithms that provide provable fairness guarantees about classifiers trained on their training labels. We demonstrate the effectiveness of our proposed techniques with experimental results.
- Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- Leopoldo E. Bertossi. Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. Google ScholarDigital Library
- Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21(2):277--292, 2010. Google ScholarDigital Library
- Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. Optimized pre-processing for discrimination prevention. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3992--4001. Curran Associates, Inc., 2017. Google ScholarDigital Library
- Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153--163, 2017.Google ScholarCross Ref
- Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 797--806. ACM, 2017. Google ScholarDigital Library
- Rachel Courtland. Bias detectives: the researchers striving to make algorithms fair. Nature, 558, 2018.Google Scholar
- Jeffrey Dastin. Rpt-insight-amazon scraps secret ai recruiting tool that showed bias against women. Reuters, 2018.Google Scholar
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214--226. ACM, 2012. Google ScholarDigital Library
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 259--268. ACM, 2015. Google ScholarDigital Library
- Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages 498--510. ACM, 2017. Google ScholarDigital Library
- Moritz Hardt, Eric Price, Nati Srebro, et al. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315--3323, 2016. Google ScholarDigital Library
- David Ingold and Spencer Soper. Amazon doesn't consider the race of its customers. should it? Bloomberg, 2016. www.bloomberg.com/graphics/2016-amazon-same-day/.Google Scholar
- Faisal Kamiran and Toon Calders. Classifying without discriminating. In Computer, Control and Communication, 2009. IC4 2009. 2nd International Conference on, pages 1--6. IEEE, 2009.Google Scholar
- Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 35--50. Springer, 2012.Google ScholarCross Ref
- Batya Kenig, Pranay Mundra, Guna Prasaad, Babak Salimi, and Dan Suciu. Mining approximate acyclic schemes from relations. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, June 14--19, 2020, pages 297--312. ACM, 2020. Google ScholarDigital Library
- Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Sch¨olkopf. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems, pages 656--666, 2017. Google ScholarDigital Library
- Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advances in Neural Information Processing Systems, pages 4069--4079, 2017. Google ScholarDigital Library
- Matt J. Kusner, Joshua R. Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. CoRR, abs/1703.06856, 2017. Google ScholarDigital Library
- Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we analyzed the compas recidivism algorithm. ProPublica (5 2016), 9, 2016.Google Scholar
- Ester Livshits, Benny Kimelfeld, and Sudeepa Roy. Computing optimal repairs for functional dependencies. ACM Transactions on Database Systems (TODS), 45(1):1--46, 2020.Google Scholar
- Joshua R Loftus, Chris Russell, Matt J Kusner, and Ricardo Silva. Causal reasoning for algorithmic fairness. arXiv preprint arXiv:1805.05859, 2018.Google Scholar
- Dimitris Margaritis. Learning bayesian network model structure from data. Technical report, Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science, 2003.Google Scholar
- Razieh Nabi and Ilya Shpitser. Fair inference on outcomes. In Proceedings of the... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, volume 2018, page 1931. NIH Public Access, 2018.Google Scholar
- Judea Pearl. Causality. Cambridge university press, 2009.Google Scholar
- Judea Pearl et al. Causal inference in statistics: An overview. Statistics Surveys, 3:96--146, 2009.Google ScholarCross Ref
- Donald B Rubin. The Use of Matched Sampling and Regression Adjustment in Observational Studies. Ph.D. Thesis, Department of Statistics, Harvard University, Cambridge, MA, 1970.Google Scholar
- Donald B Rubin. Statistics and causal inference: Comment: Which ifs have causal answers. Journal of the American Statistical Association, 81(396):961--962, 1986.Google Scholar
- Chris Russell, Matt J Kusner, Joshua Loftus, and Ricardo Silva. When worlds collide: integrating different counterfactual assumptions in fairness. In Advances in Neural Information Processing Systems, pages 6414--6423, 2017. Google ScholarDigital Library
- Babak Salimi, Johannes Gehrke, and Dan Suciu. Bias in olap queries: Detection, explanation, and removal. In Proceedings of the 2018 International Conference on Management of Data, pages 1021--1035. ACM, 2018. Google ScholarDigital Library
- Babak Salimi, Harsh Parikh, Moe Kayali, Lise Getoor, Sudeepa Roy, and Dan Suciu. Causal relational learning. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, June 14--19, 2020, pages 241--256. ACM, 2020. Google ScholarDigital Library
- Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. Capuchin: Causal database repair for algorithmic fairness. arXiv preprint arXiv:1902.08283, 2019.Google Scholar
- Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. Interventional fairness: Causal database repair for algorithmic fairness. In Proceedings of the 2019 International Conference on Management of Data, pages 793--810. ACM, 2019. Google ScholarDigital Library
- Andrew D Selbst. Disparate impact in big data policing. Ga. L. Rev., 52:109, 2017.Google Scholar
- Camelia Simoiu, Sam Corbett-Davies, Sharad Goel, et al. The problem of infra-marginality in outcome tests for discrimination. The Annals of Applied Statistics, 11(3):1193--1216, 2017.Google ScholarCross Ref
- Michael Veale, Max Van Kleek, and Reuben Binns. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pages 440:1--440:14, 2018. Google ScholarDigital Library
- Lauren Weber and Elizabeth Dwoskin. Are workplace personality tests fair? Wall Strreet Journal, 2014.Google Scholar
- SK Michael Wong, Cory J. Butz, and Dan Wu. On the implication problem for probabilistic conditional independency. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 30(6):785--805, 2000. Google ScholarDigital Library
- Blake Woodworth, Suriya Gunasekar, Mesrob I. Ohannessian, and Nathan Srebro. Learning non-discriminatory predictors. In Proceedings of the 2017 Conference on Learning Theory, pages 1920--1953, 2017.Google Scholar
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, pages 1171--1180, 2017. Google ScholarDigital Library
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P. Gummadi. Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pages 962--970, 2017.Google Scholar
- Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In International Conference on Machine Learning, pages 325--333, 2013. Google ScholarDigital Library
Recommendations
Interventional Fairness: Causal Database Repair for Algorithmic Fairness
SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataFairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflect discrimination, suggesting a database repair problem. Existing treatments ...
Algorithmic Decision Making with Conditional Fairness
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningNowadays fairness issues have raised great concerns in decision-making systems. Various fairness notions have been proposed to measure the degree to which an algorithm is unfair. In practice, there frequently exist a certain set of variables we term as ...
Airtime Fairness for IEEE 802.11 Multirate Networks
Under a multi rate network scenario, the IEEE 802.11 DCF MAC fails to provide air-time fairness for all competing stations since the protocol is designed for ensuring max-min throughput fairness and the maximum achievable throughput by any station gets ...
Comments