ABSTRACT
Machine learning bias and fairness have recently emerged as key issues due to the pervasive deployment of data-driven decision making in a variety of sectors and services. It has often been argued that unfair classifications can be attributed to bias in training data, but previous attempts to 'repair' training data have led to limited success. To circumvent shortcomings prevalent in data repairing approaches, such as those that weight training samples of the sensitive group (e.g. gender, race, financial status) based on their misclassification error, we present a process that iteratively adapts training sample weights with a theoretically grounded model. This model addresses different kinds of bias to better achieve fairness objectives, such as trade-offs between accuracy and disparate impact elimination or disparate mistreatment elimination. We show that, compared to previous fairness-aware approaches, our methodology achieves better or similar trades-offs between accuracy and unfairness mitigation on real-world and synthetic datasets.
- Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. (2016).Google Scholar
- Dan Biddle. 2006. Adverse impact and test validation: A practitioner's guide to valid and defensible employment testing. Gower Publishing, Ltd.Google Scholar
- Toon Calders, Faisal Kamiran, and Mykola Pechenizkiy. 2009. Building classifiers with independency constraints Data mining workshops, 2009. ICDMW'09. IEEE international conference on. IEEE, 13--18. Google ScholarDigital Library
- Toon Calders, Asim Karim, Faisal Kamiran, Wasif Ali, and Xiangliang Zhang. 2013. Controlling attribute effect in linear regression. Data Mining (ICDM), 2013 IEEE 13th International Conference on. IEEE, 71--80.Google ScholarCross Ref
- Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery Vol. 21, 2 (2010), 277--292. Google ScholarDigital Library
- L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. 2017. Ranking with Fairness Constraints. arXiv preprint arXiv:1704.06840 (2017).Google Scholar
- Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv preprint arXiv:1703.00056 (2017).Google Scholar
- Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. arXiv preprint arXiv:1701.08230 (2017). Google ScholarDigital Library
- Georges Dionne and Casey Rothschild. 2014. Economic effects of risk classification bans. The Geneva Risk and Insurance Review Vol. 39, 2 (2014), 184--221.Google ScholarCross Ref
- Neil A Doherty, Anastasia V Kartasheva, and Richard D Phillips. 2012. Information effect of entry into credit ratings market: The case of insurers' ratings. Journal of Financial Economics Vol. 106, 2 (2012), 308--330.Google ScholarCross Ref
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. ACM, 214--226. Google ScholarDigital Library
- Michael Feldman. 2015. Computational Fairness: Preventing Machine-Learned Discrimination. (2015).Google Scholar
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259--268. Google ScholarDigital Library
- Daniel E Finkel. 2003. DIRECT optimization algorithm user guide. Center for Research in Scientific Computation, North Carolina State University Vol. 2 (2003).Google Scholar
- Daniel E Finkel and CT Kelley. 2004. Convergence analysis of the DIRECT algorithm. Optimization Online, Vol. 14, 2 (2004), 1--10.Google Scholar
- Benjamin Fish, Jeremy Kun, and Adám D Lelkes. 2015. Fair boosting: a case study. In Workshop on Fairness, Accountability, and Transparency in Machine Learning.Google Scholar
- Benjamin Fish, Jeremy Kun, and Ádám D Lelkes. 2016. A confidence-based approach for balancing fairness and accuracy Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 144--152.Google Scholar
- Kazuto Fukuchi and Jun Sakuma. 2015. Fairness-Aware Learning with Restriction of Universal Dependency using f-Divergences. arXiv preprint arXiv:1506.07721 (2015).Google Scholar
- Gabriel Goh, Andrew Cotter, Maya Gupta, and Michael P Friedlander. 2016. Satisfying Real-world Goals with Dataset Constraints Advances in Neural Information Processing Systems. 2415--2423. Google ScholarDigital Library
- Moritz Hardt, Eric Price, Nati Srebro, and others. 2016. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems. 3315--3323. Google ScholarDigital Library
- Qinghua Hu, Pengfei Zhu, Yongbin Yang, and Daren Yu. 2011. Large-margin nearest neighbor classifiers via sample weight learning. Neurocomputing, Vol. 74, 4 (2011), 656--660. Google ScholarDigital Library
- Anatoli Iouditski and Yuri Nesterov. 2014. Primal-dual subgradient methods for minimizing uniformly convex functions. arXiv preprint arXiv:1401.1792 (2014).Google Scholar
- Donald R Jones, Cary D Perttunen, and Bruce E Stuckman. 1993. Lipschitzian optimization without the Lipschitz constant. Journal of Optimization Theory and Applications, Vol. 79, 1 (1993), 157--181.Google ScholarDigital Library
- Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In Computer, Control and Communication, 2009. IC4 2009. 2nd International Conference on. IEEE, 1--6.Google Scholar
- Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems Vol. 33, 1 (2012), 1--33.Google ScholarDigital Library
- Faisal Kamiran, Toon Calders, and others. 2011. Handling conditional discrimination. In Proc. of the 11th IEEE Int'l Conf. on Data Mining. Google ScholarDigital Library
- Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In 2010 IEEE International Conference on Data Mining. IEEE, 869--874. Google ScholarDigital Library
- Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.Google Scholar
- Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware learning through regularization approach Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on. IEEE, 643--650. Google ScholarDigital Library
- Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. 2017. Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness. arXiv preprint arXiv:1711.05144 (2017).Google Scholar
- Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).Google Scholar
- Ron Kohavi. 1996. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. KDD, Vol. Vol. 96. Citeseer, 202--207. Google ScholarDigital Library
- J. Larson, S. Mattu, L. Kirchner, and J. Angwin. 2017. COMPAS dataset. (2017). https://github.com/propublica/compas-analysisGoogle Scholar
- Yuan Li, Chang Huang, and Ram Nevatia. 2009. Learning to associate: Hybridboosted multi-target tracker for crowded scene Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2953--2960.Google Scholar
- M. Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/mlGoogle Scholar
- Enrico Miglierina and Elena Molho. 2002. Scalarization and stability in vector optimization. Journal of Optimization Theory and Applications, Vol. 114, 3 (2002), 657--670. Google ScholarDigital Library
- Sergio Moro, Raul Laureano, and Paulo Cortez. 2011. Using data mining for bank direct marketing: An application of the crisp-dm methodology Proceedings of European Simulation and Modelling Conference-ESM'2011. Eurosis, 117--121.Google Scholar
- Shelly L Peffer. 2009. Title VII and disparate-treatment discrimination versus disparate-impact discrimination: The Supreme Court's decision in Ricci v. DeStefano. Review of Public Personnel Administration Vol. 29, 4 (2009), 402--410.Google ScholarCross Ref
- Andrea Romei and Salvatore Ruggieri. 2014. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review Vol. 29, 05 (2014), 582--638.Google ScholarCross Ref
- Robert E Schapire. 2003. The boosting approach to machine learning: An overview. Nonlinear estimation and classification. Springer, 149--171.Google Scholar
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification Without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1171--1180. Google ScholarDigital Library
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2015 a. Fairness constraints: A mechanism for fair classification. stat Vol. 1050 (2015), 19.Google Scholar
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2015 b. Learning Fair Classifiers. stat Vol. 1050 (2015), 29.Google Scholar
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi. 2017. Fairness Constraints: Mechanisms for Fair Classification Artificial Intelligence and Statistics. 962--970.Google Scholar
Index Terms
- Adaptive Sensitive Reweighting to Mitigate Bias in Fairness-aware Classification
Recommendations
Airtime Fairness for IEEE 802.11 Multirate Networks
Under a multi rate network scenario, the IEEE 802.11 DCF MAC fails to provide air-time fairness for all competing stations since the protocol is designed for ensuring max-min throughput fairness and the maximum achievable throughput by any station gets ...
Delay Sensitive Aware Subframe Allocation Schema in WiMAX Base Stations
WIMOB '09: Proceedings of the 2009 IEEE International Conference on Wireless and Mobile Computing, Networking and CommunicationsTDD (Time Division Duplex) duplexing technique is the most common option for the existing and upcoming WiMAX deployments. A WiMAX TDD frame contains one downlink subframe and one uplink subframe, the capacity allocated to each subframe is a system ...
Cost-sensitive classification with inadequate labeled data
It is an actual and challenging issue to learn cost-sensitive models from those datasets that are with few labeled data and plentiful unlabeled data, because some time labeled data are very difficult, time consuming and/or expensive to obtain. To solve ...
Comments