ABSTRACT
Discrimination discovery and prevention/removal are increasingly important tasks in data mining. Discrimination discovery aims to unveil discriminatory practices on the protected attribute (e.g., gender) by analyzing the dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data before conducting predictive analysis. In this paper, we show that the key to discrimination discovery and prevention is to find the meaningful partitions that can be used to provide quantitative evidences for the judgment of discrimination. With the support of the causal graph, we present a graphical condition for identifying a meaningful partition. Based on that, we develop a simple criterion for the claim of non-discrimination, and propose discrimination removal algorithms which accurately remove discrimination while retaining good data utility. Experiments using real datasets show the effectiveness of our approaches.
- Philip Adler, Casey Falk, Sorelle A Friedler, Gabriel Rybeck, Carlos Scheidegger, Brandon Smith, and Suresh Venkatasubramanian. 2016. Auditing black-box models for indirect influence. ICDM'16. IEEE, 1--10. Google ScholarCross Ref
- George A Anastassiou. 2009. Probabilistic inequalities. World Scientific.Google Scholar
- Martin Andersen, Joachim Dahl, and Lieven Vandenberghe. 2004. CVXOPT. http://cvxopt.org/. (2004).Google Scholar
- Francesco Bonchi, Sara Hajian, Bud Mishra, and Daniele Ramazzotti 2017. Exposing the probabilistic causal structure of discrimination. International Journal of Data Science and Analytics, Vol. 3, 1 (2017), 1--21. Google ScholarCross Ref
- Toon Calders and Sicco Verwer 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery Vol. 21, 2 (2010), 277--292. Google ScholarDigital Library
- Diego Colombo and Marloes H Maathuis 2014. Order-independent constraint-based causal structure learning. JMLR, Vol. 15, 1 (2014), 3741--3782.Google ScholarDigital Library
- Thomas H Cormen. 2009. Introduction to algorithms. MIT press.Google Scholar
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel 2012. Fairness through awareness. In Proceedings of ITCS. ACM, 214--226. Google ScholarDigital Library
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In SIGKDD'15. ACM, 259--268. Google ScholarDigital Library
- Sheila R Foster. 2004. Causation in Antidiscrimination Law: Beyond Intent Versus Impact. Hous. L. Rev. Vol. 41 (2004), 1469.Google Scholar
- Clark Glymour and others 2004. The TETRAD project. http://www.phil.cmu.edu/tetrad. (2004).Google Scholar
- Sara Hajian and Josep Domingo-Ferrer 2013. A methodology for direct and indirect discrimination prevention in data mining. JKDE, Vol. 25, 7 (2013), 1445--1459. Google ScholarDigital Library
- Sara Hajian, Josep Domingo-Ferrer, Anna Monreale, Dino Pedreschi, and Fosca Giannotti. 2015. Discrimination-and privacy-aware patterns. Data Mining and Knowledge Discovery Vol. 29, 6 (2015), 1733--1782. Google ScholarDigital Library
- Moritz Hardt, Eric Price, and Nati Srebro 2016. Equality of opportunity in supervised learning. In NIPS'16. 3315--3323.Google Scholar
- Markus Kalisch and Peter Bühlmann 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research Vol. 8 (2007), 613--636.Google ScholarDigital Library
- Faisal Kamiran and Toon Calders 2012. Data preprocessing techniques for classification without discrimination. KAIS, Vol. 33, 1 (2012), 1--33. Google ScholarDigital Library
- Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In ICDM'10. IEEE, 869--874. Google ScholarDigital Library
- Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware learning through regularization approach ICDMW. IEEE, 643--650.Google Scholar
- Daphne Koller and Nir Friedman 2009. Probabilistic graphical models: principles and techniques. MIT press.Google ScholarDigital Library
- Mikhail K Kozlov, Sergei P Tarasov, and Leonid G Khachiyan. 1980. The polynomial solvability of convex quadratic programming. U. S. S. R. Comput. Math. and Math. Phys. Vol. 20, 5 (1980), 223--228.Google ScholarCross Ref
- M. Lichman. 2013. UCI Machine learning repository. http://archive.ics.uci.edu/ml. (2013).Google Scholar
- Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN as an implementation of situation testing for discrimination discovery and prevention SIGKDD'11. ACM, 502--510.Google Scholar
- Koray Mancuhan and Chris Clifton 2014. Combating discrimination using Bayesian networks. Artificial intelligence and law Vol. 22, 2 (2014), 211--238. Google ScholarDigital Library
- Richard E Neapolitan and others 2004. Learning bayesian networks. Vol. Vol. 38. Prentice Hall Upper Saddle River.Google ScholarDigital Library
- Statistics Netherlands. 2001. Volkstelling. https://sites.google.com/site/faisalkamiran/. (2001).Google Scholar
- Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009. Measuring Discrimination in Socially-Sensitive Decision Records SIAM SDM. Society for Industrial and Applied Mathematics, 581.Google Scholar
- Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. In SIGKDD'08. ACM, 560--568. Google ScholarDigital Library
- Andrea Romei and Salvatore Ruggieri 2014. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review Vol. 29, 05 (2014), 582--638. Google ScholarCross Ref
- Salvatore Ruggieri, Dino Pedreschi, and Franco Turini. 2010. Data mining for discrimination discovery. TKDD, Vol. 4, 2 (2010), 9. Google ScholarDigital Library
- Peter Spirtes, Clark N Glymour, and Richard Scheines. 2000. Causation, prediction, and search. Vol. Vol. 81. MIT press.Google Scholar
- Alexander Statnikov, Jan Lemeir, and Constantin F Aliferis. 2013. Algorithms for discovery of multiple Markov boundaries. JMLR, Vol. 14, 1 (2013), 499--566.Google Scholar
- Indre vZliobait.e, Faisal Kamiran, and Toon Calders. 2011. Handling conditional discrimination. In ICDM'11. IEEE, 992--1001.Google Scholar
- Yongkai Wu and Xintao Wu 2016. Using loglinear model for discrimination discovery and prevention Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on. IEEE, 110--119.Google Scholar
- Richard S Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork 2013. Learning Fair Representations. ICML Vol. 28 (2013), 325--333.Google Scholar
- Lu Zhang and Xintao Wu 2017. Anti-discrimination learning: a causal modeling-based framework. International Journal of Data Science and Analytics (2017), 1--16.Google ScholarCross Ref
- Lu Zhang, Yongkai Wu, and Xintao Wu. On discrimination discovery using causal networks. SBP-BRiMS 2016.Google Scholar
- Lu Zhang, Yongkai Wu, and Xintao Wu. Situation testing-based discrimination discovery: a causal inference approach IJCAI'16.Google Scholar
- Lu Zhang, Yongkai Wu, and Xintao Wu. Achieving non-discrimination in prediction. arXiv preprint arXiv:1703.00060 (2017).Google Scholar
- Lu Zhang, Yongkai Wu, and Xintao Wu 2017natexlabb. A causal framework for discovering and removing direct and indirect discrimination IJCAI'17.Google Scholar
Index Terms
- Achieving Non-Discrimination in Data Release
Recommendations
Prestimulus oscillations in the alpha band of the eeg are modulated by the difficulty of feature discrimination and predict activation of a sensory discrimination process
Recent work has demonstrated that the occipital-temporal N1 component of the ERP is sensitive to the difficulty of visual discrimination, in a manner that cannot be explained by simple differences in low-level visual features, arousal, or time on task. ...
Timing of Target Discrimination in Human Frontal Eye Fields
Frontal eye field (FEF) neurons discharge in response to behaviorally relevant stimuli that are potential targets for saccades. Distinct visual and motor processes have been dissociated in the FEF of macaque monkeys, but little is known about the visual ...
Difficulty of discrimination modulates attentional capture by regulating attentional focus
Attentional capture for distractors is enhanced by increasing the difficulty of discrimination between the standard and the target in the three-stimulus oddball paradigm. In this study, we investigated the cognitive mechanism of this modulation of ...
Comments