research-article

Public Access

Achieving Non-Discrimination in Data Release

Authors:
Lu Zhang

University of Arkansas, Fayetteville, AR, USA

University of Arkansas, Fayetteville, AR, USA
View Profile

,
Yongkai Wu

University of Arkansas, Fayetteville, AR, USA

University of Arkansas, Fayetteville, AR, USA
View Profile

,
Xintao Wu

University of Arkansas, Fayetteville, AR, USA

University of Arkansas, Fayetteville, AR, USA
View Profile

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2017Pages 1335–1344https://doi.org/10.1145/3097983.3098167

Published:13 August 2017Publication History

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1335–1344

ABSTRACT

Discrimination discovery and prevention/removal are increasingly important tasks in data mining. Discrimination discovery aims to unveil discriminatory practices on the protected attribute (e.g., gender) by analyzing the dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data before conducting predictive analysis. In this paper, we show that the key to discrimination discovery and prevention is to find the meaningful partitions that can be used to provide quantitative evidences for the judgment of discrimination. With the support of the causal graph, we present a graphical condition for identifying a meaningful partition. Based on that, we develop a simple criterion for the claim of non-discrimination, and propose discrimination removal algorithms which accurately remove discrimination while retaining good data utility. Experiments using real datasets show the effectiveness of our approaches.

References

Philip Adler, Casey Falk, Sorelle A Friedler, Gabriel Rybeck, Carlos Scheidegger, Brandon Smith, and Suresh Venkatasubramanian. 2016. Auditing black-box models for indirect influence. ICDM'16. IEEE, 1--10. Google ScholarCross Ref
George A Anastassiou. 2009. Probabilistic inequalities. World Scientific.Google Scholar
Martin Andersen, Joachim Dahl, and Lieven Vandenberghe. 2004. CVXOPT. http://cvxopt.org/. (2004).Google Scholar
Francesco Bonchi, Sara Hajian, Bud Mishra, and Daniele Ramazzotti 2017. Exposing the probabilistic causal structure of discrimination. International Journal of Data Science and Analytics, Vol. 3, 1 (2017), 1--21. Google ScholarCross Ref
Toon Calders and Sicco Verwer 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery Vol. 21, 2 (2010), 277--292. Google ScholarDigital Library
Diego Colombo and Marloes H Maathuis 2014. Order-independent constraint-based causal structure learning. JMLR, Vol. 15, 1 (2014), 3741--3782.Google ScholarDigital Library
Thomas H Cormen. 2009. Introduction to algorithms. MIT press.Google Scholar
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel 2012. Fairness through awareness. In Proceedings of ITCS. ACM, 214--226. Google ScholarDigital Library
Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In SIGKDD'15. ACM, 259--268. Google ScholarDigital Library
Sheila R Foster. 2004. Causation in Antidiscrimination Law: Beyond Intent Versus Impact. Hous. L. Rev. Vol. 41 (2004), 1469.Google Scholar
Clark Glymour and others 2004. The TETRAD project. http://www.phil.cmu.edu/tetrad. (2004).Google Scholar
Sara Hajian and Josep Domingo-Ferrer 2013. A methodology for direct and indirect discrimination prevention in data mining. JKDE, Vol. 25, 7 (2013), 1445--1459. Google ScholarDigital Library
Sara Hajian, Josep Domingo-Ferrer, Anna Monreale, Dino Pedreschi, and Fosca Giannotti. 2015. Discrimination-and privacy-aware patterns. Data Mining and Knowledge Discovery Vol. 29, 6 (2015), 1733--1782. Google ScholarDigital Library
Moritz Hardt, Eric Price, and Nati Srebro 2016. Equality of opportunity in supervised learning. In NIPS'16. 3315--3323.Google Scholar
Markus Kalisch and Peter Bühlmann 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research Vol. 8 (2007), 613--636.Google ScholarDigital Library
Faisal Kamiran and Toon Calders 2012. Data preprocessing techniques for classification without discrimination. KAIS, Vol. 33, 1 (2012), 1--33. Google ScholarDigital Library
Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In ICDM'10. IEEE, 869--874. Google ScholarDigital Library
Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware learning through regularization approach ICDMW. IEEE, 643--650.Google Scholar
Daphne Koller and Nir Friedman 2009. Probabilistic graphical models: principles and techniques. MIT press.Google ScholarDigital Library
Mikhail K Kozlov, Sergei P Tarasov, and Leonid G Khachiyan. 1980. The polynomial solvability of convex quadratic programming. U. S. S. R. Comput. Math. and Math. Phys. Vol. 20, 5 (1980), 223--228.Google ScholarCross Ref
M. Lichman. 2013. UCI Machine learning repository. http://archive.ics.uci.edu/ml. (2013).Google Scholar
Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN as an implementation of situation testing for discrimination discovery and prevention SIGKDD'11. ACM, 502--510.Google Scholar
Koray Mancuhan and Chris Clifton 2014. Combating discrimination using Bayesian networks. Artificial intelligence and law Vol. 22, 2 (2014), 211--238. Google ScholarDigital Library
Richard E Neapolitan and others 2004. Learning bayesian networks. Vol. Vol. 38. Prentice Hall Upper Saddle River.Google ScholarDigital Library
Statistics Netherlands. 2001. Volkstelling. https://sites.google.com/site/faisalkamiran/. (2001).Google Scholar
Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009. Measuring Discrimination in Socially-Sensitive Decision Records SIAM SDM. Society for Industrial and Applied Mathematics, 581.Google Scholar
Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. In SIGKDD'08. ACM, 560--568. Google ScholarDigital Library
Andrea Romei and Salvatore Ruggieri 2014. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review Vol. 29, 05 (2014), 582--638. Google ScholarCross Ref
Salvatore Ruggieri, Dino Pedreschi, and Franco Turini. 2010. Data mining for discrimination discovery. TKDD, Vol. 4, 2 (2010), 9. Google ScholarDigital Library
Peter Spirtes, Clark N Glymour, and Richard Scheines. 2000. Causation, prediction, and search. Vol. Vol. 81. MIT press.Google Scholar
Alexander Statnikov, Jan Lemeir, and Constantin F Aliferis. 2013. Algorithms for discovery of multiple Markov boundaries. JMLR, Vol. 14, 1 (2013), 499--566.Google Scholar
Indre vZliobait.e, Faisal Kamiran, and Toon Calders. 2011. Handling conditional discrimination. In ICDM'11. IEEE, 992--1001.Google Scholar
Yongkai Wu and Xintao Wu 2016. Using loglinear model for discrimination discovery and prevention Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on. IEEE, 110--119.Google Scholar
Richard S Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork 2013. Learning Fair Representations. ICML Vol. 28 (2013), 325--333.Google Scholar
Lu Zhang and Xintao Wu 2017. Anti-discrimination learning: a causal modeling-based framework. International Journal of Data Science and Analytics (2017), 1--16.Google ScholarCross Ref
Lu Zhang, Yongkai Wu, and Xintao Wu. On discrimination discovery using causal networks. SBP-BRiMS 2016.Google Scholar
Lu Zhang, Yongkai Wu, and Xintao Wu. Situation testing-based discrimination discovery: a causal inference approach IJCAI'16.Google Scholar
Lu Zhang, Yongkai Wu, and Xintao Wu. Achieving non-discrimination in prediction. arXiv preprint arXiv:1703.00060 (2017).Google Scholar
Lu Zhang, Yongkai Wu, and Xintao Wu 2017natexlabb. A causal framework for discovering and removing direct and indirect discrimination IJCAI'17.Google Scholar

Index Terms

Achieving Non-Discrimination in Data Release

Recommendations

Prestimulus oscillations in the alpha band of the eeg are modulated by the difficulty of feature discrimination and predict activation of a sensory discrimination process

Recent work has demonstrated that the occipital-temporal N1 component of the ERP is sensitive to the difficulty of visual discrimination, in a manner that cannot be explained by simple differences in low-level visual features, arousal, or time on task. ...
Read More
Timing of Target Discrimination in Human Frontal Eye Fields

Frontal eye field (FEF) neurons discharge in response to behaviorally relevant stimuli that are potential targets for saccades. Distinct visual and motor processes have been dissociated in the FEF of macaque monkeys, but little is known about the visual ...
Read More
Difficulty of discrimination modulates attentional capture by regulating attentional focus

Attentional capture for distractors is enhanced by increasing the difficulty of discrimination between the standard and the target in the three-stimulus oddball paradigm. In this study, we investigated the cognitive mechanism of this modulation of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2017
2240 pages
ISBN:9781450348874
DOI:10.1145/3097983
General Chairs:
Stan Matwin
Dalhousie University
,
Shipeng Yu
LinkedIn
,
Faisal Farooq
IBM
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 August 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
causal graph
discrimination discovery and removal
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 34
  Total Citations
  View Citations
- 812
  Total Downloads
- Downloads (Last 12 months)118
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Achieving Non-Discrimination in Data Release

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Prestimulus oscillations in the alpha band of the eeg are modulated by the difficulty of feature discrimination and predict activation of a sensory discrimination process

Timing of Target Discrimination in Human Frontal Eye Fields

Difficulty of discrimination modulates attentional capture by regulating attentional focus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Achieving Non-Discrimination in Data Release

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Prestimulus oscillations in the alpha band of the eeg are modulated by the difficulty of feature discrimination and predict activation of a sensory discrimination process

Timing of Target Discrimination in Human Frontal Eye Fields

Difficulty of discrimination modulates attentional capture by regulating attentional focus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media