skip to main content
10.1145/2939672.2945386acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
tutorial

Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining

Published:13 August 2016Publication History

ABSTRACT

Algorithms and decision making based on Big Data have become pervasive in all aspects of our daily lives lives (offline and online), as they have become essential tools in personal finance, health care, hiring, housing, education, and policies. It is therefore of societal and ethical importance to ask whether these algorithms can be discriminative on grounds such as gender, ethnicity, or health status. It turns out that the answer is positive: for instance, recent studies in the context of online advertising show that ads for high-income jobs are presented to men much more often than to women [Datta et al., 2015]; and ads for arrest records are significantly more likely to show up on searches for distinctively black names [Sweeney, 2013]. This algorithmic bias exists even when there is no discrimination intention in the developer of the algorithm. Sometimes it may be inherent to the data sources used (software making decisions based on data can reflect, or even amplify, the results of historical discrimination), but even when the sensitive attributes have been suppressed from the input, a well trained machine learning algorithm may still discriminate on the basis of such sensitive attributes because of correlations existing in the data. These considerations call for the development of data mining systems which are discrimination-conscious by-design. This is a novel and challenging research area for the data mining community.

The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. The tutorial covers two main complementary approaches: algorithms for discrimination discovery and discrimination prevention by means of fairness-aware data mining. We conclude by summarizing promising paths for future research.

Skip Supplemental Material Section

Supplemental Material

kdd2016_tutorial_algorithmic_bias_01-acm.mp4

mp4

565.9 MB

kdd2016_tutorial_algorithmic_bias_02-acm.mp4

mp4

1.2 GB

kdd2016_tutorial_algorithmic_bias_03-acm.mp4

mp4

1.3 GB

References

  1. S. Barocas and A. D. Selbst. Big data's disparate impact. SSRN Pre-Print 2477899, 2014.Google ScholarGoogle Scholar
  2. F. Bonchi, S. Hajian, B. Mishra, and D. Ramazzotti. Exposing the probabilistic causal structure of discrimination. arXiv:1510.00552, 2015.Google ScholarGoogle Scholar
  3. T. Calders and S. Verwer. Three naive bayes approaches for discrimination-free classi cation. DMKD, 21(2):277--292, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Custers, T. Calders, B. Schermer, and T. Zarsky, editors. Discrimination and Privacy in the Information Society. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Datta, M. C. Tschantz, and A. Datta. Automated experiments on ad privacy settings. Proc. Privacy Enhancing Technologies, 2015(1):92--112, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  6. C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In ITCS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. In KDD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Hajian and J. Domingo-Ferrer. A methodology for direct and indirect discrimination prevention in data mining. TKDE, 25(7):1445--1459, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Hajian, J. Domingo-Ferrer, and O. Farras. Generalization-based privacy preservation and discrimination prevention in data publishing and mining. DMKD, 28(5-6):1158--1188, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Hajian, J. Domingo-Ferrer, A. Monreale, D. Pedreschi, and F. Giannotti. Discrimination- and privacy-aware patterns. DMKD, 29(6):1733--1782, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Kamiran and T. Calders. Data preprocessing techniques for classi cation without discrimination. KAIS, 33(1):1--33, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. T. Luong, S. Ruggieri, and F. Turini. k-nn as an implementation of situation testing for discrimination discovery and prevention. In KDD, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Pedreshi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. In KDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Ruggieri, S. Hajian, F. Kamiran, and X. Zhang. Anti-discrimination analysis using privacy attack strategies. In ECML-PKDD, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Savage. When computers stand in the schoolhouse door. Commun. ACM, 59(3):19--21, Feb. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Sweeney. Discrimination in online ad delivery. Queue, 11(3):10, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2016
      2176 pages
      ISBN:9781450342322
      DOI:10.1145/2939672

      Copyright © 2016 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 August 2016

      Check for updates

      Qualifiers

      • tutorial

      Acceptance Rates

      KDD '16 Paper Acceptance Rate66of1,115submissions,6%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader