Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 4/2017

31.03.2017

Measuring discrimination in algorithmic decision making

verfasst von: Indrė Žliobaitė

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Society is increasingly relying on data-driven predictive models for automated decision making. This is not by design, but due to the nature and noisiness of observational data, such models may systematically disadvantage people belonging to certain categories or groups, instead of relying solely on individual merits. This may happen even if the computing process is fair and well-intentioned. Discrimination-aware data mining studies of how to make predictive models free from discrimination, when the historical data, on which they are built, may be biased, incomplete, or even contain past discriminatory decisions. Discrimination-aware data mining is an emerging research discipline, and there is no firm consensus yet of how to measure the performance of algorithms. The goal of this survey is to review various discrimination measures that have been used, analytically and computationally analyze their performance, and highlight implications of using one or another measure. We also describe measures from other disciplines, which have not been used for measuring discrimination, but potentially could be suitable for this purpose. This survey is primarily intended for researchers in data mining and machine learning as a step towards producing a unifying view of performance criteria when developing new algorithms for non-discriminatory predictive modeling. In addition, practitioners and policy makers could use this study when diagnosing potential discrimination by predictive models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The code for our experiments is available at https://​github.​com/​zliobaite/​paper-fairml-survey.
 
Literatur
Zurück zum Zitat Arrow KJ (1973) The theory of discrimination. In: Ashenfelter O, Rees A (eds) Discrimination in labor markets. Princeton University Press, Princeton, pp 3–33 Arrow KJ (1973) The theory of discrimination. In: Ashenfelter O, Rees A (eds) Discrimination in labor markets. Princeton University Press, Princeton, pp 3–33
Zurück zum Zitat Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104:671–732 Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104:671–732
Zurück zum Zitat Barocas S, Friedler S, Hardt M, Kroll J, Venkatasubramanian S, Wallach H (eds) (2015) 2nd International workshop on fairness, accountability, and transparency in machine learning (FATML). http://www.fatml.org Barocas S, Friedler S, Hardt M, Kroll J, Venkatasubramanian S, Wallach H (eds) (2015) 2nd International workshop on fairness, accountability, and transparency in machine learning (FATML). http://​www.​fatml.​org
Zurück zum Zitat Berendt B, Preibusch S (2014) Better decision support through exploratory discrimination-aware data mining: foundations and empirical evidence. Artif Intell Law 22(2):175–209CrossRef Berendt B, Preibusch S (2014) Better decision support through exploratory discrimination-aware data mining: foundations and empirical evidence. Artif Intell Law 22(2):175–209CrossRef
Zurück zum Zitat Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York, Inc., New YorkMATH Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York, Inc., New YorkMATH
Zurück zum Zitat Blank RM, Dabady M, Citro CF (2004) Methods for assessing discrimination, NRCUP (2004) Measuring racial discrimination. National Academies Press, Washigton D.C Blank RM, Dabady M, Citro CF (2004) Methods for assessing discrimination, NRCUP (2004) Measuring racial discrimination. National Academies Press, Washigton D.C
Zurück zum Zitat Bonchi F, Hajian S, Mishra B, Ramazzotti D (2015) Exposing the probabilistic causal structure of discrimination. CoRR arXiv:1510.00552 Bonchi F, Hajian S, Mishra B, Ramazzotti D (2015) Exposing the probabilistic causal structure of discrimination. CoRR arXiv:​1510.​00552
Zurück zum Zitat Calders T, Verwer S (2010) Three naive bayes approaches for discrimination-free classification. Data Min Knowl Discov 21(2):277–292MathSciNetCrossRef Calders T, Verwer S (2010) Three naive bayes approaches for discrimination-free classification. Data Min Knowl Discov 21(2):277–292MathSciNetCrossRef
Zurück zum Zitat Calders T, Zliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Custers B, Zarsky T, Schermer B, Calders T (eds) Discrimination and privacy in the information society—Data mining and profiling in large databases, Springer, pp 43–57 Calders T, Zliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Custers B, Zarsky T, Schermer B, Calders T (eds) Discrimination and privacy in the information society—Data mining and profiling in large databases, Springer, pp 43–57
Zurück zum Zitat Calders T, Karim A, Kamiran F, Ali W, Zhang X (2013) Controlling attribute effect in linear regression. In: Proceedings of the 13th international conference on data Mining, ICDM, pp 71–80 Calders T, Karim A, Kamiran F, Ali W, Zhang X (2013) Controlling attribute effect in linear regression. In: Proceedings of the 13th international conference on data Mining, ICDM, pp 71–80
Zurück zum Zitat Citron DK, Pasqualle III, FA (2014) The scored society: Due process for automated predictions. Wash Law Rev 89:1–33 Citron DK, Pasqualle III, FA (2014) The scored society: Due process for automated predictions. Wash Law Rev 89:1–33
Zurück zum Zitat Custers B, Calders T, Schermer B, Zarsky T (eds) (2013) Discrimination and privacy in the information society. Data mining and profiling in large databases. Springer, Berlin Custers B, Calders T, Schermer B, Zarsky T (eds) (2013) Discrimination and privacy in the information society. Data mining and profiling in large databases. Springer, Berlin
Zurück zum Zitat Dwork C, Hardt M, Pitassi T, Reingold O, Zemel RS (2012) Fairness through awareness. In: Proceedings of innovations in theoretical computer science, pp 214–226 Dwork C, Hardt M, Pitassi T, Reingold O, Zemel RS (2012) Fairness through awareness. In: Proceedings of innovations in theoretical computer science, pp 214–226
Zurück zum Zitat Edelman BG, Luca M (2014) Digital discrimination: the case of airbnb.com. Working Paper 14-054, Harvard Business School NOM Unit Edelman BG, Luca M (2014) Digital discrimination: the case of airbnb.com. Working Paper 14-054, Harvard Business School NOM Unit
Zurück zum Zitat European Commission (2011) How to present a discrimination claim: Handbook on seeking remedies under the EU Non-discrimination Directives. EU Publications Office European Commission (2011) How to present a discrimination claim: Handbook on seeking remedies under the EU Non-discrimination Directives. EU Publications Office
Zurück zum Zitat European Union Agency for Fundamental Rights (2011) Handbook on European non-discrimination law. EU Publications Office, Luxemberg European Union Agency for Fundamental Rights (2011) Handbook on European non-discrimination law. EU Publications Office, Luxemberg
Zurück zum Zitat Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 259–268 Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 259–268
Zurück zum Zitat Fukuchi K, Sakuma J, Kamishima T (2013) Prediction with model-based neutrality. In: Proceedings of European conference on machine learning and knowledge discovery in databases, pp 499–514CrossRef Fukuchi K, Sakuma J, Kamishima T (2013) Prediction with model-based neutrality. In: Proceedings of European conference on machine learning and knowledge discovery in databases, pp 499–514CrossRef
Zurück zum Zitat Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH
Zurück zum Zitat Hajian S, Domingo-Ferrer J (2013) A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans Knowl Data Eng 25(7):1445–1459CrossRef Hajian S, Domingo-Ferrer J (2013) A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans Knowl Data Eng 25(7):1445–1459CrossRef
Zurück zum Zitat Hajian S, Domingo-Ferrer J, Farras O (2014) Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Min Knowl Discov 28(5–6):1158–1188MathSciNetCrossRef Hajian S, Domingo-Ferrer J, Farras O (2014) Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Min Knowl Discov 28(5–6):1158–1188MathSciNetCrossRef
Zurück zum Zitat Hajian S, Domingo-Ferrer J, Monreale A, Pedreschi D, Giannotti F (2015) Discrimination and privacy-aware patterns. Data Min Knowl Discov 29(6):1733–1782MathSciNetCrossRef Hajian S, Domingo-Ferrer J, Monreale A, Pedreschi D, Giannotti F (2015) Discrimination and privacy-aware patterns. Data Min Knowl Discov 29(6):1733–1782MathSciNetCrossRef
Zurück zum Zitat Hardt M, Price E, Srebro, N (2016) Equality of opportunity in supervised learning. In: Proceedings of advances in neural information processing systems 29, pp 3315–3323 Hardt M, Price E, Srebro, N (2016) Equality of opportunity in supervised learning. In: Proceedings of advances in neural information processing systems 29, pp 3315–3323
Zurück zum Zitat Hillier A (2003) Spatial analysis of historical redlining: a methodological explanation. J Hous Res 14(1):137–168 Hillier A (2003) Spatial analysis of historical redlining: a methodological explanation. J Hous Res 14(1):137–168
Zurück zum Zitat Kamiran F, Calders T (2009) Classification without discrimination. In: Proceedings nd IC4 conference on computer, control and communication, pp 1–6 Kamiran F, Calders T (2009) Classification without discrimination. In: Proceedings nd IC4 conference on computer, control and communication, pp 1–6
Zurück zum Zitat Kamiran F, Calders T, Pechenizkiy M (2010) Discrimination aware decision tree learning. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM, pp 869–874 Kamiran F, Calders T, Pechenizkiy M (2010) Discrimination aware decision tree learning. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM, pp 869–874
Zurück zum Zitat Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613–644CrossRef Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613–644CrossRef
Zurück zum Zitat Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp 35–50 Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp 35–50
Zurück zum Zitat Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. In: Proceedings 8th Conference on innovations in theoretical computer science Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. In: Proceedings 8th Conference on innovations in theoretical computer science
Zurück zum Zitat Luong BT, Ruggieri S, Turini F (2011) k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 502–510 Luong BT, Ruggieri S, Turini F (2011) k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 502–510
Zurück zum Zitat Mancuhan K, Clifton C (2014) Combating discrimination using bayesian networks. Artif Intell Law 22(2):211–238CrossRef Mancuhan K, Clifton C (2014) Combating discrimination using bayesian networks. Artif Intell Law 22(2):211–238CrossRef
Zurück zum Zitat Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60MathSciNetCrossRef Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60MathSciNetCrossRef
Zurück zum Zitat Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Nat Cancer Inst 22(4):719–748 Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Nat Cancer Inst 22(4):719–748
Zurück zum Zitat Mascetti S, Ricci A, Ruggieri S (2014) Special issue: computational methods for enforcing privacy and fairness in the knowledge society. Artif Intell Law 22:109CrossRef Mascetti S, Ricci A, Ruggieri S (2014) Special issue: computational methods for enforcing privacy and fairness in the knowledge society. Artif Intell Law 22:109CrossRef
Zurück zum Zitat Nature Editorial (2016) More accountability for big-data algorithms. Nature 537(7621):449 Nature Editorial (2016) More accountability for big-data algorithms. Nature 537(7621):449
Zurück zum Zitat Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on knowledge discovery and data mining, KDD, pp 560–568 Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on knowledge discovery and data mining, KDD, pp 560–568
Zurück zum Zitat Pedreschi D, Ruggieri S, Turini F (2009) Measuring discrimination in socially-sensitive decision records. In: Proceedings of the SIAM international conference on data mining, SDM, pp 581–592 Pedreschi D, Ruggieri S, Turini F (2009) Measuring discrimination in socially-sensitive decision records. In: Proceedings of the SIAM international conference on data mining, SDM, pp 581–592
Zurück zum Zitat Pedreschi D, Ruggieri S, Turini F (2012) A study of top-k measures for discrimination discovery. In: Proceedings of the 27th annual acm symposium on applied computing, SAC, pp 126–131 Pedreschi D, Ruggieri S, Turini F (2012) A study of top-k measures for discrimination discovery. In: Proceedings of the 27th annual acm symposium on applied computing, SAC, pp 126–131
Zurück zum Zitat Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29(5):582–638CrossRef Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29(5):582–638CrossRef
Zurück zum Zitat Romei A, Ruggieri S, Turini F (2013) Discrimination discovery in scientific project evaluation: a case study. Expert Syst Appl 40(15):6064–6079CrossRef Romei A, Ruggieri S, Turini F (2013) Discrimination discovery in scientific project evaluation: a case study. Expert Syst Appl 40(15):6064–6079CrossRef
Zurück zum Zitat Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 1:41–55MathSciNetCrossRef Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 1:41–55MathSciNetCrossRef
Zurück zum Zitat Ruggieri S (2014) Using t-closeness anonymity to control for non-discrimination. Trans Data Priv 7(2):99–129MathSciNet Ruggieri S (2014) Using t-closeness anonymity to control for non-discrimination. Trans Data Priv 7(2):99–129MathSciNet
Zurück zum Zitat Ruggieri S, Pedreschi D, Turini F (2010) Data mining for discrimination discovery. ACM Trans Knowl Discov Data 4(2):9:1–9:40CrossRef Ruggieri S, Pedreschi D, Turini F (2010) Data mining for discrimination discovery. ACM Trans Knowl Discov Data 4(2):9:1–9:40CrossRef
Zurück zum Zitat Ruggieri S, Hajian S, Kamiran F, Zhang, X (2014) Anti-discrimination analysis using privacy attack strategies. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp. 694–710 Ruggieri S, Hajian S, Kamiran F, Zhang, X (2014) Anti-discrimination analysis using privacy attack strategies. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp. 694–710
Zurück zum Zitat Tax D (2001) One-class classification. Ph.D. thesis, Delft University of Technology Tax D (2001) One-class classification. Ph.D. thesis, Delft University of Technology
Zurück zum Zitat Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13CrossRef Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13CrossRef
Zurück zum Zitat Yinger J (1986) Measuring racial discrimination with fair housing audits: caught in the act. Am Econ Rev 76(5):881–893 Yinger J (1986) Measuring racial discrimination with fair housing audits: caught in the act. Am Econ Rev 76(5):881–893
Zurück zum Zitat Zemel RS, Wu Y, Swersky K, Pitassi T, Dwork C (2013) Learning fair representations. In: Proceedings of the 30th international conference on machine learning, pp 325–333 Zemel RS, Wu Y, Swersky K, Pitassi T, Dwork C (2013) Learning fair representations. In: Proceedings of the 30th international conference on machine learning, pp 325–333
Zurück zum Zitat Zhang L, Wu Y, Wu X (2016) Situation testing-based discrimination discovery: A causal inference approach. In: Proceedings of the 25th international joint conference on artificial intelligence, IJCAI, pp 2718–2724 Zhang L, Wu Y, Wu X (2016) Situation testing-based discrimination discovery: A causal inference approach. In: Proceedings of the 25th international joint conference on artificial intelligence, IJCAI, pp 2718–2724
Zurück zum Zitat Zliobaite I (2015) On the relation between accuracy and fairness in binary classification. In: The 2nd workshop on fairness, accountability, and transparency in machine learning (FATML) at ICML’15 Zliobaite I (2015) On the relation between accuracy and fairness in binary classification. In: The 2nd workshop on fairness, accountability, and transparency in machine learning (FATML) at ICML’15
Zurück zum Zitat Zliobaite I, Custers B (2016) Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models. Artif Intell Law 24(2):183–201CrossRef Zliobaite I, Custers B (2016) Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models. Artif Intell Law 24(2):183–201CrossRef
Metadaten
Titel
Measuring discrimination in algorithmic decision making
verfasst von
Indrė Žliobaitė
Publikationsdatum
31.03.2017
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 4/2017
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-017-0506-1

Weitere Artikel der Ausgabe 4/2017

Data Mining and Knowledge Discovery 4/2017 Zur Ausgabe

Premium Partner