Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 4/2017

31-03-2017

Measuring discrimination in algorithmic decision making

Author: Indrė Žliobaitė

Published in: Data Mining and Knowledge Discovery | Issue 4/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Society is increasingly relying on data-driven predictive models for automated decision making. This is not by design, but due to the nature and noisiness of observational data, such models may systematically disadvantage people belonging to certain categories or groups, instead of relying solely on individual merits. This may happen even if the computing process is fair and well-intentioned. Discrimination-aware data mining studies of how to make predictive models free from discrimination, when the historical data, on which they are built, may be biased, incomplete, or even contain past discriminatory decisions. Discrimination-aware data mining is an emerging research discipline, and there is no firm consensus yet of how to measure the performance of algorithms. The goal of this survey is to review various discrimination measures that have been used, analytically and computationally analyze their performance, and highlight implications of using one or another measure. We also describe measures from other disciplines, which have not been used for measuring discrimination, but potentially could be suitable for this purpose. This survey is primarily intended for researchers in data mining and machine learning as a step towards producing a unifying view of performance criteria when developing new algorithms for non-discriminatory predictive modeling. In addition, practitioners and policy makers could use this study when diagnosing potential discrimination by predictive models.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The code for our experiments is available at https://​github.​com/​zliobaite/​paper-fairml-survey.
 
Literature
go back to reference Arrow KJ (1973) The theory of discrimination. In: Ashenfelter O, Rees A (eds) Discrimination in labor markets. Princeton University Press, Princeton, pp 3–33 Arrow KJ (1973) The theory of discrimination. In: Ashenfelter O, Rees A (eds) Discrimination in labor markets. Princeton University Press, Princeton, pp 3–33
go back to reference Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104:671–732 Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104:671–732
go back to reference Barocas S, Friedler S, Hardt M, Kroll J, Venkatasubramanian S, Wallach H (eds) (2015) 2nd International workshop on fairness, accountability, and transparency in machine learning (FATML). http://www.fatml.org Barocas S, Friedler S, Hardt M, Kroll J, Venkatasubramanian S, Wallach H (eds) (2015) 2nd International workshop on fairness, accountability, and transparency in machine learning (FATML). http://​www.​fatml.​org
go back to reference Berendt B, Preibusch S (2014) Better decision support through exploratory discrimination-aware data mining: foundations and empirical evidence. Artif Intell Law 22(2):175–209CrossRef Berendt B, Preibusch S (2014) Better decision support through exploratory discrimination-aware data mining: foundations and empirical evidence. Artif Intell Law 22(2):175–209CrossRef
go back to reference Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York, Inc., New YorkMATH Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York, Inc., New YorkMATH
go back to reference Blank RM, Dabady M, Citro CF (2004) Methods for assessing discrimination, NRCUP (2004) Measuring racial discrimination. National Academies Press, Washigton D.C Blank RM, Dabady M, Citro CF (2004) Methods for assessing discrimination, NRCUP (2004) Measuring racial discrimination. National Academies Press, Washigton D.C
go back to reference Calders T, Verwer S (2010) Three naive bayes approaches for discrimination-free classification. Data Min Knowl Discov 21(2):277–292MathSciNetCrossRef Calders T, Verwer S (2010) Three naive bayes approaches for discrimination-free classification. Data Min Knowl Discov 21(2):277–292MathSciNetCrossRef
go back to reference Calders T, Zliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Custers B, Zarsky T, Schermer B, Calders T (eds) Discrimination and privacy in the information society—Data mining and profiling in large databases, Springer, pp 43–57 Calders T, Zliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Custers B, Zarsky T, Schermer B, Calders T (eds) Discrimination and privacy in the information society—Data mining and profiling in large databases, Springer, pp 43–57
go back to reference Calders T, Karim A, Kamiran F, Ali W, Zhang X (2013) Controlling attribute effect in linear regression. In: Proceedings of the 13th international conference on data Mining, ICDM, pp 71–80 Calders T, Karim A, Kamiran F, Ali W, Zhang X (2013) Controlling attribute effect in linear regression. In: Proceedings of the 13th international conference on data Mining, ICDM, pp 71–80
go back to reference Citron DK, Pasqualle III, FA (2014) The scored society: Due process for automated predictions. Wash Law Rev 89:1–33 Citron DK, Pasqualle III, FA (2014) The scored society: Due process for automated predictions. Wash Law Rev 89:1–33
go back to reference Custers B, Calders T, Schermer B, Zarsky T (eds) (2013) Discrimination and privacy in the information society. Data mining and profiling in large databases. Springer, Berlin Custers B, Calders T, Schermer B, Zarsky T (eds) (2013) Discrimination and privacy in the information society. Data mining and profiling in large databases. Springer, Berlin
go back to reference Dwork C, Hardt M, Pitassi T, Reingold O, Zemel RS (2012) Fairness through awareness. In: Proceedings of innovations in theoretical computer science, pp 214–226 Dwork C, Hardt M, Pitassi T, Reingold O, Zemel RS (2012) Fairness through awareness. In: Proceedings of innovations in theoretical computer science, pp 214–226
go back to reference Edelman BG, Luca M (2014) Digital discrimination: the case of airbnb.com. Working Paper 14-054, Harvard Business School NOM Unit Edelman BG, Luca M (2014) Digital discrimination: the case of airbnb.com. Working Paper 14-054, Harvard Business School NOM Unit
go back to reference European Commission (2011) How to present a discrimination claim: Handbook on seeking remedies under the EU Non-discrimination Directives. EU Publications Office European Commission (2011) How to present a discrimination claim: Handbook on seeking remedies under the EU Non-discrimination Directives. EU Publications Office
go back to reference European Union Agency for Fundamental Rights (2011) Handbook on European non-discrimination law. EU Publications Office, Luxemberg European Union Agency for Fundamental Rights (2011) Handbook on European non-discrimination law. EU Publications Office, Luxemberg
go back to reference Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 259–268 Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 259–268
go back to reference Fukuchi K, Sakuma J, Kamishima T (2013) Prediction with model-based neutrality. In: Proceedings of European conference on machine learning and knowledge discovery in databases, pp 499–514CrossRef Fukuchi K, Sakuma J, Kamishima T (2013) Prediction with model-based neutrality. In: Proceedings of European conference on machine learning and knowledge discovery in databases, pp 499–514CrossRef
go back to reference Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH
go back to reference Hajian S, Domingo-Ferrer J (2013) A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans Knowl Data Eng 25(7):1445–1459CrossRef Hajian S, Domingo-Ferrer J (2013) A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans Knowl Data Eng 25(7):1445–1459CrossRef
go back to reference Hajian S, Domingo-Ferrer J, Farras O (2014) Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Min Knowl Discov 28(5–6):1158–1188MathSciNetCrossRef Hajian S, Domingo-Ferrer J, Farras O (2014) Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Min Knowl Discov 28(5–6):1158–1188MathSciNetCrossRef
go back to reference Hajian S, Domingo-Ferrer J, Monreale A, Pedreschi D, Giannotti F (2015) Discrimination and privacy-aware patterns. Data Min Knowl Discov 29(6):1733–1782MathSciNetCrossRef Hajian S, Domingo-Ferrer J, Monreale A, Pedreschi D, Giannotti F (2015) Discrimination and privacy-aware patterns. Data Min Knowl Discov 29(6):1733–1782MathSciNetCrossRef
go back to reference Hardt M, Price E, Srebro, N (2016) Equality of opportunity in supervised learning. In: Proceedings of advances in neural information processing systems 29, pp 3315–3323 Hardt M, Price E, Srebro, N (2016) Equality of opportunity in supervised learning. In: Proceedings of advances in neural information processing systems 29, pp 3315–3323
go back to reference Hillier A (2003) Spatial analysis of historical redlining: a methodological explanation. J Hous Res 14(1):137–168 Hillier A (2003) Spatial analysis of historical redlining: a methodological explanation. J Hous Res 14(1):137–168
go back to reference Kamiran F, Calders T (2009) Classification without discrimination. In: Proceedings nd IC4 conference on computer, control and communication, pp 1–6 Kamiran F, Calders T (2009) Classification without discrimination. In: Proceedings nd IC4 conference on computer, control and communication, pp 1–6
go back to reference Kamiran F, Calders T, Pechenizkiy M (2010) Discrimination aware decision tree learning. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM, pp 869–874 Kamiran F, Calders T, Pechenizkiy M (2010) Discrimination aware decision tree learning. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM, pp 869–874
go back to reference Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613–644CrossRef Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613–644CrossRef
go back to reference Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp 35–50 Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp 35–50
go back to reference Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. In: Proceedings 8th Conference on innovations in theoretical computer science Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. In: Proceedings 8th Conference on innovations in theoretical computer science
go back to reference Luong BT, Ruggieri S, Turini F (2011) k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 502–510 Luong BT, Ruggieri S, Turini F (2011) k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 502–510
go back to reference Mancuhan K, Clifton C (2014) Combating discrimination using bayesian networks. Artif Intell Law 22(2):211–238CrossRef Mancuhan K, Clifton C (2014) Combating discrimination using bayesian networks. Artif Intell Law 22(2):211–238CrossRef
go back to reference Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60MathSciNetCrossRef Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60MathSciNetCrossRef
go back to reference Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Nat Cancer Inst 22(4):719–748 Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Nat Cancer Inst 22(4):719–748
go back to reference Mascetti S, Ricci A, Ruggieri S (2014) Special issue: computational methods for enforcing privacy and fairness in the knowledge society. Artif Intell Law 22:109CrossRef Mascetti S, Ricci A, Ruggieri S (2014) Special issue: computational methods for enforcing privacy and fairness in the knowledge society. Artif Intell Law 22:109CrossRef
go back to reference Nature Editorial (2016) More accountability for big-data algorithms. Nature 537(7621):449 Nature Editorial (2016) More accountability for big-data algorithms. Nature 537(7621):449
go back to reference Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on knowledge discovery and data mining, KDD, pp 560–568 Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on knowledge discovery and data mining, KDD, pp 560–568
go back to reference Pedreschi D, Ruggieri S, Turini F (2009) Measuring discrimination in socially-sensitive decision records. In: Proceedings of the SIAM international conference on data mining, SDM, pp 581–592 Pedreschi D, Ruggieri S, Turini F (2009) Measuring discrimination in socially-sensitive decision records. In: Proceedings of the SIAM international conference on data mining, SDM, pp 581–592
go back to reference Pedreschi D, Ruggieri S, Turini F (2012) A study of top-k measures for discrimination discovery. In: Proceedings of the 27th annual acm symposium on applied computing, SAC, pp 126–131 Pedreschi D, Ruggieri S, Turini F (2012) A study of top-k measures for discrimination discovery. In: Proceedings of the 27th annual acm symposium on applied computing, SAC, pp 126–131
go back to reference Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29(5):582–638CrossRef Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29(5):582–638CrossRef
go back to reference Romei A, Ruggieri S, Turini F (2013) Discrimination discovery in scientific project evaluation: a case study. Expert Syst Appl 40(15):6064–6079CrossRef Romei A, Ruggieri S, Turini F (2013) Discrimination discovery in scientific project evaluation: a case study. Expert Syst Appl 40(15):6064–6079CrossRef
go back to reference Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 1:41–55MathSciNetCrossRef Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 1:41–55MathSciNetCrossRef
go back to reference Ruggieri S (2014) Using t-closeness anonymity to control for non-discrimination. Trans Data Priv 7(2):99–129MathSciNet Ruggieri S (2014) Using t-closeness anonymity to control for non-discrimination. Trans Data Priv 7(2):99–129MathSciNet
go back to reference Ruggieri S, Pedreschi D, Turini F (2010) Data mining for discrimination discovery. ACM Trans Knowl Discov Data 4(2):9:1–9:40CrossRef Ruggieri S, Pedreschi D, Turini F (2010) Data mining for discrimination discovery. ACM Trans Knowl Discov Data 4(2):9:1–9:40CrossRef
go back to reference Ruggieri S, Hajian S, Kamiran F, Zhang, X (2014) Anti-discrimination analysis using privacy attack strategies. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp. 694–710 Ruggieri S, Hajian S, Kamiran F, Zhang, X (2014) Anti-discrimination analysis using privacy attack strategies. In: Proceedings of European conference on machine learning and knowledge discovery in databases, ECMLPKDD, pp. 694–710
go back to reference Tax D (2001) One-class classification. Ph.D. thesis, Delft University of Technology Tax D (2001) One-class classification. Ph.D. thesis, Delft University of Technology
go back to reference Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13CrossRef Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13CrossRef
go back to reference Yinger J (1986) Measuring racial discrimination with fair housing audits: caught in the act. Am Econ Rev 76(5):881–893 Yinger J (1986) Measuring racial discrimination with fair housing audits: caught in the act. Am Econ Rev 76(5):881–893
go back to reference Zemel RS, Wu Y, Swersky K, Pitassi T, Dwork C (2013) Learning fair representations. In: Proceedings of the 30th international conference on machine learning, pp 325–333 Zemel RS, Wu Y, Swersky K, Pitassi T, Dwork C (2013) Learning fair representations. In: Proceedings of the 30th international conference on machine learning, pp 325–333
go back to reference Zhang L, Wu Y, Wu X (2016) Situation testing-based discrimination discovery: A causal inference approach. In: Proceedings of the 25th international joint conference on artificial intelligence, IJCAI, pp 2718–2724 Zhang L, Wu Y, Wu X (2016) Situation testing-based discrimination discovery: A causal inference approach. In: Proceedings of the 25th international joint conference on artificial intelligence, IJCAI, pp 2718–2724
go back to reference Zliobaite I (2015) On the relation between accuracy and fairness in binary classification. In: The 2nd workshop on fairness, accountability, and transparency in machine learning (FATML) at ICML’15 Zliobaite I (2015) On the relation between accuracy and fairness in binary classification. In: The 2nd workshop on fairness, accountability, and transparency in machine learning (FATML) at ICML’15
go back to reference Zliobaite I, Custers B (2016) Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models. Artif Intell Law 24(2):183–201CrossRef Zliobaite I, Custers B (2016) Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models. Artif Intell Law 24(2):183–201CrossRef
Metadata
Title
Measuring discrimination in algorithmic decision making
Author
Indrė Žliobaitė
Publication date
31-03-2017
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 4/2017
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-017-0506-1

Other articles of this Issue 4/2017

Data Mining and Knowledge Discovery 4/2017 Go to the issue

Premium Partner