Skip to main content
Erschienen in: Artificial Intelligence and Law 2/2016

01.06.2016

Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models

verfasst von: Indrė Žliobaitė, Bart Custers

Erschienen in: Artificial Intelligence and Law | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Increasing numbers of decisions about everyday life are made using algorithms. By algorithms we mean predictive models (decision rules) captured from historical data using data mining. Such models often decide prices we pay, select ads we see and news we read online, match job descriptions and candidate CVs, decide who gets a loan, who goes through an extra airport security check, or who gets released on parole. Yet growing evidence suggests that decision making by algorithms may discriminate people, even if the computing process is fair and well-intentioned. This happens due to biased or non-representative learning data in combination with inadvertent modeling procedures. From the regulatory perspective there are two tendencies in relation to this issue: (1) to ensure that data-driven decision making is not discriminatory, and (2) to restrict overall collecting and storing of private data to a necessary minimum. This paper shows that from the computing perspective these two goals are contradictory. We demonstrate empirically and theoretically with standard regression models that in order to make sure that decision models are non-discriminatory, for instance, with respect to race, the sensitive racial information needs to be used in the model building process. Of course, after the model is ready, race should not be required as an input variable for decision making. From the regulatory perspective this has an important implication: collecting sensitive personal data is necessary in order to guarantee fairness of algorithms, and law making needs to find sensible ways to allow using such data in the modeling process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
European directive 95/46/EG of the European Parliament and the Council of 24th October 1995, [1995] OJ L281/31. See also http://​europa.​eu.​int/​eur-lex/​en/​lif/​dat/​1995/​en_​395L0046.​html.
 
2
This principle is sometimes referred to as the principle of minimality, see Bygrave (2002, p. 341).
 
3
Note that, in the European Data Protection Directive and the WBP, this principle applies only to incomplete or inaccurate data, or data that are irrelevant or processed illegitimately.
 
4
Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation), Brussels, 25.1.2012 COM(2012) 11 final 2012/0011 (COD). Available at http://​eur-lex.​europa.​eu/​LexUriServ/​LexUriServ.​do?​uri=​COM:​2012:​0011:​FIN:​EN:​PDF.
 
5
Art. 15 of the EU directive on the protection of personal data.
 
6
ECJ, C-127/07, 16 December 2008.
 
Literatur
Zurück zum Zitat Ajunwa I, Friedler S, Scheidegger C, Venkatasubramanian S (2016) Hiring by algorithm: predicting and preventing disparate impact. SSRN Ajunwa I, Friedler S, Scheidegger C, Venkatasubramanian S (2016) Hiring by algorithm: predicting and preventing disparate impact. SSRN
Zurück zum Zitat Bygrave L (2002). Data protection law; approaching its rationale, logic and limits, vol. 10 of information law series. Kluwer Law International, The Hague Bygrave L (2002). Data protection law; approaching its rationale, logic and limits, vol. 10 of information law series. Kluwer Law International, The Hague
Zurück zum Zitat Calders T, Karim A, Kamiran F, Ali, W, Zhang X (2013) Controlling attribute effect in linear regression. In: Proceedings of 13th IEEE ICDM, pp 71–80 Calders T, Karim A, Kamiran F, Ali, W, Zhang X (2013) Controlling attribute effect in linear regression. In: Proceedings of 13th IEEE ICDM, pp 71–80
Zurück zum Zitat Calders T, Zliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Discrimination and Privacy in the Information Society, pp 43–57 Calders T, Zliobaite I (2013) Why unbiased computational processes can lead to discriminative decision procedures. In: Discrimination and Privacy in the Information Society, pp 43–57
Zurück zum Zitat Citron DK, Pasquale FA (2014) The scored society: due process for automated predictions. Wash Law Rev, Vol. 89, p. 1, U of Maryland Legal Studies Research Paper No. 2014-8 Citron DK, Pasquale FA (2014) The scored society: due process for automated predictions. Wash Law Rev, Vol. 89, p. 1, U of Maryland Legal Studies Research Paper No. 2014-8
Zurück zum Zitat Custers B, Calders T, Schermer B, Zarsky T (eds) (2013a) Discrimination and privacy in the information society: data mining and profiling in large databases. Springer, Heidelberg Custers B, Calders T, Schermer B, Zarsky T (eds) (2013a) Discrimination and privacy in the information society: data mining and profiling in large databases. Springer, Heidelberg
Zurück zum Zitat Custers B, Van der Hof S, Schermer B, Appleby-Arnold S, Brockdorff N (2013b) Informed consent in social media use. the gap between user expectations and eu personal data protection law. SCRIPTed J Law Technol Soc 10:435–457 Custers B, Van der Hof S, Schermer B, Appleby-Arnold S, Brockdorff N (2013b) Informed consent in social media use. the gap between user expectations and eu personal data protection law. SCRIPTed J Law Technol Soc 10:435–457
Zurück zum Zitat Edelman BG, Luca M (2014) Digital discrimination: the case of Airbnb.com. Working Paper 14-054, Harvard Business School Edelman BG, Luca M (2014) Digital discrimination: the case of Airbnb.com. Working Paper 14-054, Harvard Business School
Zurück zum Zitat Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of 21st ACM KDD, pp 259–268 Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: Proceedings of 21st ACM KDD, pp 259–268
Zurück zum Zitat Gellert R, De Vries K, De Hert P, Gutwirth S (2013) A comparative analysis of anti-discrimination and data protection legislations. In: Discrimination and privacy in the information society: data mining and profiling in large databases. Springer, Heidelberg Gellert R, De Vries K, De Hert P, Gutwirth S (2013) A comparative analysis of anti-discrimination and data protection legislations. In: Discrimination and privacy in the information society: data mining and profiling in large databases. Springer, Heidelberg
Zurück zum Zitat Hajian S, Domingo-Ferrer J (2013) A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans Knowl Data Eng 25(7):1445–1459CrossRef Hajian S, Domingo-Ferrer J (2013) A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans Knowl Data Eng 25(7):1445–1459CrossRef
Zurück zum Zitat Hillier A (2003) Spatial analysis of historical redlining: a methodological explanation. J Hous Res 14(1):137–168 Hillier A (2003) Spatial analysis of historical redlining: a methodological explanation. J Hous Res 14(1):137–168
Zurück zum Zitat Hornung G (2012) A general data protection regulation for europe? Light and shade. The Commissions Draft of 25 January 2012, 9 SCRIPTed, pp 64–81 Hornung G (2012) A general data protection regulation for europe? Light and shade. The Commissions Draft of 25 January 2012, 9 SCRIPTed, pp 64–81
Zurück zum Zitat House TW (2014) Big data: seizing opportunities, preserving values House TW (2014) Big data: seizing opportunities, preserving values
Zurück zum Zitat Kamiran F, Calders T (2009) Classification without discrimination. In IEEE international conference on computer, control & communication, IEEE-IC4. IEEE press Kamiran F, Calders T (2009) Classification without discrimination. In IEEE international conference on computer, control & communication, IEEE-IC4. IEEE press
Zurück zum Zitat Kamiran F, Calders T, Pechenizkiy M (2010) Discrimination aware decision tree learning. In: Proceedings of 10th IEEE ICDM, pp 869–874 Kamiran F, Calders T, Pechenizkiy M (2010) Discrimination aware decision tree learning. In: Proceedings of 10th IEEE ICDM, pp 869–874
Zurück zum Zitat Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613–644CrossRef Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613–644CrossRef
Zurück zum Zitat Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Proceedings of ECMLPKDD, pp 35–50 Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Proceedings of ECMLPKDD, pp 35–50
Zurück zum Zitat Kay M, Matuszek C, Munson S (2015) Unequal representation and gender stereotypes in image search results for occupations. In: Proceedings of 33rd ACM CHI, pp 3819–3828 Kay M, Matuszek C, Munson S (2015) Unequal representation and gender stereotypes in image search results for occupations. In: Proceedings of 33rd ACM CHI, pp 3819–3828
Zurück zum Zitat Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behaviour. Proc Natl Acad Sci 110(15):5802–5805CrossRef Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behaviour. Proc Natl Acad Sci 110(15):5802–5805CrossRef
Zurück zum Zitat Kuner C (2012) The european commission’s proposed data protection regulation: a copernican revolution in european data protection law. Privacy and Security Law Report Kuner C (2012) The european commission’s proposed data protection regulation: a copernican revolution in european data protection law. Privacy and Security Law Report
Zurück zum Zitat Luong BT, Ruggieri S, Turini F (2011) k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of 17th KDD, pp 502–510 Luong BT, Ruggieri S, Turini F (2011) k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of 17th KDD, pp 502–510
Zurück zum Zitat Mancuhan K, Clifton C (2014) Combating discrimination using bayesian networks. Artif Intell Law 22(2):211–238CrossRef Mancuhan K, Clifton C (2014) Combating discrimination using bayesian networks. Artif Intell Law 22(2):211–238CrossRef
Zurück zum Zitat McCrudden C, Prechal S (2009) The concepts of equality and non-discrimination in europe. European commission. DG Employment, Social Affairs and Equal Opportunities McCrudden C, Prechal S (2009) The concepts of equality and non-discrimination in europe. European commission. DG Employment, Social Affairs and Equal Opportunities
Zurück zum Zitat Ohm P (2010) Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev 57:1701–1765 Ohm P (2010) Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev 57:1701–1765
Zurück zum Zitat Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, CambridgeCrossRefMATH Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, CambridgeCrossRefMATH
Zurück zum Zitat Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of 14th ACM KDD, pp 560–568 Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of 14th ACM KDD, pp 560–568
Zurück zum Zitat Pope DG, Sydnor JR (2011) Implementing anti-discrimination policies in statistical profiling models. Am Econ J Econ Policy 3(3):206–231CrossRef Pope DG, Sydnor JR (2011) Implementing anti-discrimination policies in statistical profiling models. Am Econ J Econ Policy 3(3):206–231CrossRef
Zurück zum Zitat Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29(5):582–638CrossRef Romei A, Ruggieri S (2014) A multidisciplinary survey on discrimination analysis. Knowl Eng Rev 29(5):582–638CrossRef
Zurück zum Zitat Schermer B, Custers B, Van der Hof S (2014) The crisis of consent: how stronger legal protection may lead to weaker consent in data protection. Ethics Inf Technol 16(2):171–182 Schermer B, Custers B, Van der Hof S (2014) The crisis of consent: how stronger legal protection may lead to weaker consent in data protection. Ethics Inf Technol 16(2):171–182
Zurück zum Zitat Squires G (2003) Racial profiling, insurance style: insurance redlining and the uneven development of metropolitan areas. J Urban Aff 25(4):391–410MathSciNetCrossRef Squires G (2003) Racial profiling, insurance style: insurance redlining and the uneven development of metropolitan areas. J Urban Aff 25(4):391–410MathSciNetCrossRef
Zurück zum Zitat Weisberg S (1985) Applied linear regression, second edition Weisberg S (1985) Applied linear regression, second edition
Zurück zum Zitat Zemel RS, Wu Y, Swersky K, Pitassi T, Dwork C (2013) Learning fair representations. In: Proceedings of 30th ICML, pp 325–333 Zemel RS, Wu Y, Swersky K, Pitassi T, Dwork C (2013) Learning fair representations. In: Proceedings of 30th ICML, pp 325–333
Zurück zum Zitat Zliobaite I (2015) A survey on measuring indirect discrimination in machine learning. CoRR, abs/1511.00148 Zliobaite I (2015) A survey on measuring indirect discrimination in machine learning. CoRR, abs/1511.00148
Metadaten
Titel
Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models
verfasst von
Indrė Žliobaitė
Bart Custers
Publikationsdatum
01.06.2016
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence and Law / Ausgabe 2/2016
Print ISSN: 0924-8463
Elektronische ISSN: 1572-8382
DOI
https://doi.org/10.1007/s10506-016-9182-5