Skip to main content
Erschienen in: International Journal of Data Science and Analytics 1/2017

18.05.2017 | Trends of Data Science

Anti-discrimination learning: a causal modeling-based framework

verfasst von: Lu Zhang, Xintao Wu

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 1/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Anti-discrimination learning is an increasingly important task in data mining. Discrimination discovery is the problem of unveiling discriminatory practices by analyzing a dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data and/or the predictive algorithms. Discrimination is causal, which means that to prove discrimination one needs to derive a causal relationship rather than an association relationship. Although it is well known that association does not mean causation, the gap between association and causation is not paid enough attention by many researchers. In this paper, we introduce a causal modeling-based framework for anti-discrimination learning. Discrimination is categorized according to two dimensions: direct/indirect and system/group/individual level. Within the causal framework, we introduce a work for discovering and preventing both direct and indirect system-level discrimination in the training data, and a work for extending the non-discrimination result from the training data to prediction. We then introduce two works for group-level direct discrimination and individual-level direct discrimination respectively. The aim of this paper is to deepen the understanding of discrimination in data mining from the causal modeling perspective, and suggest several potential future research directions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Adler, P., Falk, C., Friedler, S.A., Rybeck, G., Scheidegger, C., Smith, B., Venkatasubramanian, S.: Auditing black-box models for indirect influence. In: Data Mining (ICDM), 2016 IEEE 16th International Conference on, pp. 1–10. IEEE, (2016) Adler, P., Falk, C., Friedler, S.A., Rybeck, G., Scheidegger, C., Smith, B., Venkatasubramanian, S.: Auditing black-box models for indirect influence. In: Data Mining (ICDM), 2016 IEEE 16th International Conference on, pp. 1–10. IEEE, (2016)
2.
Zurück zum Zitat Avin, C., Shpitser, I., Pearl, J.: Identifiability of path-specific effects. In: IJCAI’05, pp. 357–363. (2005) Avin, C., Shpitser, I., Pearl, J.: Identifiability of path-specific effects. In: IJCAI’05, pp. 357–363. (2005)
3.
Zurück zum Zitat Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. Law Rev. 104(3), 671–769 (2016) Barocas, S., Selbst, A.D.: Big data’s disparate impact. Calif. Law Rev. 104(3), 671–769 (2016)
4.
Zurück zum Zitat Bonchi, F., Hajian, S., Mishra, B., Ramazzotti, D.: Exposing the probabilistic causal structure of discrimination. Int. J. Data Sci. Anal. 3(1), 1–21 (2017)CrossRef Bonchi, F., Hajian, S., Mishra, B., Ramazzotti, D.: Exposing the probabilistic causal structure of discrimination. Int. J. Data Sci. Anal. 3(1), 1–21 (2017)CrossRef
5.
Zurück zum Zitat Bickel, P.J., Hammel, E.A., OConnell, J.W.: Sex bias in graduate admissions: data from Berkeley. Science 187(4175), 398–404 (1975)CrossRef Bickel, P.J., Hammel, E.A., OConnell, J.W.: Sex bias in graduate admissions: data from Berkeley. Science 187(4175), 398–404 (1975)CrossRef
6.
Zurück zum Zitat Podesta, J., Pritzker, P., Moniz, E.J., Holdren, J., Zients, J.: Big data: seizing opportunities, preserving values. Executive Office of the President (2014) Podesta, J., Pritzker, P., Moniz, E.J., Holdren, J., Zients, J.: Big data: seizing opportunities, preserving values. Executive Office of the President (2014)
7.
Zurück zum Zitat Calders, T., Verwer, S.: Three naive bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21(2), 277–292 (2010)MathSciNetCrossRef Calders, T., Verwer, S.: Three naive bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21(2), 277–292 (2010)MathSciNetCrossRef
8.
Zurück zum Zitat Colombo, D., Maathuis, M.H.: Order-independent constraint-based causal structure learning. JMLR 15(1), 3741–3782 (2014)MathSciNetMATH Colombo, D., Maathuis, M.H.: Order-independent constraint-based causal structure learning. JMLR 15(1), 3741–3782 (2014)MathSciNetMATH
9.
Zurück zum Zitat Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM, (2012) Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM, (2012)
10.
Zurück zum Zitat Eberhardt, F.: Introduction to the foundations of causal discovery. Int. J. Data Sci. Anal. 3(2), 81–91 (2017)CrossRef Eberhardt, F.: Introduction to the foundations of causal discovery. Int. J. Data Sci. Anal. 3(2), 81–91 (2017)CrossRef
11.
Zurück zum Zitat Evans, R.J., Richardson, T.S., et al.: Markovian acyclic directed mixed graphs for discrete data. Ann. Stat. 42(4), 1452–1482 (2014)MathSciNetCrossRefMATH Evans, R.J., Richardson, T.S., et al.: Markovian acyclic directed mixed graphs for discrete data. Ann. Stat. 42(4), 1452–1482 (2014)MathSciNetCrossRefMATH
12.
Zurück zum Zitat Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268. ACM, (2015) Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268. ACM, (2015)
13.
Zurück zum Zitat Hajian, S., Domingo-Ferrer, J.: A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans. Knowl. Data Eng. 25(7), 1445–1459 (2013)CrossRef Hajian, S., Domingo-Ferrer, J.: A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans. Knowl. Data Eng. 25(7), 1445–1459 (2013)CrossRef
14.
Zurück zum Zitat Hajian, S., Domingo-Ferrer, J., Monreale, A., Pedreschi, D., Giannotti, F.: Discrimination-and privacy-aware patterns. Data Min. Knowl. Discov. 29(6), 1733–1782 (2015)MathSciNetCrossRef Hajian, S., Domingo-Ferrer, J., Monreale, A., Pedreschi, D., Giannotti, F.: Discrimination-and privacy-aware patterns. Data Min. Knowl. Discov. 29(6), 1733–1782 (2015)MathSciNetCrossRef
15.
Zurück zum Zitat Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 3315–3323 (2016) Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 3315–3323 (2016)
16.
Zurück zum Zitat Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the pc-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)MATH Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the pc-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)MATH
17.
Zurück zum Zitat Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1–33 (2012)CrossRef Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1–33 (2012)CrossRef
18.
Zurück zum Zitat Kamiran, F., Calders, T., Pechenizkiy, M.: Discrimination aware decision tree learning. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 869–874. IEEE, (2010) Kamiran, F., Calders, T., Pechenizkiy, M.: Discrimination aware decision tree learning. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 869–874. IEEE, (2010)
19.
Zurück zum Zitat Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regularization approach. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), pp. 643–650. IEEE, (2011) Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regularization approach. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), pp. 643–650. IEEE, (2011)
20.
Zurück zum Zitat Luong, B.T., Ruggieri, S., Turini, F.: k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 502–510. ACM, (2011) Luong, B.T., Ruggieri, S., Turini, F.: k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 502–510. ACM, (2011)
21.
Zurück zum Zitat Mancuhan, K., Clifton, C.: Combating discrimination using bayesian networks. Artif. Intell. Law 22(2), 211–238 (2014)CrossRef Mancuhan, K., Clifton, C.: Combating discrimination using bayesian networks. Artif. Intell. Law 22(2), 211–238 (2014)CrossRef
22.
Zurück zum Zitat Munoz, C., Smith, M., Patil, D.: Big data: a report on algorithmic systems, opportunity, and civil rights. Executive Office of the President (2016) Munoz, C., Smith, M., Patil, D.: Big data: a report on algorithmic systems, opportunity, and civil rights. Executive Office of the President (2016)
23.
Zurück zum Zitat Neapolitan, R.E., et al.: Learning Bayesian Networks, vol. 38. Prentice Hall, Upper Saddle River (2004) Neapolitan, R.E., et al.: Learning Bayesian Networks, vol. 38. Prentice Hall, Upper Saddle River (2004)
24.
Zurück zum Zitat Pearl, J.: Causality. Cambridge University Press, Cambridge (2009) Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)
25.
Zurück zum Zitat Pearl, J.: The do-calculus revisited. In: Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp. 3–11. AUAI Press, (2012) Pearl, J.: The do-calculus revisited. In: Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp. 3–11. AUAI Press, (2012)
26.
Zurück zum Zitat Pedreschi, D., Ruggieri, S., Turini, F.: Measuring discrimination in socially-sensitive decision records. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 581–592. SIAM, (2009) Pedreschi, D., Ruggieri, S., Turini, F.: Measuring discrimination in socially-sensitive decision records. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 581–592. SIAM, (2009)
27.
Zurück zum Zitat Pedreshi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 560–568. ACM, (2008) Pedreshi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 560–568. ACM, (2008)
28.
Zurück zum Zitat Qureshi, B., Kamiran, F., Karim, A., Ruggieri, S.: Causal discrimination discovery through propensity score analysis. arXiv preprint arXiv:1608.03735 (2016) Qureshi, B., Kamiran, F., Karim, A., Ruggieri, S.: Causal discrimination discovery through propensity score analysis. arXiv preprint arXiv:​1608.​03735 (2016)
29.
Zurück zum Zitat Romei, A., Ruggieri, S.: A multidisciplinary survey on discrimination analysis. Knowl. Eng. Rev. 29(05), 582–638 (2014)CrossRef Romei, A., Ruggieri, S.: A multidisciplinary survey on discrimination analysis. Knowl. Eng. Rev. 29(05), 582–638 (2014)CrossRef
30.
Zurück zum Zitat Ruggieri, S., Pedreschi, D., Turini, F.: Data mining for discrimination discovery. ACM Trans. Knowl. Discov. Data (TKDD) 4(2), 9 (2010) Ruggieri, S., Pedreschi, D., Turini, F.: Data mining for discrimination discovery. ACM Trans. Knowl. Discov. Data (TKDD) 4(2), 9 (2010)
31.
Zurück zum Zitat Shpitser, I.: Counterfactual graphical models for longitudinal mediation analysis with unobserved confounding. Cogn. Sci. 37(6), 1011–1035 (2013)CrossRef Shpitser, I.: Counterfactual graphical models for longitudinal mediation analysis with unobserved confounding. Cogn. Sci. 37(6), 1011–1035 (2013)CrossRef
32.
Zurück zum Zitat Shpitser, I., Evans, R.J., Richardson, T.S., Robins, J.M.: Introduction to nested Markov models. Behaviormetrika 41(1), 3–39 (2014)CrossRef Shpitser, I., Evans, R.J., Richardson, T.S., Robins, J.M.: Introduction to nested Markov models. Behaviormetrika 41(1), 3–39 (2014)CrossRef
33.
Zurück zum Zitat Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search, vol. 81. MIT press, Cambridge (2000)MATH Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search, vol. 81. MIT press, Cambridge (2000)MATH
34.
Zurück zum Zitat Tian, J., Pearl, J.: Probabilities of causation: bounds and identification. Ann. Math. Artif. Intell. 28(1–4), 287–313 (2000)MathSciNetCrossRefMATH Tian, J., Pearl, J.: Probabilities of causation: bounds and identification. Ann. Math. Artif. Intell. 28(1–4), 287–313 (2000)MathSciNetCrossRefMATH
35.
Zurück zum Zitat Wu, Y., Wu, X.: Using loglinear model for discrimination discovery and prevention. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 110–119. IEEE, (2016) Wu, Y., Wu, X.: Using loglinear model for discrimination discovery and prevention. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 110–119. IEEE, (2016)
36.
Zurück zum Zitat Yang, K., Stoyanovich, J.: Measuring fairness in ranked outputs. In: FATML. (2016) Yang, K., Stoyanovich, J.: Measuring fairness in ranked outputs. In: FATML. (2016)
37.
Zurück zum Zitat Zemel, R.S., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representations. ICML 28, 325–333 (2013) Zemel, R.S., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representations. ICML 28, 325–333 (2013)
39.
Zurück zum Zitat Zhang, L., Wu, Y., Wu, X.: On discrimination discovery using causal networks. In: Proceedings of SBP-BRiMS 2016. (2016) Zhang, L., Wu, Y., Wu, X.: On discrimination discovery using causal networks. In: Proceedings of SBP-BRiMS 2016. (2016)
40.
Zurück zum Zitat Zhang, L., Wu, Y., Wu, X.: Situation testing-based discrimination discovery: a causal inference approach. In: Proceedings of IJCAI’16 (2016) Zhang, L., Wu, Y., Wu, X.: Situation testing-based discrimination discovery: a causal inference approach. In: Proceedings of IJCAI’16 (2016)
42.
Zurück zum Zitat Zhang, L., Wu, Y., Wu, X.: A causal framework for discovering and removing direct and indirect discrimination. In: Proceedings of IJCAI’17 (2017) Zhang, L., Wu, Y., Wu, X.: A causal framework for discovering and removing direct and indirect discrimination. In: Proceedings of IJCAI’17 (2017)
43.
Zurück zum Zitat Žliobaite, I., Kamiran, F., Calders, T.: Handling conditional discrimination. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 992–1001. IEEE, (2011) Žliobaite, I., Kamiran, F., Calders, T.: Handling conditional discrimination. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 992–1001. IEEE, (2011)
Metadaten
Titel
Anti-discrimination learning: a causal modeling-based framework
verfasst von
Lu Zhang
Xintao Wu
Publikationsdatum
18.05.2017
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 1/2017
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-017-0058-x