Skip to main content

2020 | OriginalPaper | Buchkapitel

Unjustified Classification Regions and Counterfactual Explanations in Machine Learning

verfasst von : Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, Marcin Detyniecki

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Post-hoc interpretability approaches, although powerful tools to generate explanations for predictions made by a trained black-box model, have been shown to be vulnerable to issues caused by lack of robustness of the classifier. In particular, this paper focuses on the notion of explanation justification, defined as connectedness to ground-truth data, in the context of counterfactuals. In this work, we explore the extent of the risk of generating unjustified explanations. We propose an empirical study to assess the vulnerability of classifiers and show that the chosen learning algorithm heavily impacts the vulnerability of the model. Additionally, we show that state-of-the-art post-hoc counterfactual approaches can minimize the impact of this risk by generating less local explanations (Source code available at: https://​github.​com/​thibaultlaugel/​truce).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alvarez Melis, D., Jaakkola, T.: Towards robust interpretability with self-explaining neural networks. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7786–7795 (2018) Alvarez Melis, D., Jaakkola, T.: Towards robust interpretability with self-explaining neural networks. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7786–7795 (2018)
2.
Zurück zum Zitat Baehrens, D., Schroeter, T., Harmeling, S., Hansen, K., Muller, K.R.: How to explain individual classification decisions Motoaki Kawanabe. J. Mach. Learn. Res. 11, 1803–1831 (2010)MathSciNetMATH Baehrens, D., Schroeter, T., Harmeling, S., Hansen, K., Muller, K.R.: How to explain individual classification decisions Motoaki Kawanabe. J. Mach. Learn. Res. 11, 1803–1831 (2010)MathSciNetMATH
4.
Zurück zum Zitat Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)CrossRef Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)CrossRef
5.
Zurück zum Zitat Bottou, L., et al.: Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. 14, 3207–3260 (2013)MathSciNetMATH Bottou, L., et al.: Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. 14, 3207–3260 (2013)MathSciNetMATH
6.
Zurück zum Zitat Craven, M.W., Shavlik, J.W.: Extracting tree-structured representations of trained neural networks. In: Advances in Neural Information Processing Systems, vol. 8, pp. 24–30 (1996) Craven, M.W., Shavlik, J.W.: Extracting tree-structured representations of trained neural networks. In: Advances in Neural Information Processing Systems, vol. 8, pp. 24–30 (1996)
7.
Zurück zum Zitat Dua, D., Graff, C.: UCI machine learning repository (2017) Dua, D., Graff, C.: UCI machine learning repository (2017)
8.
Zurück zum Zitat Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 226–231 (1996) Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 226–231 (1996)
9.
Zurück zum Zitat Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P., Soatto, S.: Empirical study of the topology and geometry of deep networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018 Fawzi, A., Moosavi-Dezfooli, S.M., Frossard, P., Soatto, S.: Empirical study of the topology and geometry of deep networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
11.
Zurück zum Zitat Guidotti, R., Monreale, A., Ruggieri, S., Pedreschi, D., Turini, F., Giannotti, F.: Local rule-based explanations of black box decision systems. arXiv preprint arXiv:1805.10820 (2018) Guidotti, R., Monreale, A., Ruggieri, S., Pedreschi, D., Turini, F., Giannotti, F.: Local rule-based explanations of black box decision systems. arXiv preprint arXiv:​1805.​10820 (2018)
12.
Zurück zum Zitat Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 93 (2018) Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 93 (2018)
13.
Zurück zum Zitat Hara, S., Hayashi, K.: Making tree ensembles interpretable. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2016) (2016) Hara, S., Hayashi, K.: Making tree ensembles interpretable. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2016) (2016)
14.
Zurück zum Zitat Harrison, D., Rubinfeld, D.: Hedonic prices and the demand for clean air. Environ. Econ. Manag. 5, 81–102 (1978)MATHCrossRef Harrison, D., Rubinfeld, D.: Hedonic prices and the demand for clean air. Environ. Econ. Manag. 5, 81–102 (1978)MATHCrossRef
15.
Zurück zum Zitat Jiang, H., Kim, B., Guan, M., Gupta, M.: To trust or not to trust a classifier. In: Advances in Neural Information Processing Systems, vol. 31, pp. 5541–5552 (2018) Jiang, H., Kim, B., Guan, M., Gupta, M.: To trust or not to trust a classifier. In: Advances in Neural Information Processing Systems, vol. 31, pp. 5541–5552 (2018)
16.
Zurück zum Zitat Kabra, M., Robie, A., Branson, K.: Understanding classifier errors by examining influential neighbors. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3917–3925 (2015) Kabra, M., Robie, A., Branson, K.: Understanding classifier errors by examining influential neighbors. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3917–3925 (2015)
17.
Zurück zum Zitat Kim, B., Rudin, C., Shah, J.A.: The Bayesian case model: a generative approach for case-based reasoning and prototype classification. In: Advances in Neural Information Processing Systems, pp. 1952–1960 (2014) Kim, B., Rudin, C., Shah, J.A.: The Bayesian case model: a generative approach for case-based reasoning and prototype classification. In: Advances in Neural Information Processing Systems, pp. 1952–1960 (2014)
18.
Zurück zum Zitat Larson, J., Mattu, S., Kirchner, L., Angwin, J.: How We Analyzed the COMPAS Recidivism Algorithm. ProPublica, Manhattan (2016) Larson, J., Mattu, S., Kirchner, L., Angwin, J.: How We Analyzed the COMPAS Recidivism Algorithm. ProPublica, Manhattan (2016)
19.
Zurück zum Zitat Lash, M., Lin, Q., Street, N., Robinson, J., Ohlmann, J.: Generalized inverse classification. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 162–170 (2017) Lash, M., Lin, Q., Street, N., Robinson, J., Ohlmann, J.: Generalized inverse classification. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 162–170 (2017)
21.
Zurück zum Zitat Laugel, T., Lesot, M.J., Marsala, C., Renard, X., Detyniecki, M.: The dangers of post-hoc interpretability: Unjustified counterfactual explanations. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence IJCAI 2019 (2019, to appear) Laugel, T., Lesot, M.J., Marsala, C., Renard, X., Detyniecki, M.: The dangers of post-hoc interpretability: Unjustified counterfactual explanations. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence IJCAI 2019 (2019, to appear)
22.
Zurück zum Zitat Laugel, T., Renard, X., Lesot, M.J., Marsala, C., Detyniecki, M.: Defining locality for surrogates in post-hoc interpretablity. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2018) (2018) Laugel, T., Renard, X., Lesot, M.J., Marsala, C., Detyniecki, M.: Defining locality for surrogates in post-hoc interpretablity. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2018) (2018)
23.
Zurück zum Zitat Lipton, Z.C.: The mythos of model interpretability. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2017) (2017) Lipton, Z.C.: The mythos of model interpretability. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2017) (2017)
24.
Zurück zum Zitat Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774 (2017) Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774 (2017)
25.
Zurück zum Zitat Martens, D., Provost, F.: Explaining data-driven document classifications. MIS Q. 38(1), 73–100 (2014)CrossRef Martens, D., Provost, F.: Explaining data-driven document classifications. MIS Q. 38(1), 73–100 (2014)CrossRef
26.
Zurück zum Zitat Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1135–1144 (2016) Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1135–1144 (2016)
27.
Zurück zum Zitat Rudin, C.: Please stop explaining black box models for high stakes decisions. In: NeurIPS Workshop on Critiquing and Correcting Trends in Machine Learning (2018) Rudin, C.: Please stop explaining black box models for high stakes decisions. In: NeurIPS Workshop on Critiquing and Correcting Trends in Machine Learning (2018)
28.
Zurück zum Zitat Russell, C.: Efficient search for diverse coherent explanations. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, (FAT* 2019), pp. 20–28 (2019) Russell, C.: Efficient search for diverse coherent explanations. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, (FAT* 2019), pp. 20–28 (2019)
29.
Zurück zum Zitat Turner, R.: A model explanation system. In: NIPS Workshop on Black Box Learning and Inference (2015) Turner, R.: A model explanation system. In: NIPS Workshop on Black Box Learning and Inference (2015)
30.
Zurück zum Zitat Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box; automated decisions and the GDPR. Harv. J. Law Technol. 31(2), 841–887 (2018) Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box; automated decisions and the GDPR. Harv. J. Law Technol. 31(2), 841–887 (2018)
Metadaten
Titel
Unjustified Classification Regions and Counterfactual Explanations in Machine Learning
verfasst von
Thibault Laugel
Marie-Jeanne Lesot
Christophe Marsala
Xavier Renard
Marcin Detyniecki
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-46147-8_3