nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Self-explaining AI as an Alternative to Interpretable AI

verfasst von : Daniel C. Elton

Erschienen in: Artificial General Intelligence

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The ability to explain decisions made by AI systems is highly sought after, especially in domains where human lives are at stake such as medicine or autonomous vehicles. While it is often possible to approximate the input-output relations of deep neural networks with a few human-understandable rules, the discovery of the double descent phenomena suggests that such approximations do not accurately capture the mechanism by which deep neural networks work. Double descent indicates that deep neural networks typically operate by smoothly interpolating between data points rather than by extracting a few high level rules. As a result, neural networks trained on complex real world data are inherently hard to interpret and prone to failure if asked to extrapolate. To show how we might be able to trust AI despite these problems we introduce the concept of self-explaining AI. Self-explaining AIs are capable of providing a human-understandable explanation of each decision along with confidence levels for both the decision and explanation. Some difficulties with this approach along with possible solutions are sketched. Finally, we argue it is important that deep learning based systems include a “warning light” based on techniques from applicability domain analysis to warn the user if a model is asked to extrapolate outside its training distribution.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Post-turing Methodology: Breaking the Wall on the Way to Artificial General Intelligence

Nächstes Kapitel AGI Needs the Humanities

Note that this sort of approach should not be taken as quantifying “information flow” in the network. In fact, since the output of units is continuous, the amount of information which can flow through the network is infinite (for discussion and how to recover the concept of “information flow” in neural networks see [22]). What we propose to measure is the mutual information over the data distribution used.

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems NIPS 2018, pp. 9525–9536. Curran Associates Inc., Red Hook (2018)

Ahmad, M.A., Eckert, C., Teredesai, A.: Interpretable machine learning in healthcare. In: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics - BCB 2018. ACM Press (2018)

Aliman, N.-M., Kester, L.: Hybrid strategies towards safe “Self-Aware” superintelligent systems. In: Iklé, M., Franz, A., Rzepka, R., Goertzel, B. (eds.) AGI 2018. LNCS (LNAI), vol. 10999, pp. 1–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97676-1_1CrossRef

Alvarez-Melis, D., Jaakkola, T.S.: Towards robust interpretability with self-explaining neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems NIPS 2018, pp. 7786–7795. Curran Associates Inc., Red Hook (2018)

Arya, V., et al.: One explanation does not fit all: a toolkit and taxonomy of AI explainability techniques. arXiv eprints: 1909.03012 (2019)

Ashby, W.R.: An Introduction to Cybernetics. Chapman & Hall, London (1956)CrossRef

Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)CrossRef

Barnes, B.C., et al.: Machine learning of energetic material properties. arXiv eprints: 1807.06156 (2018)

Beede, E., et al.: A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems CHI 2020, pp. 1–12. Association for Computing Machinery, New York (2020)

10.

Belkin, M., Hsu, D., Ma, S., Mandal, S.: Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proc. Natl. Acad. Sci. 116(32), 15849–15854 (2019)MathSciNetCrossRef

11.

Belkin, M., Hsu, D., Xu, J.: Two models of double descent for weak features. arXiv eprints: 1903.07571 (2019)

12.

Bordes, F., Berthier, T., Jorio, L.D., Vincent, P., Bengio, Y.: Iteratively unveiling new regions of interest in deep learning models. In: Medical Imaging with Deep Learning (MIDL) (2018)

13.

Bostrom, N.: Superintelligence: Paths, Dangers, Strategies, 1st edn. Oxford University Press Inc., Oxford (2014)

14.

Breiman, L.: Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16(3), 199–231 (2001)CrossRef

15.

Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., Su, J.: This looks like that: deep learning for interpretable image recognition. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Canada, Vancouver, BC, pp. 8928–8939 (2019)

16.

Dombrowski, A.K., Alber, M., Anders, C.J., Ackermann, M., Müller, K.R., Kessel, P.: Explanations can be manipulated and geometry is to blame (2019)

17.

Elton, D., Sandfort, V., Pickhardt, P.J., Summers, R.M.: Accurately identifying vertebral levels in large datasets. In: Hahn, H.K., Mazurowski, M.A. (eds.) Medical Imaging 2020: Computer-Aided Diagnosis. SPIE, March 2020

18.

Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Technical report 1341, University of Montreal: also presented at the ICML 2009 Workshop on Learning Feature Hierarchies. Montréal, Canada (2009)

19.

Frosst, N., Hinton, G.: Distilling a neural network into a soft decision tree. arXiv eprintss: 1711.09784 (2017)

20.

Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1050–1059. PMLR, New York, 20–22 June 2016

21.

Goertzel, B.: Are there deep reasons underlying the pathologies of today’s deep learning algorithms? In: Bieger, J., Goertzel, B., Potapov, A. (eds.) AGI 2015. LNCS (LNAI), vol. 9205, pp. 70–79. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21365-1_8CrossRef

22.

Goldfeld, Z., et al.: Estimating information flow in deep neural networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 2299–2308. PMLR, Long Beach, 09–15 June 2019

23.

Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv eprintss: 1412.6572 (2014)

24.

Hasson, U., Nastase, S.A., Goldstein, A.: Direct fit to nature: an evolutionary perspective on biological and artificial neural networks. Neuron 105(3), 416–434 (2020)CrossRef

25.

Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: Proceedings of the 34th International Conference on Machine Learning ICML 2017, vol. 70, pp. 1885–1894. JMLR.org (2017)

26.

Kulesza, T., Burnett, M., Wong, W.K., Stumpf, S.: Principles of explanatory debugging to personalize interactive machine learning. In: Proceedings of the 20th International Conference on Intelligent User Interfaces - IUI 2015. ACM Press (2015)

27.

LaLonde, R., Torigian, D., Bagci, U.: Encoding visual attributes in capsules for explainable medical diagnoses. arXiv e-prints: 1909.05926, September 2019

28.

Lie, C.: Relevance in the eye of the beholder: diagnosing classifications based on visualised layerwise relevance propagation. Master’s thesis, Lund University, Sweden (2019)

29.

Lillicrap, T.P., Kording, K.P.: What does it mean to understand a neural network? arXiv eprints: 1907.06374 (2019)

30.

Linfoot, E.: An informational measure of correlation. Inf. Control 1(1), 85–89 (1957)MathSciNetCrossRef

31.

Lipton, Z.C.: The mythos of model interpretability. arXiv eprints: 1606.03490 (2016)

32.

Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774. Curran Associates, Inc. (2017)

33.

McClure, P., et al.: Knowing what you know in brain segmentation using bayesian deep neural networks. Front. Neuroinform. 13, 67 (2019)CrossRef

34.

Murdoch, W.J., Singh, C., Kumbier, K., Abbasi-Asl, R., Yu, B.: Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. 116(44), 22071–22080 (2019)MathSciNetCrossRef

35.

Nakkiran, P., Kaplun, G., Bansal, Y., Yang, T., Barak, B., Sutskever, I.: Deep double descent: where bigger models and more data hurt. arXiv eprints: 1912.02292 (2019)

36.

Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you? In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD. ACM Press (2016)

37.

Richards, B.A., et al.: A deep learning framework for neuroscience. Nat. Neurosci. 22(11), 1761–1770 (2019)CrossRef

38.

Rolnick, D., Kording, K.P.: Identifying weights and architectures of unknown ReLU networks. arXiv eprintss: 1910.00744 (2019)

39.

Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)CrossRef

40.

Sahigara, F., Mansouri, K., Ballabio, D., Mauri, A., Consonni, V., Todeschini, R.: Comparison of different approaches to define the applicability domain of QSAR models. Molecules 17(5), 4791–4810 (2012)CrossRef

41.

Shen, S., Han, S.X., Aberle, D.R., Bui, A.A., Hsu, W.: An interpretable deep hierarchical semantic convolutional neural network for lung nodule malignancy classification. Expert Syst. Appl. 128, 84–95 (2019)CrossRef

42.

Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. arXiv eprintss: 1704.02685 (2017)

43.

Spigler, S., Geiger, M., d’Ascoli, S., Sagun, L., Biroli, G., Wyart, M.: A jamming transition from under- to over-parametrization affects generalization in deep learning. J. Phys. A: Math. Theor. 52(47), 474001 (2019)MathSciNetCrossRef

44.

Sutre, E.T., Colliot, O., Dormont, D., Burgos, N.: Visualization approach to assess the robustness of neural networks for medical image classification. In: Proceedings of the SPIE: Medical Imaging (2020)

45.

Swartout, W.R.: XPLAIN: a system for creating and explaining expert consulting programs. Artif. Intell. 21(3), 285–325 (1983)CrossRef

46.

Torkkola, K.: Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. Res. 3, 1415–1438 (2003)MathSciNetMATH

47.

Yeh, C.K., Hsieh, C.Y., Suggala, A.S., Inouye, D.I., Ravikumar, P.: On the (in)fidelity and sensitivity for explanations. arXiv eprints: 1901.09392 (2019)

48.

Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53CrossRef

49.

Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. arXiv eprints: 1611.03530 (2016)

50.

Zhang, Q., Cao, R., Shi, F., Wu, Y.N., Zhu, S.C.: Interpreting CNN knowledge via an explanatory graph. In: McIlraith, S.A., Weinberger, K.Q. (eds.) AAAI, pp. 4454–4463. AAAI Press (2018)

Titel: Self-explaining AI as an Alternative to Interpretable AI
verfasst von: Daniel C. Elton
Verlag: Springer International Publishing
Buch: Artificial General Intelligence
Print ISBN: 978-3-030-52151-6

Electronic ISBN: 978-3-030-52152-3

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-52152-3_10

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner