Skip to main content
Erschienen in: Knowledge and Information Systems 3/2014

01.12.2014 | Regular Paper

Explaining prediction models and individual predictions with feature contributions

verfasst von: Erik Štrumbelj, Igor Kononenko

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a sensitivity analysis-based method for explaining prediction models that can be applied to any type of classification or regression model. Its advantage over existing general methods is that all subsets of input features are perturbed, so interactions and redundancies between features are taken into account. Furthermore, when explaining an additive model, the method is equivalent to commonly used additive model-specific methods. We illustrate the method’s usefulness with examples from artificial and real-world data sets and an empirical analysis of running times. Results from a controlled experiment with 122 participants suggest that the method’s explanations improved the participants’ understanding of the model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Achen CH (1982) Intepreting and Using Regression. Sage Publications, Thousand Oaks Achen CH (1982) Intepreting and Using Regression. Sage Publications, Thousand Oaks
2.
Zurück zum Zitat Allahyari H, Lavesson N (2011) User-oriented assessment of classification model understandability. In: Proceedings of the 11th Scandinavian conference on artificial intelligence, SCAI 2011, pp 11–19 Allahyari H, Lavesson N (2011) User-oriented assessment of classification model understandability. In: Proceedings of the 11th Scandinavian conference on artificial intelligence, SCAI 2011, pp 11–19
3.
Zurück zum Zitat Becker B, Kohavi R, Sommerfield D (1997) Visualizing the simple Bayesian classier. KDD workshop on issues in the integration of data mining and data visualization Becker B, Kohavi R, Sommerfield D (1997) Visualizing the simple Bayesian classier. KDD workshop on issues in the integration of data mining and data visualization
4.
Zurück zum Zitat Bhattacharya S, Xu D, Kumar K (2011) An ANN-based auditor decision support system using Benford’s law. Decis Support Syst 50(3):576–584CrossRef Bhattacharya S, Xu D, Kumar K (2011) An ANN-based auditor decision support system using Benford’s law. Decis Support Syst 50(3):576–584CrossRef
5.
Zurück zum Zitat Bhattacharyya S, Jha S, Tharakunnel K, Westland JC (2011) Data mining for credit card fraud: a comparative study. Decis Support Syst 50(3):602–613CrossRef Bhattacharyya S, Jha S, Tharakunnel K, Westland JC (2011) Data mining for credit card fraud: a comparative study. Decis Support Syst 50(3):602–613CrossRef
6.
Zurück zum Zitat Blanchard J, Guillet F, Briand H (2007) Interactive visual exploration of association rules with rule-focusing methodology. Knowl Inf Syst 13:43–75CrossRef Blanchard J, Guillet F, Briand H (2007) Interactive visual exploration of association rules with rule-focusing methodology. Knowl Inf Syst 13:43–75CrossRef
7.
Zurück zum Zitat Castro J, Gómez D, Tejada J (2009) Polynomial calculation of the shapley value based on sampling. Comput Oper Res 36(5):1726–1730CrossRefMATHMathSciNet Castro J, Gómez D, Tejada J (2009) Polynomial calculation of the shapley value based on sampling. Comput Oper Res 36(5):1726–1730CrossRefMATHMathSciNet
8.
Zurück zum Zitat De Falco I, Della Cioppa A (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7:179–201CrossRef De Falco I, Della Cioppa A (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7:179–201CrossRef
9.
Zurück zum Zitat Frank A, Asuncion A (2011) Uci machine learning repository Frank A, Asuncion A (2011) Uci machine learning repository
10.
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18CrossRef
11.
Zurück zum Zitat Huang Z, Chen H, Hsu CJ, Chen WH, Wu S (2004) Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis Support Syst 37(4):543–558CrossRef Huang Z, Chen H, Hsu CJ, Chen WH, Wu S (2004) Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis Support Syst 37(4):543–558CrossRef
12.
Zurück zum Zitat Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis Support Syst 51(1):141–154CrossRef Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis Support Syst 51(1):141–154CrossRef
13.
Zurück zum Zitat Jaeckel P (2002) Monte Carlo methods in finance. Wiley, New York Jaeckel P (2002) Monte Carlo methods in finance. Wiley, New York
14.
Zurück zum Zitat Jakulin A, Možina M, Demšar J, Bratko I, Zupan B (2005) Nomograms for visualizing support vector machines. KDD ’05: 11th ACM SIGKDD, ACM, pp 108–117 Jakulin A, Možina M, Demšar J, Bratko I, Zupan B (2005) Nomograms for visualizing support vector machines. KDD ’05: 11th ACM SIGKDD, ACM, pp 108–117
15.
Zurück zum Zitat Kattan MW, Eastham JA, Stapleton AM, Wheeler TM, Scardino PT (1998) A preoperative nomogram for disease recurrence following radical prostatectomy for prostate cancer. J Natl Cancer Inst 90:766–771CrossRef Kattan MW, Eastham JA, Stapleton AM, Wheeler TM, Scardino PT (1998) A preoperative nomogram for disease recurrence following radical prostatectomy for prostate cancer. J Natl Cancer Inst 90:766–771CrossRef
16.
Zurück zum Zitat Knuth DE (1998) The art of computer programming, volume 2: seminumerical algorithms. Addison-Wesley, Boston Knuth DE (1998) The art of computer programming, volume 2: seminumerical algorithms. Addison-Wesley, Boston
17.
Zurück zum Zitat Kononenko I (1993) Inductive and bayesian learning in medical diagnosis. Appl Artif Intell 7:317–337CrossRef Kononenko I (1993) Inductive and bayesian learning in medical diagnosis. Appl Artif Intell 7:317–337CrossRef
18.
Zurück zum Zitat Lee S (2010) Using data envelopment analysis and decision trees for efficiency analysis and recommendation of B2C controls. Decis Support Syst 49(4):486–497CrossRef Lee S (2010) Using data envelopment analysis and decision trees for efficiency analysis and recommendation of B2C controls. Decis Support Syst 49(4):486–497CrossRef
19.
Zurück zum Zitat Lemaire V, Feraud R, Voisine N (2008) Contact personalization using a score understanding method. In: International joint conference on neural networks (IJCNN) Lemaire V, Feraud R, Voisine N (2008) Contact personalization using a score understanding method. In: International joint conference on neural networks (IJCNN)
20.
Zurück zum Zitat Lim BY, Dey AK, Avrahami D (2009) Why and why not explanations improve the intelligibility of context-aware intelligent systems. In: Proceedings of the 27th international conference on Human factors in computing systems, CHI ’09, ACM, New York, NY, USA, pp 2119–2128 Lim BY, Dey AK, Avrahami D (2009) Why and why not explanations improve the intelligibility of context-aware intelligent systems. In: Proceedings of the 27th international conference on Human factors in computing systems, CHI ’09, ACM, New York, NY, USA, pp 2119–2128
21.
Zurück zum Zitat Lubsen J, Pool J, van der Does E (1978) A practical device for the application of a diagnostic or prognostic function. Methods Inf Med 17:127–129 Lubsen J, Pool J, van der Does E (1978) A practical device for the application of a diagnostic or prognostic function. Methods Inf Med 17:127–129
23.
Zurück zum Zitat Možina M, Demšar J, Kattan M, Zupan B (2004) Nomograms for visualization of naive Bayesian classifier. PKDD 2004, Springer, pp 337–348 Možina M, Demšar J, Kattan M, Zupan B (2004) Nomograms for visualization of naive Bayesian classifier. PKDD 2004, Springer, pp 337–348
24.
Zurück zum Zitat Robnik-Šikonja M, Kononenko I (2008) Explaining classifications for individual instances. IEEE TKDE 20:589–600 Robnik-Šikonja M, Kononenko I (2008) Explaining classifications for individual instances. IEEE TKDE 20:589–600
25.
Zurück zum Zitat Shapley LS (1953) A value for n-person games, vol II of Contributions to the theory of games. Princeton University Press, Princeton Shapley LS (1953) A value for n-person games, vol II of Contributions to the theory of games. Princeton University Press, Princeton
26.
Zurück zum Zitat Szafron D, Poulin B, Eisner R, Lu P, Greiner R, Wishart D, Fyshe A, Pearcy B, Macdonell C, Anvik J (2006) Visual explanation of evidence in additive classifiers. In: Proceedings of innovative applications of artificial intelligence Szafron D, Poulin B, Eisner R, Lu P, Greiner R, Wishart D, Fyshe A, Pearcy B, Macdonell C, Anvik J (2006) Visual explanation of evidence in additive classifiers. In: Proceedings of innovative applications of artificial intelligence
27.
Zurück zum Zitat Štrumbelj E, Bosnić Z, Zakotnik B, Grašič-Kuhar C, Kononenko I (2010) Explanation and reliability of breast cancer recurrence predictions. Knowl Inf Syst 24(2):305–324CrossRef Štrumbelj E, Bosnić Z, Zakotnik B, Grašič-Kuhar C, Kononenko I (2010) Explanation and reliability of breast cancer recurrence predictions. Knowl Inf Syst 24(2):305–324CrossRef
28.
Zurück zum Zitat Štrumbelj E, Kononenko I (2010) An efficient explanation of individual classifications using game theory. J Mach Learn Res 11:1–18MATHMathSciNet Štrumbelj E, Kononenko I (2010) An efficient explanation of individual classifications using game theory. J Mach Learn Res 11:1–18MATHMathSciNet
29.
Zurück zum Zitat Štrumbelj E, Kononenko I (2011) A general method for visualizing and explaining black-box regression models. In: Dobnikar A, Lotric U, Ster B (eds) ICANNGA (2), vol 6594 of Lecture notes in computer science. Springer, Berlin, pp 21–30 Štrumbelj E, Kononenko I (2011) A general method for visualizing and explaining black-box regression models. In: Dobnikar A, Lotric U, Ster B (eds) ICANNGA (2), vol 6594 of Lecture notes in computer science. Springer, Berlin, pp 21–30
30.
Zurück zum Zitat Welford BP (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419–420CrossRefMathSciNet Welford BP (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419–420CrossRefMathSciNet
Metadaten
Titel
Explaining prediction models and individual predictions with feature contributions
verfasst von
Erik Štrumbelj
Igor Kononenko
Publikationsdatum
01.12.2014
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 3/2014
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-013-0679-x

Weitere Artikel der Ausgabe 3/2014

Knowledge and Information Systems 3/2014 Zur Ausgabe

Premium Partner