Skip to main content

2015 | OriginalPaper | Buchkapitel

A Structural Pattern Mining Approach for Credit Risk Assessment

verfasst von : Bernardete Ribeiro, Ning Chen, Alexander Kovačec

Erschienen in: Hybrid Artificial Intelligent Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent years graph mining took a valuable step towards harnessing the problem of efficient discovery of substructures in complex input data that do not fit into the usual data mining models. A graph is a general and powerful data representation formalism, which found widespread application in many scientific fields. Finding subgraphs capable of compressing data by abstracting instances of the substructures and identifying interesting patterns is thus crucial. When it comes to financial settings, data is very complex and in particular when risk factors relationships are not taken into account it seriously affects the goodness of predictions. In this paper, we posit that risk analysis can be leveraged if structure can be taken into account by discovering financial motifs in the input graphs. We use gBoost which learns from graph data using a mathematical linear programming procedure combined with a substructure mining algorithm. An algorithm is proposed which has shown to be efficient to extract graph structure from feature vector data. Furthermore, we empirically show that the graph-mining model is competitive with state-of-the-art machine learning approaches in terms of classification accuracy without increase in the computational cost.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Martino, G.D.S., Sperduti, A.: Mining structured data. IEEE Comput. Intell. Mag. 5(1), 42–49 (2010)CrossRef Martino, G.D.S., Sperduti, A.: Mining structured data. IEEE Comput. Intell. Mag. 5(1), 42–49 (2010)CrossRef
2.
Zurück zum Zitat Pimentel, C., Van Der Straeten, D., Pires, E., Faro, C., Rodrigues-Pousada, C.: Characterization and expression analysis of the aspartic protease gene family of cynara cardunculus l. FEBS J. 274(10), 2523–2539 (2007)CrossRef Pimentel, C., Van Der Straeten, D., Pires, E., Faro, C., Rodrigues-Pousada, C.: Characterization and expression analysis of the aspartic protease gene family of cynara cardunculus l. FEBS J. 274(10), 2523–2539 (2007)CrossRef
3.
Zurück zum Zitat Borgelt, C., Berthold, M.: Mining molecular fragments: finding relevant substructures of molecules. In: Proceedings of the IEEE International Conference on Data Mining, ICDM 2002, pp. 51–58, December 2002 Borgelt, C., Berthold, M.: Mining molecular fragments: finding relevant substructures of molecules. In: Proceedings of the IEEE International Conference on Data Mining, ICDM 2002, pp. 51–58, December 2002
4.
Zurück zum Zitat Iliofotou, M., Pappu, P., Faloutsos, M., Mitzenmacher, M., Singh, S., Varghese, G.: Network monitoring using traffic dispersion graphs (tdgs). In: Dovrolis, C., Roughan, M. (eds.) Internet Measurement Conference, pp. 315–320. ACM (2007) Iliofotou, M., Pappu, P., Faloutsos, M., Mitzenmacher, M., Singh, S., Varghese, G.: Network monitoring using traffic dispersion graphs (tdgs). In: Dovrolis, C., Roughan, M. (eds.) Internet Measurement Conference, pp. 315–320. ACM (2007)
5.
Zurück zum Zitat Saigo, H., Uno, T., Tsuda, K.: Mining complex genotypic features for predicting HIV-1 drug resistance. Bioinformatics 23(18), 2455–2462 (2007)CrossRef Saigo, H., Uno, T., Tsuda, K.: Mining complex genotypic features for predicting HIV-1 drug resistance. Bioinformatics 23(18), 2455–2462 (2007)CrossRef
6.
Zurück zum Zitat Bunke, H., Riesen, K.: Recent advances in graph-based pattern recognition with applications in document analysis. Pattern Recogn. 44(5), 1057–1067 (2011)CrossRefMATH Bunke, H., Riesen, K.: Recent advances in graph-based pattern recognition with applications in document analysis. Pattern Recogn. 44(5), 1057–1067 (2011)CrossRefMATH
7.
Zurück zum Zitat Saigo, H., Nowozin, S., Kadowaki, T., Kudo, T., Tsuda, K.: gBoost: a mathematical programming approach to graph classification and regression. Mach. Learn. 75(1), 69–89 (2009)CrossRef Saigo, H., Nowozin, S., Kadowaki, T., Kudo, T., Tsuda, K.: gBoost: a mathematical programming approach to graph classification and regression. Mach. Learn. 75(1), 69–89 (2009)CrossRef
8.
Zurück zum Zitat Yan, X., Han, J.: gspan: graph based substructure pattern mining. In: IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 721–724, December 2002 Yan, X., Han, J.: gspan: graph based substructure pattern mining. In: IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 721–724, December 2002
9.
Zurück zum Zitat Cheng, K.F., Chu, C.K., Hwang, R.: Predicting bankruptcy using the discrete-time semi-parametric hazard model. Quant. Finance 10(9), 1055–1066 (2010)CrossRefMATHMathSciNet Cheng, K.F., Chu, C.K., Hwang, R.: Predicting bankruptcy using the discrete-time semi-parametric hazard model. Quant. Finance 10(9), 1055–1066 (2010)CrossRefMATHMathSciNet
10.
Zurück zum Zitat Hwang, R.C., Chung, H., Chu, C.K.: Predicting issuer credit ratings using a semi-parametric method. J. Empirical Finance. 17(1), 120–137 (2010)CrossRef Hwang, R.C., Chung, H., Chu, C.K.: Predicting issuer credit ratings using a semi-parametric method. J. Empirical Finance. 17(1), 120–137 (2010)CrossRef
11.
Zurück zum Zitat Ribeiro, B., Silva, C., Chen, N., Vieira, A., das Neves, J.C.: Predicting issuer credit ratings using a semiparametric method. Expert Syst. Appl. 39, 10140–10152 (2012)CrossRef Ribeiro, B., Silva, C., Chen, N., Vieira, A., das Neves, J.C.: Predicting issuer credit ratings using a semiparametric method. Expert Syst. Appl. 39, 10140–10152 (2012)CrossRef
12.
Zurück zum Zitat Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, pp. 207–216. ACM Press, New York, NY, USA(1993) Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, pp. 207–216. ACM Press, New York, NY, USA(1993)
13.
Zurück zum Zitat Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000) CrossRef Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000) CrossRef
14.
Zurück zum Zitat Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, pp. 313–320 (2001) Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, pp. 313–320 (2001)
15.
Zurück zum Zitat Nijssen, S., Kok, J.N.: The Gaston tool for frequent subgraph mining. Electron. Notes Theor. Comput. Sci. 127(1), 77–87 (2005). Proceedings of the International Workshop on Graph-Based Tools (GraBaTs 2004) Graph-Based Tools 2004CrossRefMathSciNet Nijssen, S., Kok, J.N.: The Gaston tool for frequent subgraph mining. Electron. Notes Theor. Comput. Sci. 127(1), 77–87 (2005). Proceedings of the International Workshop on Graph-Based Tools (GraBaTs 2004) Graph-Based Tools 2004CrossRefMathSciNet
16.
Zurück zum Zitat Demiriz, A., Bennett, K.P., Shawe-Taylor, J.: Linear programming boosting via column generation. Mach. Learning 46(1–3), 225–254 (2002)CrossRefMATH Demiriz, A., Bennett, K.P., Shawe-Taylor, J.: Linear programming boosting via column generation. Mach. Learning 46(1–3), 225–254 (2002)CrossRefMATH
17.
Zurück zum Zitat Luenberger, D.G.: Optimization by Vector Space Methods Decision and Control. Wiley, New York (1969)MATH Luenberger, D.G.: Optimization by Vector Space Methods Decision and Control. Wiley, New York (1969)MATH
18.
Zurück zum Zitat Kim, M.J., Han, I.: The discovery of experts’ decision rules from qualitative bankruptcy data using genetic algorithms. Expert Syst. Appl. 25(4), 637–646 (2003)CrossRef Kim, M.J., Han, I.: The discovery of experts’ decision rules from qualitative bankruptcy data using genetic algorithms. Expert Syst. Appl. 25(4), 637–646 (2003)CrossRef
19.
Zurück zum Zitat Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. newsl. 11(1), 10–18 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. newsl. 11(1), 10–18 (2009)CrossRef
20.
Zurück zum Zitat Min, J.H., Lee, Y.C.: Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst. Appl. 28(4), 603–614 (2005)CrossRef Min, J.H., Lee, Y.C.: Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst. Appl. 28(4), 603–614 (2005)CrossRef
21.
Zurück zum Zitat Yang, Z., You, W., Ji, G.: Using partial least squares and support vector machines for bankruptcy prediction. Expert Syst. Appl. 38(7), 8336–8342 (2011)CrossRef Yang, Z., You, W., Ji, G.: Using partial least squares and support vector machines for bankruptcy prediction. Expert Syst. Appl. 38(7), 8336–8342 (2011)CrossRef
Metadaten
Titel
A Structural Pattern Mining Approach for Credit Risk Assessment
verfasst von
Bernardete Ribeiro
Ning Chen
Alexander Kovačec
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-19644-2_7