Skip to main content

2022 | OriginalPaper | Buchkapitel

2. How AI Models Are Built

verfasst von : Rossella Locatelli, Giovanni Pepe, Fabio Salis

Erschienen in: Artificial Intelligence and Credit Risk

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter describes the various kinds of data that are mostly in use today in AI models, differentiating between “structured”, “semi-structured” and “unstructured” data. Text analysis and Natural Language Processing are illustrated as the main structuring techniques for unstructured data. Some examples of alternative credit data are described, including among others transactional data, data extracted from telephones and other utilities, data extracted from social profiles, data extracted from the world wide web and data gathered through surveys/questionnaires. Also, the chapter describes the opportunity of estimating a model only by means of machine learning techniques, detailing the characteristics of the most used ML algorithms: decision trees, random forests, gradient boosting and neural networks. The application of a special type of neural network is detailed: the autoencoder.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Structured data are based on a scheme and may be represented by rows and columns and filed in a central repository, typically a relational database, where the pieces of data may be retrieved separately or in a variety of combinations for processing and analysis.
 
2
Stop words are considered such because they create stops in the processing.
 
3
Liberati C. et al. “Personal values and credit scoring: new insights in the financial prediction”, Journal of the Operational Research Society – February 2018. The word “psychometrics” refers to the set of psychological analysis methods for quantitative behavioural assessments. The leader in this field, the Entrepreneurial Finance Lab (EFL), bases its scores on 10 years of research at Harvard.
 
4
For example, from Faire.ai, fintech B2B (https://​www.​faire.​ai/​).
 
5
SSM-2020–0744, “Identification and measurement of credit risk in the context of the coronavirus (COVID-19) pandemic”, Frankfurt am Main, 4 December 2020.
 
6
Art. 174: “a) […] The input variables shall form a reasonable and effective basis for the resulting predictions […]; e) the institution shall complement the statistical model by human judgement and human oversight to review model-based assignments and to ensure that the models are used appropriately. Review procedures shall aim at finding and limiting errors associated with model weaknesses. Human judgements shall take into account all relevant information not considered by the model. The institution shall document how human judgement and model results are to be combined”.
 
7
Overfitting is when a statistical model contains an excessive number of parameters compared to the number of observations and therefore achieves an excellent performance on the training set, but a weak performance on the validation sets.
 
8
Incremental learning is when statistical models learn as they acquire new data.
 
Literatur
Zurück zum Zitat Kamber et al., “Generalization and Decision Tree Induction: Efficient Classification in Data Mining”, 1997. Kamber et al., “Generalization and Decision Tree Induction: Efficient Classification in Data Mining”, 1997.
Zurück zum Zitat Liberati C. et al., “Personal values and credit scoring: new insights in the financial prediction”, Journal of the Operational Research Society, February 2018. Liberati C. et al., “Personal values and credit scoring: new insights in the financial prediction”, Journal of the Operational Research Society, February 2018.
Zurück zum Zitat SSM-2020–0744, “Identification and measurement of credit risk in the context of the coronavirus (COVID-19) pandemic”, Frankfurt am Main, 4 December 2020. SSM-2020–0744, “Identification and measurement of credit risk in the context of the coronavirus (COVID-19) pandemic”, Frankfurt am Main, 4 December 2020.
Metadaten
Titel
How AI Models Are Built
verfasst von
Rossella Locatelli
Giovanni Pepe
Fabio Salis
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-031-10236-3_2