Skip to main content
Top

2021 | OriginalPaper | Chapter

4. The Analysis of Big Financial Data Through Artificial Intelligence Methods

Authors : Erkan Ozhan, Erdinç Uzun

Published in: The Impact of Artificial Intelligence on Governance, Economics and Finance, Volume I

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A new data world which never get deformed, can be reached from anywhere, continuously stream and multiply, emerged with the evolution of technology. The data, in particular, created by business firms, scientific research centers, and automation systems reached great amounts. It has become the main target of many data analysts to reach meaningful, unexplored, and valuable information or deductions among these piles of data. In this chapter, firstly the techniques of artificial intelligence and the skills of these techniques were discussed. Later, the mostly-used techniques in the finance sector, the advantages and weaknesses of these techniques, and the methods which can be used to process the data created by the finance sector, which creates big data and is one of the leading sources, was comparatively shown. The current version of the mostly-used artificial intelligence methods in the finance sector was scanned and the new skills and contributions it provides to the sector were examined. What Classification, clustering, association rules, and time series analysis methods, in particular, cover and what problems they can produce solutions to were examined and the readers were informed about these techniques. It was aimed to give information about forming credit score and customer segmentation, where classification and clustering methods are especially employed, with sample studies. It was aimed to present the principles the up-to-date methods are based on and their theoretical and practical applications in a meaningful way. In addition to these, information about practical and useful software that can be used for data analysis in the finance sector was given and the skills of this software were conveyed to the readers. Finally, how the techniques of processing big data can be used was examined through samples as the finance data are classified as big data. The difficulties met during the analysis of big data, a natural result created by this sector, and solutions to them were presented. Updated big data processing solutions like Hadoop, Spark, MapReduce, Distributed computing, and GPU (Graphics Processing Unit) computing, in particular, were comparatively explained. The main principles that big data processing techniques are based on were simplified in a way that the readers could understand and were supported by examples from the sector. Especially, Spark, Hadoop, and MapReduce methods, which are leading methods in processing big data, were examined. Finally, the contributions made to the sector by artificial intelligence and big data processing techniques were generally summarized and the results were presented.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Identifying the license plate of a vehicle is actually determining the class of each letter on the license plate in an alphabet with an average number of 40 classes.
 
2
These companies promise the required infrastructure for businesses by providing monthly or annual subscriptions. Even though they are widely used today, some businesses are establishing their own data analysis departments for carrying out these analyses.
 
3
ENIAC- Electronic Numerical Integrator And Computer: “Built between 1943 and 1945 the first large-scale computer to run at electronic speed without being slowed by any mechanical parts” (CHM).
 
15
Visit https://​kudu.​apache.​org/​ for more detailed information.
 
Literature
go back to reference Artis M, Ayuso M, Guillén M (2002) Detection of automobile insurance fraud with discrete choice models and misclassified claims. J Risk Insur 69:325–340.CrossRef Artis M, Ayuso M, Guillén M (2002) Detection of automobile insurance fraud with discrete choice models and misclassified claims. J Risk Insur 69:325–340.CrossRef
go back to reference Attiya H, Welch J (2004) Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley. Attiya H, Welch J (2004) Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley.
go back to reference Blazejewski A, Coggins R (2004) Application of self-organizing maps to clustering of high-frequency financial data. In: Proceedings of the Second Workshop on Australasian Information Security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32. Australian Computer Society, Inc., Darlinghurst, Australia, Australia, pp 85–90. Blazejewski A, Coggins R (2004) Application of self-organizing maps to clustering of high-frequency financial data. In: Proceedings of the Second Workshop on Australasian Information Security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32. Australian Computer Society, Inc., Darlinghurst, Australia, Australia, pp 85–90.
go back to reference Brause R, Langsdorf T, Hepp M (1999) Neural data mining for credit card fraud detection. In: Proceedings 11th International Conference on Tools with Artificial Intelligence, pp 103–106. Brause R, Langsdorf T, Hepp M (1999) Neural data mining for credit card fraud detection. In: Proceedings 11th International Conference on Tools with Artificial Intelligence, pp 103–106.
go back to reference Brockwell PJ, Davis RA (2016) Introduction to Time Series and Forecasting. Springer International Publishing. Brockwell PJ, Davis RA (2016) Introduction to Time Series and Forecasting. Springer International Publishing.
go back to reference Castillo O, Melin P (1995) An intelligent system for financial time series prediction combining dynamical systems theory, fractal theory, and statistical methods. In: Proceedings of 1995 Conference on Computational Intelligence for Financial Engineering (CIFEr). IEEE, pp 151–155. Castillo O, Melin P (1995) An intelligent system for financial time series prediction combining dynamical systems theory, fractal theory, and statistical methods. In: Proceedings of 1995 Conference on Computational Intelligence for Financial Engineering (CIFEr). IEEE, pp 151–155.
go back to reference Chawla NV (2009) Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook. Springer US, Boston, MA, pp 875–886.CrossRef Chawla NV (2009) Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook. Springer US, Boston, MA, pp 875–886.CrossRef
go back to reference Ciszak L (2008) Application of clustering and association methods in data cleaning. In: 2008 International Multiconference on Computer Science and Information Technology, pp 97–103. Ciszak L (2008) Application of clustering and association methods in data cleaning. In: 2008 International Multiconference on Computer Science and Information Technology, pp 97–103.
go back to reference Cryer JD, Chan KS (2008) Time Series Analysis: With Applications in R. Springer. New York.CrossRef Cryer JD, Chan KS (2008) Time Series Analysis: With Applications in R. Springer. New York.CrossRef
go back to reference Cumby C, Fano A, Ghani R, Krema M (2004) Predicting customer shopping lists from point-of-sale purchase data. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, USA, pp 402–409. Cumby C, Fano A, Ghani R, Krema M (2004) Predicting customer shopping lists from point-of-sale purchase data. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, USA, pp 402–409.
go back to reference Erl T, Khattak W, Buhler P (2015) Big Data Fundamentals: Concepts, Drivers & Techniques. Prentice Hall. Erl T, Khattak W, Buhler P (2015) Big Data Fundamentals: Concepts, Drivers & Techniques. Prentice Hall.
go back to reference Farajian MA, Mohammadi S (2010) Mining the banking customer behavior using clustering and association rules methods. Int J Indust Eng Prod Res 21:239–245. Farajian MA, Mohammadi S (2010) Mining the banking customer behavior using clustering and association rules methods. Int J Indust Eng Prod Res 21:239–245.
go back to reference Ghemawat S, Gobioff H, Leung S-T (2003) The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles. ACM, New York, NY, USA, pp 29–43. Ghemawat S, Gobioff H, Leung S-T (2003) The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles. ACM, New York, NY, USA, pp 29–43.
go back to reference Guida T (2018) Big Data and Machine Learning in Quantitative Investment. Wiley. Guida T (2018) Big Data and Machine Learning in Quantitative Investment. Wiley.
go back to reference Hamuro Y, Katoh N, Edward IH, et al (2003) Combining information fusion with string pattern analysis: a new method for predicting future purchase behavior BT—Information fusion in data mining. In: Torra V (ed). Springer Berlin Heidelberg, Berlin, Heidelberg, pp 161–187. Hamuro Y, Katoh N, Edward IH, et al (2003) Combining information fusion with string pattern analysis: a new method for predicting future purchase behavior BT—Information fusion in data mining. In: Torra V (ed). Springer Berlin Heidelberg, Berlin, Heidelberg, pp 161–187.
go back to reference Holland JH (1992) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT Press. Holland JH (1992) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT Press.
go back to reference Holmes A (2012) Hadoop in practice—MEAP. In: Hadoop in Practice. p 525. Holmes A (2012) Hadoop in practice—MEAP. In: Hadoop in Practice. p 525.
go back to reference Hsu CF, Hung HF (2009) Classification methods of credit rating—a comparative analysis on SVM, MDA and RST. In: 2009 International Conference on Computational Intelligence and Software Engineering. pp 1–4. Hsu CF, Hung HF (2009) Classification methods of credit rating—a comparative analysis on SVM, MDA and RST. In: 2009 International Conference on Computational Intelligence and Software Engineering. pp 1–4.
go back to reference Hurwitz J, Nugent A, Halper F, Kaufman M (2013) Big Data for Dummies, For Dummies; 1st Edition (April 15, 2013). Hurwitz J, Nugent A, Halper F, Kaufman M (2013) Big Data for Dummies, For Dummies; 1st Edition (April 15, 2013).
go back to reference Joudaki H, Rashidian A, Minaei-Bidgoli B, et al (2015) Using data mining to detect health care fraud and abuse: a review of literature. Glob J Health Sci 7:194. Joudaki H, Rashidian A, Minaei-Bidgoli B, et al (2015) Using data mining to detect health care fraud and abuse: a review of literature. Glob J Health Sci 7:194.
go back to reference Khan MA, Uddin MF, Gupta N (2014) Seven V’s of Big Data understanding Big Data to extract value. In: Proceedings of the 2014 Zone 1 Conference of the American Society for Engineering Education. IEEE, pp 1–5. Khan MA, Uddin MF, Gupta N (2014) Seven V’s of Big Data understanding Big Data to extract value. In: Proceedings of the 2014 Zone 1 Conference of the American Society for Engineering Education. IEEE, pp 1–5.
go back to reference Kirkos E, Spathis C, Manolopoulos Y (2007) Data mining techniques for the detection of fraudulent financial statements. Expert Syst Appl 32:995–1003.CrossRef Kirkos E, Spathis C, Manolopoulos Y (2007) Data mining techniques for the detection of fraudulent financial statements. Expert Syst Appl 32:995–1003.CrossRef
go back to reference Kirlidog M, Asuk C (2012) A fraud detection approach with data mining in health insurance. Procedia-Social Behav Sci 62:989–994.CrossRef Kirlidog M, Asuk C (2012) A fraud detection approach with data mining in health insurance. Procedia-Social Behav Sci 62:989–994.CrossRef
go back to reference Kshemkalyani AD, Singhal M (2011) Distributed Computing: Principles, Algorithms, and Systems. Cambridge University Press. Kshemkalyani AD, Singhal M (2011) Distributed Computing: Principles, Algorithms, and Systems. Cambridge University Press.
go back to reference Kumar BS, Ravi V (2016) A survey of the applications of text mining in financial domain. Knowledge-Based Syst 114:128–147.CrossRef Kumar BS, Ravi V (2016) A survey of the applications of text mining in financial domain. Knowledge-Based Syst 114:128–147.CrossRef
go back to reference Kunigk J, Buss I, Wilkinson P, George L (2018) Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale. O’Reilly Media. Kunigk J, Buss I, Wilkinson P, George L (2018) Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale. O’Reilly Media.
go back to reference Meng X, Bradley J, Yavuz B, et al (2016) MLlib: Machine Learning in Apache Spark. J Mach Learn Res 17:1–7. Meng X, Bradley J, Yavuz B, et al (2016) MLlib: Machine Learning in Apache Spark. J Mach Learn Res 17:1–7.
go back to reference Mukid MA, Widiharih T, Rusgiyono A, Prahutama A (2018) Credit scoring analysis using weighted k nearest neighbor. In: Warsito, B and Putro, SP and Khumaeni A (ed) 7th International Seminar on New Paradigm and Innovation on Natural Science and Its Application. IOP PUBLISHING LTD, DIRAC HOUSE, TEMPLE BACK, BRISTOL BS1 6BE, ENGLAND. Mukid MA, Widiharih T, Rusgiyono A, Prahutama A (2018) Credit scoring analysis using weighted k nearest neighbor. In: Warsito, B and Putro, SP and Khumaeni A (ed) 7th International Seminar on New Paradigm and Innovation on Natural Science and Its Application. IOP PUBLISHING LTD, DIRAC HOUSE, TEMPLE BACK, BRISTOL BS1 6BE, ENGLAND.
go back to reference Sato M (2002) OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors. In: 15th International Symposium on System Synthesis, 2002. pp 109–111. Sato M (2002) OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors. In: 15th International Symposium on System Synthesis, 2002. pp 109–111.
go back to reference Schmuck F, Haskin R (2002) GPFS: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, USA. Schmuck F, Haskin R (2002) GPFS: A shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, USA.
go back to reference Spaggiari JM, Kovacevic M, Noland B, Bosshart R (2018) Getting Started with Kudu: Perform Fast Analytics on Fast Data. O’Reilly Media. Spaggiari JM, Kovacevic M, Noland B, Bosshart R (2018) Getting Started with Kudu: Perform Fast Analytics on Fast Data. O’Reilly Media.
go back to reference Trobec R, Slivnik B, Bulić P, Robič B (2018) Introduction to Parallel Computing: From Algorithms to Programming on State-of-the-Art Platforms. Springer International Publishing. Trobec R, Slivnik B, Bulić P, Robič B (2018) Introduction to Parallel Computing: From Algorithms to Programming on State-of-the-Art Platforms. Springer International Publishing.
go back to reference Turkington G, Deshpande T, Karanth S (2016) Hadoop: Data Processing and Modelling. Packt Publishing. Turkington G, Deshpande T, Karanth S (2016) Hadoop: Data Processing and Modelling. Packt Publishing.
go back to reference Uzun E, Özhan E (2018) Examining the impact of feature selection on classification of user reviews in web pages. In: International Conference on Artificial Intelligence and Data Processing (IDAP 2018). Malatya, Turkey, pp 430–437. Uzun E, Özhan E (2018) Examining the impact of feature selection on classification of user reviews in web pages. In: International Conference on Artificial Intelligence and Data Processing (IDAP 2018). Malatya, Turkey, pp 430–437.
go back to reference Vohra D (2016) Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools. Apress. Vohra D (2016) Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools. Apress.
go back to reference Wang Y, Xu W (2018) Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decis Support Syst 105:87–95.CrossRef Wang Y, Xu W (2018) Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decis Support Syst 105:87–95.CrossRef
go back to reference Witten IH, Frank E, Hall MA, Pal CJ (2016) Data Mining: Practical Machine Learning Tools and Techniques. Elsevier Science. Witten IH, Frank E, Hall MA, Pal CJ (2016) Data Mining: Practical Machine Learning Tools and Techniques. Elsevier Science.
go back to reference Woodward WA, Gray HL, Elliott AC (2017) Applied Time Series Analysis with R. CRC Press. Woodward WA, Gray HL, Elliott AC (2017) Applied Time Series Analysis with R. CRC Press.
go back to reference Yao M, Zhou A, Jia M (2018) Applied Artificial Intelligence: A Handbook for Business Leaders. Topbots. Yao M, Zhou A, Jia M (2018) Applied Artificial Intelligence: A Handbook for Business Leaders. Topbots.
go back to reference Zhi-min Xu, Rui Zhang (2009) Financial revenue analysis based on association rules mining. In: 2009 Asia-Pacific Conference on Computational Intelligence and Industrial Applications (PACIIA), pp 220–223. Zhi-min Xu, Rui Zhang (2009) Financial revenue analysis based on association rules mining. In: 2009 Asia-Pacific Conference on Computational Intelligence and Industrial Applications (PACIIA), pp 220–223.
go back to reference Zikopoulos P, Eaton C (2011) Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data, 1st edn. McGraw-Hill Osborne Media. Zikopoulos P, Eaton C (2011) Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data, 1st edn. McGraw-Hill Osborne Media.
Metadata
Title
The Analysis of Big Financial Data Through Artificial Intelligence Methods
Authors
Erkan Ozhan
Erdinç Uzun
Copyright Year
2021
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-33-6811-8_4