Skip to main content
Top

2019 | OriginalPaper | Chapter

18. Hadoop: A Standard Framework for Computer Cluster

Authors : Eljar Akhgarnush, Lars Broeckers, Thorsten Jakoby

Published in: The Impact of Digital Transformation and FinTech on the Finance Professional

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Hadoop has become a standard for processing big data in a clustered environment. This article provides an introduction to Hadoop/HDSF and other important Apache projects including Spark, Hive and HBase. The basic concepts like worker nodes and cluster manager are also introduced here.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
RC—record columnar files.
 
2
ORC—optimized RC files.
 
Literature
go back to reference Capriolo, E., Wampler, D., & Rutherglen, J. (2012). Programming Hive: Data warehouse and query language for Hadoop. Sebastopol: O’Reilly Media, Inc. Capriolo, E., Wampler, D., & Rutherglen, J. (2012). Programming Hive: Data warehouse and query language for Hadoop. Sebastopol: O’Reilly Media, Inc.
go back to reference Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113. Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113.
go back to reference Gerecke, K., & Poschke, K. (2010). IBM system storage-Kompendium. Ehningen, Germany: IBM. Gerecke, K., & Poschke, K. (2010). IBM system storage-Kompendium. Ehningen, Germany: IBM.
go back to reference Meng, X., & Bradley, J. (2016). MLlib: Machine learning in Apache Spark. Journal of Machine Learning Research, 1, 1235–1241. Meng, X., & Bradley, J. (2016). MLlib: Machine learning in Apache Spark. Journal of Machine Learning Research, 1, 1235–1241.
Metadata
Title
Hadoop: A Standard Framework for Computer Cluster
Authors
Eljar Akhgarnush
Lars Broeckers
Thorsten Jakoby
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-23719-6_18