nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Processing Using Spark—A Potent of BD Technology

verfasst von : M. Venkatesh Saravanakumar, Sabibullah Mohamed Hanifa

Erschienen in: Big Data Processing Using Spark in Cloud

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Processing, accessing, analyzing, securing, and stockpiling of big data are the most core modalities in big data technology, where Spark, is a core processing layer, an open-source cluster (in-memory) computing platform, unified data processing engine, faster and reliable in a cutting-edge analysis for all types of data. It has a potent to join different datasets across multiple disparate data sources. It supports in-memory computing and enables faster query access compared to disk-based engines like Hadoop. Query ID="Q1" Text="Please check and confirm if the author names and initials are correct." This chapter sustains the major potent of processing behind Spark connected contents like Resilient Distributed Datasets (RDDs), scalable Machine Learning libraries (MLlib), Spark incremental Streaming pipeline process, parallel graph computation interface through GraphX, SQL Data frames, SparkSQL (Data processing paradigm supports columnar storage), and Recommendation systems with MlLib. All libraries operate on RDDs as the data abstraction is very easy to compose with any applications. RDDs are a fault-tolerant computing engine (RDDs are the major abstraction and provide explicit support for data-sharing (user’s computations), can capture a wide range of processing workloads and parallel manipulated can be done in the cluster as a fault-tolerant manner). These are exposed through functional programming APIs (or BD-supported languages) like Scala, Python. Chapter also throws the viewpoint on core scalability of Spark to build high-level data processing libraries for the future generation application is involved. To understand and simplify the entire BD tasks, focusing of processing hindsight, insights, foresights by using Spark’s core engine, its members of ecosystem components are explained with a neat interpretable way, is mandatory for data science compilers at this moment. Big contents dive (current big data tools in Spark, cloud storage) of cognizance are explored in this initiative to replace the bottlenecks towards the development of an efficient and comprehend analytics applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Personalized Diabetes Analysis Using Correlation-Based Incremental Clustering Algorithm

Nächstes Kapitel Recent Developments in Big Data Analysis Tools and Apache Spark

Ankam, V.: Big Data Analytics. Packt Publishing (2016). ISBN 978-1-78588-469-6

Apache Spark Research. https://spark.apache.org/research.html (2014)

Ambrusty, M., et al.: Spark SQL: Relational Data Processing in Spark. AMP, UC Berkrley, (2015)

Lu, X., et al.: Accelerating spark with RDMA for big data processing: early experiences. In: 22nd annual Symbosium on High-Performance Interconnects. IEEE (2014)

Zaharia, M., et al.: Spark: Cluster Computing with Working Sets, Hot Cloud (2010)

Zaharia, M., et al.: Resilient distributed datasets: a fault—tolerant abstraction for in-memory cluster computing. In: NSDI’12 USENIX Symposium on networked design and implementation with ACM SIGOCOMM and ACM SIGOPS, SAN-JOSE,CA (2012)

Hindman, B., et al.: Mesos: A Platform for fine-grained resource sharing in the data center, Technical report UCB/EECS-2010-87, EECS Department, University of California, Berkely, May 2010

Fu, J., et al.: SPARK—a big data processing platform for machine learning. In: International Conference on Industrial Informatics—Computing Technology, Intelligent Technology, Industrial Information, Integration. pp. 48–51. IEEE (2016)

Dhanapal, A., Saravanakumar M.V., Sabibullah. M.: Emerging big data storage architectures: a new paradigm. i-Manag. J. Pattern Recogn. 4(2), 31–41 (2017)

10.

Raja, K., Sabibullah, M.: Big data driven cloud security—a survey. In: IOP Conference Series, Materials Science & Engineering (ICMAEM-2017), vol. 225 (2017)

11.

Arulananthan, C., Sabibullah, M.: Smart Health- Potential & Pathways -A Survey, Vol. 225, IOP Conference Series, Materials Science & Engineering (ICMAEM-2017) (2017)

12.

Ghaffar, A., et al.: Big data analysis: an spark perspective. Glob. J. Comput. Sci. Technol. Softw. Data Eng. Version 1.0 15(1) (2015)

Titel: Processing Using Spark—A Potent of BD Technology
verfasst von: M. Venkatesh Saravanakumar
Sabibullah Mohamed Hanifa
Verlag: Springer Singapore
Buch: Big Data Processing Using Spark in Cloud
Print ISBN: 978-981-13-0549-8

Electronic ISBN: 978-981-13-0550-4

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-981-13-0550-4_9

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"