Top

Published in:

2016 | OriginalPaper | Chapter

Design Issues of Big Data Parallelisms

Author : Koushik Mondal

Published in: Information Systems Design and Intelligent Applications

Publisher: Springer India

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Data Intensive Computing for Scientific Research needs effective tools for data capture, curate them for designing appropriate algorithms and multidimensional analysis for effective decision making for the society. Different computational environments used for different data intensive problems such as Sentiment Analysis and Opinion Mining of Social media, Massive Open Online Courses (MOOCs), Large Hadron Collider of CERN, Square Kilometer Array (SKA) of radio telescopes project, are usually capable of generating exabytes (EB) of data per day, but present situations limits them to more manageable data collection rates. Different disciplines and data generation rates of different lab experiments, online as well as offline, make the issue of creating effective tools a formidable problem. In this paper we will discuss about different data intensive computing tools, trends of different emerging technologies, how big data processing heavily relying on those effective tools and how it helps in creating different models and decision making.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A Novel Genetic Algorithm and Particle Swarm Optimization for Data Clustering

next chapter Revised ECLAT Algorithm for Frequent Itemset Mining

Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute, May 2011.

Apache Hadoop. http://hadoop.apache.org.

J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of ACM 51, Vol. 1, 2008, pp. 107–113.

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig latin: A Not-So-Foreign Language for Data Processing,” Proc. of the SIGMOD International Conference on Management of Data. ACM, 2008, pp. 1099–1110.

Apache Hive. https://hive.apache.org.

S. Seo, E.J. Yoon, J. Kim, S. Jin, J.-S. Kim, S. Maeng, Hama: an efficient matrix computation with the mapreduce framework, in 2nd International Conference on Cloud Computing Technology and Science, IEEE, 2010, pp. 721–726.

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, J.M. Hellerstein, Graphlab: a new parallel framework for machine learning, in: Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010.

Apache Giraph, http://giraph.apache.org.

Apache Mahout, http://mahout.apache.org/.

10.

U. Kang, C.E. Tsourakakis, C. Faloutsos, Pegasus: a peta-scale graph mining system implementation and observations, in the Ninth IEEE International Conference on Data Mining, ICDM’09, IEEE, 2009, pp. 229–238.

11.

M.F. Pace, BSP vs MapReduce, Procedeeings of Computer Science, Vol. 9, 2012, pp. 246–255.

12.

Y. Bu, B. Howe, M. Balazinska, M.D. Ernst, HaLoop: efficient iterative data processing on large clusters, Proceedings of VLDB Endowments, Vol. 3 (1–2), 2010), pp. 285–296.

13.

L.G. Valiant, A bridging model for parallel computation, Communnications of ACM, Vol. 33 (8), 1990, pp. 103–111.

14.

L. Bottou and O. Bousquet, The tradeoffs off of large scale learning, Advance Neural Information Process Systems, Vol. 20, 2008, pp. 161–168.

15.

R.H. Byrd, G.M. Chin, J. Nocedal, Y.Wu, Sample size selection in optimization methods for machine learning, Mathematical Programming 134:1, 2012, pp. 127–155.

16.

Koushik Mondal, Big Data Parallelism: Issues in different X-Information Paradigms, Elsevier Procedia Computer Science, Special Issue on Big Data, Cloud and Computing Challenges, ISSN 1877–0509, Vol. 50, pp. 395-400, 2015.

17.

Koushik Mondal, Paramartha Dutta, Big Data Parallelism: Challenges in different computational Paradigms, IEEE Third International conference on Computer, Communication, Control and Information Technology, ISBN: 978-1-4799-4446-0, 2015.

18.

B.Recht, C.Re, S. Wright,F.Niu, Hogwild: A lock-free approach to parallelizing stochastic grapient descent, Advance Neural Information Processing System,s Vol. 24, 2011, pp. 693–701.

19.

Drew Schmidt and George Ostrouchov, Programming with Big Data in R, user 2014 summit, http://r-pbd.org/.

20.

RHadoop: Open Source Project, http://projects.revolutionanalytics.com/rhadoop/.

21.

KNIME: Open for Innovation Project, https://www.knime.org/.

22.

Bonnet L., Laurent A., Sala M., Laurent B., Sicard N., Reduce, You Say: What NoSQL Can Do for Data Aggregation and BI in Large Repositories, 22nd International Workshop on Database and Expert Systems Applications, pp. 483–488, 2011.

Title: Design Issues of Big Data Parallelisms
Author: Koushik Mondal
Publisher: Springer India
Book: Information Systems Design and Intelligent Applications
Print ISBN: 978-81-322-2750-2

Electronic ISBN: 978-81-322-2752-6

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-81-322-2752-6_20

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner