Skip to main content
Top

2016 | OriginalPaper | Chapter

Design Issues of Big Data Parallelisms

Author : Koushik Mondal

Published in: Information Systems Design and Intelligent Applications

Publisher: Springer India

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data Intensive Computing for Scientific Research needs effective tools for data capture, curate them for designing appropriate algorithms and multidimensional analysis for effective decision making for the society. Different computational environments used for different data intensive problems such as Sentiment Analysis and Opinion Mining of Social media, Massive Open Online Courses (MOOCs), Large Hadron Collider of CERN, Square Kilometer Array (SKA) of radio telescopes project, are usually capable of generating exabytes (EB) of data per day, but present situations limits them to more manageable data collection rates. Different disciplines and data generation rates of different lab experiments, online as well as offline, make the issue of creating effective tools a formidable problem. In this paper we will discuss about different data intensive computing tools, trends of different emerging technologies, how big data processing heavily relying on those effective tools and how it helps in creating different models and decision making.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute, May 2011. Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute, May 2011.
3.
go back to reference J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of ACM 51, Vol. 1, 2008, pp. 107–113. J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of ACM 51, Vol. 1, 2008, pp. 107–113.
4.
go back to reference C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig latin: A Not-So-Foreign Language for Data Processing,” Proc. of the SIGMOD International Conference on Management of Data. ACM, 2008, pp. 1099–1110. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, “Pig latin: A Not-So-Foreign Language for Data Processing,” Proc. of the SIGMOD International Conference on Management of Data. ACM, 2008, pp. 1099–1110.
6.
go back to reference S. Seo, E.J. Yoon, J. Kim, S. Jin, J.-S. Kim, S. Maeng, Hama: an efficient matrix computation with the mapreduce framework, in 2nd International Conference on Cloud Computing Technology and Science, IEEE, 2010, pp. 721–726. S. Seo, E.J. Yoon, J. Kim, S. Jin, J.-S. Kim, S. Maeng, Hama: an efficient matrix computation with the mapreduce framework, in 2nd International Conference on Cloud Computing Technology and Science, IEEE, 2010, pp. 721–726.
7.
go back to reference Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, J.M. Hellerstein, Graphlab: a new parallel framework for machine learning, in: Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010. Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, J.M. Hellerstein, Graphlab: a new parallel framework for machine learning, in: Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, 2010.
10.
go back to reference U. Kang, C.E. Tsourakakis, C. Faloutsos, Pegasus: a peta-scale graph mining system implementation and observations, in the Ninth IEEE International Conference on Data Mining, ICDM’09, IEEE, 2009, pp. 229–238. U. Kang, C.E. Tsourakakis, C. Faloutsos, Pegasus: a peta-scale graph mining system implementation and observations, in the Ninth IEEE International Conference on Data Mining, ICDM’09, IEEE, 2009, pp. 229–238.
11.
go back to reference M.F. Pace, BSP vs MapReduce, Procedeeings of Computer Science, Vol. 9, 2012, pp. 246–255. M.F. Pace, BSP vs MapReduce, Procedeeings of Computer Science, Vol. 9, 2012, pp. 246–255.
12.
go back to reference Y. Bu, B. Howe, M. Balazinska, M.D. Ernst, HaLoop: efficient iterative data processing on large clusters, Proceedings of VLDB Endowments, Vol. 3 (1–2), 2010), pp. 285–296. Y. Bu, B. Howe, M. Balazinska, M.D. Ernst, HaLoop: efficient iterative data processing on large clusters, Proceedings of VLDB Endowments, Vol. 3 (1–2), 2010), pp. 285–296.
13.
go back to reference L.G. Valiant, A bridging model for parallel computation, Communnications of ACM, Vol. 33 (8), 1990, pp. 103–111. L.G. Valiant, A bridging model for parallel computation, Communnications of ACM, Vol. 33 (8), 1990, pp. 103–111.
14.
go back to reference L. Bottou and O. Bousquet, The tradeoffs off of large scale learning, Advance Neural Information Process Systems, Vol. 20, 2008, pp. 161–168. L. Bottou and O. Bousquet, The tradeoffs off of large scale learning, Advance Neural Information Process Systems, Vol. 20, 2008, pp. 161–168.
15.
go back to reference R.H. Byrd, G.M. Chin, J. Nocedal, Y.Wu, Sample size selection in optimization methods for machine learning, Mathematical Programming 134:1, 2012, pp. 127–155. R.H. Byrd, G.M. Chin, J. Nocedal, Y.Wu, Sample size selection in optimization methods for machine learning, Mathematical Programming 134:1, 2012, pp. 127–155.
16.
go back to reference Koushik Mondal, Big Data Parallelism: Issues in different X-Information Paradigms, Elsevier Procedia Computer Science, Special Issue on Big Data, Cloud and Computing Challenges, ISSN 1877–0509, Vol. 50, pp. 395-400, 2015. Koushik Mondal, Big Data Parallelism: Issues in different X-Information Paradigms, Elsevier Procedia Computer Science, Special Issue on Big Data, Cloud and Computing Challenges, ISSN 1877–0509, Vol. 50, pp. 395-400, 2015.
17.
go back to reference Koushik Mondal, Paramartha Dutta, Big Data Parallelism: Challenges in different computational Paradigms, IEEE Third International conference on Computer, Communication, Control and Information Technology, ISBN: 978-1-4799-4446-0, 2015. Koushik Mondal, Paramartha Dutta, Big Data Parallelism: Challenges in different computational Paradigms, IEEE Third International conference on Computer, Communication, Control and Information Technology, ISBN: 978-1-4799-4446-0, 2015.
18.
go back to reference B.Recht, C.Re, S. Wright,F.Niu, Hogwild: A lock-free approach to parallelizing stochastic grapient descent, Advance Neural Information Processing System,s Vol. 24, 2011, pp. 693–701. B.Recht, C.Re, S. Wright,F.Niu, Hogwild: A lock-free approach to parallelizing stochastic grapient descent, Advance Neural Information Processing System,s Vol. 24, 2011, pp. 693–701.
22.
go back to reference Bonnet L., Laurent A., Sala M., Laurent B., Sicard N., Reduce, You Say: What NoSQL Can Do for Data Aggregation and BI in Large Repositories, 22nd International Workshop on Database and Expert Systems Applications, pp. 483–488, 2011. Bonnet L., Laurent A., Sala M., Laurent B., Sicard N., Reduce, You Say: What NoSQL Can Do for Data Aggregation and BI in Large Repositories, 22nd International Workshop on Database and Expert Systems Applications, pp. 483–488, 2011.
Metadata
Title
Design Issues of Big Data Parallelisms
Author
Koushik Mondal
Copyright Year
2016
Publisher
Springer India
DOI
https://doi.org/10.1007/978-81-322-2752-6_20

Premium Partner