Skip to main content
Erschienen in: The Journal of Supercomputing 5/2016

01.05.2016

J2M: a Java to MapReduce translator for cloud computing

verfasst von: Bing Li, Junbo Zhang, Ning Yu, Yi Pan

Erschienen in: The Journal of Supercomputing | Ausgabe 5/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cloud computing has gradually evolved into an infrastructural tool for a variety of scientific research and computing. It has become a trend that lots of products have been migrated from local servers to cloud by many institutions and organizations. One of the challenges in cloud computing now is how to run software efficiently on cloud platforms since lots of original codes are not capable of being executed in parallel on cloud contexts, resulting in that the power of clouds cannot be exerted well. It is costly to redesign and convert current sequential codes into cloud platform. Thus, automatic translation from sequential code to cloud code is one of the directions that could be taken to resolve the problem of code migration in cloud infrastructure. In this paper, a new Java to MapReduce (J2M) translator is developed to achieve the automatic translation from sequential Java to cloud for specific data-parallel code with large loops. This paper will provide details about the design of our translator and evaluate our performance through experiments. The experimental results not only indicate that the translator can precisely translate the sequential Java into cloud codes, but also show that it can achieve very good speedup in performance, and we expect that an almost linear speedup is possible if larger enough data is processed. It is believed that the J2M translator is an ideal stereotype for code migration and will play an important role in the transition era of cloud computing.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58CrossRef Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58CrossRef
3.
Zurück zum Zitat Bajda-Pawlikowski K, Abadi DJ, Silberschatz A, Paulson E (2011) Efficient processing of data warehousing queries in a split execution environment. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, SIGMOD ’11, pp 1165–1176. ACM, New York, NY, USA. doi:10.1145/1989323.1989447 Bajda-Pawlikowski K, Abadi DJ, Silberschatz A, Paulson E (2011) Efficient processing of data warehousing queries in a split execution environment. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, SIGMOD ’11, pp 1165–1176. ACM, New York, NY, USA. doi:10.​1145/​1989323.​1989447
4.
Zurück zum Zitat Beyer KS, Ercegovac V, Gemulla R, Balmin A, Eltabakh MY, Kanne CC, Ozcan F, Shekita EJ (2011) Jaql: a scripting language for large scale semistructured data analysis. PVLDB, pp 1272–1283 Beyer KS, Ercegovac V, Gemulla R, Balmin A, Eltabakh MY, Kanne CC, Ozcan F, Shekita EJ (2011) Jaql: a scripting language for large scale semistructured data analysis. PVLDB, pp 1272–1283
5.
Zurück zum Zitat Bughin J, Chui M, Manyika J (2010) Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q 56(1):75–86 Bughin J, Chui M, Manyika J (2010) Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q 56(1):75–86
7.
Zurück zum Zitat Chattopadhyay B, Lin, L, Liu W, Mittal S, Aragonda P, Lychagina V, Kwon Y, Wong M (2011) Tenzing a SQL implementation on the MapReduce framework. In: Proceedings of VLDB, p 1318–1327 Chattopadhyay B, Lin, L, Liu W, Mittal S, Aragonda P, Lychagina V, Kwon Y, Wong M (2011) Tenzing a SQL implementation on the MapReduce framework. In: Proceedings of VLDB, p 1318–1327
8.
Zurück zum Zitat Dagum L, Menon R (1998) OpenMP: an industry standard API for shared-memory programming. Comput Sci Eng IEEE 5(1):46–55CrossRef Dagum L, Menon R (1998) OpenMP: an industry standard API for shared-memory programming. Comput Sci Eng IEEE 5(1):46–55CrossRef
10.
Zurück zum Zitat Ekanayake J, Li H, Zhang B, Gunarathne T, Bae SH, Qiu J, Fox G (2010) Twister : a runtime for iterative MapReduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC ’10, ACM, New York, NY, USA, p 810–818 doi:10.1145/1851476.1851593 Ekanayake J, Li H, Zhang B, Gunarathne T, Bae SH, Qiu J, Fox G (2010) Twister : a runtime for iterative MapReduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC ’10, ACM, New York, NY, USA, p 810–818 doi:10.​1145/​1851476.​1851593
12.
Zurück zum Zitat Gunarathne T, Zhang B, Wu TL, Qiu J (2011) Portable parallel programming on cloud and HPC: Scientific applications of twister4azure. In: UCC’11, p 97–104 Gunarathne T, Zhang B, Wu TL, Qiu J (2011) Portable parallel programming on cloud and HPC: Scientific applications of twister4azure. In: UCC’11, p 97–104
13.
Zurück zum Zitat He B, Fang W, Luo Q, Govindaraju NK, Wang T (2008) Mars : a mapreduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08, ACM, New York, NY, USA, pp 260–269. doi:10.1145/1454115.1454152 He B, Fang W, Luo Q, Govindaraju NK, Wang T (2008) Mars : a mapreduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08, ACM, New York, NY, USA, pp 260–269. doi:10.​1145/​1454115.​1454152
14.
Zurück zum Zitat Lee R, Luo T, Huai Y, Wang F, He Y, Zhang X (2011) YSmart: Yet another SQL-to-MapReduce translator. In: Distributed Computing Systems (ICDCS), 2011 31st International Conference on, p 25 –36. doi:10.1109/ICDCS.2011.26 Lee R, Luo T, Huai Y, Wang F, He Y, Zhang X (2011) YSmart: Yet another SQL-to-MapReduce translator. In: Distributed Computing Systems (ICDCS), 2011 31st International Conference on, p 25 –36. doi:10.​1109/​ICDCS.​2011.​26
16.
Zurück zum Zitat Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of data, SIGMOD ’08, ACM, New York, NY, USA, pp 1099–1110. doi:10.1145/1376616.1376726 Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of data, SIGMOD ’08, ACM, New York, NY, USA, pp 1099–1110. doi:10.​1145/​1376616.​1376726
17.
Zurück zum Zitat Pallickara S, Ekanayake J, Fox G (2009) Granules: A lightweight, streaming runtime for cloud computing with support, for Map-Reduce. In: Cluster Computing and Workshops, 2009. CLUSTER ’09. IEEE International Conference on, pp 1–10. doi:10.1109/CLUSTR.2009.5289160 Pallickara S, Ekanayake J, Fox G (2009) Granules: A lightweight, streaming runtime for cloud computing with support, for Map-Reduce. In: Cluster Computing and Workshops, 2009. CLUSTER ’09. IEEE International Conference on, pp 1–10. doi:10.​1109/​CLUSTR.​2009.​5289160
18.
Zurück zum Zitat Pan Y, Zhang J (2012) Parallel programming on cloud computing platforms. J Converg 3(4):23–28MathSciNet Pan Y, Zhang J (2012) Parallel programming on cloud computing platforms. J Converg 3(4):23–28MathSciNet
19.
Zurück zum Zitat Talbot J, Yoo RM, Kozyrakis C (2011) Phoenix++ : modular MapReduce for shared-memory systems. In: Proceedings of the second international workshop on MapReduce and its applications, MapReduce ’11, ACM, New York, NY, USA, p 9–16. doi:10.1145/1996092.1996095 Talbot J, Yoo RM, Kozyrakis C (2011) Phoenix++ : modular MapReduce for shared-memory systems. In: Proceedings of the second international workshop on MapReduce and its applications, MapReduce ’11, ACM, New York, NY, USA, p 9–16. doi:10.​1145/​1996092.​1996095
21.
Zurück zum Zitat White T (2010) Hadoop: The Definitive Guide, 2nd edn. O’Reilly Media, Inc., Sebastopol, CA White T (2010) Hadoop: The Definitive Guide, 2nd edn. O’Reilly Media, Inc., Sebastopol, CA
22.
Zurück zum Zitat Zhang J, Xiang D, Li T, Pan Y (2013) M2M : a simple Matlab-to-MapReduce translator for cloud computing. Tsinghua Sci Technol 18(1):1–9CrossRef Zhang J, Xiang D, Li T, Pan Y (2013) M2M : a simple Matlab-to-MapReduce translator for cloud computing. Tsinghua Sci Technol 18(1):1–9CrossRef
Metadaten
Titel
J2M: a Java to MapReduce translator for cloud computing
verfasst von
Bing Li
Junbo Zhang
Ning Yu
Yi Pan
Publikationsdatum
01.05.2016
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 5/2016
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1695-x

Weitere Artikel der Ausgabe 5/2016

The Journal of Supercomputing 5/2016 Zur Ausgabe

Premium Partner