Skip to main content
Erschienen in: Cluster Computing 4/2014

01.12.2014

Adaptive Combiner for MapReduce on cloud computing

verfasst von: Tzu-Chi Huang, Kuo-Chih Chu, Wei-Tsong Lee, Yu-Sheng Ho

Erschienen in: Cluster Computing | Ausgabe 4/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

MapReduce is a programming model to process a massive amount of data on cloud computing. MapReduce processes data in two phases and needs to transfer intermediate data among computers between phases. MapReduce allows programmers to aggregate intermediate data with a function named combiner before transferring it. By leaving programmers the choice of using a combiner, MapReduce has a risk of performance degradation because aggregating intermediate data benefits some applications but harms others. Now, MapReduce can work with our proposal named the Adaptive Combiner for MapReduce (ACMR) to automatically, smartly, and trainer for getting a better performance without any interference of programmers. In experiments on seven applications, MapReduce can utilize ACMR to get the performance comparable to the system that is optimal for an application.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dean, J. and Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: Proceedings of the 6th symposium on operating systems design and implementation (OSDI), pp. 137–150, Dec 2004 Dean, J. and Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: Proceedings of the 6th symposium on operating systems design and implementation (OSDI), pp. 137–150, Dec 2004
2.
Zurück zum Zitat Dede, E., Govindaraju, M., and Ramakrishnan, L.: Benchmarking MapReduce implementations for application usage scenarios. In: Proceedings of the IEEE/ACM international conference on grid computing (GRID), pp. 90–97, Sept 2011 Dede, E., Govindaraju, M., and Ramakrishnan, L.: Benchmarking MapReduce implementations for application usage scenarios. In: Proceedings of the IEEE/ACM international conference on grid computing (GRID), pp. 90–97, Sept 2011
3.
Zurück zum Zitat Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. J. ACM SIGMOD 40(4), 11–20 (Dec. 2011) Lee, K.H., Lee, Y.J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. J. ACM SIGMOD 40(4), 11–20 (Dec. 2011)
4.
Zurück zum Zitat Mazur, E., Li, B., Diao, Y., and Shenoy, P.: Towards scalable one-pass analytics using MapReduce. In: Proceedings of IEEE international symposium on parallel and distributed processing workshops and Phd Forum (IPDPSW), pp. 1102–1111, May 2011 Mazur, E., Li, B., Diao, Y., and Shenoy, P.: Towards scalable one-pass analytics using MapReduce. In: Proceedings of IEEE international symposium on parallel and distributed processing workshops and Phd Forum (IPDPSW), pp. 1102–1111, May 2011
5.
Zurück zum Zitat Li, K., Yang, L.T., Lin, X.: Advanced topics in cloud computing. J. Netw. Comput. Appl. 34(4), 1033–1034 (2011)CrossRefMathSciNet Li, K., Yang, L.T., Lin, X.: Advanced topics in cloud computing. J. Netw. Comput. Appl. 34(4), 1033–1034 (2011)CrossRefMathSciNet
6.
Zurück zum Zitat Zhou, M., Mu, Y., Susilo, W., Yan, J., Dong, L.: Privacy enhanced data outsourcing in the cloud. J. Netw. Comput. Appl. 35(4), 1367–1373 (2012)CrossRef Zhou, M., Mu, Y., Susilo, W., Yan, J., Dong, L.: Privacy enhanced data outsourcing in the cloud. J. Netw. Comput. Appl. 35(4), 1367–1373 (2012)CrossRef
7.
Zurück zum Zitat Wu, T.L., Qiu, J., and Fox, G.: MapReduce in the clouds for science. In: Proceedings of IEEE second international conference on cloud computing technology and science (CloudCom), pp. 565–572, Dec 2010 Wu, T.L., Qiu, J., and Fox, G.: MapReduce in the clouds for science. In: Proceedings of IEEE second international conference on cloud computing technology and science (CloudCom), pp. 565–572, Dec 2010
8.
Zurück zum Zitat Prodan, R., Sperk, M., Ostermann, S.: Evaluating high-performance computing on google app engine. IEEE Softw. 29(2), 52–58 (2012)CrossRef Prodan, R., Sperk, M., Ostermann, S.: Evaluating high-performance computing on google app engine. IEEE Softw. 29(2), 52–58 (2012)CrossRef
9.
Zurück zum Zitat Huang, T.C.: Program ultra-dispatcher for launching applications in a customization manner on cloud computing. J. Netw. Comput. Appl. (JNCA) 35(1), 423–446 (2012)CrossRef Huang, T.C.: Program ultra-dispatcher for launching applications in a customization manner on cloud computing. J. Netw. Comput. Appl. (JNCA) 35(1), 423–446 (2012)CrossRef
11.
Zurück zum Zitat Suzumura, T., Trent, S., Tatsubori, M., Tozawa, A. and Onodera, T.: Performance comparison of web service engines in PHP, Java and C. In: Proceedings of IEEE international conference on web services (ICWS), pp. 385–392, Sept 2008 Suzumura, T., Trent, S., Tatsubori, M., Tozawa, A. and Onodera, T.: Performance comparison of web service engines in PHP, Java and C. In: Proceedings of IEEE international conference on web services (ICWS), pp. 385–392, Sept 2008
12.
Zurück zum Zitat Yu, X. and Yi, C.: Design and implementation of the website based on PHP & MYSQL’. In: Proceedings of international conference on E-product E-service and E-entertainment (ICEEE), pp. 1–4, Nov 2010 Yu, X. and Yi, C.: Design and implementation of the website based on PHP & MYSQL’. In: Proceedings of international conference on E-product E-service and E-entertainment (ICEEE), pp. 1–4, Nov 2010
13.
Zurück zum Zitat White, T.: Hadoop: the definitive guide. ISBN: 978-0-596-52497-4, O’Reilly Media, Yahoo! Press, June 5, 2009 White, T.: Hadoop: the definitive guide. ISBN: 978-0-596-52497-4, O’Reilly Media, Yahoo! Press, June 5, 2009
14.
Zurück zum Zitat Duan, A.: Research and application of distributed parallel search hadoop algorithm. In: Proceedings of international conference on systems and informatics (ICSAI), pp. 2462–2465, May 2012 Duan, A.: Research and application of distributed parallel search hadoop algorithm. In: Proceedings of international conference on systems and informatics (ICSAI), pp. 2462–2465, May 2012
15.
Zurück zum Zitat Shvachko, K., Kuang, H., Radia, S. and Chansler, R.: The Hadoop distributed File system. In: Proceedings of 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp. 1–10, May 2010 Shvachko, K., Kuang, H., Radia, S. and Chansler, R.: The Hadoop distributed File system. In: Proceedings of 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp. 1–10, May 2010
16.
Zurück zum Zitat Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. J. BMC Bioinforma. 11(12), S1 (2010)CrossRef Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. J. BMC Bioinforma. 11(12), S1 (2010)CrossRef
17.
Zurück zum Zitat Wright, G.R. and Stevens, W.R.: TCP/IP illustrated: the protocols. ISBN: 0-201-63346-9, Vol. 2: The Implementation. Addison-Wesley, 1995 Wright, G.R. and Stevens, W.R.: TCP/IP illustrated: the protocols. ISBN: 0-201-63346-9, Vol. 2: The Implementation. Addison-Wesley, 1995
18.
Zurück zum Zitat Yang, Y.R and Lam, S.S.: General AIMD congestion control. In: Proceedings of ICNP, pp. 187–198, Nov 2000 Yang, Y.R and Lam, S.S.: General AIMD congestion control. In: Proceedings of ICNP, pp. 187–198, Nov 2000
19.
Zurück zum Zitat Everette, S., Gardner, J.: Exponential smoothing: the state of the art. J. Forecast. 4(1), 1–28 (1985)CrossRef Everette, S., Gardner, J.: Exponential smoothing: the state of the art. J. Forecast. 4(1), 1–28 (1985)CrossRef
20.
Zurück zum Zitat Gosling, J., Joy, B., and Steele, G.L.: The Java Language Specification, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston, MA (1996). ISBN:0201634511 Gosling, J., Joy, B., and Steele, G.L.: The Java Language Specification, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston, MA (1996). ISBN:0201634511
22.
Zurück zum Zitat Lee, W.M.: Recommending proper API code examples for documentation purpose. In: Proceedings of 18th Asia Pacific software engineering conference (APSEC), pp. 331–338, 2011 Lee, W.M.: Recommending proper API code examples for documentation purpose. In: Proceedings of 18th Asia Pacific software engineering conference (APSEC), pp. 331–338, 2011
23.
Zurück zum Zitat Yang, G.: The application of MapReduce in the cloud computing. In: Proceeding of 2th international symposium on intelligence information processing and trusted computing (IPTC), pp. 154–156, Oct 2011 Yang, G.: The application of MapReduce in the cloud computing. In: Proceeding of 2th international symposium on intelligence information processing and trusted computing (IPTC), pp. 154–156, Oct 2011
24.
Zurück zum Zitat Astrachan, O.: Bubble sort: an archaeological algorithmic analysis. In: Proceedings of the 34th SIGCSE technical symposium on computer science education, pp. 1–5, 2003 Astrachan, O.: Bubble sort: an archaeological algorithmic analysis. In: Proceedings of the 34th SIGCSE technical symposium on computer science education, pp. 1–5, 2003
25.
Zurück zum Zitat Inaba, M., Katoh, N., and Imai, H.: Applications of weighted voronoi diagrams and randomization to variance-based k-clustering. In: Proceedings of 10th annual ACM symposium computational geometry, pp. 332–339, June 1994 Inaba, M., Katoh, N., and Imai, H.: Applications of weighted voronoi diagrams and randomization to variance-based k-clustering. In: Proceedings of 10th annual ACM symposium computational geometry, pp. 332–339, June 1994
26.
Zurück zum Zitat Bull, R.I., Trevors, A., Malton, A.J., Godfrey, M.W.: Semantic grep: regular expressions + relational abstraction. In: Proceedings of ninth working conference on reverse, engineering (WCRE’02), pp. 267–276, Oct 2002 Bull, R.I., Trevors, A., Malton, A.J., Godfrey, M.W.: Semantic grep: regular expressions + relational abstraction. In: Proceedings of ninth working conference on reverse, engineering (WCRE’02), pp. 267–276, Oct 2002
27.
Zurück zum Zitat Zhu, S., Zhiwei, X., Haibo, C., Rong, C., Weihua, Z., and Binyu, Z.: Evaluating SPLASH-2 applications using MapReduce. In: Proceedings of APPT’09, pp. 452–464, 2009 Zhu, S., Zhiwei, X., Haibo, C., Rong, C., Weihua, Z., and Binyu, Z.: Evaluating SPLASH-2 applications using MapReduce. In: Proceedings of APPT’09, pp. 452–464, 2009
28.
Zurück zum Zitat He, B., Fang, W., Luo, Q., Govindaraju, N.K., and Wang, T.: Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th international conference on parallel architectures and compilation, techniques, pp. 260–269, 2008 He, B., Fang, W., Luo, Q., Govindaraju, N.K., and Wang, T.: Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th international conference on parallel architectures and compilation, techniques, pp. 260–269, 2008
29.
Zurück zum Zitat Isard, M., Budiu, M., Yu, Y., Birrell, A. and Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: Proceedings of European conference on computer systems (EuroSys), pp. 59–72, 2007 Isard, M., Budiu, M., Yu, Y., Birrell, A. and Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: Proceedings of European conference on computer systems (EuroSys), pp. 59–72, 2007
30.
Zurück zum Zitat Smith, J.M., Chang, P.Y.T.: Optimizing the performance of a relational algebra database interface. J. ACM 18(10), 568–579 (Oct. 1975) Smith, J.M., Chang, P.Y.T.: Optimizing the performance of a relational algebra database interface. J. ACM 18(10), 568–579 (Oct. 1975)
31.
Zurück zum Zitat Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.H., Qiu, J., and Fox, G.: Twister: a runtime for iterative MapReduce. In: Proceedings of the first international workshop on MapReduce and its applications(HPDC’10), pp. 810–818, 2010 Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.H., Qiu, J., and Fox, G.: Twister: a runtime for iterative MapReduce. In: Proceedings of the first international workshop on MapReduce and its applications(HPDC’10), pp. 810–818, 2010
32.
Zurück zum Zitat Condie, T., Conway, N., Alvaro, P., and Hellerstien, J.M.: MapReduce online. In: Proceedings of 7th USENIX conference on networked systems design and implementation (NSDI), pp. 12–21, 2010 Condie, T., Conway, N., Alvaro, P., and Hellerstien, J.M.: MapReduce online. In: Proceedings of 7th USENIX conference on networked systems design and implementation (NSDI), pp. 12–21, 2010
33.
Zurück zum Zitat Kambatla, K., Rapolu, N., Jagannathan, S., and Grama, A.: Asynchronous algorithms in mapreduce. In: Proceedings of IEEE CLUSTER, pp. 245–254, 2010 Kambatla, K., Rapolu, N., Jagannathan, S., and Grama, A.: Asynchronous algorithms in mapreduce. In: Proceedings of IEEE CLUSTER, pp. 245–254, 2010
34.
Zurück zum Zitat Yu, Y., Gunda, P.K., and Isard, M.: Distributed aggregation for data-parallel computing: interfaces and implementations. In: Proceedings of ACM symposium on operating systems principles (SOSP), pp. 247–260, 2009 Yu, Y., Gunda, P.K., and Isard, M.: Distributed aggregation for data-parallel computing: interfaces and implementations. In: Proceedings of ACM symposium on operating systems principles (SOSP), pp. 247–260, 2009
35.
Zurück zum Zitat Jiang, D., Tung, A.K.H., Chen, G.: Map-join-reduce: towards scalable and efficient data analysis on large clusters. J. IEEE Trans. Knowl. Data Eng. 23(9), 1299–1311 (2011)CrossRef Jiang, D., Tung, A.K.H., Chen, G.: Map-join-reduce: towards scalable and efficient data analysis on large clusters. J. IEEE Trans. Knowl. Data Eng. 23(9), 1299–1311 (2011)CrossRef
Metadaten
Titel
Adaptive Combiner for MapReduce on cloud computing
verfasst von
Tzu-Chi Huang
Kuo-Chih Chu
Wei-Tsong Lee
Yu-Sheng Ho
Publikationsdatum
01.12.2014
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 4/2014
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-014-0362-3

Weitere Artikel der Ausgabe 4/2014

Cluster Computing 4/2014 Zur Ausgabe