Skip to main content
Erschienen in: Cluster Computing 2/2018

23.11.2017

Performance prediction of parallel computing models to analyze cloud-based big data applications

verfasst von: Chao Shen, Weiqin Tong, Kim-Kwang Raymond Choo, Samina Kausar

Erschienen in: Cluster Computing | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Performance evaluation of cloud center is a necessary prerequisite to fulfilling contractual quality of service, particularly in big data applications. However, effectively evaluating performance of cloud services is challenging due to the complexity of cloud services and the diversity of big data applications. In this paper, we propose a performance evaluation model for parallel computing models deployed in cloud centers to support big data applications. In this evaluation model, a big data application is divided into lots of parallel tasks and the task arrivals follow a general distribution. In our approach, we also consider factors associated with resource heterogeneity, resource contention among cloud nodes, and data storage strategy, which have an impact on the performance of parallel computing models. Our model also allows us to calculate key performance indicators of cloud center such as mean number of tasks in the system, probability that a task obtains immediate service, and task waiting time. The model can also be used to predict the time of performing applications. We then demonstrate the utility of the model based on simulations and benchmarking using WordCount and TeraSort applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Liu, X.D., Tong, W.Q., Zhi, X.L., Fu, Z.R., Liao, W.Z.: Performance analysis of cloud computing services considering resources sharing among virtual machines. J Supercomput, pp. 357-374 (2014) Liu, X.D., Tong, W.Q., Zhi, X.L., Fu, Z.R., Liao, W.Z.: Performance analysis of cloud computing services considering resources sharing among virtual machines. J Supercomput, pp. 357-374 (2014)
3.
Zurück zum Zitat Nita, M.-C., Pop, F., Voicu, C., Dobre, C., Xhafa, F.: MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop. Clust. Comput. J. Netw. Softw. Tools Appl. 18, 1011–1024 (2015) Nita, M.-C., Pop, F., Voicu, C., Dobre, C., Xhafa, F.: MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop. Clust. Comput. J. Netw. Softw. Tools Appl. 18, 1011–1024 (2015)
4.
Zurück zum Zitat Evans, J.J., Lucas, C.E.: Parallel application-level behavioral attributes for performance and energy management of high-performance computing systems. Clust. Comput. J. Netw. Softw. Tools Appl. 16, 91–115 (2013) Evans, J.J., Lucas, C.E.: Parallel application-level behavioral attributes for performance and energy management of high-performance computing systems. Clust. Comput. J. Netw. Softw. Tools Appl. 16, 91–115 (2013)
5.
Zurück zum Zitat Sandhu, R., Sood, S.K.: Scheduling of big data applications on distributed cloud based on QoS parameters. Clust. Comput. J. Netw. Softw. Tools Appl. 18, 817–828 (2015) Sandhu, R., Sood, S.K.: Scheduling of big data applications on distributed cloud based on QoS parameters. Clust. Comput. J. Netw. Softw. Tools Appl. 18, 817–828 (2015)
6.
Zurück zum Zitat Luo, T., Liao, Y., Chen, G., Zhang, Y.: P-DOT: a model of computation for big data. In: IEEE International Conference on Big Data, pp. 31–37 (2013) Luo, T., Liao, Y., Chen, G., Zhang, Y.: P-DOT: a model of computation for big data. In: IEEE International Conference on Big Data, pp. 31–37 (2013)
7.
Zurück zum Zitat Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
8.
Zurück zum Zitat Olsen, B., McKenney, M.: Storm system database: a big data approach to moving object databases. In: The 4th International Conference on Computing for Geospatial Research and Application (COM.Geo), pp. 142–143 (2013) Olsen, B., McKenney, M.: Storm system database: a big data approach to moving object databases. In: The 4th International Conference on Computing for Geospatial Research and Application (COM.Geo), pp. 142–143 (2013)
9.
Zurück zum Zitat Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proc. VLDB Endow. 2(1), 922–933 (2009)CrossRef Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proc. VLDB Endow. 2(1), 922–933 (2009)CrossRef
10.
Zurück zum Zitat Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRef
11.
Zurück zum Zitat Herodotou, H.: Hadoop Performance Models. Technical Report CS-2011-05. Computer Science Department, Duke University Herodotou, H.: Hadoop Performance Models. Technical Report CS-2011-05. Computer Science Department, Duke University
12.
Zurück zum Zitat Lin, X., Meng, Z., Xu, C., Wang, M.: A practical performance model for Hadoop MapReduce. In : IEEE International Conference on Cluster Computing Workshops, pp. 231–239 (2012) Lin, X., Meng, Z., Xu, C., Wang, M.: A practical performance model for Hadoop MapReduce. In : IEEE International Conference on Cluster Computing Workshops, pp. 231–239 (2012)
13.
Zurück zum Zitat Yigitbasi, N., Willke, T.L., Liao, G., Epema, D.: Towards machine learning-based auto-tuning of MapReduce. In: 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 11–20 (2013) Yigitbasi, N., Willke, T.L., Liao, G., Epema, D.: Towards machine learning-based auto-tuning of MapReduce. In: 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 11–20 (2013)
14.
Zurück zum Zitat Kadirvel, S., Fortes, J.A.B.: Grey-box approach for performance prediction in map-reduce based platforms. In: 21st International Conference on Computer Communications and Networks, pp. 1–9 (2012) Kadirvel, S., Fortes, J.A.B.: Grey-box approach for performance prediction in map-reduce based platforms. In: 21st International Conference on Computer Communications and Networks, pp. 1–9 (2012)
15.
Zurück zum Zitat Karloff, H., Suri, S., Vassilvitskii, S.: A model of computation for MapReduce. In: Proceedings of the 21st Annual ACM–SIAM Symposium on Discrete Algorithms, pp. 938–948 (2010) Karloff, H., Suri, S., Vassilvitskii, S.: A model of computation for MapReduce. In: Proceedings of the 21st Annual ACM–SIAM Symposium on Discrete Algorithms, pp. 938–948 (2010)
16.
Zurück zum Zitat Morton, K., Balazinska, M., Grossman, D.: ParaTimer: a progress indicator for MapReduce DAGs. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 507–518 (2010) Morton, K., Balazinska, M., Grossman, D.: ParaTimer: a progress indicator for MapReduce DAGs. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 507–518 (2010)
17.
Zurück zum Zitat Wasi-ur-Rahman, M., Lu, X., Islam, N.S., Panda, D.K.: Performance modeling for RDMA-enhanced Hadoop MapReduce. In: 43rd International Conference on Parallel Processing, pp. 50–59 (2014) Wasi-ur-Rahman, M., Lu, X., Islam, N.S., Panda, D.K.: Performance modeling for RDMA-enhanced Hadoop MapReduce. In: 43rd International Conference on Parallel Processing, pp. 50–59 (2014)
19.
Zurück zum Zitat Niculescu, V.: Cost evaluation from specifications for BSP programs. In: 20th International Parallel and Distributed Processing Symposium, pp. 25–29 (2006) Niculescu, V.: Cost evaluation from specifications for BSP programs. In: 20th International Parallel and Distributed Processing Symposium, pp. 25–29 (2006)
20.
Zurück zum Zitat Huai, Y., Lee, R., Zhang, S., Xia, C.H., Zhang, X.: DOT: a matrix model for analyzing, optimizing and deploying software for big data analytics in distributed systems. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, pp. 1–14 (2011) Huai, Y., Lee, R., Zhang, S., Xia, C.H., Zhang, X.: DOT: a matrix model for analyzing, optimizing and deploying software for big data analytics in distributed systems. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, pp. 1–14 (2011)
21.
Zurück zum Zitat Lee, J.W., Cho, Y.: An effective shared memory allocator for reducing false sharing in NUMA multiprocessors. In: IEEE Second International Conference on Algorithms and Architectures for Parallel Processing, pp. 373–382 (1996) Lee, J.W., Cho, Y.: An effective shared memory allocator for reducing false sharing in NUMA multiprocessors. In: IEEE Second International Conference on Algorithms and Architectures for Parallel Processing, pp. 373–382 (1996)
22.
Zurück zum Zitat Felice Pace, M.: BSP vs MapReduce. Procedia Comput. Sci. 9, 246–255 (2012)CrossRef Felice Pace, M.: BSP vs MapReduce. Procedia Comput. Sci. 9, 246–255 (2012)CrossRef
23.
Zurück zum Zitat Dagum, L., Menon, R.: OpenMP: an industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5, 46–55 (1998)CrossRef Dagum, L., Menon, R.: OpenMP: an industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5, 46–55 (1998)CrossRef
24.
Zurück zum Zitat Ahuja, S., Carriero, N.J., Gelernter, D.H., Krishnaswamy, V.: Matching language and hardware for parallel computation in the Linda machine. IEEE Trans. Comput. 37(8), 921–929 (1998)CrossRef Ahuja, S., Carriero, N.J., Gelernter, D.H., Krishnaswamy, V.: Matching language and hardware for parallel computation in the Linda machine. IEEE Trans. Comput. 37(8), 921–929 (1998)CrossRef
25.
Zurück zum Zitat Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge (1999)MATH Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge (1999)MATH
26.
Zurück zum Zitat Sunderam, V.S.: PVM: a framework for parallel distributed programming. Concurr. Pract. Exp. 2, 315–339 (1990)CrossRef Sunderam, V.S.: PVM: a framework for parallel distributed programming. Concurr. Pract. Exp. 2, 315–339 (1990)CrossRef
27.
Zurück zum Zitat Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. In: Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 207–216 (1995) Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. In: Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 207–216 (1995)
28.
Zurück zum Zitat Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multicore Processor Parallelism. O’Reilly Media, Inc., Sebastopol (2007) Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multicore Processor Parallelism. O’Reilly Media, Inc., Sebastopol (2007)
29.
Zurück zum Zitat Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: European Conference on Computer Systems (EuroSys), pp. 59–72 (2007) Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: European Conference on Computer Systems (EuroSys), pp. 59–72 (2007)
30.
Zurück zum Zitat Power, R., Li, J.Y.: Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems, pp. 1–14 (2010) Power, R., Li, J.Y.: Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems, pp. 1–14 (2010)
31.
Zurück zum Zitat Khazaei, H.: Performance analysis of cloud computing centers. In: Quality, Reliability, Security and Robustness in Heterogeneous Networks, pp. 251–264 (2012) Khazaei, H.: Performance analysis of cloud computing centers. In: Quality, Reliability, Security and Robustness in Heterogeneous Networks, pp. 251–264 (2012)
32.
Zurück zum Zitat Shen, C., Tong, W.Q., Kausar, S.: Predicting the performance of parallel computing models using queuing system. In: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 757–760 (2015) Shen, C., Tong, W.Q., Kausar, S.: Predicting the performance of parallel computing models using queuing system. In: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 757–760 (2015)
33.
Zurück zum Zitat Lu, C.: Queuing Theory (the Second Version). Beijing University of Posts and Telecommunication Press, Beijing (2009) Lu, C.: Queuing Theory (the Second Version). Beijing University of Posts and Telecommunication Press, Beijing (2009)
Metadaten
Titel
Performance prediction of parallel computing models to analyze cloud-based big data applications
verfasst von
Chao Shen
Weiqin Tong
Kim-Kwang Raymond Choo
Samina Kausar
Publikationsdatum
23.11.2017
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 2/2018
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-017-1385-3

Weitere Artikel der Ausgabe 2/2018

Cluster Computing 2/2018 Zur Ausgabe

Premium Partner