Skip to main content
Top

2019 | OriginalPaper | Chapter

Scalability Evaluation of Big Data Processing Services in Clouds

Authors : Xin Zhou, Congfeng Jiang, Yeliang Qiu, Tiantian Fan, Yumei Wang, Liangbin Zhang, Jian Wan, Weisong Shi

Published in: Benchmarking, Measuring, and Optimizing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Currently, many cloud providers deploy their big data processing systems as cloud services, which helps users conveniently manage and process their data in clouds. Among different service providers’ big data processing services, how to evaluate and compare their scalability is an interesting and challenging work. Most traditional benchmark tools focus on performance evaluation of big data processing systems, such as aggregated throughput and IOPS, but fail to conduct a quantitative analysis of their scalability. In this paper, we propose a measurement methodology to quantify the scalability of big data processing services, which makes the cloud services scalability comparable. We conduct a group of comparative experiments on AliCloud E-MapReduce and Baidu MRS, and collect their respective scalability characteristics under Hadoop and Spark workloads. The scalability characteristics observed in our work could help cloud users choose the best cloud service platform to set up an optimized big data processing system to achieve their specific goals more successfully.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
6.
go back to reference Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: SoCC, pp. 143–154 (2010) Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: SoCC, pp. 143–154 (2010)
7.
go back to reference Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. Spec. Interest Group Oper. Syst. Oper. Syst. Rev. 44(2), 35–40 (2010) Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. Spec. Interest Group Oper. Syst. Oper. Syst. Rev. 44(2), 35–40 (2010)
8.
go back to reference George, L.: HBase - The Definitive Guide. O’Reilly, Newton (2011) George, L.: HBase - The Definitive Guide. O’Reilly, Newton (2011)
9.
go back to reference Cooper, B.F., et al.: PNUTS: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008)CrossRef Cooper, B.F., et al.: PNUTS: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008)CrossRef
10.
go back to reference Shi, Y., Meng, X., Zhao, J., Hu, X., Liu, B., Wang, H.: Benchmarking cloud-based data management systems. In: Proceedings of the Second International Workshop on Cloud Data Management, pp. 47–54. ACM (2010) Shi, Y., Meng, X., Zhao, J., Hu, X., Liu, B., Wang, H.: Benchmarking cloud-based data management systems. In: Proceedings of the Second International Workshop on Cloud Data Management, pp. 47–54. ACM (2010)
11.
go back to reference Ferdman, M., et al.: Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In: ACM SIGARCH Computer Architecture News, vol. 40, pp. 37–48. ACM (2012) Ferdman, M., et al.: Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In: ACM SIGARCH Computer Architecture News, vol. 40, pp. 37–48. ACM (2012)
12.
go back to reference Jia, Z., et al.: Understanding big data analytics workloads on modern processors. IEEE Trans. Parallel Distrib. Syst. 28(6), 1797–1810 (2017)CrossRef Jia, Z., et al.: Understanding big data analytics workloads on modern processors. IEEE Trans. Parallel Distrib. Syst. 28(6), 1797–1810 (2017)CrossRef
13.
go back to reference Jia, Z., Wang, L., Zhan, J., Zhang, L., Luo, C.: Characterizing data analysis workloads in data centers. In: IISWC, pp. 66–76. IEEE (2013) Jia, Z., Wang, L., Zhan, J., Zhang, L., Luo, C.: Characterizing data analysis workloads in data centers. In: IISWC, pp. 66–76. IEEE (2013)
14.
go back to reference Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51. IEEE (2010) Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51. IEEE (2010)
16.
go back to reference Luo, C., et al.: CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications. Front. Comput. Sci. 6(4), 347–362 (2012)MathSciNetCrossRef Luo, C., et al.: CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications. Front. Comput. Sci. 6(4), 347–362 (2012)MathSciNetCrossRef
18.
go back to reference Baru, C., Bhandarkar, M., Nambiar, R., Poess, M., Rabl, T.: Benchmarking big data systems and the bigdata top100 list. Big Data 1(1), 60–64 (2013)CrossRef Baru, C., Bhandarkar, M., Nambiar, R., Poess, M., Rabl, T.: Benchmarking big data systems and the bigdata top100 list. Big Data 1(1), 60–64 (2013)CrossRef
19.
go back to reference Dede, E., Fadika, Z., Govindaraju, M., Ramakrishnan, L.: Benchmarking MapReduce implementations under different application scenarios. Future Gener. Comput. Syst. 36, 389–399 (2014)CrossRef Dede, E., Fadika, Z., Govindaraju, M., Ramakrishnan, L.: Benchmarking MapReduce implementations under different application scenarios. Future Gener. Comput. Syst. 36, 389–399 (2014)CrossRef
20.
21.
go back to reference Pavlo, A., et al.: A comparison of approaches to large-scale data analysis. In: Special Interest Group on Management Of Data, pp. 165–178. ACM (2009) Pavlo, A., et al.: A comparison of approaches to large-scale data analysis. In: Special Interest Group on Management Of Data, pp. 165–178. ACM (2009)
22.
go back to reference Rizzelli, G., Maier, G., Quagliotti, M., Schiano, M., Pattavina, A.: Assessing the scalability of next-generation wavelength switched optical networks. J. Lightwave Technol. 32(12), 2263–2270 (2014)CrossRef Rizzelli, G., Maier, G., Quagliotti, M., Schiano, M., Pattavina, A.: Assessing the scalability of next-generation wavelength switched optical networks. J. Lightwave Technol. 32(12), 2263–2270 (2014)CrossRef
23.
go back to reference Badia, S., Martín, A.F., Principe, J.: Implementation and scalability analysis of balancing domain decomposition methods. Arch. Comput. Methods Eng. 20(3), 239–262 (2013)MathSciNetCrossRef Badia, S., Martín, A.F., Principe, J.: Implementation and scalability analysis of balancing domain decomposition methods. Arch. Comput. Methods Eng. 20(3), 239–262 (2013)MathSciNetCrossRef
24.
go back to reference Gunther, N., Puglia, P., Tomasette, K.: Hadoop superlinear scalability. Queue 13(5), 20 (2015)CrossRef Gunther, N., Puglia, P., Tomasette, K.: Hadoop superlinear scalability. Queue 13(5), 20 (2015)CrossRef
25.
go back to reference Gao, J., Pattabhiraman, P., Bai, X., Tsai, W.T.: Saas performance and scalability evaluation in clouds. In: 2011 IEEE 6th International Symposium on Service Oriented System Engineering (SOSE), pp. 61–71. IEEE (2011) Gao, J., Pattabhiraman, P., Bai, X., Tsai, W.T.: Saas performance and scalability evaluation in clouds. In: 2011 IEEE 6th International Symposium on Service Oriented System Engineering (SOSE), pp. 61–71. IEEE (2011)
26.
go back to reference Jiang, C., Han, G., Lin, J., Jia, G., Shi, W., Wan, J.: Characteristics of co-allocated online services and batch jobs in internet data centers: a case study from alibaba cloud. IEEE Access 7, 22495–22508 (2019)CrossRef Jiang, C., Han, G., Lin, J., Jia, G., Shi, W., Wan, J.: Characteristics of co-allocated online services and batch jobs in internet data centers: a case study from alibaba cloud. IEEE Access 7, 22495–22508 (2019)CrossRef
27.
go back to reference Jiang, C., et al.: Energy efficiency comparison of hypervisors. Sustain. Comput.: Inf. Syst. 22, 311–321 (2019) Jiang, C., et al.: Energy efficiency comparison of hypervisors. Sustain. Comput.: Inf. Syst. 22, 311–321 (2019)
28.
go back to reference Jiang, C., et al.: Interdomain I/O optimization in virtualized sensor networks. Sensors 18(12), 4395 (2018)CrossRef Jiang, C., et al.: Interdomain I/O optimization in virtualized sensor networks. Sensors 18(12), 4395 (2018)CrossRef
29.
go back to reference Qiu, Y., Jiang, C., Wang, Y., Ou, D., Li, Y., Wan, J.: Energy aware virtual machine scheduling in data centers. Energies 12(4), 646 (2019)CrossRef Qiu, Y., Jiang, C., Wang, Y., Ou, D., Li, Y., Wan, J.: Energy aware virtual machine scheduling in data centers. Energies 12(4), 646 (2019)CrossRef
Metadata
Title
Scalability Evaluation of Big Data Processing Services in Clouds
Authors
Xin Zhou
Congfeng Jiang
Yeliang Qiu
Tiantian Fan
Yumei Wang
Liangbin Zhang
Jian Wan
Weisong Shi
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-32813-9_8

Premium Partner