Skip to main content
Top
Published in: Cluster Computing 3/2019

15-01-2018

The bandwidth-aware backup task scheduling strategy using SDN in Hadoop

Authors: Fengjun Shang, Xuanling Chen, Chenyun Yan, Luzhong Li, Yuting Zhao

Published in: Cluster Computing | Special Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the era of big data, the traditional capacity of computing and storage has been unable to meet the growing demand. In this case, Cloud Computing technology is emerging. Researching on task scheduling is a way from the perspective of resource allocation and management to improve performance of Hadoop system. In this paper, a speculative task scheduling strategy that based on SDN technology is improved. For LATE mechanism, some slow tasks are slower than speculative tasks. This is not only unable to reduce task turnaround time and a waste of system resources. In this paper, we join the slow task compared with the speculative task for the speculative task scheduling strategy of LATE. Wherein, the run time of speculative tasks contains the input data transfer time, real-time bandwidth corresponding to a bandwidth of the link. Based on this model, we propose a bandwidth-aware speculative task run time estimation model (BWRE) based on SDN, using this model to accurately speculative the backup task run time. And we use SDN to provide bandwidth guarantees for the speculative task. Finally, BWRE is verified by simulation experiments. Evaluation results show that BWRE outperforms the shortening job turnaround time by an average of 9.85%.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Landset, S., Khoshgoftaar, T.M., Richter, A.N., et al.: A survey of open source tools for machine learning with big data in the Hadoop ecosystem. J. Big Data 2(1), 1–36 (2015)CrossRef Landset, S., Khoshgoftaar, T.M., Richter, A.N., et al.: A survey of open source tools for machine learning with big data in the Hadoop ecosystem. J. Big Data 2(1), 1–36 (2015)CrossRef
2.
go back to reference Saxena, V.K., Pushkar, S.: Cloud computing challenges and implementations. In: International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp. 2583–2588 (2016) Saxena, V.K., Pushkar, S.: Cloud computing challenges and implementations. In: International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp. 2583–2588 (2016)
3.
go back to reference Shahid, A., Fiaidhi, J., Mohammed, S.: Implementing innovative routing using software defined networking (SDN). Int. J. Multimed. Ubiquitous Eng. 11(2), 159–172 (2016)CrossRef Shahid, A., Fiaidhi, J., Mohammed, S.: Implementing innovative routing using software defined networking (SDN). Int. J. Multimed. Ubiquitous Eng. 11(2), 159–172 (2016)CrossRef
4.
go back to reference Mashayekhy, L., Nejad, M.M., Grosu, D., et al.: Energy-aware scheduling of MapReduce jobs for big data applications. Parallel Distrib. Syst. IEEE Trans. 26(10), 2720–2733 (2015)CrossRef Mashayekhy, L., Nejad, M.M., Grosu, D., et al.: Energy-aware scheduling of MapReduce jobs for big data applications. Parallel Distrib. Syst. IEEE Trans. 26(10), 2720–2733 (2015)CrossRef
5.
go back to reference Yu, S.: Big privacy: challenges and opportunities of privacy study in the age of big data. IEEE Access 4, 2751–2763 (2017)CrossRef Yu, S.: Big privacy: challenges and opportunities of privacy study in the age of big data. IEEE Access 4, 2751–2763 (2017)CrossRef
6.
go back to reference Zhang, Q., Zhani, M.F., Yang, Y., et al.: PRISM: fine-grained resource-aware scheduling for MapReduce. IEEE Trans. Cloud Comput. 3(2), 182–194 (2015)CrossRef Zhang, Q., Zhani, M.F., Yang, Y., et al.: PRISM: fine-grained resource-aware scheduling for MapReduce. IEEE Trans. Cloud Comput. 3(2), 182–194 (2015)CrossRef
7.
go back to reference Huang, W., Meng, L., Zhang, D., et al.: In-memory parallel processing of massive remotely sensed data using an apache spark on Hadoop YARN model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10(1), 3–19 (2017)CrossRef Huang, W., Meng, L., Zhang, D., et al.: In-memory parallel processing of massive remotely sensed data using an apache spark on Hadoop YARN model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10(1), 3–19 (2017)CrossRef
8.
go back to reference Wu, L., Yuan, L., You, J.: Survey of large-scale data management systems for big data applications. J. Comput. Sci. Technol. 30(1), 163–183 (2015)CrossRef Wu, L., Yuan, L., You, J.: Survey of large-scale data management systems for big data applications. J. Comput. Sci. Technol. 30(1), 163–183 (2015)CrossRef
9.
go back to reference Sun, D., Zhang, G., Yang, S., et al.: Re-stream: real-time and energy-efficient resource scheduling in big data stream computing environments. Inf. Sci. 319, 92–112 (2015)MathSciNetCrossRef Sun, D., Zhang, G., Yang, S., et al.: Re-stream: real-time and energy-efficient resource scheduling in big data stream computing environments. Inf. Sci. 319, 92–112 (2015)MathSciNetCrossRef
10.
go back to reference Douglas, C., Curino, C.: Blind men and an elephant coalescing open-source, academic, and industrial perspectives on Big Data. In: IEEE International Conference on Data Engineering. IEEE, pp. 1523–1526 (2015) Douglas, C., Curino, C.: Blind men and an elephant coalescing open-source, academic, and industrial perspectives on Big Data. In: IEEE International Conference on Data Engineering. IEEE, pp. 1523–1526 (2015)
11.
go back to reference Finocchi, I., Finocchi, M., Fusco, E.G.: Clique counting in MapReduce: algorithms and experiments. J. Exp. Algorithmics (JEA) 20(1), 1–7 (2015)MathSciNetMATH Finocchi, I., Finocchi, M., Fusco, E.G.: Clique counting in MapReduce: algorithms and experiments. J. Exp. Algorithmics (JEA) 20(1), 1–7 (2015)MathSciNetMATH
12.
go back to reference Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., et al.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1 (2015)CrossRef Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., et al.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1 (2015)CrossRef
13.
go back to reference Elmeleegy, K., Reed, B., Reed, B.: SpongeFiles: mitigating data skew in mapreduce using distributed memory. In: ACM SIGMOD International Conference on Management of Data. ACM, pp. 551–562 (2014) Elmeleegy, K., Reed, B., Reed, B.: SpongeFiles: mitigating data skew in mapreduce using distributed memory. In: ACM SIGMOD International Conference on Management of Data. ACM, pp. 551–562 (2014)
14.
go back to reference Yang, L., Jie, Y., Yuan, H., et al.: MapReduce based parallel neural networks in enabling large scale machine learning. Computat. Intell. Neurosci. 2015(2), 297672 (2015) Yang, L., Jie, Y., Yuan, H., et al.: MapReduce based parallel neural networks in enabling large scale machine learning. Computat. Intell. Neurosci. 2015(2), 297672 (2015)
15.
go back to reference Kumar, A., Shankar, R., Choudhary, A., et al.: A big data MapReduce framework for fault diagnosis in cloud-based manufacturing. Int. J. Prod. Res. 54(23), 7060–7073 (2016)CrossRef Kumar, A., Shankar, R., Choudhary, A., et al.: A big data MapReduce framework for fault diagnosis in cloud-based manufacturing. Int. J. Prod. Res. 54(23), 7060–7073 (2016)CrossRef
16.
go back to reference Qian, J., Lv, P., Yue, X., et al.: Hierarchical attribute reduction algorithms for big data using MapReduce. Knowl.-Based Syst. 73(1), 18–31 (2015)CrossRef Qian, J., Lv, P., Yue, X., et al.: Hierarchical attribute reduction algorithms for big data using MapReduce. Knowl.-Based Syst. 73(1), 18–31 (2015)CrossRef
17.
go back to reference Li, Z., Shen, Y., Yao, B., et al.: OFScheduler: a dynamic network optimizer for mapreduce in heterogeneous cluster. Int. J. Parallel Program. 43(3), 472–488 (2015)CrossRef Li, Z., Shen, Y., Yao, B., et al.: OFScheduler: a dynamic network optimizer for mapreduce in heterogeneous cluster. Int. J. Parallel Program. 43(3), 472–488 (2015)CrossRef
18.
go back to reference Gagie, T., Gawrychowski, P., Puglisi, S.J.: Approximate pattern matching in LZ77-compressed texts. J. Discret. Algorithms 32(C), 64–68 (2015)MathSciNetMATHCrossRef Gagie, T., Gawrychowski, P., Puglisi, S.J.: Approximate pattern matching in LZ77-compressed texts. J. Discret. Algorithms 32(C), 64–68 (2015)MathSciNetMATHCrossRef
19.
go back to reference Hashem, I.A.T., Anuar, N.B., Gani, A., et al.: MapReduce: review and open challenges. Scientometrics 109(1), 389–422 (2016)CrossRef Hashem, I.A.T., Anuar, N.B., Gani, A., et al.: MapReduce: review and open challenges. Scientometrics 109(1), 389–422 (2016)CrossRef
20.
go back to reference Magalhães, D., Calheiros, R.N., Buyya, R., et al.: Workload modeling for resource usage analysis and simulation in cloud computing. Comput. Electr. Eng. 47(17), 69–81 (2015)CrossRef Magalhães, D., Calheiros, R.N., Buyya, R., et al.: Workload modeling for resource usage analysis and simulation in cloud computing. Comput. Electr. Eng. 47(17), 69–81 (2015)CrossRef
21.
go back to reference Min, F., Xu, J.: Semi-greedy heuristics for feature selection with test cost constraints. Granul. Comput. 1(3), 199–211 (2016)CrossRef Min, F., Xu, J.: Semi-greedy heuristics for feature selection with test cost constraints. Granul. Comput. 1(3), 199–211 (2016)CrossRef
22.
go back to reference Wang, Y., Davidson, A., Pan, Y., Wu, Y., Riffel, A., Owens, J.D.: Gunrock: a high performance graph processing library on the gpu. In In: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM 30, 265–266 (2015)CrossRef Wang, Y., Davidson, A., Pan, Y., Wu, Y., Riffel, A., Owens, J.D.: Gunrock: a high performance graph processing library on the gpu. In In: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM 30, 265–266 (2015)CrossRef
23.
go back to reference Won, H., Nguyen, M.C., Gil, M.S., et al.: Advanced resource management with access control for multitenant Hadoop. J. Commun. Netw. 17(6), 592–601 (2016)CrossRef Won, H., Nguyen, M.C., Gil, M.S., et al.: Advanced resource management with access control for multitenant Hadoop. J. Commun. Netw. 17(6), 592–601 (2016)CrossRef
24.
go back to reference White, T.: Hadoop: The definitive guide, pp. 125–230. O’Reilly Media, Inc., California (2015) White, T.: Hadoop: The definitive guide, pp. 125–230. O’Reilly Media, Inc., California (2015)
25.
go back to reference Li, H., Li, P., Guo, S., et al.: Byzantine-resilient secure software-defined networks with multiple controllers in cloud. IEEE Trans. Cloud Comput. 2(4), 436–447 (2015)CrossRef Li, H., Li, P., Guo, S., et al.: Byzantine-resilient secure software-defined networks with multiple controllers in cloud. IEEE Trans. Cloud Comput. 2(4), 436–447 (2015)CrossRef
26.
go back to reference Xiaotao, W U., Dongyan, J I., Chen, A.: Opportunities and challenges of the reform of China’s Emergency Management System in the age of big data. J. Henan Polytechnic Univ. (2016) Xiaotao, W U., Dongyan, J I., Chen, A.: Opportunities and challenges of the reform of China’s Emergency Management System in the age of big data. J. Henan Polytechnic Univ. (2016)
27.
go back to reference Liu, X., Zhao, D., Xu, L., et al.: A distributed video management cloud platform Using Hadoop. IEEE Access 3, 2637–2643 (2017)CrossRef Liu, X., Zhao, D., Xu, L., et al.: A distributed video management cloud platform Using Hadoop. IEEE Access 3, 2637–2643 (2017)CrossRef
28.
go back to reference Indiveri, G., Liu, S.C.: Memory and information processing in neuromorphic systems. Proc. IEEE 103(8), 1379–1397 (2015)CrossRef Indiveri, G., Liu, S.C.: Memory and information processing in neuromorphic systems. Proc. IEEE 103(8), 1379–1397 (2015)CrossRef
Metadata
Title
The bandwidth-aware backup task scheduling strategy using SDN in Hadoop
Authors
Fengjun Shang
Xuanling Chen
Chenyun Yan
Luzhong Li
Yuting Zhao
Publication date
15-01-2018
Publisher
Springer US
Published in
Cluster Computing / Issue Special Issue 3/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-1736-8

Other articles of this Special Issue 3/2019

Cluster Computing 3/2019 Go to the issue

Premium Partner