Skip to main content
Erschienen in: Earth Science Informatics 4/2019

11.08.2019 | Methodology Article

ACO-DPDGW: an ant colony optimization algorithm for data placement of data-intensive geospatial workflow

verfasst von: Xiaozhu Wu, Ying Liu, Chongcheng Chen

Erschienen in: Earth Science Informatics | Ausgabe 4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Massive data transmission between distributed data centers is the major efficiency bottleneck of geospatial workflow. Although many data placement methods have been proposed to overcome this problem, few researches have considered the impact of the structure of the workflow. In this paper, we define the problem of data placement for data-intensive geospatial workflow aiming to minimize the data transfer time. An algorithm called ant colony optimization based data placement of data-intensive geospatial workflow (ACO-DPDGW) is proposed to handle this problem. By taking advantage of the node vector to represent the traditional workflow model, the ants could place datasets and tasks in appropriate data centers according to the combination of pheromone information and heuristic information, when they visit the nodes randomly. To prevent premature convergence, a variable neighborhood search operation is embedded into ACO-DPDGW. The experiments show that our algorithm can reduce data transfer volume and data transfer time even as the numbers of datasets, tasks, and data centers increase.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Altintas I, Berkley C, Jaeger E, et al. (2004) Kepler: an extensible system for design and execution of scientific workflows[C]//proceedings. 16th international conference on scientific and statistical database management, 2004. IEEE, 423–424 Altintas I, Berkley C, Jaeger E, et al. (2004) Kepler: an extensible system for design and execution of scientific workflows[C]//proceedings. 16th international conference on scientific and statistical database management, 2004. IEEE, 423–424
Zurück zum Zitat Altintas I, Block J, De Callafon R et al (2015) Towards an integrated cyberinfrastructure for scalable data-driven monitoring, dynamic prediction and resilience of wildfires[J]. Procedia Comput Sci 51:1633–1642CrossRef Altintas I, Block J, De Callafon R et al (2015) Towards an integrated cyberinfrastructure for scalable data-driven monitoring, dynamic prediction and resilience of wildfires[J]. Procedia Comput Sci 51:1633–1642CrossRef
Zurück zum Zitat Atrey A, Van Seghbroeck G, Volckaert B, et al. (2018) Scalable data placement of data-intensive Services in geo-distributed Clouds[C]//CLOSER2018, the 8th international conference on cloud computing and services science. SCITEPRESS-Science and Technology Publications, 497–508 Atrey A, Van Seghbroeck G, Volckaert B, et al. (2018) Scalable data placement of data-intensive Services in geo-distributed Clouds[C]//CLOSER2018, the 8th international conference on cloud computing and services science. SCITEPRESS-Science and Technology Publications, 497–508
Zurück zum Zitat Bousrih A, Brahmi Z. (2015) Optimizing cost and response time for data intensive services' composition based on ABC algorithm[C]//Information & Communication Technology and accessibility (ICTA), 2015 5th international conference on. IEEE, 1–6 Bousrih A, Brahmi Z. (2015) Optimizing cost and response time for data intensive services' composition based on ABC algorithm[C]//Information & Communication Technology and accessibility (ICTA), 2015 5th international conference on. IEEE, 1–6
Zurück zum Zitat Chen W, Paik I, Li Z (2016) Tology-aware optimal data placement algorithm for network traffic optimization[J]. IEEE Trans Comput 65(8):2603–2617CrossRef Chen W, Paik I, Li Z (2016) Tology-aware optimal data placement algorithm for network traffic optimization[J]. IEEE Trans Comput 65(8):2603–2617CrossRef
Zurück zum Zitat Chen J, Zhang J, Song A. (2017) Efficient data and task co-scheduling for scientific workflow in geo-distributed datacenters[C]//advanced cloud and big data (CBD), 2017 fifth international conference on. IEEE, 63–68 Chen J, Zhang J, Song A. (2017) Efficient data and task co-scheduling for scientific workflow in geo-distributed datacenters[C]//advanced cloud and big data (CBD), 2017 fifth international conference on. IEEE, 63–68
Zurück zum Zitat Cowart C, Block J, Crawl D, et al. (2015) geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling[C]//AGU Fall Meeting Abstracts Cowart C, Block J, Crawl D, et al. (2015) geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling[C]//AGU Fall Meeting Abstracts
Zurück zum Zitat Davies DK, Ilavajhala S, Wong MM et al (2009) Fire information for resource management system: archiving and distributing MODIS active fire data[J]. IEEE Trans Geosci Remote Sens 47(1):72–79CrossRef Davies DK, Ilavajhala S, Wong MM et al (2009) Fire information for resource management system: archiving and distributing MODIS active fire data[J]. IEEE Trans Geosci Remote Sens 47(1):72–79CrossRef
Zurück zum Zitat Davila CC, Reinhart CF, Bemis JL (2016) Modeling Boston: a workflow for the efficient generation and maintenance of urban building energy models from existing geospatial datasets[J]. Energy 117:237–250CrossRef Davila CC, Reinhart CF, Bemis JL (2016) Modeling Boston: a workflow for the efficient generation and maintenance of urban building energy models from existing geospatial datasets[J]. Energy 117:237–250CrossRef
Zurück zum Zitat Deelman E, Chervenak A. (2008) Data management challenges of data-intensive scientific workflows[C]//cluster computing and the grid, 2008. CCGRID'08. 8th IEEE international symposium on. IEEE, 687–692 Deelman E, Chervenak A. (2008) Data management challenges of data-intensive scientific workflows[C]//cluster computing and the grid, 2008. CCGRID'08. 8th IEEE international symposium on. IEEE, 687–692
Zurück zum Zitat Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities[J]. Futur Gener Comput Syst 25(5):528–540CrossRef Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities[J]. Futur Gener Comput Syst 25(5):528–540CrossRef
Zurück zum Zitat Deng K, Ren K, Song J, Yuan D, Xiang Y, Chen J (2013) A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing[J]. Concurr Comput: Pract E 25(18):2523–2539CrossRef Deng K, Ren K, Song J, Yuan D, Xiang Y, Chen J (2013) A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing[J]. Concurr Comput: Pract E 25(18):2523–2539CrossRef
Zurück zum Zitat Deng K, Ren K, Zhu M, et al. (2015) A data and task co-scheduling algorithm for scientific cloud workflows[J]. IEEE Trans Cloud Comput (1): 1–1 Deng K, Ren K, Zhu M, et al. (2015) A data and task co-scheduling algorithm for scientific cloud workflows[J]. IEEE Trans Cloud Comput (1): 1–1
Zurück zum Zitat Dorigo M (1996) The any system optimization by a colony of cooperating agents[J]. IEEE Trans Syst Man Cybern B 26:1): 1–1):13CrossRef Dorigo M (1996) The any system optimization by a colony of cooperating agents[J]. IEEE Trans Syst Man Cybern B 26:1): 1–1):13CrossRef
Zurück zum Zitat Ebrahimi M, Mohan A, Kashlev A, et al. (2015) BDAP: a big data placement strategy for cloud-based scientific workflows[C]//big data computing service and applications (BigDataService), 2015 IEEE first international conference on. IEEE, 105–114 Ebrahimi M, Mohan A, Kashlev A, et al. (2015) BDAP: a big data placement strategy for cloud-based scientific workflows[C]//big data computing service and applications (BigDataService), 2015 IEEE first international conference on. IEEE, 105–114
Zurück zum Zitat Er-Dun Z, Yong-Qiang Q, Xing-Xing X, et al. (2012) A data placement strategy based on genetic algorithm for scientific workflows[C]//computational intelligence and security (CIS), 2012 eighth international conference on IEEE, 146–149 Er-Dun Z, Yong-Qiang Q, Xing-Xing X, et al. (2012) A data placement strategy based on genetic algorithm for scientific workflows[C]//computational intelligence and security (CIS), 2012 eighth international conference on IEEE, 146–149
Zurück zum Zitat Gao Y, Guan H, Qi Z et al (2013) A multi-objective ant colony system algorithm for virtual machine placement in cloud computing[J]. J Comput Syst Sci 79(8):1230–1242CrossRef Gao Y, Guan H, Qi Z et al (2013) A multi-objective ant colony system algorithm for virtual machine placement in cloud computing[J]. J Comput Syst Sci 79(8):1230–1242CrossRef
Zurück zum Zitat Gutjahr WJ (2002) ACO algorithms with guaranteed convergence to the optimal solution[J]. Inf Process Lett 82(3):145–153CrossRef Gutjahr WJ (2002) ACO algorithms with guaranteed convergence to the optimal solution[J]. Inf Process Lett 82(3):145–153CrossRef
Zurück zum Zitat Hamrouni T, Slimani S, Charrada FB (2015) A data mining correlated patterns-based periodic decentralized replication strategy for data grids[J]. J Syst Softw 110:10–27CrossRef Hamrouni T, Slimani S, Charrada FB (2015) A data mining correlated patterns-based periodic decentralized replication strategy for data grids[J]. J Syst Softw 110:10–27CrossRef
Zurück zum Zitat Jiang L, Yue P, Kuhn W, Zhang C, Yu C, Guo X (2018) Advancing interoperability of geospatial data provenance on the web: gap analysis and strategies[J]. Comput Geosci 117:21–31CrossRef Jiang L, Yue P, Kuhn W, Zhang C, Yu C, Guo X (2018) Advancing interoperability of geospatial data provenance on the web: gap analysis and strategies[J]. Comput Geosci 117:21–31CrossRef
Zurück zum Zitat Kalra M, Singh S (2015) A review of metaheuristic scheduling techniques in cloud computing[J]. Egypt Inf J 16(3):275–295CrossRef Kalra M, Singh S (2015) A review of metaheuristic scheduling techniques in cloud computing[J]. Egypt Inf J 16(3):275–295CrossRef
Zurück zum Zitat Lee JG, Kang M (2015) Geospatial big data: challenges and opportunities[J]. Big Data Research 2(2):74–81CrossRef Lee JG, Kang M (2015) Geospatial big data: challenges and opportunities[J]. Big Data Research 2(2):74–81CrossRef
Zurück zum Zitat Li S, Dragicevic S, Castro FA, Sester M, Winter S, Coltekin A, Pettit C, Jiang B, Haworth J, Stein A, Cheng T (2016a) Geospatial big data handling theory and methods: a review and research challenges[J]. ISPRS J Photogramm Remote Sens 115:119–133CrossRef Li S, Dragicevic S, Castro FA, Sester M, Winter S, Coltekin A, Pettit C, Jiang B, Haworth J, Stein A, Cheng T (2016a) Geospatial big data handling theory and methods: a review and research challenges[J]. ISPRS J Photogramm Remote Sens 115:119–133CrossRef
Zurück zum Zitat Li X, Zhang L, Wu Y, et al. (2016b) A novel workflow-level data placement strategy for data-sharing scientific cloud workflows[J]. IEEE Trans Serv Comput Li X, Zhang L, Wu Y, et al. (2016b) A novel workflow-level data placement strategy for data-sharing scientific cloud workflows[J]. IEEE Trans Serv Comput
Zurück zum Zitat Liu XF, Zhan ZH, Deng Jeremiah D et al An energy efficient ant Colony system for virtual machine placement in cloud computing[J]. IEEE Trans Evol Comput 22(1):113–128CrossRef Liu XF, Zhan ZH, Deng Jeremiah D et al An energy efficient ant Colony system for virtual machine placement in cloud computing[J]. IEEE Trans Evol Comput 22(1):113–128CrossRef
Zurück zum Zitat Mladenović N, Hansen P (1997) Variable neighborhood search[J]. Comput Oper Res 24(11):1097–1100CrossRef Mladenović N, Hansen P (1997) Variable neighborhood search[J]. Comput Oper Res 24(11):1097–1100CrossRef
Zurück zum Zitat Pisinger D (2005) Where are the hard knapsack problems?[J]. Comput Oper Res 32(9):2271–2284CrossRef Pisinger D (2005) Where are the hard knapsack problems?[J]. Comput Oper Res 32(9):2271–2284CrossRef
Zurück zum Zitat Shabeera TP, Kumar SDM, Salam SM et al (2016) Optimizing VM Allocation and Data Placement for Data-Intensive Applications in Cloud using ACO Metaheuristic Algorithm[J]. Eng Sci Technol Int J 20(2):616–628CrossRef Shabeera TP, Kumar SDM, Salam SM et al (2016) Optimizing VM Allocation and Data Placement for Data-Intensive Applications in Cloud using ACO Metaheuristic Algorithm[J]. Eng Sci Technol Int J 20(2):616–628CrossRef
Zurück zum Zitat Shibata T, Choi S J, Taura K. (2010) File-access patterns of data-intensive workflow applications and their implications to distributed filesystems[C]//proceedings of the 19th ACM international symposium on high performance distributed computing. ACM, 746–755 Shibata T, Choi S J, Taura K. (2010) File-access patterns of data-intensive workflow applications and their implications to distributed filesystems[C]//proceedings of the 19th ACM international symposium on high performance distributed computing. ACM, 746–755
Zurück zum Zitat Shirasuna S, Gannon D (2006) Xbaya: a graphical workflow composer for the web services architecture[J]. Indiana University Shirasuna S, Gannon D (2006) Xbaya: a graphical workflow composer for the web services architecture[J]. Indiana University
Zurück zum Zitat Tawfeek MA, El-Sisi AB, Keshk AE et al (2014) Virtual machine placement based on ant colony optimization for minimizing resource wastage[C]//international conference on advanced machine learning technologies and applications. Springer, Cham, pp 153–164 Tawfeek MA, El-Sisi AB, Keshk AE et al (2014) Virtual machine placement based on ant colony optimization for minimizing resource wastage[C]//international conference on advanced machine learning technologies and applications. Springer, Cham, pp 153–164
Zurück zum Zitat Teylo L, de Paula U, Frota Y, de Oliveira D, Drummond LMA (2017) A hybrid evolutionary algorithm for task scheduling and data assignment of data-intensive scientific workflows on clouds[J]. Futur Gener Comput Syst 76:1–17CrossRef Teylo L, de Paula U, Frota Y, de Oliveira D, Drummond LMA (2017) A hybrid evolutionary algorithm for task scheduling and data assignment of data-intensive scientific workflows on clouds[J]. Futur Gener Comput Syst 76:1–17CrossRef
Zurück zum Zitat van Der Aalst WMP, Ter Hofstede AHM, Kiepuszewski B et al (2003) Workflow patterns[J]. Distrib Parallel Databases 14(1):5–51CrossRef van Der Aalst WMP, Ter Hofstede AHM, Kiepuszewski B et al (2003) Workflow patterns[J]. Distrib Parallel Databases 14(1):5–51CrossRef
Zurück zum Zitat Wang L, Shen J, Beydoun G (2013) Enhanced ant colony algorithm for cost-aware data-intensive service provision[C]//2013 IEEE ninth world congress on services. IEEE, 227–234 Wang L, Shen J, Beydoun G (2013) Enhanced ant colony algorithm for cost-aware data-intensive service provision[C]//2013 IEEE ninth world congress on services. IEEE, 227–234
Zurück zum Zitat Wang T, Yao S, Xu Z, Jia S (2016) DCCP: an effective data placement strategy for data-intensive computations in distributed cloud computing systems[J]. J Supercomput 72(7):2537–2564CrossRef Wang T, Yao S, Xu Z, Jia S (2016) DCCP: an effective data placement strategy for data-intensive computations in distributed cloud computing systems[J]. J Supercomput 72(7):2537–2564CrossRef
Zurück zum Zitat Wei-Neng CHEN, Zhang J (2008) An ant Colony optimization approach to a grid workflow scheduling problem with various QoS requirements[J]. IEEE Tran Syst Man Cybern C 39(1):29–43CrossRef Wei-Neng CHEN, Zhang J (2008) An ant Colony optimization approach to a grid workflow scheduling problem with various QoS requirements[J]. IEEE Tran Syst Man Cybern C 39(1):29–43CrossRef
Zurück zum Zitat Xu Q, Xu Z, Wang T (2015) A data-placement strategy based on genetic algorithm in cloud computing[J]. Int J Intell Sci 5(03):145–157CrossRef Xu Q, Xu Z, Wang T (2015) A data-placement strategy based on genetic algorithm in cloud computing[J]. Int J Intell Sci 5(03):145–157CrossRef
Zurück zum Zitat Yuan D, Yang Y, Liu X, Chen J (2010) A data placement strategy in scientific cloud workflows[J]. Futur Gener Comput Syst 26(8):1200–1214CrossRef Yuan D, Yang Y, Liu X, Chen J (2010) A data placement strategy in scientific cloud workflows[J]. Futur Gener Comput Syst 26(8):1200–1214CrossRef
Zurück zum Zitat Yue P, Zhang M, Tan Z (2015) A geoprocessing workflow system for environmental monitoring and integrated modelling[J]. Environ Model Softw 69:128–140CrossRef Yue P, Zhang M, Tan Z (2015) A geoprocessing workflow system for environmental monitoring and integrated modelling[J]. Environ Model Softw 69:128–140CrossRef
Zurück zum Zitat Zeng L, Veeravalli B, Zomaya AY (2015) An integrated task computation and data management scheduling strategy for workflow applications in cloud environments[J]. J Netw Comput Appl 50:39–48CrossRef Zeng L, Veeravalli B, Zomaya AY (2015) An integrated task computation and data management scheduling strategy for workflow applications in cloud environments[J]. J Netw Comput Appl 50:39–48CrossRef
Zurück zum Zitat Zhang XL, Chen XF, He ZJ (2010) An ACO-based algorithm for parameter optimization of support vector machines[J]. Expert Syst Appl 37(9):6618–6628CrossRef Zhang XL, Chen XF, He ZJ (2010) An ACO-based algorithm for parameter optimization of support vector machines[J]. Expert Syst Appl 37(9):6618–6628CrossRef
Zurück zum Zitat Zhang J, Wang M, Luo J, Dong F, Zhang J (2015) Towards optimized scheduling for data-intensive scientific workflow in multiple datacenter environment[J]. Concurr Comput: Pract E 27(18):5606–5622CrossRef Zhang J, Wang M, Luo J, Dong F, Zhang J (2015) Towards optimized scheduling for data-intensive scientific workflow in multiple datacenter environment[J]. Concurr Comput: Pract E 27(18):5606–5622CrossRef
Zurück zum Zitat Zhao Q, Xiong C, Zhao X, et al. (2015) A data placement strategy for data-intensive scientific workflows in cloud[C]//cluster, cloud and grid computing (CCGrid), 2015 15th IEEE/ACM international symposium on. IEEE, 928–934 Zhao Q, Xiong C, Zhao X, et al. (2015) A data placement strategy for data-intensive scientific workflows in cloud[C]//cluster, cloud and grid computing (CCGrid), 2015 15th IEEE/ACM international symposium on. IEEE, 928–934
Zurück zum Zitat Zhao Q, Xiong C, Wang P (2016) Heuristic data placement for data-intensive applications in heterogeneous cloud[J]. J Electr Comput Eng 2016:1–8 Zhao Q, Xiong C, Wang P (2016) Heuristic data placement for data-intensive applications in heterogeneous cloud[J]. J Electr Comput Eng 2016:1–8
Metadaten
Titel
ACO-DPDGW: an ant colony optimization algorithm for data placement of data-intensive geospatial workflow
verfasst von
Xiaozhu Wu
Ying Liu
Chongcheng Chen
Publikationsdatum
11.08.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
Earth Science Informatics / Ausgabe 4/2019
Print ISSN: 1865-0473
Elektronische ISSN: 1865-0481
DOI
https://doi.org/10.1007/s12145-019-00401-3

Weitere Artikel der Ausgabe 4/2019

Earth Science Informatics 4/2019 Zur Ausgabe

Premium Partner