Skip to main content
Top
Published in: Earth Science Informatics 4/2019

11-08-2019 | Methodology Article

ACO-DPDGW: an ant colony optimization algorithm for data placement of data-intensive geospatial workflow

Authors: Xiaozhu Wu, Ying Liu, Chongcheng Chen

Published in: Earth Science Informatics | Issue 4/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Massive data transmission between distributed data centers is the major efficiency bottleneck of geospatial workflow. Although many data placement methods have been proposed to overcome this problem, few researches have considered the impact of the structure of the workflow. In this paper, we define the problem of data placement for data-intensive geospatial workflow aiming to minimize the data transfer time. An algorithm called ant colony optimization based data placement of data-intensive geospatial workflow (ACO-DPDGW) is proposed to handle this problem. By taking advantage of the node vector to represent the traditional workflow model, the ants could place datasets and tasks in appropriate data centers according to the combination of pheromone information and heuristic information, when they visit the nodes randomly. To prevent premature convergence, a variable neighborhood search operation is embedded into ACO-DPDGW. The experiments show that our algorithm can reduce data transfer volume and data transfer time even as the numbers of datasets, tasks, and data centers increase.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Altintas I, Berkley C, Jaeger E, et al. (2004) Kepler: an extensible system for design and execution of scientific workflows[C]//proceedings. 16th international conference on scientific and statistical database management, 2004. IEEE, 423–424 Altintas I, Berkley C, Jaeger E, et al. (2004) Kepler: an extensible system for design and execution of scientific workflows[C]//proceedings. 16th international conference on scientific and statistical database management, 2004. IEEE, 423–424
go back to reference Altintas I, Block J, De Callafon R et al (2015) Towards an integrated cyberinfrastructure for scalable data-driven monitoring, dynamic prediction and resilience of wildfires[J]. Procedia Comput Sci 51:1633–1642CrossRef Altintas I, Block J, De Callafon R et al (2015) Towards an integrated cyberinfrastructure for scalable data-driven monitoring, dynamic prediction and resilience of wildfires[J]. Procedia Comput Sci 51:1633–1642CrossRef
go back to reference Atrey A, Van Seghbroeck G, Volckaert B, et al. (2018) Scalable data placement of data-intensive Services in geo-distributed Clouds[C]//CLOSER2018, the 8th international conference on cloud computing and services science. SCITEPRESS-Science and Technology Publications, 497–508 Atrey A, Van Seghbroeck G, Volckaert B, et al. (2018) Scalable data placement of data-intensive Services in geo-distributed Clouds[C]//CLOSER2018, the 8th international conference on cloud computing and services science. SCITEPRESS-Science and Technology Publications, 497–508
go back to reference Bousrih A, Brahmi Z. (2015) Optimizing cost and response time for data intensive services' composition based on ABC algorithm[C]//Information & Communication Technology and accessibility (ICTA), 2015 5th international conference on. IEEE, 1–6 Bousrih A, Brahmi Z. (2015) Optimizing cost and response time for data intensive services' composition based on ABC algorithm[C]//Information & Communication Technology and accessibility (ICTA), 2015 5th international conference on. IEEE, 1–6
go back to reference Chen W, Paik I, Li Z (2016) Tology-aware optimal data placement algorithm for network traffic optimization[J]. IEEE Trans Comput 65(8):2603–2617CrossRef Chen W, Paik I, Li Z (2016) Tology-aware optimal data placement algorithm for network traffic optimization[J]. IEEE Trans Comput 65(8):2603–2617CrossRef
go back to reference Chen J, Zhang J, Song A. (2017) Efficient data and task co-scheduling for scientific workflow in geo-distributed datacenters[C]//advanced cloud and big data (CBD), 2017 fifth international conference on. IEEE, 63–68 Chen J, Zhang J, Song A. (2017) Efficient data and task co-scheduling for scientific workflow in geo-distributed datacenters[C]//advanced cloud and big data (CBD), 2017 fifth international conference on. IEEE, 63–68
go back to reference Cowart C, Block J, Crawl D, et al. (2015) geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling[C]//AGU Fall Meeting Abstracts Cowart C, Block J, Crawl D, et al. (2015) geoKepler Workflow Module for Computationally Scalable and Reproducible Geoprocessing and Modeling[C]//AGU Fall Meeting Abstracts
go back to reference Davies DK, Ilavajhala S, Wong MM et al (2009) Fire information for resource management system: archiving and distributing MODIS active fire data[J]. IEEE Trans Geosci Remote Sens 47(1):72–79CrossRef Davies DK, Ilavajhala S, Wong MM et al (2009) Fire information for resource management system: archiving and distributing MODIS active fire data[J]. IEEE Trans Geosci Remote Sens 47(1):72–79CrossRef
go back to reference Davila CC, Reinhart CF, Bemis JL (2016) Modeling Boston: a workflow for the efficient generation and maintenance of urban building energy models from existing geospatial datasets[J]. Energy 117:237–250CrossRef Davila CC, Reinhart CF, Bemis JL (2016) Modeling Boston: a workflow for the efficient generation and maintenance of urban building energy models from existing geospatial datasets[J]. Energy 117:237–250CrossRef
go back to reference Deelman E, Chervenak A. (2008) Data management challenges of data-intensive scientific workflows[C]//cluster computing and the grid, 2008. CCGRID'08. 8th IEEE international symposium on. IEEE, 687–692 Deelman E, Chervenak A. (2008) Data management challenges of data-intensive scientific workflows[C]//cluster computing and the grid, 2008. CCGRID'08. 8th IEEE international symposium on. IEEE, 687–692
go back to reference Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities[J]. Futur Gener Comput Syst 25(5):528–540CrossRef Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities[J]. Futur Gener Comput Syst 25(5):528–540CrossRef
go back to reference Deng K, Ren K, Song J, Yuan D, Xiang Y, Chen J (2013) A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing[J]. Concurr Comput: Pract E 25(18):2523–2539CrossRef Deng K, Ren K, Song J, Yuan D, Xiang Y, Chen J (2013) A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing[J]. Concurr Comput: Pract E 25(18):2523–2539CrossRef
go back to reference Deng K, Ren K, Zhu M, et al. (2015) A data and task co-scheduling algorithm for scientific cloud workflows[J]. IEEE Trans Cloud Comput (1): 1–1 Deng K, Ren K, Zhu M, et al. (2015) A data and task co-scheduling algorithm for scientific cloud workflows[J]. IEEE Trans Cloud Comput (1): 1–1
go back to reference Dorigo M (1996) The any system optimization by a colony of cooperating agents[J]. IEEE Trans Syst Man Cybern B 26:1): 1–1):13CrossRef Dorigo M (1996) The any system optimization by a colony of cooperating agents[J]. IEEE Trans Syst Man Cybern B 26:1): 1–1):13CrossRef
go back to reference Ebrahimi M, Mohan A, Kashlev A, et al. (2015) BDAP: a big data placement strategy for cloud-based scientific workflows[C]//big data computing service and applications (BigDataService), 2015 IEEE first international conference on. IEEE, 105–114 Ebrahimi M, Mohan A, Kashlev A, et al. (2015) BDAP: a big data placement strategy for cloud-based scientific workflows[C]//big data computing service and applications (BigDataService), 2015 IEEE first international conference on. IEEE, 105–114
go back to reference Er-Dun Z, Yong-Qiang Q, Xing-Xing X, et al. (2012) A data placement strategy based on genetic algorithm for scientific workflows[C]//computational intelligence and security (CIS), 2012 eighth international conference on IEEE, 146–149 Er-Dun Z, Yong-Qiang Q, Xing-Xing X, et al. (2012) A data placement strategy based on genetic algorithm for scientific workflows[C]//computational intelligence and security (CIS), 2012 eighth international conference on IEEE, 146–149
go back to reference Gao Y, Guan H, Qi Z et al (2013) A multi-objective ant colony system algorithm for virtual machine placement in cloud computing[J]. J Comput Syst Sci 79(8):1230–1242CrossRef Gao Y, Guan H, Qi Z et al (2013) A multi-objective ant colony system algorithm for virtual machine placement in cloud computing[J]. J Comput Syst Sci 79(8):1230–1242CrossRef
go back to reference Gutjahr WJ (2002) ACO algorithms with guaranteed convergence to the optimal solution[J]. Inf Process Lett 82(3):145–153CrossRef Gutjahr WJ (2002) ACO algorithms with guaranteed convergence to the optimal solution[J]. Inf Process Lett 82(3):145–153CrossRef
go back to reference Hamrouni T, Slimani S, Charrada FB (2015) A data mining correlated patterns-based periodic decentralized replication strategy for data grids[J]. J Syst Softw 110:10–27CrossRef Hamrouni T, Slimani S, Charrada FB (2015) A data mining correlated patterns-based periodic decentralized replication strategy for data grids[J]. J Syst Softw 110:10–27CrossRef
go back to reference Jiang L, Yue P, Kuhn W, Zhang C, Yu C, Guo X (2018) Advancing interoperability of geospatial data provenance on the web: gap analysis and strategies[J]. Comput Geosci 117:21–31CrossRef Jiang L, Yue P, Kuhn W, Zhang C, Yu C, Guo X (2018) Advancing interoperability of geospatial data provenance on the web: gap analysis and strategies[J]. Comput Geosci 117:21–31CrossRef
go back to reference Kalra M, Singh S (2015) A review of metaheuristic scheduling techniques in cloud computing[J]. Egypt Inf J 16(3):275–295CrossRef Kalra M, Singh S (2015) A review of metaheuristic scheduling techniques in cloud computing[J]. Egypt Inf J 16(3):275–295CrossRef
go back to reference Lee JG, Kang M (2015) Geospatial big data: challenges and opportunities[J]. Big Data Research 2(2):74–81CrossRef Lee JG, Kang M (2015) Geospatial big data: challenges and opportunities[J]. Big Data Research 2(2):74–81CrossRef
go back to reference Li S, Dragicevic S, Castro FA, Sester M, Winter S, Coltekin A, Pettit C, Jiang B, Haworth J, Stein A, Cheng T (2016a) Geospatial big data handling theory and methods: a review and research challenges[J]. ISPRS J Photogramm Remote Sens 115:119–133CrossRef Li S, Dragicevic S, Castro FA, Sester M, Winter S, Coltekin A, Pettit C, Jiang B, Haworth J, Stein A, Cheng T (2016a) Geospatial big data handling theory and methods: a review and research challenges[J]. ISPRS J Photogramm Remote Sens 115:119–133CrossRef
go back to reference Li X, Zhang L, Wu Y, et al. (2016b) A novel workflow-level data placement strategy for data-sharing scientific cloud workflows[J]. IEEE Trans Serv Comput Li X, Zhang L, Wu Y, et al. (2016b) A novel workflow-level data placement strategy for data-sharing scientific cloud workflows[J]. IEEE Trans Serv Comput
go back to reference Liu XF, Zhan ZH, Deng Jeremiah D et al An energy efficient ant Colony system for virtual machine placement in cloud computing[J]. IEEE Trans Evol Comput 22(1):113–128CrossRef Liu XF, Zhan ZH, Deng Jeremiah D et al An energy efficient ant Colony system for virtual machine placement in cloud computing[J]. IEEE Trans Evol Comput 22(1):113–128CrossRef
go back to reference Mladenović N, Hansen P (1997) Variable neighborhood search[J]. Comput Oper Res 24(11):1097–1100CrossRef Mladenović N, Hansen P (1997) Variable neighborhood search[J]. Comput Oper Res 24(11):1097–1100CrossRef
go back to reference Pisinger D (2005) Where are the hard knapsack problems?[J]. Comput Oper Res 32(9):2271–2284CrossRef Pisinger D (2005) Where are the hard knapsack problems?[J]. Comput Oper Res 32(9):2271–2284CrossRef
go back to reference Shabeera TP, Kumar SDM, Salam SM et al (2016) Optimizing VM Allocation and Data Placement for Data-Intensive Applications in Cloud using ACO Metaheuristic Algorithm[J]. Eng Sci Technol Int J 20(2):616–628CrossRef Shabeera TP, Kumar SDM, Salam SM et al (2016) Optimizing VM Allocation and Data Placement for Data-Intensive Applications in Cloud using ACO Metaheuristic Algorithm[J]. Eng Sci Technol Int J 20(2):616–628CrossRef
go back to reference Shibata T, Choi S J, Taura K. (2010) File-access patterns of data-intensive workflow applications and their implications to distributed filesystems[C]//proceedings of the 19th ACM international symposium on high performance distributed computing. ACM, 746–755 Shibata T, Choi S J, Taura K. (2010) File-access patterns of data-intensive workflow applications and their implications to distributed filesystems[C]//proceedings of the 19th ACM international symposium on high performance distributed computing. ACM, 746–755
go back to reference Shirasuna S, Gannon D (2006) Xbaya: a graphical workflow composer for the web services architecture[J]. Indiana University Shirasuna S, Gannon D (2006) Xbaya: a graphical workflow composer for the web services architecture[J]. Indiana University
go back to reference Tawfeek MA, El-Sisi AB, Keshk AE et al (2014) Virtual machine placement based on ant colony optimization for minimizing resource wastage[C]//international conference on advanced machine learning technologies and applications. Springer, Cham, pp 153–164 Tawfeek MA, El-Sisi AB, Keshk AE et al (2014) Virtual machine placement based on ant colony optimization for minimizing resource wastage[C]//international conference on advanced machine learning technologies and applications. Springer, Cham, pp 153–164
go back to reference Teylo L, de Paula U, Frota Y, de Oliveira D, Drummond LMA (2017) A hybrid evolutionary algorithm for task scheduling and data assignment of data-intensive scientific workflows on clouds[J]. Futur Gener Comput Syst 76:1–17CrossRef Teylo L, de Paula U, Frota Y, de Oliveira D, Drummond LMA (2017) A hybrid evolutionary algorithm for task scheduling and data assignment of data-intensive scientific workflows on clouds[J]. Futur Gener Comput Syst 76:1–17CrossRef
go back to reference van Der Aalst WMP, Ter Hofstede AHM, Kiepuszewski B et al (2003) Workflow patterns[J]. Distrib Parallel Databases 14(1):5–51CrossRef van Der Aalst WMP, Ter Hofstede AHM, Kiepuszewski B et al (2003) Workflow patterns[J]. Distrib Parallel Databases 14(1):5–51CrossRef
go back to reference Wang L, Shen J, Beydoun G (2013) Enhanced ant colony algorithm for cost-aware data-intensive service provision[C]//2013 IEEE ninth world congress on services. IEEE, 227–234 Wang L, Shen J, Beydoun G (2013) Enhanced ant colony algorithm for cost-aware data-intensive service provision[C]//2013 IEEE ninth world congress on services. IEEE, 227–234
go back to reference Wang T, Yao S, Xu Z, Jia S (2016) DCCP: an effective data placement strategy for data-intensive computations in distributed cloud computing systems[J]. J Supercomput 72(7):2537–2564CrossRef Wang T, Yao S, Xu Z, Jia S (2016) DCCP: an effective data placement strategy for data-intensive computations in distributed cloud computing systems[J]. J Supercomput 72(7):2537–2564CrossRef
go back to reference Wei-Neng CHEN, Zhang J (2008) An ant Colony optimization approach to a grid workflow scheduling problem with various QoS requirements[J]. IEEE Tran Syst Man Cybern C 39(1):29–43CrossRef Wei-Neng CHEN, Zhang J (2008) An ant Colony optimization approach to a grid workflow scheduling problem with various QoS requirements[J]. IEEE Tran Syst Man Cybern C 39(1):29–43CrossRef
go back to reference Xu Q, Xu Z, Wang T (2015) A data-placement strategy based on genetic algorithm in cloud computing[J]. Int J Intell Sci 5(03):145–157CrossRef Xu Q, Xu Z, Wang T (2015) A data-placement strategy based on genetic algorithm in cloud computing[J]. Int J Intell Sci 5(03):145–157CrossRef
go back to reference Yuan D, Yang Y, Liu X, Chen J (2010) A data placement strategy in scientific cloud workflows[J]. Futur Gener Comput Syst 26(8):1200–1214CrossRef Yuan D, Yang Y, Liu X, Chen J (2010) A data placement strategy in scientific cloud workflows[J]. Futur Gener Comput Syst 26(8):1200–1214CrossRef
go back to reference Yue P, Zhang M, Tan Z (2015) A geoprocessing workflow system for environmental monitoring and integrated modelling[J]. Environ Model Softw 69:128–140CrossRef Yue P, Zhang M, Tan Z (2015) A geoprocessing workflow system for environmental monitoring and integrated modelling[J]. Environ Model Softw 69:128–140CrossRef
go back to reference Zeng L, Veeravalli B, Zomaya AY (2015) An integrated task computation and data management scheduling strategy for workflow applications in cloud environments[J]. J Netw Comput Appl 50:39–48CrossRef Zeng L, Veeravalli B, Zomaya AY (2015) An integrated task computation and data management scheduling strategy for workflow applications in cloud environments[J]. J Netw Comput Appl 50:39–48CrossRef
go back to reference Zhang XL, Chen XF, He ZJ (2010) An ACO-based algorithm for parameter optimization of support vector machines[J]. Expert Syst Appl 37(9):6618–6628CrossRef Zhang XL, Chen XF, He ZJ (2010) An ACO-based algorithm for parameter optimization of support vector machines[J]. Expert Syst Appl 37(9):6618–6628CrossRef
go back to reference Zhang J, Wang M, Luo J, Dong F, Zhang J (2015) Towards optimized scheduling for data-intensive scientific workflow in multiple datacenter environment[J]. Concurr Comput: Pract E 27(18):5606–5622CrossRef Zhang J, Wang M, Luo J, Dong F, Zhang J (2015) Towards optimized scheduling for data-intensive scientific workflow in multiple datacenter environment[J]. Concurr Comput: Pract E 27(18):5606–5622CrossRef
go back to reference Zhao Q, Xiong C, Zhao X, et al. (2015) A data placement strategy for data-intensive scientific workflows in cloud[C]//cluster, cloud and grid computing (CCGrid), 2015 15th IEEE/ACM international symposium on. IEEE, 928–934 Zhao Q, Xiong C, Zhao X, et al. (2015) A data placement strategy for data-intensive scientific workflows in cloud[C]//cluster, cloud and grid computing (CCGrid), 2015 15th IEEE/ACM international symposium on. IEEE, 928–934
go back to reference Zhao Q, Xiong C, Wang P (2016) Heuristic data placement for data-intensive applications in heterogeneous cloud[J]. J Electr Comput Eng 2016:1–8 Zhao Q, Xiong C, Wang P (2016) Heuristic data placement for data-intensive applications in heterogeneous cloud[J]. J Electr Comput Eng 2016:1–8
Metadata
Title
ACO-DPDGW: an ant colony optimization algorithm for data placement of data-intensive geospatial workflow
Authors
Xiaozhu Wu
Ying Liu
Chongcheng Chen
Publication date
11-08-2019
Publisher
Springer Berlin Heidelberg
Published in
Earth Science Informatics / Issue 4/2019
Print ISSN: 1865-0473
Electronic ISSN: 1865-0481
DOI
https://doi.org/10.1007/s12145-019-00401-3

Other articles of this Issue 4/2019

Earth Science Informatics 4/2019 Go to the issue

Premium Partner