Skip to main content
Erschienen in: International Journal of Data Science and Analytics 2/2018

06.03.2018 | Review

The many faces of data-centric workflow optimization: a survey

verfasst von: Georgia Kougka, Anastasios Gounaris, Alkis Simitsis

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Workflow technology is rapidly evolving and, rather than being limited to modeling the control flow in business processes, is becoming a key mechanism to perform advanced data management, such as big data analytics. This survey focuses on data-centric workflows (or workflows for data analytics or data flows), where a key aspect is data passing through and getting manipulated by a sequence of steps. The large volume and variety of data, the complexity of operations performed, and the long time such workflows take to compute give rise to the need for optimization. In general, data-centric workflow optimization is a technology in evolution. This survey focuses on techniques applicable to workflows comprising arbitrary types of data manipulation steps and semantic inter-dependencies between such steps. Further, it serves a twofold purpose: firstly, to present the main dimensions of the relevant optimization problems and the types of optimizations that occur before flow execution and secondly, to provide a concise overview of the existing approaches with a view to highlighting key observations and areas deserving more attention from the community.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Hereafter, these three terms will be used interchangeably; the terms workflow and flow will be used interchangeably, too.
 
2
The terms technique, proposal, and work will be used interchangeably.
 
3
Through considering optimizations starting from a valid initial flow, we exclude from our survey the big area of answering queries in the presence of limited access patterns, in which, the main aim is to construct such an initial plan [69, 78] through selecting an appropriate subset of tasks from a given task pool; however, we have considered works from data integration that optimize the plan after it has been devised, such as [111] or [34], which is subsumed by Kougka and Gounaris [60].
 
4
www.​myexperiment.​org/​ in bio-informatics.
 
Literatur
2.
Zurück zum Zitat Abadi, D.J., Agrawal, R., Ailamaki, A., Balazinska, M., Bernstein, P.A., Carey, M.J., Chaudhuri, S., Dean, J., Doan, A., Franklin, M.J., Gehrke, J., Haas, L.M., Halevy, A.Y., Hellerstein, J.M., Ioannidis, Y.E., Jagadish, H.V., Kossmann, D., Madden, S., Mehrotra, S., Milo, T., Naughton, J.F., Ramakrishnan, R., Markl, V., Olston, C., Ooi, B.C., Ré, C., Suciu, D., Stonebraker, M., Walter, T., Widom, J.: The beckman report on database research. SIGMOD Rec. 43(3), 61–70 (2014)CrossRef Abadi, D.J., Agrawal, R., Ailamaki, A., Balazinska, M., Bernstein, P.A., Carey, M.J., Chaudhuri, S., Dean, J., Doan, A., Franklin, M.J., Gehrke, J., Haas, L.M., Halevy, A.Y., Hellerstein, J.M., Ioannidis, Y.E., Jagadish, H.V., Kossmann, D., Madden, S., Mehrotra, S., Milo, T., Naughton, J.F., Ramakrishnan, R., Markl, V., Olston, C., Ooi, B.C., Ré, C., Suciu, D., Stonebraker, M., Walter, T., Widom, J.: The beckman report on database research. SIGMOD Rec. 43(3), 61–70 (2014)CrossRef
3.
Zurück zum Zitat Abrishami, S., Naghibzadeh, M., Epema, D.H.: Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Gener. Comput. Syst. 29(1), 158–169 (2013)CrossRef Abrishami, S., Naghibzadeh, M., Epema, D.H.: Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Gener. Comput. Syst. 29(1), 158–169 (2013)CrossRef
4.
Zurück zum Zitat Abrishami, S., Naghibzadeh, M., Epema, D.H.J.: Cost-driven scheduling of grid workflows using partial critical paths. IEEE Trans. Parallel Distrib. Syst. 23(8), 1400–1414 (2012)CrossRef Abrishami, S., Naghibzadeh, M., Epema, D.H.J.: Cost-driven scheduling of grid workflows using partial critical paths. IEEE Trans. Parallel Distrib. Syst. 23(8), 1400–1414 (2012)CrossRef
5.
Zurück zum Zitat Agrawal, K., Benoit, A., Dufossé, F., Robert, Y.: Mapping filtering streaming applications with communication costs. In: SPAA, pp. 19–28 (2009) Agrawal, K., Benoit, A., Dufossé, F., Robert, Y.: Mapping filtering streaming applications with communication costs. In: SPAA, pp. 19–28 (2009)
6.
Zurück zum Zitat Agrawal, K., Benoit, A., Dufossé, F., Robert, Y.: Mapping filtering streaming applications. Algorithmica 62(1–2), 258–308 (2012)MathSciNetCrossRefMATH Agrawal, K., Benoit, A., Dufossé, F., Robert, Y.: Mapping filtering streaming applications. Algorithmica 62(1–2), 258–308 (2012)MathSciNetCrossRefMATH
7.
Zurück zum Zitat Agrawal, K., Benoit, A., Magnan, L., Robert, Y.: Scheduling algorithms for linear workflow optimization. In: IPDPS, pp. 1–12 (2010) Agrawal, K., Benoit, A., Magnan, L., Robert, Y.: Scheduling algorithms for linear workflow optimization. In: IPDPS, pp. 1–12 (2010)
8.
Zurück zum Zitat Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)CrossRef Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)CrossRef
9.
Zurück zum Zitat Barker, A., van Hemert, J.I.: Scientific workflow: a survey and research directions. In: PPAM, Lecture Notes in Computer Science, vol. 4967, pp. 746–753 (2007) Barker, A., van Hemert, J.I.: Scientific workflow: a survey and research directions. In: PPAM, Lecture Notes in Computer Science, vol. 4967, pp. 746–753 (2007)
10.
Zurück zum Zitat Benoit, A., Çatalyürek, U.V., Robert, Y., Saule, E.: A survey of pipelined workflow scheduling: models and algorithms. ACM Comput. Surv. 45(4), 50:1–50:36 (2013)CrossRef Benoit, A., Çatalyürek, U.V., Robert, Y., Saule, E.: A survey of pipelined workflow scheduling: models and algorithms. ACM Comput. Surv. 45(4), 50:1–50:36 (2013)CrossRef
11.
Zurück zum Zitat Bhattacharya, K., Hull, R., Su, J.: A data-centric design methodology for business processes. In: Handbook of Research on Business Process Modeling, Chapter 23, 503–531 (2009) Bhattacharya, K., Hull, R., Su, J.: A data-centric design methodology for business processes. In: Handbook of Research on Business Process Modeling, Chapter 23, 503–531 (2009)
12.
Zurück zum Zitat Böhm, M.: Cost-based optimization of integration flows. Ph.D. thesis (2011) Böhm, M.: Cost-based optimization of integration flows. Ph.D. thesis (2011)
13.
Zurück zum Zitat Böhm, M., Habich, D., Lehner, W.: On-demand re-optimization of integration flows. Inf. Syst. 45, 1–17 (2014)CrossRef Böhm, M., Habich, D., Lehner, W.: On-demand re-optimization of integration flows. Inf. Syst. 45, 1–17 (2014)CrossRef
14.
Zurück zum Zitat Böhm, M., Tatikonda, S., Reinwald, B., Sen, P., Tian, Y., Burdick, D., Vaithyanathan, S.: Hybrid parallelization strategies for large-scale machine learning in systemml. PVLDB 7(7), 553–564 (2014) Böhm, M., Tatikonda, S., Reinwald, B., Sen, P., Tian, Y., Burdick, D., Vaithyanathan, S.: Hybrid parallelization strategies for large-scale machine learning in systemml. PVLDB 7(7), 553–564 (2014)
15.
Zurück zum Zitat Braga, D., Ceri, S., Daniel, F., Martinenghi, D.: Optimization of multi-domain queries on the web. PVLDB 1(1), 562–573 (2008) Braga, D., Ceri, S., Daniel, F., Martinenghi, D.: Optimization of multi-domain queries on the web. PVLDB 1(1), 562–573 (2008)
16.
Zurück zum Zitat Burge, J., Munagala, K., Srivastava, U.: Ordering pipelined query operators with precedence constraints. Technical Report 2005-40, Stanford InfoLab (2005) Burge, J., Munagala, K., Srivastava, U.: Ordering pipelined query operators with precedence constraints. Technical Report 2005-40, Stanford InfoLab (2005)
17.
Zurück zum Zitat Calheiros, R.N., Buyya, R.: Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans. Parallel Distrib. Syst. 25(7), 1787–1796 (2014)CrossRef Calheiros, R.N., Buyya, R.: Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans. Parallel Distrib. Syst. 25(7), 1787–1796 (2014)CrossRef
18.
Zurück zum Zitat Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 1–3, 1998, Seattle, Washington, pp. 34–43 (1998) Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 1–3, 1998, Seattle, Washington, pp. 34–43 (1998)
19.
Zurück zum Zitat Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54, 88–98 (2011)CrossRef Chaudhuri, S., Dayal, U., Narasayya, V.: An overview of business intelligence technology. Commun. ACM 54, 88–98 (2011)CrossRef
20.
Zurück zum Zitat Chaudhuri, S., Shim, K.: Optimization of queries with user-defined predicates. ACM Trans. Database Syst. 24(2), 177–228 (1999)CrossRef Chaudhuri, S., Shim, K.: Optimization of queries with user-defined predicates. ACM Trans. Database Syst. 24(2), 177–228 (1999)CrossRef
21.
Zurück zum Zitat Chen, W., Deelman, E.: Partitioning and scheduling workflows across multiple sites with storage constraints. In: Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics—Volume Part II, PPAM’11, pp. 11–20 (2012) Chen, W., Deelman, E.: Partitioning and scheduling workflows across multiple sites with storage constraints. In: Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics—Volume Part II, PPAM’11, pp. 11–20 (2012)
22.
Zurück zum Zitat Chen, W.N., Zhang, J.: An ant colony optimization approach to a grid workflow scheduling problem with various qos requirements. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 39(1), 29–43 (2009)CrossRef Chen, W.N., Zhang, J.: An ant colony optimization approach to a grid workflow scheduling problem with various qos requirements. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 39(1), 29–43 (2009)CrossRef
23.
Zurück zum Zitat Chirkin, A.M., Belloum, A., Kovalchuk, S.V., Makkes, M.X.: Execution time estimation for workflow scheduling. In: Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, pp. 1–10. IEEE Press (2014) Chirkin, A.M., Belloum, A., Kovalchuk, S.V., Makkes, M.X.: Execution time estimation for workflow scheduling. In: Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, pp. 1–10. IEEE Press (2014)
24.
Zurück zum Zitat Cohen-Boulakia, S., Chen, J., Goble, C., Missier, P., Williams, A., Froidevaux, C.: Distilling structure in taverna scientific workflows: a refactoring approach. BMC Bioinformatics 15(1), S12 (2014)CrossRef Cohen-Boulakia, S., Chen, J., Goble, C., Missier, P., Williams, A., Froidevaux, C.: Distilling structure in taverna scientific workflows: a refactoring approach. BMC Bioinformatics 15(1), S12 (2014)CrossRef
25.
Zurück zum Zitat Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U., Zdonik, S.: An architecture for compiling udf-centric workflows. PVLDB 8(12), 1466–1477 (2015) Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U., Zdonik, S.: An architecture for compiling udf-centric workflows. PVLDB 8(12), 1466–1477 (2015)
26.
Zurück zum Zitat Curcin, V., Ghanem, M.: Scientific workflow systems—can one size fit all? In: Biomedical Engineering Conference, 2008. CIBEC 2008. Cairo International, pp. 1–9 (2008) Curcin, V., Ghanem, M.: Scientific workflow systems—can one size fit all? In: Biomedical Engineering Conference, 2008. CIBEC 2008. Cairo International, pp. 1–9 (2008)
27.
Zurück zum Zitat Dayal, U., Castellanos, M., Simitsis, A., Wilkinson, K.: Data integration flows for business intelligence. In: Proceedings of EDBT, pp. 1–11 (2009) Dayal, U., Castellanos, M., Simitsis, A., Wilkinson, K.: Data integration flows for business intelligence. In: Proceedings of EDBT, pp. 1–11 (2009)
28.
Zurück zum Zitat de Oliveira, D., Ogasawara, E.S., Dias, J., Baio, F.A., Mattoso, M.: Ontology-based semi-automatic workflow composition. JIDM 3(1), 61–72 (2012) de Oliveira, D., Ogasawara, E.S., Dias, J., Baio, F.A., Mattoso, M.: Ontology-based semi-automatic workflow composition. JIDM 3(1), 61–72 (2012)
29.
Zurück zum Zitat Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009)CrossRef Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009)CrossRef
30.
Zurück zum Zitat Deelman, E., Singh, G., Su, M.H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. 13(3), 219–237 (2005) Deelman, E., Singh, G., Su, M.H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci. Program. 13(3), 219–237 (2005)
31.
Zurück zum Zitat Deshpande, A., Hellerstein, L.: Parallel pipelined filter ordering with precedence constraints. ACM Trans. Algorithms 8(4), 41:1–41:38 (2012)MathSciNetCrossRefMATH Deshpande, A., Hellerstein, L.: Parallel pipelined filter ordering with precedence constraints. ACM Trans. Algorithms 8(4), 41:1–41:38 (2012)MathSciNetCrossRefMATH
32.
Zurück zum Zitat Dong, F., Akl, S.G.: Scheduling algorithms for grid computing: state of the art and open problems. Technical report (2006) Dong, F., Akl, S.G.: Scheduling algorithms for grid computing: state of the art and open problems. Technical report (2006)
33.
Zurück zum Zitat Fard, H., Prodan, R., Fahringer, T.: A truthful dynamic workflow scheduling mechanism for commercial multicloud environments. IEEE Trans. Parallel Distrib. Syst. 24(6), 1203–1212 (2013)CrossRef Fard, H., Prodan, R., Fahringer, T.: A truthful dynamic workflow scheduling mechanism for commercial multicloud environments. IEEE Trans. Parallel Distrib. Syst. 24(6), 1203–1212 (2013)CrossRef
34.
Zurück zum Zitat Florescu, D., Levy, A., Manolescu, I., Suciu, D.: Query optimization in the presence of limited access patterns. In: ACM SIGMOD, pp. 311–322 (1999) Florescu, D., Levy, A., Manolescu, I., Suciu, D.: Query optimization in the presence of limited access patterns. In: ACM SIGMOD, pp. 311–322 (1999)
35.
Zurück zum Zitat Garcia-Molina, H., Ullman, J.D., Widom, J.D.: Database Systems: The Complete Book. Prentice Hall, Upper Saddle River (2001) Garcia-Molina, H., Ullman, J.D., Widom, J.D.: Database Systems: The Complete Book. Prentice Hall, Upper Saddle River (2001)
37.
Zurück zum Zitat Grehant, X., Demeure, I., Jarp, S.: A survey of task mapping on production grids. ACM Comput. Surv. 45(3), 37:1–37:25 (2013)CrossRefMATH Grehant, X., Demeure, I., Jarp, S.: A survey of task mapping on production grids. ACM Comput. Surv. 45(3), 37:1–37:25 (2013)CrossRefMATH
38.
Zurück zum Zitat Gu, Y., Wu, Q., Rao, N.S.V.: Analyzing execution dynamics of scientific workflows for latency minimization in resource sharing environments. In: Proceedings of the 2011 IEEE World Congress on Services, pp. 153–160 (2011) Gu, Y., Wu, Q., Rao, N.S.V.: Analyzing execution dynamics of scientific workflows for latency minimization in resource sharing environments. In: Proceedings of the 2011 IEEE World Congress on Services, pp. 153–160 (2011)
39.
Zurück zum Zitat Halasipuram, R., Deshpande, P.M., Padmanabhan, S.: Determining essential statistics for cost based optimization of an ETL workflow. In: EDBT, pp. 307–318 (2014) Halasipuram, R., Deshpande, P.M., Padmanabhan, S.: Determining essential statistics for cost based optimization of an ETL workflow. In: EDBT, pp. 307–318 (2014)
40.
Zurück zum Zitat Hellerstein, J.M.: Optimization techniques for queries with expensive methods. ACM Trans. Database Syst. 23(2), 113–157 (1998)CrossRef Hellerstein, J.M.: Optimization techniques for queries with expensive methods. ACM Trans. Database Syst. 23(2), 113–157 (1998)CrossRef
41.
Zurück zum Zitat Herodotou, H., Babu, S.: Profiling, what-if analysis, and cost-based optimization of mapreduce programs. PVLDB 4(11), 1111–1122 (2011) Herodotou, H., Babu, S.: Profiling, what-if analysis, and cost-based optimization of mapreduce programs. PVLDB 4(11), 1111–1122 (2011)
42.
Zurück zum Zitat Holl, S., Zimmermann, O., Hofmann-Apitius, M.: A new optimization phase for scientific workflow management systems. In: eScience, pp. 1–8 (2012) Holl, S., Zimmermann, O., Hofmann-Apitius, M.: A new optimization phase for scientific workflow management systems. In: eScience, pp. 1–8 (2012)
43.
Zurück zum Zitat Holzinger, A., Stocker, C., Ofner, B., Prohaska, G., Brabenetz, A., Hofmann-Wellenhof, R.: Combining HCI, natural language processing, and knowledge discovery—potential of IBM content analytics as an assistive technology in the biomedical field. In: Human–Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data—Third International Workshop, HCI-KDD, pp. 13–24 (2013) Holzinger, A., Stocker, C., Ofner, B., Prohaska, G., Brabenetz, A., Hofmann-Wellenhof, R.: Combining HCI, natural language processing, and knowledge discovery—potential of IBM content analytics as an assistive technology in the biomedical field. In: Human–Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data—Third International Workshop, HCI-KDD, pp. 13–24 (2013)
44.
Zurück zum Zitat Huang, B., Babu, S., Yang, J.: Cumulon: optimizing statistical data analysis in the cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2013) Huang, B., Babu, S., Yang, J.: Cumulon: optimizing statistical data analysis in the cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2013)
45.
Zurück zum Zitat Huang, B., Böhm, M., Tian, Y., Reinwald, B., Tatikonda, S., Reiss, F.R.: Resource elasticity for large-scale machine learning. In: SIGMOD’15, pp. 137–152 (2015) Huang, B., Böhm, M., Tian, Y., Reinwald, B., Tatikonda, S., Reiss, F.R.: Resource elasticity for large-scale machine learning. In: SIGMOD’15, pp. 137–152 (2015)
46.
Zurück zum Zitat Huang, B., Jarrett, N.W.D., Babu, S., Mukherjee, S., Yang, J.: Cümülön: Matrix-based data analytics in the cloud with spot instances. Proc. VLDB Endow. 9(3), 156–167 (2015)CrossRef Huang, B., Jarrett, N.W.D., Babu, S., Mukherjee, S., Yang, J.: Cümülön: Matrix-based data analytics in the cloud with spot instances. Proc. VLDB Endow. 9(3), 156–167 (2015)CrossRef
47.
Zurück zum Zitat Hueske, F., Peters, M., Sax, M., Rheinländer, A., Bergmann, R., Krettek, A., Tzoumas, K.: Opening the black boxes in data flow optimization. PVLDB 5(11), 1256–1267 (2012) Hueske, F., Peters, M., Sax, M., Rheinländer, A., Bergmann, R., Krettek, A., Tzoumas, K.: Opening the black boxes in data flow optimization. PVLDB 5(11), 1256–1267 (2012)
48.
Zurück zum Zitat Informatica: How to achieve flexible, cost-effective scalability and performance through pushdown processing. White Paper (2007) Informatica: How to achieve flexible, cost-effective scalability and performance through pushdown processing. White Paper (2007)
49.
Zurück zum Zitat Ioannidis, Y.E.: Query optimization. ACM Comput. Surv. 28(1), 121–123 (1996)CrossRef Ioannidis, Y.E.: Query optimization. ACM Comput. Surv. 28(1), 121–123 (1996)CrossRef
50.
Zurück zum Zitat Jin, T., Zhang, F., Sun, Q., Bui, H., Parashar, M., Yu, H., Klasky, S., Podhorszki, N., Abbasi, H.: Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflows. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC’13, p. 74 (2013) Jin, T., Zhang, F., Sun, Q., Bui, H., Parashar, M., Yu, H., Klasky, S., Podhorszki, N., Abbasi, H.: Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflows. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC’13, p. 74 (2013)
51.
Zurück zum Zitat Jovanovic, P., Romero, O., Abelló, A.: A unified view of data-intensive flows in business intelligence systems: a survey. In: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIX, pp. 66–107. Springer, Berlin (2016) Jovanovic, P., Romero, O., Abelló, A.: A unified view of data-intensive flows in business intelligence systems: a survey. In: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIX, pp. 66–107. Springer, Berlin (2016)
52.
Zurück zum Zitat Jovanovic, P., Romero, O., Simitsis, A., Abell, A.: Incremental consolidation of data-intensive multi-flows. IEEE Trans. Knowl. Data Eng. 28(5), 1203–1216 (2016)CrossRef Jovanovic, P., Romero, O., Simitsis, A., Abell, A.: Incremental consolidation of data-intensive multi-flows. IEEE Trans. Knowl. Data Eng. 28(5), 1203–1216 (2016)CrossRef
53.
Zurück zum Zitat Jovanovic, P., Simitsis, A., Wilkinson, K.: Babbleflow: a translator for analytic data flow programs. In: SIGMOD, pp. 713–716 (2014) Jovanovic, P., Simitsis, A., Wilkinson, K.: Babbleflow: a translator for analytic data flow programs. In: SIGMOD, pp. 713–716 (2014)
54.
Zurück zum Zitat Jovanovic, P., Simitsis, A., Wilkinson, K.: Engine independence for logical analytic flows. In: ICDE, pp. 1060–1071 (2014) Jovanovic, P., Simitsis, A., Wilkinson, K.: Engine independence for logical analytic flows. In: ICDE, pp. 1060–1071 (2014)
55.
Zurück zum Zitat Juve, G., Chervenak, A.L., Deelman, E., Bharathi, S., Mehta, G., Vahi, K.: Characterizing and profiling scientific workflows. Future Gener. Comput. Syst. 29(3), 682–692 (2013)CrossRef Juve, G., Chervenak, A.L., Deelman, E., Bharathi, S., Mehta, G., Vahi, K.: Characterizing and profiling scientific workflows. Future Gener. Comput. Syst. 29(3), 682–692 (2013)CrossRef
56.
Zurück zum Zitat Karagiannis, A., Vassiliadis, P., Simitsis, A.: Scheduling strategies for efficient ETL execution. Inf. Syst. 38(6), 927–945 (2013)CrossRef Karagiannis, A., Vassiliadis, P., Simitsis, A.: Scheduling strategies for efficient ETL execution. Inf. Syst. 38(6), 927–945 (2013)CrossRef
57.
Zurück zum Zitat Kllapi, H., Sitaridi, E., Tsangaris, M.M., Ioannidis, Y.: Schedule optimization for data processing flows on the cloud. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 289–300 (2011) Kllapi, H., Sitaridi, E., Tsangaris, M.M., Ioannidis, Y.: Schedule optimization for data processing flows on the cloud. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 289–300 (2011)
58.
Zurück zum Zitat Kougka, G., Gounaris, A.: Declarative expression and optimization of data-intensive flows. In: DaWaK, pp. 13–25 (2013) Kougka, G., Gounaris, A.: Declarative expression and optimization of data-intensive flows. In: DaWaK, pp. 13–25 (2013)
59.
Zurück zum Zitat Kougka, G., Gounaris, A.: Optimization of data-intensive flows: is it needed? is it solved? In: Proceedings of the 17th International Workshop on Data Warehousing and OLAP, DOLAP 2014, Shanghai, November 3–7, 2014, pp. 95–98 (2014) Kougka, G., Gounaris, A.: Optimization of data-intensive flows: is it needed? is it solved? In: Proceedings of the 17th International Workshop on Data Warehousing and OLAP, DOLAP 2014, Shanghai, November 3–7, 2014, pp. 95–98 (2014)
60.
Zurück zum Zitat Kougka, G., Gounaris, A.: Cost optimization of data flows based on task re-ordering. In: LNCS Transactions on Large-Scale Data- and Knowledge-Centered Systems (2017, to appear) Kougka, G., Gounaris, A.: Cost optimization of data flows based on task re-ordering. In: LNCS Transactions on Large-Scale Data- and Knowledge-Centered Systems (2017, to appear)
61.
Zurück zum Zitat Kougka, G., Gounaris, A.: Optimal task ordering in chain data flows: exploring the practicality of non-scalable solutions. In: DaWaK (2017) Kougka, G., Gounaris, A.: Optimal task ordering in chain data flows: exploring the practicality of non-scalable solutions. In: DaWaK (2017)
62.
Zurück zum Zitat Kougka, G., Gounaris, A., Leser, U.: Modeling data flow execution in a parallel environment. In: DaWaK (2017) Kougka, G., Gounaris, A., Leser, U.: Modeling data flow execution in a parallel environment. In: DaWaK (2017)
63.
Zurück zum Zitat Kougka, G., Gounaris, A., Tsichlas, K.: Practical algorithms for execution engine selection in data flows. Future Gener. Comput. Syst. 45, 133–148 (2015)CrossRef Kougka, G., Gounaris, A., Tsichlas, K.: Practical algorithms for execution engine selection in data flows. Future Gener. Comput. Syst. 45, 133–148 (2015)CrossRef
64.
Zurück zum Zitat Krishnamurthy, R., Boral, H., Zaniolo, C.: Optimization of nonrecursive queries. In: VLDB, pp. 128–137 (1986) Krishnamurthy, R., Boral, H., Zaniolo, C.: Optimization of nonrecursive queries. In: VLDB, pp. 128–137 (1986)
65.
Zurück zum Zitat Kumar, N., Kumar, P.S.: An efficient heuristic for logical optimization of ETL workflows. In: BIRTE, pp. 68–83 (2010) Kumar, N., Kumar, P.S.: An efficient heuristic for logical optimization of ETL workflows. In: BIRTE, pp. 68–83 (2010)
66.
Zurück zum Zitat Kumar, V.S., Sadayappan, P., Mehta, G., Vahi, K., Deelman, E., Ratnakar, V., Kim, J., Gil, Y., Hall, M., Kurc, T., Saltz, J.: An integrated framework for parameter-based optimization of scientific workflows. In: HPDC, pp. 177–186 (2009) Kumar, V.S., Sadayappan, P., Mehta, G., Vahi, K., Deelman, E., Ratnakar, V., Kim, J., Gil, Y., Hall, M., Kurc, T., Saltz, J.: An integrated framework for parameter-based optimization of scientific workflows. In: HPDC, pp. 177–186 (2009)
67.
Zurück zum Zitat Kumbhare, A.G., Simmhan, Y., Prasanna, V.K.: Exploiting application dynamism and cloud elasticity for continuous dataflows. In: SC, p. 57 (2013) Kumbhare, A.G., Simmhan, Y., Prasanna, V.K.: Exploiting application dynamism and cloud elasticity for continuous dataflows. In: SC, p. 57 (2013)
68.
Zurück zum Zitat Kyriazis, D., Tserpes, K., Menychtas, A., Litke, A., Varvarigou, T.A.: An innovative workflow mapping mechanism for grids in the frame of quality of service. Future Gener. Comput. Syst. 24(6), 498–511 (2008)CrossRef Kyriazis, D., Tserpes, K., Menychtas, A., Litke, A., Varvarigou, T.A.: An innovative workflow mapping mechanism for grids in the frame of quality of service. Future Gener. Comput. Syst. 24(6), 498–511 (2008)CrossRef
69.
Zurück zum Zitat Li, C.: Computing complete answers to queries in the presence of limited access patterns. VLDB J. 12(3), 211–227 (2003)CrossRef Li, C.: Computing complete answers to queries in the presence of limited access patterns. VLDB J. 12(3), 211–227 (2003)CrossRef
70.
Zurück zum Zitat Lim, H., Herodotou, H., Babu, S.: Stubby: a transformation-based optimizer for mapreduce workflows. Proc. VLDB Endow. 5(11), 1196–1207 (2012)CrossRef Lim, H., Herodotou, H., Babu, S.: Stubby: a transformation-based optimizer for mapreduce workflows. Proc. VLDB Endow. 5(11), 1196–1207 (2012)CrossRef
71.
Zurück zum Zitat Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)CrossRef Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)CrossRef
72.
Zurück zum Zitat Liu, X., Iftikhar, N.: An ETL optimization framework using partitioning and parallelization. In: SAC’15 (2015) Liu, X., Iftikhar, N.: An ETL optimization framework using partitioning and parallelization. In: SAC’15 (2015)
73.
Zurück zum Zitat Nguyen, P., Hilario, M., Kalousis, A.: Using meta-mining to support data mining workflow planning and optimization. J. Artif. Intell. Res. 51, 605–644 (2014)CrossRef Nguyen, P., Hilario, M., Kalousis, A.: Using meta-mining to support data mining workflow planning and optimization. J. Artif. Intell. Res. 51, 605–644 (2014)CrossRef
74.
Zurück zum Zitat Ogasawara, E.S., de Oliveira, D., Valduriez, P., Dias, J., Porto, F., Mattoso, M.: An algebraic approach for data-centric scientific workflows. PVLDB 4(12), 1328–1339 (2011) Ogasawara, E.S., de Oliveira, D., Valduriez, P., Dias, J., Porto, F., Mattoso, M.: An algebraic approach for data-centric scientific workflows. PVLDB 4(12), 1328–1339 (2011)
75.
Zurück zum Zitat Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD Conference, pp. 1099–1110 (2008) Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: SIGMOD Conference, pp. 1099–1110 (2008)
76.
Zurück zum Zitat Pietri, I., Juve, G., Deelman, E., Sakellariou, R.: A performance model to estimate execution time of scientific workflows on the cloud. In: Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, pp. 11–19. IEEE Press (2014) Pietri, I., Juve, G., Deelman, E., Sakellariou, R.: A performance model to estimate execution time of scientific workflows on the cloud. In: Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, pp. 11–19. IEEE Press (2014)
77.
Zurück zum Zitat Plankensteiner, K., Prodan, R.: Meeting soft deadlines in scientific workflows using resubmission impact. IEEE Trans. Parallel Distrib. Syst. 23(5), 890–901 (2012)CrossRef Plankensteiner, K., Prodan, R.: Meeting soft deadlines in scientific workflows using resubmission impact. IEEE Trans. Parallel Distrib. Syst. 23(5), 890–901 (2012)CrossRef
78.
Zurück zum Zitat Preda, N., Kasneci, G., Suchanek, F.M., Neumann, T., Yuan, W., Weikum, G.: Active knowledge: dynamically enriching RDF knowledge bases by web services. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, IN, June 6–10, 2010, pp. 399–410 (2010) Preda, N., Kasneci, G., Suchanek, F.M., Neumann, T., Yuan, W., Weikum, G.: Active knowledge: dynamically enriching RDF knowledge bases by web services. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, IN, June 6–10, 2010, pp. 399–410 (2010)
79.
Zurück zum Zitat Quiroz, A., Huang, E., Ceriani, L.: A robust and extensible tool for data integration using data type models. In: Proceedings of the Twenty-Ninth AAAI, pp. 3993–3998 (2015) Quiroz, A., Huang, E., Ceriani, L.: A robust and extensible tool for data integration using data type models. In: Proceedings of the Twenty-Ninth AAAI, pp. 3993–3998 (2015)
80.
Zurück zum Zitat Rahman, M., Hassan, M.R., Ranjan, R., Buyya, R.: Adaptive workflow scheduling for dynamic grid and cloud computing environment. Concurr. Comput. Pract. Exp. 25(13), 1816–1842 (2013)CrossRef Rahman, M., Hassan, M.R., Ranjan, R., Buyya, R.: Adaptive workflow scheduling for dynamic grid and cloud computing environment. Concurr. Comput. Pract. Exp. 25(13), 1816–1842 (2013)CrossRef
81.
Zurück zum Zitat Rheinländer, A., Heise, A., Hueske, F., Leser, U., Naumann, F.: SOFA: an extensible logical optimizer for udf-heavy data flows. Inf. Syst. 52, 96–125 (2015)CrossRef Rheinländer, A., Heise, A., Hueske, F., Leser, U., Naumann, F.: SOFA: an extensible logical optimizer for udf-heavy data flows. Inf. Syst. 52, 96–125 (2015)CrossRef
82.
Zurück zum Zitat Schikuta, E., Wanek, H., Ul Haq, I.: Grid workflow optimization regarding dynamically changing resources and conditions. Concurr. Comput. Pract. Exp. 20, 1837–1849 (2008)CrossRef Schikuta, E., Wanek, H., Ul Haq, I.: Grid workflow optimization regarding dynamically changing resources and conditions. Concurr. Comput. Pract. Exp. 20, 1837–1849 (2008)CrossRef
83.
Zurück zum Zitat Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 23–34 (1979) Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 23–34 (1979)
84.
Zurück zum Zitat Shi, J., Zou, J., Lu, J., Cao, Z., Li, S., Wang, C.: MRTuner: a toolkit to enable holistic optimization for mapreduce jobs. Proc. VLDB Endow. 7(13), 1319–1330 (2014)CrossRef Shi, J., Zou, J., Lu, J., Cao, Z., Li, S., Wang, C.: MRTuner: a toolkit to enable holistic optimization for mapreduce jobs. Proc. VLDB Endow. 7(13), 1319–1330 (2014)CrossRef
85.
Zurück zum Zitat Shivam, P., Babu, S., Chase, J.S.: Active and accelerated learning of cost models for optimizing scientific applications. In: VLDB, pp. 535–546 (2006) Shivam, P., Babu, S., Chase, J.S.: Active and accelerated learning of cost models for optimizing scientific applications. In: VLDB, pp. 535–546 (2006)
86.
Zurück zum Zitat Simitsis, A., Vassiliadis, P., Dayal, U., Karagiannis, A., Tziovara, V.: Benchmarking ETL workflows. In: TPCTC 2009, 199–220 (2009) Simitsis, A., Vassiliadis, P., Dayal, U., Karagiannis, A., Tziovara, V.: Benchmarking ETL workflows. In: TPCTC 2009, 199–220 (2009)
87.
Zurück zum Zitat Simitsis, A., Vassiliadis, P., Sellis, T.K.: State-space optimization of ETL workflows. IEEE Trans. Knowl. Data Eng. 17(10), 1404–1419 (2005)CrossRef Simitsis, A., Vassiliadis, P., Sellis, T.K.: State-space optimization of ETL workflows. IEEE Trans. Knowl. Data Eng. 17(10), 1404–1419 (2005)CrossRef
88.
Zurück zum Zitat Simitsis, A., Wilkinson, K.: Revisiting ETL benchmarking: the case for hybrid flows. In: TPCTC, pp. 75–91 (2012) Simitsis, A., Wilkinson, K.: Revisiting ETL benchmarking: the case for hybrid flows. In: TPCTC, pp. 75–91 (2012)
89.
Zurück zum Zitat Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: QoX-driven ETL design: reducing the cost of ETL consulting engagements. In: Proceedings of the SIGMOD, pp. 953–960 (2009) Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: QoX-driven ETL design: reducing the cost of ETL consulting engagements. In: Proceedings of the SIGMOD, pp. 953–960 (2009)
90.
Zurück zum Zitat Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: Optimizing analytic data flows for multiple execution engines. In: SIGMOD Conference, pp. 829–840 (2012) Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: Optimizing analytic data flows for multiple execution engines. In: SIGMOD Conference, pp. 829–840 (2012)
91.
Zurück zum Zitat Simitsis, A., Wilkinson, K., Dayal, U.: Hybrid analytic flows—the case for optimization. Fund. Inf. 128(3), 303–335 (2013) Simitsis, A., Wilkinson, K., Dayal, U.: Hybrid analytic flows—the case for optimization. Fund. Inf. 128(3), 303–335 (2013)
92.
Zurück zum Zitat Simitsis, A., Wilkinson, K., Dayal, U., Castellanos, M.: Optimizing ETL workflows for fault-tolerance. In: ICDE, pp. 385–396 (2010) Simitsis, A., Wilkinson, K., Dayal, U., Castellanos, M.: Optimizing ETL workflows for fault-tolerance. In: ICDE, pp. 385–396 (2010)
93.
Zurück zum Zitat Simitsis, A., Wilkinson, K., Dayal, U., Hsu, M.: HFMS: managing the lifecycle and complexity of hybrid analytic data flows. In: ICDE, pp. 1174–1185 (2013) Simitsis, A., Wilkinson, K., Dayal, U., Hsu, M.: HFMS: managing the lifecycle and complexity of hybrid analytic data flows. In: ICDE, pp. 1174–1185 (2013)
94.
Zurück zum Zitat Srivastava, U., Munagala, K., Widom, J., Motwani, R.: Query optimization over web services. In: Proceedings of VLDB, pp. 355–366 (2006) Srivastava, U., Munagala, K., Widom, J., Motwani, R.: Query optimization over web services. In: Proceedings of VLDB, pp. 355–366 (2006)
95.
Zurück zum Zitat Tan, W., Sun, Y., Lu, G., Tang, A., Cui, L.: Trust services-oriented multi-objects workflow scheduling model for cloud computing. In: ICPCA/SWS, pp. 617–630 (2012) Tan, W., Sun, Y., Lu, G., Tang, A., Cui, L.: Trust services-oriented multi-objects workflow scheduling model for cloud computing. In: ICPCA/SWS, pp. 617–630 (2012)
96.
Zurück zum Zitat Tao, F., Zhang, L., Laili, Y.: Configurable Intelligent Optimization Algorithm: Design and Practice in Manufacturing. Springer, New York, Incorporated (2014) Tao, F., Zhang, L., Laili, Y.: Configurable Intelligent Optimization Algorithm: Design and Practice in Manufacturing. Springer, New York, Incorporated (2014)
97.
Zurück zum Zitat Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Brief announcement: on the quest of optimal service ordering in decentralized queries. In: Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing, PODC 2010, Zurich, July 25–28, 2010, pp. 277–278 (2010) Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Brief announcement: on the quest of optimal service ordering in decentralized queries. In: Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing, PODC 2010, Zurich, July 25–28, 2010, pp. 277–278 (2010)
98.
Zurück zum Zitat Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Decentralized execution of linear workflows over web services. Future Gener. Comput. Syst. 27(3), 341–347 (2011)CrossRef Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Decentralized execution of linear workflows over web services. Future Gener. Comput. Syst. 27(3), 341–347 (2011)CrossRef
99.
Zurück zum Zitat Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Optimal service ordering in decentralized queries over web services. IJKBO 1(2), 1–16 (2011) Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Optimal service ordering in decentralized queries over web services. IJKBO 1(2), 1–16 (2011)
100.
Zurück zum Zitat Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Queries over web services. In: New Directions in Web Data Management, vol. 1, pp. 139–169 (2011) Tsamoura, E., Gounaris, A., Manolopoulos, Y.: Queries over web services. In: New Directions in Web Data Management, vol. 1, pp. 139–169 (2011)
101.
Zurück zum Zitat Tziovara, V., Vassiliadis, P., Simitsis, A.: Deciding the physical implementation of ETL workflows. In: Proceedings of the ACM 10th International Workshop on Data Warehousing and OLAP DOLAP, pp. 49–56 (2007) Tziovara, V., Vassiliadis, P., Simitsis, A.: Deciding the physical implementation of ETL workflows. In: Proceedings of the ACM 10th International Workshop on Data Warehousing and OLAP DOLAP, pp. 49–56 (2007)
102.
Zurück zum Zitat Varol, Y.L., Rotem, D.: An algorithm to generate all topological sorting arrangements. Comput. J. 24(1), 83–84 (1981)CrossRefMATH Varol, Y.L., Rotem, D.: An algorithm to generate all topological sorting arrangements. Comput. J. 24(1), 83–84 (1981)CrossRefMATH
103.
Zurück zum Zitat Vassiliadis, P.: A survey of extract–transform–load technology. IJDWM 5(3), 1–27 (2009) Vassiliadis, P.: A survey of extract–transform–load technology. IJDWM 5(3), 1–27 (2009)
104.
Zurück zum Zitat Vassiliadis, P., Simitsis, A., Baikousi, E.: A taxonomy of ETL activities. In: DOLAP 2009, ACM 12th International Workshop on Data Warehousing and OLAP, Hong Kong, November 6, 2009, Proceedings, pp. 25–32 (2009) Vassiliadis, P., Simitsis, A., Baikousi, E.: A taxonomy of ETL activities. In: DOLAP 2009, ACM 12th International Workshop on Data Warehousing and OLAP, Hong Kong, November 6, 2009, Proceedings, pp. 25–32 (2009)
105.
Zurück zum Zitat vom Brocke, J., Sonnenberg, C.: Business process management and business process analysis. In: Information Systems and Information Technology. Computing Handbook, 3rd edn., pp. 26: 1–31 (2014) vom Brocke, J., Sonnenberg, C.: Business process management and business process analysis. In: Information Systems and Information Technology. Computing Handbook, 3rd edn., pp. 26: 1–31 (2014)
106.
Zurück zum Zitat Vrhovnik, M., Schwarz, H., Radeschütz, S., Mitschang, B.: An overview of SQL support in workflow products. In: Proceedings of ICDE, pp. 1287–1296 (2008) Vrhovnik, M., Schwarz, H., Radeschütz, S., Mitschang, B.: An overview of SQL support in workflow products. In: Proceedings of ICDE, pp. 1287–1296 (2008)
107.
Zurück zum Zitat Vrhovnik, M., Schwarz, H., Suhre, O., Mitschang, B., Markl, V., Maier, A., Kraft, T.: An approach to optimize data processing in business processes. In: VLDB, pp. 615–626 (2007) Vrhovnik, M., Schwarz, H., Suhre, O., Mitschang, B., Markl, V., Maier, A., Kraft, T.: An approach to optimize data processing in business processes. In: VLDB, pp. 615–626 (2007)
108.
Zurück zum Zitat Vu, L.H., Hauswirth, M., Aberer, K.: Qos-based service selection and ranking with trust and reputation management. In: Proceedings of the Cooperative Information System Conference (CoopIS05, pp. 466–483 (2005) Vu, L.H., Hauswirth, M., Aberer, K.: Qos-based service selection and ranking with trust and reputation management. In: Proceedings of the Cooperative Information System Conference (CoopIS05, pp. 466–483 (2005)
109.
Zurück zum Zitat Whrer, A., Brezany, P., Janciak, I., Mehofer, E.: Modeling and optimizing large-scale data flows. Future Gener. Comput. Syst. 31, 12–27 (2014)CrossRef Whrer, A., Brezany, P., Janciak, I., Mehofer, E.: Modeling and optimizing large-scale data flows. Future Gener. Comput. Syst. 31, 12–27 (2014)CrossRef
110.
Zurück zum Zitat Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, EASE’14, pp. 38:1–38:10 (2014) Wohlin, C.: Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, EASE’14, pp. 38:1–38:10 (2014)
111.
Zurück zum Zitat Yerneni, R., Li, C., Ullman, J.D., Garcia-Molina, H.: Optimizing large join queries in mediation systems. In: ICDT, pp. 348–364 (1999) Yerneni, R., Li, C., Ullman, J.D., Garcia-Molina, H.: Optimizing large join queries in mediation systems. In: ICDT, pp. 348–364 (1999)
112.
Zurück zum Zitat Zeng, L., Veeravalli, B., Zomaya, A.Y.: An integrated task computation and data management scheduling strategy for workflow applications in cloud environments. J. Netw. Comput. Appl. 50, 39–48 (2015)CrossRef Zeng, L., Veeravalli, B., Zomaya, A.Y.: An integrated task computation and data management scheduling strategy for workflow applications in cloud environments. J. Netw. Comput. Appl. 50, 39–48 (2015)CrossRef
113.
Zurück zum Zitat Zhou, A.C., He, B., Liu, C.: Monetary cost optimizations for hosting workflow-as-a-service in IaaS clouds. IEEE Trans. Cloud Comput. 4(1), 34–48 (2016)CrossRef Zhou, A.C., He, B., Liu, C.: Monetary cost optimizations for hosting workflow-as-a-service in IaaS clouds. IEEE Trans. Cloud Comput. 4(1), 34–48 (2016)CrossRef
114.
Zurück zum Zitat Zinn, D., Bowers, S., McPhillips, T., Ludäscher, B.: Scientific workflow design with data assembly lines. In: Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, pp. 14:1–14:10 (2009) Zinn, D., Bowers, S., McPhillips, T., Ludäscher, B.: Scientific workflow design with data assembly lines. In: Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, pp. 14:1–14:10 (2009)
Metadaten
Titel
The many faces of data-centric workflow optimization: a survey
verfasst von
Georgia Kougka
Anastasios Gounaris
Alkis Simitsis
Publikationsdatum
06.03.2018
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 2/2018
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-018-0107-0