Skip to main content
Top
Published in: Information Systems and e-Business Management 2/2017

09-07-2016 | Original Article

Automating ETL processes using the domain-specific modeling approach

Authors: Marko Petrović, Milica Vučković, Nina Turajlić, Slađan Babarogić, Nenad Aničić, Zoran Marjanović

Published in: Information Systems and e-Business Management | Issue 2/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The development of Extract–Transform–Load (ETL) processes is the most complex, time-consuming and expensive phase of data warehouse development. Yet, the dynamics of modern business systems demand a more agile and flexible approach to their development. As a result, current research in this area is focused on ETL process conceptualization and the automation of ETL process development. This paper proposes a novel solution for automating ETL processes using the domain-specific modeling (DSM) approach. The proposed solution is based on the formal specification of ETL processes and the implementation of such formal specifications. Thus, in accordance with the DSM approach, several new domain-specific languages (DSLs) are introduced, each defining concepts relevant for a specific aspect of an ETL process. The focus of this paper is the actual implementation of the formal specification of an ETL process. To this end, a specific ETL platform (ETL-PL) is introduced to technologically support both the modeling of ETL processes (i.e., the creation of models in accordance with the introduced DSLs) and the automated transformation of the created models into the executable code of a specific application framework (representing ETL-PL’s execution environment). It should be emphasized that ETL-PL actually presumes the dynamic execution of ETL models or, more precisely, the executable code is generated at runtime. Thus the execution environment consists of code generator components and the components implementing the application framework. ETL-PL has been implemented as an extension of the .NET platform.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference El Akkaoui Z, Zimányi E (2009) Defining ETL worfklows using BPMN and BPEL. In: Proceedings of DOLAP ‘09, (China), pp 41–48 El Akkaoui Z, Zimányi E (2009) Defining ETL worfklows using BPMN and BPEL. In: Proceedings of DOLAP ‘09, (China), pp 41–48
go back to reference El Akkaoui, Zimányi E, Mazón J-N, Trujillo J (2011) A model-driven framework for ETL process development. In: Proceedings of DOLAP ‘11, (UK), pp 45–52 El Akkaoui, Zimányi E, Mazón J-N, Trujillo J (2011) A model-driven framework for ETL process development. In: Proceedings of DOLAP ‘11, (UK), pp 45–52
go back to reference El Akkaoui Z, Mazón J-N, Vaisman A, Zimányi E (2012) BPMN-based conceptual modeling of ETL processes. In: Data warehousing and knowledge discovery, LNCS 7448. Springer, Berlin, pp 1–14 El Akkaoui Z, Mazón J-N, Vaisman A, Zimányi E (2012) BPMN-based conceptual modeling of ETL processes. In: Data warehousing and knowledge discovery, LNCS 7448. Springer, Berlin, pp 1–14
go back to reference Fowler M (2010) Domain-specific languages. Addison-Wesley Professional, Boston Fowler M (2010) Domain-specific languages. Addison-Wesley Professional, Boston
go back to reference Greenfield J, Short K, Cook S, Kent S (2004) Software factories: assembling applications with patterns, models, frameworks, and tools. Wiley, Hoboken Greenfield J, Short K, Cook S, Kent S (2004) Software factories: assembling applications with patterns, models, frameworks, and tools. Wiley, Hoboken
go back to reference Hazzard K, Bock J (2013) Metaprogramming in.NET. Manning Publications, Greenwich Hazzard K, Bock J (2013) Metaprogramming in.NET. Manning Publications, Greenwich
go back to reference Jarke M, Lenzerini M, Vassiliou Y, Vassiliadis P (2003) Fundamentals of data warehouses. Springer, BerlinCrossRef Jarke M, Lenzerini M, Vassiliou Y, Vassiliadis P (2003) Fundamentals of data warehouses. Springer, BerlinCrossRef
go back to reference Kelly S, Tolvanen JP (2008) Domain-specific modeling: enabling full code generation. Wiley, HobokenCrossRef Kelly S, Tolvanen JP (2008) Domain-specific modeling: enabling full code generation. Wiley, HobokenCrossRef
go back to reference Kimball R, Caserta J (2004) The data warehouse ETL toolkit: practical techniques for extracting, cleaning, conforming, and delivering data. Wiley, Hoboken Kimball R, Caserta J (2004) The data warehouse ETL toolkit: practical techniques for extracting, cleaning, conforming, and delivering data. Wiley, Hoboken
go back to reference Kimball R, Ross M, Thornthwaite W, Mundy J, Becker B (2010) The Kimball group reader: relentlessly practical tools for data warehousing and business intelligence. Wiley, Hoboken Kimball R, Ross M, Thornthwaite W, Mundy J, Becker B (2010) The Kimball group reader: relentlessly practical tools for data warehousing and business intelligence. Wiley, Hoboken
go back to reference Luján-Mora S, Trujillo J (2004) A data warehouse engineering process. In: Advances in information systems, LNCS 3261. Springer, Berlin, pp 14–23 Luján-Mora S, Trujillo J (2004) A data warehouse engineering process. In: Advances in information systems, LNCS 3261. Springer, Berlin, pp 14–23
go back to reference Luján-Mora S, Vassiliadis P, Trujillo J (2004) Data mapping diagrams for data warehouse design with UML. In: Conceptual modeling-ER 2004, LNCS 3288. Springer, Berlin, pp 191–204 Luján-Mora S, Vassiliadis P, Trujillo J (2004) Data mapping diagrams for data warehouse design with UML. In: Conceptual modeling-ER 2004, LNCS 3288. Springer, Berlin, pp 191–204
go back to reference Mazón J-N, Trujillo J (2008) An MDA approach for the development of data warehouses. Decis Support Syst 45(1):41–58CrossRef Mazón J-N, Trujillo J (2008) An MDA approach for the development of data warehouses. Decis Support Syst 45(1):41–58CrossRef
go back to reference Muñoz L, Mazón JN, Pardillo J, Trujillo J (2008) Modelling ETL processes of data warehouses with UML activity diagrams. In: On the move to meaningful internet systems: OTM 2008 workshops, LNCS 5333. Springer, Berlin, pp 44–53 Muñoz L, Mazón JN, Pardillo J, Trujillo J (2008) Modelling ETL processes of data warehouses with UML activity diagrams. In: On the move to meaningful internet systems: OTM 2008 workshops, LNCS 5333. Springer, Berlin, pp 44–53
go back to reference Muñoz L, Mazón JN, Trujillo J (2009) Automatic generation of ETL processes from conceptual models. In: Proceedings of DOLAP ‘09, (China), pp 33–40 Muñoz L, Mazón JN, Trujillo J (2009) Automatic generation of ETL processes from conceptual models. In: Proceedings of DOLAP ‘09, (China), pp 33–40
go back to reference Petrović M (2014) A model driven development approach for the data warehouse extract, transform and load process. Ph.D. Thesis final version (in Serbian), Faculty of Organizational Sciences, University of Belgrade, Serbia Petrović M (2014) A model driven development approach for the data warehouse extract, transform and load process. Ph.D. Thesis final version (in Serbian), Faculty of Organizational Sciences, University of Belgrade, Serbia
go back to reference Simitsis A (2005) Mapping conceptual to logical models for ETL processes. In: Proceedings of DOLAP ‘05, (Germany), pp 67–76 Simitsis A (2005) Mapping conceptual to logical models for ETL processes. In: Proceedings of DOLAP ‘05, (Germany), pp 67–76
go back to reference Simitsis A, Vassiliadis P (2003) A methodology for the conceptual modeling of ETL processes. In: Proceedings of the decision systems engineering—DSE ‘03, (Austria), pp 305–316 Simitsis A, Vassiliadis P (2003) A methodology for the conceptual modeling of ETL processes. In: Proceedings of the decision systems engineering—DSE ‘03, (Austria), pp 305–316
go back to reference Simitsis A, Vassiliadis P (2008) A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis Support Syst 45(1):22–40CrossRef Simitsis A, Vassiliadis P (2008) A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis Support Syst 45(1):22–40CrossRef
go back to reference Simitsis A, Vassiliadis P, Terrovitis M, Skiadopoulos S (2005) Graph-based modeling of ETL activities with multi-level transformations and updates. In: Data warehousing and knowledge discovery, LNCS 3589. Springer, Berlin, pp 43–52 Simitsis A, Vassiliadis P, Terrovitis M, Skiadopoulos S (2005) Graph-based modeling of ETL activities with multi-level transformations and updates. In: Data warehousing and knowledge discovery, LNCS 3589. Springer, Berlin, pp 43–52
go back to reference Troelsen A (2012) Pro C# 5.0 and the.NET 4.5 Framework. Apress Troelsen A (2012) Pro C# 5.0 and the.NET 4.5 Framework. Apress
go back to reference Trujillo J, Luján-Mora S (2003) A UML based approach for modeling ETL Processes in data warehouses. In: Conceptual modeling-ER 2003, LNCS 2813. Springer, Berlin, pp 307–320 Trujillo J, Luján-Mora S (2003) A UML based approach for modeling ETL Processes in data warehouses. In: Conceptual modeling-ER 2003, LNCS 2813. Springer, Berlin, pp 307–320
go back to reference Turajlić N, Petrović M, Vučković M (2014) Analysis of ETL process development approaches: some open issues. In: Proceedings of SYMORG’14, pp 45–51 Turajlić N, Petrović M, Vučković M (2014) Analysis of ETL process development approaches: some open issues. In: Proceedings of SYMORG’14, pp 45–51
go back to reference Vassiliadis P, Simitsis A, Skiadopoulos S (2002) Modeling ETL activities as graphs. In: Proceedings of DMDW’02, pp 52–61 Vassiliadis P, Simitsis A, Skiadopoulos S (2002) Modeling ETL activities as graphs. In: Proceedings of DMDW’02, pp 52–61
go back to reference Vassiliadis P, Simitsis A, Skiadopoulos S (2002) Conceptual modeling for ETL processes. In: Proceedings of DOLAP ‘02, (USA), pp 14–21 Vassiliadis P, Simitsis A, Skiadopoulos S (2002) Conceptual modeling for ETL processes. In: Proceedings of DOLAP ‘02, (USA), pp 14–21
go back to reference Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M (2003) A framework for the design of ETL scenarios. In: Advanced information systems engineering, LNCS 2681. Springer, Berlin, pp 520–535 Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M (2003) A framework for the design of ETL scenarios. In: Advanced information systems engineering, LNCS 2681. Springer, Berlin, pp 520–535
go back to reference Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M, Skiadopoulos S (2005) A generic and customizable framework for the design of ETL scenarios. Inf Syst 30(7):492–525CrossRef Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M, Skiadopoulos S (2005) A generic and customizable framework for the design of ETL scenarios. Inf Syst 30(7):492–525CrossRef
go back to reference Vassiliadis P, Simitsis A, Baikousi E (2009) A taxonomy of ETL activities. In: Proceedings of DOLAP’09, (China), pp 25–32 Vassiliadis P, Simitsis A, Baikousi E (2009) A taxonomy of ETL activities. In: Proceedings of DOLAP’09, (China), pp 25–32
Metadata
Title
Automating ETL processes using the domain-specific modeling approach
Authors
Marko Petrović
Milica Vučković
Nina Turajlić
Slađan Babarogić
Nenad Aničić
Zoran Marjanović
Publication date
09-07-2016
Publisher
Springer Berlin Heidelberg
Published in
Information Systems and e-Business Management / Issue 2/2017
Print ISSN: 1617-9846
Electronic ISSN: 1617-9854
DOI
https://doi.org/10.1007/s10257-016-0325-8

Other articles of this Issue 2/2017

Information Systems and e-Business Management 2/2017 Go to the issue