Skip to main content
Erschienen in: Cluster Computing 4/2013

01.12.2013

Towards a conceptualization of ETL and physical storage of semantic data warehouses as a service

verfasst von: Nabila Berkani, Ladjel Bellatreche, Selma Khouri

Erschienen in: Cluster Computing | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The data warehouse technology has become the incontestable tool for businesses and organizations to make strategic decisions to ensure their competitively. The construction of a data warehouse (\(\mathcal{D}\mathcal{W}\)) passes by selecting relevant information sources, extracting relevant data and loading them into the \(\mathcal{D}\mathcal{W}\). These processes require a precise expertise from designers related to logical and physical implementations of information sources, which is not usually an easy task. The diversity and heterogeneity of information sources makes the construction process of the \(\mathcal{D}\mathcal{W}\) complex and time consuming. Domain ontologies have been proposed to reduce heterogeneity between sources, platforms, services, etc. They resolve syntax and semantic conflicts. The phenomenon of adopting domain ontologies by organizations creates a new type of databases, called semantic databases (\(\mathcal{S}\mathcal{D}\mathcal{B}\)). As a consequence, they become a candidate for building the semantic \(\mathcal{D}\mathcal{W}\) (\(\mathcal{S}\mathcal{D}\mathcal{W}\)). To handle the diversity of information sources and hide the implementations aspects of sources, proposing a generic framework for constructing \(\mathcal{D}\mathcal{W}\) becomes a necessity. In this paper, we first proposed an ontology-based approach for designing \(\mathcal{S}\mathcal {D}\mathcal{B}\). Secondly, ETL phases are defined at ontological level to hide the implementation details. Thirdly, a storage service for ontologies and its associated data is given. Finally, our proposal is validated through a case study and a tool.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
1.
Zurück zum Zitat Halevy, A.Y., Ashish, N., Bitton, D., Carey, M.J., Draper, D., Pollock, J., Rosenthal, A., Sikka, V.: Enterprise information integration: successes, challenges and controversies. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 778–787 (2005) Halevy, A.Y., Ashish, N., Bitton, D., Carey, M.J., Draper, D., Pollock, J., Rosenthal, A., Sikka, V.: Enterprise information integration: successes, challenges and controversies. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 778–787 (2005)
2.
Zurück zum Zitat Jarke, M., Jeusfeld, M.A., Quix, C., Vassiliadis, P.: Architecture and quality in data warehouses: an extended repository approach. Inf. Syst. 24(3), 229–253 (1999) CrossRef Jarke, M., Jeusfeld, M.A., Quix, C., Vassiliadis, P.: Architecture and quality in data warehouses: an extended repository approach. Inf. Syst. 24(3), 229–253 (1999) CrossRef
3.
Zurück zum Zitat Liu, X., Thomsen, C., Pedersen, T.B.: Mapreduce-based dimensional ETL made easy. J. Proc. VLDB Endow. 5(12), 1882–1885 (2012) Liu, X., Thomsen, C., Pedersen, T.B.: Mapreduce-based dimensional ETL made easy. J. Proc. VLDB Endow. 5(12), 1882–1885 (2012)
4.
Zurück zum Zitat Calvanese, D., Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Data integration in data warehousing. Int. J. Coop. Inf. Syst. 10(3), 237–271 (2001) CrossRef Calvanese, D., Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Data integration in data warehousing. Int. J. Coop. Inf. Syst. 10(3), 237–271 (2001) CrossRef
5.
Zurück zum Zitat Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: new wine or just new bottles? J. Proc. VLDB Endow. 3(2), 1647–1648 (2010) Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: new wine or just new bottles? J. Proc. VLDB Endow. 3(2), 1647–1648 (2010)
6.
Zurück zum Zitat Agrawal, D., El Abbadi, A., Wang, S.: Secure and privacy-preserving data services in the cloud: a data centric view. J. Proc. VLDB Endow. 5(12), 2028–2029 (2012) Agrawal, D., El Abbadi, A., Wang, S.: Secure and privacy-preserving data services in the cloud: a data centric view. J. Proc. VLDB Endow. 5(12), 2028–2029 (2012)
7.
Zurück zum Zitat Haase, P., Motik, B.: A mapping system for the integration of owl-dl ontologies. In: IHIS, pp. 9–16 (2005) CrossRef Haase, P., Motik, B.: A mapping system for the integration of owl-dl ontologies. In: IHIS, pp. 9–16 (2005) CrossRef
8.
Zurück zum Zitat Gruber, T.R.: A translation approach to portable ontology specifications. In: Knowledge Acquisition, vol. 5, pp. 199–220 (1993) Gruber, T.R.: A translation approach to portable ontology specifications. In: Knowledge Acquisition, vol. 5, pp. 199–220 (1993)
9.
Zurück zum Zitat Bellatreche, L., Nguyen Xuan, D., Pierra, G., Dehainsala, H.: Contribution of ontology-based data modeling to automatic integration of electronic catalogues within engineering databases. Comput. Ind. 57(8–9), 711–724 (2006) CrossRef Bellatreche, L., Nguyen Xuan, D., Pierra, G., Dehainsala, H.: Contribution of ontology-based data modeling to automatic integration of electronic catalogues within engineering databases. Comput. Ind. 57(8–9), 711–724 (2006) CrossRef
10.
Zurück zum Zitat Fankam, C.: Ontodb2: un systme flexible et efficient de base de donnes base ontologique pour le web smantique et les donnes techniques. Poitiers University, Ph.D. Thesis (2009) Fankam, C.: Ontodb2: un systme flexible et efficient de base de donnes base ontologique pour le web smantique et les donnes techniques. Poitiers University, Ph.D. Thesis (2009)
11.
Zurück zum Zitat Lu, J., Ma, L., Zhang, L., Brunner, J.S., Wang, C., Pan, Y., Yu, Y.: Sor: a practical system for ontology storage, reasoning and search. In: VLDB, pp. 1402–1405 (2007) Lu, J., Ma, L., Zhang, L., Brunner, J.S., Wang, C., Pan, Y., Yu, Y.: Sor: a practical system for ontology storage, reasoning and search. In: VLDB, pp. 1402–1405 (2007)
12.
Zurück zum Zitat Wu, Z., Eadon, G., Das, S., Chong, E., Kolovski, V., Annamalai, M., Srinivasan, J.: Implementing an inference engine for rdfs/owl constructs and user-defined rules in oracle. In: ICDE, pp. 1239–1248 (2008) Wu, Z., Eadon, G., Das, S., Chong, E., Kolovski, V., Annamalai, M., Srinivasan, J.: Implementing an inference engine for rdfs/owl constructs and user-defined rules in oracle. In: ICDE, pp. 1239–1248 (2008)
13.
Zurück zum Zitat Beneventano, D., Bergamaschi, S., Castano, S., Corni, A., Guidetti, R., Malvezzi, G., Melchiori, M., Vincini, M.: Information integration: the momis project demonstration. In: VLDB Journal, pp. 611–614 (2000) Beneventano, D., Bergamaschi, S., Castano, S., Corni, A., Guidetti, R., Malvezzi, G., Melchiori, M., Vincini, M.: Information integration: the momis project demonstration. In: VLDB Journal, pp. 611–614 (2000)
14.
Zurück zum Zitat Mena, E., Illarramendi, A., Kashyap, V., Sheth, A.P.: Observer: an approach for query processing in global information systems based on interoperation across pre-existing ontologies. Distrib. Parallel Databases 8(2), 223–271 (2000) CrossRef Mena, E., Illarramendi, A., Kashyap, V., Sheth, A.P.: Observer: an approach for query processing in global information systems based on interoperation across pre-existing ontologies. Distrib. Parallel Databases 8(2), 223–271 (2000) CrossRef
15.
Zurück zum Zitat Wache, H., et al.: Ontology-based integration of information—a survey of existing approaches. In: OIS, pp. 108–117 (2001) Wache, H., et al.: Ontology-based integration of information—a survey of existing approaches. In: OIS, pp. 108–117 (2001)
16.
Zurück zum Zitat Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual modeling for ETL processes. In: DOLAP, pp. 14–21 (2002) Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual modeling for ETL processes. In: DOLAP, pp. 14–21 (2002)
17.
Zurück zum Zitat Trujillo, J., Luján-Mora, S.: A uml based approach for modeling ETL processes in data warehouses. In: ER, pp. 307–320 (2003) Trujillo, J., Luján-Mora, S.: A uml based approach for modeling ETL processes in data warehouses. In: ER, pp. 307–320 (2003)
18.
Zurück zum Zitat Mazón, J.-N., Trujillo, J.: An mda approach for the development of data warehouses. In: JISBD, p. 208 (2009) Mazón, J.-N., Trujillo, J.: An mda approach for the development of data warehouses. In: JISBD, p. 208 (2009)
19.
Zurück zum Zitat Wilkinson, K., Simitsis, A., Castellanos, M., Dayal, U.: Leveraging business process models for ETL design. In: ER, pp. 15–30 (2010) Wilkinson, K., Simitsis, A., Castellanos, M., Dayal, U.: Leveraging business process models for ETL design. In: ER, pp. 15–30 (2010)
20.
Zurück zum Zitat Akkaoui, Z., Mazón, J., Vaisman, A., Zimányi, A.: Bpmn-based conceptual modeling of ETL processes. In: DaWaK, pp. 1–14 (2012) Akkaoui, Z., Mazón, J., Vaisman, A., Zimányi, A.: Bpmn-based conceptual modeling of ETL processes. In: DaWaK, pp. 1–14 (2012)
21.
Zurück zum Zitat Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: A principled approach to data integration and reconciliation in data warehousing. In: DMDW, p. 16 (1999) Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: A principled approach to data integration and reconciliation in data warehousing. In: DMDW, p. 16 (1999)
22.
Zurück zum Zitat Calvanese, D., Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Data integration in data warehousing. Int. J. Coop. Inf. Syst. 10(3), 237–271 (2001) CrossRef Calvanese, D., Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Data integration in data warehousing. Int. J. Coop. Inf. Syst. 10(3), 237–271 (2001) CrossRef
23.
Zurück zum Zitat Luján-Mora, S., Vassiliadis, P., Trujillo, J.: Data mapping diagrams for data warehouse design with uml. In: ER, pp. 191–204 (2004) Luján-Mora, S., Vassiliadis, P., Trujillo, J.: Data mapping diagrams for data warehouse design with uml. In: ER, pp. 191–204 (2004)
24.
Zurück zum Zitat Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL activities as graphs. In: DMDW, pp. 52–61 (2002) Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL activities as graphs. In: DMDW, pp. 52–61 (2002)
25.
Zurück zum Zitat Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M., Skiadopoulos, S.: A generic and customizable framework for the design of ETL scenarios. Inf. Syst. 30(7), 492–525 (2005) CrossRef Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M., Skiadopoulos, S.: A generic and customizable framework for the design of ETL scenarios. Inf. Syst. 30(7), 492–525 (2005) CrossRef
26.
Zurück zum Zitat Shmueli, O., Tsur, S.: Logical diagnosis of ldl programs. New Gener. Comput. 9(3/4), 277–304 (1991) CrossRef Shmueli, O., Tsur, S.: Logical diagnosis of ldl programs. New Gener. Comput. 9(3/4), 277–304 (1991) CrossRef
27.
Zurück zum Zitat Luján-Mora, S., Trujillo, J.: Physical modeling of data warehouses using uml component and deployment diagrams: design and implementation issues. J. Database Manag. 17(2), 12–42 (2006) CrossRef Luján-Mora, S., Trujillo, J.: Physical modeling of data warehouses using uml component and deployment diagrams: design and implementation issues. J. Database Manag. 17(2), 12–42 (2006) CrossRef
28.
Zurück zum Zitat Tziovara, P., Vassiliadis, P., Simitsis, A.: Deciding the physical implementation of ETL workflows. In: DOLAP, pp. 49–56 (2007) Tziovara, P., Vassiliadis, P., Simitsis, A.: Deciding the physical implementation of ETL workflows. In: DOLAP, pp. 49–56 (2007)
29.
Zurück zum Zitat Simitsis, A., Vassiliadis, P., Sellis, T.-K.: Optimizing ETL processes in data warehouses. In: ICDE, pp. 564–575 (2005) Simitsis, A., Vassiliadis, P., Sellis, T.-K.: Optimizing ETL processes in data warehouses. In: ICDE, pp. 564–575 (2005)
30.
Zurück zum Zitat Simitsis, A., Wilkinson, K., Dayal, U., Castellanos, M.: Optimizing ETL workflows for fault-tolerance. In: ICDE, pp. 385–396 (2010) Simitsis, A., Wilkinson, K., Dayal, U., Castellanos, M.: Optimizing ETL workflows for fault-tolerance. In: ICDE, pp. 385–396 (2010)
35.
Zurück zum Zitat Skoutas, D., Simitsis, A.: Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Int. J. Semantic Web Inf. Syst. 3(4), 1–24 (2007) CrossRef Skoutas, D., Simitsis, A.: Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Int. J. Semantic Web Inf. Syst. 3(4), 1–24 (2007) CrossRef
36.
Zurück zum Zitat Romero, O., Simitsis, A., Abelló, A.: Gem: requirement-driven generation of ETL and multidimensional conceptual designs. In: DaWaK, pp. 80–95 (2011) Romero, O., Simitsis, A., Abelló, A.: Gem: requirement-driven generation of ETL and multidimensional conceptual designs. In: DaWaK, pp. 80–95 (2011)
37.
Zurück zum Zitat Nebot, V., Berlanga, R.: Building data warehouses with semantic web data. Decis. Support Syst. 52(4), 853–868 (2012) CrossRef Nebot, V., Berlanga, R.: Building data warehouses with semantic web data. Decis. Support Syst. 52(4), 853–868 (2012) CrossRef
38.
Zurück zum Zitat Calvanese, D., Lenzerini, M., Nardi, D.: Description logics for conceptual data modeling. In: Logics for Databases and Information Systems, pp. 229–263 (1998) CrossRef Calvanese, D., Lenzerini, M., Nardi, D.: Description logics for conceptual data modeling. In: Logics for Databases and Information Systems, pp. 229–263 (1998) CrossRef
39.
Zurück zum Zitat Brockmans, S., Haase, P., Serafini, L., Stuckenschmidt, H.: Formal and conceptual comparison of ontology mapping languages. In: Modular Ontologies, pp. 267–291 (2009) CrossRef Brockmans, S., Haase, P., Serafini, L., Stuckenschmidt, H.: Formal and conceptual comparison of ontology mapping languages. In: Modular Ontologies, pp. 267–291 (2009) CrossRef
40.
Zurück zum Zitat Guo, Y., Pan, Z., Heflin, J.: Lubm: a benchmark for owl knowledge base systems. J. Web Semant. 3(2–3), 158–182 (2005) CrossRef Guo, Y., Pan, Z., Heflin, J.: Lubm: a benchmark for owl knowledge base systems. J. Web Semant. 3(2–3), 158–182 (2005) CrossRef
41.
Zurück zum Zitat Mayr, C., Zdun, U., Dustdar, S.: Model-driven integration and management of data access objects in process-driven soas. In: ServiceWave, pp. 62–73 (2008) Mayr, C., Zdun, U., Dustdar, S.: Model-driven integration and management of data access objects in process-driven soas. In: ServiceWave, pp. 62–73 (2008)
Metadaten
Titel
Towards a conceptualization of ETL and physical storage of semantic data warehouses as a service
verfasst von
Nabila Berkani
Ladjel Bellatreche
Selma Khouri
Publikationsdatum
01.12.2013
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 4/2013
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-013-0266-7

Weitere Artikel der Ausgabe 4/2013

Cluster Computing 4/2013 Zur Ausgabe