Skip to main content
Erschienen in: Journal of Intelligent Information Systems 3/2020

19.02.2020

The contribution of linked open data to augment a traditional data warehouse

verfasst von: Nabila Berkani, Ladjel Bellatreche, Selma Khouri, Carlos Ordonez

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The arrival of Big Data has contributed positively to the evolution of the data warehouse (DW ) technology. This gives birth of augmented DW s that aim at maximizing the effectiveness of existing ones. Various augmentation scenarios have been proposed and adopted by firms and industry covering several aspects such as new data sources (e.g., Linked Open Data (LOD), social, stream and IoT data), data ingestion, advanced deployment infrastructures, programming paradigms, data visualization. These scenarios allow companies reaching valuable decisions. By examining traditional DW s, we realized that they do not fulfill all decision-maker requirements since data sources alimenting a target DW are not rich enough to capture Big Data. The arrival of LOD era is an excellent opportunity to enrich traditional DW s with a new V dimension: Value. In this paper, we first conceptualize the variety of internal and external sources and study its effect on the ETL phase to ease the value capturing. Secondly, a Value-driven approach for the DW design is discussed. Thirdly, three realistic scenarios for integrating LOD in the DW landscape are given. Finally, experiments are conducted showing the added value by augmenting the existing DW environment with LOD.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abelló, A., Romero, O., Pedersen, T.B., Llavori, R.B., Nebot, V., Cabo, M.J.A., Simitsis, A. (2015). Using semantic web technologies for exploratory OLAP: a survey. IEEE Transition Knowledge Data Engineering, 27(2), 571–588.CrossRef Abelló, A., Romero, O., Pedersen, T.B., Llavori, R.B., Nebot, V., Cabo, M.J.A., Simitsis, A. (2015). Using semantic web technologies for exploratory OLAP: a survey. IEEE Transition Knowledge Data Engineering, 27(2), 571–588.CrossRef
Zurück zum Zitat Abelló Gamazo, A., Gallinucci, E., Golfarelli, M., Rizzi Bach, S., Romero Moral, O. (2016). Towards exploratory olap on linked data. In SEBD (pp. 86–93). Abelló Gamazo, A., Gallinucci, E., Golfarelli, M., Rizzi Bach, S., Romero Moral, O. (2016). Towards exploratory olap on linked data. In SEBD (pp. 86–93).
Zurück zum Zitat Baldacci, L., Golfarelli, M., Graziani, S., Rizzi, S. (2017). Qetl: an approach to on-demand etl from non-owned data sources. DKE, 112, 17–37.CrossRef Baldacci, L., Golfarelli, M., Graziani, S., Rizzi, S. (2017). Qetl: an approach to on-demand etl from non-owned data sources. DKE, 112, 17–37.CrossRef
Zurück zum Zitat Ballou, D.P., & Tayi, G.K. (1999). Enhancing data quality in data warehouse environments. Communications of the ACM, 42(1), 73–78.CrossRef Ballou, D.P., & Tayi, G.K. (1999). Enhancing data quality in data warehouse environments. Communications of the ACM, 42(1), 73–78.CrossRef
Zurück zum Zitat Beheshti, A., Benatallah, B., Nouri, R., Tabebordbar, A. (2018). Corekg: a knowledge lake service. Proceedings of the VLDB Endowment, 11(12), 1942–1945.CrossRef Beheshti, A., Benatallah, B., Nouri, R., Tabebordbar, A. (2018). Corekg: a knowledge lake service. Proceedings of the VLDB Endowment, 11(12), 1942–1945.CrossRef
Zurück zum Zitat Beheshti, A., Benatallah, B., Tabebordbar, A., Motahari-Nezhad, H.R., Barukh, M.C., Nouri, R. (2019). Datasynapse: a social data curation foundry. Distributed and Parallel Databases, 37(3), 351–384.CrossRef Beheshti, A., Benatallah, B., Tabebordbar, A., Motahari-Nezhad, H.R., Barukh, M.C., Nouri, R. (2019). Datasynapse: a social data curation foundry. Distributed and Parallel Databases, 37(3), 351–384.CrossRef
Zurück zum Zitat Berkani, N., & Bellatreche, L. (2017). A variety-sensitive ETL processes. In DEXA, (Vol. 2 pp. 201–216). Berkani, N., & Bellatreche, L. (2017). A variety-sensitive ETL processes. In DEXA, (Vol. 2 pp. 201–216).
Zurück zum Zitat Berkani, N., Bellatreche, L., Benatallah, B. (2016). A value-added approach to design BI applications. In DaWaK (pp. 361–375). Berkani, N., Bellatreche, L., Benatallah, B. (2016). A value-added approach to design BI applications. In DaWaK (pp. 361–375).
Zurück zum Zitat Berkani, N., Bellatreche, L., Khouri, S., Ordonez, C. (2019). Value-driven approach for designing extended data warehouses. In DOLAP. Berkani, N., Bellatreche, L., Khouri, S., Ordonez, C. (2019). Value-driven approach for designing extended data warehouses. In DOLAP.
Zurück zum Zitat Berro, A., Megdiche, I., Teste, O. (2015). Graph-based ETL processes for warehousing statistical open data. In ICEIS, (Vol. 2015 pp. 271–278). Berro, A., Megdiche, I., Teste, O. (2015). Graph-based ETL processes for warehousing statistical open data. In ICEIS, (Vol. 2015 pp. 271–278).
Zurück zum Zitat Boehm, B. (2003). Value-based software engineering: reinventing. ACM SIGSOFT Software Engineering Notes, 28(2), 3.CrossRef Boehm, B. (2003). Value-based software engineering: reinventing. ACM SIGSOFT Software Engineering Notes, 28(2), 3.CrossRef
Zurück zum Zitat Božič, K., & Dimovski, V. (2019). Business intelligence and analytics for value creation: the role of absorptive capacity. IJIM, 46, 93–103. Božič, K., & Dimovski, V. (2019). Business intelligence and analytics for value creation: the role of absorptive capacity. IJIM, 46, 93–103.
Zurück zum Zitat Calvanese, D., & et al. (1999). A principled approach to data integration and reconciliation in data warehousing. In DMDW (p. 16). Calvanese, D., & et al. (1999). A principled approach to data integration and reconciliation in data warehousing. In DMDW (p. 16).
Zurück zum Zitat Deb Nath, R.P., Hose, K., Pedersen, T.B. (2015). Towards a programmable semantic extract-transform-load framework for semantic data warehouses. In DOLAP (pp. 15–24). Deb Nath, R.P., Hose, K., Pedersen, T.B. (2015). Towards a programmable semantic extract-transform-load framework for semantic data warehouses. In DOLAP (pp. 15–24).
Zurück zum Zitat Dehainsala, H., Pierra, G., Bellatreche, L. (2007). OntoDB: an ontology-based database for data intensive applications. In DASFAA (pp. 497–508). Dehainsala, H., Pierra, G., Bellatreche, L. (2007). OntoDB: an ontology-based database for data intensive applications. In DASFAA (pp. 497–508).
Zurück zum Zitat Deza, M.M., & Deza, E. (2009). Encyclopedia of distances. In Encyclopedia of distances (pp. 1–583): Springer. Deza, M.M., & Deza, E. (2009). Encyclopedia of distances. In Encyclopedia of distances (pp. 1–583): Springer.
Zurück zum Zitat Eckerson, W. (2003). Smart companies in the 21st century: the secrets of creating successful business intelligence solutions. TDWI Report Series 7. Eckerson, W. (2003). Smart companies in the 21st century: the secrets of creating successful business intelligence solutions. TDWI Report Series 7.
Zurück zum Zitat Etcheverry, L., Vaisman, A., Zimányi, E. (2014). Modeling and querying data warehouses on the semantic web using qb4olap. In DaWAK (pp. 45–56). Etcheverry, L., Vaisman, A., Zimányi, E. (2014). Modeling and querying data warehouses on the semantic web using qb4olap. In DaWAK (pp. 45–56).
Zurück zum Zitat Golfarelli, M., & Rizzi, S. (2009). A survey on temporal data warehousing. International Journal of Data Warehousing and Mining (IJDWM), 5(1), 1–17.CrossRef Golfarelli, M., & Rizzi, S. (2009). A survey on temporal data warehousing. International Journal of Data Warehousing and Mining (IJDWM), 5(1), 1–17.CrossRef
Zurück zum Zitat Gordijn, J., & Akkermans, J. (2003). Value-based requirements engineering: exploring innovative e-commerce ideas. Requirements Engineering, 8(2), 114–134.CrossRef Gordijn, J., & Akkermans, J. (2003). Value-based requirements engineering: exploring innovative e-commerce ideas. Requirements Engineering, 8(2), 114–134.CrossRef
Zurück zum Zitat Gosain, A., & et al. (2015). Literature review of data model quality metrics of data warehouse. Procedia Computer Science, 48, 236–243.CrossRef Gosain, A., & et al. (2015). Literature review of data model quality metrics of data warehouse. Procedia Computer Science, 48, 236–243.CrossRef
Zurück zum Zitat Guarino, N., Andersson, B., Johannesson, P., Livieri, B. (2016). Towards an ontology of value ascription. In FOIS, IOS Press, (Vol. 283 p. 331). Guarino, N., Andersson, B., Johannesson, P., Livieri, B. (2016). Towards an ontology of value ascription. In FOIS, IOS Press, (Vol. 283 p. 331).
Zurück zum Zitat Hoffart, J., & et al. (2011). YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In WWW (pp. 229–232). Hoffart, J., & et al. (2011). YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In WWW (pp. 229–232).
Zurück zum Zitat Hoffer, J.A., Ramesh, V., Topi, H. (2011). Modern database management. Upper Saddle River: Prentice Hall. Hoffer, J.A., Ramesh, V., Topi, H. (2011). Modern database management. Upper Saddle River: Prentice Hall.
Zurück zum Zitat Kämpgen, B., O’Riain, S., Harth, A. (2012). Interacting with statistical linked data via OLAP operations. In ESWC (pp. 87–101). Kämpgen, B., O’Riain, S., Harth, A. (2012). Interacting with statistical linked data via OLAP operations. In ESWC (pp. 87–101).
Zurück zum Zitat Konstantinou, N., & et al. (2017). The VADA architecture for cost-effective data wrangling. In SIGMOD (pp. 1599–1602). Konstantinou, N., & et al. (2017). The VADA architecture for cost-effective data wrangling. In SIGMOD (pp. 1599–1602).
Zurück zum Zitat Matei, A., Chao, K., Godwin, N. (2014). OLAP for multidimensional semantic web databases. In BIRTE (pp. 81–96). Matei, A., Chao, K., Godwin, N. (2014). OLAP for multidimensional semantic web databases. In BIRTE (pp. 81–96).
Zurück zum Zitat Mountantonakis, M., & Tzitzikas, Y. (2018). Scalable methods for measuring the connectivity and quality of large numbers of linked datasets. JDIQ, 9(3), 15.CrossRef Mountantonakis, M., & Tzitzikas, Y. (2018). Scalable methods for measuring the connectivity and quality of large numbers of linked datasets. JDIQ, 9(3), 15.CrossRef
Zurück zum Zitat Nebot, V., & Llavori, R.B. (2012). Building data warehouses with semantic web data. Decision Support Systems, 52(4), 853–868.CrossRef Nebot, V., & Llavori, R.B. (2012). Building data warehouses with semantic web data. Decision Support Systems, 52(4), 853–868.CrossRef
Zurück zum Zitat Ravat, F., Song, J., Teste, O. (2016). Designing multidimensional cubes from warehoused data and linked open data. In RCIS (pp. 1–12). Ravat, F., Song, J., Teste, O. (2016). Designing multidimensional cubes from warehoused data and linked open data. In RCIS (pp. 1–12).
Zurück zum Zitat Saad, R., Teste, O., Trojahn, C. (2013). Olap manipulations on rdf data following a constellation model. In 1st international workshop on semantic statistics. Saad, R., Teste, O., Trojahn, C. (2013). Olap manipulations on rdf data following a constellation model. In 1st international workshop on semantic statistics.
Zurück zum Zitat Sabharwal, S., Nagpal, S., Aggarwal, G. (2017). Empirical analysis of metrics for object oriented multidimensional model of data warehouse using unsupervised machine learning techniques. JSAEM, 8(2), 703–715. Sabharwal, S., Nagpal, S., Aggarwal, G. (2017). Empirical analysis of metrics for object oriented multidimensional model of data warehouse using unsupervised machine learning techniques. JSAEM, 8(2), 703–715.
Zurück zum Zitat Sales, T.P., Guarino, N., Guizzardi, G., Mylopoulos, J. (2017). An ontological analysis of value propositions. In EDOC (pp. 184–193): IEEE. Sales, T.P., Guarino, N., Guizzardi, G., Mylopoulos, J. (2017). An ontological analysis of value propositions. In EDOC (pp. 184–193): IEEE.
Zurück zum Zitat Sales, T.P., Baião, F.A., Guizzardi, G., Almeida, J.P.A., Guarino, N., Mylopoulos, J. (2018). The common ontology of value and risk. In ER (pp. 121–135). Sales, T.P., Baião, F.A., Guizzardi, G., Almeida, J.P.A., Guarino, N., Mylopoulos, J. (2018). The common ontology of value and risk. In ER (pp. 121–135).
Zurück zum Zitat Serrano, M., Trujillo, J., Calero, C., Piattini, M. (2007). Metrics for data warehouse conceptual models understandability. JIST, 49(8), 851–870. Serrano, M., Trujillo, J., Calero, C., Piattini, M. (2007). Metrics for data warehouse conceptual models understandability. JIST, 49(8), 851–870.
Zurück zum Zitat Skoutas, D., & Simitsis, A. (2007). Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Semantic Web, 3(4), 1–24.CrossRef Skoutas, D., & Simitsis, A. (2007). Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Semantic Web, 3(4), 1–24.CrossRef
Zurück zum Zitat Thew, S., & Sutcliffe, A. (2018). Value-based requirements engineering: method and experience. Requirements Engineering, 23(4), 443–464.CrossRef Thew, S., & Sutcliffe, A. (2018). Value-based requirements engineering: method and experience. Requirements Engineering, 23(4), 443–464.CrossRef
Zurück zum Zitat van Der Aalst, W.M., Ter Hofstede, A.H., Kiepuszewski, B., Barros, A.P. (2003). Workflow patterns. Distributed and Parallel Databases, 14(1), 5–51.CrossRef van Der Aalst, W.M., Ter Hofstede, A.H., Kiepuszewski, B., Barros, A.P. (2003). Workflow patterns. Distributed and Parallel Databases, 14(1), 5–51.CrossRef
Zurück zum Zitat Wegmann, A. (2003). On the systemic enterprise architecture methodology (seam). In CONF (pp. 483–490). Wegmann, A. (2003). On the systemic enterprise architecture methodology (seam). In CONF (pp. 483–490).
Zurück zum Zitat Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S. (2016). Quality assessment for linked data: a survey. Semantic Web, 7(1), 63–93.CrossRef Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S. (2016). Quality assessment for linked data: a survey. Semantic Web, 7(1), 63–93.CrossRef
Metadaten
Titel
The contribution of linked open data to augment a traditional data warehouse
verfasst von
Nabila Berkani
Ladjel Bellatreche
Selma Khouri
Carlos Ordonez
Publikationsdatum
19.02.2020
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 3/2020
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-020-00594-w

Weitere Artikel der Ausgabe 3/2020

Journal of Intelligent Information Systems 3/2020 Zur Ausgabe