Skip to main content
Top

2019 | OriginalPaper | Chapter

15. Challenges, Approaches and Solutions in Data Integration for Research and Innovation

Authors : Maurizio Lenzerini, Cinzia Daraio

Published in: Springer Handbook of Science and Technology Indicators

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In order to be implemented by policy makers, science, technology, and innovation () policies and indicator building need data. Whenever we need data, we need a method for data management, and in the era of big data, a crucial role is played by data integration. Therefore, STI policies and indicator development need data integration. Two main approaches to data integration exist, namely procedural and declarative. In this chapter, we follow the latter approach and focus our attention on the ontology-based data integration () paradigm. The main principles of OBDI are:
(i)
Leave the data where they are.
 
(ii)
Build a conceptual specification of the domain of interest (ontology), in terms of knowledge structures.
 
(iii)
Map such knowledge structures to concrete data sources.
 
(iv)
Express all services over the abstract representation.
 
(v)
Automatically translate knowledge services to data services.
 
We introduce the main challenges of data integration for research and innovation () and show that reasoning over an ontology connected to data may be very helpful for the study of R&I. We also provide examples by using Sapientia, an ontology specifically defined for multidimensional research assessment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference J. Chen, Y. Chen, X. Du, C. Li, J. Lu, S. Zhao, X. Zhou: Big data challenge: a data management perspective, Front. Comput. Sci. 7(2), 157–164 (2013)CrossRef J. Chen, Y. Chen, X. Du, C. Li, J. Lu, S. Zhao, X. Zhou: Big data challenge: a data management perspective, Front. Comput. Sci. 7(2), 157–164 (2013)CrossRef
go back to reference H. Ekbia, M. Mattioli, I. Kouper, G. Arave, A. Ghazinejad, T. Bowman, V. Ratandeep Suri, A. Tsou, S. Weingart, C.R. Sugimoto: Big data, bigger dilemmas: A critical review, J. Assoc. Inf. Sci. Technol. 66(8), 1523–1545 (2015)CrossRef H. Ekbia, M. Mattioli, I. Kouper, G. Arave, A. Ghazinejad, T. Bowman, V. Ratandeep Suri, A. Tsou, S. Weingart, C.R. Sugimoto: Big data, bigger dilemmas: A critical review, J. Assoc. Inf. Sci. Technol. 66(8), 1523–1545 (2015)CrossRef
go back to reference C.L. Borgman: Big Data, Little Data, No Data: Scholarship in the Networked World (MIT Press, Cambridge 2015)CrossRef C.L. Borgman: Big Data, Little Data, No Data: Scholarship in the Networked World (MIT Press, Cambridge 2015)CrossRef
go back to reference Z. Majkić: Big Data Integration Theory, Texts in Computer Science (Springer, Switzerland 2014)CrossRef Z. Majkić: Big Data Integration Theory, Texts in Computer Science (Springer, Switzerland 2014)CrossRef
go back to reference X.L. Dong, D. Srivastava: Big data integration, Synth. Lect. Data Manag. 7(1), 1–198 (2015)CrossRef X.L. Dong, D. Srivastava: Big data integration, Synth. Lect. Data Manag. 7(1), 1–198 (2015)CrossRef
go back to reference M. Lenzerini: Data integration: A theoretical perspective. In: Proc. 21st ACM-SIGMOD-SIGART Symp. Princ. Database Syst. PODS2002 (2002) pp. 233–246 M. Lenzerini: Data integration: A theoretical perspective. In: Proc. 21st ACM-SIGMOD-SIGART Symp. Princ. Database Syst. PODS2002 (2002) pp. 233–246
go back to reference C. Parent, S. Spaccapietra: Database integration: the key to data interoperability. In: Advances in Object-Oriented Data Modeling, ed. by M.P. Papazoglou, Z. Zari (MIT Press, Cambridge 2000) pp. 221–253 C. Parent, S. Spaccapietra: Database integration: the key to data interoperability. In: Advances in Object-Oriented Data Modeling, ed. by M.P. Papazoglou, Z. Zari (MIT Press, Cambridge 2000) pp. 221–253
go back to reference C. Daraio: A framework for the assessment of research and its impacts, J. Data Inf. Sci. 2(4), 7–42 (2017) C. Daraio: A framework for the assessment of research and its impacts, J. Data Inf. Sci. 2(4), 7–42 (2017)
go back to reference C. Daraio, W. Glänzel: Grand challenges in data integration—state of the art and future perspectives: An introduction, Scientometrics 108(1), 391–400 (2016)CrossRef C. Daraio, W. Glänzel: Grand challenges in data integration—state of the art and future perspectives: An introduction, Scientometrics 108(1), 391–400 (2016)CrossRef
go back to reference OECD: Quality Framework and Guidelines for OECD Statistical Activities (OECD, Paris 2011) OECD: Quality Framework and Guidelines for OECD Statistical Activities (OECD, Paris 2011)
go back to reference W. Glänzel, S. Katz, H. Moed, U. Schoepflin: Preface, Scientometrics 35(2), 165–166 (1996)CrossRef W. Glänzel, S. Katz, H. Moed, U. Schoepflin: Preface, Scientometrics 35(2), 165–166 (1996)CrossRef
go back to reference W. Glänzel, H. Willems: Towards standardisation, harmonisation and integration of data from heterogeneous sources for funding and evaluation purposes, Scientometrics 106(2), 821–823 (2016)CrossRef W. Glänzel, H. Willems: Towards standardisation, harmonisation and integration of data from heterogeneous sources for funding and evaluation purposes, Scientometrics 106(2), 821–823 (2016)CrossRef
go back to reference W. Glänzel: The need for standards in bibliometric research and technology, Scientometrics 35(2), 167–176 (1996)CrossRef W. Glänzel: The need for standards in bibliometric research and technology, Scientometrics 35(2), 167–176 (1996)CrossRef
go back to reference G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, R. Rosati: Using ontologies for semantic data integration. In: A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, Studies in Big Data, Vol. 31, ed. by S. Flesca, S. Greco, E. Masciari, D. Saccà (Springer, Cham 2018) G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, R. Rosati: Using ontologies for semantic data integration. In: A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, Studies in Big Data, Vol. 31, ed. by S. Flesca, S. Greco, E. Masciari, D. Saccà (Springer, Cham 2018)
go back to reference C. Daraio, M. Lenzerini, C. Leporelli, P. Naggar, E. Fusco, A. Balducci: Sapientia (the ontology of multidimensional research assessment) and OBDM (ontology based data management) as two key enabling technologies for the development of integrated data platforms for science, technology and innovation (STI). In: OECD Blue Sky 2016, Ghent (2016) C. Daraio, M. Lenzerini, C. Leporelli, P. Naggar, E. Fusco, A. Balducci: Sapientia (the ontology of multidimensional research assessment) and OBDM (ontology based data management) as two key enabling technologies for the development of integrated data platforms for science, technology and innovation (STI). In: OECD Blue Sky 2016, Ghent (2016)
go back to reference J.D. Ullman: Information integration using logical views. In: Proc. Int. Conf. Database Theor., ICDT'97, LNCS, Vol. 1186 (Springer, Berlin, Heidelberg 1997) pp. 19–40CrossRef J.D. Ullman: Information integration using logical views. In: Proc. Int. Conf. Database Theor., ICDT'97, LNCS, Vol. 1186 (Springer, Berlin, Heidelberg 1997) pp. 19–40CrossRef
go back to reference A.Y. Levy, A.O. Mendelzon, Y. Sagiv, D. Srivastava: Answering queries using views. In: Proc. 14th ACM-SIGMOD-SIGART Symp. Princ. Database Syst., PODS'95 (1995) pp. 95–104 A.Y. Levy, A.O. Mendelzon, Y. Sagiv, D. Srivastava: Answering queries using views. In: Proc. 14th ACM-SIGMOD-SIGART Symp. Princ. Database Syst., PODS'95 (1995) pp. 95–104
go back to reference A.Y. Halevy, A. Rajaraman, J. Ordille: Data integration: The teenage years. In: Proc. 32nd Int. Conf. Very Large Data Bases, VLDB 2006 (2006) pp. 9–16 A.Y. Halevy, A. Rajaraman, J. Ordille: Data integration: The teenage years. In: Proc. 32nd Int. Conf. Very Large Data Bases, VLDB 2006 (2006) pp. 9–16
go back to reference N.F. Noy, A. Doan, A.Y. Halevy: Semantic integration (editorial), AI Magazine 26(1), 7 (2005) N.F. Noy, A. Doan, A.Y. Halevy: Semantic integration (editorial), AI Magazine 26(1), 7 (2005)
go back to reference D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, G. Vetere: DL-Lite: Practical reasoning for rich DLs. In: Proc. Int. Workshop Descr. Log., DL2004, CEUR, Vol. 104 (2004), http://ceur-ws.org D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, G. Vetere: DL-Lite: Practical reasoning for rich DLs. In: Proc. Int. Workshop Descr. Log., DL2004, CEUR, Vol. 104 (2004), http://​ceur-ws.​org
go back to reference D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati: Tractable reasoning and efficient query answering in description logics: The DL-Lite family, J. Autom. Reason. 39(3), 385–429 (2007)CrossRef D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati: Tractable reasoning and efficient query answering in description logics: The DL-Lite family, J. Autom. Reason. 39(3), 385–429 (2007)CrossRef
go back to reference A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati: Linking data to ontologies. In: J. Data Semant, Vol. 4900 (Springer, Berlin, Heidelberg 2008) pp. 133–173 A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati: Linking data to ontologies. In: J. Data Semant, Vol. 4900 (Springer, Berlin, Heidelberg 2008) pp. 133–173
go back to reference M. Lenzerini: Ontology-based data management. In: Proc. 20th ACM Int. Conf. Inf. Knowl. Manag., CIKM'11 (2011) pp. 5–6 M. Lenzerini: Ontology-based data management. In: Proc. 20th ACM Int. Conf. Inf. Knowl. Manag., CIKM'11 (2011) pp. 5–6
go back to reference C. Daraio, M. Lenzerini, C. Leporelli, H.F. Moed, P. Naggar, A. Bonaccorsi, A. Bartolucci: Sapientia: the ontology of multi-dimensional research assessment. In: Proc. 15th Int. Soc. Scientometr. Informetr. Conf., Istanbul, ed. by A.A. Salah, Y. Tonta, A.A. Akdag Salah, C. Sugimoto, U. Al (Bogaziçi Univ. Printhouse, Turkey 2015) pp. 965–977 C. Daraio, M. Lenzerini, C. Leporelli, H.F. Moed, P. Naggar, A. Bonaccorsi, A. Bartolucci: Sapientia: the ontology of multi-dimensional research assessment. In: Proc. 15th Int. Soc. Scientometr. Informetr. Conf., Istanbul, ed. by A.A. Salah, Y. Tonta, A.A. Akdag Salah, C. Sugimoto, U. Al (Bogaziçi Univ. Printhouse, Turkey 2015) pp. 965–977
go back to reference F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P.F. Patel-Schneider (Eds.): The Description Logic Handbook: Theory, Implementation and Applications, 2nd edn. (Cambridge Univ. Press, Cambridge 2007) F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P.F. Patel-Schneider (Eds.): The Description Logic Handbook: Theory, Implementation and Applications, 2nd edn. (Cambridge Univ. Press, Cambridge 2007)
go back to reference T. Imielinski, W. Lipski Jr.: Incomplete information in relational databases, J. ACM 31(4), 761–791 (1984)CrossRef T. Imielinski, W. Lipski Jr.: Incomplete information in relational databases, J. ACM 31(4), 761–791 (1984)CrossRef
go back to reference S. Ceri, G. Gottlob, L. Tanca: Logic Programming and Databases (Springer, Berlin 1990)CrossRef S. Ceri, G. Gottlob, L. Tanca: Logic Programming and Databases (Springer, Berlin 1990)CrossRef
go back to reference R. Fagin, G.P. Kolaitis, R.J. Miller, L. Popa: Data exchange: Semantics and query answering, Theor. Comput. Sci. 336(1), 89–124 (2005)CrossRef R. Fagin, G.P. Kolaitis, R.J. Miller, L. Popa: Data exchange: Semantics and query answering, Theor. Comput. Sci. 336(1), 89–124 (2005)CrossRef
go back to reference P.N. Edwards, S.J. Jackson, M.K. Chalmers, G.C. Bowker, C.L. Borgman, D. Ribes, M. Burton, S. Calvert: Knowledge Infrastructures: Intellectual frameworks and research challenges (Deep Blue, Ann Arbor 2013), http://hdl.net/2027.42/97552 P.N. Edwards, S.J. Jackson, M.K. Chalmers, G.C. Bowker, C.L. Borgman, D. Ribes, M. Burton, S. Calvert: Knowledge Infrastructures: Intellectual frameworks and research challenges (Deep Blue, Ann Arbor 2013), http://​hdl.​net/​2027.​42/​97552
go back to reference N. Georgescu-Roegen: The economics of production, Am. Econ. Rev. 60(2), 1–9 (1970) N. Georgescu-Roegen: The economics of production, Am. Econ. Rev. 60(2), 1–9 (1970)
go back to reference N. Georgescu-Roegen: Process analysis and the neoclassical theory of production, Am. J. Agric. Econ. 54(2), 279–294 (1972)CrossRef N. Georgescu-Roegen: Process analysis and the neoclassical theory of production, Am. J. Agric. Econ. 54(2), 279–294 (1972)CrossRef
go back to reference N. Georgescu-Roegen: Methods in economic science, J. Econ. Issues 13(2), 317–328 (1979)CrossRef N. Georgescu-Roegen: Methods in economic science, J. Econ. Issues 13(2), 317–328 (1979)CrossRef
go back to reference C. Daraio, M. Lenzerini, C. Leporelli, P. Naggar, A. Bonaccorsi, A. Bartolucci: The advantages of an ontology-based data management approach: Openness, interoperability and data quality, Scientometrics 108(1), 441–455 (2016)CrossRef C. Daraio, M. Lenzerini, C. Leporelli, P. Naggar, A. Bonaccorsi, A. Bartolucci: The advantages of an ontology-based data management approach: Openness, interoperability and data quality, Scientometrics 108(1), 441–455 (2016)CrossRef
go back to reference X. Li, J.D. Johnson: Evaluate IT investment opportunities using real options theory, Inf. Resour. Manag. J. 15(3), 32–47 (2002)CrossRef X. Li, J.D. Johnson: Evaluate IT investment opportunities using real options theory, Inf. Resour. Manag. J. 15(3), 32–47 (2002)CrossRef
go back to reference C.Y. Baldwin, K. Clark: Design Rules – The Power of Modularity (MIT Press, Cambridge 2000)CrossRef C.Y. Baldwin, K. Clark: Design Rules – The Power of Modularity (MIT Press, Cambridge 2000)CrossRef
go back to reference D.L. Parnas: On the criteria to be used in decomposing systems into modules, Commun. ACM 15(12), 1053–1058 (1972)CrossRef D.L. Parnas: On the criteria to be used in decomposing systems into modules, Commun. ACM 15(12), 1053–1058 (1972)CrossRef
go back to reference H.A. Simon: The architecture of complexity, Proc. Am. Philos. Soc. 106, 467–482 (1962) H.A. Simon: The architecture of complexity, Proc. Am. Philos. Soc. 106, 467–482 (1962)
go back to reference D. Lembo, D. Pantaleone, V. Santarelli, D.F. Savo: Easy OWL drawing with the graphol visual ontology language. In: Proc. 15th Int. Conf. Princ. Knowl. Represent. Reason., KR2016 (2016) pp. 573–576 D. Lembo, D. Pantaleone, V. Santarelli, D.F. Savo: Easy OWL drawing with the graphol visual ontology language. In: Proc. 15th Int. Conf. Princ. Knowl. Represent. Reason., KR2016 (2016) pp. 573–576
go back to reference D. Lembo, D. Pantaleone, V. Santarelli, D.F. Savo: Eddy: A graphical editor for OWL 2 ontologies. In: Proc. 25th Int. Jt. Conf. Artif. Intell., IJCAI (2016) pp. 4252–4253 D. Lembo, D. Pantaleone, V. Santarelli, D.F. Savo: Eddy: A graphical editor for OWL 2 ontologies. In: Proc. 25th Int. Jt. Conf. Artif. Intell., IJCAI (2016) pp. 4252–4253
go back to reference C. Daraio, M. Lenzerini, C. Leporelli, F.H. Moed, P. Naggar, A. Bonaccorsi, A. Bartolucci: Data integration for research and innovation policy: An ontology-based data management approach, Scientometrics 106(2), 857–871 (2016)CrossRef C. Daraio, M. Lenzerini, C. Leporelli, F.H. Moed, P. Naggar, A. Bonaccorsi, A. Bartolucci: Data integration for research and innovation policy: An ontology-based data management approach, Scientometrics 106(2), 857–871 (2016)CrossRef
go back to reference C. Daraio, A. Bonaccorsi: Beyond university rankings? Generating new indicators on universities by linking data in open platforms, J. Assoc. Inf. Sci. Technol. 68, 508–529 (2016)CrossRef C. Daraio, A. Bonaccorsi: Beyond university rankings? Generating new indicators on universities by linking data in open platforms, J. Assoc. Inf. Sci. Technol. 68, 508–529 (2016)CrossRef
go back to reference B.M. Frischmann: Infrastructure: The Social Value of Shared Resources (Oxford Univ. Press, New York 2012)CrossRef B.M. Frischmann: Infrastructure: The Social Value of Shared Resources (Oxford Univ. Press, New York 2012)CrossRef
go back to reference OECD: Data-Driven Innovation Big Data for Growth and Well-Being (OECD, Paris 2015)CrossRef OECD: Data-Driven Innovation Big Data for Growth and Well-Being (OECD, Paris 2015)CrossRef
Metadata
Title
Challenges, Approaches and Solutions in Data Integration for Research and Innovation
Authors
Maurizio Lenzerini
Cinzia Daraio
Copyright Year
2019
Publisher
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-030-02511-3_15

Premium Partners