Skip to main content
Erschienen in: Knowledge and Information Systems 1/2019

15.12.2018 | Regular Paper

A graph-based meta-model for heterogeneous data management

verfasst von: Ernesto Damiani, Barbara Oliboni, Elisa Quintarelli, Letizia Tanca

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The wave of interest in data-centric applications has spawned a high variety of data models, making it extremely difficult to evaluate, integrate or access them in a uniform way. Moreover, many recent models are too specific to allow immediate comparison with the others and do not easily support incremental model design. In this paper, we introduce GSMM, a meta-model based on the use of a generic graph that can be instantiated to a concrete data model by simply providing values for a restricted set of parameters and some high-level constraints, themselves represented as graphs. In GSMM, the concept of data schema is replaced by that of constraint, which allows the designer to impose structural restrictions on data in a very flexible way. GSMM includes GSL, a graph-based language for expressing queries and constraints that besides being applicable to data represented in GSMM, in principle, can be specialised and used for existing models where no language was defined. We show some sample applications of GSMM for deriving and comparing classical data models like the relational model, plain XML data, XML Schema, and time-varying semistructured data. We also show how GSMM can represent more recent modelling proposals: the triple stores, the BigTable model and Neo4j, a graph-based model for NoSQL data. A prototype showing the potential of the approach is also described.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We say that data are semi-structured when, although some structure is present, it is not as strict, regular, or complete as the one required by the traditional database management systems [1].
 
2
Big Table is the model shared by popular NoSQL databases like Apache HBase and Cassandra [13].
 
3
In the remainder of the paper we denote constants by means of lowercase words, whereas words denoting variables start with a capital letter.
 
4
The notation \(b_2\mid N'\) stands for the restriction of mapping \(b_2\) to the nodes in \(N'\).
 
5
Plain XML documents may also contain ENTITY nodes, not unlike macro calls that must be expanded before parsing. We do not consider ENTITY expansion in this paper.
 
6
For the sake of conciseness Table 1 does not explicitly consider Base Types, because they may be very large.
 
7
An edge pointing to \(m_2\).
 
Literatur
1.
Zurück zum Zitat Abiteboul S (1997) Querying semi-structured data. In: Proceedings of the international conference on database theory, vol 1186. Lecture notes in computer science, pp 262–275 Abiteboul S (1997) Querying semi-structured data. In: Proceedings of the international conference on database theory, vol 1186. Lecture notes in computer science, pp 262–275
2.
Zurück zum Zitat Angles R (2012) A comparison of current graph database models. In: Proceedings of the 2012 IEEE 28th international conference on data engineering workshops, ICDEW ’12. IEEE Computer Society, Washington, DC, pp 171–177 Angles R (2012) A comparison of current graph database models. In: Proceedings of the 2012 IEEE 28th international conference on data engineering workshops, ICDEW ’12. IEEE Computer Society, Washington, DC, pp 171–177
3.
Zurück zum Zitat Atzeni P, Cappellari P, Torlone R, Bernstein PA, Gianforme G (2008) Model-independent schema translation. VLDB J 17(6):1347–1370CrossRef Atzeni P, Cappellari P, Torlone R, Bernstein PA, Gianforme G (2008) Model-independent schema translation. VLDB J 17(6):1347–1370CrossRef
4.
Zurück zum Zitat Atzeni P, Torlone R (2001) A unified framework for data translation over the web. In: Proceedings of the 2nd international conference on web information system engineering. IEEE Computer Society, pp 350–358 Atzeni P, Torlone R (2001) A unified framework for data translation over the web. In: Proceedings of the 2nd international conference on web information system engineering. IEEE Computer Society, pp 350–358
5.
Zurück zum Zitat Bekiropoulos K, Keramopoulos E, Beza O, Mouratidis P (2010) A list of features that a graphical xml query language should support. Comput Syst Sci Eng 25(5):13–21 Bekiropoulos K, Keramopoulos E, Beza O, Mouratidis P (2010) A list of features that a graphical xml query language should support. Comput Syst Sci Eng 25(5):13–21
6.
Zurück zum Zitat Benda S, Klímek J, Nečaský M (2013) Using schematron as schema language in conceptual modeling for xml. In: Proceedings of the ninth Asia-Pacific conference on conceptual modelling, vol 143, APCCM ’13. Australian Computer Society, Inc., Darlinghurst, pp 31–40 Benda S, Klímek J, Nečaský M (2013) Using schematron as schema language in conceptual modeling for xml. In: Proceedings of the ninth Asia-Pacific conference on conceptual modelling, vol 143, APCCM ’13. Australian Computer Society, Inc., Darlinghurst, pp 31–40
7.
Zurück zum Zitat Bernstein PA, Halevy AY, Pottinger RA (2000) A vision for management of complex models. SIGMOD Rec 29(4):55–63CrossRef Bernstein PA, Halevy AY, Pottinger RA (2000) A vision for management of complex models. SIGMOD Rec 29(4):55–63CrossRef
8.
Zurück zum Zitat Bernstein PA, Pottinger R (2003) Merging models based on given correspondences. Technical report UW-CSE-03-02-03. University of Washington Bernstein PA, Pottinger R (2003) Merging models based on given correspondences. Technical report UW-CSE-03-02-03. University of Washington
9.
Zurück zum Zitat Bowers S, Delcambre L (2000) Representing and transforming model-based information. In: Proceedings of International workshop on the semantic web at the 4th European conference on research and advanced technology for digital libraries (SemWeb) Bowers S, Delcambre L (2000) Representing and transforming model-based information. In: Proceedings of International workshop on the semantic web at the 4th European conference on research and advanced technology for digital libraries (SemWeb)
10.
Zurück zum Zitat Bunemann P, Fan W, Siméon J, Weinstein S (2001) Constraints for semistructured data and XML. SIGMOD Rec 30:47–54CrossRef Bunemann P, Fan W, Siméon J, Weinstein S (2001) Constraints for semistructured data and XML. SIGMOD Rec 30:47–54CrossRef
11.
Zurück zum Zitat Bunemann P, Fan W, Weinstein S (1998) Path constraints on semistructured and structured data. In: Proceedings of 17th symposium on principles of database system. ACM Press, pp 129–138 Bunemann P, Fan W, Weinstein S (1998) Path constraints on semistructured and structured data. In: Proceedings of 17th symposium on principles of database system. ACM Press, pp 129–138
12.
Zurück zum Zitat Cattell R (2011) Scalable SQL and NoSQL data stores. SIGMOD Rec 39(4):12–27CrossRef Cattell R (2011) Scalable SQL and NoSQL data stores. SIGMOD Rec 39(4):12–27CrossRef
13.
Zurück zum Zitat Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2):4:1–4:26CrossRef Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2):4:1–4:26CrossRef
14.
Zurück zum Zitat Chawathe SS, Abiteboul S, Widom J (1998) Representing and querying changes in semistructured data. In: Proceedings of the fourteenth international conference on data engineering. IEEE Computer Society, pp 4–13 Chawathe SS, Abiteboul S, Widom J (1998) Representing and querying changes in semistructured data. In: Proceedings of the fourteenth international conference on data engineering. IEEE Computer Society, pp 4–13
15.
Zurück zum Zitat Chawathe SS, Abiteboul S, Widom J (1999) Managing historical semistructured data. Theory Pract Object Syst 5(3):143–162CrossRef Chawathe SS, Abiteboul S, Widom J (1999) Managing historical semistructured data. Theory Pract Object Syst 5(3):143–162CrossRef
16.
Zurück zum Zitat Chen L, Oughtred R, Berman HM, Westbrook J (2004) Targetdb: a target registration database for structural genomics projects. Bioinform Appl Notes 20(16):2860–2862CrossRef Chen L, Oughtred R, Berman HM, Westbrook J (2004) Targetdb: a target registration database for structural genomics projects. Bioinform Appl Notes 20(16):2860–2862CrossRef
17.
Zurück zum Zitat Combi C, Oliboni B, Quintarelli E (2012) Modeling temporal dimensions of semistructured data. J Intell Inf Syst 38(3):601–644CrossRef Combi C, Oliboni B, Quintarelli E (2012) Modeling temporal dimensions of semistructured data. J Intell Inf Syst 38(3):601–644CrossRef
18.
Zurück zum Zitat Cortesi A, Dovier A, Quintarelli E, Tanca L (2002) Operational and abstract semantics of a query language for semi-structured information. Theor Comput Sci 275(1–2):521–560CrossRefMATH Cortesi A, Dovier A, Quintarelli E, Tanca L (2002) Operational and abstract semantics of a query language for semi-structured information. Theor Comput Sci 275(1–2):521–560CrossRefMATH
19.
Zurück zum Zitat Damiani E, Oliboni B, Quintarelli E, Tanca L (2003) Modeling semistructured data by using graph-based constraints. In: OTM workshops proceedings. Lecture notes in computer science. Springer, Berlin, pp 20–21 Damiani E, Oliboni B, Quintarelli E, Tanca L (2003) Modeling semistructured data by using graph-based constraints. In: OTM workshops proceedings. Lecture notes in computer science. Springer, Berlin, pp 20–21
20.
Zurück zum Zitat Damiani E, Tanca L (1997) Semantic approches to structuring and querying web sites. In: Proceedings of 7th IFIP working conference on database semantics (DS-97) Damiani E, Tanca L (1997) Semantic approches to structuring and querying web sites. In: Proceedings of 7th IFIP working conference on database semantics (DS-97)
21.
Zurück zum Zitat Fan W, Lu P (2017) Dependencies for graphs. In: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI symposium on principles of database systems, PODS ’17. ACM, pp 403–416 Fan W, Lu P (2017) Dependencies for graphs. In: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI symposium on principles of database systems, PODS ’17. ACM, pp 403–416
22.
Zurück zum Zitat Indrawan-Santiago M (2012) Database research: Are we at a crossroad? reflection on NoSQL. In: Proceedings of the 2012 15th international conference on network-based information systems, NBIS ’12. IEEE Computer Society, Washington, DC, pp 45–51 Indrawan-Santiago M (2012) Database research: Are we at a crossroad? reflection on NoSQL. In: Proceedings of the 2012 15th international conference on network-based information systems, NBIS ’12. IEEE Computer Society, Washington, DC, pp 45–51
23.
Zurück zum Zitat Kaur K, Rani R (2013) Modeling and querying data in NoSQL databases. In: Proceedings of the IEEE international conference on Big Data, pp 1 – 7 Kaur K, Rani R (2013) Modeling and querying data in NoSQL databases. In: Proceedings of the IEEE international conference on Big Data, pp 1 – 7
24.
Zurück zum Zitat Lee KK-Y, Tang W-C, Choi K-S (2013) Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage. Comput Methods Progr Biomed 110(1):99–109CrossRef Lee KK-Y, Tang W-C, Choi K-S (2013) Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage. Comput Methods Progr Biomed 110(1):99–109CrossRef
25.
Zurück zum Zitat Levy AY, Rajaraman A, Ordille JJ (1996) Querying heterogeneous information sources using source descriptions. In: Proceedings of the twenty-second international conference on very large databases. VLDB Endowment, Saratoga, Calif., Bombay, India, pp 251–262 Levy AY, Rajaraman A, Ordille JJ (1996) Querying heterogeneous information sources using source descriptions. In: Proceedings of the twenty-second international conference on very large databases. VLDB Endowment, Saratoga, Calif., Bombay, India, pp 251–262
26.
Zurück zum Zitat Makoto M, Lee D, Mani M, Kawaguchi K (2005) Taxonomy of XML schema languages using formal language theory. ACM Trans Internet Technol 5(4):660–704CrossRef Makoto M, Lee D, Mani M, Kawaguchi K (2005) Taxonomy of XML schema languages using formal language theory. ACM Trans Internet Technol 5(4):660–704CrossRef
27.
Zurück zum Zitat McBrien P, Poulovassilis A (1999) A uniform approach to inter-model transformations. In: Conference on advanced information systems engineering, pp 333–348 McBrien P, Poulovassilis A (1999) A uniform approach to inter-model transformations. In: Conference on advanced information systems engineering, pp 333–348
28.
Zurück zum Zitat Oliboni B, Quintarelli E, Tanca L (2001) Temporal aspects of semistructured data. In: Proceedings of the eighth international symposium on temporal representation and reasoning (TIME-01). IEEE Computer Society, pp 119–127 Oliboni B, Quintarelli E, Tanca L (2001) Temporal aspects of semistructured data. In: Proceedings of the eighth international symposium on temporal representation and reasoning (TIME-01). IEEE Computer Society, pp 119–127
29.
Zurück zum Zitat Papakonstantinou Y, Garcia-Molina H, Widom J (1995) Object exchange across heterogeneous information sources. In: Proceedings of the eleventh international conference on data engineering. IEEE Computer Society, pp 251–260 Papakonstantinou Y, Garcia-Molina H, Widom J (1995) Object exchange across heterogeneous information sources. In: Proceedings of the eleventh international conference on data engineering. IEEE Computer Society, pp 251–260
30.
Zurück zum Zitat Paredaens J, Peelman P, Tanca L (1995) G-Log: a declarative graphical query language. IEEE Trans Knowl Data Eng 7(3):436–453CrossRef Paredaens J, Peelman P, Tanca L (1995) G-Log: a declarative graphical query language. IEEE Trans Knowl Data Eng 7(3):436–453CrossRef
31.
Zurück zum Zitat Vicknair C, Macias M, Zhao Z, Nan X, Chen Y, Wilkins D (2010) A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th annual southeast regional conference, ACM SE ’10. ACM, New York, NY, pp 42:1–42:6 Vicknair C, Macias M, Zhao Z, Nan X, Chen Y, Wilkins D (2010) A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th annual southeast regional conference, ACM SE ’10. ACM, New York, NY, pp 42:1–42:6
32.
Zurück zum Zitat Virgilio RD, Maccioni A, Torlone R (2014) Graph-driven exploration of relational databases for efficient keyword search. In: Candan KS, Amer-Yahia S, Schweikardt N, Christophides V, Leroy V (eds) Proceedings of the workshops of the EDBT/ICDT 2014 joint conference (EDBT/ICDT 2014), Athens, Greece, March 28, 2014, Vol. 1133 of CEUR workshop proceedings, CEUR-WS.org, pp 208–215 Virgilio RD, Maccioni A, Torlone R (2014) Graph-driven exploration of relational databases for efficient keyword search. In: Candan KS, Amer-Yahia S, Schweikardt N, Christophides V, Leroy V (eds) Proceedings of the workshops of the EDBT/ICDT 2014 joint conference (EDBT/ICDT 2014), Athens, Greece, March 28, 2014, Vol. 1133 of CEUR workshop proceedings, CEUR-WS.org, pp 208–215
35.
Zurück zum Zitat Zang T, Calinescu R, Kwiatkowska MZ (2011) Metamodel-driven SOA for collaborative e-science application. Comput Syst Sci Eng 26(3):215–226 Zang T, Calinescu R, Kwiatkowska MZ (2011) Metamodel-driven SOA for collaborative e-science application. Comput Syst Sci Eng 26(3):215–226
Metadaten
Titel
A graph-based meta-model for heterogeneous data management
verfasst von
Ernesto Damiani
Barbara Oliboni
Elisa Quintarelli
Letizia Tanca
Publikationsdatum
15.12.2018
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1305-8

Weitere Artikel der Ausgabe 1/2019

Knowledge and Information Systems 1/2019 Zur Ausgabe