Skip to main content
Erschienen in: Journal on Data Semantics 1/2018

27.10.2017 | Original Article

MongoDB-Based Modular Ontology Building for Big Data Integration

verfasst von: Hanen Abbes, Faiez Gargouri

Erschienen in: Journal on Data Semantics | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Big Data are collections of data sets so large and complex to process using classical database management tools. Their main characteristics are volume, variety and velocity. Although these characteristics accentuate heterogeneity problems, users are always looking for a unified view of the data. Consequently, Big Data integration is a new research area that faces new challenges due to the aforementioned characteristics. Ontologies are widely used in data integration since they represent knowledge as a formal description of a domain of interest. With the advent of Big Data, their implementation faces new challenges due to the volume, variety and velocity dimensions of these data. This paper illustrates an approach to build a modular ontology for Big Data integration that considers the characteristics of big volume, high-speed generation and wide variety of the data. Our approach exploits a NOSQL database, namely MongoDB, and takes advantages of modular ontologies. It follows three main steps: wrapping data sources to MongoDB databases, generating local ontologies and finally composing the local ontologies to get a global one. We equally focus on the implementation of the two last steps.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kaisler S, Armour F, Espinosa J A, Money W (2013) Big data: issues and challenges moving forward. In: 6th Hawaii international conference on system sciences (HICSS), pp 995–1004 Kaisler S, Armour F, Espinosa J A, Money W (2013) Big data: issues and challenges moving forward. In: 6th Hawaii international conference on system sciences (HICSS), pp 995–1004
2.
Zurück zum Zitat Gupta R, Gupta H, Mohania M (2012) Cloud computing and big data analytics: what is new from databases perspective?, big data analytics, Lecture notes in computer science, vol 7678, pp 42–61 Gupta R, Gupta H, Mohania M (2012) Cloud computing and big data analytics: what is new from databases perspective?, big data analytics, Lecture notes in computer science, vol 7678, pp 42–61
3.
Zurück zum Zitat Zikopoulos P, Eaton C (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York Zikopoulos P, Eaton C (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York
4.
Zurück zum Zitat Boden C, Karnstedt M, Fernandez M, Markl V (2013) Large-scale social-media analytics on stratosphere. In: Proceedings of the 22nd international conference on world wide web companion, pp 257–260 Boden C, Karnstedt M, Fernandez M, Markl V (2013) Large-scale social-media analytics on stratosphere. In: Proceedings of the 22nd international conference on world wide web companion, pp 257–260
5.
Zurück zum Zitat Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 233–246 Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 233–246
6.
Zurück zum Zitat Malucelli A, Oliveira E (2003) Ontology-services to facilitate agents interoperability. In: Proceedings of the sixth Pacific rim international workshop on multi-agents (PRIMA), pp 170–181 Malucelli A, Oliveira E (2003) Ontology-services to facilitate agents interoperability. In: Proceedings of the sixth Pacific rim international workshop on multi-agents (PRIMA), pp 170–181
7.
Zurück zum Zitat Wache H, Vögele T, Visser U, Stuckenschmidt H, Schuster G, Neumann H, Hübner S (2001) Ontology-based integration of information—a survey of existing approaches. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI-01), workshop: ontologies and information sharing Wache H, Vögele T, Visser U, Stuckenschmidt H, Schuster G, Neumann H, Hübner S (2001) Ontology-based integration of information—a survey of existing approaches. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI-01), workshop: ontologies and information sharing
8.
Zurück zum Zitat Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220CrossRef Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220CrossRef
9.
Zurück zum Zitat Benjamins VR, Gómez-Pérez A (2000) Knowledge-system technology: ontologies and problem-solving methods. Department of Social Science Informatics, University of Amsterdam, The Netherlands Benjamins VR, Gómez-Pérez A (2000) Knowledge-system technology: ontologies and problem-solving methods. Department of Social Science Informatics, University of Amsterdam, The Netherlands
10.
Zurück zum Zitat Bontcheva K, Sabou M (2006) Learning ontologies from software artifacts: exploring and combining multiple sources. In: Proceedings of the 2nd international workshop on semantic web enabled software engineering (SWESE) Bontcheva K, Sabou M (2006) Learning ontologies from software artifacts: exploring and combining multiple sources. In: Proceedings of the 2nd international workshop on semantic web enabled software engineering (SWESE)
11.
Zurück zum Zitat Cimiano P, Mädche A, Staab S, Völker J (2009) Ontology learning. In: Staab S, Studer R (eds) Handbook on ontologies. International handbooks on information systems. Springer, Berlin, Heidelberg Cimiano P, Mädche A, Staab S, Völker J (2009) Ontology learning. In: Staab S, Studer R (eds) Handbook on ontologies. International handbooks on information systems. Springer, Berlin, Heidelberg
12.
Zurück zum Zitat Ziegler P, Dittri KR (2007) Data integration—problems, approaches, and perspectives. In: Conceptual modelling in information systems engineering, pp 39–58 Ziegler P, Dittri KR (2007) Data integration—problems, approaches, and perspectives. In: Conceptual modelling in information systems engineering, pp 39–58
13.
Zurück zum Zitat Knoblock CA, Szekely PA (2015) Exploiting semantics for big data integration. AI Mag 36(1):25–38CrossRef Knoblock CA, Szekely PA (2015) Exploiting semantics for big data integration. AI Mag 36(1):25–38CrossRef
14.
Zurück zum Zitat Kadadi A, Agrawal R, Nyamful C, Atiq R (2014) Challenges of data integration and interoperability in big data. In: IEEE international conference on big data, pp 38–40 Kadadi A, Agrawal R, Nyamful C, Atiq R (2014) Challenges of data integration and interoperability in big data. In: IEEE international conference on big data, pp 38–40
16.
Zurück zum Zitat Curé O, Lamolle M, Le Duc C (2013) Ontology based data integration over document and column family oriented NOSQL, The Computing Research Repository Curé O, Lamolle M, Le Duc C (2013) Ontology based data integration over document and column family oriented NOSQL, The Computing Research Repository
17.
Zurück zum Zitat Kiran VK, Vijayakumar R (2014) Ontology based data integration of NoSQL datastores. In: 9th international conference on industrial and information systems (ICIIS) Kiran VK, Vijayakumar R (2014) Ontology based data integration of NoSQL datastores. In: 9th international conference on industrial and information systems (ICIIS)
18.
Zurück zum Zitat Jirkovskỳ V, Obitko M (2014) Semantic heterogeneity reduction for big data in industrial automation, information technologies—applications and theory (ITAT) Jirkovskỳ V, Obitko M (2014) Semantic heterogeneity reduction for big data in industrial automation, information technologies—applications and theory (ITAT)
19.
Zurück zum Zitat Jirkovskỳ V, Ichise R (2013) Mapsom: user involvement in ontology matching. In: Proceedings of the 3rd JIST conference Jirkovskỳ V, Ichise R (2013) Mapsom: user involvement in ontology matching. In: Proceedings of the 3rd JIST conference
20.
Zurück zum Zitat Obitko M, Snasel V, Smid J (2004) Ontology design with formal concept analysis. CLA 110:111–119 Obitko M, Snasel V, Smid J (2004) Ontology design with formal concept analysis. CLA 110:111–119
21.
Zurück zum Zitat Bansal SK, Kagemann S (2015) Integrating big data: a semantic extract-transform-load framework. Computer 48(3):42–50CrossRef Bansal SK, Kagemann S (2015) Integrating big data: a semantic extract-transform-load framework. Computer 48(3):42–50CrossRef
22.
Zurück zum Zitat Baader F, Calvanese D, McGuiness DL, Nardi D, Patel-Schneider P (2003) The description logic handbook: theory, implementation, applications. Cambridge University Press, CambridgeMATH Baader F, Calvanese D, McGuiness DL, Nardi D, Patel-Schneider P (2003) The description logic handbook: theory, implementation, applications. Cambridge University Press, CambridgeMATH
23.
Zurück zum Zitat Baader F, Sertkaya B, Turhan AY (2004) Computing the least common subsumer w.r.t. a background terminology. J Appl Logic 5:400–412MathSciNetMATH Baader F, Sertkaya B, Turhan AY (2004) Computing the least common subsumer w.r.t. a background terminology. J Appl Logic 5:400–412MathSciNetMATH
24.
Zurück zum Zitat Elloumi-Chaabene M, Mustapha NB, Zghal HB, Moreno A, Sànchez D (2011) Semantic-based composition of modular ontologies applied to web query reformulation. ICSOFT 1:305–308 Elloumi-Chaabene M, Mustapha NB, Zghal HB, Moreno A, Sànchez D (2011) Semantic-based composition of modular ontologies applied to web query reformulation. ICSOFT 1:305–308
25.
Zurück zum Zitat Bao J, Caragea D, Honavar V (2006) Towards collaborative environments for ontology construction and sharing. In: International symposium on collaborative technologies and systems (CTS), pp 99–108 Bao J, Caragea D, Honavar V (2006) Towards collaborative environments for ontology construction and sharing. In: International symposium on collaborative technologies and systems (CTS), pp 99–108
26.
Zurück zum Zitat Ben Mustapha N, Baazaoui Zghal H, Moreno A, Ben Ghézala H (2013) A dynamic composition of ontology modules approach: application to web query reformulation. IJMSO 8(4):309–321CrossRef Ben Mustapha N, Baazaoui Zghal H, Moreno A, Ben Ghézala H (2013) A dynamic composition of ontology modules approach: application to web query reformulation. IJMSO 8(4):309–321CrossRef
27.
Zurück zum Zitat Zimmermann A, Le Duc C (2008) Reasoning with a network of aligned ontologies. In: Proceedings of the 2nd international conference on web reasoning and rulesystems (ICWRRS), pp 43–57 Zimmermann A, Le Duc C (2008) Reasoning with a network of aligned ontologies. In: Proceedings of the 2nd international conference on web reasoning and rulesystems (ICWRRS), pp 43–57
28.
Zurück zum Zitat Desprès S (2014) Construction d’une ontologie modulaire pour l’univers de la cuisine numérique, 25èmes Journées francophones d’Ingénierie des Connaissances, pp 27–38 Desprès S (2014) Construction d’une ontologie modulaire pour l’univers de la cuisine numérique, 25èmes Journées francophones d’Ingénierie des Connaissances, pp 27–38
29.
Zurück zum Zitat Atrash A, Abel MH, Moulin C (2014) Ontologie Modulaire pour la collaboration, 225èmes Journées francophones d’Ingénierie des Connaissances, p 811 Atrash A, Abel MH, Moulin C (2014) Ontologie Modulaire pour la collaboration, 225èmes Journées francophones d’Ingénierie des Connaissances, p 811
30.
Zurück zum Zitat Deparis E, Abel MH, Lortal G, Mattoli J (2014) Information management from social and documentary sources in organizations. Comput Human Behav 30:753–759CrossRef Deparis E, Abel MH, Lortal G, Mattoli J (2014) Information management from social and documentary sources in organizations. Comput Human Behav 30:753–759CrossRef
31.
Zurück zum Zitat Abbes H, Gargouri F (2016) Big data integration: a MongoDB database and modular ontologies based approach, knowledge-based and intelligent information and engineering systems. In: Proceedings of the 20th international conference KES-2016, procedia computer science, vol 96, pp 446–455 Abbes H, Gargouri F (2016) Big data integration: a MongoDB database and modular ontologies based approach, knowledge-based and intelligent information and engineering systems. In: Proceedings of the 20th international conference KES-2016, procedia computer science, vol 96, pp 446–455
33.
Zurück zum Zitat Hecht R, Jablonski S (2011) NoSQL evaluation: a use case oriented survey. In: International conference on cloud and service computing, pp 336–341 Hecht R, Jablonski S (2011) NoSQL evaluation: a use case oriented survey. In: International conference on cloud and service computing, pp 336–341
34.
Zurück zum Zitat Ben Abbes S, Scheuermann A, Meilender T, D’Aquin M (2012) Characterizing modular ontologies. In: 7th international conference on formal ontologies in information systems – FOIS 2012, Jul 2012, Graz, Austria, pp 13–25 Ben Abbes S, Scheuermann A, Meilender T, D’Aquin M (2012) Characterizing modular ontologies. In: 7th international conference on formal ontologies in information systems – FOIS 2012, Jul 2012, Graz, Austria, pp 13–25
37.
Zurück zum Zitat Grau BC, Horrocks I, Kazakov Y, Sattler U (2007) A logical framework for modularity of ontologies. JCAI 2007:298–303 Grau BC, Horrocks I, Kazakov Y, Sattler U (2007) A logical framework for modularity of ontologies. JCAI 2007:298–303
39.
Zurück zum Zitat Pathak J, Johnson TM, Chute CG (2009) Survey of modular ontology techniques and their applications in the biomedical domain. Integrat Comput Aided Eng 16(3):225242 Pathak J, Johnson TM, Chute CG (2009) Survey of modular ontology techniques and their applications in the biomedical domain. Integrat Comput Aided Eng 16(3):225242
41.
Zurück zum Zitat Abbes H, Gargouri F (2014) Towards ontology building and updating from big data. In: Advances on decisional systems conference (ASD), pp 61–66 Abbes H, Gargouri F (2014) Towards ontology building and updating from big data. In: Advances on decisional systems conference (ASD), pp 61–66
42.
Zurück zum Zitat Abbes H, Boukettaya S, Gargouri F (2015) Learning ontology from big data through MongoDB database. In: ACS/IEEE 12th international conference of computer systems and applications (AICCSA), pp 1–7 Abbes H, Boukettaya S, Gargouri F (2015) Learning ontology from big data through MongoDB database. In: ACS/IEEE 12th international conference of computer systems and applications (AICCSA), pp 1–7
43.
Zurück zum Zitat Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration, KDD, pp 475-480 Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration, KDD, pp 475-480
44.
Zurück zum Zitat Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707710MathSciNet Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707710MathSciNet
45.
Zurück zum Zitat Cohen WW, Ravikumar P, Fienberg SE (2003) A comparison of string metrics for matching names and records, KDD workshop on data cleaning and object consolidation Cohen WW, Ravikumar P, Fienberg SE (2003) A comparison of string metrics for matching names and records, KDD workshop on data cleaning and object consolidation
46.
Zurück zum Zitat Abbes H, Gargouri F (2016) Structure based modular ontologies composition. In: ACS/IEEE 13th international conference of computer systems and applications (AICCSA) Abbes H, Gargouri F (2016) Structure based modular ontologies composition. In: ACS/IEEE 13th international conference of computer systems and applications (AICCSA)
47.
Zurück zum Zitat Frikha M, Mhiri M, Gargouri F (2007) Extraction of semantic relationships starting from similarity measurements. ICEIS 3:602–606 Frikha M, Mhiri M, Gargouri F (2007) Extraction of semantic relationships starting from similarity measurements. ICEIS 3:602–606
48.
Zurück zum Zitat Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics (ACL-94), pp 133–138 Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics (ACL-94), pp 133–138
49.
Zurück zum Zitat Abbes H, Gargouri F (2016) M2Onto: an approach and a tool to learn OWL ontology from MongoDB database. In: 16th international conference on intelligent systems design and applications (ISDA) Abbes H, Gargouri F (2016) M2Onto: an approach and a tool to learn OWL ontology from MongoDB database. In: 16th international conference on intelligent systems design and applications (ISDA)
50.
Zurück zum Zitat Rospocher M, Tonelli S, Serafini L, Pianta E (2012) Corpus-based terminological evaluation of ontologies. Appl Ontol 7(4):429–448 Rospocher M, Tonelli S, Serafini L, Pianta E (2012) Corpus-based terminological evaluation of ontologies. Appl Ontol 7(4):429–448
51.
Zurück zum Zitat Dellschaft K, Staab S (2008) Strategies for the evaluation of ontology learning. Ontol Learn Popul 167:253–272 Dellschaft K, Staab S (2008) Strategies for the evaluation of ontology learning. Ontol Learn Popul 167:253–272
52.
Zurück zum Zitat Raunich S, Rahm E (2012) Towards a benchmark for ontology merging, On the moveto meaningful internet systems: OTM 2012 workshops, pp 124–133 Raunich S, Rahm E (2012) Towards a benchmark for ontology merging, On the moveto meaningful internet systems: OTM 2012 workshops, pp 124–133
Metadaten
Titel
MongoDB-Based Modular Ontology Building for Big Data Integration
verfasst von
Hanen Abbes
Faiez Gargouri
Publikationsdatum
27.10.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
Journal on Data Semantics / Ausgabe 1/2018
Print ISSN: 1861-2032
Elektronische ISSN: 1861-2040
DOI
https://doi.org/10.1007/s13740-017-0081-z