Abstract
Ontology matching is among the core techniques used for heterogeneity resolution by information and knowledge-based systems. However, due to the excess and ever-evolving nature of data, ontologies are becoming large-scale and complex; consequently, leading to performance bottlenecks during ontology matching. In this paper, we present our performance-based ontology matching system. Today’s desktop and cloud platforms are equipped with parallelism-enabled multicore processors. Our system benefits from this opportunity and provides effectiveness-independent data parallel ontology matching resolution over parallelism-enabled platforms. Our system decomposes complex ontologies into smaller, simpler, and scalable subsets depending upon the needs of matching algorithms. Matching process over these subsets is divided from granular to finer-level abstraction of independent matching requests, matching jobs, and matching tasks, running in parallel over parallelism-enabled platforms. Execution of matching algorithms is aligned for the minimization of the matching space during the matching process. We comprehensively evaluated our system over OAEI’s dataset of fourteen real world ontologies from diverse domains, having different sizes and complexities. We have executed twenty different matching tasks over parallelism-enabled desktop and Microsoft Azure public cloud platform. In a single-node desktop environment, our system provides an impressive performance speedup of 4.1, 5.0, and 4.9 times for medium, large, and very large-scale ontologies. In a single-node cloud environment, our system provides an impressive performance speedup of 5.9, 7.4, and 7.0 times for medium, large, and very large-scale ontologies. In a multi-node (3 nodes) environment, our system provides an impressive performance speedup of 15.16 and 21.51 times over desktop and cloud platforms respectively.
Similar content being viewed by others
Notes
2 Utilization of communication by each core component is described in the components explanation.
References
Doan A, Halevy A, Ives Z (2012) Principles of data integration. Addision-Wesley, Reading
Euzenat J, Shvaiko P (2013) Ontology Matching, 2nd Edn. Springer, Berlin
Isern D, Sanchez D, Moreno A (2012) Ontology-driven execution of clinical guidelines. Comput Methods Prog Biomed 107:122–139
Cimino J, Zhu X (2006) IMIA Yearbook of Medical 1:124–135
De Potter P, Cools H, Depraetere K, Mels G, Debevere P, De Roo J, Huszka C, Colaert D, Mannens E, Van de Walle R (2012) Semantic patient information aggregation and medicinal decision support. Comput Methods Prog Biomed 2:724–735
Gene Ontology Consortium (2004) The Gene Ontology (GO) database and informatics resource, Nucleic Acid Research, Database issue, 32
Golbeck J, Fragoso G, Hartel F, Hendler J, Oberthaler J, Parsia B (2003) The National Cancer Institute’s Thesaurus and Ontology, Web Semantics: Science, Services and Agents on the World Wide Web, 1
(2003) A reference ontology for biomedical informatics: the Foundational Model of Anatomy, Journal of Biomedical Informatics, vol 36, pp 478–500
Schulz S, Cornet R, Spackman K (2011) Consolidating SNOMED CT’s ontological commitment. Appl Ontol 1:1–11
Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, Consortium OBI, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone SA, Scheuermann RH, Shah N, Whetzel PL, Lewis S (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotech 25:1251–1255
Whetzel P L, Noy N F, Shah N H, Alexander P R, Nyulas C, Tudorache T, Musen M A (2011) BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications, Nucleic Acids Res 2011 Jul;39(Web Server issue):W541-5. Epub
Algergawy A, Siegmund N, Saake G (2010) Combining schema and level-based matching for web service discovery. In: Proceedings of the 10th international conference of web engineering ICWE
Fensel D, Lausen H, Polleres A, de Bruijn J, Stollberg M, Roman D, Domingue J (2006) Enabling semantic web services: the web service modeling ontology. Springer, Heidelberg
Shvaiko P., Euzenat J. (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25:158–176
van Hage WR, Katrenko S, Schreiber G (2005) A method to combine linguistic ontology-mapping techniques, the semantic web ISWC 2005, pp 732–744. Springer, Berlin
Gross A, Hartung M, Kirsten T, Rahm E (2010) On matching large life science ontologies in parallel. Springer, Berlin Heidelberg
LeBlanc T.J., Friedberg S.A. (1985) HPC: a model of structure and change in distributed systems. IEEE Trans Comput C-34:1114–1129
Han S., Choi HG (2013) Investigation of the parallel efficiency of a PC cluster for the simulation of a CFD problem. Pers Ubiquit Comput:1–12
Buyyaa R, Yeoa C S, Venugopala S, Broberga J, Brandicc I (2009) Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comput Syst 25:599–616
Tenschert A, Assel M, Cheptsov A, Gallizo G, Della Valle E, Celino I (2009) Parallelization and distribution techniques for ontology matching in urban computing environments. In: Proceedings of the 4th international workshop on ontology matching (OM-2009) collocated with the 8th international semantic web conference (ISWC-2009) Chantilly, USA, October 25, 2009, volume 551 of CEUR Workshop Proceedings, CEUR-WS.org
Andrade D, Fraguela BB, Brodman J, Padua D (2009) Task-parallel versus data-parallel library-based programming in multicore systems, 2009 17th euromicro international conference on parallel, distributed and network-based processing, pp 101–110
Chen W-Y, Song Y, Bai H, Lin C-J, Chang EY (2011) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Anal Mach Intell 33(3):568–586
Intel Developer Zone (2011) Choose the right threading model (task-parallel or data-parallel threading)
Kirsten T, Gross A, Hartung M, Rahm E (2011) GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J Biomed Semant 2:6
Yves R, Jean-Mary E, Shironoshita P, Kabuka MR (2009) Ontology matching with semantic verification. Web Semant 7:235–251
Wei H, Yuzhong Q (2008) Falcon-AO: a practical ontology matching system. Web Semant 6:237–239
Wei H, Yuzhong Q, Cheng G (2008) Matching large ontologies: a divide-and-conquer approach. Data & Knowl Eng 67:140–160
Hanif Seddiqui Md, Aono M (2009) An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semant Sci Serv Agents World Wide Web 7:344–356
Garruzzo S, Rosaci D (2008) Agent clustering based on semantic negotiation. ACM Trans Auton Adapt Syst 3:7:1–7:40
De Meo P, Quattrone G, Rosaci D, Ursino D (2012) Bilateral semantic negotiation: a decentralised approach to ontology enrichment in open multiagent systems. Int J Data Model Manag 4:1–38
Garruzzo S, Rosaci D (2006) Information agents that learn to understand each other via semantic negotiation. Distrib Appl Interoperable Syst 4025:99–112
Garruzzo S, Rosaci D (2006) HISENE2: a reputation-based protocol for supporting semantic negotiation. Move Meaningful Internet Syst 2006: CoopIS, DOA, GADA, and ODBASE 4275:949–966
Caire G (2007) (TILAB, formerly CSELT), JADE TUTORIAL, JADE PROGRAMMING For BEGINNERS. http://www.cs.uu.nl/docs/vakken/map/JADEProgramming-Tutorial-for-beginners.pdf
Cruz If, Antonelli FP, Stroe C (2009) AgreementMaker: efficient matching for large real-world schemas and ontologies. Proc VLDB Endow 2:1586–1589
enez-Ruiz EJ, Grau BC (2011) LogMap: logic-based and scalable ontology matching. Semant Web ISWC 7031:273–288
Kirsten T, Kolb L, Hartung M, Gross A, Köpcke H, Rahm E (2010) Data partitioning for parallel entity matching, 8th international workshop on quality in databases
Ernesto J-R, Meilicke C, Grau BC, Horrocks I (2013) Evaluating mapping repair systems with large biomedical ontologies, 26th international workshop on description logics. Springer LNCS, Berlin
Lambrix P, Tan H (2006) SAMBO-A system for aligning and merging biomedical ontologies. Web Semant 4:196–206
What is WordNet? (2013), Princeton University
National Center for Biotechnology Information, U.S. National Library of Medicine, PubMed, 2013
Takai-Igarashi T, Takagi T (2000) SIGNAL-ONTOLOGY: ontology for cell signaling. Genome Inform 11:440–441
U.S. National Library of Medicine (2013) National Institute of Health, Medical Subject Headings
Hayamizu TF, Mangan M, Corradi JP, Kadin JA, Ringwald M (2005) The adult mouse anatomical dictionary: a tool for annotating and integrating data. Genome Biol 6:1–8
Lambrix P, Tan H, Liu Q (2008) SAMBO and SAMBOdtf results for the ontology alignment evaluation initiative 2008. CEUR Workshop Proc 431:1114–1129
Zhang S, Bodenreider O (2007) Hybrid alignment strategy for anatomical ontologies: results of the 2007 ontology alignment contest. CEUR Workshop Proc:304
Ba M, Diallo G (2011) Large-scale biomedical ontology matching with ServOMap. IRBM 34:56–59
HDFS Architecture Guide. http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
Krishnaswamy A (2013) To hadoop or not to hadoop? http://www.thoughtworks.com/insights/blog/hadoop-or-not-hadoop
Matsunaga A, Tsugawa M, Fortes J (2008) CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications, fourth IEEE international conference on eScience
Reasoning-Hadoop. http://www.jacopourbani.it/reasoning-hadoop.html
Heart Project. http://rdf-proj.blogspot.kr/
Hadoop Distributed RDF Store. https://code.google.com/p/hdrs/
Flynn’s Taxonomy. http://en.wikipedia.org/wiki/Flynn%B4s_taxonomy
Park M-J, Lee J, Lee C-H, Lin J, Serres O, Chung C-W (2007) An efficient and scalable management of ontology. In: Proceedings of the 12th international conference on database systems for advanced applications. Springer, Berlin
Zhao G, Meersman R (2005) Architecting ontology for scalability and versatility. In: Proceedings of the 2005 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, COA, and ODBASE - Volume Part II. Springer, Berlin
Zhou J, Ma L, Liu Q, Zhang L, Yu Y, Pan Y (2006) Minerva: a scalable OWL ontology storage and inference system, ASWC
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston
Intel Corporation (2013) Intel hyper-threading technology
Adult Mouse Anatomy. http://www.informatics.jax.org/searches/AMA_form.shtml
STW Thesaurus of Economics Ontology. http://zbw.eu/stw/versions/8.10/descriptor/29234-2/about.en.html
Thesaurus for the Social Sciences. http://www.gesis.org/en/services/research/thesauri-und-klassifikationen/social-science-thesaurus/
Acknowledgments
This work was supported by a post-doctoral fellowship grant from the Kyung Hee University Korea in 2011 (KHU-20110219).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Amin, M.B., Khan, W.A., Lee, S. et al. Performance-based ontology matching. Appl Intell 43, 356–385 (2015). https://doi.org/10.1007/s10489-015-0648-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0648-z