Skip to main content
Log in

Performance-based ontology matching

A data-parallel approach for an effectiveness-independent performance-gain in ontology matching

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Ontology matching is among the core techniques used for heterogeneity resolution by information and knowledge-based systems. However, due to the excess and ever-evolving nature of data, ontologies are becoming large-scale and complex; consequently, leading to performance bottlenecks during ontology matching. In this paper, we present our performance-based ontology matching system. Today’s desktop and cloud platforms are equipped with parallelism-enabled multicore processors. Our system benefits from this opportunity and provides effectiveness-independent data parallel ontology matching resolution over parallelism-enabled platforms. Our system decomposes complex ontologies into smaller, simpler, and scalable subsets depending upon the needs of matching algorithms. Matching process over these subsets is divided from granular to finer-level abstraction of independent matching requests, matching jobs, and matching tasks, running in parallel over parallelism-enabled platforms. Execution of matching algorithms is aligned for the minimization of the matching space during the matching process. We comprehensively evaluated our system over OAEI’s dataset of fourteen real world ontologies from diverse domains, having different sizes and complexities. We have executed twenty different matching tasks over parallelism-enabled desktop and Microsoft Azure public cloud platform. In a single-node desktop environment, our system provides an impressive performance speedup of 4.1, 5.0, and 4.9 times for medium, large, and very large-scale ontologies. In a single-node cloud environment, our system provides an impressive performance speedup of 5.9, 7.4, and 7.0 times for medium, large, and very large-scale ontologies. In a multi-node (3 nodes) environment, our system provides an impressive performance speedup of 15.16 and 21.51 times over desktop and cloud platforms respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

Notes

  1. 1 http://www.w3.org/TR/REC-rdf-syntax/

  2. 2 Utilization of communication by each core component is described in the components explanation.

References

  1. Doan A, Halevy A, Ives Z (2012) Principles of data integration. Addision-Wesley, Reading

    Google Scholar 

  2. Euzenat J, Shvaiko P (2013) Ontology Matching, 2nd Edn. Springer, Berlin

    Book  Google Scholar 

  3. Isern D, Sanchez D, Moreno A (2012) Ontology-driven execution of clinical guidelines. Comput Methods Prog Biomed 107:122–139

    Article  Google Scholar 

  4. Cimino J, Zhu X (2006) IMIA Yearbook of Medical 1:124–135

    Google Scholar 

  5. De Potter P, Cools H, Depraetere K, Mels G, Debevere P, De Roo J, Huszka C, Colaert D, Mannens E, Van de Walle R (2012) Semantic patient information aggregation and medicinal decision support. Comput Methods Prog Biomed 2:724–735

    Article  Google Scholar 

  6. Gene Ontology Consortium (2004) The Gene Ontology (GO) database and informatics resource, Nucleic Acid Research, Database issue, 32

  7. Golbeck J, Fragoso G, Hartel F, Hendler J, Oberthaler J, Parsia B (2003) The National Cancer Institute’s Thesaurus and Ontology, Web Semantics: Science, Services and Agents on the World Wide Web, 1

  8. (2003) A reference ontology for biomedical informatics: the Foundational Model of Anatomy, Journal of Biomedical Informatics, vol 36, pp 478–500

  9. Schulz S, Cornet R, Spackman K (2011) Consolidating SNOMED CT’s ontological commitment. Appl Ontol 1:1–11

    Google Scholar 

  10. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, Consortium OBI, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone SA, Scheuermann RH, Shah N, Whetzel PL, Lewis S (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotech 25:1251–1255

    Article  Google Scholar 

  11. Whetzel P L, Noy N F, Shah N H, Alexander P R, Nyulas C, Tudorache T, Musen M A (2011) BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications, Nucleic Acids Res 2011 Jul;39(Web Server issue):W541-5. Epub

  12. Algergawy A, Siegmund N, Saake G (2010) Combining schema and level-based matching for web service discovery. In: Proceedings of the 10th international conference of web engineering ICWE

  13. Fensel D, Lausen H, Polleres A, de Bruijn J, Stollberg M, Roman D, Domingue J (2006) Enabling semantic web services: the web service modeling ontology. Springer, Heidelberg

    Google Scholar 

  14. Shvaiko P., Euzenat J. (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25:158–176

    Article  MATH  Google Scholar 

  15. van Hage WR, Katrenko S, Schreiber G (2005) A method to combine linguistic ontology-mapping techniques, the semantic web ISWC 2005, pp 732–744. Springer, Berlin

    Google Scholar 

  16. Gross A, Hartung M, Kirsten T, Rahm E (2010) On matching large life science ontologies in parallel. Springer, Berlin Heidelberg

    Book  Google Scholar 

  17. LeBlanc T.J., Friedberg S.A. (1985) HPC: a model of structure and change in distributed systems. IEEE Trans Comput C-34:1114–1129

    Article  Google Scholar 

  18. Han S., Choi HG (2013) Investigation of the parallel efficiency of a PC cluster for the simulation of a CFD problem. Pers Ubiquit Comput:1–12

  19. Buyyaa R, Yeoa C S, Venugopala S, Broberga J, Brandicc I (2009) Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comput Syst 25:599–616

    Article  Google Scholar 

  20. Tenschert A, Assel M, Cheptsov A, Gallizo G, Della Valle E, Celino I (2009) Parallelization and distribution techniques for ontology matching in urban computing environments. In: Proceedings of the 4th international workshop on ontology matching (OM-2009) collocated with the 8th international semantic web conference (ISWC-2009) Chantilly, USA, October 25, 2009, volume 551 of CEUR Workshop Proceedings, CEUR-WS.org

  21. Andrade D, Fraguela BB, Brodman J, Padua D (2009) Task-parallel versus data-parallel library-based programming in multicore systems, 2009 17th euromicro international conference on parallel, distributed and network-based processing, pp 101–110

  22. Chen W-Y, Song Y, Bai H, Lin C-J, Chang EY (2011) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Anal Mach Intell 33(3):568–586

    Article  Google Scholar 

  23. Intel Developer Zone (2011) Choose the right threading model (task-parallel or data-parallel threading)

  24. Kirsten T, Gross A, Hartung M, Rahm E (2011) GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J Biomed Semant 2:6

    Article  Google Scholar 

  25. Yves R, Jean-Mary E, Shironoshita P, Kabuka MR (2009) Ontology matching with semantic verification. Web Semant 7:235–251

    Article  Google Scholar 

  26. Wei H, Yuzhong Q (2008) Falcon-AO: a practical ontology matching system. Web Semant 6:237–239

    Article  Google Scholar 

  27. Wei H, Yuzhong Q, Cheng G (2008) Matching large ontologies: a divide-and-conquer approach. Data & Knowl Eng 67:140–160

    Article  Google Scholar 

  28. Hanif Seddiqui Md, Aono M (2009) An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semant Sci Serv Agents World Wide Web 7:344–356

    Article  Google Scholar 

  29. Garruzzo S, Rosaci D (2008) Agent clustering based on semantic negotiation. ACM Trans Auton Adapt Syst 3:7:1–7:40

    Article  Google Scholar 

  30. De Meo P, Quattrone G, Rosaci D, Ursino D (2012) Bilateral semantic negotiation: a decentralised approach to ontology enrichment in open multiagent systems. Int J Data Model Manag 4:1–38

    Google Scholar 

  31. Garruzzo S, Rosaci D (2006) Information agents that learn to understand each other via semantic negotiation. Distrib Appl Interoperable Syst 4025:99–112

    Article  Google Scholar 

  32. Garruzzo S, Rosaci D (2006) HISENE2: a reputation-based protocol for supporting semantic negotiation. Move Meaningful Internet Syst 2006: CoopIS, DOA, GADA, and ODBASE 4275:949–966

    MATH  Google Scholar 

  33. Caire G (2007) (TILAB, formerly CSELT), JADE TUTORIAL, JADE PROGRAMMING For BEGINNERS. http://www.cs.uu.nl/docs/vakken/map/JADEProgramming-Tutorial-for-beginners.pdf

  34. Cruz If, Antonelli FP, Stroe C (2009) AgreementMaker: efficient matching for large real-world schemas and ontologies. Proc VLDB Endow 2:1586–1589

    Article  Google Scholar 

  35. enez-Ruiz EJ, Grau BC (2011) LogMap: logic-based and scalable ontology matching. Semant Web ISWC 7031:273–288

    Google Scholar 

  36. Kirsten T, Kolb L, Hartung M, Gross A, Köpcke H, Rahm E (2010) Data partitioning for parallel entity matching, 8th international workshop on quality in databases

  37. Ernesto J-R, Meilicke C, Grau BC, Horrocks I (2013) Evaluating mapping repair systems with large biomedical ontologies, 26th international workshop on description logics. Springer LNCS, Berlin

    Google Scholar 

  38. Lambrix P, Tan H (2006) SAMBO-A system for aligning and merging biomedical ontologies. Web Semant 4:196–206

    Article  Google Scholar 

  39. What is WordNet? (2013), Princeton University

  40. National Center for Biotechnology Information, U.S. National Library of Medicine, PubMed, 2013

  41. Takai-Igarashi T, Takagi T (2000) SIGNAL-ONTOLOGY: ontology for cell signaling. Genome Inform 11:440–441

    Google Scholar 

  42. U.S. National Library of Medicine (2013) National Institute of Health, Medical Subject Headings

  43. Hayamizu TF, Mangan M, Corradi JP, Kadin JA, Ringwald M (2005) The adult mouse anatomical dictionary: a tool for annotating and integrating data. Genome Biol 6:1–8

    Article  Google Scholar 

  44. Lambrix P, Tan H, Liu Q (2008) SAMBO and SAMBOdtf results for the ontology alignment evaluation initiative 2008. CEUR Workshop Proc 431:1114–1129

    Google Scholar 

  45. Zhang S, Bodenreider O (2007) Hybrid alignment strategy for anatomical ontologies: results of the 2007 ontology alignment contest. CEUR Workshop Proc:304

  46. Ba M, Diallo G (2011) Large-scale biomedical ontology matching with ServOMap. IRBM 34:56–59

    Article  Google Scholar 

  47. HDFS Architecture Guide. http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html

  48. Krishnaswamy A (2013) To hadoop or not to hadoop? http://www.thoughtworks.com/insights/blog/hadoop-or-not-hadoop

  49. Matsunaga A, Tsugawa M, Fortes J (2008) CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications, fourth IEEE international conference on eScience

  50. Reasoning-Hadoop. http://www.jacopourbani.it/reasoning-hadoop.html

  51. Heart Project. http://rdf-proj.blogspot.kr/

  52. Hadoop Distributed RDF Store. https://code.google.com/p/hdrs/

  53. Flynn’s Taxonomy. http://en.wikipedia.org/wiki/Flynn%B4s_taxonomy

  54. Park M-J, Lee J, Lee C-H, Lin J, Serres O, Chung C-W (2007) An efficient and scalable management of ontology. In: Proceedings of the 12th international conference on database systems for advanced applications. Springer, Berlin

    Google Scholar 

  55. Zhao G, Meersman R (2005) Architecting ontology for scalability and versatility. In: Proceedings of the 2005 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, COA, and ODBASE - Volume Part II. Springer, Berlin

    Google Scholar 

  56. Zhou J, Ma L, Liu Q, Zhang L, Yu Y, Pan Y (2006) Minerva: a scalable OWL ontology storage and inference system, ASWC

  57. Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston

    Google Scholar 

  58. Intel Corporation (2013) Intel hyper-threading technology

  59. Adult Mouse Anatomy. http://www.informatics.jax.org/searches/AMA_form.shtml

  60. STW Thesaurus of Economics Ontology. http://zbw.eu/stw/versions/8.10/descriptor/29234-2/about.en.html

  61. Thesaurus for the Social Sciences. http://www.gesis.org/en/services/research/thesauri-und-klassifikationen/social-science-thesaurus/

Download references

Acknowledgments

This work was supported by a post-doctoral fellowship grant from the Kyung Hee University Korea in 2011 (KHU-20110219).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sungyoung Lee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amin, M.B., Khan, W.A., Lee, S. et al. Performance-based ontology matching. Appl Intell 43, 356–385 (2015). https://doi.org/10.1007/s10489-015-0648-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0648-z

Keywords

Navigation