nach oben

Discover Computing

Erschienen in:

17.12.2018 | Knowledge Graphs and Semantics in Text Analysis and Retrieval

Automated assessment of knowledge hierarchy evolution: comparing directed acyclic graphs

verfasst von: Guruprasad Nayak, Sourav Dutta, Deepak Ajwani, Patrick Nicholson, Alessandra Sala

Erschienen in: Discover Computing | Ausgabe 3-4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Automated construction of knowledge hierarchies from huge data corpora is gaining increasing attention in recent years, in order to tackle the infeasibility of manually extracting and semantically linking millions of concepts. As a knowledge hierarchy evolves with these automated techniques, there is a need for measures to assess its temporal evolution, quantifying the similarities between different versions and identifying the relative growth of different subgraphs in the knowledge hierarchy. In this paper, we focus on measures that leverage structural properties of the knowledge hierarchy graph to assess the temporal changes. We propose a principled and scalable similarity measure, based on Katz similarity between concept nodes, for comparing different versions of a knowledge hierarchy, modeled as a generic directed acyclic graph. We present theoretical analysis to depict that the proposed measure accurately captures the salient properties of taxonomic hierarchies, assesses changes in the ordering of nodes, along with the logical subsumption of relationships among concepts. We also present a linear time variant of the measure, and show that our measures, unlike previous approaches, are tunable to cater to diverse application needs. We further show that our measure provides interpretability, thereby identifying the key structural and logical difference in the hierarchies. Experiments on a real DBpedia and biological knowledge hierarchy showcase that our measures accurately capture structural similarity, while providing enhanced scalability and tunability. Also, we demonstrate that the temporal evolution of different subgraphs in this knowledge hierarchy, as captured purely by our structural measure, corresponds well with the known disruptions in the related subject areas.

Vorheriger Artikel Neural variational entity set expansion for automatically populated knowledge graphs

Nächster Artikel Identifying and exploiting target entity type information for ad hoc entity retrieval

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Hierarchical ordering of DBpedia concepts and Wikipedia categories based on Simple Knowledge Organization System (SKOS) : broader relationships between the corresponding Wikipedia pages is henceforth referred to as the DBpedia taxonomy.

For non-identical sets, nodes present in one but absent in other are added as singletons.

The Katz similarity vector of Eq. (1) is obtained by flattening this matrix into a vector.

DBpedia, derived from Wikipedia pages, contains over 6 million vertices and 25 million edges for a hierarchical ordering of Wikipedia concepts using the Simple Knowledge Organization System (SKOS) broader (i.e., skos:broader) relationships.

Adams, E, I. I. I. (1972). Consensus techniques and the comparison of taxonomic trees. Systematic Zoology, 21(4), 390–397.CrossRef

Bille, P. (2005). A survey on tree edit distance and related problems. Theoretical Computer Science, 337(1), 217–239.MathSciNetCrossRefMATH

Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD ’08 (pp. 1247–1250). ISBN 978-1-60558-102-6.

Bordea, G., Buitelaar, P., Feralli, S., & Navigli, R. (2015). Semeval-2015 task 17: Taxonomy extraction evaluation (texeval). In SemEval-2015.

Bordea, G., Lefever, E., & Buitelaar, P. (2016). Semeval-2016 task 13: Taxonomy extraction evaluation (texeval-2). In SemEval-2016 (pp. 1081–1091).

Brandenburg, F., Gleiner, A., & Hofmeier, A. (2012). Comparing and aggregating partial orders with Kendall tau distances. In WALCOM: Algorithms and computation (pp. 88–99).

Brank, J., Grobelnik, M., & Mladenić, D. (2005) A survey of ontology evaluation techniques. In SiKDD.

Brank, J., Mladenic, D., & Grobelnik, M. (2006). Gold standard based ontology evaluation using instance assignment. In Workshop on evaluation of ontologies for the web, EON.

Cai, S., & Knight, K. (2013). Smatch: An evaluation metric for semantic feature structures. In Annual Meeting of the Association for Computational Linguistics (ACL) (pp. 748–752). Association for Computational Linguistics.

Cross, V., Yu, X., & Hu, X. (2013). Unifying ontological similarity measures: A theoretical and empirical investigation. International Journal of Approximate Reasoning, 54(7), 861–875.CrossRef

d’Aquin, M. (2009). Formally measuring agreement and disagreement in ontologies. In K-CAP (pp. 145–152).

David, J., Euzenat, J., & Šváb-Zamazal, O. (2010). Ontology similarity in the alignment space. In ISWC (pp. 129–144).

de Knijff, J., Meijer, K., Frasincar, F., & Hogenboom, F. (2011). Word sense disambiguation for automatic taxonomy construction from text-based web corpora. In WISE.

Dellschaft, K., Staab, S. (2006). On how to perform a gold standard based evaluation of ontology learning. In International semantic web conference (pp. 228–241).

Elghawalby, H., & Hancock, E. R. (2008). Measuring graph similarity using spectral geometry. In ICIAR (pp. 517–526).

Fensel, D. (2005). Spinning the semantic web: Bringing the World Wide Web to its full potential. Cambridge: MIT Press.CrossRef

Foggia, P., Percannella, G., & Vento, M. (2014). Graph matching and learning in pattern recognition in the last 10 years. Pattern Recognition and AI, 28(01), 1450001.MathSciNetCrossRef

Fowlkes, E., & Mallows, C. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78, 553–569.CrossRefMATH

Gao, X., Xiao, B., Tao, D., & Li, X. (2010). A survey of graph edit distance. Pattern Analysis and Applications, 13(1), 113–129.MathSciNetCrossRef

Geffet, M., & Dagan, I. (2005). The distributional inclusion hypotheses and lexical entailment. In ACL (pp. 107–114).

Goyal, P., & Ferrara, E. (2017). Graph embedding techniques, applications, and performance: A survey. arXiv:1705.02801v4 [cs.SI].

Goyal, P., & Ferrara, E. (2018). Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 151, 78–94.CrossRef

Guarino, N., & Welty, C. (2002). Evaluating ontological decisions with ontoclean. Communications of ACM, 45, 61–65.CrossRef

Harabagiu, S. M., Maiorano, S. J., & Paşca, M. A. (2003). Open-domain textual question answering techniques. Natural Language Engineering, 9(3), 231–267.CrossRef

Hulpuş, I., Prangnawarat, N., & Hayes, C. (2015). Path-based semantic relatedness on linked data and its use to word and entity disambiguation. In ISWC (pp. 442–457).

Jurgens, D., & Pilehvar, M. T. (2016). Semeval-2016: Semantic taxonomy enrichment. In SemEval (pp. 1092–1102).

Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39–43.CrossRefMATH

Koutra, D., Ke, T.-Y., Kang, U., Chau, D., Pao, H.-K., & Faloutsos, C. (2011). Unifying guilt-by-association approaches: Theorems and fast algorithms. In ECML-PKDD (pp. 245–260).

Koutra, D., Shah, N., Vogelstein, J. T., Gallagher, B., & Faloutsos, C. (2016). Deltacon: A principled massive-graph similarity function with attribution. TKDD, 10(3), 28.CrossRef

Kozareva, Z., & Hovy, E. (2010). A semi-supervised method to learn and construct taxonomies using the web. In Proceedings of the 2010 conference on empirical methods in natural language processing (pp. 1110–1118). Association for Computational Linguistics.

Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., et al. (2015). Dbpedia: A large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 6(2), 167–195.

Levin, R., Abassi, H., & Cohen, U. (2016). Guided walk: A scalable recommendation algorithm for complex heterogeneous social networks. In Recommender systems (pp. 293–300).

Liu, X., Song, Y., Liu, S., & Wang, H. (2012). Automatic taxonomy construction from keywords. In KDD (pp. 1433–1441).

Maedche, A., & Staab, S. (2002). Measuring similarity between ontologies. In EKAW (pp. 251–263).

Malmi, E., Tatti, N., & Gionis, A. (2015). Beyond rankings: Comparing directed acyclic graphs. Data Mining and Knowledge Discovery, 29(5), 1233–1257.MathSciNetCrossRefMATH

McVicar, M., Sach, B., Mesnage, C., Lijffijt, J., Spyropoulou, E., & De Bie, T. (2016). Sumoted: An intuitive edit distance between rooted unordered uniquely-labelled trees. Pattern Recognition Letters, 79, 52–59.CrossRef

Newman, M. (2010). Networks: An introduction. Oxford: Oxford University Press.CrossRefMATH

Ou, M., Cui, P., Pei, J., Zhang, Z., & Zhu, W. (2016). Asymmetric transitivity preserving graph embedding. In KDD (pp. 1105–1114).

Pesquita, C., Faria, D., Falco, A. O., Lord, P., & Couto, F. M. (2009). Semantic similarity in biomedical ontologies. PLOS Computational Biology, 5(7), 1–12.MathSciNetCrossRef

Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. JASA, 66(336), 846–850.CrossRef

Resnik, P. (1995). Using information content to evaluate semantic similarity in taxonomy. In IJCAI (pp. 448–453).

Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., & Borgwardt, K. (2009). Efficient graphlet kernels for large graph comparison. In AI and Statistics (pp. 488–495).

Suchanek, F. M., Kasneci, G., & Weikum, G. (2008). YAGO: A large ontology from wikipedia and wordnet. Journal of Web Semantics, 6(3), 203–217.CrossRef

Sun, J., Ajwani, D., Nicholson, P. K., Sala, A., & Parthasarathy, S. (2017). Breaking cycles in noisy hierarchies. In WebSci (pp. 151–160).

Suominen, O., & Hyvönen, E. (2012). Improving the quality of SKOS vocabularies with skosify. In EKAW (pp. 383–397).

Vedula, N., Nicholson, P. K., Ajwani, D., Dutta, S., Sala, A., & Parthasarathy, S. (2018). Enriching taxonomies with functional domain knowledge. In SIGIR.

Velardi, P., Navigli, R., Faralli, S., & María Ruiz-Martínez, J. (2012). A new method for evaluating automatically learned terminological taxonomies. In LREC (pp. 1498–1504).

Velardi, P., Faralli, S., & Navigli, R. (2013). Ontolearn reloaded: A graph-based algorithm for taxonomy induction. Computational Linguistics, 39(3), 665–707.CrossRef

Volker, J., Vrandecic, J., Sure, J., & Hotho, A. (2010). AEON: An approach to the automatic evaluation of ontologies. Journal of Applied Ontology, 3, 41–62.

Wagner, S., & Wagner, D. (2007). Comparing clusterings: An overview. Analysis, 4769, 1–19.

Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. In ACL (pp. 133–138).

Zavitsanos, E., Paliouras, G., & Vouros, G. A. (2011). Gold standard evaluation of ontology learning methods through ontology transformation and alignment. TKDE, 23, 1635–1648.MATH

Titel: Automated assessment of knowledge hierarchy evolution: comparing directed acyclic graphs
verfasst von: Guruprasad Nayak
Sourav Dutta
Deepak Ajwani
Patrick Nicholson
Alessandra Sala
Publikationsdatum: 17.12.2018
Verlag: Springer Netherlands
Erschienen in: Discover Computing / Ausgabe 3-4/2019
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI: https://doi.org/10.1007/s10791-018-9345-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3-4/2019

Identifying and exploiting target entity type information for ad hoc entity retrieval

Neural architecture for question answering using a knowledge graph and web corpus

Neural variational entity set expansion for automatically populated knowledge graphs

Payoffs and pitfalls in using knowledge-bases for consumer health search

Overcoming low-utility facets for complex answer retrieval

Special issue on knowledge graphs and semantics in text analysis and retrieval