Skip to main content
Top

2017 | OriginalPaper | Chapter

WebIsALOD: Providing Hypernymy Relations Extracted from the Web as Linked Open Data

Authors : Sven Hertling, Heiko Paulheim

Published in: The Semantic Web – ISWC 2017

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Hypernymy relations are an important asset in many applications, and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. In this paper, we introduce WebIsALOD, a Linked Open Data release of the IsA database, containing 400M hypernymy relations, each provided with rich provenance information. As the original dataset contained more than 80% wrong, noisy extractions, we run a machine learning algorithm to assign confidence scores to the individual statements. Furthermore, 2.5M links to DBpedia and 23.7k links to the YAGO class hierarchy were created at a precision of 97%. In total, the dataset contains 5.4B triples.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
2
NP stands for noun phrase.
 
4
We restricted the workers to have a 95% approval rate and a minimum of 100 approved HITs (human intelligence tasks), following the recommendations by [3, 5], and restricted their location to the US to attract a large fraction of native speakers.
 
7
The reason why we did not use the gold standard of the full dataset for training is its imbalance (cf. Sect. 2), i.e., the number of positive examples (only 37 out of 500) is too low for learning a meaningful model.
 
Literature
1.
go back to reference Bryl, V., Bizer, C., Paulheim, H.: Gathering alternative surface forms for DBpedia entities. In: NLP-DBPEDIA@ISWC, pp. 13–24 (2015) Bryl, V., Bizer, C., Paulheim, H.: Gathering alternative surface forms for DBpedia entities. In: NLP-DBPEDIA@ISWC, pp. 13–24 (2015)
2.
go back to reference Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: International Conference on World Wide Web, pp. 613–622. ACM (2005) Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: International Conference on World Wide Web, pp. 613–622. ACM (2005)
3.
go back to reference Hauser, D.J., Schwarz, N.: Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants. Behav. Res. Methods 48(1), 400–407 (2016)CrossRef Hauser, D.J., Schwarz, N.: Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants. Behav. Res. Methods 48(1), 400–407 (2016)CrossRef
4.
go back to reference Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification (2003) Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification (2003)
5.
go back to reference Kazai, G.: In search of quality in crowdsourcing for search engine evaluation. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 165–176. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_17 CrossRef Kazai, G.: In search of quality in crowdsourcing for search engine evaluation. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 165–176. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-20161-5_​17 CrossRef
6.
go back to reference Kliegr, T., Zamazal, O.: LHD 2.0: a text mining approach to typing entities in knowledge graphs. Web Semant.: Sci. Serv. Agents World Wide Web 39, 47–61 (2016)CrossRef Kliegr, T., Zamazal, O.: LHD 2.0: a text mining approach to typing entities in knowledge graphs. Web Semant.: Sci. Serv. Agents World Wide Web 39, 47–61 (2016)CrossRef
7.
go back to reference Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2013) Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2013)
8.
go back to reference Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011) Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)
9.
go back to reference Paulheim, H., Fümkranz, J.: Unsupervised generation of data mining features from linked open data. In: 2nd International Conference on Web Intelligence, Mining and Semantics, p. 31. ACM (2012) Paulheim, H., Fümkranz, J.: Unsupervised generation of data mining features from linked open data. In: 2nd International Conference on Web Intelligence, Mining and Semantics, p. 31. ACM (2012)
10.
go back to reference Ringler, D., Paulheim, H.: One knowledge graph to rule them all? Analyzing the differences between DBpedia, Yago, Wikidata & Co. In: 40th German Conference on Artificial Intelligence (2017) Ringler, D., Paulheim, H.: One knowledge graph to rule them all? Analyzing the differences between DBpedia, Yago, Wikidata & Co. In: 40th German Conference on Artificial Intelligence (2017)
11.
go back to reference Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_16 Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Cham (2014). doi:10.​1007/​978-3-319-11964-9_​16
12.
go back to reference Seitner, J., Bizer, C., Eckert, K., Faralli, S., Meusel, R., Paulheim, H., Ponzetto, S.: A large database of hypernymy relations extracted from the web. In: Language Resources and Evaluation Conference, Portoroz, Slovenia (2016) Seitner, J., Bizer, C., Eckert, K., Faralli, S., Meusel, R., Paulheim, H., Ponzetto, S.: A large database of hypernymy relations extracted from the web. In: Language Resources and Evaluation Conference, Portoroz, Slovenia (2016)
13.
go back to reference Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge unifying wordnet and wikipedia. In: 16th International Conference on World Wide Web, pp. 697–706 (2007) Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge unifying wordnet and wikipedia. In: 16th International Conference on World Wide Web, pp. 697–706 (2007)
Metadata
Title
WebIsALOD: Providing Hypernymy Relations Extracted from the Web as Linked Open Data
Authors
Sven Hertling
Heiko Paulheim
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-68204-4_11

Premium Partner