Skip to main content

2018 | OriginalPaper | Buchkapitel

Enriching Knowledge Bases with Counting Quantifiers

verfasst von : Paramita Mirza, Simon Razniewski, Fariz Darari, Gerhard Weikum

Erschienen in: The Semantic Web – ISWC 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Information extraction traditionally focuses on extracting relations between identifiable entities, such as \(\langle \)Monterey, locatedIn, California\(\rangle \). Yet, texts often also contain Counting information, stating that a subject is in a specific relation with a number of objects, without mentioning the objects themselves, for example, “California is divided into 58 counties”. Such counting quantifiers can help in a variety of tasks such as query answering or knowledge base curation, but are neglected by prior work.
This paper develops the first full-fledged system for extracting counting information from text, called CINEX. We employ distant supervision using fact counts from a knowledge base as training seeds, and develop novel techniques for dealing with several challenges: (i) non-maximal training seeds due to the incompleteness of knowledge bases, (ii) sparse and skewed observations in text sources, and (iii) high diversity of linguistic patterns. Experiments with five human-evaluated relations show that CINEX can achieve 60% average precision for extracting counting information. In a large-scale experiment, we demonstrate the potential for knowledge base enrichment by applying CINEX to 2,474 frequent relations in Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct relations, which is 28% more than the existing Wikidata facts for these relations.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Brin, S.: Extracting patterns and relations from the World Wide Web. In: WebDB (1998) Brin, S.: Extracting patterns and relations from the World Wide Web. In: WebDB (1998)
3.
Zurück zum Zitat Craven, M., Kumlien, J., et al.: Constructing biological knowledge bases by extracting information from text sources. In: ISMB (1999) Craven, M., Kumlien, J., et al.: Constructing biological knowledge bases by extracting information from text sources. In: ISMB (1999)
4.
Zurück zum Zitat Dang, H.T., Kelly, D., Lin, J.J.: Overview of the TREC 2007 question answering track. TREC 7, 63 (2007) Dang, H.T., Kelly, D., Lin, J.J.: Overview of the TREC 2007 question answering track. TREC 7, 63 (2007)
6.
Zurück zum Zitat Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: WWW (2013) Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: WWW (2013)
7.
Zurück zum Zitat Denecker, M., Cortés-Calabuig, A., Bruynooghe, M., Arieli, O.: Towards a logical reconstruction of a theory for locally closed databases. ACM Trans. Database Syst. 35(3) (2010)CrossRef Denecker, M., Cortés-Calabuig, A., Bruynooghe, M., Arieli, O.: Towards a logical reconstruction of a theory for locally closed databases. ACM Trans. Database Syst. 35(3) (2010)CrossRef
8.
Zurück zum Zitat Dong, X.L., et al.: From data fusion to knowledge fusion. PVLDB 7(10), 881–892 (2014) Dong, X.L., et al.: From data fusion to knowledge fusion. PVLDB 7(10), 881–892 (2014)
9.
Zurück zum Zitat Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: KDD (2014) Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: KDD (2014)
10.
Zurück zum Zitat Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in ontological knowledge bases with AMIE+. VLDB J. 24(6), 707–730 (2015)CrossRef Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in ontological knowledge bases with AMIE+. VLDB J. 24(6), 707–730 (2015)CrossRef
11.
Zurück zum Zitat Ibrahim, Y., Riedewald, M., Weikum, G.: Making sense of entities and quantities in web tables. In: CIKM (2016) Ibrahim, Y., Riedewald, M., Weikum, G.: Making sense of entities and quantities in web tables. In: CIKM (2016)
13.
Zurück zum Zitat Koch, M., Gilmer, J., Soderland, S., Weld, D.S.: Type-aware distantly supervised relation extraction with linked arguments. In: EMNLP (2014) Koch, M., Gilmer, J., Soderland, S., Weld, D.S.: Type-aware distantly supervised relation extraction with linked arguments. In: EMNLP (2014)
15.
Zurück zum Zitat Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL (2016) Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL (2016)
16.
Zurück zum Zitat Ling, X., Weld, D.S.: Temporal information extraction. In: AAAI (2010) Ling, X., Weld, D.S.: Temporal information extraction. In: AAAI (2010)
17.
Zurück zum Zitat Madaan, A., Mittal, A., Mausam, G.R., Ramakrishnan, G., Sarawagi, S.: Numerical relation extraction with minimal supervision. In: AAAI (2016) Madaan, A., Mittal, A., Mausam, G.R., Ramakrishnan, G., Sarawagi, S.: Numerical relation extraction with minimal supervision. In: AAAI (2016)
18.
Zurück zum Zitat Mausam: Open information extraction systems and downstream applications. In: IJCAI (2016) Mausam: Open information extraction systems and downstream applications. In: IJCAI (2016)
19.
Zurück zum Zitat Mausam, Schmitz, M., Soderland, S., Bart, R., Etzioni, O.: Open language learning for information extraction. In: EMNLP (2012) Mausam, Schmitz, M., Soderland, S., Bart, R., Etzioni, O.: Open language learning for information extraction. In: EMNLP (2012)
20.
Zurück zum Zitat Min, B., Grishman, R., Wan, L., Wang, C., Gondek, D.: Distant supervision for relation extraction with an incomplete knowledge base. In: HLT-NAACL (2013) Min, B., Grishman, R., Wan, L., Wang, C., Gondek, D.: Distant supervision for relation extraction with an incomplete knowledge base. In: HLT-NAACL (2013)
21.
Zurück zum Zitat Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: ACL/IJCNLP (2009) Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: ACL/IJCNLP (2009)
22.
Zurück zum Zitat Mirza, P., Razniewski, S., Darari, F., Weikum, G.: Cardinal virtues: extracting relation cardinalities from text. In: ACL 2017 (Short Papers) (2017) Mirza, P., Razniewski, S., Darari, F., Weikum, G.: Cardinal virtues: extracting relation cardinalities from text. In: ACL 2017 (Short Papers) (2017)
23.
Zurück zum Zitat Mitchell, T.M., et al.: Never-ending learning. In: AAAI (2015) Mitchell, T.M., et al.: Never-ending learning. In: AAAI (2015)
25.
Zurück zum Zitat Palomares, T., Ahres, Y., Kangaspunta, J., Ré, C.: Wikipedia knowledge graph with DeepDive. In: ICWSM (2016) Palomares, T., Ahres, Y., Kangaspunta, J., Ré, C.: Wikipedia knowledge graph with DeepDive. In: ICWSM (2016)
26.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP (2014) Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP (2014)
28.
Zurück zum Zitat Saha, S., Pal, H., Mausam: Bootstrapping for numerical open IE. In: ACL (2017) Saha, S., Pal, H., Mausam: Bootstrapping for numerical open IE. In: ACL (2017)
29.
Zurück zum Zitat Speer, R., Havasi, C.: Representing general relational knowledge in ConceptNet 5. In: LREC (2012) Speer, R., Havasi, C.: Representing general relational knowledge in ConceptNet 5. In: LREC (2012)
30.
Zurück zum Zitat Strötgen, J., Gertz, M.: Heideltime: high quality rule-based extraction and normalization of temporal expressions. In: SemEval Workshop (2010) Strötgen, J., Gertz, M.: Heideltime: high quality rule-based extraction and normalization of temporal expressions. In: SemEval Workshop (2010)
31.
Zurück zum Zitat Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: WWW (2007) Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: WWW (2007)
32.
Zurück zum Zitat Suchanek, F.M., Sozio, M., Weikum, G.: SOFIE: a self-organizing framework for information extraction. In: WWW (2009) Suchanek, F.M., Sozio, M., Weikum, G.: SOFIE: a self-organizing framework for information extraction. In: WWW (2009)
33.
Zurück zum Zitat Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: ACL (2012) Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: ACL (2012)
34.
Zurück zum Zitat Tan, C.H., Agichtein, E., Ipeirotis, P., Gabrilovich, E.: Trust, but verify: predicting contribution quality for knowledge base construction and curation. In: WSDM (2014) Tan, C.H., Agichtein, E., Ipeirotis, P., Gabrilovich, E.: Trust, but verify: predicting contribution quality for knowledge base construction and curation. In: WSDM (2014)
35.
Zurück zum Zitat Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. In: CACM (2014) Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. In: CACM (2014)
Metadaten
Titel
Enriching Knowledge Bases with Counting Quantifiers
verfasst von
Paramita Mirza
Simon Razniewski
Fariz Darari
Gerhard Weikum
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-00671-6_11

Premium Partner