Skip to main content

2018 | OriginalPaper | Buchkapitel

Assessing the Impact of Single and Pairwise Slot Constraints in a Factor Graph Model for Template-Based Information Extraction

verfasst von : Hendrik ter Horst, Matthias Hartung, Roman Klinger, Nicole Brazda, Hans Werner Müller, Philipp Cimiano

Erschienen in: Natural Language Processing and Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Template-based information extraction generalizes over standard token-level binary relation extraction in the sense that it attempts to fill a complex template comprising multiple slots on the basis of information given in a text. In the approach presented in this paper, templates and possible fillers are defined by a given ontology. The information extraction task consists in filling these slots within a template with previously recognized entities or literal values. We cast the task as a structure prediction problem and propose a joint probabilistic model based on factor graphs to account for the interdependence in slot assignments. Inference is implemented as a heuristic building on Markov chain Monte Carlo sampling. As our main contribution, we investigate the impact of soft constraints modeled as single slot factors which measure preferences of individual slots for ranges of fillers, as well as pairwise slot factors modeling the compatibility between fillers of two slots. Instead of relying on expert knowledge to acquire such soft constraints, in our approach they are directly captured in the model and learned from training data. We show that both types of factors are effective in improving information extraction on a real-world data set of full-text papers from the biomedical domain. Pairwise factors are shown to particularly improve the performance of our extraction model by up to \({+}0.43\) points in precision, leading to an F\(_1\) score of 0.90 for individual templates.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Adel, H., Roth, B., Schütze, H.: Comparing convolutional neural networks to traditional models for slot filling. In: Proceedings of NAACL/HLT, pp. 828–838 (2016) Adel, H., Roth, B., Schütze, H.: Comparing convolutional neural networks to traditional models for slot filling. In: Proceedings of NAACL/HLT, pp. 828–838 (2016)
2.
Zurück zum Zitat Banko, M., Cafarella, M., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: Proceedings of IJCAI, pp. 2670–2676 (2007) Banko, M., Cafarella, M., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: Proceedings of IJCAI, pp. 2670–2676 (2007)
3.
Zurück zum Zitat Brazda, N., ter Horst, H., Hartung, M., Wiljes, C., Estrada, V., Klinger, R., Kuchinke, W., Müller, H.W., Cimiano, P.: SCIO: an ontology to support the formalization of pre-clinical spinal cord injury experiments. In: Proceedings of the 3rd JOWO Workshops: Ontologies and Data in the Life Sciences (2017) Brazda, N., ter Horst, H., Hartung, M., Wiljes, C., Estrada, V., Klinger, R., Kuchinke, W., Müller, H.W., Cimiano, P.: SCIO: an ontology to support the formalization of pre-clinical spinal cord injury experiments. In: Proceedings of the 3rd JOWO Workshops: Ontologies and Data in the Life Sciences (2017)
4.
Zurück zum Zitat Bunescu, R., Mooney, R.: Collective information extraction with relational markov networks. In: Proceedings of ACL, pp. 438–445 (2004) Bunescu, R., Mooney, R.: Collective information extraction with relational markov networks. In: Proceedings of ACL, pp. 438–445 (2004)
5.
Zurück zum Zitat Chang, M.W., Ratinov, L., Roth, D.: Structured learning with constrained conditional models. Mach. Learn. 88(3), 399–431 (2012)MathSciNetCrossRef Chang, M.W., Ratinov, L., Roth, D.: Structured learning with constrained conditional models. Mach. Learn. 88(3), 399–431 (2012)MathSciNetCrossRef
6.
Zurück zum Zitat Freitag, D.: Machine learning for information extraction in informal domains. Mach. Learn. 39(2–3), 169–202 (2000)CrossRef Freitag, D.: Machine learning for information extraction in informal domains. Mach. Learn. 39(2–3), 169–202 (2000)CrossRef
7.
Zurück zum Zitat Haghighi, A., Klein, D.: An entity-level approach to information extraction. In: Proceedings of ACL, pp. 291–295 (2010) Haghighi, A., Klein, D.: An entity-level approach to information extraction. In: Proceedings of ACL, pp. 291–295 (2010)
8.
Zurück zum Zitat Henry, S., McInnes, B.: Literature based discovery: models, methods, and trends. J. Biomed. Inform. 74, 20–32 (2017)CrossRef Henry, S., McInnes, B.: Literature based discovery: models, methods, and trends. J. Biomed. Inform. 74, 20–32 (2017)CrossRef
10.
Zurück zum Zitat Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)MATH Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)MATH
11.
Zurück zum Zitat Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and sum product algorithm. IEEE Trans. Inf. Theory 47(2), 498–519 (2001)MathSciNetCrossRef Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and sum product algorithm. IEEE Trans. Inf. Theory 47(2), 498–519 (2001)MathSciNetCrossRef
12.
Zurück zum Zitat Lopez de Lacalle, O., Lapata, M.: Unsupervised relation extraction with general domain knowledge. In: Proceedings of EMNLP, pp. 415–425 (2013) Lopez de Lacalle, O., Lapata, M.: Unsupervised relation extraction with general domain knowledge. In: Proceedings of EMNLP, pp. 415–425 (2013)
13.
Zurück zum Zitat Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of ACL, pp. 1003–1011 (2009) Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of ACL, pp. 1003–1011 (2009)
14.
Zurück zum Zitat Paassen, B., Stöckel, A., Dickfelder, R., Göpfert, J.P., Brazda, N., Kirchhoffer, T., Müller, H.W., Klinger, R., Hartung, M., Cimiano, P.: Ontology-based extraction of structured information from publications on preclinical experiments for spinal cord injury treatments. In: Proceedings of the 3rd Workshop on Semantic Web and Information Extraction (SWAIE), pp. 25–32 (2014) Paassen, B., Stöckel, A., Dickfelder, R., Göpfert, J.P., Brazda, N., Kirchhoffer, T., Müller, H.W., Klinger, R., Hartung, M., Cimiano, P.: Ontology-based extraction of structured information from publications on preclinical experiments for spinal cord injury treatments. In: Proceedings of the 3rd Workshop on Semantic Web and Information Extraction (SWAIE), pp. 25–32 (2014)
15.
Zurück zum Zitat Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of NAACL/HLT, pp. 74–84 (2013) Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of NAACL/HLT, pp. 74–84 (2013)
16.
Zurück zum Zitat Singh, S., Yao, L., Belanger, D., Kobren, A., Anzaroot, S., Wick, M., Passos, A., Pandya, H., Choi, J.D., Martin, B., McCallum, A.: Universal schema for slot filling and cold start: UMass IESL at TACKBP 2013. In: Proceedings of TAC-KBP (2013) Singh, S., Yao, L., Belanger, D., Kobren, A., Anzaroot, S., Wick, M., Passos, A., Pandya, H., Choi, J.D., Martin, B., McCallum, A.: Universal schema for slot filling and cold start: UMass IESL at TACKBP 2013. In: Proceedings of TAC-KBP (2013)
17.
Zurück zum Zitat Smith, N.A.: Linguistic Structure Prediction. Morgan and Claypool, San Rafael (2011) Smith, N.A.: Linguistic Structure Prediction. Morgan and Claypool, San Rafael (2011)
18.
Zurück zum Zitat Sundheim, B.M.: Overview of the fourth message understanding evaluation and conference. In: Proceedings of MUC, pp. 3–21 (1992) Sundheim, B.M.: Overview of the fourth message understanding evaluation and conference. In: Proceedings of MUC, pp. 3–21 (1992)
19.
Zurück zum Zitat Wick, M., Rohanimanesh, K., Culotta, A., McCallum, A.: SampleRank: learning preferences from atomic gradients. In: Proceedings of the NIPS Workshop on Advances in Ranking, pp. 1–5 (2009) Wick, M., Rohanimanesh, K., Culotta, A., McCallum, A.: SampleRank: learning preferences from atomic gradients. In: Proceedings of the NIPS Workshop on Advances in Ranking, pp. 1–5 (2009)
20.
Zurück zum Zitat Wimalasuriya, D.C., Dou, D.: Ontology-based information extraction: an introduction and a survey of current approaches. J. Inf. Sci. 36(3), 306–323 (2010)CrossRef Wimalasuriya, D.C., Dou, D.: Ontology-based information extraction: an introduction and a survey of current approaches. J. Inf. Sci. 36(3), 306–323 (2010)CrossRef
21.
Zurück zum Zitat Zhang, Y., Zhong, V., Chen, D., Angeli, G., Manning, C.D.: Position-aware attention and supervised data improve slot filling. In: Proceedings of EMNLP, pp. 35–45 (2017) Zhang, Y., Zhong, V., Chen, D., Angeli, G., Manning, C.D.: Position-aware attention and supervised data improve slot filling. In: Proceedings of EMNLP, pp. 35–45 (2017)
Metadaten
Titel
Assessing the Impact of Single and Pairwise Slot Constraints in a Factor Graph Model for Template-Based Information Extraction
verfasst von
Hendrik ter Horst
Matthias Hartung
Roman Klinger
Nicole Brazda
Hans Werner Müller
Philipp Cimiano
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-91947-8_18