Skip to main content

2020 | OriginalPaper | Buchkapitel

ChEMU: Named Entity Recognition and Event Extraction of Chemical Reactions from Patents

verfasst von : Dat Quoc Nguyen, Zenan Zhai, Hiyori Yoshikawa, Biaoyan Fang, Christian Druckenbrodt, Camilo Thorne, Ralph Hoessel, Saber A. Akhondi, Trevor Cohn, Timothy Baldwin, Karin Verspoor

Erschienen in: Advances in Information Retrieval

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We introduce a new evaluation lab named ChEMU (Cheminformatics Elsevier Melbourne University), part of the 11th Conference and Labs of the Evaluation Forum (CLEF-2020). ChEMU involves two key information extraction tasks over chemical reactions from patents. Task 1—Named entity recognition—involves identifying chemical compounds as well as their types in context, i.e., to assign the label of a chemical compound according to the role which the compound plays within a chemical reaction. Task 2—Event extraction over chemical reactions—involves event trigger detection and argument recognition. We briefly present the motivations and goals of the ChEMU tasks, as well as resources and evaluation methodology.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Note that those individual event steps are sequentially ordered, thus we do not consider cases where an event is an argument of another event, i.e. we do not label the relationship between two event triggers.
 
Literatur
1.
Zurück zum Zitat Akhondi, S.A., et al.: Annotated chemical patent corpus: a gold standard for text mining. PLoS ONE 9, 1–8 (2014)CrossRef Akhondi, S.A., et al.: Annotated chemical patent corpus: a gold standard for text mining. PLoS ONE 9, 1–8 (2014)CrossRef
2.
Zurück zum Zitat Akhondi, S.A., et al.: Automatic identification of relevant chemical compounds from patents. Database 2019, baz001 (2019)CrossRef Akhondi, S.A., et al.: Automatic identification of relevant chemical compounds from patents. Database 2019, baz001 (2019)CrossRef
3.
Zurück zum Zitat Bregonje, M.: Patents: a unique source for scientific technical information in chemistry related industry? World Pat. Inf. 27(4), 309–315 (2005)CrossRef Bregonje, M.: Patents: a unique source for scientific technical information in chemistry related industry? World Pat. Inf. 27(4), 309–315 (2005)CrossRef
4.
Zurück zum Zitat Hu, M., Cinciruk, D., Walsh, J.M.: Improving automated patent claim parsing: dataset, system, and experiments. CoRR abs/1605.01744 (2016) Hu, M., Cinciruk, D., Walsh, J.M.: Improving automated patent claim parsing: dataset, system, and experiments. CoRR abs/1605.01744 (2016)
5.
Zurück zum Zitat Jurafsky, D., Martin, J.H.: Semantic Role Labeling and Argument Structure. In: Speech and Language Processing, 3rd edn. (2019) Jurafsky, D., Martin, J.H.: Semantic Role Labeling and Argument Structure. In: Speech and Language Processing, 3rd edn. (2019)
6.
Zurück zum Zitat Kim, J.D., Ohta, T., Pyysalo, S., Kano, Y., Tsujii, J.: Overview of BioNLP’09 shared task on event extraction. In: Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, pp. 1–9 (2009) Kim, J.D., Ohta, T., Pyysalo, S., Kano, Y., Tsujii, J.: Overview of BioNLP’09 shared task on event extraction. In: Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, pp. 1–9 (2009)
7.
Zurück zum Zitat Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., Valencia, A.: CHEMDNER: the drugs and chemical names extraction challenge. J. Cheminform. 7(1), S1 (2015)CrossRef Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., Valencia, A.: CHEMDNER: the drugs and chemical names extraction challenge. J. Cheminform. 7(1), S1 (2015)CrossRef
8.
Zurück zum Zitat Krallinger, M., et al.: Overview of the CHEMDNER patents task. In: Proceedings of the Fifth BioCreative Challenge Evaluation Workshop, pp. 63–75 (2015) Krallinger, M., et al.: Overview of the CHEMDNER patents task. In: Proceedings of the Fifth BioCreative Challenge Evaluation Workshop, pp. 63–75 (2015)
10.
Zurück zum Zitat Muller, P.: Glossary of terms used in physical organic chemistry (IUPAC Recommendations 1994). Pure Appl. Chem. 66(5), 1077–1184 (2009)CrossRef Muller, P.: Glossary of terms used in physical organic chemistry (IUPAC Recommendations 1994). Pure Appl. Chem. 66(5), 1077–1184 (2009)CrossRef
11.
Zurück zum Zitat Muresan, S., et al.: Making every SAR point count: the development of chemistry connect for the large-scale integration of structure and bioactivity data. Drug Discovery Today 16(23), 1019–1030 (2011)CrossRef Muresan, S., et al.: Making every SAR point count: the development of chemistry connect for the large-scale integration of structure and bioactivity data. Drug Discovery Today 16(23), 1019–1030 (2011)CrossRef
12.
Zurück zum Zitat Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: an annotated corpus of semantic roles. Comput. Linguist. 31(1), 71–106 (2005)CrossRef Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: an annotated corpus of semantic roles. Comput. Linguist. 31(1), 71–106 (2005)CrossRef
13.
Zurück zum Zitat Senger, S., Bartek, L., Papadatos, G., Gaulton, A.: Managing expectations: assessment of chemistry databases generated by automated extraction of chemical structures from patents. J. Cheminformatics 7, 49:1–49:12 (2015)CrossRef Senger, S., Bartek, L., Papadatos, G., Gaulton, A.: Managing expectations: assessment of chemistry databases generated by automated extraction of chemical structures from patents. J. Cheminformatics 7, 49:1–49:12 (2015)CrossRef
14.
Zurück zum Zitat Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations Session at EACL 2012 (2012) Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations Session at EACL 2012 (2012)
15.
Zurück zum Zitat Valentinuzzi, M.E.: Patents and scientific papers: quite different concepts: the reward is found in giving, not in keeping [Retrospectroscope]. IEEE Pulse 8(1), 49–53 (2017)CrossRef Valentinuzzi, M.E.: Patents and scientific papers: quite different concepts: the reward is found in giving, not in keeping [Retrospectroscope]. IEEE Pulse 8(1), 49–53 (2017)CrossRef
16.
Zurück zum Zitat Verberne, S., D’hondt, E., Oostdijk, N., Koster, C.: Quantifying the challenges in parsing patent claims. In: Proceedings of the 1st International Workshop on Advances in Patent Information Retrieval at ECIR 2010, pp. 14–21 (2010) Verberne, S., D’hondt, E., Oostdijk, N., Koster, C.: Quantifying the challenges in parsing patent claims. In: Proceedings of the 1st International Workshop on Advances in Patent Information Retrieval at ECIR 2010, pp. 14–21 (2010)
17.
Zurück zum Zitat Verspoor, K., et al.: Annotating the biomedical literature for the human variome. Database 2013, bat019 (2013)CrossRef Verspoor, K., et al.: Annotating the biomedical literature for the human variome. Database 2013, bat019 (2013)CrossRef
18.
Zurück zum Zitat Yoshikawa, H., et al.: Detecting chemical reactions in patents. In: Proceedings of the 17th Annual Workshop of the Australasian Language Technology Association, pp. 100–110 (2019) Yoshikawa, H., et al.: Detecting chemical reactions in patents. In: Proceedings of the 17th Annual Workshop of the Australasian Language Technology Association, pp. 100–110 (2019)
19.
Zurück zum Zitat Zhai, Z., et al.: Improving chemical named entity recognition in patents with contextualized word embeddings. In: Proceedings of the 18th BioNLP Workshop, pp. 328–338 (2019) Zhai, Z., et al.: Improving chemical named entity recognition in patents with contextualized word embeddings. In: Proceedings of the 18th BioNLP Workshop, pp. 328–338 (2019)
Metadaten
Titel
ChEMU: Named Entity Recognition and Event Extraction of Chemical Reactions from Patents
verfasst von
Dat Quoc Nguyen
Zenan Zhai
Hiyori Yoshikawa
Biaoyan Fang
Christian Druckenbrodt
Camilo Thorne
Ralph Hoessel
Saber A. Akhondi
Trevor Cohn
Timothy Baldwin
Karin Verspoor
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-45442-5_74

Neuer Inhalt