Skip to main content

2021 | OriginalPaper | Buchkapitel

Overview of ChEMU 2021: Reaction Reference Resolution and Anaphora Resolution in Chemical Patents

verfasst von : Yuan Li, Biaoyan Fang, Jiayuan He, Hiyori Yoshikawa, Saber A. Akhondi, Christian Druckenbrodt, Camilo Thorne, Zubair Afzal, Zenan Zhai, Timothy Baldwin, Karin Verspoor

Erschienen in: Experimental IR Meets Multilinguality, Multimodality, and Interaction

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we provide an overview of the Cheminformatics Elsevier Melbourne University (ChEMU) evaluation lab 2021, part of the Conference and Labs of the Evaluation Forum 2021 (CLEF 2021). The ChEMU evaluation lab focuses on information extraction over chemical reactions from patent texts. As the second instance of our ChEMU lab series, we build upon the ChEMU corpus developed for ChEMU 2020, extending it for two distinct tasks related to reference resolution in chemical patents. Task 1—Chemical Reaction Reference Resolution—focuses on paragraph-level references and aims to identify the chemical reactions or general conditions specified in one reaction description referred to by another. Task 2—Anaphora Resolution—focuses on expression-level references and aims to identify the reference relationships between expressions in chemical reaction descriptions. Herein, we describe the resources created for these tasks and the evaluation methodology adopted. We also provide a brief summary of the results obtained in this lab, finding that one submission achieves substantially better results than our baseline models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Reaxys® Copyright ©2021 Elsevier Life Sciences IP Limited except certain content provided by third parties. Reaxys is a trademark of Elsevier Life Sciences IP Limited, used under license. https://​www.​reaxys.​com.
 
4
With the lowest agreement being \(\alpha =0.89\) for coreference mentions.
 
Literatur
1.
Zurück zum Zitat Akhondi, S.A., et al.: Automatic identification of relevant chemical compounds from patents. Database 2019 (2019) Akhondi, S.A., et al.: Automatic identification of relevant chemical compounds from patents. Database 2019 (2019)
3.
Zurück zum Zitat Baumgartner Jr., W.A., et al.: CRAFT shared tasks 2019 overview—integrated structure, semantics, and coreference. In: Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, pp. 174–184 (2019) Baumgartner Jr., W.A., et al.: CRAFT shared tasks 2019 overview—integrated structure, semantics, and coreference. In: Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, pp. 174–184 (2019)
4.
Zurück zum Zitat Bregonje, M.: Patents: a unique source for scientific technical information in chemistry related industry? World Patent Inf. 27(4), 309–315 (2005)CrossRef Bregonje, M.: Patents: a unique source for scientific technical information in chemistry related industry? World Patent Inf. 27(4), 309–315 (2005)CrossRef
5.
Zurück zum Zitat Clark, K., Manning, C.D.: Entity-centric coreference resolution with model stacking. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, 26–31 July 2015, Beijing, China, Volume 1: Long Papers, pp. 1405–1415. The Association for Computer Linguistics (2015). https://doi.org/10.3115/v1/p15-1136 Clark, K., Manning, C.D.: Entity-centric coreference resolution with model stacking. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, 26–31 July 2015, Beijing, China, Volume 1: Long Papers, pp. 1405–1415. The Association for Computer Linguistics (2015). https://​doi.​org/​10.​3115/​v1/​p15-1136
7.
Zurück zum Zitat Fang, B., Druckenbrodt, C., Akhondi, S.A., He, J., Baldwin, T., Verspoor, K.: ChEMU-Ref: a corpus for modeling anaphora resolution in the chemical domain. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, April 2021 Fang, B., Druckenbrodt, C., Akhondi, S.A., He, J., Baldwin, T., Verspoor, K.: ChEMU-Ref: a corpus for modeling anaphora resolution in the chemical domain. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, April 2021
12.
Zurück zum Zitat Hu, M., Cinciruk, D., Walsh, J.M.: Improving automated patent claim parsing: dataset, system, and experiments. arXiv preprint arXiv:1605.01744 (2016) Hu, M., Cinciruk, D., Walsh, J.M.: Improving automated patent claim parsing: dataset, system, and experiments. arXiv preprint arXiv:​1605.​01744 (2016)
13.
Zurück zum Zitat Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., Valencia, A.: CHEMDNER: the drugs and chemical names extraction challenge. J. Cheminf. 7(S1), S1 (2015)CrossRef Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., Valencia, A.: CHEMDNER: the drugs and chemical names extraction challenge. J. Cheminf. 7(S1), S1 (2015)CrossRef
14.
Zurück zum Zitat Krippendorff, K.: Measuring the reliability of qualitative text analysis data. Qual. Quant. 38, 787–800 (2004)CrossRef Krippendorff, K.: Measuring the reliability of qualitative text analysis data. Qual. Quant. 38, 787–800 (2004)CrossRef
15.
Zurück zum Zitat Lee, K., He, L., Lewis, M., Zettlemoyer, L.: End-to-end neural coreference resolution. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 9–11 September 2017, pp. 188–197. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/d17-1018 Lee, K., He, L., Lewis, M., Zettlemoyer, L.: End-to-end neural coreference resolution. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 9–11 September 2017, pp. 188–197. Association for Computational Linguistics (2017). https://​doi.​org/​10.​18653/​v1/​d17-1018
16.
Zurück zum Zitat Lee, K., He, L., Zettlemoyer, L.: Higher-order coreference resolution with coarse-to-fine inference. In: Walker, M.A., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, 1–6 June 2018, Volume 2 (Short Papers), pp. 687–692. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/n18-2108 Lee, K., He, L., Zettlemoyer, L.: Higher-order coreference resolution with coarse-to-fine inference. In: Walker, M.A., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, 1–6 June 2018, Volume 2 (Short Papers), pp. 687–692. Association for Computational Linguistics (2018). https://​doi.​org/​10.​18653/​v1/​n18-2108
18.
Zurück zum Zitat Muresan, S., et al.: Making every SAR point count: the development of Chemistry Connect for the large-scale integration of structure and bioactivity data. Drug Discovery Today 16(23–24), 1019–1030 (2011)CrossRef Muresan, S., et al.: Making every SAR point count: the development of Chemistry Connect for the large-scale integration of structure and bioactivity data. Drug Discovery Today 16(23–24), 1019–1030 (2011)CrossRef
21.
Zurück zum Zitat Nguyen, N., Kim, J.D., Tsujii, J.: Overview of BioNLP 2011 protein coreference shared task. In: Proceedings of BioNLP Shared Task 2011 Workshop, pp. 74–82 (2011) Nguyen, N., Kim, J.D., Tsujii, J.: Overview of BioNLP 2011 protein coreference shared task. In: Proceedings of BioNLP Shared Task 2011 Workshop, pp. 74–82 (2011)
22.
Zurück zum Zitat Ohta, T., Tateisi, Y., Kim, J.D., Mima, H., Tsujii, J.: The GENIA corpus: an annotated research abstract corpus in molecular biology domain. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 82–86 (2002) Ohta, T., Tateisi, Y., Kim, J.D., Mima, H., Tsujii, J.: The GENIA corpus: an annotated research abstract corpus in molecular biology domain. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 82–86 (2002)
23.
Zurück zum Zitat Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: CoNLL-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: Pradhan, S., Moschitti, A., Xue, N. (eds.) Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning - Proceedings of the Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes, EMNLP-CoNLL 2012, 13 July 2012, Jeju Island, Korea, pp. 1–40. ACL (2012). https://www.aclweb.org/anthology/W12-4501/ Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: CoNLL-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: Pradhan, S., Moschitti, A., Xue, N. (eds.) Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning - Proceedings of the Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes, EMNLP-CoNLL 2012, 13 July 2012, Jeju Island, Korea, pp. 1–40. ACL (2012). https://​www.​aclweb.​org/​anthology/​W12-4501/​
24.
Zurück zum Zitat Senger, S., Bartek, L., Papadatos, G., Gaulton, A.: Managing expectations: assessment of chemistry databases generated by automated extraction of chemical structures from patents. J. Cheminf. 7(1), 1–12 (2015)CrossRef Senger, S., Bartek, L., Papadatos, G., Gaulton, A.: Managing expectations: assessment of chemistry databases generated by automated extraction of chemical structures from patents. J. Cheminf. 7(1), 1–12 (2015)CrossRef
26.
Zurück zum Zitat Zhai, Z., et al.: Improving chemical named entity recognition in patents with contextualized word embeddings. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 328–338. Association for Computational Linguistics, Florence, Italy, August 2019. https://doi.org/10.18653/v1/W19-5035 Zhai, Z., et al.: Improving chemical named entity recognition in patents with contextualized word embeddings. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 328–338. Association for Computational Linguistics, Florence, Italy, August 2019. https://​doi.​org/​10.​18653/​v1/​W19-5035
Metadaten
Titel
Overview of ChEMU 2021: Reaction Reference Resolution and Anaphora Resolution in Chemical Patents
verfasst von
Yuan Li
Biaoyan Fang
Jiayuan He
Hiyori Yoshikawa
Saber A. Akhondi
Christian Druckenbrodt
Camilo Thorne
Zubair Afzal
Zenan Zhai
Timothy Baldwin
Karin Verspoor
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-85251-1_20

Premium Partner