Skip to main content

2022 | OriginalPaper | Buchkapitel

DocTAG: A Customizable Annotation Tool for Ground Truth Creation

verfasst von : Fabio Giachelle, Ornella Irrera, Gianmaria Silvello

Erschienen in: Advances in Information Retrieval

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Information Retrieval (IR) is a discipline deeply rooted on evaluation that in many cases relies on annotated data as ground truth. Manual annotation is a demanding and time-consuming task, involving human intervention for topic-document assessment. To ease and possibly speed up the work of the assessors, it is desirable to have easy-to-use, collaborative and flexible annotation tools. Despite their importance, in the IR domain no open-source fully customizable annotation tool has been proposed for topic-document annotation and assessment, so far. In this demo paper, we present DocTAG, a portable and customizable annotation tool for ground-truth creation in a web-based collaborative setting.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Biega, A.J., Diaz, F., Ekstrand, M.D., Kohlmeier, S.: Overview of the TREC 2019 fair ranking track. CoRR abs/2003.11650 (2020) Biega, A.J., Diaz, F., Ekstrand, M.D., Kohlmeier, S.: Overview of the TREC 2019 fair ranking track. CoRR abs/2003.11650 (2020)
2.
Zurück zum Zitat Cejuela, J.M., et al.: tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles. Database J. Biol. Databases Curation 2014 (2014) Cejuela, J.M., et al.: tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles. Database J. Biol. Databases Curation 2014 (2014)
3.
Zurück zum Zitat Dogan, R.I., Kwon, D., Kim, S., Lu, Z.: TeamTat: a collaborative text annotation tool. Nucleic Acids Res. 48(Webserver-Issue), W5–W11 (2020) Dogan, R.I., Kwon, D., Kim, S., Lu, Z.: TeamTat: a collaborative text annotation tool. Nucleic Acids Res. 48(Webserver-Issue), W5–W11 (2020)
4.
Zurück zum Zitat Giachelle, F., Irrera, O., Silvello, G.: MedTAG: a portable and customizable annotation tool for biomedical documents. BMC Med. Inform. Decis. Making 21, 352 (2021) Giachelle, F., Irrera, O., Silvello, G.: MedTAG: a portable and customizable annotation tool for biomedical documents. BMC Med. Inform. Decis. Making 21, 352 (2021)
5.
Zurück zum Zitat Klie, J.C., Bugert, M., Boullosa, B., de Castilho, R.E., Gurevych, I.: The inception platform: machine-assisted and knowledge-oriented interactive annotation. In: Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, pp. 5–9. Association for Computational Linguistics, June 2018 Klie, J.C., Bugert, M., Boullosa, B., de Castilho, R.E., Gurevych, I.: The inception platform: machine-assisted and knowledge-oriented interactive annotation. In: Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, pp. 5–9. Association for Computational Linguistics, June 2018
6.
Zurück zum Zitat Kwon, D., Kim, S., Shin, S., Chatr-aryamontri, A., Wilbur, W.J.: Assisting manual literature curation for protein-protein interactions using BioQRator. Database J. Biol. Databases Curation 2014, bau067 (2014) Kwon, D., Kim, S., Shin, S., Chatr-aryamontri, A., Wilbur, W.J.: Assisting manual literature curation for protein-protein interactions using BioQRator. Database J. Biol. Databases Curation 2014, bau067 (2014)
7.
Zurück zum Zitat Kwon, D., Kim, S., Wei, C., Leaman, R., Lu, Z.: ezTag: tagging biomedical concepts via interactive learning. Nucleic Acids Res. 46(Webserver-Issue), W523–W529 (2018) Kwon, D., Kim, S., Wei, C., Leaman, R., Lu, Z.: ezTag: tagging biomedical concepts via interactive learning. Nucleic Acids Res. 46(Webserver-Issue), W523–W529 (2018)
8.
Zurück zum Zitat Lin, J., et al.: Overview of the TREC 2017 real-time summarization track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Sixth Text REtrieval Conference, TREC 2017, Gaithersburg, Maryland, USA, 15–17 November 2017. NIST Special Publication, vol. 500–324. National Institute of Standards and Technology (NIST) (2017) Lin, J., et al.: Overview of the TREC 2017 real-time summarization track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Sixth Text REtrieval Conference, TREC 2017, Gaithersburg, Maryland, USA, 15–17 November 2017. NIST Special Publication, vol. 500–324. National Institute of Standards and Technology (NIST) (2017)
9.
Zurück zum Zitat Lin, J., Roegiest, A., Tan, L., McCreadie, R., Voorhees, E.M., Diaz, F.: Overview of the TREC 2016 real-time summarization track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Fifth Text REtrieval Conference, TREC 2016, Gaithersburg, Maryland, USA, 15–18 November 2016. NIST Special Publication, vol. 500–321. National Institute of Standards and Technology (NIST) (2016) Lin, J., Roegiest, A., Tan, L., McCreadie, R., Voorhees, E.M., Diaz, F.: Overview of the TREC 2016 real-time summarization track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Fifth Text REtrieval Conference, TREC 2016, Gaithersburg, Maryland, USA, 15–18 November 2016. NIST Special Publication, vol. 500–321. National Institute of Standards and Technology (NIST) (2016)
10.
Zurück zum Zitat Lin, J., Wang, Y., Efron, M., Sherman, G.: Overview of the TREC-2014 microblog track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, 19–21 November 2014. NIST Special Publication, vol. 500–308. National Institute of Standards and Technology (NIST) (2014) Lin, J., Wang, Y., Efron, M., Sherman, G.: Overview of the TREC-2014 microblog track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, 19–21 November 2014. NIST Special Publication, vol. 500–308. National Institute of Standards and Technology (NIST) (2014)
11.
Zurück zum Zitat Neves, M., Ševa, J.: An extensive review of tools for manual annotation of documents. Brief. Bioinform. 22(1), 146–163 (2021)CrossRef Neves, M., Ševa, J.: An extensive review of tools for manual annotation of documents. Brief. Bioinform. 22(1), 146–163 (2021)CrossRef
12.
Zurück zum Zitat Neves, M.L., Leser, U.: A survey on annotation tools for the biomedical literature. Briefings Bioinform. 15(2), 327–340 (2014)CrossRef Neves, M.L., Leser, U.: A survey on annotation tools for the biomedical literature. Briefings Bioinform. 15(2), 327–340 (2014)CrossRef
13.
Zurück zum Zitat Salgado, D., et al.: MyMiner: a web application for computer-assisted biocuration and text annotation. Bioinform. 28(17), 2285–2287 (2012)CrossRef Salgado, D., et al.: MyMiner: a web application for computer-assisted biocuration and text annotation. Bioinform. 28(17), 2285–2287 (2012)CrossRef
14.
Zurück zum Zitat Sequiera, R., Tan, L., Lin, J.: Overview of the TREC 2018 real-time summarization track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Seventh Text REtrieval Conference, TREC 2018, Gaithersburg, Maryland, USA, 14–16 November 2018. NIST Special Publication, vol. 500–331. National Institute of Standards and Technology (NIST) (2018) Sequiera, R., Tan, L., Lin, J.: Overview of the TREC 2018 real-time summarization track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Seventh Text REtrieval Conference, TREC 2018, Gaithersburg, Maryland, USA, 14–16 November 2018. NIST Special Publication, vol. 500–331. National Institute of Standards and Technology (NIST) (2018)
15.
Zurück zum Zitat Stenetorp, P., Pyysalo, S., Topic, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation. In: Daelemans, W., Lapata, M., Màrquez, L. (eds.) EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, 23–27 April 2012, pp. 102–107. The Association for Computer Linguistics (2012) Stenetorp, P., Pyysalo, S., Topic, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation. In: Daelemans, W., Lapata, M., Màrquez, L. (eds.) EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, 23–27 April 2012, pp. 102–107. The Association for Computer Linguistics (2012)
16.
Zurück zum Zitat Voorhees, E.M., et al.: TREC-COVID: constructing a pandemic information retrieval test collection. SIGIR Forum 54(1), 1:1–1:12 (2020) Voorhees, E.M., et al.: TREC-COVID: constructing a pandemic information retrieval test collection. SIGIR Forum 54(1), 1:1–1:12 (2020)
17.
Zurück zum Zitat Voorhees, E.M., Harman, D.K.: Overview of the seventh text retrieval conference (TREC-7) (1999) Voorhees, E.M., Harman, D.K.: Overview of the seventh text retrieval conference (TREC-7) (1999)
18.
Zurück zum Zitat Yimam, S.M., Gurevych, I., de Castilho, R.E., Biemann, C.: WebAnno: a flexible, web-based and visually supported system for distributed annotations. In: 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Proceedings of the Conference System Demonstrations, Sofia, Bulgaria, 4–9 August 2013, pp. 1–6. The Association for Computer Linguistics (2013) Yimam, S.M., Gurevych, I., de Castilho, R.E., Biemann, C.: WebAnno: a flexible, web-based and visually supported system for distributed annotations. In: 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Proceedings of the Conference System Demonstrations, Sofia, Bulgaria, 4–9 August 2013, pp. 1–6. The Association for Computer Linguistics (2013)
Metadaten
Titel
DocTAG: A Customizable Annotation Tool for Ground Truth Creation
verfasst von
Fabio Giachelle
Ornella Irrera
Gianmaria Silvello
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-99739-7_35