Skip to main content

2020 | OriginalPaper | Buchkapitel

2AIRTC: The Amharic Adhoc Information Retrieval Test Collection

verfasst von : Tilahun Yeshambel, Josiane Mothe, Yaregal Assabie

Erschienen in: Experimental IR Meets Multilinguality, Multimodality, and Interaction

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Evaluation is highly important for designing, developing, and maintaining information retrieval (IR) systems. The IR community has developed shared tasks where evaluation framework, evaluation measures and test collections have been developed for different languages. Although Amharic is the official language of Ethiopia currently having an estimated population of over 110 million, it is one of the under-resourced languages and there is no Amharic adhoc IR test collection to date. In this paper, we promote the monolingual Amharic IR test collection that we build for the IR community. Following the framework of Cranfield project and TREC, the collection that we named 2AIRTC consists of 12,583 documents, 240 topics and the corresponding relevance judgments.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Text REtrieval Conference http://​trec.​nist.​gov.
 
2
Cross Language Evaluation Forum http://​www.​clef-initiative.​eu.
 
3
NII Test Collection for Information Retrieval http://​research.​nii.​ac.​jp/​ntcir.
 
4
Initiative for the Evaluation of XML Retrieval http://​inex.​mmci.​uni-saarland.​de.
 
5
Forum for Information Retrieval Evaluation http://​fire.​irsi.​res.​in.
 
11
This collection is accessible by contacting the corresponding author at: tilahun.yeshambel@uog.edu.et
 
Literatur
Zurück zum Zitat Abate, S.T., et al.: Parallel corpora for Bi-Lingual English-Ethiopian languages statistical machine translation. In: Proceedings of the 27th International Conference on Computational Linguistics, New Mexico, USA, pp. 3102–3111, (2018) Abate, S.T., et al.: Parallel corpora for Bi-Lingual English-Ethiopian languages statistical machine translation. In: Proceedings of the 27th International Conference on Computational Linguistics, New Mexico, USA, pp. 3102–3111, (2018)
Zurück zum Zitat Alemayehu, N., Willett, P.: The effectiveness of stemming for information retrieval in Amharic. Program: Electron. libr. Inf. Syst. 37(4), 254–259 (2003)CrossRef Alemayehu, N., Willett, P.: The effectiveness of stemming for information retrieval in Amharic. Program: Electron. libr. Inf. Syst. 37(4), 254–259 (2003)CrossRef
Zurück zum Zitat Amsalu, A.: Amharic-English Dictionary. Kuraz Printing Press, Addis Ababa (1987) Amsalu, A.: Amharic-English Dictionary. Kuraz Printing Press, Addis Ababa (1987)
Zurück zum Zitat Berhanu, A.: Amharic-Français Dictionnaire. Shama Books, Addis Ababa (2004) Berhanu, A.: Amharic-Français Dictionnaire. Shama Books, Addis Ababa (2004)
Zurück zum Zitat Buckley, C., Voorhees, E.: Retrieval system evaluation. In TREC: Experiment and Evaluation in Information Retrieval, 3rd edn, pp. 53–75. MIT Press, Cambridge (2005) Buckley, C., Voorhees, E.: Retrieval system evaluation. In TREC: Experiment and Evaluation in Information Retrieval, 3rd edn, pp. 53–75. MIT Press, Cambridge (2005)
Zurück zum Zitat Cleverdon, C.W.: The evaluation of systems used in information retrieval. In: Proceeding of the International Conference on Scientific Information, Washington, DC, pp. 687–698 (1959) Cleverdon, C.W.: The evaluation of systems used in information retrieval. In: Proceeding of the International Conference on Scientific Information, Washington, DC, pp. 687–698 (1959)
Zurück zum Zitat Cleverdon, C.: The Cranfield tests on index language devices. In: Aslib Proceedings, MCB UP Ltd (1967) Cleverdon, C.: The Cranfield tests on index language devices. In: Aslib Proceedings, MCB UP Ltd (1967)
Zurück zum Zitat Demeke, G.A., Getachew, M.: Manual annotation of Amharic news items with part-of-speech tags and its challenges. Ethiop. Lang. Res. Cent. 2, 1–16 (2006) Demeke, G.A., Getachew, M.: Manual annotation of Amharic news items with part-of-speech tags and its challenges. Ethiop. Lang. Res. Cent. 2, 1–16 (2006)
Zurück zum Zitat Harman, D.: Overview of the second text retrieval conference (TREC-2). Inf. Process. Manage. 31(3), 271–289 (1995)CrossRef Harman, D.: Overview of the second text retrieval conference (TREC-2). Inf. Process. Manage. 31(3), 271–289 (1995)CrossRef
Zurück zum Zitat Ferro, N.: CLEF 15th birthday: past, present, and future. ACM SIGIR Forum 48(2), 31–55 (2014)CrossRef Ferro, N.: CLEF 15th birthday: past, present, and future. ACM SIGIR Forum 48(2), 31–55 (2014)CrossRef
Zurück zum Zitat Gamback, B.: Tagging and verifying an amharic news corpus. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp. 79–84 (2012) Gamback, B.: Tagging and verifying an amharic news corpus. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp. 79–84 (2012)
Zurück zum Zitat Gasser, M.: HornMorpho: a system for morphological processing of Amharic, Oromo, and Tigrinya. In: Conference on Human Language Technology for Development, pp. 94–99 (2011) Gasser, M.: HornMorpho: a system for morphological processing of Amharic, Oromo, and Tigrinya. In: Conference on Human Language Technology for Development, pp. 94–99 (2011)
Zurück zum Zitat Hetzron, R.: Ethiopian Semitic: Studies in Classification. Manchester Univesity Press, Manchester (1972) Hetzron, R.: Ethiopian Semitic: Studies in Classification. Manchester Univesity Press, Manchester (1972)
Zurück zum Zitat Kagolovsky, Y., Moehr, J.: Current status of the evaluation of information retrieval. J. Med. Syst. 27(5), 409–424 (2003)CrossRef Kagolovsky, Y., Moehr, J.: Current status of the evaluation of information retrieval. J. Med. Syst. 27(5), 409–424 (2003)CrossRef
Zurück zum Zitat Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Adachi, J.: The NTCIR workshop  the first evaluation workshop on Japanese text retrieval and cross-lingual information retrieval. In: 4th International Workshop on Information Retrieval with Asian Languages (1), INV-1-INV-7 (1999) Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Adachi, J.: The NTCIR workshop  the first evaluation workshop on Japanese text retrieval and cross-lingual information retrieval. In: 4th International Workshop on Information Retrieval with Asian Languages (1), INV-1-INV-7 (1999)
Zurück zum Zitat Kesatie B.: YeAmarinja Mezgebe Qalat. Ethiopian Languages Research Center, Artistic Publisher, Addis Abeba, Ethiopia (1993) Kesatie B.: YeAmarinja Mezgebe Qalat. Ethiopian Languages Research Center, Artistic Publisher, Addis Abeba, Ethiopia (1993)
Zurück zum Zitat Mindaye, T., Atnafu, S.: Design and implementation of Amharic search engine. In: Proceeding of the 5th International Conference on Signal Image Technology and Internet Based Systems, pp. 318–325 (2009) Mindaye, T., Atnafu, S.: Design and implementation of Amharic search engine. In: Proceeding of the 5th International Conference on Signal Image Technology and Internet Based Systems, pp. 318–325 (2009)
Zurück zum Zitat Peters, C., Braschler, M.: European research letter: cross-language system evaluation: The CLEF campaigns. J. Am. Soc. Inf. Sci. Technol. 52(12), 1067–1072 (2001)CrossRef Peters, C., Braschler, M.: European research letter: cross-language system evaluation: The CLEF campaigns. J. Am. Soc. Inf. Sci. Technol. 52(12), 1067–1072 (2001)CrossRef
Zurück zum Zitat Samimi, P., Ravana, S.: Creation of reliable relevance judgments in information retrieval systems evaluation experimentation through crowdsourcing  a review. Sci. World J. 2014 (2014) Samimi, P., Ravana, S.: Creation of reliable relevance judgments in information retrieval systems evaluation experimentation through crowdsourcing  a review. Sci. World J. 2014 (2014)
Zurück zum Zitat Sanderson, M., Croft, W.: The history of information retrieval research. Proc. IEEE Spec. Centennial Issue 100, 1444–1451 (2001) Sanderson, M., Croft, W.: The history of information retrieval research. Proc. IEEE Spec. Centennial Issue 100, 1444–1451 (2001)
Zurück zum Zitat Soboroff, I.: A comparasion of pooled and sampled relevance judgments. In: The TREC 2006 Terabyte Track. The First International Workshop on Evaluation Information Access, Tokyo, Japan (2007) Soboroff, I.: A comparasion of pooled and sampled relevance judgments. In: The TREC 2006 Terabyte Track. The First International Workshop on Evaluation Information Access, Tokyo, Japan (2007)
Zurück zum Zitat Tachbelie, M.Y., Abate, S.T., Besacier, L.: Part-of-speech tagging for under resourced and morphologically rich languages the case of Amharic. HLTD 2011, 50–55 (2011) Tachbelie, M.Y., Abate, S.T., Besacier, L.: Part-of-speech tagging for under resourced and morphologically rich languages the case of Amharic. HLTD 2011, 50–55 (2011)
Zurück zum Zitat Yeshambel, T., Josiane, M., Assabie, Y.: Construction of Morpheme-based Amharic stopword list for information retrieval system. accepted. In: The 8th EAI International Conference on Advancements of Science and Technology, Bahir Dar, Ethiopia (2020) Yeshambel, T., Josiane, M., Assabie, Y.: Construction of Morpheme-based Amharic stopword list for information retrieval system. accepted. In: The 8th EAI International Conference on Advancements of Science and Technology, Bahir Dar, Ethiopia (2020)
Metadaten
Titel
2AIRTC: The Amharic Adhoc Information Retrieval Test Collection
verfasst von
Tilahun Yeshambel
Josiane Mothe
Yaregal Assabie
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-58219-7_5