Top

Published in:

2022 | OriginalPaper | Chapter

Streamlining Evaluation with `ir-measures`

Authors : Sean MacAvaney, Craig Macdonald, Iadh Ounis

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We present ir-measures, a new tool that makes it convenient to calculate a diverse set of evaluation measures used in information retrieval. Rather than implementing its own measure calculations, ir-measures provides a common interface to a handful of evaluation tools. The necessary tools are automatically invoked (potentially multiple times) to calculate all the desired metrics, simplifying the evaluation process for the user. The tool also makes it easier for researchers to use recently-proposed measures (such as those from the C/W/L framework) alongside traditional measures, potentially encouraging their adoption.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter INForex: Interactive News Digest for Forex Investors

next chapter Turning News Texts into Business Sentiment

For instance, the MSMARCO MRR evaluation script: https://git.io/JKG1S.

Docs: https://ir-measur.es/, Source: https://github.com/terrierteam/ir_measures.

https://ir-measur.es/en/latest/measures.html.

https://git.io/JKG94, https://git.io/JKCTo.

https://git.io/JKCT1.

https://git.io/JKG9O.

https://git.io/JKG1S.

https://git.io/JKCT5.

Azzopardi, L., Mackenzie, J., Moffat, A.: ERR is not C/W/L: exploring the relationship between expected reciprocal rank and other metrics. In: ICTIR (2021)

Azzopardi, L., Thomas, P., Moffat, A.: Cwl_eval: an evaluation tool for information retrieval. In: SIGIR (2019)

Bajaj, P., et al.: MS MARCO: a human generated machine reading comprehension dataset. In: CoCo@NIPS (2016)

Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR (2004)

Buckley, C., Voorhees, E.M.: Retrieval System Evaluation. MIT Press, Cambridge (2005)

Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: CIKM (2009)

Clarke, C.L.A., et al.: Novelty and diversity in information retrieval evaluation. In: SIGIR (2008)

Clarke, C.L.A., Kolla, M., Vechtomova, O.: An effectiveness measure for ambiguous and underspecified queries. In: ICTIR (2009)

Clarke, C.L.A., Vtyurina, A., Smucker, M.D.: Assessing top-k preferences. TOIS 39(3), 1–21 (2021)CrossRef

10.

Craswell, N., Mitra, B., Yilmaz, E., Campos, D., Voorhees, E.: Overview of the TREC 2019 deep learning track. In: TREC (2019)

11.

Fuhr, N.: Some common mistakes in ir evaluation, and how they can be avoided. SIGIR Forum 51, 32–41 (2018)CrossRef

12.

Harman, D.: Evaluation issues in information retrieval. IPM 28(4), 439–440 (1992)MathSciNet

13.

Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of ir techniques. TOIS 20(4), 422–446 (2002)CrossRef

14.

Jose, K.M., Nguyen, T., MacAvaney, S., Dalton, J., Yates, A.: Diffir: exploring differences in ranking models’ behavior. In: SIGIR (2021)

15.

Kantor, P., Voorhees, E.: The TREC-5 confusion track. Inf. Retr. 2(2–3), 165–176 (2000)CrossRef

16.

Lin, J., et al.: Supporting interoperability between open-source search engines with the common index file format. In: SIGIR (2020)

17.

Lucchese, C., Muntean, C.I., Nardini, F.M., Perego, R., Trani, S.: Rankeval: an evaluation and analysis framework for learning-to-rank solutions. In: SIGIR (2017)

18.

MacAvaney, S.: OpenNIR: a complete neural ad-hoc ranking pipeline. In: WSDM (2020)

19.

MacAvaney, S., Yates, A., Feldman, S., Downey, D., Cohan, A., Goharian, N.: Simplified data wrangling with ir_datasets. In: SIGIR (2021)

20.

Macdonald, C., Tonellotto, N.: Declarative experimentation ininformation retrieval using PyTerrier. In: Proceedings of ICTIR 2020 (2020)

21.

Moffat, A., Bailey, P., Scholer, F., Thomas, P.: Inst: an adaptive metric for information retrieval evaluation. In: Australasian Document Computing Symposium (2015)

22.

Moffat, A., Bailey, P., Scholer, F., Thomas, P.: Incorporating user expectations and behavior into the measurement of search effectiveness. TOIS 35(3), 1–38 (2017)CrossRef

23.

Moffat, A., Scholer, F., Thomas, P.: Models and metrics: IR evaluation as a user process. In: Australasian Document Computing Symposium (2012)

24.

National Institute of Standards and Technology: trec_eval. https://github.com/usnistgov/trec_eval (1993–2021)

25.

Palotti, J., Scells, H., Zuccon, G.: TrecTools: an open-source python library for information retrieval practitioners involved in TREC-like campaigns. In: SIGIR (2019)

26.

Piwowarski, B.: Experimaestro and datamaestro: experiment and dataset managers (for IR). In: SIGIR (2020)

27.

Sakai, T.: On Fuhr’s guideline for IR evaluation. SIGIR Forum 54, 1–8 (2020)CrossRef

28.

Van Gysel, C., de Rijke, M.: Pytrec_eval: an extremely fast python interface to trec_eval. In: SIGIR (2018)

29.

Van Rijsbergen, C.J.: Information retrieval (1979)

30.

Voorhees, E., et al.: Trec-covid: constructing a pandemic information retrieval test collection. ArXiv abs/2005.04474 (2020)

31.

Yilmaz, E., Aslam, J.A.: Estimating average precision with incomplete and imperfect judgments. In: CIKM (2006)

32.

Zhang, F., Liu, Y., Li, X., Zhang, M., Xu, Y., Ma, S.: Evaluating web search with a bejeweled player model. In: SIGIR (2017)

Title: Streamlining Evaluation with ir-measures
Authors: Sean MacAvaney
Craig Macdonald
Iadh Ounis
Publisher: Springer International Publishing
Book: Advances in Information Retrieval
Print ISBN: 978-3-030-99738-0

Electronic ISBN: 978-3-030-99739-7

Copyright Year: 2022
DOI: https://doi.org/10.1007/978-3-030-99739-7_38

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"