nach oben

Discover Computing

Erschienen in:

01.06.2007

On rank-based effectiveness measures and optimization

verfasst von: Stephen Robertson, Hugo Zaragoza

Erschienen in: Discover Computing | Ausgabe 3/2007

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Many current retrieval models and scoring functions contain free parameters which need to be set—ideally, optimized. The process of optimization normally involves some training corpus of the usual document-query-relevance judgement type, and some choice of measure that is to be optimized. The paper proposes a way to think about the process of exploring the space of parameter values, and how moving around in this space might be expected to affect different measures. One result, concerning local optima, is demonstrated for a range of rank-based evaluation measures.

Vorheriger Artikel Result merging methods in distributed information retrieval with overlapping databases

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

All the measures defined above are positive effectiveness measures; in this context, ‘M rewards’ means ‘M is increased by’, and ‘M penalizes’ means ‘M is decreased by’. However, obvious reversals occur if we consider a cost function where lower is better rather than an effectiveness measure.

Parameters may be the weights of features which are to be combined linearly. However, given n features, we would not normally have n independent weights, but n−1, since (a) we would certainly want to exclude the possibility of setting all weights to zero, and (b) the ranking would be unaffected by a constant (non-zero) multiplier for all weights. So we consider here an n + 1-dimensional feature space with n independent linear parameters (for example we might fix the weight of one feature to unity). An alternative would be to fix the parameter space to the surface of the unit hypersphere in n + 1-dimensional feature space; the theorem could be established just as strongly in this model.

Bartell, B. (1994). Optimizing ranking functions: A connectionist approach to adaptive information retrieval, Technical report, PhD thesis, University of California, San Diego.

Belkin, N. J., Ingwersen, P., & Leong, M.-K., (Eds.) (2000). In SIGIR 2000: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM.

Buckley, C., & Voorhees, E. (2000). Evaluating evaluation measure stability. In Belkin et al. (2000) (pp. 33–40).

Burges, C. J. C. (2005). Ranking as learning structured outputs. In S. Agarwal et al. (Ed.), Proceedings of the NIPS 2005 Workshop on Learning to Rank.

Burges, C. J. C., Shaked, T., Renshaw, E. et al. (2005). Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning, Bonn.

Cooper, W. S. (1968). Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19, 30–41.

Cooper, W. S., Chen, A., & Gey, F. C. (1994). Full text retrieval based on probabilistic equations with coefficients fitted by logistic regression. In D. K. Harman (Ed.), The Second Text REtrieval Conference (TREC–2) (pp. 57–66). NIST Special Publication 500-215, Gaithersburg, MD: NIST.

Freund, Y., Iyer, R., Schapire, R., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.CrossRefMathSciNet

Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers (pp. 115–132). MIT Press.

Järvelin, K., & Kekäläinen, J. (2000). IR evaluation methods for retrieving highly relevant documents. In Belkin et al. (2000) (pp. 41–48).

Kazai, G., Lalmas, M., & de Vries, A. P. (2004). The overlap problem in content-oriented XML retrieval evaluation. In K. Järvelin, J. Allan, P. Bruza & M. Sanderson (Eds.), SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 72–79). New York: ACM Press.

Mitchell, M. (1996). An introduction to genetic algorithms. Cambridge, MA: MIT Press.

Press, W. H., Teukolsky, S. A., Vettering, W. T., & Flannery, B. P. (2002). Numerical recipes in C++. The art of scientific computing, 2nd ed. Cambridge University Press.

Robertson, S., & Soboroff, I. (2002). The TREC 2001 filtering track report. In E. M. Voorhees & D. K. Harman (Eds.), The Tenth Text REtrieval Conference, TREC 2001 (pp. 26–37). NIST Special Publication 500-250, Gaithersburg, MD: NIST.

Robertson, S. E. (2002). Threshold setting and performance optimization in adaptive filtering. Information Retrieval, 5, 239–256.CrossRef

Swets, J. A. (1963). Information retrieval systems. Science, 141(3577), 245– 250.CrossRef

Voorhees, E. M. (2006). Overview of the TREC 2005 robust retrieval track. In E. M. Voorhees & L. P. Buckland (Eds.), The Fourteenth Text Retrieval Conference, TREC 2005. Gaithersburg, MD: NIST.

Titel: On rank-based effectiveness measures and optimization
verfasst von: Stephen Robertson
Hugo Zaragoza
Publikationsdatum: 01.06.2007
Verlag: Kluwer Academic Publishers
Erschienen in: Discover Computing / Ausgabe 3/2007
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI: https://doi.org/10.1007/s10791-007-9025-9

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2007

Multilingual phrase-based concordance generation in real-time

Linear feature-based models for information retrieval

Learning-based summarisation of XML documents

Result merging methods in distributed information retrieval with overlapping databases

A pipelined architecture for distributed text query evaluation

Premium Partner