Erschienen in:

2003 | OriginalPaper | Buchkapitel

Language Models for Topic Tracking

The importance of score normalization

verfasst von : Wessel Kraaij, Martijn Spitters

Erschienen in: Language Modeling for Information Retrieval

Verlag: Springer Netherlands

Enthalten in: Professional Book Archive

Zugang erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Generative unigram language models have proven to be a simple though effective model for information retrieval tasks. In contrast to ad-hoc retrieval, topic tracking requires that matching scores are comparable across topics. Several ranking functions based on generative language models: straight likelihood, likelihood ratio, normalized likelihood ratio, and the related Kullback-Leibler divergence are evaluated in two orientations. Best performance is achieved by the models based on a normalized log-likelihood ratio. Key component of these models is the a-priori probability of a story with respect to a common reference distribution.

Springer Professional