2003 | OriginalPaper | Buchkapitel
Language Models for Topic Tracking
The importance of score normalization
verfasst von : Wessel Kraaij, Martijn Spitters
Erschienen in: Language Modeling for Information Retrieval
Verlag: Springer Netherlands
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Generative unigram language models have proven to be a simple though effective model for information retrieval tasks. In contrast to ad-hoc retrieval, topic tracking requires that matching scores are comparable across topics. Several ranking functions based on generative language models: straight likelihood, likelihood ratio, normalized likelihood ratio, and the related Kullback-Leibler divergence are evaluated in two orientations. Best performance is achieved by the models based on a normalized log-likelihood ratio. Key component of these models is the a-priori probability of a story with respect to a common reference distribution.