Skip to main content

2003 | OriginalPaper | Buchkapitel

An Unbiased Generative Model for Setting Dissemination Thresholds

verfasst von : Yi Zhang, Jamie Callan

Erschienen in: Language Modeling for Information Retrieval

Verlag: Springer Netherlands

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Information filtering systems based on statistical retrieval models usually compute a numeric score that indicates how well each document matches each profile. Documents with scores above profile-specific dissemination thresholds are delivered. Optimal dissemination thresholds are usually difficult to determine a priori, so they are often learned during filtering, using relevance feedback about disseminated documents. However, the scores of disseminated documents are a biased sample of the complete distribution of document scores, which causes some algorithms to learn suboptimal thresholds.This chapter presents a generative method of adjusting dissemination thresholds that explicitly models and compensates for this bias. The new algorithm, which is based on the Maximum Likelihood principle, jointly estimates the parameters of the density distributions for relevant and non-relevant documents and the ratio of relevant to non-relevant documents in the region around the dissemination threshold. Experiments demonstrate its effectiveness when its underlying assumptions about document scores are true, and illustrate its behavior when its assumptions don’t match the actual distribution of document scores.

Metadaten
Titel
An Unbiased Generative Model for Setting Dissemination Thresholds
verfasst von
Yi Zhang
Jamie Callan
Copyright-Jahr
2003
Verlag
Springer Netherlands
DOI
https://doi.org/10.1007/978-94-017-0171-6_9