Skip to main content

2003 | OriginalPaper | Buchkapitel

Applications of Score Distributions in Information Retrieval

verfasst von : R. Manmatha

Erschienen in: Language Modeling for Information Retrieval

Verlag: Springer Netherlands

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Researchers have recently shown that document scores of a number of different text search engines may be fitted on a per query basis using an exponential distribution for the set of non-relevant documents and a normal distribution for the set of relevant documents. This model fits a large number of different search engines including probabilistic search engines like INQUERY, vector space search engines like SMART and also LSI search engines and a language model engine. The model also appears to be true of search engines operating on a number of different languages. This leads to the hypothesis that all ‘good’ text search engines operating on any language have similar characteristics.We then show that given a query for which relevance information is not available, a mixture model consisting of an exponential and a normal distribution can be fitted to the score distribution. These distributions can be used to map the scores of a search engine to probabilities.This model has many possible applications. For example, the outputs of different search engines can be combined by averaging the probabilities (optimal if the search engines are independent) or by using the probabilities to select the best engine for each query. It has also been applied to filtering. We discuss these and other applications of score modeling in information retrieval.

Metadaten
Titel
Applications of Score Distributions in Information Retrieval
verfasst von
R. Manmatha
Copyright-Jahr
2003
Verlag
Springer Netherlands
DOI
https://doi.org/10.1007/978-94-017-0171-6_8