Erschienen in:

2003 | OriginalPaper | Buchkapitel

Applications of Score Distributions in Information Retrieval

verfasst von : R. Manmatha

Erschienen in: Language Modeling for Information Retrieval

Verlag: Springer Netherlands

Enthalten in: Professional Book Archive

Zugang erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Researchers have recently shown that document scores of a number of different text search engines may be fitted on a per query basis using an exponential distribution for the set of non-relevant documents and a normal distribution for the set of relevant documents. This model fits a large number of different search engines including probabilistic search engines like INQUERY, vector space search engines like SMART and also LSI search engines and a language model engine. The model also appears to be true of search engines operating on a number of different languages. This leads to the hypothesis that all ‘good’ text search engines operating on any language have similar characteristics.We then show that given a query for which relevance information is not available, a mixture model consisting of an exponential and a normal distribution can be fitted to the score distribution. These distributions can be used to map the scores of a search engine to probabilities.This model has many possible applications. For example, the outputs of different search engines can be combined by averaging the probabilities (optimal if the search engines are independent) or by using the probabilities to select the best engine for each query. It has also been applied to filtering. We discuss these and other applications of score modeling in information retrieval.

Springer Professional