research-article

A mutual information-based framework for the analysis of information retrieval systems

Authors:
Peter B. Golbus

Northeastern University, Boston, MA, USA

Northeastern University, Boston, MA, USA
View Profile

,
Javed A. Aslam

Northeastern University, Boston, MA, USA

Northeastern University, Boston, MA, USA
View Profile

SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrievalJuly 2013Pages 683–692https://doi.org/10.1145/2484028.2484073

Published:28 July 2013Publication History

SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Pages 683–692

ABSTRACT

We consider the problem of information retrieval evaluation and the methods and metrics used for such evaluations. We propose a probabilistic framework for evaluation which we use to develop new information-theoretic evaluation metrics. We demonstrate that these new metrics are powerful and generalizable, enabling evaluations heretofore not possible.

We introduce four preliminary uses of our framework: (1) a measure of conditional rank correlation, information tau, a powerful meta-evaluation tool whose use we demonstrate on understanding novelty and diversity evaluation; (2) a new evaluation measure, relevance information correlation, which is correlated with traditional evaluation measures and can be used to (3) evaluate a collection of systems simultaneously, which provides a natural upper bound on metasearch performance; and (4) a measure of the similarity between rankers on judged documents, information difference, which allows us to determine whether systems with similar performance are in fact different.

References

Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong. Diversifying search results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM '09, pages 5-14, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Chris Buckley and Ellen M. Voorhees. Retrieval evaluation with incomplete information. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '04, 2004. Google ScholarDigital Library
Christopher J.C. Burges. From ranknet to lambdarank to lambdamart: An overview. Technical Report MSR-TR-2010-82, Microsoft Research, 2010.Google Scholar
Ben Carterette and Paul N. Bennett. Evaluation measures for preference judgments. In SIGIR, 2008. Google ScholarDigital Library
Ben Carterette, Paul N. Bennett, David Maxwell Chickering, and Susan T. Dumais. Here or there: preference judgments for relevance. In Proceedings of the IR research, 30th European conference on Advances in information retrieval, ECIR'08, 2008. Google ScholarDigital Library
Praveen Chandar and Ben Carterette. Using preference judgments for novel document retrieval. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, SIGIR '12, 2012. Google ScholarDigital Library
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM conference on Information and knowledge management, CIKM '09, pages 621-630, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Gordon V. Cormack. Overview of the TREC 2010 Web Track. In 19th Text REtrieval Conference, Gaithersburg, Maryland, 2010.Google Scholar
Charles L. A. Clarke, Nick Craswell, Ian Soboroff, and Ellen M. Voorhees. Overview of the TREC 2011 Web Track. In 20th Text REtrieval Conference, Gaithersburg, Maryland, 2011.Google Scholar
Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Buttcher, and Ian MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '08, pages 659-666, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley-Interscience, 1991. Google ScholarDigital Library
Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. An experimental comparison of click position-bias models. In Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM '08, pages 87-94, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 4, December 2003. Google ScholarDigital Library
Peter B. Golbus, Javed A. Aslam, and Charles L.A. Clarke. Increasing evaluation sensitivity to diversity. In Journal of Information Retrieval, To Appear. Google ScholarDigital Library
Kalervo Jarvelin and Jaana Kekalainen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4):422-446, October 2002. Google ScholarDigital Library
Thorsten Joachims. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '02, 2002. Google ScholarDigital Library
M. G. Kendall. A New Measure of Rank Correlation. Biometrika, 30(1/2):81-93, June 1938.Google ScholarCross Ref
Alistair Moffat and Justin Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst., 27(1):2:1-2:27, December 2008. Google ScholarDigital Library
Mark Montague. Metasearch: Data Fusion for Document Retrieval. PhD thesis, Dartmouth College. Dept. of Computer Science, 2002. Google ScholarDigital Library
Mark Montague and Javed A. Aslam. Condorcet fusion for improved retrieval. In Proceedings of the eleventh international conference on Information and knowledge management, CIKM '02, 2002. Google ScholarDigital Library
Stephen E. Robertson, Evangelos Kanoulas, and Emine Yilmaz. Extending average precision to graded relevance judgments. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '10, 2010. Google ScholarDigital Library
Tetsuya Sakai. Evaluating evaluation metrics based on the bootstrap. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, 2006. Google ScholarDigital Library
Tetsuya Sakai. Alternatives to Bpref. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, 2007. Google ScholarDigital Library
Tetsuya Sakai and Ruihua Song. Evaluating diversified search results using per-intent graded relevance. In SIGIR, pages 1043-1052, 2011. Google ScholarDigital Library
Joseph A. Shaw and Edward A. Fox. Combination of multiple searches. In The Second Text REtrieval Conference (TREC-2), pages 243-252, 1994.Google Scholar
Ruihua Song, Min Zhang, Tetsuya Sakai, Makoto P. Kato, Yiqun Liu, Miho Sugimoto, Qinglei Wang, and Naoki Orii. Overview of the ntcir-9 intent task. In Proceedings of the 9th NTCIR Workshop, Tokyo, Japan, 2011.Google Scholar
C. Spearman. The proof and measurement of association between two things. The American Journal of Psychology, 1904.Google ScholarCross Ref
E. M. Voorhees and D. Harman. Overview of the eighth text retrieval conference (TREC-8). In Proceedings of the Eighth Text REtrieval Conference (TREC-8), 2000.Google ScholarCross Ref
E. M. Voorhees and D. Harman. Overview of the ninth text retrieval conference (TREC-9). In Proceedings of the Ninth Text REtrieval Conference (TREC-9), 2001.Google ScholarCross Ref
Emine Yilmaz and Javed A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the 15th ACM international conference on Information and knowledge management, CIKM '06, 2006. Google ScholarDigital Library
Emine Yilmaz, Javed A. Aslam, and Stephen Robertson. A new rank correlation coefficient for information retrieval. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, 2008. Google ScholarDigital Library
Emine Yilmaz, Evangelos Kanoulas, and Javed A. Aslam. A simple and efficient sampling method for estimating AP and nDCG. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, 2008. Google ScholarDigital Library
Chengxiang Zhai and John Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., 22(2):179-214, April 2004. Google ScholarDigital Library

Index Terms

A mutual information-based framework for the analysis of information retrieval systems
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

On the information difference between standard retrieval models
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Recent work introduced a probabilistic framework that measures search engine performance information-theoretically. This allows for novel meta-evaluation measures such as Information Difference, which measures the magnitude of the difference between ...
Read More
Current Status of the Evaluation of Information Retrieval

This is the second in the series of the articles on an application of the systems analytic approach to evaluation of information retrieval (IR). In the previous article a historical overview of IR was presented and existing terminological problems ...
Read More
SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

The SIGIR 2013 Workshop on Modeling User Behavior for Information Retrieval Evaluation (MUBE 2013) brings together people to discuss existing and new approaches, ways to collaborate, and other ideas and issues involved in improving information retrieval ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
July 2013
1188 pages
ISBN:9781450320344
DOI:10.1145/2484028
General Chairs:
Gareth J.F. Jones
Dublin City University, Ireland
,
Páraic Sheridan
Dublin City University, Ireland
,
Program Chairs:
Diane Kelly
University of North Carolina, Chapel Hill, USA
,
Maarten de Rijke
University of Amsterdam, The Netherlands
,
Tetsuya Sakai
Microsoft Research Asia, China
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 July 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information retrieval
search evaluation
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '13 Paper Acceptance Rate73of366submissions,20%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 402
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A mutual information-based framework for the analysis of information retrieval systems

SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

On the information difference between standard retrieval models

Current Status of the Evaluation of Information Retrieval

SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation