skip to main content
10.1145/1277741.1277841acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Performance prediction using spatial autocorrelation

Published:23 July 2007Publication History

ABSTRACT

Evaluation of information retrieval systems is one of the core tasks in information retrieval. Problems include the inability to exhaustively label all documents for a topic, generalizability from a small number of topics, and incorporating the variability of retrieval systems. Previous work addresses the evaluation of systems, the ranking of queries by difficulty, and the ranking of individual retrievals by performance. Approaches exist for the case of few and even no relevance judgments. Our focus is on zero-judgment performance prediction of individual retrievals. One common shortcoming of previous techniques is the assumption of uncorrelated document scores and judgments. If documents are embedded in a high-dimensional space (as they often are), we can apply techniques from spatial data analysis to detect correlations between document scores. We find that the low correlation between scores of topically close documents often implies a poor retrieval performance. When compared to a state of the art baseline, we demonstrate that the spatial analysis of retrieval scores provides significantly better prediction performance. These new predictors can also be incorporated with classic predictors to improve performance further. We also describe the first large-scale experiment to evaluate zero-judgment performance prediction for a massive number of retrieval systems over a variety collections in several languages.

References

  1. J. Aslam and V. Pavlu. Query hardness estimation using jensen-shannon divergence among multiple scoring functions. In ECIR 2007: Proceedings of the 29th European Conference on Information Retrieval, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. A. Aslam, V. Pavlu, and E. Yilmaz. A statistical method for system evaluation using incomplete judgments. In S. Dumais, E. N. Efthimiadis, D. Hawking, and K. Jarvelin, editors, Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 541--548. ACM Press, August 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 390--397, New York, NY, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Carterette, J. Allan, and R. Sitaraman. Minimal test collections for retrieval evaluation. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 268--275, New York, NY, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. D. Cliff and J. K. Ord. Spatial Autocorrelation. Pion Ltd., 1973.Google ScholarGoogle Scholar
  6. M. Connell, A. Feng, G. Kumaran, H. Raghavan, C. Shah, and J. Allan. Umass at tdt 2004. Technical Report CIIR Technical Report IR -- 357, Department of Computer Science, University of Massachusetts, 2004.Google ScholarGoogle Scholar
  7. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Precision prediction based on ranked list coherence. Inf. Retr., 9(6):723--755, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. Diaz. Regularizing ad hoc retrieval scores. In CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 672--679, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 18--24, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. A. Griffith. Spatial Autocorrelation and Spatial Filtering. Springer Verlag, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  11. B. He and I. Ounis. Inferring Query Performance Using Pre-retrieval Predictors. In The Eleventh Symposium on String Processing and Information Retrieval (SPIRE), 2004.Google ScholarGoogle Scholar
  12. N. Jardine and C. J. V. Rijsbergen. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7:217--240, 1971.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In ICML '02: Proceedings of the Nineteenth International Conference on Machine Learning, pages 259--266, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. O. Kurland and L. Lee. Corpus structure, language models, and ad hoc information retrieval. In SIGIR '04: Proceedings of the 27th annual international conference on Research and development in information retrieval, pages 194--201, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Montague and J. A. Aslam. Relevance score normalization for metasearch. In CIKM '01: Proceedings of the tenth international conference on Information and knowledge management, pages 427--433, New York, NY, USA, 2001. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Qin, T.-Y. Liu, X.-D. Zhang, Z. Chen, and W.-Y. Ma. A study of relevance propagation for web search. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 408--415, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. I. Soboroff, C. Nicholas, and P. Cahan. Ranking retrieval systems without relevance judgments. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 66--73, New York, NY, USA, 2001. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Vinay, I. J. Cox, N. Milic-Frayling, and K. Wood. On ranking the effectiveness of searches. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 398--404, New York, NY, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Zhou and W. B. Croft. Ranking robustness: a novel framework to predict query performance. In CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management, pages 567--574, New York, NY, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Performance prediction using spatial autocorrelation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
        July 2007
        946 pages
        ISBN:9781595935977
        DOI:10.1145/1277741

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 July 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader