Skip to main content
Erschienen in: Discover Computing 6/2007

01.12.2007

Regularizing query-based retrieval scores

verfasst von: Fernando Diaz

Erschienen in: Discover Computing | Ausgabe 6/2007

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Baliński and Daniłowicz (2005) recently proposed a similar score-based objective. Though a solution is presented, we are not aware of any experimental results or connections to previous models we describe here.
 
2
These functions operate on the entire vector f as opposed to element-wise.
 
3
The local, discrete Lipschitz constant for a document, i, can be thought of as \(\hbox{max}_j \; (W_{ij}\|f_i-f_j\|)\). Although similar, the local Lipschitz measure is much less forgiving to discontinuities in a function. Because our retrieval function can be thought of as a very peaked or spiky function due to the paucity of relevant documents, we adopt the Laplacian-based measure.
 
4
In practice, the document representations are only based on the cluster information (i.e., λ = 0). Our ranking function generalizes classic cluster-based retrieval functions.
 
5
In Sect. 3.2, we adopted the symmetric diffusion kernel to compare distributions. The cross-entropy measure here is asymmetric and therefore cannot be used in our closed form solution. Nevertheless, our iterative solution is not constrained by the symmetry requirement. Furthermore, theoretical results for Laplacians of directed graphs exist and can be applied in our framework (Chung 2004; Zhou et al. 2005).
 
6
We noticed that the cosine similarity in general outperformed the diffusion kernel.
 
Literatur
Zurück zum Zitat Baliński, J., & Daniłowicz, C. (2005). Re-ranking method based on inter-document distances. Information Processing and Management, 41(4), 759–775.CrossRef Baliński, J., & Daniłowicz, C. (2005). Re-ranking method based on inter-document distances. Information Processing and Management, 41(4), 759–775.CrossRef
Zurück zum Zitat Belew, R. K. (1989). Adaptive information retrieval: Using a connectionist representation to retrieve and learn about documents. In SIGIR ’89: Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 11–20). New York: ACM Press. Belew, R. K. (1989). Adaptive information retrieval: Using a connectionist representation to retrieve and learn about documents. In SIGIR ’89: Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 11–20). New York: ACM Press.
Zurück zum Zitat Belkin, M., Matveeva, I., & Niyogi, P. (2004). Regularization and semi-supervised learning on large graphs. In COLT (pp. 624–638). Belkin, M., Matveeva, I., & Niyogi, P. (2004). Regularization and semi-supervised learning on large graphs. In COLT (pp. 624–638).
Zurück zum Zitat Belkin, M., & Niyogi, P. (2003). Using manifold structure for partially labeled classification. In S. T. S. Becker & K. Obermayer (Eds.), Advances in neural information processing systems (Vol. 15, pp. 929–936). Cambridge, MA: MIT Press. Belkin, M., & Niyogi, P. (2003). Using manifold structure for partially labeled classification. In S. T. S. Becker & K. Obermayer (Eds.), Advances in neural information processing systems (Vol. 15, pp. 929–936). Cambridge, MA: MIT Press.
Zurück zum Zitat Belkin, M., & Niyogi, P. (2004). Semi-supervised learning on Riemannian manifolds. Machine Learning, 56(1–3), 209–239.MATHCrossRef Belkin, M., & Niyogi, P. (2004). Semi-supervised learning on Riemannian manifolds. Machine Learning, 56(1–3), 209–239.MATHCrossRef
Zurück zum Zitat Belkin, M., & Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In COLT (pp. 486–500). Belkin, M., & Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In COLT (pp. 486–500).
Zurück zum Zitat Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. In WWW7: Proceedings of the Seventh International Conference on World Wide Web 7 (pp. 107–117). Amsterdam: Elsevier Science Publishers B. V. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. In WWW7: Proceedings of the Seventh International Conference on World Wide Web 7 (pp. 107–117). Amsterdam: Elsevier Science Publishers B. V.
Zurück zum Zitat Buckley, C., & Voorhees, E. M. (2000). Evaluating evaluation measure stability. In SIGIR ’00: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 33–40). New York: ACM Press. Buckley, C., & Voorhees, E. M. (2000). Evaluating evaluation measure stability. In SIGIR ’00: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 33–40). New York: ACM Press.
Zurück zum Zitat Carmel, D., Yom-Tov, E., Darlow, A., & Pelleg, D. (2006). What makes a query difficult? In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 390–397). New York: ACM Press. Carmel, D., Yom-Tov, E., Darlow, A., & Pelleg, D. (2006). What makes a query difficult? In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 390–397). New York: ACM Press.
Zurück zum Zitat Chen, Z., & Haykin, S. (2002). On different facets of regularization theory. Neural Computation, 14(12), 2791–2846.MATHCrossRef Chen, Z., & Haykin, S. (2002). On different facets of regularization theory. Neural Computation, 14(12), 2791–2846.MATHCrossRef
Zurück zum Zitat Chung, F. R. K. (1997). Spectral graph theory. American Mathematical Society. Chung, F. R. K. (1997). Spectral graph theory. American Mathematical Society.
Zurück zum Zitat Chung, F. (2004). Laplacians and the Cheeger inequality for directed graphs. Annals of Combinatorics, 9, 1–19.CrossRef Chung, F. (2004). Laplacians and the Cheeger inequality for directed graphs. Annals of Combinatorics, 9, 1–19.CrossRef
Zurück zum Zitat Cohn, D. A., & Hofmann, T. (2000). The missing link – a probabilistic model of document content and hypertext connectivity. In NIPS (pp. 430–436). Cohn, D. A., & Hofmann, T. (2000). The missing link – a probabilistic model of document content and hypertext connectivity. In NIPS (pp. 430–436).
Zurück zum Zitat Connell, M., Feng, A., Kumaran, G., Raghavan, H., Shah, C., & Allan, J. (2004). UMass at TDT 2004. Technical Report CIIR Technical Report IR-357. Department of Computer Science, University of Massachusetts. Connell, M., Feng, A., Kumaran, G., Raghavan, H., Shah, C., & Allan, J. (2004). UMass at TDT 2004. Technical Report CIIR Technical Report IR-357. Department of Computer Science, University of Massachusetts.
Zurück zum Zitat Croft, W. B., & Lafferty, J. (2003) Language modeling for information retrieval. Kluwer Academic Publishing. Croft, W. B., & Lafferty, J. (2003) Language modeling for information retrieval. Kluwer Academic Publishing.
Zurück zum Zitat Croft, W. B., Lucia, T. J., & Cohen, P. R. (1988). Retrieving documents by plausible inference: A preliminary study. In SIGIR 88: Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 481–494). New York: ACM Press. Croft, W. B., Lucia, T. J., & Cohen, P. R. (1988). Retrieving documents by plausible inference: A preliminary study. In SIGIR 88: Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 481–494). New York: ACM Press.
Zurück zum Zitat Cronen-Townsend, S., Zhou, Y., & Croft, W. B. (2002). Predicting query performance. In SIGIR ’02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 299–306). New York: ACM Press. Cronen-Townsend, S., Zhou, Y., & Croft, W. B. (2002). Predicting query performance. In SIGIR ’02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 299–306). New York: ACM Press.
Zurück zum Zitat Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6), 391–407.CrossRef Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6), 391–407.CrossRef
Zurück zum Zitat Fang, H., Tao, T., & Zhai, C. (2004). A formal study of information retrieval heuristics. In SIGIR ’04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 49–56). New York: ACM Press. Fang, H., Tao, T., & Zhai, C. (2004). A formal study of information retrieval heuristics. In SIGIR ’04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 49–56). New York: ACM Press.
Zurück zum Zitat Fang, H., & Zhai, C. (2005) An exploration of axiomatic approaches to information retrieval. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 480–487). New York: ACM Press. Fang, H., & Zhai, C. (2005) An exploration of axiomatic approaches to information retrieval. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 480–487). New York: ACM Press.
Zurück zum Zitat Harman, D. K. (1993). The first text retrieval conference (TREC-1) Rockville, MD, U.S.A., 4–6 November, 1992. Information Processing and Management, 29(4), 411–414.CrossRefMathSciNet Harman, D. K. (1993). The first text retrieval conference (TREC-1) Rockville, MD, U.S.A., 4–6 November, 1992. Information Processing and Management, 29(4), 411–414.CrossRefMathSciNet
Zurück zum Zitat Jardine, N., & van Rijsbergen, C. J. (1971). The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7, 217–240.CrossRef Jardine, N., & van Rijsbergen, C. J. (1971). The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7, 217–240.CrossRef
Zurück zum Zitat Kleinberg, J. M. (1998). Authoritative sources in a hyperlinked environment. In SODA ’98: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 668–677). Philadelphia: Society for Industrial and Applied Mathematics. Kleinberg, J. M. (1998). Authoritative sources in a hyperlinked environment. In SODA ’98: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 668–677). Philadelphia: Society for Industrial and Applied Mathematics.
Zurück zum Zitat Krovetz, R. (1993). Viewing morphology as an inference process. In SIGIR ’93: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 191–202). New York: ACM Press. Krovetz, R. (1993). Viewing morphology as an inference process. In SIGIR ’93: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 191–202). New York: ACM Press.
Zurück zum Zitat Kurland, O., & Lee, L. (2004). Corpus structure, language models, and ad hoc information retrieval. In SIGIR ’04: Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval (pp. 194–201). New York: ACM Press. Kurland, O., & Lee, L. (2004). Corpus structure, language models, and ad hoc information retrieval. In SIGIR ’04: Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval (pp. 194–201). New York: ACM Press.
Zurück zum Zitat Kurland, O., & Lee, L. (2005) PageRank without hyperlinks: Structural re-ranking using links induced by language models. In SIGIR ’05: Proceedings of the 28th Annual International Conference on Research and Development in Information Retrieval. Kurland, O., & Lee, L. (2005) PageRank without hyperlinks: Structural re-ranking using links induced by language models. In SIGIR ’05: Proceedings of the 28th Annual International Conference on Research and Development in Information Retrieval.
Zurück zum Zitat Kurland, O., Lee, L., & Domshlak, C. (2005). Better than the real thing? Iterative pseudo-query processing using cluster-based language models. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 19–26). New York: ACM Press. Kurland, O., Lee, L., & Domshlak, C. (2005). Better than the real thing? Iterative pseudo-query processing using cluster-based language models. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 19–26). New York: ACM Press.
Zurück zum Zitat Kwok, K. L. (1989). A neural network for probabilistic information retrieval. In SIGIR ’89: Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 21–30). New York: ACM Press. Kwok, K. L. (1989). A neural network for probabilistic information retrieval. In SIGIR ’89: Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 21–30). New York: ACM Press.
Zurück zum Zitat Lafferty, J., & Lebanon, G. (2005). Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6, 129–163.MathSciNet Lafferty, J., & Lebanon, G. (2005). Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6, 129–163.MathSciNet
Zurück zum Zitat Lafon, S. (2004). Diffusion maps and geometric harmonics. PhD Thesis, Yale University. Lafon, S. (2004). Diffusion maps and geometric harmonics. PhD Thesis, Yale University.
Zurück zum Zitat Lavrenko, V. (2004). A generative theory of relevance. Ph.D. Thesis, University of Massachusetts. Lavrenko, V. (2004). A generative theory of relevance. Ph.D. Thesis, University of Massachusetts.
Zurück zum Zitat Lavrenko, V., & Allan, J. (2006). Real-time query expansion in relevance models. Technical Report IR-473. Amherst: University of Massachusetts. Lavrenko, V., & Allan, J. (2006). Real-time query expansion in relevance models. Technical Report IR-473. Amherst: University of Massachusetts.
Zurück zum Zitat Liu, X., & Croft, W. B. (2004). Cluster-based retrieval using language models. In SIGIR ’04: Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval (pp. 186–193). New York: ACM Press. Liu, X., & Croft, W. B. (2004). Cluster-based retrieval using language models. In SIGIR ’04: Proceedings of the 27th Annual International Conference on Research and Development in Information Retrieval (pp. 186–193). New York: ACM Press.
Zurück zum Zitat Metzler, D., & Croft, W. B. (2004). Combining the language model and inference network approaches to retrieval. Information Processing and Management, 40(5), 735–750.CrossRef Metzler, D., & Croft, W. B. (2004). Combining the language model and inference network approaches to retrieval. Information Processing and Management, 40(5), 735–750.CrossRef
Zurück zum Zitat Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 472–479). New York: ACM Press. Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 472–479). New York: ACM Press.
Zurück zum Zitat Montague, M., & Aslam, J. A. (2001). Relevance score normalization for metasearch. In CIKM ’01: Proceedings of the Tenth International Conference on Information and Knowledge Management (pp. 427–433). New York: ACM Press. Montague, M., & Aslam, J. A. (2001). Relevance score normalization for metasearch. In CIKM ’01: Proceedings of the Tenth International Conference on Information and Knowledge Management (pp. 427–433). New York: ACM Press.
Zurück zum Zitat Ng, A. Y., Zheng, A. X., & Jordan, M. I. (2001). Stable algorithms for link analysis. In SIGIR ’01: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 258–266). New York: ACM Press. Ng, A. Y., Zheng, A. X., & Jordan, M. I. (2001). Stable algorithms for link analysis. In SIGIR ’01: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 258–266). New York: ACM Press.
Zurück zum Zitat Petersen, K. B., & Pedersen, M. S. (2005) The matrix Cookbook. Version 20051003. Petersen, K. B., & Pedersen, M. S. (2005) The matrix Cookbook. Version 20051003.
Zurück zum Zitat Platt, J. (2000). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In P. J. Bartlett, B. Schölkopf, D. Schuurmans, & A. J. Smola (Eds.), Advances in large margin classifiers. MIT Press Platt, J. (2000). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In P. J. Bartlett, B. Schölkopf, D. Schuurmans, & A. J. Smola (Eds.), Advances in large margin classifiers. MIT Press
Zurück zum Zitat Qin, T., Liu, T.-Y., Zhang, X.-D., Chen, Z., & Ma, W.-Y. (2005). A study of relevance propagation for web search. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 408–415). New York: ACM Press. Qin, T., Liu, T.-Y., Zhang, X.-D., Chen, Z., & Ma, W.-Y. (2005). A study of relevance propagation for web search. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 408–415). New York: ACM Press.
Zurück zum Zitat Richardson, M., Prakash, A., & Brill, E. (2006). Beyond PageRank: Machine learning for static ranking. In WWW ’06: Proceedings of the 15th International Conference on World Wide Web (pp. 707–715). New York: ACM Press. Richardson, M., Prakash, A., & Brill, E. (2006). Beyond PageRank: Machine learning for static ranking. In WWW ’06: Proceedings of the 15th International Conference on World Wide Web (pp. 707–715). New York: ACM Press.
Zurück zum Zitat Robertson, S. E., van Rijsbergen, C. J., & Porter, M. F. (1981). Probabilistic models of indexing and searching. In SIGIR ’80: Proceedings of the 3rd Annual ACM Conference on Research and Development in Information Retrieval (pp. 35–56). Kent: Butterworth & Co. Robertson, S. E., van Rijsbergen, C. J., & Porter, M. F. (1981). Probabilistic models of indexing and searching. In SIGIR ’80: Proceedings of the 3rd Annual ACM Conference on Research and Development in Information Retrieval (pp. 35–56). Kent: Butterworth & Co.
Zurück zum Zitat Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In SIGIR ’94: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 232–241). New York: Springer-Verlag Inc. Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In SIGIR ’94: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 232–241). New York: Springer-Verlag Inc.
Zurück zum Zitat Rocchio, J. J. (1971). Relevance feedback in information retrieval. In The SMART retrieval system: Experiments in automatic document processing (pp. 313–323). Prentice-Hall Inc. Rocchio, J. J. (1971). Relevance feedback in information retrieval. In The SMART retrieval system: Experiments in automatic document processing (pp. 313–323). Prentice-Hall Inc.
Zurück zum Zitat Rölleke, T., Tsikrika, T., & Kazai, G. (2006). A general matrix framework for modelling information retrieval. Information Processing and Management, 42(1), 4–30.MATHCrossRef Rölleke, T., Tsikrika, T., & Kazai, G. (2006). A general matrix framework for modelling information retrieval. Information Processing and Management, 42(1), 4–30.MATHCrossRef
Zurück zum Zitat Salton, G. (1968). Automatic information organization and retrieval. McGraw Hill Text. Salton, G. (1968). Automatic information organization and retrieval. McGraw Hill Text.
Zurück zum Zitat Salton, G., & Buckley, C. (1988). On the use of spreading activation methods in automatic information. In SIGIR ’88: Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 147–160). New York: ACM Press. Salton, G., & Buckley, C. (1988). On the use of spreading activation methods in automatic information. In SIGIR ’88: Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 147–160). New York: ACM Press.
Zurück zum Zitat Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613–620.MATHCrossRef Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613–620.MATHCrossRef
Zurück zum Zitat Savoy, J. (1997). Ranking schemes in hybrid Boolean systems: A new approach. Journal of the American Society for Information Science, 48(3), 235–253.CrossRef Savoy, J. (1997). Ranking schemes in hybrid Boolean systems: A new approach. Journal of the American Society for Information Science, 48(3), 235–253.CrossRef
Zurück zum Zitat Singhal, A., & Pereira, F. (1999). Document expansion for speech retrieval. In SIGIR ’99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 34–41). New York: ACM Press. Singhal, A., & Pereira, F. (1999). Document expansion for speech retrieval. In SIGIR ’99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 34–41). New York: ACM Press.
Zurück zum Zitat Strohman, T., Metzler, D., Turtle, H., & Croft, W. B. (2004). Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligence Analysis. Strohman, T., Metzler, D., Turtle, H., & Croft, W. B. (2004). Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligence Analysis.
Zurück zum Zitat Tao, T., Wang, X., Mei, Q., & Zhai, C. (2006). Language model information retrieval with document expansion. In HLT/NAACL 2006 (pp. 407–414). Tao, T., Wang, X., Mei, Q., & Zhai, C. (2006). Language model information retrieval with document expansion. In HLT/NAACL 2006 (pp. 407–414).
Zurück zum Zitat Turtle, H., & Croft, W. B. (1990). Inference networks for document retrieval. In SIGIR ’90: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1–24). New York: ACM Press. Turtle, H., & Croft, W. B. (1990). Inference networks for document retrieval. In SIGIR ’90: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1–24). New York: ACM Press.
Zurück zum Zitat Voorhees, E. (2004). Overview of the TREC 2004 Robust Track. In Proceedings of the 13th Text REtrieval Conference (TREC 2004). Voorhees, E. (2004). Overview of the TREC 2004 Robust Track. In Proceedings of the 13th Text REtrieval Conference (TREC 2004).
Zurück zum Zitat Voorhees, E. M., & Harman, D. K. (Eds.). (2001). TREC: Experiment and evaluation in information retrieval. MIT Press. Voorhees, E. M., & Harman, D. K. (Eds.). (2001). TREC: Experiment and evaluation in information retrieval. MIT Press.
Zurück zum Zitat Wei, X., & Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 178–185). New York: ACM Press. Wei, X., & Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 178–185). New York: ACM Press.
Zurück zum Zitat Wilkinson, R., & Hingston, P. (1991). Using the cosine measure in a neural network for document retrieval. In SIGIR ’91: Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 202–210). New York: ACM Press. Wilkinson, R., & Hingston, P. (1991). Using the cosine measure in a neural network for document retrieval. In SIGIR ’91: Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 202–210). New York: ACM Press.
Zurück zum Zitat Xu, J., & Croft, W. B. (1999). Cluster-based language models for distributed retrieval. In SIGIR ’99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 254–261). New York: ACM Press. Xu, J., & Croft, W. B. (1999). Cluster-based language models for distributed retrieval. In SIGIR ’99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 254–261). New York: ACM Press.
Zurück zum Zitat Yom-Tov, E., Fine, S., Carmel, D., & Darlow, A. (2005). Learning to estimate query difficulty: Including applications to missing content detection and distributed information retrieval. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 512–519). New York: ACM Press. Yom-Tov, E., Fine, S., Carmel, D., & Darlow, A. (2005). Learning to estimate query difficulty: Including applications to missing content detection and distributed information retrieval. In SIGIR ’05: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 512–519). New York: ACM Press.
Zurück zum Zitat Zhou, D., Schölkopf, B., & Hofmann, T. (2005). Semi-supervised learning on directed graphs. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 17, pp. 1633–1640). Cambridge, MA: MIT Press. Zhou, D., Schölkopf, B., & Hofmann, T. (2005). Semi-supervised learning on directed graphs. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing systems (Vol. 17, pp. 1633–1640). Cambridge, MA: MIT Press.
Zurück zum Zitat Zhou, D., Weston, J., Gretton, A., Bousquet, O., & Schölkopf, B. (2004). Ranking on data manifolds. In L. S. Thrun & B. Schölkopf (Eds.), Advances in neural information processing systems (Vol. 16, pp. 169–176). Cambridge, MA: MIT Press. Zhou, D., Weston, J., Gretton, A., Bousquet, O., & Schölkopf, B. (2004). Ranking on data manifolds. In L. S. Thrun & B. Schölkopf (Eds.), Advances in neural information processing systems (Vol. 16, pp. 169–176). Cambridge, MA: MIT Press.
Metadaten
Titel
Regularizing query-based retrieval scores
verfasst von
Fernando Diaz
Publikationsdatum
01.12.2007
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 6/2007
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-007-9034-8

Premium Partner