Skip to main content
Erschienen in: Discover Computing 2/2020

06.03.2020

A passage-based approach to learning to rank documents

verfasst von: Eilon Sheetrit, Anna Shtok, Oren Kurland

Erschienen in: Discover Computing | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

According to common relevance-judgments regimes, such as TREC’s, a document can be deemed relevant to a query even if it contains a very short passage of text with pertinent information. This fact has motivated work on passage-based document retrieval: document ranking methods that induce information from the document’s passages. However, the main source of passage-based information utilized was passage-query similarities. In this paper, we address the challenge of utilizing richer sources of passage-based information to improve document retrieval effectiveness. Specifically, we devise a suite of learning-to-rank-based document retrieval methods that utilize an effective ranking of passages produced in response to the query. Some of the methods quantify the ranking of the passages of a document. Others utilize the feature-based representation of the document’s passages. Empirical evaluation attests to the clear merits of our methods with respect to highly effective baselines. Our best performing method is based on learning a document ranking function using document-query features and passage-query features of the document’s passage most highly ranked; the passage-query features are those used to learn a highly effective passage ranker.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note that these passages are also the passages of documents in \(\mathcal {D}_{init}\) since \(\mathcal {D}_{LTR}\) is a re-rank of \(\mathcal {D}_{init}\).
 
2
Experiments—actual numbers are omitted as they convey no additional insight—showed that simply using the passage-based document ranking without the additional fusion often yields performance (substantially) inferior to that of FPD.
 
4
Unless otherwise stated, we used the jforests implementation of LambdaMART: https://​code.​google.​com/​p/​jforests/​. In Sect. 5.1.7 we also present the performance results of our best performing method when using the LightGBM implementation of LambdaMART (https://​github.​com/​microsoft/​LightGBM).
 
7
The only exception was that the passage LTR method applied on TREC corpora was learned using all queries in the INEX dataset.
 
8
Not smoothing these language models was shown to yield highly effective RM3 performance (Raiber and Kurland 2013).
 
9
The finding that init-LMart underperforms init-SVM can be attributed to the fact that LMart is a non-linear ranker while SVM is, and the number of queries used for training is not very large.
 
10
We note that the use of the lowest ranked passage did not result in substantial performance decrease due to the length of passages used here: 300; that is, such passages can incorporate a descent amount of information from the entire document, especially in cases of relatively short documents.
 
11
To avoid having the same features used for the two passages, the following features were removed from the feature vector of the second ranked passage: DocQuerySim, MaxPDSim, AvgPDSim, StdPDSim and QueryLength.
 
12
We do not present the comparison for the JPDm approach as it is independent of the passage ranking.
 
13
JPDs-SVM uses 24 features and JPDs-LMart uses 25 features—the additional feature is the query length which is not useful for a linear ranker.
 
14
In this analysis we set \(\nu\), the free parameter of SMPD, to a value which is effective across the train folds.
 
Literatur
Zurück zum Zitat Abdul-Jaleel, N., Allan, J., Croft, W. B., Diaz, F., Larkey, L., Li, X., et al. (2004). UMASS at TREC 2004: Novelty and hard. In Proceedings of TREC. Abdul-Jaleel, N., Allan, J., Croft, W. B., Diaz, F., Larkey, L., Li, X., et al. (2004). UMASS at TREC 2004: Novelty and hard. In Proceedings of TREC.
Zurück zum Zitat Arvola, P., Geva, S., Kamps, J., Schenkel, R., Trotman, A., & Vainio, J. (2011). Overview of the INEX 2010 ad hoc track. In Comparative evaluation of focused retrieval (pp. 1–32). Arvola, P., Geva, S., Kamps, J., Schenkel, R., Trotman, A., & Vainio, J. (2011). Overview of the INEX 2010 ad hoc track. In Comparative evaluation of focused retrieval (pp. 1–32).
Zurück zum Zitat Bendersky, M., & Kurland, O. (2008). Re-ranking search results using document-passage graphs. In Proceedings of SIGIR (pp. 853–854). Bendersky, M., & Kurland, O. (2008). Re-ranking search results using document-passage graphs. In Proceedings of SIGIR (pp. 853–854).
Zurück zum Zitat Bendersky, M., & Kurland, O. (2010). Utilizing passage-based language models for ad hoc document retrieval. Information Retrieval, 13(2), 157–187.CrossRef Bendersky, M., & Kurland, O. (2010). Utilizing passage-based language models for ad hoc document retrieval. Information Retrieval, 13(2), 157–187.CrossRef
Zurück zum Zitat Bendersky, M., Croft, W. B., & Diao, Y. (2011). Quality-biased ranking of web documents. In Proceedings of WSDM (pp. 95–104). Bendersky, M., Croft, W. B., & Diao, Y. (2011). Quality-biased ranking of web documents. In Proceedings of WSDM (pp. 95–104).
Zurück zum Zitat Buffoni, D., Usunier, N., & Gallinari, P. (2010). Lip6 at INEX: OWPC for ad hoc track. In Focused retrieval and evaluation (pp. 59–69). Buffoni, D., Usunier, N., & Gallinari, P. (2010). Lip6 at INEX: OWPC for ad hoc track. In Focused retrieval and evaluation (pp. 59–69).
Zurück zum Zitat Burges, C. J. (2010). From ranknet to lambdarank to lambdamart: An overview. Microsoft Research: Technical report. Burges, C. J. (2010). From ranknet to lambdarank to lambdamart: An overview. Microsoft Research: Technical report.
Zurück zum Zitat Callan, J. P. (1994). Passage-level evidence in document retrieval. In Proceedings of SIGIR (pp. 302–310). Callan, J. P. (1994). Passage-level evidence in document retrieval. In Proceedings of SIGIR (pp. 302–310).
Zurück zum Zitat Carmel, D., Shtok, A., & Kurland, O. (2013). Position-based contextualization for passage retrieval. In Proceedings of CIKM (pp. 1241–1244). Carmel, D., Shtok, A., & Kurland, O. (2013). Position-based contextualization for passage retrieval. In Proceedings of CIKM (pp. 1241–1244).
Zurück zum Zitat Chen, R., Spina, D., Croft, W. B., Sanderson, M., & Scholer, F. (2015). Harnessing semantics for answer sentence retrieval. In Proceedings of ESAIR (pp. 21–27). Chen, R., Spina, D., Croft, W. B., Sanderson, M., & Scholer, F. (2015). Harnessing semantics for answer sentence retrieval. In Proceedings of ESAIR (pp. 21–27).
Zurück zum Zitat Chen, R. C., Yulianti, E., Sanderson, M., & Cro, W. B. (2017). On the benefit of incorporating external features in a neural architecture for answer sentence selection. In Proceedings of SIGIR (pp. 1017–1020). Chen, R. C., Yulianti, E., Sanderson, M., & Cro, W. B. (2017). On the benefit of incorporating external features in a neural architecture for answer sentence selection. In Proceedings of SIGIR (pp. 1017–1020).
Zurück zum Zitat Cormack, G. V., Clarke, C. L., & Buettcher, S. (2009). Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proceedings of SIGIR (pp. 758–759). Cormack, G. V., Clarke, C. L., & Buettcher, S. (2009). Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proceedings of SIGIR (pp. 758–759).
Zurück zum Zitat Cormack, G. V., Smucker, M. D., & Clarke, C. L. (2011). Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval, 14(5), 441–465.CrossRef Cormack, G. V., Smucker, M. D., & Clarke, C. L. (2011). Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval, 14(5), 441–465.CrossRef
Zurück zum Zitat Dehghani, M., Zamani, H., Severyn, A., Kamps, J., & Croft, W. B. (2017). Neural ranking models with weak supervision. In Proceedings of SIGIR (pp. 65–74). Dehghani, M., Zamani, H., Severyn, A., Kamps, J., & Croft, W. B. (2017). Neural ranking models with weak supervision. In Proceedings of SIGIR (pp. 65–74).
Zurück zum Zitat Denoyer, L., Zaragoza, H., & Gallinari, P. (2001). HMM-based passage models for document classification and ranking. In Proceedings of ECIR. Denoyer, L., Zaragoza, H., & Gallinari, P. (2001). HMM-based passage models for document classification and ranking. In Proceedings of ECIR.
Zurück zum Zitat Fan, Y., Guo, J., Lan, Y., Xu, J., Zhai, C., & Cheng, X. (2018). Modeling diverse relevance patterns in ad-hoc retrieval. In Proceedings of SIGIR (pp. 375–384). Fan, Y., Guo, J., Lan, Y., Xu, J., Zhai, C., & Cheng, X. (2018). Modeling diverse relevance patterns in ad-hoc retrieval. In Proceedings of SIGIR (pp. 375–384).
Zurück zum Zitat Fernández, R. T., & Losada, D. E. (2012). Effective sentence retrieval based on query-independent evidence. Information Processing and Management, 48(6), 1203–1229.CrossRef Fernández, R. T., & Losada, D. E. (2012). Effective sentence retrieval based on query-independent evidence. Information Processing and Management, 48(6), 1203–1229.CrossRef
Zurück zum Zitat Fernández, R. T., Losada, D. E., & Azzopardi, L. A. (2011). Extending the language modeling framework for sentence retrieval to include local context. Information Retrieval, 14(4), 355–389.CrossRef Fernández, R. T., Losada, D. E., & Azzopardi, L. A. (2011). Extending the language modeling framework for sentence retrieval to include local context. Information Retrieval, 14(4), 355–389.CrossRef
Zurück zum Zitat Ferragina, P., & Scaiella, U. (2012). Fast and accurate annotation of short texts with wikipedia pages. IEEE Software, 29(1), 70–75.CrossRef Ferragina, P., & Scaiella, U. (2012). Fast and accurate annotation of short texts with wikipedia pages. IEEE Software, 29(1), 70–75.CrossRef
Zurück zum Zitat Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.MathSciNetCrossRef Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.MathSciNetCrossRef
Zurück zum Zitat Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. Proceedings of IJCAI, 7, 1606–1611. Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. Proceedings of IJCAI, 7, 1606–1611.
Zurück zum Zitat Geva, S., Kamps, J., Lethonen, M., Schenkel, R., Thom, J. A., & Trotman, A. (2010). Overview of the inex 2009 ad hoc track. In Focused retrieval and evaluation (pp. 4–25). Geva, S., Kamps, J., Lethonen, M., Schenkel, R., Thom, J. A., & Trotman, A. (2010). Overview of the inex 2009 ad hoc track. In Focused retrieval and evaluation (pp. 4–25).
Zurück zum Zitat Hearst, M. A., & Plaunt, C. (1993). Subtopic structuring for full-length document access. In Proceedings of SIGIR (pp. 59–68). Hearst, M. A., & Plaunt, C. (1993). Subtopic structuring for full-length document access. In Proceedings of SIGIR (pp. 59–68).
Zurück zum Zitat Jiang, J., & Zhai, C. (2004). Uiuc in hard 2004—passage retrieval using HMMS. In: TREC. Jiang, J., & Zhai, C. (2004). Uiuc in hard 2004—passage retrieval using HMMS. In: TREC.
Zurück zum Zitat Joachims, T. (2006). Training linear SVMs in linear time. In Proceedings of KDD (pp. 217–226). Joachims, T. (2006). Training linear SVMs in linear time. In Proceedings of KDD (pp. 217–226).
Zurück zum Zitat Kaszkiel, M., & Zobel, J. (1997). Passage retrieval revisited. In Proceedings of SIGIR (pp. 178–185). Kaszkiel, M., & Zobel, J. (1997). Passage retrieval revisited. In Proceedings of SIGIR (pp. 178–185).
Zurück zum Zitat Kaszkiel, M., & Zobel, J. (2001). Effective ranking with arbitrary passages. Journal of the American Society for Information Science and Technology, 52(4), 344–364.CrossRef Kaszkiel, M., & Zobel, J. (2001). Effective ranking with arbitrary passages. Journal of the American Society for Information Science and Technology, 52(4), 344–364.CrossRef
Zurück zum Zitat Keikha, M., Park, J. H., & Croft, W. B. (2014a). Evaluating answer passages using summarization measures. In Proceedings of SIGIR (pp. 963–966). Keikha, M., Park, J. H., & Croft, W. B. (2014a). Evaluating answer passages using summarization measures. In Proceedings of SIGIR (pp. 963–966).
Zurück zum Zitat Keikha, M., Park, J. H., Croft, W. B., & Sanderson, M. (2014b). Retrieving passages and finding answers. In Proceedings of ADCS (p. 81). Keikha, M., Park, J. H., Croft, W. B., & Sanderson, M. (2014b). Retrieving passages and finding answers. In Proceedings of ADCS (p. 81).
Zurück zum Zitat Krikon, E., Kurland, O., & Bendersky, M. (2010). Utilizing inter-passage and inter-document similarities for reranking search results. ACM Transactions on Information Systems, 29(1), 3:1–3:28. Krikon, E., Kurland, O., & Bendersky, M. (2010). Utilizing inter-passage and inter-document similarities for reranking search results. ACM Transactions on Information Systems, 29(1), 3:1–3:28.
Zurück zum Zitat Kurland, O., & Domshlak, C. (2008). A rank-aggregation approach to searching for optimal query-specific clusters. In Proceedings of SIGIR (pp. 547–554). Kurland, O., & Domshlak, C. (2008). A rank-aggregation approach to searching for optimal query-specific clusters. In Proceedings of SIGIR (pp. 547–554).
Zurück zum Zitat Kurland, O., & Krikon, E. (2011). The opposite of smoothing: A language model approach to ranking query-specific document clusters. Journal of Artificial Intelligence Research, 41, 367–395.MathSciNetCrossRef Kurland, O., & Krikon, E. (2011). The opposite of smoothing: A language model approach to ranking query-specific document clusters. Journal of Artificial Intelligence Research, 41, 367–395.MathSciNetCrossRef
Zurück zum Zitat Lang, H., Metzler, D., Wang, B., & Li, J. (2010). Improved latent concept expansion using hierarchical markov random fields. In Proceedings of CIKM (pp. 249–258). Lang, H., Metzler, D., Wang, B., & Li, J. (2010). Improved latent concept expansion using hierarchical markov random fields. In Proceedings of CIKM (pp. 249–258).
Zurück zum Zitat Lin, J. (2018). The neural hype and comparisons against weak baselines. SIGIR Forum, 52(2), 40–51. Lin, J. (2018). The neural hype and comparisons against weak baselines. SIGIR Forum, 52(2), 40–51.
Zurück zum Zitat Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef
Zurück zum Zitat Liu, X., & Croft, WB. (2002). Passage retrieval based on language models. In Proceedings of CIKM (pp. 375–382). Liu, X., & Croft, WB. (2002). Passage retrieval based on language models. In Proceedings of CIKM (pp. 375–382).
Zurück zum Zitat Liu, X., & Croft, WB. (2004). Cluster-based retrieval using language models. In Proceedings of SIGIR (pp. 186–193). Liu, X., & Croft, WB. (2004). Cluster-based retrieval using language models. In Proceedings of SIGIR (pp. 186–193).
Zurück zum Zitat Lv, Y., & Zhai, C. (2009). Positional language models for information retrieval. In Proceedings of SIGIR (pp. 299–306). Lv, Y., & Zhai, C. (2009). Positional language models for information retrieval. In Proceedings of SIGIR (pp. 299–306).
Zurück zum Zitat Lv, Y., & Zhai, C. (2010). Positional relevance model for pseudo-relevance feedback. In Proceedings of SIGIR (pp. 579–586). Lv, Y., & Zhai, C. (2010). Positional relevance model for pseudo-relevance feedback. In Proceedings of SIGIR (pp. 579–586).
Zurück zum Zitat Macdonald, C., Santos, R. L., & Ounis, I. (2012). On the usefulness of query features for learning to rank. In Proceedings of CIKM (pp. 2559–2562). Macdonald, C., Santos, R. L., & Ounis, I. (2012). On the usefulness of query features for learning to rank. In Proceedings of CIKM (pp. 2559–2562).
Zurück zum Zitat Metzler, D., & Croft, W. B. (2005). A markov random field model for term dependencies. In Proceedings of SIGIR (pp. 472–479). Metzler, D., & Croft, W. B. (2005). A markov random field model for term dependencies. In Proceedings of SIGIR (pp. 472–479).
Zurück zum Zitat Metzler, D., & Croft, W. B. (2007a). Latent concept expansion using markov random fields. In Proceedings of SIGIR (pp. 311–318). Metzler, D., & Croft, W. B. (2007a). Latent concept expansion using markov random fields. In Proceedings of SIGIR (pp. 311–318).
Zurück zum Zitat Metzler, D., & Croft, W. B. (2007b). Linear feature-based models for information retrieval. Information Retrieval, 10(3), 257–274.CrossRef Metzler, D., & Croft, W. B. (2007b). Linear feature-based models for information retrieval. Information Retrieval, 10(3), 257–274.CrossRef
Zurück zum Zitat Metzler, D., & Kanungo, T. (2008). Machine learned sentence selection strategies for query-biased summarization. In Proceedings of SIGIR (pp. 40–47). Metzler, D., & Kanungo, T. (2008). Machine learned sentence selection strategies for query-biased summarization. In Proceedings of SIGIR (pp. 40–47).
Zurück zum Zitat Miao, J., Huang, J. X., & Ye, Z. (2012). Proximity-based rocchio’s model for pseudo relevance. In Proceedings of SIGIR (pp. 535–544). Miao, J., Huang, J. X., & Ye, Z. (2012). Proximity-based rocchio’s model for pseudo relevance. In Proceedings of SIGIR (pp. 535–544).
Zurück zum Zitat Mittendorf, E., & Schäuble, P. (1994). Document and passage retrieval based on hidden markov models. In Proceedings of SIGIR (pp. 318–327). New York: Springer. Mittendorf, E., & Schäuble, P. (1994). Document and passage retrieval based on hidden markov models. In Proceedings of SIGIR (pp. 318–327). New York: Springer.
Zurück zum Zitat Murdock, V., & Croft, W. B. (2005). A translation model for sentence retrieval. In Proceedings of HLT/EMNLP (pp. 684–691). Association for Computational Linguistics. Murdock, V., & Croft, W. B. (2005). A translation model for sentence retrieval. In Proceedings of HLT/EMNLP (pp. 684–691). Association for Computational Linguistics.
Zurück zum Zitat Murdock, V. G. (2006). Aspects of sentence retrieval. PhD thesis, University of Massachusetts Amherst. Murdock, V. G. (2006). Aspects of sentence retrieval. PhD thesis, University of Massachusetts Amherst.
Zurück zum Zitat Na, S., Kang, I., Lee, Y., & Lee, J. (2008). Completely-arbitrary passage retrieval in language modeling approach. In Proceedings of AIRS (pp. 22–33). Na, S., Kang, I., Lee, Y., & Lee, J. (2008). Completely-arbitrary passage retrieval in language modeling approach. In Proceedings of AIRS (pp. 22–33).
Zurück zum Zitat Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., et al. (2016). MS MARCO: A human generated machine reading comprehension dataset. CoRR. arXiv:1611.09268. Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., et al. (2016). MS MARCO: A human generated machine reading comprehension dataset. CoRR. arXiv:​1611.​09268.
Zurück zum Zitat Ntoulas, A., Najork, M., Manasse, M., & Fetterly, D. (2006). Detecting spam web pages through content analysis. In Proceedings of WWW (pp. 83–92). Ntoulas, A., Najork, M., Manasse, M., & Fetterly, D. (2006). Detecting spam web pages through content analysis. In Proceedings of WWW (pp. 83–92).
Zurück zum Zitat Raiber, F., & Kurland, O. (2013). Ranking document clusters using markov random fields. In Proceedings of SIGIR (pp. 333–342). Raiber, F., & Kurland, O. (2013). Ranking document clusters using markov random fields. In Proceedings of SIGIR (pp. 333–342).
Zurück zum Zitat Raifer, N., Raiber, F., Tennenholtz, M., & Kurland, O. (2017). Information retrieval meets game theory: The ranking competition between documents? authors. In Proceedings of SIGIR (pp. 465–474). Raifer, N., Raiber, F., Tennenholtz, M., & Kurland, O. (2017). Information retrieval meets game theory: The ranking competition between documents? authors. In Proceedings of SIGIR (pp. 465–474).
Zurück zum Zitat Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., & Gatford, M. (1995). Okapi at trec-3. In Proceedings of TREC (Vol. 109, p. 109). Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., & Gatford, M. (1995). Okapi at trec-3. In Proceedings of TREC (Vol. 109, p. 109).
Zurück zum Zitat Salton, G., Allan, J., & Buckley, C. (1993). Approaches to passage retrieval in full text information systems. In Proceedings of SIGIR (pp. 49–58). Salton, G., Allan, J., & Buckley, C. (1993). Approaches to passage retrieval in full text information systems. In Proceedings of SIGIR (pp. 49–58).
Zurück zum Zitat Sheetrit, E., & Kurland, O. (2019). Cluster-based focused retrieval. In Proceedings of CIKM (pp. 2305–2308). Sheetrit, E., & Kurland, O. (2019). Cluster-based focused retrieval. In Proceedings of CIKM (pp. 2305–2308).
Zurück zum Zitat Soboroff, I. (2004). Overview of the TREC 2004 novelty track. In Proceedings of TREC. Soboroff, I. (2004). Overview of the TREC 2004 novelty track. In Proceedings of TREC.
Zurück zum Zitat Soboroff, I., & Harman, D. (2003). Overview of the TREC 2003 novelty track. In Proceedings of TREC (pp. 38–53). Soboroff, I., & Harman, D. (2003). Overview of the TREC 2003 novelty track. In Proceedings of TREC (pp. 38–53).
Zurück zum Zitat Tao, T., & Zhai, C. (2007). An exploration of proximity measures in information retrieval. In Proceedings of SIGIR (pp. 295–302). Tao, T., & Zhai, C. (2007). An exploration of proximity measures in information retrieval. In Proceedings of SIGIR (pp. 295–302).
Zurück zum Zitat Voorhees, E. M., & Harman, D. K. (2005). TREC: Experiments and evaluation in information retrieval. Cambridge: MIT Press. Voorhees, E. M., & Harman, D. K. (2005). TREC: Experiments and evaluation in information retrieval. Cambridge: MIT Press.
Zurück zum Zitat Wan, X., Yang, J., & Xiao, J. (2008). Towards a unified approach to document similarity search using manifold-ranking of blocks. Information Processing and Management, 44(3), 1032–1048.CrossRef Wan, X., Yang, J., & Xiao, J. (2008). Towards a unified approach to document similarity search using manifold-ranking of blocks. Information Processing and Management, 44(3), 1032–1048.CrossRef
Zurück zum Zitat Wang, M., & Si, L. (2008). Discriminative probabilistic models for passage based retrieval. In Proceedings of SIGIR (pp. 419–426). Wang, M., & Si, L. (2008). Discriminative probabilistic models for passage based retrieval. In Proceedings of SIGIR (pp. 419–426).
Zurück zum Zitat Wilkinson, R. (1994). Effective retrieval of structured documents. In Proceedings of SIGIR (pp. 311–317). Wilkinson, R. (1994). Effective retrieval of structured documents. In Proceedings of SIGIR (pp. 311–317).
Zurück zum Zitat Yang, L., Ai, Q., Spina, D., Chen, R. C., Pang, L., Croft, W. B., Guo, J., & Scholer, F. (2016). Beyond factoid QA: Effective methods for non-factoid answer sentence retrieval. In Proceedings of ECIR (pp. 115–128). Berlin: Springer. Yang, L., Ai, Q., Spina, D., Chen, R. C., Pang, L., Croft, W. B., Guo, J., & Scholer, F. (2016). Beyond factoid QA: Effective methods for non-factoid answer sentence retrieval. In Proceedings of ECIR (pp. 115–128). Berlin: Springer.
Zurück zum Zitat Yulianti, E., Chen, R., Scholer, F., & Sanderson, M. (2016). Using semantic and context features for answer summary extraction. In Proceedings of ADCS (pp. 81–84). Yulianti, E., Chen, R., Scholer, F., & Sanderson, M. (2016). Using semantic and context features for answer summary extraction. In Proceedings of ADCS (pp. 81–84).
Zurück zum Zitat Yulianti, E., Chen, R., Scholer, F., Croft, W. B., & Sanderson, M. (2018). Ranking documents by answer-passage quality. In Proceedings of SIGIR (pp. 335–344). Yulianti, E., Chen, R., Scholer, F., Croft, W. B., & Sanderson, M. (2018). Ranking documents by answer-passage quality. In Proceedings of SIGIR (pp. 335–344).
Zurück zum Zitat Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR (pp. 334–342). Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR (pp. 334–342).
Zurück zum Zitat Zhao, J., & Yun, Y. (2009). A proximity language model for information retrieval. In Proceedings of SIGIR (pp. 291–298). Zhao, J., & Yun, Y. (2009). A proximity language model for information retrieval. In Proceedings of SIGIR (pp. 291–298).
Metadaten
Titel
A passage-based approach to learning to rank documents
verfasst von
Eilon Sheetrit
Anna Shtok
Oren Kurland
Publikationsdatum
06.03.2020
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 2/2020
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-020-09369-x