Skip to main content

2023 | OriginalPaper | Buchkapitel

CoSPLADE: Contextualizing SPLADE for Conversational Information Retrieval

verfasst von : Nam Hai Le, Thomas Gerald, Thibault Formal, Jian-Yun Nie, Benjamin Piwowarski, Laure Soulier

Erschienen in: Advances in Information Retrieval

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Conversational search is a difficult task as it aims at retrieving documents based not only on the current user query but also on the full conversation history. Most of the previous methods have focused on a multi-stage ranking approach relying on query reformulation, a critical intermediate step that might lead to a sub-optimal retrieval. Other approaches have tried to use a fully neural IR first-stage, but are either zero-shot or rely on full learning-to-rank based on a dataset with pseudo-labels. In this work, leveraging the CANARD dataset, we propose an innovative lightweight learning technique to train a first-stage ranker based on SPLADE. By relying on SPLADE sparse representations, we show that, when combined with a second-stage ranker based on T5Mono, the results are competitive on the TREC CAsT 2020 and 2021 tracks. The source code is available at https://​github.​com/​nam685/​cosplade.​git.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note that for the second stage, we rely on weak labels since our model is similar to previous works. Given that the gap between first-stage and second-stage rankers continues to decrease, training a second-stage ranker might not be necessary in the future.
 
2
Selected by the organizer as the most relevant answer of a baseline system.
 
4
In the experiments, we also explore an alternative model where answers and queries are considered at once.
 
5
To improve coherence, we chose to make keywords follow their order of appearance in the context, but did not vary this experimental setting.
 
8
This might be due to the simple way to use past answers, i.e. Equation 4, but all the other variations we tried did not perform better.
 
Literatur
2.
Zurück zum Zitat Arabzadeh, N., Clarke, C.L.A.: Waterlooclarke at the TREC 2020 conversational assistant track (2020) Arabzadeh, N., Clarke, C.L.A.: Waterlooclarke at the TREC 2020 conversational assistant track (2020)
3.
Zurück zum Zitat Christmann, P., Roy, R.S., Weikum, G.: Conversational question answering on heterogeneous sources. In: Amigó, E., Castells, P., Gonzalo, J., Carterette, B., Culpepper, J.S., Kazai, G. (eds.) SIGIR 2022: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022, pp. 144–154. ACM (2022). https://doi.org/10.1145/3477495.3531815 Christmann, P., Roy, R.S., Weikum, G.: Conversational question answering on heterogeneous sources. In: Amigó, E., Castells, P., Gonzalo, J., Carterette, B., Culpepper, J.S., Kazai, G. (eds.) SIGIR 2022: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11–15 July 2022, pp. 144–154. ACM (2022). https://​doi.​org/​10.​1145/​3477495.​3531815
4.
Zurück zum Zitat Clarke, C.L.A.: Waterlooclarke at the TREC 2019 conversational assistant track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Eighth Text REtrieval Conference, TREC 2019, Gaithersburg, Maryland, USA, 13–15 November 2019. NIST Special Publication, vol. 1250. National Institute of Standards and Technology (NIST) (2019). https://trec.nist.gov/pubs/trec28/papers/WaterlooClarke.C.pdf Clarke, C.L.A.: Waterlooclarke at the TREC 2019 conversational assistant track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Eighth Text REtrieval Conference, TREC 2019, Gaithersburg, Maryland, USA, 13–15 November 2019. NIST Special Publication, vol. 1250. National Institute of Standards and Technology (NIST) (2019). https://​trec.​nist.​gov/​pubs/​trec28/​papers/​WaterlooClarke.​C.​pdf
6.
Zurück zum Zitat Dalton, J., Xiong, C., Callan, J.: CAsT 2020: The conversational assistance track overview, p. 10 Dalton, J., Xiong, C., Callan, J.: CAsT 2020: The conversational assistance track overview, p. 10
8.
Zurück zum Zitat Dalton, J., Xiong, C., Callan, J.: TREC CAsT 2021: the conversational assistance track overview, p. 7 (2021) Dalton, J., Xiong, C., Callan, J.: TREC CAsT 2021: the conversational assistance track overview, p. 7 (2021)
9.
Zurück zum Zitat Elgohary, A., Peskov, D., Boyd-Graber, J.: Can you unpack that? Learning to rewrite questions-in-context. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5918–5924. Association for Computational Linguistics, Hong Kong, November 2019. https://doi.org/10.18653/v1/D19-1605, https://aclanthology.org/D19-1605 Elgohary, A., Peskov, D., Boyd-Graber, J.: Can you unpack that? Learning to rewrite questions-in-context. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5918–5924. Association for Computational Linguistics, Hong Kong, November 2019. https://​doi.​org/​10.​18653/​v1/​D19-1605, https://​aclanthology.​org/​D19-1605
10.
Zurück zum Zitat Formal, T., Lassance, C., Piwowarski, B., Clinchant, S.: From distillation to hard negative sampling: making sparse neural IR models more effective. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2022, pp. 2353–2359. Association for Computing Machinery, New York, July 2022. https://doi.org/10.1145/3477495.3531857 Formal, T., Lassance, C., Piwowarski, B., Clinchant, S.: From distillation to hard negative sampling: making sparse neural IR models more effective. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2022, pp. 2353–2359. Association for Computing Machinery, New York, July 2022. https://​doi.​org/​10.​1145/​3477495.​3531857
11.
Zurück zum Zitat Formal, T., Piwowarski, B., Clinchant, S.: SPLADE: sparse lexical and expansion model for first stage ranking. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2021, pp. 2288–2292. Association for Computing Machinery, New York, July 2021. 10/gm2tf2, https://doi.org/10.1145/3404835.3463098 Formal, T., Piwowarski, B., Clinchant, S.: SPLADE: sparse lexical and expansion model for first stage ranking. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2021, pp. 2288–2292. Association for Computing Machinery, New York, July 2021. 10/gm2tf2, https://​doi.​org/​10.​1145/​3404835.​3463098
12.
Zurück zum Zitat Hofstätter, S., Althammer, S., Schröder, M., Sertkan, M., Hanbury, A.: Improving efficient neural ranking models with cross-architecture knowledge distillation. arXiv abs/2010.02666 (2020) Hofstätter, S., Althammer, S., Schröder, M., Sertkan, M., Hanbury, A.: Improving efficient neural ranking models with cross-architecture knowledge distillation. arXiv abs/2010.02666 (2020)
13.
Zurück zum Zitat Hofstätter, S., Lin, S.C., Yang, J.H., Lin, J., Hanbury, A.: Efficiently teaching an effective dense retriever with balanced topic aware sampling. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2021, pp. 113–122. Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3404835.3462891 Hofstätter, S., Lin, S.C., Yang, J.H., Lin, J., Hanbury, A.: Efficiently teaching an effective dense retriever with balanced topic aware sampling. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2021, pp. 113–122. Association for Computing Machinery, New York (2021). https://​doi.​org/​10.​1145/​3404835.​3462891
15.
Zurück zum Zitat Krasakis, A.M., Yates, A., Kanoulas, E.: Zero-shot Query Contextualization for Conversational Search. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2022, pp. 1880–1884. Association for Computing Machinery, New York, July 2022. https://doi.org/10.1145/3477495.3531769 Krasakis, A.M., Yates, A., Kanoulas, E.: Zero-shot Query Contextualization for Conversational Search. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2022, pp. 1880–1884. Association for Computing Machinery, New York, July 2022. https://​doi.​org/​10.​1145/​3477495.​3531769
16.
Zurück zum Zitat Kumar, V., Callan, J.: Making information seeking easier: an improved pipeline for conversational search, p. 10 Kumar, V., Callan, J.: Making information seeking easier: an improved pipeline for conversational search, p. 10
17.
Zurück zum Zitat Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://​openreview.​net/​forum?​id=​H1eA7AEtvS
20.
Zurück zum Zitat Lin, S.C., Yang, J.H., Lin, J.: TREC 2020 notebook: CAsT track. Technical report, TREC, December 2021 Lin, S.C., Yang, J.H., Lin, J.: TREC 2020 notebook: CAsT track. Technical report, TREC, December 2021
21.
Zurück zum Zitat Lin, S.C., Yang, J.H., Nogueira, R., Tsai, M.F., Wang, C.J., Lin, J.: Multi-stage conversational passage retrieval: an approach to fusing term importance estimation and neural query rewriting. http://arxiv.org/abs/2005.02230 Lin, S.C., Yang, J.H., Nogueira, R., Tsai, M.F., Wang, C.J., Lin, J.: Multi-stage conversational passage retrieval: an approach to fusing term importance estimation and neural query rewriting. http://​arxiv.​org/​abs/​2005.​02230
23.
Zurück zum Zitat Mele, I., Muntean, C.I., Nardini, F.M., Perego, R., Tonellotto, N.: Finding context through utterance dependencies in search conversations. Technical report (2021) Mele, I., Muntean, C.I., Nardini, F.M., Perego, R., Tonellotto, N.: Finding context through utterance dependencies in search conversations. Technical report (2021)
26.
Zurück zum Zitat Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W.B., Iyyer, M.: Open-retrieval conversational question answering. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2020, pp. 539–548. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3397271.3401110 Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W.B., Iyyer, M.: Open-retrieval conversational question answering. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2020, pp. 539–548. Association for Computing Machinery, New York (2020). https://​doi.​org/​10.​1145/​3397271.​3401110
27.
Zurück zum Zitat Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W.B., Iyyer, M.: Open-retrieval conversational question answering. In: Huang, J.X., et al. (eds.) Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, 25–30 July 2020, pp. 539–548. ACM (2020). https://doi.org/10.1145/3397271.3401110 Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W.B., Iyyer, M.: Open-retrieval conversational question answering. In: Huang, J.X., et al. (eds.) Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, 25–30 July 2020, pp. 539–548. ACM (2020). https://​doi.​org/​10.​1145/​3397271.​3401110
28.
Zurück zum Zitat Qu, C., Yang, L., Qiu, M., Croft, W.B., Zhang, Y., Iyyer, M.: BERT with history answer embedding for conversational question answering. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2019, pp. 1133–1136. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3331184.3331341 Qu, C., Yang, L., Qiu, M., Croft, W.B., Zhang, Y., Iyyer, M.: BERT with history answer embedding for conversational question answering. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2019, pp. 1133–1136. Association for Computing Machinery, New York (2019). https://​doi.​org/​10.​1145/​3331184.​3331341
29.
Zurück zum Zitat Qu, C., et al.: Attentive history selection for conversational question answering. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1391–1400 (2019) Qu, C., et al.: Attentive history selection for conversational question answering. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1391–1400 (2019)
34.
Zurück zum Zitat Voskarides, N., Li, D., Panteli, A., Ren, P.: ILPS at TREC 2019 conversational assistant track, p. 4 Voskarides, N., Li, D., Panteli, A., Ren, P.: ILPS at TREC 2019 conversational assistant track, p. 4
36.
Zurück zum Zitat Yan, X., Clarke, C.L.A., Arabzadeh, N.: Waterlooclarke at the TREC 2021 conversational assistant track (2021) Yan, X., Clarke, C.L.A., Arabzadeh, N.: Waterlooclarke at the TREC 2021 conversational assistant track (2021)
37.
Zurück zum Zitat Yang, J.H., Lin, S.C., Wang, C.J., Lin, J.J., Tsai, M.F.: Query and answer expansion from conversation history. In: TREC (2019) Yang, J.H., Lin, S.C., Wang, C.J., Lin, J.J., Tsai, M.F.: Query and answer expansion from conversation history. In: TREC (2019)
Metadaten
Titel
CoSPLADE: Contextualizing SPLADE for Conversational Information Retrieval
verfasst von
Nam Hai Le
Thomas Gerald
Thibault Formal
Jian-Yun Nie
Benjamin Piwowarski
Laure Soulier
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-28244-7_34

Neuer Inhalt