Skip to main content

2021 | OriginalPaper | Buchkapitel

A Deep Analysis of an Explainable Retrieval Model for Precision Medicine Literature Search

verfasst von : Jiaming Qu, Jaime Arguello, Yue Wang

Erschienen in: Advances in Information Retrieval

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Professional search queries are often formulated in a structured manner, where multiple aspects are combined in a logical form. The information need is often fulfilled by an initial retrieval stage followed by a complex reranking algorithm. In this paper, we analyze a simple, explainable reranking model that follows the structured search criterion. Different aspects of the criterion are predicted by machine learning classifiers, which are then combined through the logical form to predict document relevance. On three years of data from the TREC Precision Medicine literature search track (2017–2019), we show that the simple model consistently performs as well as LambdaMART rerankers. Furthermore, many black-box rerankers developed by top-ranked TREC teams can be replaced by this simple model without statistically significant performance change. Finally, we find that the model can achieve remarkably high performance even when manually labeled documents are very limited. Together, these findings suggest that leveraging the structure in professional search queries is a promising direction towards building explainable, label-efficient, and high-performance retrieval models for professional search tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agosti, M., Nunzio, G.M.D., Marchesin, S.: The university of Padua IMS research group at TREC 2018 precision medicine track (2018) Agosti, M., Nunzio, G.M.D., Marchesin, S.: The university of Padua IMS research group at TREC 2018 precision medicine track (2018)
2.
Zurück zum Zitat Aromataris, E., Riitano, D.: Systematic reviews: constructing a search strategy and searching for evidence. Am. J. Nurs. 114(5), 49–56 (2014)CrossRef Aromataris, E., Riitano, D.: Systematic reviews: constructing a search strategy and searching for evidence. Am. J. Nurs. 114(5), 49–56 (2014)CrossRef
3.
4.
Zurück zum Zitat Burges, C.J.: From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581), 81 (2010) Burges, C.J.: From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581), 81 (2010)
5.
Zurück zum Zitat Caucheteur, D., Pasche, E., Gobeill, J., Mottaz, A., Mottin, L., Ruch, P.: Designing retrieval models to contrast precision-driven ad hoc search vs. recall-driven treatment extraction in precision medicine. In: TREC (2019) Caucheteur, D., Pasche, E., Gobeill, J., Mottaz, A., Mottin, L., Ruch, P.: Designing retrieval models to contrast precision-driven ad hoc search vs. recall-driven treatment extraction in precision medicine. In: TREC (2019)
6.
Zurück zum Zitat Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef
7.
Zurück zum Zitat Cieslewicz, A., Dutkiewicz, J., Jedrzejek, C.: Poznan contribution to TREC-PM 2019. In: TREC (2019) Cieslewicz, A., Dutkiewicz, J., Jedrzejek, C.: Poznan contribution to TREC-PM 2019. In: TREC (2019)
8.
Zurück zum Zitat Dang, V., Bendersky, M., Croft, W.B.: Two-stage learning to rank for information retrieval. In: European Conference on Information Retrieval, pp. 423–434 (2013) Dang, V., Bendersky, M., Croft, W.B.: Two-stage learning to rank for information retrieval. In: European Conference on Information Retrieval, pp. 423–434 (2013)
9.
Zurück zum Zitat Faessler, E., Hahn, U., Oleynik, M.: Julie lab & med uni graz@ TREC 2019 precision medicine track. In: TREC (2019) Faessler, E., Hahn, U., Oleynik, M.: Julie lab & med uni graz@ TREC 2019 precision medicine track. In: TREC (2019)
10.
Zurück zum Zitat Faessler, E., Oleynik, M., Hahn, U.: What makes a top-performing precision medicine search engine? tracing main system features in a systematic way. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 459–468 (2020) Faessler, E., Oleynik, M., Hahn, U.: What makes a top-performing precision medicine search engine? tracing main system features in a systematic way. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 459–468 (2020)
11.
Zurück zum Zitat Feng, J., Yang, Z., Liu, Z., Luo, L., Lin, H., Wang, J.: Dutir at TREC 2019: Precision medicine track. In: TREC (2019) Feng, J., Yang, Z., Liu, Z., Luo, L., Lin, H., Wang, J.: Dutir at TREC 2019: Precision medicine track. In: TREC (2019)
12.
Zurück zum Zitat Fernando, Z.T., Singh, J., Anand, A.: A study on the interpretability of neural retrieval models using deepshap. In: SIGIR 2019. pp. 1005–1008. ACM, New York, NY, USA (2019) Fernando, Z.T., Singh, J., Anand, A.: A study on the interpretability of neural retrieval models using deepshap. In: SIGIR 2019. pp. 1005–1008. ACM, New York, NY, USA (2019)
13.
Zurück zum Zitat Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRef Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRef
14.
Zurück zum Zitat Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64 (2016) Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64 (2016)
15.
Zurück zum Zitat Hui, K., Yates, A., Berberich, K., de Melo, G.: PACRR: a position-aware neural IR model for relevance matching. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1049–1058. Association for Computational Linguistics, Copenhagen, Denmark (September 2017) Hui, K., Yates, A., Berberich, K., de Melo, G.: PACRR: a position-aware neural IR model for relevance matching. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1049–1058. Association for Computational Linguistics, Copenhagen, Denmark (September 2017)
16.
Zurück zum Zitat Kanoulas, E., Li, D., Azzopardi, L., Spijker, R.: Clef 2019 technology assisted reviews in empirical medicine overview. In: CEUR Workshop Proceedings, vol. 2380 (2019) Kanoulas, E., Li, D., Azzopardi, L., Spijker, R.: Clef 2019 technology assisted reviews in empirical medicine overview. In: CEUR Workshop Proceedings, vol. 2380 (2019)
17.
Zurück zum Zitat Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014) Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
19.
Zurück zum Zitat Li, P., Wu, Q., Burges, C.J.: Mcrank: learning to rank using multiple classification and gradient boosting. In: Advances in Neural Information Processing Systems, pp. 897–904 (2008) Li, P., Wu, Q., Burges, C.J.: Mcrank: learning to rank using multiple classification and gradient boosting. In: Advances in Neural Information Processing Systems, pp. 897–904 (2008)
21.
Zurück zum Zitat Liu, X., Li, L., Yang, Z., Dong, S.: SCUT-CCNL at TREC 2019 precision medicine track. In: TREC (2019) Liu, X., Li, L., Yang, Z., Dong, S.: SCUT-CCNL at TREC 2019 precision medicine track. In: TREC (2019)
22.
Zurück zum Zitat López-Úbeda, P., Vera-Ramos, J.A., López-García, P.: TREC 2019 precision medicine - medical university of Graz. In: TREC (2019) López-Úbeda, P., Vera-Ramos, J.A., López-García, P.: TREC 2019 precision medicine - medical university of Graz. In: TREC (2019)
23.
Zurück zum Zitat Nunzio, G.M.D., Marchesin, S., Agosti, M.: Exploring how to combine query reformulations for precision medicine. In: TREC (2019) Nunzio, G.M.D., Marchesin, S., Agosti, M.: Exploring how to combine query reformulations for precision medicine. In: TREC (2019)
25.
Zurück zum Zitat O’Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., Ananiadou, S.: Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst. Rev. 4(1), 5 (2015)CrossRef O’Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., Ananiadou, S.: Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst. Rev. 4(1), 5 (2015)CrossRef
26.
Zurück zum Zitat Qu, J., Arguello, J., Wang, Y.: Towards explainable retrieval models for precision medicine literature search. In: Proceedings of the 43rd ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1593–1596 (2020) Qu, J., Arguello, J., Wang, Y.: Towards explainable retrieval models for precision medicine literature search. In: Proceedings of the 43rd ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1593–1596 (2020)
27.
Zurück zum Zitat Qu, J., Wang, Y.: UNC SILS at TREC 2019 precision medicine track. In: TREC (2019) Qu, J., Wang, Y.: UNC SILS at TREC 2019 precision medicine track. In: TREC (2019)
28.
Zurück zum Zitat Roberts, K., et al.: Overview of the TREC 2017 precision medicine track (2017) Roberts, K., et al.: Overview of the TREC 2017 precision medicine track (2017)
29.
Zurück zum Zitat Roberts, K., et al.: Overview of the TREC 2019 precision medicine track. In: TREC (2019) Roberts, K., et al.: Overview of the TREC 2019 precision medicine track. In: TREC (2019)
30.
Zurück zum Zitat Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Mach. Intell. 1(5), 206–215 (2019)CrossRef Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Mach. Intell. 1(5), 206–215 (2019)CrossRef
31.
Zurück zum Zitat Russell-Rose, T., Chamberlain, J., Azzopardi, L.: Information retrieval in the workplace: a comparison of professional search practices. Inf. Process. Manage. 54(6), 1042–1057 (2018)CrossRef Russell-Rose, T., Chamberlain, J., Azzopardi, L.: Information retrieval in the workplace: a comparison of professional search practices. Inf. Process. Manage. 54(6), 1042–1057 (2018)CrossRef
32.
Zurück zum Zitat Rybinski, M., Karimi, S., Paris, C.: Csiro at 2019 TREC precision medicine track. In: TREC (2019) Rybinski, M., Karimi, S., Paris, C.: Csiro at 2019 TREC precision medicine track. In: TREC (2019)
33.
Zurück zum Zitat Schardt, C., Adams, M.B., Owens, T., Keitz, S., Fontelo, P.: Utilization of the pico framework to improve searching pubmed for clinical questions. BMC Med. Inf. Decis. Making 7(1), 16 (2007)CrossRef Schardt, C., Adams, M.B., Owens, T., Keitz, S., Fontelo, P.: Utilization of the pico framework to improve searching pubmed for clinical questions. BMC Med. Inf. Decis. Making 7(1), 16 (2007)CrossRef
34.
35.
Zurück zum Zitat Smucker, M.D., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: CIKM, pp. 623–632 (2007) Smucker, M.D., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: CIKM, pp. 623–632 (2007)
36.
Zurück zum Zitat Tian, A., Lease, M.: Active learning to maximize accuracy vs. effort in interactive information retrieval. In: Proceedings of the 34th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 145–154 (2011) Tian, A., Lease, M.: Active learning to maximize accuracy vs. effort in interactive information retrieval. In: Proceedings of the 34th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 145–154 (2011)
37.
Zurück zum Zitat Wallace, B.C., Trikalinos, T.A., Lau, J., Brodley, C., Schmid, C.H.: Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinform. 11(1), 1–11 (2010)CrossRef Wallace, B.C., Trikalinos, T.A., Lau, J., Brodley, C., Schmid, C.H.: Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinform. 11(1), 1–11 (2010)CrossRef
38.
Zurück zum Zitat Wu, D.T.Y., Su, W., Lee, J.J.: Retrieving scientific abstracts using venue- and concept-based approaches: Cincymedir at TREC 2019 precision medicine track. In: TREC (2019) Wu, D.T.Y., Su, W., Lee, J.J.: Retrieving scientific abstracts using venue- and concept-based approaches: Cincymedir at TREC 2019 precision medicine track. In: TREC (2019)
39.
Zurück zum Zitat Zhang, Y., Chen, X.: Explainable recommendation: a survey and new perspectives. arXiv preprint: 1804.11192 (2018) Zhang, Y., Chen, X.: Explainable recommendation: a survey and new perspectives. arXiv preprint: 1804.11192 (2018)
40.
Zurück zum Zitat Zheng, Q., Li, Y., Hu, J., Yang, Y., He, L., Xue, Y.: ECNU-ICA team at TREC 2019 precision medicine track. In: TREC (2019) Zheng, Q., Li, Y., Hu, J., Yang, Y., He, L., Xue, Y.: ECNU-ICA team at TREC 2019 precision medicine track. In: TREC (2019)
Metadaten
Titel
A Deep Analysis of an Explainable Retrieval Model for Precision Medicine Literature Search
verfasst von
Jiaming Qu
Jaime Arguello
Yue Wang
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-72113-8_36

Neuer Inhalt