Skip to main content
Erschienen in: Discover Computing 3-4/2019

08.11.2018 | Knowledge Graphs and Semantics in Text Analysis and Retrieval

Payoffs and pitfalls in using knowledge-bases for consumer health search

verfasst von: Jimmy, Guido Zuccon, Bevan Koopman

Erschienen in: Discover Computing | Ausgabe 3-4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Consumer health search (CHS) is a challenging domain with vocabulary mismatch and considerable domain expertise hampering peoples’ ability to formulate effective queries. We posit that using knowledge bases for query reformulation may help alleviate this problem. How to exploit knowledge bases for effective CHS is nontrivial, involving a swathe of key choices and design decisions (many of which are not explored in the literature). Here we rigorously empirically evaluate the impact these different choices have on retrieval effectiveness. A state-of-the-art knowledge-base retrieval model—the Entity Query Feature Expansion model—was used to evaluate these choices, which include: which knowledge base to use (specialised vs. general purpose), how to construct the knowledge base, how to extract entities from queries and map them to entities in the knowledge base, what part of the knowledge base to use for query expansion, and if to augment the knowledge base search process with relevance feedback. While knowledge base retrieval has been proposed as a solution for CHS, this paper delves into the finer details of doing this effectively, highlighting both payoffs and pitfalls. It aims to provide some lessons to others in advancing the state-of-the-art in CHS.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Unified Medical Language System (UMLS) is a compendium of many controlled vocabularies in the biomedical sciences.
 
4
A Wikipedia Infobox is used to summarise important aspects of an entity and its relation with other articles.
 
6
A Wikipedia Infobox is used to summarise important aspects of an entity and its relation with other articles.
 
8
Only complete string matches were considered.
 
9
ECNU-2 had the highest effectiveness, but it used Google query suggestion service to gain expansions.
 
Literatur
Zurück zum Zitat Aronson, A. R., & Lang, F. M. (2010). An overview of metamap: Historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229–236.CrossRef Aronson, A. R., & Lang, F. M. (2010). An overview of metamap: Historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229–236.CrossRef
Zurück zum Zitat Balaneshinkordan, S., & Kotov, A. (2016). An empirical comparison of term association and knowledge graphs for query expansion. In European conference on information retrieval (pp 761–767). Berlin: Springer. Balaneshinkordan, S., & Kotov, A. (2016). An empirical comparison of term association and knowledge graphs for query expansion. In European conference on information retrieval (pp 761–767). Berlin: Springer.
Zurück zum Zitat Bendersky, M., Metzler, D., & Croft, W, (2012), Effective query formulation with multiple information sources. In Proceedings of the 5th ACM international conference on web search and data mining (pp. 443–452). Bendersky, M., Metzler, D., & Croft, W, (2012), Effective query formulation with multiple information sources. In Proceedings of the 5th ACM international conference on web search and data mining (pp. 443–452).
Zurück zum Zitat Bodenreider, O. (2004). The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32(suppl 1), D267–D270.CrossRef Bodenreider, O. (2004). The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32(suppl 1), D267–D270.CrossRef
Zurück zum Zitat Dalton, J., Dietz, L., & Allan, J. (2014). Entity query feature expansion using knowledge base links. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval (pp. 365–374). Dalton, J., Dietz, L., & Allan, J. (2014). Entity query feature expansion using knowledge base links. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval (pp. 365–374).
Zurück zum Zitat Díaz-Galiano, M., Martín-Valdivia, M., & Ureña-López, L. (2009). Query expansion with a medical ontology to improve a multimodal information retrieval system. Journal of Computers in Biology and Medicine, 39(4), 396–403.CrossRef Díaz-Galiano, M., Martín-Valdivia, M., & Ureña-López, L. (2009). Query expansion with a medical ontology to improve a multimodal information retrieval system. Journal of Computers in Biology and Medicine, 39(4), 396–403.CrossRef
Zurück zum Zitat Egozi, O., Markovitch, S., & Gabrilovich, E. (2011). Concept-based information retrieval using explicit semantic analysis. ACM Transactions on Information Systems (TOIS), 29(2), 8.CrossRef Egozi, O., Markovitch, S., & Gabrilovich, E. (2011). Concept-based information retrieval using explicit semantic analysis. ACM Transactions on Information Systems (TOIS), 29(2), 8.CrossRef
Zurück zum Zitat Jimmy, Zuccon, G., & Koopman, B. (2016). Boosting titles does not generally improve retrieval effectiveness. In Proceedings of the 21st Australasian document computing symposium (pp. 25–32). Jimmy, Zuccon, G., & Koopman, B. (2016). Boosting titles does not generally improve retrieval effectiveness. In Proceedings of the 21st Australasian document computing symposium (pp. 25–32).
Zurück zum Zitat Jimmy, Zuccon, G., & Koopman, B. (2017). Qut ielab at clef 2017 e-health IR task: Knowledge base retrieval for consumer health search. In CLEF. Jimmy, Zuccon, G., & Koopman, B. (2017). Qut ielab at clef 2017 e-health IR task: Knowledge base retrieval for consumer health search. In CLEF.
Zurück zum Zitat Jimmy, Zuccon, G., & Koopman, B. (2018). Choices in knowledge-base retrieval for consumer health search. In Proceedings of the 40th European conference on information retrieval. Berlin: Springer. Jimmy, Zuccon, G., & Koopman, B. (2018). Choices in knowledge-base retrieval for consumer health search. In Proceedings of the 40th European conference on information retrieval. Berlin: Springer.
Zurück zum Zitat Keselman, A., Smith, C. A., Divita, G., Kim, H., Browne, A. C., Leroy, G., et al. (2008). Consumer health concepts that do not map to the UMLS: Where do they fit? Journal of the American Medical Informatics Association, 15(4), 496–505.CrossRef Keselman, A., Smith, C. A., Divita, G., Kim, H., Browne, A. C., Leroy, G., et al. (2008). Consumer health concepts that do not map to the UMLS: Where do they fit? Journal of the American Medical Informatics Association, 15(4), 496–505.CrossRef
Zurück zum Zitat Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., & Zeng, Q. (2006). Relating consumer knowledge of health terms and health concepts. In Proceedings of American medical informatics association. Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., & Zeng, Q. (2006). Relating consumer knowledge of health terms and health concepts. In Proceedings of American medical informatics association.
Zurück zum Zitat Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., & Lawley, M. (2012). Graph-based concept weighting for medical information retrieval. In Proceedings of the 17th Australasian document computing symposium (pp. 80–87). Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., & Lawley, M. (2012). Graph-based concept weighting for medical information retrieval. In Proceedings of the 17th Australasian document computing symposium (pp. 80–87).
Zurück zum Zitat Kotov, A., & Zhai, C. (2012). Tapping into knowledge base for concept feedback: Leveraging concept net to improve search results for difficult queries. In Proceedings of the 5th ACM international conference on web search and data mining, ACM (pp. 403–412). Kotov, A., & Zhai, C. (2012). Tapping into knowledge base for concept feedback: Leveraging concept net to improve search results for difficult queries. In Proceedings of the 5th ACM international conference on web search and data mining, ACM (pp. 403–412).
Zurück zum Zitat Limsopatham, N., Macdonald, C., & Ounis, I. (2013). Inferring conceptual relationships to improve medical records search. In Proceedings of the 10th conference on open research areas in information retrieval (pp. 1–8). Limsopatham, N., Macdonald, C., & Ounis, I. (2013). Inferring conceptual relationships to improve medical records search. In Proceedings of the 10th conference on open research areas in information retrieval (pp. 1–8).
Zurück zum Zitat Liu, X., & Fang, H. (2015). Latent entity space: A novel retrieval approach for entity-bearing queries. Information Retrieval Journal, 18(6), 473–503.CrossRef Liu, X., & Fang, H. (2015). Latent entity space: A novel retrieval approach for entity-bearing queries. Information Retrieval Journal, 18(6), 473–503.CrossRef
Zurück zum Zitat Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208.CrossRef Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208.CrossRef
Zurück zum Zitat Palotti, J., Goeuriot, L., Zuccon, G., & Hanbury, A. (2016). Ranking health web pages with relevance and understandability. In Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval (pp. 965–968). Palotti, J., Goeuriot, L., Zuccon, G., & Hanbury, A. (2016). Ranking health web pages with relevance and understandability. In Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval (pp. 965–968).
Zurück zum Zitat Palotti, J., Zuccon, G., Jimmy, Pecina, P., Lupu, M., Goeuriot, L., Kelly, L., & Hanbury, A. (2017). Clef 2017 task overview: The IR task at the ehealth evaluation lab. In Working notes of conference and labs of the evaluation (CLEF) forum. CEUR workshop proceedings. Palotti, J., Zuccon, G., Jimmy, Pecina, P., Lupu, M., Goeuriot, L., Kelly, L., & Hanbury, A. (2017). Clef 2017 task overview: The IR task at the ehealth evaluation lab. In Working notes of conference and labs of the evaluation (CLEF) forum. CEUR workshop proceedings.
Zurück zum Zitat Plovnick, R., & Zeng, Q. (2004). Reformulation of consumer health queries with professional terminology: A pilot study. Journal of Medical Internet Research, 6(3), e27.CrossRef Plovnick, R., & Zeng, Q. (2004). Reformulation of consumer health queries with professional terminology: A pilot study. Journal of Medical Internet Research, 6(3), e27.CrossRef
Zurück zum Zitat Sakai, T. (2007). Alternatives to bpref. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’07 (pp. 71–78). New York: ACM. Sakai, T. (2007). Alternatives to bpref. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’07 (pp. 71–78). New York: ACM.
Zurück zum Zitat Silva, R., & Lopes, C. (2016). The effectiveness of query expansion when searching for health related content: Infolab at clef ehealth 2016. In CLEF (working notes). Silva, R., & Lopes, C. (2016). The effectiveness of query expansion when searching for health related content: Infolab at clef ehealth 2016. In CLEF (working notes).
Zurück zum Zitat Soldaini, L., Cohan, A., Yates, A., Goharian, N., & Frieder, O. (2015). Retrieving medical literature for clinical decision support. In European conference on information retrieval (pp 538–549). Berlin: Springer. Soldaini, L., Cohan, A., Yates, A., Goharian, N., & Frieder, O. (2015). Retrieving medical literature for clinical decision support. In European conference on information retrieval (pp 538–549). Berlin: Springer.
Zurück zum Zitat Soldaini, L., & Goharian, N. (2016). QuickUMLS: A fast, unsupervised approach for medical concept extraction. In SIGIR MedIR workshop, Pisa, Italy. Soldaini, L., & Goharian, N. (2016). QuickUMLS: A fast, unsupervised approach for medical concept extraction. In SIGIR MedIR workshop, Pisa, Italy.
Zurück zum Zitat Soldaini, L., & Goharian, N. (2017). Learning to rank for consumer health search: A semantic approach. In European conference on information retrieval (pp 640–646). Berlin: Springer. Soldaini, L., & Goharian, N. (2017). Learning to rank for consumer health search: A semantic approach. In European conference on information retrieval (pp 640–646). Berlin: Springer.
Zurück zum Zitat Soldaini, L., Yates, A., Yom-Tov, E., Frieder, O., & Goharian, N. (2016). Enhancing web search in the medical domain via query clarification. Information Retrieval Journal, 19(1–2), 149–173.CrossRef Soldaini, L., Yates, A., Yom-Tov, E., Frieder, O., & Goharian, N. (2016). Enhancing web search in the medical domain via query clarification. Information Retrieval Journal, 19(1–2), 149–173.CrossRef
Zurück zum Zitat Stanton, I., Ieong, S., & Mishra, N. (2014). Circumlocution in diagnostic medical queries. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 133–142). Stanton, I., Ieong, S., & Mishra, N. (2014). Circumlocution in diagnostic medical queries. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, ACM (pp. 133–142).
Zurück zum Zitat Toms, E., & Latter, C. (2007). How consumers search for health information. Health Informatics Journal, 13(3), 223–235.CrossRef Toms, E., & Latter, C. (2007). How consumers search for health information. Health Informatics Journal, 13(3), 223–235.CrossRef
Zurück zum Zitat Xiong, C., & Callan, J. (2015). Query expansion with freebase. In Proceedings of the 2015 international conference on the theory of information retrieval, ACM (pp. 111–120). Xiong, C., & Callan, J. (2015). Query expansion with freebase. In Proceedings of the 2015 international conference on the theory of information retrieval, ACM (pp. 111–120).
Zurück zum Zitat Zeng, Q., Kogan, S., Ash, N., Greenes, R., & Boxwala, A. (2002). Characteristics of consumer terminology for health information retrieval. Methods of Information in Medicine-Methodik der Information in der Medizin, 41(4), 289–298.CrossRef Zeng, Q., Kogan, S., Ash, N., Greenes, R., & Boxwala, A. (2002). Characteristics of consumer terminology for health information retrieval. Methods of Information in Medicine-Methodik der Information in der Medizin, 41(4), 289–298.CrossRef
Zurück zum Zitat Zeng, Q. T., Crowell, J., Plovnick, R. M., Kim, E., Ngo, L., & Dibble, E. (2006). Assisting consumer health information retrieval with query recommendations. Journal of the American Medical Informatics Association, 13(1), 80–90.CrossRef Zeng, Q. T., Crowell, J., Plovnick, R. M., Kim, E., Ngo, L., & Dibble, E. (2006). Assisting consumer health information retrieval with query recommendations. Journal of the American Medical Informatics Association, 13(1), 80–90.CrossRef
Zurück zum Zitat Zeng, Q. T., & Tse, T. (2006). Exploring and developing consumer health vocabularies. Journal of the American Medical Informatics Association, 13(1), 24–29.CrossRef Zeng, Q. T., & Tse, T. (2006). Exploring and developing consumer health vocabularies. Journal of the American Medical Informatics Association, 13(1), 24–29.CrossRef
Zurück zum Zitat Zhang, Y. (2014). Searching for specific health-related information in MedlinePlus: Behavioral patterns and user experience. Journal of the Association for Information Science and Technology, 65(1), 53–68.CrossRef Zhang, Y. (2014). Searching for specific health-related information in MedlinePlus: Behavioral patterns and user experience. Journal of the Association for Information Science and Technology, 65(1), 53–68.CrossRef
Zurück zum Zitat Zuccon, G., Koopman, B., Nguyen, A., Vickers, D., & Butt, L. (2012). Exploiting medical hierarchies for concept-based information retrieval. In Proceedings of the 17th Australasian document computing symposium (pp. 111–114). Zuccon, G., Koopman, B., Nguyen, A., Vickers, D., & Butt, L. (2012). Exploiting medical hierarchies for concept-based information retrieval. In Proceedings of the 17th Australasian document computing symposium (pp. 111–114).
Zurück zum Zitat Zuccon, G., Koopman, B., & Palotti, J. (2015). Diagnose this if you can: On the effectiveness of search engines in finding medical self-diagnosis information. In European conference on information retrieval MedIR’15 (pp. 562–567). Zuccon, G., Koopman, B., & Palotti, J. (2015). Diagnose this if you can: On the effectiveness of search engines in finding medical self-diagnosis information. In European conference on information retrieval MedIR’15 (pp. 562–567).
Zurück zum Zitat Zuccon, G., Palotti, J., Goeuriot, L., Kelly, L., Lupu, M., Pecina, P., Mueller, H., Budaher, J., & Deacon, A. (2016). The IR task at the CLEF eHealth evaluation lab 2016: User-centred health information retrieval. In CLEF 2016-conference and labs of the evaluation forum. Zuccon, G., Palotti, J., Goeuriot, L., Kelly, L., Lupu, M., Pecina, P., Mueller, H., Budaher, J., & Deacon, A. (2016). The IR task at the CLEF eHealth evaluation lab 2016: User-centred health information retrieval. In CLEF 2016-conference and labs of the evaluation forum.
Metadaten
Titel
Payoffs and pitfalls in using knowledge-bases for consumer health search
verfasst von
Jimmy
Guido Zuccon
Bevan Koopman
Publikationsdatum
08.11.2018
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 3-4/2019
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-018-9344-z

Weitere Artikel der Ausgabe 3-4/2019

Discover Computing 3-4/2019 Zur Ausgabe

Knowledge Graphs and Semantics in Text Analysis and Retrieval

Special issue on knowledge graphs and semantics in text analysis and retrieval

Knowledge Graphs and Semantics in Text Analysis and Retrieval

Automated assessment of knowledge hierarchy evolution: comparing directed acyclic graphs

Knowledge Graphs and Semantics in Text Analysis and Retrieval

Neural architecture for question answering using a knowledge graph and web corpus

Knowledge Graphs and Semantics in Text Analysis and Retrieval

Identifying and exploiting target entity type information for ad hoc entity retrieval

Knowledge Graphs and Semantics in Text Analysis and Retrieval

Neural variational entity set expansion for automatically populated knowledge graphs

Knowledge Graphs and Semantics in Text Analysis and Retrieval

Overcoming low-utility facets for complex answer retrieval