Skip to main content
Top

2016 | OriginalPaper | Chapter

Understandability Biased Evaluation for Information Retrieval

Author : Guido Zuccon

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Although relevance is known to be a multidimensional concept, information retrieval measures mainly consider one dimension of relevance: topicality. In this paper we propose a method to integrate multiple dimensions of relevance in the evaluation of information retrieval systems. This is done within the gain-discount evaluation framework, which underlies measures like rank-biased precision (RBP), cumulative gain, and expected reciprocal rank. Albeit the proposal is general and applicable to any dimension of relevance, we study specific instantiations of the approach in the context of evaluating retrieval systems with respect to both the topicality and the understandability of retrieved documents. This leads to the formulation of understandability biased evaluation measures based on RBP. We study these measures using both simulated experiments and real human assessments. The findings show that considering both understandability and topicality in the evaluation of retrieval systems leads to claims about system effectiveness that differ from those obtained when considering topicality alone.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Although there is no requirement for this to be the case and RBP can be used for graded relevance [17].
 
2
Where P(R|d@k) captures either binary (P(R|d@k) either 0 or 1) or graded relevance and max(P(R|d)) is the highest relevance grade, e.g., 1 in case of binary relevance.
 
3
High values representing persistent users, low values representing impatient users.
 
4
Obtained from the CLEF eHealth repository, https://​github.​com/​CLEFeHealth.
 
Literature
1.
go back to reference Ahmed, O.H., Sullivan, S.J., Schneiders, A.G., McCrory, P.R.: Concussion information online: evaluation of information quality, content and readability of concussion-related websites. Br. J. Sports Med. 46(9), 675–683 (2012)CrossRef Ahmed, O.H., Sullivan, S.J., Schneiders, A.G., McCrory, P.R.: Concussion information online: evaluation of information quality, content and readability of concussion-related websites. Br. J. Sports Med. 46(9), 675–683 (2012)CrossRef
2.
go back to reference Barry, C.L.: User-defined relevance criteria: an exploratory study. JASIS 45(3), 149–159 (1994)CrossRef Barry, C.L.: User-defined relevance criteria: an exploratory study. JASIS 45(3), 149–159 (1994)CrossRef
3.
go back to reference Bruza, P.D., Zuccon, G., Sitbon, L.: Modelling the information seeking user by the decision they make. In: Proceedings of MUBE, pp. 5–6 (2013) Bruza, P.D., Zuccon, G., Sitbon, L.: Modelling the information seeking user by the decision they make. In: Proceedings of MUBE, pp. 5–6 (2013)
4.
go back to reference Carterette, B.: System effectiveness, user models, and user utility: a conceptualframework for investigation. In: Proceedings of SIGIR, pp. 903–912 (2011) Carterette, B.: System effectiveness, user models, and user utility: a conceptualframework for investigation. In: Proceedings of SIGIR, pp. 903–912 (2011)
5.
go back to reference Clarke, C.L., Craswell, N., Soboroff, I., Ashkan, A.: A comparative analysis of cascade measures for novelty and diversity. In: Proceedings of WSDM, pp. 75–84 (2011) Clarke, C.L., Craswell, N., Soboroff, I., Ashkan, A.: A comparative analysis of cascade measures for novelty and diversity. In: Proceedings of WSDM, pp. 75–84 (2011)
6.
go back to reference Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of SIGIR, pp. 659–666 (2008) Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of SIGIR, pp. 659–666 (2008)
7.
go back to reference Collins-Thompson, K., Callan, J.: Predicting reading difficulty with statistical language models. JASIST 56(13), 1448–1462 (2005)CrossRef Collins-Thompson, K., Callan, J.: Predicting reading difficulty with statistical language models. JASIST 56(13), 1448–1462 (2005)CrossRef
8.
go back to reference Cosijn, E., Ingwersen, P.: Dimensions of relevance. IP&M 36(4), 533–550 (2000) Cosijn, E., Ingwersen, P.: Dimensions of relevance. IP&M 36(4), 533–550 (2000)
9.
go back to reference Cuadra, C.A., Katter, R.V.: Opening the black box of ‘relevance’. J. Doc. 23(4), 291–303 (1967)CrossRef Cuadra, C.A., Katter, R.V.: Opening the black box of ‘relevance’. J. Doc. 23(4), 291–303 (1967)CrossRef
10.
go back to reference Eisenberg, M., Barry, C.: Order effects: a study of the possible influence of presentation order on user judgments of document relevance. JASIS 39(5), 293–300 (1988)CrossRef Eisenberg, M., Barry, C.: Order effects: a study of the possible influence of presentation order on user judgments of document relevance. JASIS 39(5), 293–300 (1988)CrossRef
11.
go back to reference Friedman, D.B., Hoffman-Goetz, L., Arocha, J.F.: Health literacy and the world wide web: comparing the readability of leading incident cancers on the internet. Inf. Health Soc. Care 31(1), 67–87 (2006) Friedman, D.B., Hoffman-Goetz, L., Arocha, J.F.: Health literacy and the world wide web: comparing the readability of leading incident cancers on the internet. Inf. Health Soc. Care 31(1), 67–87 (2006)
12.
go back to reference Goeuriot, L., Jones, G., Kelly, L., Leveling, J., Hanbury, A., Müller, H., Salanterä, S., Suominen, H., Zuccon, G.: ShARe/CLEF eHealth Evaluation Lab 2013, Task 3: Informationretrieval to address patients’ questions when reading clinical reports. In: Proceedings of CLEF (2013) Goeuriot, L., Jones, G., Kelly, L., Leveling, J., Hanbury, A., Müller, H., Salanterä, S., Suominen, H., Zuccon, G.: ShARe/CLEF eHealth Evaluation Lab 2013, Task 3: Informationretrieval to address patients’ questions when reading clinical reports. In: Proceedings of CLEF (2013)
13.
go back to reference Goeuriot, L., Kelly, L., Lee, W., Palotti, J., Pecina, P., Zuccon, G., Hanbury, A., Gareth, H.M., Jones, J.F.: ShARe, CLEF eHealth Evaluation Lab 2014, Task 3: User-centred health information retrieval. In: Proceedings of CLEF Sheffield, UK (2014) Goeuriot, L., Kelly, L., Lee, W., Palotti, J., Pecina, P., Zuccon, G., Hanbury, A., Gareth, H.M., Jones, J.F.: ShARe, CLEF eHealth Evaluation Lab 2014, Task 3: User-centred health information retrieval. In: Proceedings of CLEF Sheffield, UK (2014)
14.
go back to reference Larsson, P.: Classification into readability levels: implementation andevaluation. PhD thesis, Uppsala University (2006) Larsson, P.: Classification into readability levels: implementation andevaluation. PhD thesis, Uppsala University (2006)
15.
go back to reference McCallum, D.R., Peterson, J.L.: Computer-based readability indexes. In: Proceedings of the ACM Conference, pp. 44–48 (1982) McCallum, D.R., Peterson, J.L.: Computer-based readability indexes. In: Proceedings of the ACM Conference, pp. 44–48 (1982)
16.
17.
go back to reference Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. TOIS 27(1), 2 (2008)CrossRef Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. TOIS 27(1), 2 (2008)CrossRef
18.
go back to reference Palotti, J., Zuccon, G., Goeuriot, L., Kelly, L., Hanbury, A., Jones, G.J., Lupu, M., Pecina, P.: Clef eHealth evaluation lab 2015, task 2: Retrieving informationabout medical symptoms. In: Proceedings of CLEF (2015) Palotti, J., Zuccon, G., Goeuriot, L., Kelly, L., Hanbury, A., Jones, G.J., Lupu, M., Pecina, P.: Clef eHealth evaluation lab 2015, task 2: Retrieving informationabout medical symptoms. In: Proceedings of CLEF (2015)
19.
go back to reference Palotti, J., Zuccon, G., Hanbury, A.: The influence of pre-processing on the estimation of readability of web documents. In: Proceedings of CIKM (2015) Palotti, J., Zuccon, G., Hanbury, A.: The influence of pre-processing on the estimation of readability of web documents. In: Proceedings of CIKM (2015)
20.
go back to reference Rees, A.M., Schultz, D.G.: A field experimental approach to the study of relevance assessments in relation to document searching. Technical report, Case Western Reserve University (1967) Rees, A.M., Schultz, D.G.: A field experimental approach to the study of relevance assessments in relation to document searching. Technical report, Case Western Reserve University (1967)
21.
go back to reference Robertson, S.E.: The probability ranking principle in IR. J. Doc. 33(4), 294–304 (1977)CrossRef Robertson, S.E.: The probability ranking principle in IR. J. Doc. 33(4), 294–304 (1977)CrossRef
22.
go back to reference Sakai, T., Song, R.: Evaluating diversified search results using per-intent graded relevance. In: Proceedings of SIGIR, pp. 1043–1052 (2011) Sakai, T., Song, R.: Evaluating diversified search results using per-intent graded relevance. In: Proceedings of SIGIR, pp. 1043–1052 (2011)
23.
go back to reference Saracevic, T.: The stratified model of information retrieval interaction: extension and applications. Proceedings of ASIS, vol. 34, pp. 313–327 (1997) Saracevic, T.: The stratified model of information retrieval interaction: extension and applications. Proceedings of ASIS, vol. 34, pp. 313–327 (1997)
24.
go back to reference Schamber, L., Eisenberg, M.: Relevance: the search for a definition. In: Proceedings of ASIS (1988) Schamber, L., Eisenberg, M.: Relevance: the search for a definition. In: Proceedings of ASIS (1988)
25.
go back to reference Smucker, M.D., Clarke, C.L.: Time-based calibration of effectiveness measures. In: Proceedings of SIGIR, pp. 95–104 (2012) Smucker, M.D., Clarke, C.L.: Time-based calibration of effectiveness measures. In: Proceedings of SIGIR, pp. 95–104 (2012)
26.
go back to reference Walsh, T.M., Volsko, T.A.: Readability assessment of internet-based consumer health information. Respir. Care 53(10), 1310–1315 (2008) Walsh, T.M., Volsko, T.A.: Readability assessment of internet-based consumer health information. Respir. Care 53(10), 1310–1315 (2008)
27.
go back to reference Wiener, R.C., Wiener-Pla, R.: Literacy, pregnancy and potential oral health changes: the internetand readability levels. Matern. Child Health J. 1–6 (2013) Wiener, R.C., Wiener-Pla, R.: Literacy, pregnancy and potential oral health changes: the internetand readability levels. Matern. Child Health J. 1–6 (2013)
28.
go back to reference Xu, Y.C., Chen, Z.: Relevance judgment: what do information users consider beyond topicality? JASIST 57(7), 961–973 (2006)CrossRef Xu, Y.C., Chen, Z.: Relevance judgment: what do information users consider beyond topicality? JASIST 57(7), 961–973 (2006)CrossRef
29.
go back to reference Yan, X., Song, D., Li, X.: Concept-based document readability in domain specific information retrieval. In: Proceedings of CIKM, pp. 540–549 (2006) Yan, X., Song, D., Li, X.: Concept-based document readability in domain specific information retrieval. In: Proceedings of CIKM, pp. 540–549 (2006)
30.
go back to reference Yilmaz, E., Aslam, J.A., Robertson, S.: A new rank correlation coefficient for information retrieval. In: Proceedings of SIGIR, pp. 587–594 (2008) Yilmaz, E., Aslam, J.A., Robertson, S.: A new rank correlation coefficient for information retrieval. In: Proceedings of SIGIR, pp. 587–594 (2008)
31.
go back to reference Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics forsubtopic retrieval. In: Proceedings of SIGIR, pp. 10–17 (2003) Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics forsubtopic retrieval. In: Proceedings of SIGIR, pp. 10–17 (2003)
32.
go back to reference Zhang, Y., Park, L.A., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retrieval 13(1), 46–69 (2010)CrossRef Zhang, Y., Park, L.A., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retrieval 13(1), 46–69 (2010)CrossRef
33.
go back to reference Zhang, Y., Zhang, J., Lease, M., Gwizdka, J.: Multidimensional relevance modeling via psychometrics and crowdsourcing. In: Proceedings of SIGIR, pp. 435–444 (2014) Zhang, Y., Zhang, J., Lease, M., Gwizdka, J.: Multidimensional relevance modeling via psychometrics and crowdsourcing. In: Proceedings of SIGIR, pp. 435–444 (2014)
34.
go back to reference Zuccon, G., Koopman, B.: Integrating understandability in the evaluation of consumer health search engines. Proceedings of MedIR, pp. 32–35 (2014) Zuccon, G., Koopman, B.: Integrating understandability in the evaluation of consumer health search engines. Proceedings of MedIR, pp. 32–35 (2014)
35.
go back to reference Zuccon, G., Koopman, B., Palotti, J.: Diagnose this if you can. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 562–567. Springer, Heidelberg (2015) Zuccon, G., Koopman, B., Palotti, J.: Diagnose this if you can. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 562–567. Springer, Heidelberg (2015)
Metadata
Title
Understandability Biased Evaluation for Information Retrieval
Author
Guido Zuccon
Copyright Year
2016
Publisher
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-319-30671-1_21