Skip to main content

2016 | OriginalPaper | Buchkapitel

Understandability Biased Evaluation for Information Retrieval

verfasst von : Guido Zuccon

Erschienen in: Advances in Information Retrieval

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although relevance is known to be a multidimensional concept, information retrieval measures mainly consider one dimension of relevance: topicality. In this paper we propose a method to integrate multiple dimensions of relevance in the evaluation of information retrieval systems. This is done within the gain-discount evaluation framework, which underlies measures like rank-biased precision (RBP), cumulative gain, and expected reciprocal rank. Albeit the proposal is general and applicable to any dimension of relevance, we study specific instantiations of the approach in the context of evaluating retrieval systems with respect to both the topicality and the understandability of retrieved documents. This leads to the formulation of understandability biased evaluation measures based on RBP. We study these measures using both simulated experiments and real human assessments. The findings show that considering both understandability and topicality in the evaluation of retrieval systems leads to claims about system effectiveness that differ from those obtained when considering topicality alone.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Although there is no requirement for this to be the case and RBP can be used for graded relevance [17].
 
2
Where P(R|d@k) captures either binary (P(R|d@k) either 0 or 1) or graded relevance and max(P(R|d)) is the highest relevance grade, e.g., 1 in case of binary relevance.
 
3
High values representing persistent users, low values representing impatient users.
 
4
Obtained from the CLEF eHealth repository, https://​github.​com/​CLEFeHealth.
 
Literatur
1.
Zurück zum Zitat Ahmed, O.H., Sullivan, S.J., Schneiders, A.G., McCrory, P.R.: Concussion information online: evaluation of information quality, content and readability of concussion-related websites. Br. J. Sports Med. 46(9), 675–683 (2012)CrossRef Ahmed, O.H., Sullivan, S.J., Schneiders, A.G., McCrory, P.R.: Concussion information online: evaluation of information quality, content and readability of concussion-related websites. Br. J. Sports Med. 46(9), 675–683 (2012)CrossRef
2.
Zurück zum Zitat Barry, C.L.: User-defined relevance criteria: an exploratory study. JASIS 45(3), 149–159 (1994)CrossRef Barry, C.L.: User-defined relevance criteria: an exploratory study. JASIS 45(3), 149–159 (1994)CrossRef
3.
Zurück zum Zitat Bruza, P.D., Zuccon, G., Sitbon, L.: Modelling the information seeking user by the decision they make. In: Proceedings of MUBE, pp. 5–6 (2013) Bruza, P.D., Zuccon, G., Sitbon, L.: Modelling the information seeking user by the decision they make. In: Proceedings of MUBE, pp. 5–6 (2013)
4.
Zurück zum Zitat Carterette, B.: System effectiveness, user models, and user utility: a conceptualframework for investigation. In: Proceedings of SIGIR, pp. 903–912 (2011) Carterette, B.: System effectiveness, user models, and user utility: a conceptualframework for investigation. In: Proceedings of SIGIR, pp. 903–912 (2011)
5.
Zurück zum Zitat Clarke, C.L., Craswell, N., Soboroff, I., Ashkan, A.: A comparative analysis of cascade measures for novelty and diversity. In: Proceedings of WSDM, pp. 75–84 (2011) Clarke, C.L., Craswell, N., Soboroff, I., Ashkan, A.: A comparative analysis of cascade measures for novelty and diversity. In: Proceedings of WSDM, pp. 75–84 (2011)
6.
Zurück zum Zitat Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of SIGIR, pp. 659–666 (2008) Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of SIGIR, pp. 659–666 (2008)
7.
Zurück zum Zitat Collins-Thompson, K., Callan, J.: Predicting reading difficulty with statistical language models. JASIST 56(13), 1448–1462 (2005)CrossRef Collins-Thompson, K., Callan, J.: Predicting reading difficulty with statistical language models. JASIST 56(13), 1448–1462 (2005)CrossRef
8.
Zurück zum Zitat Cosijn, E., Ingwersen, P.: Dimensions of relevance. IP&M 36(4), 533–550 (2000) Cosijn, E., Ingwersen, P.: Dimensions of relevance. IP&M 36(4), 533–550 (2000)
9.
Zurück zum Zitat Cuadra, C.A., Katter, R.V.: Opening the black box of ‘relevance’. J. Doc. 23(4), 291–303 (1967)CrossRef Cuadra, C.A., Katter, R.V.: Opening the black box of ‘relevance’. J. Doc. 23(4), 291–303 (1967)CrossRef
10.
Zurück zum Zitat Eisenberg, M., Barry, C.: Order effects: a study of the possible influence of presentation order on user judgments of document relevance. JASIS 39(5), 293–300 (1988)CrossRef Eisenberg, M., Barry, C.: Order effects: a study of the possible influence of presentation order on user judgments of document relevance. JASIS 39(5), 293–300 (1988)CrossRef
11.
Zurück zum Zitat Friedman, D.B., Hoffman-Goetz, L., Arocha, J.F.: Health literacy and the world wide web: comparing the readability of leading incident cancers on the internet. Inf. Health Soc. Care 31(1), 67–87 (2006) Friedman, D.B., Hoffman-Goetz, L., Arocha, J.F.: Health literacy and the world wide web: comparing the readability of leading incident cancers on the internet. Inf. Health Soc. Care 31(1), 67–87 (2006)
12.
Zurück zum Zitat Goeuriot, L., Jones, G., Kelly, L., Leveling, J., Hanbury, A., Müller, H., Salanterä, S., Suominen, H., Zuccon, G.: ShARe/CLEF eHealth Evaluation Lab 2013, Task 3: Informationretrieval to address patients’ questions when reading clinical reports. In: Proceedings of CLEF (2013) Goeuriot, L., Jones, G., Kelly, L., Leveling, J., Hanbury, A., Müller, H., Salanterä, S., Suominen, H., Zuccon, G.: ShARe/CLEF eHealth Evaluation Lab 2013, Task 3: Informationretrieval to address patients’ questions when reading clinical reports. In: Proceedings of CLEF (2013)
13.
Zurück zum Zitat Goeuriot, L., Kelly, L., Lee, W., Palotti, J., Pecina, P., Zuccon, G., Hanbury, A., Gareth, H.M., Jones, J.F.: ShARe, CLEF eHealth Evaluation Lab 2014, Task 3: User-centred health information retrieval. In: Proceedings of CLEF Sheffield, UK (2014) Goeuriot, L., Kelly, L., Lee, W., Palotti, J., Pecina, P., Zuccon, G., Hanbury, A., Gareth, H.M., Jones, J.F.: ShARe, CLEF eHealth Evaluation Lab 2014, Task 3: User-centred health information retrieval. In: Proceedings of CLEF Sheffield, UK (2014)
14.
Zurück zum Zitat Larsson, P.: Classification into readability levels: implementation andevaluation. PhD thesis, Uppsala University (2006) Larsson, P.: Classification into readability levels: implementation andevaluation. PhD thesis, Uppsala University (2006)
15.
Zurück zum Zitat McCallum, D.R., Peterson, J.L.: Computer-based readability indexes. In: Proceedings of the ACM Conference, pp. 44–48 (1982) McCallum, D.R., Peterson, J.L.: Computer-based readability indexes. In: Proceedings of the ACM Conference, pp. 44–48 (1982)
16.
Zurück zum Zitat Mizzaro, S.: Relevance: the whole history. JASIS 48(9), 810–832 (1997)CrossRef Mizzaro, S.: Relevance: the whole history. JASIS 48(9), 810–832 (1997)CrossRef
17.
Zurück zum Zitat Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. TOIS 27(1), 2 (2008)CrossRef Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. TOIS 27(1), 2 (2008)CrossRef
18.
Zurück zum Zitat Palotti, J., Zuccon, G., Goeuriot, L., Kelly, L., Hanbury, A., Jones, G.J., Lupu, M., Pecina, P.: Clef eHealth evaluation lab 2015, task 2: Retrieving informationabout medical symptoms. In: Proceedings of CLEF (2015) Palotti, J., Zuccon, G., Goeuriot, L., Kelly, L., Hanbury, A., Jones, G.J., Lupu, M., Pecina, P.: Clef eHealth evaluation lab 2015, task 2: Retrieving informationabout medical symptoms. In: Proceedings of CLEF (2015)
19.
Zurück zum Zitat Palotti, J., Zuccon, G., Hanbury, A.: The influence of pre-processing on the estimation of readability of web documents. In: Proceedings of CIKM (2015) Palotti, J., Zuccon, G., Hanbury, A.: The influence of pre-processing on the estimation of readability of web documents. In: Proceedings of CIKM (2015)
20.
Zurück zum Zitat Rees, A.M., Schultz, D.G.: A field experimental approach to the study of relevance assessments in relation to document searching. Technical report, Case Western Reserve University (1967) Rees, A.M., Schultz, D.G.: A field experimental approach to the study of relevance assessments in relation to document searching. Technical report, Case Western Reserve University (1967)
21.
Zurück zum Zitat Robertson, S.E.: The probability ranking principle in IR. J. Doc. 33(4), 294–304 (1977)CrossRef Robertson, S.E.: The probability ranking principle in IR. J. Doc. 33(4), 294–304 (1977)CrossRef
22.
Zurück zum Zitat Sakai, T., Song, R.: Evaluating diversified search results using per-intent graded relevance. In: Proceedings of SIGIR, pp. 1043–1052 (2011) Sakai, T., Song, R.: Evaluating diversified search results using per-intent graded relevance. In: Proceedings of SIGIR, pp. 1043–1052 (2011)
23.
Zurück zum Zitat Saracevic, T.: The stratified model of information retrieval interaction: extension and applications. Proceedings of ASIS, vol. 34, pp. 313–327 (1997) Saracevic, T.: The stratified model of information retrieval interaction: extension and applications. Proceedings of ASIS, vol. 34, pp. 313–327 (1997)
24.
Zurück zum Zitat Schamber, L., Eisenberg, M.: Relevance: the search for a definition. In: Proceedings of ASIS (1988) Schamber, L., Eisenberg, M.: Relevance: the search for a definition. In: Proceedings of ASIS (1988)
25.
Zurück zum Zitat Smucker, M.D., Clarke, C.L.: Time-based calibration of effectiveness measures. In: Proceedings of SIGIR, pp. 95–104 (2012) Smucker, M.D., Clarke, C.L.: Time-based calibration of effectiveness measures. In: Proceedings of SIGIR, pp. 95–104 (2012)
26.
Zurück zum Zitat Walsh, T.M., Volsko, T.A.: Readability assessment of internet-based consumer health information. Respir. Care 53(10), 1310–1315 (2008) Walsh, T.M., Volsko, T.A.: Readability assessment of internet-based consumer health information. Respir. Care 53(10), 1310–1315 (2008)
27.
Zurück zum Zitat Wiener, R.C., Wiener-Pla, R.: Literacy, pregnancy and potential oral health changes: the internetand readability levels. Matern. Child Health J. 1–6 (2013) Wiener, R.C., Wiener-Pla, R.: Literacy, pregnancy and potential oral health changes: the internetand readability levels. Matern. Child Health J. 1–6 (2013)
28.
Zurück zum Zitat Xu, Y.C., Chen, Z.: Relevance judgment: what do information users consider beyond topicality? JASIST 57(7), 961–973 (2006)CrossRef Xu, Y.C., Chen, Z.: Relevance judgment: what do information users consider beyond topicality? JASIST 57(7), 961–973 (2006)CrossRef
29.
Zurück zum Zitat Yan, X., Song, D., Li, X.: Concept-based document readability in domain specific information retrieval. In: Proceedings of CIKM, pp. 540–549 (2006) Yan, X., Song, D., Li, X.: Concept-based document readability in domain specific information retrieval. In: Proceedings of CIKM, pp. 540–549 (2006)
30.
Zurück zum Zitat Yilmaz, E., Aslam, J.A., Robertson, S.: A new rank correlation coefficient for information retrieval. In: Proceedings of SIGIR, pp. 587–594 (2008) Yilmaz, E., Aslam, J.A., Robertson, S.: A new rank correlation coefficient for information retrieval. In: Proceedings of SIGIR, pp. 587–594 (2008)
31.
Zurück zum Zitat Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics forsubtopic retrieval. In: Proceedings of SIGIR, pp. 10–17 (2003) Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics forsubtopic retrieval. In: Proceedings of SIGIR, pp. 10–17 (2003)
32.
Zurück zum Zitat Zhang, Y., Park, L.A., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retrieval 13(1), 46–69 (2010)CrossRef Zhang, Y., Park, L.A., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retrieval 13(1), 46–69 (2010)CrossRef
33.
Zurück zum Zitat Zhang, Y., Zhang, J., Lease, M., Gwizdka, J.: Multidimensional relevance modeling via psychometrics and crowdsourcing. In: Proceedings of SIGIR, pp. 435–444 (2014) Zhang, Y., Zhang, J., Lease, M., Gwizdka, J.: Multidimensional relevance modeling via psychometrics and crowdsourcing. In: Proceedings of SIGIR, pp. 435–444 (2014)
34.
Zurück zum Zitat Zuccon, G., Koopman, B.: Integrating understandability in the evaluation of consumer health search engines. Proceedings of MedIR, pp. 32–35 (2014) Zuccon, G., Koopman, B.: Integrating understandability in the evaluation of consumer health search engines. Proceedings of MedIR, pp. 32–35 (2014)
35.
Zurück zum Zitat Zuccon, G., Koopman, B., Palotti, J.: Diagnose this if you can. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 562–567. Springer, Heidelberg (2015) Zuccon, G., Koopman, B., Palotti, J.: Diagnose this if you can. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 562–567. Springer, Heidelberg (2015)
Metadaten
Titel
Understandability Biased Evaluation for Information Retrieval
verfasst von
Guido Zuccon
Copyright-Jahr
2016
Verlag
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-319-30671-1_21

Neuer Inhalt