Skip to main content
Erschienen in: Discover Computing 5/2010

01.10.2010 | Focused Retrieval and Result Aggr.

Expected reading effort in focused retrieval evaluation

verfasst von: Paavo Arvola, Jaana Kekäläinen, Marko Junkkari

Erschienen in: Discover Computing | Ausgabe 5/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This study introduces a novel framework for evaluating passage and XML retrieval. The framework focuses on a user’s effort to localize relevant content in a result document. Measuring the effort is based on a system guided reading order of documents. The effort is calculated as the quantity of text the user is expected to browse through. More specifically, this study seeks evaluation metrics for retrieval methods following a specific fetch and browse approach, where in the fetch phase documents are ranked in decreasing order according to their document score, like in document retrieval. In the browse phase, for each retrieved document, a set of non-overlapping passages representing the relevant text within the document is retrieved. In other words, the passages of the document are re-organized, so that the best matching passages are read first in sequential order. We introduce an application scenario motivating the framework, and propose sample metrics based on the framework. These metrics give a basis for the comparison of effectiveness between traditional document retrieval and passage/XML retrieval and illuminate the benefit of passage/XML retrieval.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Ali, M. S., Consens, M. P., Kazai, G., & Lalmas, M. (2008). Structural relevance: A common basis for the evaluation of structured document retrieval. In Proceedings of CIKM ‘08 (pp. 1153–1162). Ali, M. S., Consens, M. P., Kazai, G., & Lalmas, M. (2008). Structural relevance: A common basis for the evaluation of structured document retrieval. In Proceedings of CIKM ‘08 (pp. 1153–1162).
Zurück zum Zitat Allan, J. (2004). Hard track overview in TREC 2004: High accuracy retrieval from documents. In Proceedings of the 13th text retrieval conference (TREC 2004). Nist Special Publication, SP 500-261, 11 pages. Allan, J. (2004). Hard track overview in TREC 2004: High accuracy retrieval from documents. In Proceedings of the 13th text retrieval conference (TREC 2004). Nist Special Publication, SP 500-261, 11 pages.
Zurück zum Zitat Arvola, P., Junkkari, M., & Kekäläinen, J. (2006). Applying XML retrieval methods for result document navigation in small screen devices. In Proceedings of MobileHCI workshop for ubiquitous information access (pp. 6–10). Arvola, P., Junkkari, M., & Kekäläinen, J. (2006). Applying XML retrieval methods for result document navigation in small screen devices. In Proceedings of MobileHCI workshop for ubiquitous information access (pp. 6–10).
Zurück zum Zitat Buyukkokten, O., Garcia-Molina, H., Paepcke, A., & Winograd, T. (2000). Power browser: Efficient web browsing for PDAs. In Proceedings of CHI ‘2000 (pp. 430–437). Buyukkokten, O., Garcia-Molina, H., Paepcke, A., & Winograd, T. (2000). Power browser: Efficient web browsing for PDAs. In Proceedings of CHI ‘2000 (pp. 430–437).
Zurück zum Zitat Chiaramella, Y., Mulhem, P., & Fourel, F. (1996). A model for multimedia search information retrieval. Technical report, basic research action FERMI 8134. Chiaramella, Y., Mulhem, P., & Fourel, F. (1996). A model for multimedia search information retrieval. Technical report, basic research action FERMI 8134.
Zurück zum Zitat Cooper, W. (1968). Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19(1), 30–41.CrossRef Cooper, W. (1968). Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19(1), 30–41.CrossRef
Zurück zum Zitat de Vries, A. P., Kazai, G., & Lalmas, M. (2004). Tolerance to irrelevance: A user-effort oriented evaluation of retrieval systems without predefined retrieval unit. In Proceedings of RIAO 2004 (pp. 463–473). de Vries, A. P., Kazai, G., & Lalmas, M. (2004). Tolerance to irrelevance: A user-effort oriented evaluation of retrieval systems without predefined retrieval unit. In Proceedings of RIAO 2004 (pp. 463–473).
Zurück zum Zitat Denoyer, L., & Gallinari, P. (2006). The Wikipedia XML corpus. SIGIR Forum, 40(1), 64–69.CrossRef Denoyer, L., & Gallinari, P. (2006). The Wikipedia XML corpus. SIGIR Forum, 40(1), 64–69.CrossRef
Zurück zum Zitat Dunlop, M. D. (1997). Time, relevance and interaction modelling for information retrieval. In Proceedings of SIGIR ‘97 (pp. 206–212). Dunlop, M. D. (1997). Time, relevance and interaction modelling for information retrieval. In Proceedings of SIGIR ‘97 (pp. 206–212).
Zurück zum Zitat Finesilver K., & Reid J. (2003). User behaviour in the context of structured documents. In Proceedings of ECIR 2003, LNCS 2633 (pp. 104–119). Finesilver K., & Reid J. (2003). User behaviour in the context of structured documents. In Proceedings of ECIR 2003, LNCS 2633 (pp. 104–119).
Zurück zum Zitat Hyönä, J., & Nurminen, A.-M. (2006). Do adult readers know how they read? Evidence from eye movement patterns and verbal reports. British Journal of Psychology, 97(1), 31–50.CrossRef Hyönä, J., & Nurminen, A.-M. (2006). Do adult readers know how they read? Evidence from eye movement patterns and verbal reports. British Journal of Psychology, 97(1), 31–50.CrossRef
Zurück zum Zitat Ibekwe-SanJuan, F., & SanJuan, E. (2009). Use of multiword terms and query expansion for interactive information retrieval. In Advances in Focused Retrieval, LNCS 5631 (pp. 54–64). Ibekwe-SanJuan, F., & SanJuan, E. (2009). Use of multiword terms and query expansion for interactive information retrieval. In Advances in Focused Retrieval, LNCS 5631 (pp. 54–64).
Zurück zum Zitat Itakura, K., & Clarke, C. L. K. (2009). University of Waterloo at INEX 2008: Adhoc, book, and link-the-wiki tracks. In Advances in Focused Retrieval, LNCS 5631 (pp. 132–139). Itakura, K., & Clarke, C. L. K. (2009). University of Waterloo at INEX 2008: Adhoc, book, and link-the-wiki tracks. In Advances in Focused Retrieval, LNCS 5631 (pp. 132–139).
Zurück zum Zitat Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transaction on Information Systems, 20(4), 422–446.CrossRef Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transaction on Information Systems, 20(4), 422–446.CrossRef
Zurück zum Zitat Jones, M., Buchanan, G., & Mohd-Nasir, N. (1999). Evaluation of WebTwig—a site outliner for handheld Web access. In Proceedings of international symposium on handheld and ubiquitous computing, LNCS 1707 (pp. 343–345). Jones, M., Buchanan, G., & Mohd-Nasir, N. (1999). Evaluation of WebTwig—a site outliner for handheld Web access. In Proceedings of international symposium on handheld and ubiquitous computing, LNCS 1707 (pp. 343–345).
Zurück zum Zitat Kamps, J., Geva, S., Trotman, A., Woodley, A., & Koolen, M. (2008c). Overview of the INEX 2008 ad hoc track. In INEX 2008 workshop pre-proceedings (pp. 1–28). Kamps, J., Geva, S., Trotman, A., Woodley, A., & Koolen, M. (2008c). Overview of the INEX 2008 ad hoc track. In INEX 2008 workshop pre-proceedings (pp. 1–28).
Zurück zum Zitat Kamps, J., Koolen, M., & Lalmas, M. (2008a). Locating relevant text within XML documents. In Proceedings of SIGIR’08 (pp. 847–848). Kamps, J., Koolen, M., & Lalmas, M. (2008a). Locating relevant text within XML documents. In Proceedings of SIGIR’08 (pp. 847–848).
Zurück zum Zitat Kamps, J., Lalmas, M., & Pehcevski, J. (2007). Evaluating relevant in context: Document retrieval with a twist. In Proceedings SIGIR ‘07 (pp. 749–750). Kamps, J., Lalmas, M., & Pehcevski, J. (2007). Evaluating relevant in context: Document retrieval with a twist. In Proceedings SIGIR ‘07 (pp. 749–750).
Zurück zum Zitat Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., & Robertson, S. (2008b). INEX 2007 evaluation measures. In INEX 2007, LNCS 4862 (pp. 24–33). Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., & Robertson, S. (2008b). INEX 2007 evaluation measures. In INEX 2007, LNCS 4862 (pp. 24–33).
Zurück zum Zitat Kazai, G., & Lalmas, M. (2006). Extended cumulated gain measures for the evaluation of content-oriented XML retrieval. ACM Transaction on Information Systems, 24(4), 503–542.CrossRef Kazai, G., & Lalmas, M. (2006). Extended cumulated gain measures for the evaluation of content-oriented XML retrieval. ACM Transaction on Information Systems, 24(4), 503–542.CrossRef
Zurück zum Zitat Kekäläinen, J., & Järvelin, K. (2002). Using graded relevance assessments in IR evaluation. Journal of the American Society for Information Science and Technology, 53, 1120–1129.CrossRef Kekäläinen, J., & Järvelin, K. (2002). Using graded relevance assessments in IR evaluation. Journal of the American Society for Information Science and Technology, 53, 1120–1129.CrossRef
Zurück zum Zitat Piwowarski, P. (2006). EPRUM metrics and INEX 2005. In Proceedings of INEX 2005, LNCS 3977 (pp. 30–42). Piwowarski, P. (2006). EPRUM metrics and INEX 2005. In Proceedings of INEX 2005, LNCS 3977 (pp. 30–42).
Zurück zum Zitat Piwowarski, B., & Dupret, G. (2006). Evaluation in (XML) information retrieval: Expected precision-recall with user modelling (EPRUM). In Proceedings of SIGIR’06 (pp. 260–267). Piwowarski, B., & Dupret, G. (2006). Evaluation in (XML) information retrieval: Expected precision-recall with user modelling (EPRUM). In Proceedings of SIGIR’06 (pp. 260–267).
Zurück zum Zitat Piwowarski, B., & Lalmas, M. (2004). Providing consistent and exhaustive relevance assessments for XML retrieval evaluation. In Proceedings of CIKM ‘04 (pp. 361–370). Piwowarski, B., & Lalmas, M. (2004). Providing consistent and exhaustive relevance assessments for XML retrieval evaluation. In Proceedings of CIKM ‘04 (pp. 361–370).
Zurück zum Zitat Reid, J., Lalmas, M., Finesilver, K., & Hertzum, M. (2006). Best entry points for structured document retrieval: Parts I & II. Information Processing and Management, 42, 74–105.CrossRef Reid, J., Lalmas, M., Finesilver, K., & Hertzum, M. (2006). Best entry points for structured document retrieval: Parts I & II. Information Processing and Management, 42, 74–105.CrossRef
Zurück zum Zitat Robertson, S. (2008). A new interpretation of average precision. In Proceedings of SIGIR ‘08 (pp. 689–690). Robertson, S. (2008). A new interpretation of average precision. In Proceedings of SIGIR ‘08 (pp. 689–690).
Zurück zum Zitat Saracevic, T. (1996). Relevance reconsidered ‘96. In Proceedings of CoLIS (pp. 201–218). Saracevic, T. (1996). Relevance reconsidered ‘96. In Proceedings of CoLIS (pp. 201–218).
Zurück zum Zitat Tombros, A., Larsen, B., & Malik, S. (2005). Report on the INEX 2004 interactive track. SIGIR Forum, 39, 43–49.CrossRef Tombros, A., Larsen, B., & Malik, S. (2005). Report on the INEX 2004 interactive track. SIGIR Forum, 39, 43–49.CrossRef
Zurück zum Zitat Trotman, A., Pharo, N., & Lehtonen, M. (2007). XML IR users and use cases. In Proceedings of INEX 2006, LNCS 4518 (pp. 400–412). Trotman, A., Pharo, N., & Lehtonen, M. (2007). XML IR users and use cases. In Proceedings of INEX 2006, LNCS 4518 (pp. 400–412).
Metadaten
Titel
Expected reading effort in focused retrieval evaluation
verfasst von
Paavo Arvola
Jaana Kekäläinen
Marko Junkkari
Publikationsdatum
01.10.2010
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 5/2010
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-010-9133-9

Weitere Artikel der Ausgabe 5/2010

Discover Computing 5/2010 Zur Ausgabe

Premium Partner