Skip to main content

2010 | Buch

ImageCLEF

Experimental Evaluation in Visual Information Retrieval

herausgegeben von: Henning Müller, Paul Clough, Thomas Deselaers, Barbara Caputo

Verlag: Springer Berlin Heidelberg

Buchreihe : The Information Retrieval Series

insite
SUCHEN

Über dieses Buch

The pervasive creation and consumption of content, especially visual content, is ingrained into our modern world. We’re constantly consuming visual media content, in printed form and in digital form, in work and in leisure pursuits. Like our cave– man forefathers, we use pictures to record things which are of importance to us as memory cues for the future, but nowadays we also use pictures and images to document processes; we use them in engineering, in art, in science, in medicine, in entertainment and we also use images in advertising. Moreover, when images are in digital format, either scanned from an analogue format or more often than not born digital, we can use the power of our computing and networking to exploit images to great effect. Most of the technical problems associated with creating, compressing, storing, transmitting, rendering and protecting image data are already solved. We use - cepted standards and have tremendous infrastructure and the only outstanding ch- lenges, apart from managing the scale issues associated with growth, are to do with locating images. That involves analysing them to determine their content, clas- fying them into related groupings, and searching for images. To overcome these challenges we currently rely on image metadata, the description of the images, - ther captured automatically at creation time or manually added afterwards.

Inhaltsverzeichnis

Frontmatter

Introduction

Frontmatter
Chapter 1. Seven Years of Image Retrieval Evaluation
Abstract
In this chapter we discuss evaluation of Information Retrieval (IR) systems and in particular ImageCLEF, a large–scale evaluation campaign that has produced several publicly–accessible resources required for evaluating visual information retrieval systems and is the focus of this book. This chapter sets the scene for the book by describing the purpose of system and user–centred evaluation, the purpose of test collections, the role of evaluation campaigns such as TREC and CLEF, our motivations for starting ImageCLEF and then a summary of the tracks run over the seven years (data, tasks and participants). The chapter will also provide an insight into lessons learned and experiences gained over the years spent organising ImageCLEF, and a summary of the main highlights.
Paul Clough, Henning Müller, Mark Sanderson
Chapter 2. Data Sets Created in ImageCLEF
Abstract
One of the main components of any Text REtrieval Conference (TREC)–style information retrieval benchmark is a collection of documents, such as images, texts, sounds or videos that is representative of a particular domain. Although many image collections exist both on–line and off–line, finding visual resources suitable for evaluation benchmarks such as ImageCLEF is challenging. For example, these resources are often expensive to purchase and subject to specific copyright licenses, restricting both the distribution and future access of such data for evaluation purposes. However, the various ImageCLEF evaluation tasks have managed to create and/or acquire almost a dozen document collections since 2003. This chapter begins by discussing the requirements and specifications for creating a suitable document collection for evaluating multi–modal and cross–lingual image retrieval systems. It then describes each of the eleven document collections created and used for ImageCLEF tasks between 2003 and 2009. The description includes the origins of each document collection, a summary of its content, as well as details regarding the distribution, benefits and limitations of each resource.
Michael Grubinger, Stefanie Nowak, Paul Clough
Chapter 3. Creating Realistic Topics for Image Retrieval Evaluation
Abstract
This chapter describes the various ways for creating realistic query topics in the context of image retrieval evaluation campaigns such as ImageCLEF. A short overview describes general ways of creating topics, from complete laboratory style evaluations based on the technical capabilities of systems to real–world applications with real end users. The chapter offers help to those planning to evaluate systems on how to develop challenging and realistic topics based on knowledge of the users and of the capabilities of systems. Information sources for created topics are detailed. The main analysis will be the ImageCLEF tasks, and especially the medical retrieval tasks, where many different ways for creating topics have been analyzed over the years.
Henning Müller
Chapter 4. Relevance Judgments for Image Retrieval Evaluation
Abstract
In this chapter, we review our experiences with the relevance judging process at ImageCLEF, using the medical retrieval task as a primary example. We begin with a historic perspective of the precursor to most modern retrieval evaluation campaigns, the Cranfield paradigm, as most modern system–based evaluation campaigns including ImageCLEF are modeled after it. We then briefly describe the stages in an evaluation campaign and provide details of the different aspects of the relevance judgment process. We summarize the recruitment process and describe the various systems used for judgment at ImageCLEF. We discuss the advantages and limitations of creating pools that are then judged by human experts. Finally, we discuss our experiences with the subjectivity of the relevance process and the relative robustness of the performance measures to variability in relevance judging.
Jayashree Kalpathy–Cramer, Steven Bedrick, William Hersh
Chapter 5. Performance Measures Used in Image Information Retrieval
Abstract
Although during the running of the ImageCLEF tracks there was no explicit co–ordination on the types of evaluation measures employed, the same statistics were often used across ImageCLEF. Therefore, in this chapter, the range of measures used in the evaluation exercise is described. The original research defining a measure, together with their formulations and the relative pros and cons of the measures, are also detailed. Research that both compares the measures and attempts to determine the best is also outlined. Finally, the use of measures in the different tracks and years of ImageCLEF is tabulated.
Mark Sanderson
Chapter 6. Fusion Techniques for Combining Textual and Visual Information Retrieval
Abstract
This chapter describes several approaches for information fusion that have been used in ImageCLEF over the past seven years. In this context, the fusion of information is mainly meant to combine textual and visual retrieval. Data fusion techniques from 116 papers (62% of ImageCLEF working notes) are categorized, described and discussed. It was observed that three general approaches were used for retrieval that can be categorized based on the system level chosen for combining modalities: 1) at the input of the system with inter–media query expansion, 2) internally to the system with early fusion and 3) at the output of the system with late fusion which is by far the most widely used fusion strategy.
Adrien Depeursinge, Henning Müller

Track Reports

Frontmatter
Chapter 7. Interactive Image Retrieval
Abstract
Information retrieval access research is based on evaluation as the main vehicle of research: benchmarking procedures are regularly pursued by all contributors to the field. But benchmarking is only one half of evaluation: to validate the results the evaluation must include the study of user behaviour while performing tasks for which the system under consideration is intended. Designing and performing such studies systematically on research systems is a challenge, breaking the mould on how benchmarking evaluation can be performed and how results can be perceived. This is the key research question of interactive information retrieval. The question of evaluation has also come to the fore through applications moving from exclusively treating topic–oriented text to including other media, most notably images. This development challenges many of the underlying assumptions of topical text retrieval, and requires new evaluation frameworks, not unrelated to the questions raised by interactive study. This chapter describes how the interactive track of the Cross–Language Evaluation Forum (iCLEF) has addressed some of those theoretical and practical challenges.
Jussi Karlgren, Julio Gonzalo
Chapter 8. Photographic Image Retrieval
Abstract
CLEF was the first benchmarking campaign that organized an evaluation event for image retrieval: the ImageCLEF photographic ad hoc retrieval task in 2003. Since then, this task has become one of the most popular tasks of ImageCLEF, providing both the resources and a framework necessary to carry out comparative laboratory–style evaluation of multi–lingual visual information retrieval from photographic collections. Running for seven years, several challenges have been given to participants, including: retrieval from a collection of historic photographs; retrieval from a more generic collection with multi–lingual annotations; and retrieval from a large news archive, promoting result diversity. This chapter summarizes each of these tasks, describes the individual test collections and evaluation scenarios, analyzes the retrieval results, and discusses potential findings for a number of research questions.
Monica Lestari Paramita, Michael Grubinger
Chapter 9. The Wikipedia Image Retrieval Task
Abstract
The Wikipedia image retrieval task at ImageCLEF provides a test–bed for the system–oriented evaluation of visual information retrieval from a collection of Wikipedia images. The aim is to investigate the effectiveness of retrieval approaches that exploit textual and visual evidence in the context of a large and heterogeneous collection of images that are searched for by users with diverse information needs. This chapter presents an overview of the available test collections, summarises the retrieval approaches employed by the groups that participated in the task during the 2008 and 2009 ImageCLEF campaigns, provides an analysis of the main evaluation results, identifies best practices for effective retrieval, and discusses open issues.
Theodora Tsikrika, Jana Kludas
Chapter 10. The Robot Vision Task
Abstract
In 2009, ImageCLEF expanded its tasks with the introduction of the first robot vision challenge. The overall focus of the challenge is semantic localization of a robot platform using visual place recognition. This is a key topic of research in the robotics community today. This chapter presents the goals and achievements of the first edition of the robot vision task. We describe the task, the method of data collection used and the evaluation procedure. We give an overview of the obtained results and briefly highlight the most promising approaches. We then outline how the task will evolve in the near and distant future.
Andrzej Pronobis, Barbara Caputo
Chapter 11. Object and Concept Recognition for Image Retrieval
Abstract
ImageCLEF introduced its first automatic annotation task for photos in 2006. The visual object and concept detection task evolved over the years to become an inherent part of the yearly ImageCLEF evaluation cycle with growing interest and participation from the research community. Although the task can be solved purely visually, the incorporation of multi–modal information such as EXIF (Exchangeable Image File Format) data, concept hierarchies or concept relations is supported. In this chapter, the development, goals and achievements of four cycles of object and concept recognition for image retrieval are presented. This includes the task definitions and the participation of the research community. In addition, the approaches applied to solve the tasks and the lessons learnt are outlined. The results of all years are illustrated, compared and the most promising approaches are highlighted. Finally, the interactions with the photo retrieval task are presented.
Stefanie Nowak, Allan Hanbury, Thomas Deselaers
Chapter 12. The Medical Image Classification Task
Abstract
We describe the medical image classification task in ImageCLEF 2005–2009. It evolved from a classification task with 57 classes on a total of 10,000 images into a hierarchical classification task with a very large number of potential classes. Here, we describe how the database and the objectives changed over the years and how state–of–the–art approaches from machine learning and computer vision were shown to outperform the nearest neighbor-based classification schemes working on full–image descriptors that were very successful in 2005. In particular the use of discriminative classification methods such as support vector machines and the use of local image descriptors were empirically shown to be important building blocks for medical image classification.
Tatiana Tommasi, Thomas Deselaers
Chapter 13. The Medical Image Retrieval Task
Abstract
This chapter describes the medical image retrieval task of ImageCLEF, the image retrieval track of the CLEF. The medical task has been running for six consecutive years, beginning in 2004. Participation has increased over the years to over 45 registrations for 2010. The query topics have also evolved over the years from a starting point of using images only, via clear visual and textual information needs, and now towards case descriptions to find similar cases. The primary goal of the task is to provide challenging research questions to the scientific community to advance medical visual information retrieval on standard databases. Databases have increased significantly in size over the years to keep pace with the growing demand. The results show that textual information retrieval of images is now much further developed and produces much better results than in past years. However, visual retrieval components such as pre–classifying the images (i.e. modality detection) or improving early precision of the retrieval results can lead to an overall improvement in retrieval performance in specific domains.
Henning Müller, Jayashree Kalpathy–Cramer

Participant reports

Frontmatter
Chapter 14. Expansion and Re–ranking Approaches for Multimodal Image Retrieval using Text–based Methods
Abstract
In this chapter, we present an approach to handle multi–modality in image retrieval using a Vector Space Model (VSM), which is extensively used in text retrieval. We simply extended the model with visual terms aiming to close the semantic gap by helping to map low–level features into high level textual semantic concepts. Moreover, this combination of textual and visual modality into one space also helps to query a textual database with visual content, or a visual database with textual content. Alongside this, in order to improve the performance of text retrieval we propose a novel expansion and re–ranking method, applied both to the documents and the query. When textual annotations of images are acquired automatically, they may contain too much information, and document expansion adds more noise to retrieval results. We propose a re–ranking phase to discard such noisy terms. The approaches introduced in this chapter were evaluated in two sub–tasks of ImageCLEF2009. First, we tested the multi–modality part in ImageCLEFmed and obtained the best rank in mixed retrieval, which includes textual and visual modalities. Secondly, we tested expansion and re–ranking methods in ImageCLEFWiki and the results were superior to others and obtained the best four positions in text–only retrieval. The results showed that the handling of multi–modality in text retrieval using a VSM is promising, and document expansion and re–ranking plays an important role in text–based image retrieval.
Adil Alpkocak, Deniz Kilinc, Tolga Berber
Chapter 15. Revisiting Sub–topic Retrieval in the ImageCLEF 2009 Photo Retrieval Task
Abstract
Ranking documents according to the Probability Ranking Principle has been theoretically shown to guarantee optimal retrieval effectiveness in tasks such as ad hoc document retrieval. This ranking strategy assumes independence among document relevance assessments. This assumption, however, often does not hold, for example in the scenarios where redundancy in retrieved documents is of major concern, as it is the case in the sub–topic retrieval task. In this chapter, we propose a new ranking strategy for sub–topic retrieval that builds upon the interdependent document relevance and topic–oriented models. With respect to the topic–oriented model, we investigate both static and dynamic clustering techniques, aiming to group topically similar documents. Evidence from clusters is then combined with information about document dependencies to form a new document ranking. We compare and contrast the proposed method against state–of–the–art approaches, such as Maximal Marginal Relevance, Portfolio Theory for Information Retrieval, and standard cluster–based diversification strategies. The empirical investigation is performed on the ImageCLEF 2009 Photo Retrieval collection, where images are assessed with respect to sub–topics of a more general query topic. The experimental results show that our approaches outperform the state–of–the–art strategies with respect to a number of diversity measures.
Teerapong Leelanupab, Guido Zuccon, Joemon M. Jose
Chapter 16. Knowledge Integration using Textual Information for Improving ImageCLEF Collections
Abstract
In this chapter we explain our participation at ImageCLEF from 2005 to 2009. During these years we have mainly developed systems for the ad hoc and the medical retrieval tasks. Although the different proposed tasks include both visual and textual information, the diverse approaches applied by the participants also include the use of only one type of information. The SINAI group specializes in the management of textual collections. For this reason, our main goal has been to improve the general system by taking advantage of the textual information.
Manuel Carlos Díaz–Galiano, Miguel Ángel García–Cumbreras, María Teresa Martín–Valdivia, Arturo Montejo-Ráez
Chapter 17. Leveraging Image, Text and Cross–media Similarities for Diversity–focused Multimedia Retrieval
Abstract
This chapter summarizes the different cross–modal information retrieval techniques Xerox Research Centre implemented during three years of participation in ImageCLEF Photo tasks. The main challenge remained constant: how to optimally couple visual and textual similarities, when they capture things at different semantic levels and when one of the media (the textual one) gives, most of the time, much better retrieval performance. Some core components turned out to be very effective all over the years: the visual similarity metrics based on Fisher Vector representation of images and the cross–media similarity principle based on relevance models. However, other components were introduced to solve additional issues: We tried different query– and document–enrichment methods by exploiting auxiliary resources such as Flickr or open–source thesauri, or by doing some statistical ‘semantic smoothing’. We also implemented some clustering mechanisms in order to promote diversity in the top results and to provide faster access to relevant information. This chapter describes, analyses and assesses each of these components, namely: the monomodal similarity measures, the different cross–media similarities, the query and document enrichment, and finally the mechanisms to ensure diversity in what is proposed to the user. To conclude, we discuss the numerous lessons we have learnt over the years by trying to solve this very challenging task.
Julien Ah-Pine, Stephane Clinchant, Gabriela Csurka, Florent Perronnin, Jean-Michel Renders
Chapter 18. University of Amsterdam at the Visual Concept Detection and Annotation Tasks
Abstract
Visual concept detection is important to access visual information on the level of objects and scene types. The current state–of–the–art in visual concept detection and annotation tasks is based on the bag–of–words model. Within the bag–of–words model, points are first sampled according to some strategy, then the area around these points are described using color descriptors. These descriptors are then vector–quantized against a codebook of prototypical descriptors, which results in a fixed–length representation of the image. Based on these representations, visual concept models are trained. In this chapter, we discuss the design choices within the bag–of–words model and their implications for concept detection accuracy.
Koen E. A. van de Sande, Theo Gevers
Chapter 19. Intermedia Conceptual Indexing
Abstract
Indexing textual and visual documents at the same conceptual level enables the building of an effective intermedia conceptual indexing. In this chapter we present models and techniques used to achieve this conceptual indexing level. Our experiments were done on ImageCLEF medical data, using the UMLS (Unified Medical Language System) source of concepts for text, and a manually built visual ontology. We have used the UMLS meta–thesaurus as an external resource for indexing text at a conceptual level. At this level, the query can be structured using domain dimensions. The visual ontology has been built using the most frequent concepts from UMLS, and cropped images from the Web and from the corpus itself. The system is then able to index text and images using the same conceptual set.
Jean–Pierre Chevallet, Joo Hwee Lim
Chapter 20. Conceptual Indexing Contribution to ImageCLEF Medical Retrieval Tasks
Abstract
In this chapter, we study conceptual indexing using a language modeling approach to information retrieval. In particular, we propose a conceptual representation of documents that allows the use of both concepts and labelled relations when matching documents and queries. Such semantic indexing gives effective results when large ontologies are used. We first present a model derived from the language modeling approach to information retrieval based on a conceptual representation of documents. We then introduce an extension to take into account relations between concepts. Concept and relation detection methods are, however, error–prone. We thus develop an approach to limit such errors by combining different methods. In order to illustrate various aspects of the model proposed, we conducted a series of experiments on various medical ImageCLEF collections. Our experiments in ImageCLEFmed show that the conceptual model proposed here provides good results in medical information retrieval. Experiments furthermore show that combining concept extraction methods through fusion improves the standard language model by up to 17% MAP on the medical ImageCLEF collections.
Loïc Maisonasse, Jean–Pierre Chevallet, Eric Gaussier
Chapter 21. Improving Early Precision in the ImageCLEF Medical Retrieval Task
Abstract
Oregon Health and Science University has participated in the ImageCLEFmed medical image retrieval task since 2005. Over the years of our participation, our focus has been on exploring the needs of medical end users, and developing retrieval strategies that address those needs. Given that many users of search systems never look beyond the first few results, we have attempted to emphasize early precision in the performance of our system. This chapter describes several of the approaches we have used to achieve this goal, along with the results we have seen in doing so.
Steven Bedrick, Saïd Radhouani, Jayashree Kalpathy–Cramer
Chapter 22. Lung Nodule Detection
Abstract
The quantity of digital medical images that must be reviewed by radiologists as part of routine clinical practice has greatly increased in recent years. New acquisition devices generate images that have higher spatial resolution, both in 2–D as well as 3–D, requiring physicians to use more sophisticated visualization tools. In addition, advanced visualization systems, designed to assist the radiologist, are now part of a standard arsenal of tools which, together with workflow improvements, aid the physicians in their clinical tasks. Computer–Assisted Diagnosis (CAD) systems are one of such class of sophisticated tools to support the radiologists in tedious and time–consuming tasks such as the detection of lesions. Over the past ten years, CAD systems have evolved to reach sensitivity capabilities equivalent to or exceeding that of a radiologist, thus becoming clinically acceptable, but with limited specificity which necessitates their use as a second reader tool. This chapter presents one such system (LungCAD) designed for the detection of nodules in the lung parenchyma. Its performance was evaluated as part of a detection challenge organized by ImageCLEF 2009.
Luca Bogoni, Jinbo Bi, Charles Florin, Anna K. Jerebko, Arun Krishnan, Sangmin Park, Vikas Raykar, Marcos Salganicoff
Chapter 23. Medical Image Classification at Tel Aviv and Bar Ilan Universities
Abstract
We present an efficient and accurate image categorization system, applied to medical image databases within the ImageCLEF medical annotation task. The methodology is based on local representation of the image content, using a bag–of–visual–words approach. We explore the effect of different parameters on system performance, and show best results using dense sampling of simple features with spatial content in multiple scales, combined with a nonlinear kernel based Support Vector Machine classifier. The system was ranked first in the ImageCLEF 2009 medical annotation challenge, with a total error score of 852.8.
Uri Avni, Jacob Goldberger, Hayit Greenspan
Chapter 24. Idiap on Medical Image Classification
Abstract
The team from the Idiap Research Institute in Martigny, Switzerland, participated in three editions of the CLEF medical image annotation task always reaching among the highest positions in the rankings. Here, we present in detailed form the successful strategies we used in the different editions of the challenge to face the inter– vs. intra–class image variability, to exploit the hierarchical labeling, and to cope with the unbalanced distribution of the classes.
Tatiana Tommasi, Francesco Orabona

External views

Frontmatter
Chapter 25. Press Association Images — Image Retrieval Challenges
Abstract
In order to maximise the potential benefits of large repositories of digital images available both publicly and in private collections, intelligent information retrieval systems are required. Unfortunately, most image search engines rely on free–text search that often returns non–relevant results based on the occurrence of search keywords in text accompanying the images being matched purely at a lexical, rather than a semantic, level. In this chapter we report on ongoing work at Press Association Images on building a semantically–enabled image annotation and retrieval engine that relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently improving the end–user browsing experience.
Martin Stephens, Dhavalkumar Thakker
Chapter 26. Image Retrieval in a Commercial Setting
Abstract
This chapter provides an overview of image retrieval in a commercial setting. It details the types of resources available to commercial systems in conducting image retrieval research, and the challenges in using such resources. In particular the chapter discusses user generated content, click data, and how to evaluate commercial image search systems. It ends with a discussion of the role of benchmark efforts such as ImageCLEF in this type of research.
Vanessa Murdock, Roelof van Zwol, Lluis Garcia, Ximena Olivares
Chapter 27. An Overview of Evaluation Campaigns in Multimedia Retrieval
Abstract
This chapter presents an academic and research perspective on the impact and importance of ImageCLEF and similar evaluation workshops in multimedia information retrieval (MIR). Three main themes are examined: the position of ImageCLEF compared with other evaluation conferences; general views on the usefulness of evaluation conferences and possible alternatives, and the impact and real–world meaning of evaluation metrics used within ImageCLEF. We examine the value of ImageCLEF, and related evaluation conferences, for the multimedia IR researcher as providing not only a forum for assessing and comparing outcomes but also serving to promote research aims, provide practical guidance (e.g. standard data sets) and inspire research directions.
Suzanne Little, Ainhoa Llorente, Stefan Rüger
Backmatter
Metadaten
Titel
ImageCLEF
herausgegeben von
Henning Müller
Paul Clough
Thomas Deselaers
Barbara Caputo
Copyright-Jahr
2010
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-15181-1
Print ISBN
978-3-642-15180-4
DOI
https://doi.org/10.1007/978-3-642-15181-1

Neuer Inhalt