Visual Indexing and Retrieval

verfasst von: Jenny Benois-Pineau, Frédéric Precioso, Matthieu Cord

Verlag: Springer New York

Buchreihe : SpringerBriefs in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The research in content-based indexing and retrieval of visual information such as images and video has become one of the most populated directions in the vast area of information technologies. Social networks such as YouTube, Facebook, FileMobile, and DailyMotion host and supply facilities for accessing a tremendous amount of professional and user generated data. The areas of societal activity, such as, video protection and security, also generate thousands and thousands of terabytes of visual content. This book presents the most recent results and important trends in visual information indexing and retrieval. It is intended for young researchers, as well as, professionals looking for an algorithmic solution to a problem.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Abstract

The research in visual information indexing and retrieval has become one of the most popular directions in the broad area of information technologies. The reasons for that are the technological maturity of capture, storage and network infrastructures, that allow for common daily-life capturing of images and recording of video with professional equipment and personal mobile devices. According to Internet sources the British Broadcasting Corporation set up team dedicated to process usergenerated content, as an experimental group in April 2005 with 3 employees in the staff. The team was then made durable and got expanded, unveiling the integration of “citizen journalist” in the (broadcast) news mainstream. The same concept has been put in place by CNN that launched CNN iReport in 2006. This project was meant to allow CNN to collect user-generated news. So did the american Fox News with their ’uReport’ project and the french BFM-TV broadcast channels. YouTube, Face- Book, FileMobile, DailyMotion, host and supply facilities for accessing a tremendous amount of professional and user-generated content for educational and entertainment purposes. The areas of societal activity such as video surveillance and security also generate thousands of tera-bytes of video content with specific issues to be tackled. Finally, the digitization and storage of cultural heritage be it Byzantine frescoes, Medieval miniatures, old manuscripts or feature films, documentaries and broadcasting programs or web-sites, lead to the production of a mass of visual data which has to be accessed and searched both by professionals for re-mastering and production of new visual content and by common users for various humanities research.

Jenny Benois-Pineau, Frédéric Precioso, Matthieu Cord

Chapter 2. Visual feature extraction and description

Abstract

Since the very beginning of pattern recognition in the early 70’s, pattern recognition remains a research challenge and has become of paramount importance nowadays. Today, machine learning methods complete expert knowledge in the choice of optimal feature sets with respect to image categories to be searched and recognized. This chapter provides an overview of the feature extraction and description approaches for still images as well as for spatio-temporal data analysis.

Khalifa Djemal, Jean-Pierre Cocquerez, Frédéric Precioso

Chapter 3. Machine learning approaches for visual information retrieval

Abstract

In this chapter, we first describe the main stages for deriving image representation from visual local descriptors which has been described in Chapter 2. Coding and pooling steps are detailed. We then remind briefly some of the most usual (dis-)similarity measures between histograms, paying a particular attention to a class of similarity functions, called kernels, we deeply investigate. We present several strategies to build similarity measures. These similarities can then either represent the basis of a similarity search system or be integrated into more powerful machine learning frameworks to address classification, retrieval or detection tasks.

Frédéric Precioso, Matthieu Cord

Chapter 4. Spatial and multi-resolution context in visual indexing

Abstract

Recent trends in visual indexing make appear a large family of methods which use a local image representation via descriptors associated to the interest points, see chapter 2. Such approaches mostly ”forget” any structure in the image considering unordered sets of descriptors or their histograms as image model. Hence, more advanced approaches try to overcome this drawback by adding spatial arrangements to the interest points. In this chapter we will present two trends in incorporation of spatial context into visual description, such as considering spatial context in the process of matching of signatures on one hand and design of structural descriptors which are then used in a global Bag-of-Visual-Words (BoVW) approach on the other hand. As images and video are mainly available in a compressed form, we shortly review global descriptors extracted from compressed stream and hence less sensible to compression artifacts. Furthermore, on the basis of scalable, multiresolution/ multi-scale visual content representation in modern compression standards, we study how this multi-resolution context can be efficiently incorporated into a BoVW approach.

Jenny Benois-Pineau, Aurélie Bugeau, Svebor Karaman, Rémi Mégret

Chapter 5. Scalability issues in visual information retrieval

Abstract

Information retrieval applications increasingly have to deal with multimedia content. Since image and video databases became ever larger, scalability is a critical requirement for visual information retrieval. This chapter first describes the types of processes that support either content-based retrieval or mining and have to scale. The nature of the problems to be solved and the principle of the solutions are presented next. An emphasis is put on key ideas supporting recent progress, like the use of approximation or of shared-neighbor similarity. To keep the pace with the evolution of scalability requirements, due to more complex visual descriptions and higher volumes of data, further advances are needed. Embeddings, filtering based on simplified descriptions, optimization of content representations and distributed processing are a few directions that deserve being followed.

Michel Crucianu, Jenny Benois-Pineau

Chapter 6. Evaluation of visual information indexing and retrieval

Abstract

Comparative evaluation of methods and systems is of primarily importance in the domain of visual indexing and retrieval. As in many other domains, it is generally organized by institutions like NIST or by research networks like Pascal or PetaMedia. Evaluations are carried out in the context of periodical campaigns, or benchmarks. In these, one or more visual indexing or retrieval tasks are defined, each with a data collection, relevance judgments, performance measures and an experimentation protocol. Participants submit results computed automatically and blindly and the organizers return the measured performances. These evaluation campaigns are generally concluded by a workshop in which the participants explain how they performed the tasks. The chapter will give an overview of the major evaluation campaigns in the domain and present in detail the tasks, the data collection, the metrics and the protocols used. The state of the art performance in recent campaigns and the lessons learned from these campaigns will also be presented.

Georges Quénot, Philippe Joly, Jenny Benois-Pineau

Backmatter

Titel: Visual Indexing and Retrieval
verfasst von: Jenny Benois-Pineau
Frédéric Precioso
Matthieu Cord
Verlag: Springer New York
Electronic ISBN: 978-1-4614-3588-4
Print ISBN: 978-1-4614-3587-7
DOI: https://doi.org/10.1007/978-1-4614-3588-4