nach oben

2007 | Buch

Kapitel lesen Erstes Kapitel lesen

Multimedia Retrieval

herausgegeben von: Henk M. Blanken, Henk Ernst Blok, Ling Feng, Arjen P. de Vries

Verlag: Springer Berlin Heidelberg

Buchreihe : Data-Centric Systems and Applications

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Retrieval of multimedia data is different from retrieval of structured data. A key problem in multimedia databases is search, and the proposed solutions to the problem of multimedia information retrieval span a rather wide spectrum of topics outside the traditional database area, ranging from information retrieval and human–computer interaction to computer vision and pattern recognition.

Based on more than 10 years of teaching experience, Blanken and his coeditors have assembled all the topics that should be covered in advanced undergraduate or graduate courses on multimedia retrieval and multimedia databases. The single chapters of this textbook explain the general architecture of multimedia information retrieval systems; various metadata languages like Dublin Core, RDF, or MPEG; pattern recognition through Markov models, unsupervised learning, and pattern clustering; various indexing approaches to audio and video streams; interaction and control; the protection of content and user privacy; and search effectiveness and efficiency. The authors emphasize high-level features and show how these features are used in mathematical models to support the retrieval process. For each chapter, there’s detail on further reading, and additional exercises and teaching material is available online.

Inhaltsverzeichnis

Frontmatter

1. Introduction

Abstract

People interact with multimedia every day: reading books, watching television, listening to music, etc. For quite some time we have faced astonishing technological developments causing an explosion of digital multimedia information. Large amounts of text, images, speech, and video are converted to digital form. Think of catalog information of libraries, information about museums with nice pictures of paintings or famous speeches that are available on DVD. Moreover, much information is produced directly in digital form: TV programs, audio-visual data from surveillance cameras, photos. Major advantages of digitized data over analog data are easy storage, processing and sharing of data. Multimedia applications influence our daily life. Consider for example the following scenarios.

Henk Blanken, Ling Feng, Maurice van Keulen, Henk Ernst Blok

2. Languages for Metadata

Abstract

The term meta origins from the Greek word µ∈τα, meaning after. The word Metaphysics is the title of Aristotle’s book coming after his book on nature called Physics. This has given meta the modern connotation of a nature of a higher order or of a more fundamental kind [1]. Literally, metadata is “data about data”. It can be any descriptive information about other data sources that is used to aid the organization, identification, representation, localization, interoperability, management, and use of the data [1, 14, 17].

Ling Feng, Rogier Brussee, Henk Blanken, Mettina Veenstra

3. Pattern Recognition for Multimedia Content Analysis

Abstract

This chapter looks at the basics of recognizing patterns in multimedia content. Our aim is twofold: first, to give an introduction to some of the general principles behind the various methods of pattern recognition, and second, to show what role these methods play in multimedia content analysis.

Elena Ranguelova, Mark Huiskes

4. Searching for Text Documents

Abstract

Many documents contain, besides text, also images, tables, and so on. This chapter concentrates on the text part only. Traditionally, systems handling text documents are called information storage and retrieval systems. Before the World-Wide Web emerged, such systems were almost exclusively used by professional users, so-called indexers and searchers, e.g., for medical research, in libraries, by governmental organizations and archives. Typically, professional users act as “search intermediaries” for end users. They try to fig out in an interactive dialogue with the system and the end user what it is the end user needs, and how this information should be used in a successful search. Professionals know the collection, they know how documents in the collection are represented in the system, and they know how to use Boolean search operators to control the number of retrieved documents.

Henk Blanken, Djoerd Hiemstra

5. Image Processing

Abstract

The field of image processing addresses handling and analysis of images for many purposes using a large number of techniques and methods. The applications of image processing range from enhancement of the visibility of certain organs in medical images to object recognition for handling by industrial robots and face recognition for identification at airports, but also searching for images in image databases. The methods applied range from low-level approaches like boundary detection and color based segmentation to advanced object detection using statistical geometric models. Often several techniques must be combined to obtain a desired result, e.g., first low-level feature extraction, next clustering into regions, extraction of shape parameters and finally object recognition.

Ferdi van der Heijden, Luuk Spreeuwers

6. Generative Probabilistic Models

Abstract

Many content-based multimedia retrieval tasks can be seen as decision theory problems. Clearly, this is the case for classification tasks, like face detection, face recognition, or indoor/outdoor classification. In all these cases a system has to decide whether an image (or video) belongs to one class or another (respectively face or no face; face A, B, or C; and indoor or outdoor). Even the ad hoc retrieval tasks, where the goal is to find relevant documents given a description of an information need, can be seen as a decision theory problem: documents can be classified into relevant and non-relevant classes, or we can treat each of the documents in the collection as a separate class, and classify a query as belonging to one of these. In all these settings, a probabilistic approach seems natural: an image is assigned to the class with the highest probability.³

Thijs Westerveld, Arjen de Vries, Franciska de Jong

7. Speech Indexing

Abstract

The amount of metadata attached to multimedia collections that can be used for searching is very much dependent on the available resources within the organizations that create or own the collections. Large national audiovisual institutions, such as Sound&Vision in The Netherlands,⁴ put a lot of effort in archiving their assets and they label collection items with at least titles, dates and short content descriptions (descriptive metadata, see Chapter 2). However, many organizations that create or own multimedia collections lack the resources to apply even the most basic form of archiving. Certain collections may become the stepchild of an archive — minimally managed, poorly preserved, and hardly accessible.

Roeland Ordelman, Franciska de Jong, David van Leeuwen

8. Semantic Video Indexing

Abstract

Query-by-keyword is the paradigm on which machine-based text search is still based. Elaborating on the success of text-based search engines, query-by-keyword also gains momentum in multimedia retrieval. For multimedia archives it is hard to achieve access, however, when based on text alone. Multimodal indexing is essential for effective access to video archives. For the automatic detection of specific concepts, the state-of-the-art has produced sophisticated and specialized indexing methods. Other than their textual counterparts, generic methods for semantic indexing in multimedia are neither generally available, nor scalable in their computational needs, nor robust in their performance. As a consequence, semantic access to multimedia archives is still limited. Therefore, there is a case to be made for a new approach to semantic video indexing.

Cees G. M. Snoek, Marcel Worring, Jan-Mark Geusebroek, Dennis C. Koelma, Frank J. Seinstra, Arnold W. M. Smeulders

9. A Spatio-Temporal and a Probabilistic Approach for Video Retrieval

Abstract

In this chapter we address two approaches to extract high-level concepts from video footage and show the integrated use of both. We also describe an experiment used for validation.

Milan Petković, Willem Jonker, Henk Blanken

10. Multimodal Content-based Video Retrieval

Abstract

This chapter is a case study showing how important events (highlights) can be automatically detected in video recordings of Formula 1 car racing. Numerous approaches presented in literature have shown that it is becoming possible to extract interesting events from video. However, the majority of the approaches uses individual visual or audio cues. According to the current understanding of human perception it is expected that using evidence obtained from different modalities should result in a more robust and accurate perception of video. On the other hand, fusion of multimodal evidence is quite challenging, since it has to deal with indications which may contradict each other. In this chapter we deal with three topics, one being fusion of evidence from different modalities.

Vojkan Mihajlović, Milan Petković, Willem Jonker, Henk Blanken

11. Interaction

Abstract

The challenge of coping with the overload of multimedia data has been growing since the advent of digital cameras and broadband connections to the home. Not only the techniques for storing, annotating and search are essential for helping the user to face this challenge, a well-considered user interaction design and an intuitive user interface are equally important in helping the user to find interesting content.

Erik Boertjes, Anton Nijholt

12. Digital Rights Management

Abstract

Digital Rights Management, or DRM for short, is a much-discussed topic nowadays. The main reason for this is that DRM technology is often mentioned in the context of protection of digital audio and video content, for example to avoid large scale copying of CDs and DVDs via peer-to-peer networks in the Internet. However, DRM technology is much more than a simple copy protection technology. It is one of the enabling technologies that open the way to secure distribution and exchange of digital content over open digital infrastructures such as the Internet.

Paul Koster, Willem Jonker

13. Evaluation of Multimedia Retrieval Systems

Abstract

In this chapter, we provide the tools and methodology for comparing the effectiveness of two or more multimedia retrieval systems in a meaningful way. Several aspects of multimedia retrieval systems can be evaluated without consulting the potential users or customers of the system, such as the query processing time (measured for instance in milliseconds per query) or the query throughput (measured for instance as the number of queries per second). In this chapter, however, we will focus on aspects of the system that influence the effectiveness of the retrieved results. In order to measure the effectiveness of search results, one must at some point consult the potential user of the system. For, what are the correct results for the query “black jaguar”? Cars, or cats? Ultimately, the user has to decide.

Djoerd Hiemstra, Wessel Kraaij

Backmatter

Titel: Multimedia Retrieval
herausgegeben von: Henk M. Blanken
Henk Ernst Blok
Ling Feng
Arjen P. de Vries
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-72895-5
Print ISBN: 978-3-540-72894-8
DOI: https://doi.org/10.1007/978-3-540-72895-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Kryptowährungen/© gopixa / Getty Images / iStock, MG4 aus China auf dem Prüfstand im ADAC-Technik-Zentrum in Landsberg am Lech/© ADAC e.V., Chassis eines Elektrofahrzeugs/© chesky / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

1. Introduction

2. Languages for Metadata

3. Pattern Recognition for Multimedia Content Analysis

4. Searching for Text Documents

5. Image Processing

6. Generative Probabilistic Models

7. Speech Indexing

8. Semantic Video Indexing

9. A Spatio-Temporal and a Probabilistic Approach for Video Retrieval

10. Multimodal Content-based Video Retrieval

11. Interaction

12. Digital Rights Management

13. Evaluation of Multimedia Retrieval Systems

Backmatter

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.