Skip to main content
main-content

Über dieses Buch

Exploration of Visual Data presents latest research efforts in the area of content-based exploration of image and video data. The main objective is to bridge the semantic gap between high-level concepts in the human mind and low-level features extractable by the machines.

The two key issues emphasized are "content-awareness" and "user-in-the-loop". The authors provide a comprehensive review on algorithms for visual feature extraction based on color, texture, shape, and structure, and techniques for incorporating such information to aid browsing, exploration, search, and streaming of image and video data. They also discuss issues related to the mixed use of textual and low-level visual features to facilitate more effective access of multimedia data.

Exploration of Visual Data provides state-of-the-art materials on the topics of content-based description of visual data, content-based low-bitrate video streaming, and latest asymmetric and nonlinear relevance feedback algorithms, which to date are unpublished.

Inhaltsverzeichnis

Frontmatter

1. Introduction

Abstract
With the ever-increasing amount of digital image and video data along with faster and easier means for information access and exchange, we are facing a pressing demand for intelligent image and video analysis, indexing, retrieval, and delivery systems.
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

2. Overview of Visual Information Representation

Abstract
Machine “sees” every pixel in an image, but can it detect, say, happiness or excitement in a picture or video? Can it EVER “feel” the emotions or semantics behind?
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

3. Edge-Based Structural Features

Abstract
In this chapter we propose edge-based structural features for image representation, develop an algorithm for effective extraction of such features, and explore the effectiveness of these features in aiding visual retrieval tasks [229].
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

4. Probabilistic Local Structure Models

Abstract
Small, local evidences can reveal a great deal in an image.
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

5. Constructing Table-of-Content for Videos

Abstract
A fundamental task in video analysis is to extract structures from the video to facilitate user’s access (browsing and retrieval). Motivated by the important role that Table-of-Content (ToC) plays in a book, in this chapter we introduce the concept of ToC in the video domain. Some existing approaches implicitly use the ToC, but are mainly limited to low-level entities (e.g. shots and key frames). The drawbacks are that low-level structures (1) contain too many entries to be efficiently presented to the user; and (2) do not capture the underlying semantic structure of the video based on which the user may wish to browse/retrieve.
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

6. Nonlinearly Sampled Video Streaming

Abstract
Over low-bit-rate channels, we adopt the streaming of nonlinearly sampled video frames (i.e., key-frame slideshow) synchronized with the audio stream. Given the channel and buffer limits, we wish to obtain a set of sampled frames that is not only feasible (i.e., streamable) but also optimal in terms of maximal information flow (given that the semantic information contents of each frame can be quantified in a way either automatically or manually). Different application scenarios are considered and modeled in a principle way, for which we propose computationally efficient algorithms for finding the global optimal solution.
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

7. Relevance Feedback for Visual Data Retrieval

Abstract
Machine-aided retrieval of multimedia information—image [44], video [170], or audio [195], etc.—is achieved based on representations in the form of descriptors (or feature vectors). Two issues arise: one is the effectiveness of the representation, i.e., to what extent can the meaningful contents of the media be represented in these vectors? The other is the selection of a similarity metric during the retrieval process. This is an important issue because the similarity metric dynamically depends upon the user and the user defined query class, which are unknown a priori. In the following, we focus our attention on the second issue, i.e., the on-line learning problem for content-based multimedia information retrieval.
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

8. Toward Unification of Keywords and Low-Level Contents

Abstract
The performance of a image retrieval system is inherently constrained by the use of the low-level features and cannot give satisfactory retrieval results in many cases, especially when the high-level concepts in the user’s mind are not easily expressible in terms of the low-level features. Therefore, for real world applications, whenever possible, textual annotations shall be added or extracted and/or processed to improve the retrieval performance. In this part we explore the unification of keywords and feature contents for image retrieval. We propose a seamless joint querying and relevance feedback scheme based on both keywords and low-level feature contents incorporating keyword similarities. We propose a WARF (word association via relevance feedback) formula as a pseudoclassification algorithm for the learning of the term similarity matrix during user interaction. This learned similarity matrix, specific to the dataset as well as the users, can be applied for keyword semantic grouping, thesaurus construction, and soft query expansion during intelligent image retrieval with user-in-the-loop.
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

9. Future Research Directions

Abstract
Content-based exploration of visual data is a relatively new research area, with many promising future research directions. To extend and continue the research carried out in this book, the emphasis may be put on the following challenging topics
Xiang Sean Zhou, Yong Rui, Thomas S. Huang

Backmatter

Weitere Informationen