Skip to main content

1996 | Buch

Searching Multimedia Databases by Content

verfasst von: Christos Faloutsos

Verlag: Springer US

Buchreihe : Advances in Database Systems

insite
SUCHEN

Über dieses Buch

Searching Multimedia Databases by Content bridges the gap between the database and signal processing communities by providing the necessary background information for the reader and presenting it along with the intuition and mechanics of the best existing tools in each area.
The first half of Searching Multimedia Databases by Content reviews the most successful database access methods, in increasing complexity, reaching up to spatial access methods and text retrieval. In all cases, the emphasis is on practical approaches that have been incorporated in commercial systems, or that seem very promising.
The second half of the book uses the above access methods to achieve fast searching in a database of signals. A general methodology is presented, which suggests extracting a few good features from each multimedia object, thus mapping objects into points in a metric space. Finally, the book concludes by presenting some recent successful applications of the methodology on time series and color images.
Searching Multimedia Databases by Content is targeted towards researchers and developers of multimedia systems. The book can also serve as a textbook for a graduate course on multimedia searching, covering both access methods as well as the basics of signal processing.

Inhaltsverzeichnis

Frontmatter

Introduction

1. Introduction
Abstract
As a working definition of a Multimedia Database System we shall consider a system that can store and retrieve multimedia objects, such as 2-dimensional color images, gray-scale medical images in 2-d or 3-d (eg., MRI brain scans), 1-dimensional time series, digitized voice or music, video clips, traditional data types, like ‘product-id’, ‘date’, ‘title’, and any other user-defined data types. For such a system, what this book focuses on is the design of fast searching methods by content. A typical query by content would be, eg., ‘in a collection of color photographs, find ones with a same color distribution like a sunset photograph’.
Christos Faloutsos

Database Indexing Methods

Frontmatter
2. Introduction to Relational DBMS
Abstract
This chapter presents the minimal necessary set of concepts from relational database management systems. Excellent textbooks include, eg., [KS91] [Dat86]. The chapter also describes how the proposed methods will fit in an extensible DBMS, customized to support multimedia datatypes.
Christos Faloutsos
3. Primary Key Access Methods
Abstract
Here we give a brief overview of the traditional methods to handle queries on primary (ie, unique) keys. Considering the running example of EMPLOYEE records, a typical query is, eg., ‘find the employee with emp# = 344’. Notice that the emp# is assumed to have no duplicates.
Christos Faloutsos
4. Secondary Key Access Methods
Abstract
Access methods for secondary key retrieval have attracted much interest. The problem is stated as follows: Given a file, say, EMPLOYEE( name, salary, age), organize the appropriate indices so that we can answer efficiently queries on any and all of the available attributes. Rivest [Riv76] classified the possible queries into the following classes, in increasing order of complexity:
  • exact match query, when the query specifies all the attribute values of the desired record, e.g.:
    name = ‘Smith’ and salary = 40,000 and age = 45
  • partial match query, when only some of the attribute values are specified, e.g.:
    salary = 40,000 and age =35
  • range queries, when ranges for some or all of the attributes are specified, e.g.:
    35,000 ≤ salary ≤ 45,000 and age = 45
  • Boolean queries:
    ( (not name = ‘Smith’) and salary ≥ 40,000 ) or age ≥ 50
Christos Faloutsos
5. Spatial Access Methods (SAMS)
Abstract
In the previous section we examined the so-called ‘secondary key’ access methods, which handle queries on keys that may have duplicates (eg., ‘salary’, or ‘age’, in an EMPLOYEE file). As mentioned, records with k numerical attributes can be envisioned as k-dimensional points. Here we examine spatial access methods, which are designed to handle multidimensional points, lines, rectangles and other geometric bodies.
Christos Faloutsos
6. Access Methods For Text
Abstract
In this chapter we present the main ideas for text retrieval methods. For more details, see the survey in [Fal85] and the book by Frakes and Baeza-Yates [FBY92]. Access methods for text are interesting for three reasons: (a) multimedia objects often have captions in free text; exploiting these captions may help retrieve some additional relevant objects [OS95] (b) research in text retrieval has led to some extremely useful ideas, like the relevance feedback and the vector space model that we discuss next and (c) text retrieval has several applications in itself.
Christos Faloutsos

Indexing Signals

Frontmatter
7. Problem - Intuition
Abstract
The problem we focus on is the design of fast searching methods that will search a database of multimedia objects, to locate objects that match a query object, exactly or approximately. Objects can be 2-dimensional color images, gray-scale medical images in 2-d or 3-d (eg., MRI brain scans), 1-dimensional time sequences, digitized voice or music, video clips etc. A typical query by content would be, eg., ‘in a collection of color photographs, find ones with a same color distribution as a sunset photograph’.
Christos Faloutsos
8. 1-D Time Sequences
Abstract
Here the goal is to search a collection of (equal-length) time sequences, to find the ones that are similar to a desirable sequence. For example, ‘in a collection of yearly stock-price movements, find the ones that are similar to IBM’.
Christos Faloutsos
9. 2-D Color Images
Abstract
Retrieving images by content attracts high and increasing interest [JN92, OS95, FSN+95]. Queries on content may focus on color, texture, shape, position, etc. Potential applications include medical image databases (‘Give me other images that contain a tumor with a texture like this one’), photo-journalism (‘Find images that have blue at the top and red at the bottom’), art, fashion, cataloging, retailing etc..
Christos Faloutsos
10. Sub-Pattern Matching
Abstract
Up to now, we have examined the ‘whole-match’ case. The goal in this chapter is to extend the ‘GEMINI’ approach of the ‘quick-and-dirty’ test, so that we can handle sub-pattern matching queries. We focus on 1-d time series, to illustrate the problem and the solution more clearly. Then, the problem is defined as follows:
  • We are given a collection of N sequences of real numbers S1, S2, S N , each one of potentially different length.
  • The user specifies query subsequence Q of length Len(Q) (which may vary) and the tolerance ∈, that is, the maximum acceptable dis-similarity (= distance).
  • We want to find quickly all the sequences S i (1 ≤ iN), along with the correct offsets p, such that the subsequence S i [p : p + Len(Q) − 1] matches the query sequence: D(Q, S i [p : p + Len(Q) − 1]) ≤ ∈.
Christos Faloutsos
11. Fastmap
Abstract
In the previous chapters we saw the ‘GEMINI’ approach, which suggests that we rely on domain experts to derive k feature-extraction functions, thus mapping each object into a point in k-dimensional space. Then the problem is reduced to storing, retrieving and displaying k-dimensional points, for which there is a plethora of algorithms available.
Christos Faloutsos
12. Conclusions
Abstract
We have presented a generic method (the ‘GEMINI’ approach) to accelerate queries by content on image databases and, more general, on multimedia databases. Target queries are, eg., ‘find images with a color distribution of a sunset photograph’ or, ‘find companies whose stock-price moves similarly to a given company’s stock’.
Christos Faloutsos

Mathematical Toolbox

Frontmatter
A. Preliminaries
Abstract
Before we start, we need some definitions from complex algebra and from linear algebra. Consider a complex number
$$ c = a + jb = A\;\exp \left( {j\Phi } \right) $$
where
$$ j = \sqrt { - 1} $$
is the imaginary unit.
Christos Faloutsos
B. Fourier Analysis
Abstract
The intuition behind the Fourier Transform (as well as the Discrete Time Fourier transform that we are examining) is based on Fourier’s theorem, that every continuous function can be considered as a sum of sinusoidal functions. For the discrete case, which is the one of interest to us, the n-point Discrete Fourier Transform [OS75] of a signal \( \overrightarrow x = \left[ {x_i } \right],\;i = 0, \ldots ,n - 1 \) is defined to be a sequence \( \overrightarrow X \) of n complex numbers \( X_f ,f = 0, \ldots ,n - 1 \), given by
$$ X_f = {1 \mathord{\left/ {\vphantom {1 {\sqrt n \sum\limits_{i = 0}^{n - 1} {x_i \;\exp \left( { - j2\pi {{fi} \mathord{\left/ {\vphantom {{fi} n}} \right. \kern-\nulldelimiterspace} n}} \right)\;f = 0,1, \ldots ,n - 1\;\;\;\;\;\;\;\left( {\text{B}.1} \right)} }}} \right. \kern-\nulldelimiterspace} {\sqrt n \sum\limits_{i = 0}^{n - 1} {x_i \;\exp \left( { - j2\pi {{fi} \mathord{\left/ {\vphantom {{fi} n}} \right. \kern-\nulldelimiterspace} n}} \right)\;f = 0,1, \ldots ,n - 1\;\;\;\;\;\;\;\left( {\text{B}.1} \right)} }} $$
(1)
where j is the imaginary unit \( \left( {j \equiv \sqrt { - 1} } \right) \).
Christos Faloutsos
C. Wavelets
Abstract
The wavelet transform is believed to avoid the ‘frequency leak’ problem even better. Consider the case of an impulse function (Example B.2): both in the DFT and the DCT transform, it has non-zero amplitudes in all frequencies. Thus, what would take a single number to describe in the time domain, will require several numbers in the frequency domain. The problem is that the DFT has no temporal locality: each of its coefficients provide information about all the time instants. A partial remedy would be the so-called ‘Short Window Fourier Transform’ (SWFT) [RV91): We can divide the time sequence into frames of, say, w consecutive (non-overlapping) samples, and do the w-point DFT in each of these windows. Thus, an impulse function in the time domain will have a restricted ‘frequency leak’. Figure C.1 shows intuitively what happens: In the time domain, each value gives the full information about that instant (but no information about frequencies). The DFT has coefficients that give full information about a given frequency, but it needs all the frequencies to recover the value at a given instant in time. The SWFT is somewhere in between.
Christos Faloutsos
D. K-L And SVD
Abstract
Before we examine the K-L transform, we should give the definition of eigen-values and eigenvectors of a square matrix S. (We use the letter ‘S’ to stress the fact that the matrix is square).
Christos Faloutsos
Backmatter
Metadaten
Titel
Searching Multimedia Databases by Content
verfasst von
Christos Faloutsos
Copyright-Jahr
1996
Verlag
Springer US
Electronic ISBN
978-1-4613-1445-5
Print ISBN
978-1-4612-8629-5
DOI
https://doi.org/10.1007/978-1-4613-1445-5