Skip to main content
main-content

Über dieses Buch

Traditionally, scientific fields have defined boundaries, and scientists work on research problems within those boundaries. However, from time to time those boundaries get shifted or blurred to evolve new fields. For instance, the original goal of computer vision was to understand a single image of a scene, by identifying objects, their structure, and spatial arrangements. This has been referred to as image understanding. Recently, computer vision has gradually been making the transition away from understanding single images to analyz­ ing image sequences, or video understanding. Video understanding deals with understanding of video sequences, e. g. , recognition of gestures, activities, fa­ cial expressions, etc. The main shift in the classic paradigm has been from the recognition of static objects in the scene to motion-based recognition of actions and events. Video understanding has overlapping research problems with other fields, therefore blurring the fixed boundaries. Computer graphics, image processing, and video databases have obvious overlap with computer vision. The main goal of computer graphics is to gener­ ate and animate realistic looking images, and videos. Researchers in computer graphics are increasingly employing techniques from computer vision to gen­ erate the synthetic imagery. A good example of this is image-based rendering and modeling techniques, in which geometry, appearance, and lighting is de­ rived from real images using computer vision techniques. Here the shift is from synthesis to analysis followed by synthesis.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Bridging the Semantic Gap in Content Management Systems

Computational Media Aesthetics
Abstract
With the explosion of digital media and online services, a key challenge in the area of media management is automation of content annotation, indexing, and organization for efficient media access, search, retrieval, and browsing. A major failing of current media annotation systems is the semantic gap — the incompatibility between the low-level features that can be currently computed automatically to describe media content and the high-level meaning associated with the content by users in media search and retrieval. This inevitably leads to the problem of content management systems returning media clips that are similar to one another in terms of low-level descriptions, but are completely different in terms of semantics sought by the users in their search. This chapter introduces Computational Media Aesthetics as an approach to bridging the semantic gap, outlines its foundations in media production principles, presents a computational framework to deriving high-level semantic constructs from media, and describes the structure of this collection.
Chitra Dorai, Svetha Venkatesh

Chapter 2. Essentials of Applied Media Aesthetics

Abstract
This chapter describes the fundamental aesthetic elements of media aesthetics — light and color, two-dimensional space, three-dimensional space, time-motion, and sound — and how they can serve as basic criteria for analysis of video and film. It also explains how these elements can be structured and applied to produce maximally effective visual and sound images in video and film. By placing these essential image elements into their respective contextual fields, their interdependence and structural potential in aesthetic synthesis, the clarification, intensification, interpretation, and communication of significant experiences are made explicit. Ultimately, applied media aesthetics attempts to give some answers to why we compose or light a particular shot in a particular way, or how we match pictures and sound most effectively.
Herbert Zettl

Chapter 3. Space-Time Mappings as Database Browsing Tools

Abstract
One possible value to examining cinematic structures is that once understood, they can serve as visual and conceptual mappings which can then function as an interface to accessing the work in question. Determining such structures is not purely a systematic technique, but some ideas about segmentation can perhaps be useful in developing approaches to accessing visual material.
Stephen Mamber

Chapter 4. Formulating Film Tempo

The Computational Media Aesthetics Methodology in Practice
Abstract
This chapter constitutes a detailed example of Computational Media Aesthetics at work. A short history of approaches to the problems posed by automatic content management in its broadest context is presented, cast in the light of their ability to obtain the much needed semantic grid with which to interpret their object. Our consideration is then further narrowed to the video medium, where we identify two common categories of solution to the problem, the kitchen sink and the brittle mapping, note their relative weaknesses, and show them to be directly attributable to the nature of the semantic grid chosen, or lack thereof. Focusing on our domain of Film, we argue that the best semantic grid for its interpretation is that within which its creators work; namely film grammar. In order to demonstrate this, we develop a measure for the extraction of a fundamental aspect of film, Tempo. From definition, to formulation, and even its exploitation resulting in the location of high-level filmic components such as dramatic occurrences, the process is guided by film grammar at every step. Example results are provided from the movie, The Matrix.
Brett Adams, Chitra Dorai, Svetha Venkatesh

Chapter 5. Modeling Color Dynamics for the Semantics of Commercials

Abstract
Retrieval of video based on content semantics requires that models are developed to map low level perceptual features into high level semantic concepts. Commercials are a video category where the link between low level perceptual features and high level semantics is stressed, since the way colors are chosen and modified throughout a commercial creates a large part of its message. In this chapter, we propose a model for the representation and comparison of video content based on dynamics of color regions in the video. A model is presented to define an intermediate level representation of color dynamics in terms of spatial arrangement of color flows. The model for representation and comparison of spatial relationships between extended sets of pixels in a 3D space is introduced by developing on the concept of weighted walkthroughs. Results of preliminary experiments are reported for a library of video commercials.
Alberto Del Bimbo, Pietro Pala, Enrico Vicario

Chapter 6. Scene Determination Using Auditive Segmentation Models of Edited Video

Abstract
This chapter describes different approaches that use audio features for determination of scenes in edited video. It focuses on analyzing the sound track of videos for extraction of higher-level video structure. We define a scene in a video as a temporal interval which is semantically coherent. The semantic coherence of a scene is often constructed during cinematic editing of a video. An example is the use of music for concatenation of several shots into a scene which describes a lengthy passage of time such as the journey of a character. Some semantic coherence is also inherent to the unedited video material such as the sound ambience at a specific setting, or the change pattern of speakers in a dialog. Another kind of semantic coherence is constructed from the textual content of the sound track revealing, for example, different stories contained in a news broadcast or documentary. This chapter explains the types of scenes that can be constructed via audio cues from a film art perspective. It discusses the feasibility of automatic extraction of these scene types and finally presents s survey of existing approaches.
Silvia Pfeiffer, Uma Srinivasan

Chapter 7. Determining Affective Events Through Film Audio

Abstract
This chapter provides a brief introduction to the use of sound in film as motivation for the examination of affective aspects of film audio. A method is presented to search the audio track of film for sound energy events that are hypothesized to be associated with an affect or a thematic event within the film. The sound energy events are characterized by specific patterns in the dynamics of the sound energy of the film audio. Four sound energy events are examined, each associated with a specific affect or symbolic meaning, that occur predominantly, although not solely, in films of the horror genre. The occurrences of the sound energy patterns of interest are investigated in six full length film audio tracks and experimental results are presented.
Simon Moncrieff, Chitra Dorai, Svetha Venkatesh

Chapter 8. The Future of Media Computing

From Ontology-Based Semiosis to Communal Intelligence
Abstract
In this chapter we investigate the underlying structural requirements for media-aware knowledge spaces. We discuss the merging of media generation and annotation to facilitate the use of media-based information for diverse purposes. In particular, we pay attention to the description of various tools in a distributed digital production environment supporting many distinct phases of media production. We then outline the accessibility of annotated media material for the purpose of discourse in encyclopaedic spaces. The examples throughout this chapter are taken from two domains: Media production with an emphasis on news and film production, and encyclopaedic spaces as provided by domains such as theory, history, and anthropology of film.
Frank Nack

Backmatter

Weitere Informationen