Skip to main content

2005 | Buch

Advances in Visual Computing

First International Symposium, ISVC 2005, Lake Tahoe, NV, USA, December 5-7, 2005. Proceedings

herausgegeben von: George Bebis, Richard Boyle, Darko Koracin, Bahram Parvin

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

It is with great pleasure that I welcome you to Lake Tahoe for the 2005 Int- national Symposium on Visual Computing (ISVC). ISVC provides a common umbrella for the four main areas of visual computing: vision, graphics, visu- ization, and virtual reality. The goal of ISVC is to provide a common forum for researchers, scientists, engineers, and practitioners throughout the world to present their latest research ?ndings, ideas, developments, and applications in the broader area of visual computing. The program consists of six oral sessions, two poster sessions, seven special tracks,fourkeynotepresentations,andoneinvitedpresentation.Theresponseto thecallforpapersforthegeneralISVC2005sessionswasverygood.Wereceived over110submissionsfromwhichweaccepted33papersfororalpresentationand 26 papers for poster presentation. Special track papers were solicited separately through the organizing and program committees of each track. A total of 32 papers were accepted for inclusion in the special tracks. All papers were reviewed with an emphasis on their potential to contribute to the state of the art in the ?eld. Selection criteria included accuracy and originality of ideas, clarity and signi?cance of results, and presentation qu- ity. The review process was quite rigorous, involving two or three independent double-blind reviews followed by a one-week discussion period. During the d- cussion period we tried to correct anomalies and errors that might have existed in the initial reviews. Despite our e?orts, we recognize that some papers worthy of inclusion may not have been included in the program. We o?er our sincere apologies to authors whose contributions might have been overlooked. IwishtothankeverybodywhosubmittedtheirworktoISVC2005forreview.

Inhaltsverzeichnis

Frontmatter
An NPR Technique for Pointillistic and Mosaic Images with Impressionist Color Arrangement

A simple non-photorealistic rendering (NPR) technique is presented for creating pointillistic and mosaic images with color arrangement resembling the impressionist paintings. An input image is partitioned into rectangular tiles which are grouped into blocks. The color of tiles is changed to ones maximally far apart from each other while their local average approximates the color of the input image. The resultant mosaic image with the tile size of only one pixel is used for creating pointillistic images like to ones by George Seurat. Their blending with the original image produces a mosaic image resembling that by Chuck Close. Some modifications are also incorporated into the color transformation to improve the reproducibility of mosaic images blended with a collection of tiny images.

Linlin Jing, Kohei Inoue, Kiichi Urahama
Active View Optimization for Viewing Objects in Motion

This paper introduces a novel method to automatically generate camera animation for scenes with objects in motion. A means of bounding the scene is first discussed, in order to provide a restricted surface for the camera to move along. Methods for generating static views of a scene are then developed, based on prior literature in the field of optimal camera views. Methods for producing smooth animations for camera paths are developed. This is broken up into two constraints; the minimum distance constraint and the minimum angle constraint. Finally, a camera path is produced based on the two prior constraints. This framework is then applied to the problem of facial tracking and visualizing a large scene.

Matt Berger, Lijun Yin, Jason Moore
Adding Hand Motion to the Motion Capture Based Character Animation

Most character motion capture data does not contain secondary motions like detailed hand motion, therefore the resultant animation looks unnatural due to the stiffness of hand motion. In this paper, we analyzed the pose space distance from the character’s motion capture data and used stepwise searching algorithm to find the key poses for hand motion synthesis. We adaptively changed the contrast of the local distance map to capture the small motions. If an appropriate hand motion data available, temporal alignment with speed matching and spatial warping of motion data can produce realistic hand motion. If there is no motion capture data available, key frame with cubic or gaussian based interpolation can be used to generate in between frames.

Ge Jin, James Hahn
Oversimplified Euler Operators for a Non-oriented, Non-manifold B-Rep Data Structure

Traditionally, B-rep geometric kernels possess oriented data structures, i.e. they possess oriented cells (e.g. half-edges, co-edges, face uses, etc.). The use of explicit oriented cells makes these data structures quite verbose and expensive in terms of memory space. Although orientation is important for visualization and engineering analysis purposes, it gives rise to difficult issues at the representation level; for example, keeping inclusion relationships between incident surfaces at a non-manifold vertex. Instead, we propose a non-manifold B-rep data structure whose cells are not oriented. This facilitates the design and implementation of its associated Euler operators, each one of which then reduces itself to a sequence of insertion and removal operations of cells into or from a list. Besides, these Euler operators call a single query operator to retrieve

all

incidence and adjacency information through a minimal number of accesses. As a result, we obtain a

simple

,

responsive

,

concise

and

general

non-oriented, non-manifold B-rep geometric kernel.

Frutuoso G. M. Silva, Abel J. P. Gomes
The Number of Gaps in Binary Pictures

This paper identifies the total number of gaps of object pixels in a binary picture, which solves an open problem in 2D digital geometry (or combinatorial topology of binary pictures). We obtain a formula for the total number of gaps as a function of the number of object pixels (grid squares), vertices (corners of grid squares), holes, connected components, and 2 × 2 squares of pixels. It can be used to test a binary picture (or just one region: e.g., a digital curve) for gap-freeness.

Valentin E. Brimkov, Angelo Maimone, Giorgio Nordo, Reneta P. Barneva, Reinhard Klette
Video Codec for Classical Cartoon Animations with Hardware Accelerated Playback

We introduce a novel approach to video compression which is suitable for traditional outline-based cartoon animations. In this case the dynamic foreground consists of several homogeneous regions and the background is static textural image. For this drawing style we show how to recover hybrid representation where the background is stored as a single bitmap and the foreground as a sequence of vector images. This allows us to preserve compelling visual quality as well as spatial scalability even for low encoding bit-rates. We also introduce an efficient approach to play back compressed animations in real-time on commodity graphics hardware. Practical results confirm that for the same storage requirements our framework provides better visual quality as compared to standard video compression techniques.

Daniel Sýkora, Jan Buriánek, Jiří Žára
Retinal Image Registration for NIH’s ETDRS

This paper presents a retinal image registration approach for National Institute of Health (NIH)’s Early Treatment Diabetic Retinopathy Study (ETDRS) standard. The ETDRS imaging protocol specifies seven fields of each retina and presents several major challenges for image registration. The proposed method effectively combines both area-based and feature-based methods in three steps. First, the vascular tree is extracted by using a local entropy thresholding technique. Next, zeroth-order translation is estimated by maximizing mutual information based on the binary image pair (area-based). Specifically, a local entropy-based peak selection and a multi-resolution searching schemes are developed to improve accuracy and efficiency of translation estimation. Third, we use two types of features (feature-based), landmark points and sampling points, for affine/quadratic model estimation. Simulation on 504 pairs of ETDRS retinal images shows the effectiveness of the proposed algorithm.

Thitiporn Chanwimaluang, Guoliang Fan
Using Multimodal MR Data for Segmentation and Topology Recovery of the Cerebral Superficial Venous Tree

Magnetic resonance angiography (MRA) produces 3D data visualizing vascular structures by detecting the flowing blood signal. While segmentation methods generally detect vessels by only processing MRA, the proposed method uses both MRA and non-angiographic (MRI) images. It is based on the assumption that MRI provides anatomical information useful for vessel detection. This supplementary information can be used to correct the topology of the segmented vessels. Vessels are first segmented from MRA while the cortex is segmented from MRI. An algorithm, based on distance maps and topology preserving thinning, then uses both segmented structures for recovery of the missing parts of the brain superficial venous tree and removal of other vessels. This method has been performed and validated on 9 MRA/MRI data of the brain. The results show that the venous tree is correctly segmented and topologically recovered with a 84% accuracy.

N. Passat, C. Ronse, J. Baruthio, J. -P. Armspach, M. Bosc, J. Foucher
Loop Removal from Colon Central Path Through Skeleton Scale-Space Tracking

The central path of the colon is an important tool in computer-assisted diagnosis: it is an aid to navigation during a virtual colonoscopy and allows an easier follow-up of the patient pathologies. However the computation of this central path remains a difficult task: on both MR and CT medical images, the wall of the colon is too thin with respect to the resolution of the acquisition and thus does not appear on images. Hence, when two sections of the colon are in contact, a loop is created in the image, and causes a wrong central path. In this paper, we propose an algorithm to remove those loops, based on the tracking of the colon skeleton in a scale-space. An evaluation of our algorithm on 20 patients showed that every loop was successfully removed, yielding a centered, anatomically and topologically correct central path.

Julien Lamy, Christian Ronse, Luc Soler
Multiscale Segmentation of HRCT Images Using Bipolar Incoherent Filtering

In this paper, we present a new Multiscale segmentation method based on an optical transfer function implemented in the Frequency domain and with this new segmentation technique, we demonstrate that it is possible to segment the HRCT (High Resolution CT) images into its various components at multiple scales hence separating the information available in the HRCT image. In the literature, several image segmentation techniques have been proposed for the segmentation of the medical images. However, there are few Multiscale segmentation methods that can segment the medical image so that various components within the image could be separated at multiple resolutions or scales. We show that the HRCT image can be segmented such that we get separate images for bones, tissues, lungs and anatomical struc-tures within the lungs.

Aamir Saeed Malik, Tae-Sun Choi
Building Statistical Atlas of White Matter Fiber Tract Based on Vector/Tensor Field Reconstruction in Diffusion Tensor MRI

The diffusion tensor tractography has drawbacks such as low objectivity by interactive ROI setting and fiber-crossing. For coping with such problems, we are constructing a statistical atlas of white matter fiber tracts, in which probability density maps of tract structures are stored with diffusion tensor parameters on spatially normalized brain data. In building the atlas, our fiber tract modeling method plays a key role, which is based on a novel approach of vector/tensor field reconstruction avoiding fiber-crossings. In this abstract, we describe the modeling method, our statistical atlas, and the preliminary results.

Yoshitaka Masutani, Shigeki Aoki, Osamu Abe, Mariko Yoshida, Haruyasu Yamada, Harushi Mori, Kenji Ino, Kuni Ohtomo
Interactive 3D Heart Chamber Partitioning with a New Marker-Controlled Watershed Algorithm

Watershed transform has been widely used in medical image segmentation. One fundamental problem with it is over-segmentation. There are mainly two approaches to deal with this problem: hierarchical segmentation and segmentation with markers. The markers, either automatically extracted or interactively generated, are mostly used in the homotopy modification of morphological gradients prior to the watershed segmentation. Most of the current techniques does not incorporate domain knowledge of the data. In this paper, we propose a two-step marker-controlled watershed segmentation algorithm with simple domain knowledge incorporated: (1) Modified image foresting transform (IFT) algorithm is used to produce the initial segmentation; (2) The marker-controlled watershed region merging process is incorporated with domain knowledge. A min-cut criterion for region merging is proposed. This approach is effectively applied to the interactive 3D heart chamber partitioning.

Xinwei Xue
Inferring Cause/Effect Relationships in Multi-sensor Ambient Intelligence Systems

In this work a learning technique to provide an Ambient Intelligence (smart space) system with the capacity of predicting variation events in its own internal state is presented. The system and the interacting users are modeled through the instantaneous state vectors obtained as output of two trained Self Organizing Map-based classifiers. The information processed by the system is collected by two sensors sets monitoring several internal and external system variables. Starting from the hypothesis that the user actions have a direct influence on internal system state variables (e.g. work load on personal computers computation or storage devices in a University laboratory, in our current test implementation) we developed a statistical voting algorithm for inferring cause/effect relationships in these instantaneous variations. Logical connections are obtained in unsupervised mode with no a priori information and leads to the definition of a knowledge base the system can exploit to predict its own near future internal state variations, given the observation of the lab users.

S. Piva, C. S. Regazzoni
Toward a Unified Probabilistic Framework for Object Recognition and Segmentation

This paper presents a novel and effective Bayesian belief network that integrates object segmentation and recognition. The network consists of three latent variables that represent the local features, the recognition hypothesis, and the segmentation hypothesis. The probabilities are the result of approximate inference based on stochastic simulations with Gibbs sampling, and can be calculated for large databases of objects. Experimental results demonstrate that this framework outperforms a feed-forward recognition system that ignores the segmentation problem.

Huei-Ju Chen, Kuang-Chih Lee, Erik Murphy-Chutorian, Jochen Triesch
Distributed Multi-camera Surveillance for Aircraft Servicing Operations

This paper presents the visual surveillance aspects of a distributed intelligent system that has been developed in the context of aircraft activity monitoring. The overall tracking system comprises three main modules — Motion Detection, Object Tracking and Data Fusion. In this paper we primarily focus on the object tracking and data fusion modules.

David Thirde, Mark Borg, James Ferryman, Josep Aguilera, Martin Kampel
Mining Paths of Complex Crowd Scenes

The Ambient Intelligence (AmI) paradigm requires a robust interpretation of people actions and behaviour and a way for automatically generating persistent spatial-temporal models of recurring events. This paper describes a relatively inexpensive technique that does not require the use of conventional trackers to identify the main paths of highly cluttered scenes, approximating them with spline curves. An AmI system could easily make use of the generated model to identify people who do not follow prefixed paths and warn them. Security, safety, rehabilitation are potential application areas. The model is evaluated against new data of the same scene.

B. Zhan, P. Remagnino, S. A. Velastin
Geometric and Photometric Analysis for Interactively Recognizing Multicolor or Partially Occluded Objects

An effective human-robot interaction is essential for wide penetration of service robots into the market. Such robots need vision systems to recognize objects. It is, however, difficult to realize vision systems that can work in various conditions. More robust techniques of object recognition and image segmentation are essential. Thus, we have proposed to use the human user’s assistance for objects recognition through speech. Our previous system assumes that it can segment images without failure. However, if there are occluded objects and/or objects composed of multicolor parts, segmentation failures cannot be avoided. This paper presents an extended system that can recognize objects in occlusion and/or multicolor cases using geometric and photometric analysis of images. If the robot is not sure about the segmentation results, it asks questions of the user by appropriate expressions depending on the certainty to remove the ambiguity.

Md. Altab Hossain, Rahmadi Kurnia, Yoshinori Kuno
A Three-Level Graph Based Interactive Volume Segmentation System

We present a new

hierarchical graph

representation for volume data as well as its associated operations to enable interactive feature segmentation for high-resolution volume data. Our method constructs a low-resolution graph which represents a coarser resolution of the data. This graph enables the user to interactively sample and edit a feature of interest by drawing strokes on data slices. A subgraph representing the feature is derived with a growing process, and is used to extract the high-resolution version of the feature from the original volume data by performing an automatic mapping and refinement procedure. Our three-level, graph-based approach overcomes partial volume effects that are introduced by downsampling the volume data, and enables interactive segmentation of fine features. We demonstrate the effectiveness of this approach with several challenging 3D segmentation applications.

Runzhen Huang, Kwan-Liu Ma
Self-organizing Deformable Model: A New Method for Fitting Mesh Model to Given Object Surface

This paper presents a new method for projecting a mesh model of a source object onto a surface of an arbitrary target object. A deformable model, called Self-organizing Deformable Model(SDM), is deformed so that the shape of the model is fitted to the target object. We introduce an idea of combining a competitive learning and an energy minimization into the SDM deformation. Our method is a powerful tool in the areas of computer vision and computer graphics. For example, it enables to map mesh models onto various kinds of target surfaces like other methods for a surface parameterization, which have focused on specified target surface. Also the SDM can reconstruct shapes of target objects like general deformable models.

Ken’ichi Morooka, Hiroshi Nagahashi
Image-Based Deformation of Objects in Real Scenes

We present a new method for deforming an object in a static real scene, which interacts with animated synthetic characters. Unlike the existing method – making a new synthetic object to substitute for the interacting object in the real scene, we directly deform the object in image space using an image-warping technique with assistance from a simplified 3D sub-model. The deformed image sequence is processed further using the Expression Ratio Image (ERI) technique to apply the illumination changes generated by the deformation. Using this method, we can maintain the photo-realistic rendering quality of the scene efficiently by preserving the appearance of the object in the real scene as it is.

Han-Vit Chung, In-Kwon Lee
Comparing Sphere-Tree Generators and Hierarchy Updates for Deformable Objects Collision Detection

This paper presents a quantitative evaluation of the accuracy of different sphere-tree construction methods when they are used in deformable bodies. The methods evaluated are Grid (an extension of octrees), Hubbard, Adaptive Medial Axis and Spawn. We also present a new approach to update the sphere-tree hierarchy that ensures lower loss of accuracy than in traditional update techniques.

M. Garcia, S. Bayona, P. Toharia, C. Mendoza
Simulating Complex Organ Interactions: Evaluation of a Soft Tissue Discrete Model

Computer assisted procedures play a key role in the improvement of surgical operations. The current techniques in simulation potentially lead to more accuracy, more safety and more predictability in the surgical room. Despite the important number of algorithms proposed for interactively modelling deformable objects such as soft human tissues, very few methods have attempted to simulate complex anatomical configurations. In this paper, we present a new approach for soft tissue modelling whose novelty is to integrate the interactions between a given soft organ and its surrounding organs. The proposed discrete model is compared to finite element method in order to quantify its performance and physical realism. The model is applied to the simulation of the prostate-bladder set.

Maud Marchal, Emmanuel Promayon, Jocelyne Troccaz
Face Verification in Polar Frequency Domain: A Biologically Motivated Approach

We present a novel local-based face verification system whose components are analogous to those of biological systems. In the proposed system, after global registration and normalization, three eye regions are converted from the spatial to polar frequency domain by a Fourier-Bessel Transform. The resulting representations are embedded in a dissimilarity space, where each image is represented by its distance to all the other images. In this dissimilarity space a Pseudo-Fisher discriminator is built. ROC and equal error rate verification test results on the FERET database showed that the system performed at least as state-of-the-art methods and better than a system based on polar Fourier features. The local-based system is especially robust to facial expression and age variations, but sensitive to registration errors.

Yossi Zana, Roberto M. Cesar-Jr, Rogerio S. Feris, Matthew Turk
Face Alignment and Adaptive Weight Assignment for Robust Face Recognition

It is observed that only certain portions of the face images that are affected due to expressions, non uniform lighting and partial occlusions are responsible for the failure of face recognition. A methodology of identifying and reducing the influence of such regions in the recognition process is proposed in this paper. Dense correspondence is established between the probe image and a template face-model using optical flow technique. The face image is divided into modules and the summation of the magnitudes of the flow vectors in each module are used in determining the effectiveness of that module in the overall recognition. A low weightage is assigned to the modules whose summation of magnitudes of the flow vectors within that module is high and vice versa. An eye center location algorithm based on adaptive thresholding is implemented to align the test image with the face model prior to establishing the correspondence. Recognition accuracy has increased considerably for PCA based linear subspace approaches when implemented along with the proposed technique.

Satyanadh Gundimada, Vijayan Asari
Face Detection in Low-Resolution Images

Face detection is a hot research topic in Computer Vision; the field has greatly progressed over the past decade. However face detection in low-resolution images has not been studied. In this paper, we use a conventional AdaBoost-based face detector to show that the face detection rate falls to 39% from 88% as face resolution decreases from 24 × 24 pixels to 6 × 6 pixels. We propose a new face detection method comprising four techniques for low-resolution images. As a result, our method improved the face detection rate from 39% to 71% for 6 × 6 pixel faces of MIT+CMU frontal face test set.

Shinji Hayashi, Osamu Hasegawa
Investigating the Impact of Face Categorization on Recognition Performance

Face recognition is a key biometric technology with a wide range of potential applications both in government and private sectors. Despite considerable progress in face recognition research over the past decade, today’s face recognition systems are not accurate or robust enough to be fully deployed in high security environments. In this paper, we investigate the impact of face categorization on recognition performance. In general, face categorization can be used as a filtering step to limit the search space during identification (e.g., a person categorized as a middle-aged, Asian male, needs to be compared only to subjects having the same profile). Our experimental results demonstrate that face categorization based on important visual characteristics such as gender, ethnicity, and age offers significant improvements in recognition performance including higher recognition accuracy, lower time requirements, and graceful degradation. Additional performance improvements can be expected by implementing ”category-specific” recognition subsystems that are optimized to discriminate more accurately between faces within the same face category rather than faces between other categories.

Konstantinos Veropoulos, George Bebis, Michael Webster
A Novel Approach on Silhouette Based Human Motion Analysis for Gait Recognition

This paper presents a novel view independent approach on silhouette based human motion analysis for gait recognition applications. Spatio-temporal 1-D signals based on the differences between the outer of binarized silhouette of a motion object and a bounding box placed around silhouette are chosen as the basic image features called the distance vectors. The distance vectors are extracted using four view directions to silhouette. Gait cycle estimation and motion analysis are then performed by using normalized correlation on the distance vectors. Initial experiments for human identification are finally presented. Experimental results on the different test image sequences demonstrate that the proposed algorithm has an encouraging performance with relatively robust, low computational cost, and recognition rate for gait-based human identification.

Murat Ekinci, Eyup Gedikli
A Hybrid HMM/DPA Adaptive Gesture Recognition Method

We present a hybrid classification method applicable to gesture recognition. The method combines elements of Hidden Markov Models (HMM) and various Dynamic Programming Alignment (DPA) methods, such as edit distance, sequence alignment, and dynamic time warping. As opposed to existing approaches which treat HMM and DPA as either competing or complementing methods, we provide a common framework which allows us to combine ideas from both HMM and DPA research. The combined approach takes on the robustness and effectiveness of HMMs and the simplicity of DPA approaches. We have implemented and successfully tested the proposed algorithm on various gesture data.

Stjepan Rajko, Gang Qian
Hifocon: Object and Dimensional Coherence and Correlation in Multidimensional Visualization

In any multidimensional visualization, some information has to be compromised when projecting multidimensional data to two- or three-dimensional space. We introduce the concepts of dimensional and object coherence and correlation to analyze and classify multidimensional visualization techniques. These concepts are used as principles for our design of Hifocon, a new multidimensional data visualization system.

Soon Tee Teoh, Kwan-Liu Ma
Efficient Compression of Visibility Sets

Interactive network-based walkthroughs in large urban environments raise difficult problems due to the size and density of these scenes. Thanks to the strong occlusion complexity of such environments, visibility streaming is a particularly efficient technique for minimizing the network load. In this paper, we present a solution which relies on client-side processing of visibility information so as to minimize the server workload. To solve the problem of transmitting the visibility data to the client, we suggest a bi-level compression scheme for the visibility sets that performs significantly better than previous methods. As a result, the visibility sets can be efficiently transmitted on-demand to the client and then used for adaptive streaming and rendering. Finally, we present our experimental results for a virtual city walkthrough.

Christian Bouville, Isabelle Marchal, Loïc Bouget
Rendering Optimizations Guided by Head-Pose Estimates and Their Uncertainty

In virtual environments, head pose and/or eye-gaze estimation can be employed to improve the visual experience of the user by enabling adaptive level of detail during rendering. In this study, we present a real-time system for rendering complex scenes in an immersive virtual environment based on head pose estimation and perceptual level of detail. In our system, the position and orientation of the head are estimated using stereo vision approach and markers placed on a pair of glasses used to view images projected on a stereo display device. The main innovation of our work is the incorporation of uncertainty estimates to improve the visual experience perceived by the user. The estimated pose and its uncertainty are used to determine the desired level of detail for different parts of the scene based on criteria originating from physiological and psychological aspects of human vision. Subject tests have been performed to evaluate our approach.

Javier E. Martínez, Ali Erol, George Bebis, Richard Boyle, Xander Twombly
Acceptance of Visual Search Interfaces for the Web – Design and Empirical Evaluation of a Book Search Interface

Theoretically, visual search interfaces are supposed to outperform list interfaces for such task types as nonspecific queries because they make use of additional semantic information (like price, date or review for a book). But why are web sites like Amazon or eBay still using classical textual list interfaces? Many visual interfaces performed well on objective measures (retrieval time, precision or recall). But subjective factors (ease, joy, usefulness) determining their acceptance in practice are often neglected. Therefore, we created a graphical interface for searching books and evaluated it in a 51 participant study. The study builds on the technology acceptance model which measures users’ subjective attitude towards using an interface. We found that the variable enjoyment is of higher relevance in both visual and textual search interfaces than previously stated. Finally, the novel interface yielded significantly better results for book searches than the textual one.

Olaf Thiele, Gunnar Mau
Distributed and Collaborative Biomedical Data Exploration

Imaging techniques such as MRI, fMRI, CT and PET have provided physicians and researchers with a means to acquire high-quality biomedical images as the foundation for the diagnosis and treatment of diseases. Unfortunately, access to domain experts at the same physical location is not always possible and new tools and techniques are required to facilitate simultaneous and collaborative exploration of data between spatially separated experts. This paper presents a framework for collaborative visualization of biomedical data-sets, supporting heterogeneous computational platforms and network configurations. The system provides the user with data visualization, annotation and the middleware to exchange the resulting visuals between all participants, in real-time. A resulting 2D visual provides a user specifiable high-resolution image slice, while a resulting 3D visual provides insight into the entire data set. To address the costly rendering of large-scale volumetric data, the visualization engine can distribute tasks over multiple render nodes.

Zhiyu He, Jason Kimball, Falko Kuester
Image Database Navigation: A Globe-Al Approach

Image database visualisation and navigation tools become increasingly important as image collections keep ever growing. Demanded are easily navigable and intuitive ways of displaying and browsing image databases allowing the user to view images from a collection that facilitates finding images of interest. In this paper we introduce a way of viewing a complete collection of images by projecting them onto a spherical globe for colour-based image database navigation. Taking median hue and brightness of images, features that are useful also for image retrieval purposes, and using these as a set of co-ordinates which then determine the location on the surface of the globe where the image is projected. Navigation is performed by rotation (e.g. choosing a different hue range) and zooming into areas of interest.

Gerald Schaefer, Simon Ruszala
Viewpoint Interpolation Using an Ellipsoid Head Model for Video Teleconferencing

To establish eye contact in video teleconferencing, it is necessary to synthesize a front view image by viewpoint interpolation. After finding the viewing direction of a user, we can interpolate an image seen from the viewpoint, which will result in a front view image. There are two categories of previous research: image based method and model based method. The former is simple, however, it shows limited performance for complex objects. And the latter is robust to noise, whereas it is computationally expensive. We propose to model a face with an ellipsoid and show that the new method is simple and robust from various experiments.

Na-Ree Yoon, Byung-Uk Lee
Real-Time Video Annotations for Augmented Reality

Augmented reality (AR) provides an intuitive user interface to present information in the context of the real world. A common application is to overlay screen-aligned annotations for real world objects to create in-situ information displays for users. While the referenced object’s location is fixed in the view the annotating labels should be placed in such a way as to not interfere with other content of interest such as other labels or objects in the real world. We present a new approach to determine and track areas with less visual interest based on feature density and to automatically compute label layout from this information. The algorithm works in under 5ms per frame, which is fast enough that it can be used with existing AR systems. Moreover, it provides flexible constraints for controlling label placement behaviour to the application designer. The resulting overlays are demonstrated with a simple hand-held augmented reality system for information display in a lab environment.

Edward Rosten, Gerhard Reitmayr, Tom Drummond
A Tree-Structured Model of Visual Appearance Applied to Gaze Tracking

In some computer vision applications, we may need to analyze large numbers of similar frames depicting various aspects of an event. In this situation, the appearance may change significantly within the sequence, hampering efforts to track particular features. Active shape models [1] offer one approach to this problem, by ”learning” the relationship between appearance and world-state from a small set of hand-labeled training examples. In this paper we propose a method for partitioning the input image set which addresses two problems: first, it provides an automatic method for selecting a set of training images for hand-labeling; second, it results in a partitioning of the image space into regions suitable for local model adaptation. Repeated application of the partitioning procedure results in a tree-structured representation of the image space. The resulting structure can be used to define corresponding neighborhoods in the shape model parameter space; a new image may be processed efficiently by first inserting it into the tree, and then solving for model parameters within the corresponding restricted domain. The ideas are illustrated with examples from an outdoor gaze-tracking application.

Jeffrey B. Mulligan
Emotional Expression in Virtual Agents Through Body Language

Virtual agents are used to interact with humans in a myriad of applications. However, the agents often lack the believability necessary to maximize their effectiveness. These agents, or characters, lack personality and emotions, and therefore the capacity to emotionally connect and interact with the human. This deficiency prevents the viewer from identifying with the characters on a personal level. This research explores the possibility of automating the expression of a character’s mental state through its body language. Using a system that animates a character procedurally, we provide tools to modify the character’s body movements in real-time, so that they reflect the character’s mood, personality, interest, bodily pain, and emotions, all of which make up the current mental state of the character.

Vishal Nayak, Matthew Turk
Visual Tracking for Seamless 3D Interactions in Augmented Reality

This paper presents a computer vision based approach for creating 3D tangible interfaces, which can facilitate real–time and flexible interactions with the augmented virtual world. This approach uses real–world objects and free–hand gestures as interaction handles. The identity of these objects/gestures as well as their 3D pose in the physical world can be tracked in real–time. Once the objects and gestures are perceived and localized, the corresponding virtual objects can be manipulated dynamically by human operators who are operating on those real objects. Since the tracking algorithm is robust against background clutter and adaptable to illumination changes, it performs well in real–world scenarios, where both objects and cameras move rapidly in unconstrained environments.

C. Yuan
ARISupport – Interaction Support for Augmented Reality Systems

The communication between user and software is a basic stage in any Interac-tion System project. In interactive systems, this communication is established by the means of a graphical interface, whose objective is to supply a visual rep-resentation of the main entities and functions present in the Virtual Environ-ment. New ways of interacting in computational systems have been minimizing the gap in the relationship between man and computer, and therefore enhancing its usability. The objective of this paper, therefore, is to present a proposal for a non-conventional user interface library called ARISupport, which supplies AR-ToolKit applications developers with an opportunity to create simple GUI interfaces, and provides some of the functionality used in Augmented Reality systems.

Luiz Fernando Braga Lopes, Antonio Carlos Sementille, José Remo Ferreira Brega, Fátima L. S. Nunes Marques, Ildeberto Aparecido Rodello
Background Updating for Visual Surveillance

Scene changes such as moved objects, parked vehicles, or opened/closed doors need to be carefully handled so that interesting foreground targets can be detected along with the short-term background layers created by those changes. A simple layered modeling technique is embedded into a codebook-based background subtraction algorithm to update a background model. In addition, important issues related to background updating for visual surveillance are discussed. Experimental results on surveillance examples, such as unloaded packages and unattended objects, are presented by showing those objects as short-term background layers.

Kyungnam Kim, David Harwood, Larry S. Davis
Pattern Discovery for Video Surveillance

There is a need in many surveillance applications to automatically detect certain events, such as activities and/or behaviors exhibited by people, vehicle, or other moving objects. Existing systems require that every event be custom coded, predefined, into the computer system. We present a novel system that can automatically capture and define (learn) new events by pattern discovery, and further presents the events to the operator for confirmation. The operator checks for validity of the newly detected events and adds them into the event library. We also propose a new feature selection procedure that can uniquely identify important events such as people falling. We present experimental results on real dataset, which shows the effectiveness of the proposed method.

Yunqian Ma, Pradeep Buddharaju, Mike Bazakos
Real-Time Crowd Density Estimation Using Images

This paper presents a technique for real-time crowd density estimation based on textures of crowd images. In this technique, the current image from a sequence of input images is classified into a crowd density class. Then, the classification is corrected by a low-pass filter based on the crowd density classification of the last

n

images of the input sequence. The technique obtained 73.89% of correct classification in a real-time application on a sequence of 9892 crowd images. Distributed processing was used in order to obtain real-time performance.

A. N. Marana, M. A. Cavenaghi, R. S. Ulson, F. L. Drumond
Automatic Robust Background Modeling Using Multivariate Non-parametric Kernel Density Estimation for Visual Surveillance

The final goal for many visual surveillance systems is automatic understanding of events in a site. Higher level processing on video data requires certain lower level vision tasks to be performed. One of these tasks is the segmentation of video data into regions that correspond to objects in the scene. Issues such as automation, noise robustness, adaptation, and accuracy of the model must be addressed. Current background modeling techniques use heuristics to build a representation of the background, while it would be desirable to obtain the background model automatically. In order to increase the accuracy of modeling it needs to adapt to different parts of the same scene and finally the model has to be robust to noise. The building block of the model representation used in this paper is multivariate non-parametric kernel density estimation which builds a statistical model for the background of the video scene based on the probability density function of its pixels. A post processing step is applied to the background model to achieve the spatial consistency of the foreground objects.

Alireza Tavakkoli, Mircea Nicolescu, George Bebis
Recognition of Complex Human Behaviors in Pool Environment Using Foreground Silhouette

This paper presents a vision system which allows real-time recognition of temporal swimming activities and the detection of drowning incident. Operating with a set of techniques, the developed system focuses on two fundamental issues: i) way to analyze temporal behavior and ii) way to incorporate expert knowledge. To perform the recognition of different behaviors, data fusion and Hidden Markov Model (HMM) techniques are implemented. A polynomial classifier is introduced to deal with noisy foreground descriptors caused by poor resolution and sensory noise. It addresses the nonlinear interactions among different dimensions of foreground descriptors while preserving the linear estimation property. HMM is used to model the state transition process that yields a simple and efficient probabilistic inference engine. This work reports the results of extensive on-site experiments carried out. The results demonstrate reasonably good performance yielded, specifically, in terms of false alarm rates and detection of genuine water crises.

How-Lung Eng, Kar-Ann Toh, Wei-Yun Yau, Tuan-Kiang Chiew
Adaptive Background Subtraction with Multiple Feedbacks for Video Surveillance

Background subtraction is the first step for video surveillance. Existing methods almost all update their background models with a constant learning rate, which makes them not adaptive to some complex situations, e.g., crowded scenes or objects staying for a long time. In this paper, a novel framework which integrates both positive and negative feedbacks to control the learning rate is proposed. The negative feedback comes from background contextual analysis and the positive feedback comes from the foreground region analysis. Two descriptors of global contextual features are proposed and the visibility measures of background regions are derived based on contextual descriptors. Spatial-temporal features of the foreground regions are exploited. Fusing both positive and negative feedbacks, suitable strategy of background updating for specified surveillance task can be implemented. Three strategies for short-term, selective and long-term surveillance have been implemented and tested. Improved results compared with conventional background subtraction have been obtained.

Liyuan Li, Ruijiang Luo, Weimin Huang, Karianto Leman, Wei-Yun Yau
A Vectorial Self-dual Morphological Filter Based on Total Variation Minimization

We present a vectorial self dual morphological filter. Contrary to many methods, our approach does not require the use of an ordering on vectors. It relies on the minimization of the total variation with

L

1

norm as data fidelity on each channel. We further constraint this minimization in order not to create new values. It is shown that this minimization yields a self-dual and contrast invariant filter. Although the above minimization is not a convex problem, we propose an algorithm which computes a global minimizer. This algorithm relies on minimum cost cut-based optimizations.

Jérôme Darbon, Sylvain Peyronnet
Wavelet Transform Based Gaussian Point Spread Function Estimation

Point spread function (PSF) estimation, an essential part for image restoration, has no accurate estimation algorithm at present. Based on the wavelet theory, a new Gaussian PSF accurate estimation algorithm is put forward in this paper. Firstly, the blurred images are smoothed, and their noise is reduced. Secondly, wavelet with varied scales is transformed, after which the local maxima of the modulus of the wavelet are computed respectively. Thirdly, on the basis of the relation deduced in this paper among the local maxima of the modulus of the wavelet at different scales, Lipschitz exponent and variance, the variance of a Gaussian PSF is computed. The experimental result shows that the proposed algorithm has an accuracy rate as high as 95%, and is of great application value.

Qing-Chuan Tao, Xiao-Hai He, Hong-Bin Deng, Ying Liu, Jia Zhao
One-Point Hexagonal Inner Search for Fast Motion Estimation

In this paper, we propose a novel inner search algorithm for fast motion estimation to speed up the coding of video compression. The proposed algorithm, called as one-point hexagonal inner search (OPHIS), is based on the characteristic of monotonically decreasing of distortion on a local area. The most probable inner point is checked when an inner search starts. Compared with the enhanced hexagon-based search algorithm (EHEXBS), the OPHIS not only decreases the number of search points but also gets a better video quality. Experimental results show that the speed improvement is about 12.06% on average and the percentage decrease of the mean squared error is near to 2.29% on average. Therefore, the proposed OPHIS is more suitable for the applications of fast motion estimation than the EHEXBS.

Index Terms:

one-point hexagonal inner search, fast motion estimation, hexagonal search, enhanced hexagon-based search algorithm.

Chorng-Yann Su, Cheng-Tao Chang
Self-Describing Context-Based Pixel Ordering

In this paper we introduce a novel self-describing context-based pixel ordering for digital images. Our method is inherently reversible and uses the pixel value to guide the exploration of the two-dimensional image space, in contrast to universal scans where the traversal is based solely on the pixel position. The outcome is a one-dimensional representation of the image with enhanced autocorrelation. When used as a front-end to a memoryless entropy coder, empirical results show that our method, on average, improves the compression rate by 11.56% and 5.23% compared to raster-scan and Hilbert space-filling curve, respectively.

Abdul Itani, Manohar Das
Lossless Compression of CCD Sensor Data

An evaluation of lossless image compression methods applied to CCD sensor data is provided. Ten different image compression algorithms were tested on an image set of more than 200 images organised according to the Bayer pattern configuration. The results show that Glicbawls is the best performing algorithm providing the highest compression ratio followed by CALIC, JPEG-LS, and JPEG2000.

Gerald Schaefer, Joanna Obstoj
Geometric Approach to Segmentation and Protein Localization in Cell Cultured Assays

Cell-based fluorescence imaging assays are heterogeneous requiring collection of a large number of images for detailed quantitative analysis. Complexities arise as a result of variation in spatial nonuniformity, shape, overlapping compartments, and scale. A new technique and methodology has been developed and tested for delineating subcellular morphology and partitioning overlapping compartments at multiple scales. This system is packaged as an integrated software platform for quantifying images that are obtained through fluorescence microscopy. Proposed methods are model-based, leveraging geometric shape properties of subcellular compartments and corresponding protein localization. From the morphological perspective, convexity constraint is imposed to delineate, partition, and group nuclear compartments. From the protein localization perspective, radial symmetry is imposed to localize punctate protein events at sub-micron resolution. The technique has been tested against 196 images that were generated to study centrosome abnormalities. Computed representations are evaluated against the ground truth annotation for comparative analysis.

S. Raman, B. Parvin, C. Maxwell, M. H. Barcellos-Hoff
Multi-level Thresholding Using Entropy-Based Weighted FCM Algorithm in Color Image

This paper proposes a multi-level thresholding method based on a weighted FCM(Fuzzy C-Means) algorithm in color image. FCM algorithm can determine a more optimal thresholding value than existing methods and be extended to multi-level thresholding, yet it is sensitive to noise, as it does not include spatial information. To solve this problem, a weight based on the entropy obtained from neighboring pixels is applied to FCM algorithm, and the optimal cluster number is determined using the within-class distance in the code image based on the clustered pixels for each color component. Experiments confirmed that the proposed method was more tolerant to noise and superior to existing methods.

Jun-Taek Oh, Hyun-Wook Kwak, Young-Ho Sohn, Wook-Hyun Kim
Adaptive Robust Structure Tensors for Orientation Estimation and Image Segmentation

Recently, Van Den Boomgaard and Van De Weijer have presented an algorithm for texture analysis using robust tensor-based estimation of orientation. Structure tensors are a useful tool for reliably estimating oriented structures within a neighborhood and in the presence of noise. In this paper, we extend their work by using the Geman-McClure robust error function and, developing a novel iterative scheme that adaptively and simultaneously, changes the size, orientation and weighting of the neighborhood used to estimate the local structure tensor. The iterative neighborhood adaptation is initialized using the total least-squares solution for the gradient using a relatively large isotropic neighborhood. Combining our novel region adaptation algorithm, with a robust tensor formulation leads to better localization of low-level edge and junction image structures in the presence of noise. Preliminary results, using synthetic and biological images are presented.

Sumit K. Nath, Kannappan Palaniappan
Structural and Textural Skeletons for Noisy Shapes

The extraction of consistent skeletons in the presence of boundary noise is still a problem for most skeletonization algorithms. Many suppress skeletons associated with boundary perturbation, either by preventing their formation or removing them subsequently using additional operations. A more appropriate approach is to view a shape as comprising of structural and textural skeletons. The former describes the general structure of the shape and the latter its boundary characteristics. These two types of skeletons should be encouraged to remaining disconnected to facilitate gross shape matching without the need for branch pruning. Such skeletons can be formed by means of a multi-resolution gradient vector field (MGVF), which can be generated efficiently using a pyramidal framework. The robust scale-invariant extraction of the skeletons from the MGVF is described. Experimental results show that the MGVF structural skeletons are less affected by boundary noise compared to skeletons extract by other popular iterative and non-iterative techniques.

Wooi-Boon Goh, Kai-Yun Chan
Accurate and Efficient Computation of High Order Zernike Moments

Zernike Moments are useful tools in pattern recognition and image analysis due to their orthogonality and rotation invariance property. However, direct computation of these moments is very expensive, limiting their use especially at high orders. There have been some efforts to reduce the computational cost by employing quantized polar coordinate systems, which also reduce the accuracy of the moments. In this paper, we propose an efficient algorithm to accurately calculate Zernike moments at high orders. To preserve accuracy, we do not use any form of coordinate transformation and employ arbitrary precision arithmetic. The computational complexity is reduced by detecting the common terms in Zernike moments with different order and repetition. Experimental results show that our method is more accurate than other methods and it has comparable computational complexity especially in case of using large images and high order moments.

Gholamreza Amayeh, Ali Erol, George Bebis, Mircea Nicolescu
3D Model Generation from Image Sequences Using Global Geometric Constraint

This paper describes a method for generating a three-dimensional model from an uncalibrated image sequence taken around an object. Our method is based on feature tracking and minimization of re-projection errors. To cope with mis-matchings in the result of feature tracking, we introduce two types of global geometric constraints. The one is “affine constraint” which imposes the positional relationship between pixels on the images. The other is “depth constraint” which imposes the three-dimensional structure of the object. First, we use the affine constraint to reconstruct the object roughly and then we refine the feature tracking and shape reconstruction using the depth constraint. Experimental results show that our method can automatically generate accurate three-dimensional models from real image sequences.

Masayuki Mukunoki, Kazutaka Yasuda, Naoki Asada
Efficient Shot Boundary Detection for Action Movies Using Blockwise Motion-Based Features

Conventional shot boundary detection (SBD) algorithms have limitations in handling video data that contain fast illumination changes or rapid motions of objects and background. This paper presents a SBD algorithm that is efficient for action movies/videos. A measure of discontinuity in camera and object/background motion is proposed for SBD based on the combination of two motion features: the modified displaced frame difference (DFD) and the blockwise motion similarity. A variant of the block matching algorithm (BMA) is developed for the former and we find that it is a robust SBD feature for fast motion video. The latter measures the similarity of motion directions of small local region such as macroblock (MB). Experimental results with seven test action movies show the effectiveness of the proposed algorithm. The proposed features can be easily applied to compressed video.

Min-Ho Park, Rae-Hong Park, Sang Wook Lee
Text Localization and Extraction from Complex Color Images

Availability of mobile and hand-held imaging devices, such as, cell phones, PDA’s, still and video cameras have resulted in new applications, where the text present in the acquired images is extracted and interpreted for various purposes. In this paper, we present a new algorithm for automatic detection of text in color images. Proposed system involves Gabor function based multi-channel filtering on the intensity component of the image along with Graph-Theoretical clustering applied on the color space of the same image, there-by utilizing the advantages of texture analysis as well as those of connected component for text detection. Our approach performs well on images with complex background.

S. Sabari Raju, P. B. Pati, A. G. Ramakrishnan
Using Linguistic Models for Image Retrieval

This research addresses the problem of image retrieval by exploring the semantic relationships that exist between image annotations. This is done by using linguistic relationships encoded in WordNet, a comprehensive lexical repository. Additionally, we propose the use of a reflective user-interface where users can interactively query-explore semantically related images by varying a simple parameter that does not require knowledge about the underlying information structure. This facilitates query-retrieval in context of the emergent nature of semantics that complex media, such as images have. Experiments show the efficacy and promise of this approach which can play a significant role in applications varying from multimedia information management to web-based image search.

Brian Zambrano, Rahul Singh, Bibek Bhattarai
Content-Based Image Retrieval Via Vector Quantization

Image retrieval and image compression are each areas that have received considerable attention in the past. However there have been fewer advances that address both these problems simultaneously. In this work, we present a novel approach for content-based image retrieval (CBIR) using vector quantization (VQ). Using VQ allows us to retain the image database in compressed form without any need to store additional features for image retrieval. The VQ codebooks serve as generative image models and are used to represent images while computing their similarity. The hope is that encoding an image with a codebook of a

similar

image will yield a better representation than when a codebook of a

dissimilar

image is used. Experiments performed on a color image database over a range of codebook sizes support this hypothesis and retrieval based on this method compares well with previous work.

Ajay H. Daptardar, James A. Storer
Multi-aspect Target Tracking in Image Sequences Using Particle Filters

This paper addresses the issue of multi-aspect target tracking where target’s aspect is modeled by a continuous-valued affine model. The affine parameters are assumed to follow first-order Markov models and augmented with target’s kinematic parameters in the state vector. Three particle filtering algorithms, Sequential Importance Re-sampling (SIR), the Auxiliary Particle Filter (APF1), and a modified APF (APF2) are implemented and compared along with a new initialization technique. Specifically, APF2 involves two likelihood functions and a re-weighting scheme to balance the diversity and the focus of particles. Simulation results on simulated infrared image sequences show the proposed APF2 algorithm significantly outperforms SIR and APF1 algorithms for multi-aspect target tracking in terms of robustness, accuracy and complexity.

Li Tang, Vijay Bhaskar Venkataraman, Guoliang Fan
Segmentation and Recognition of Traffic Signs Using Shape Information

This paper proposes a method for traffic sign recognition and segmentation using shape information of traffic sign. First, a connected component algorithm is used to segment candidate traffic sign regions from a binary image obtained based on the RGB color ratio of each pixel in image. Then actual traffic sign regions are segmented based on their X- and Y-axes symmetry. The recognition step utilizes shape information, including a moment, edge correlogram, and the number of times a concentric circular pattern from the region center intersects with the frequency information extracted by the wavelet transform. Finally, recognition is performed by measuring the similarity with templates in a database. Experimental results confirm the validity of the proposed method as regards geometric transformations and environmental factors.

Jun-Taek Oh, Hyun-Wook Kwak, Young-Ho Sohn, Wook-Hyun Kim
Detection and Tracking Multiple Pedestrians from a Moving Camera

This paper presents a method to detect and track multiple pedestrians from a moving camera. First, a BMA(Block Matching Algorithm) is used to obtain a motion vector from two consecutive input frames. A frame difference image is then generated by the motion compensation with the motion vector. Second, pedestrians are detected by the step that the frame difference image is transformed into binary image, a noise is deleted and a projection histogram is processed. And a color histogram is applied on the obtained pedestrian region to separate from adjacent pedestrians. Finally, color segmentation and color mean value is used to track the detected pedestrians. The experimental results on our test sequences demonstrated the high efficiency of our method.

Jong Seok Lim, Wook Hyun Kim
Event Detection in Underground Stations Using Multiple Heterogeneous Surveillance Cameras

An automatic event detection system is presented that addresses the problem of safety in underground and train stations. The proposed system is based on video analysis from multiple heterogeneous cameras, including sensors in the visible and in the infrared spectrum. Video analysis on surveillance footage from underground stations is a challenging task because of poor image quality, low contrast between pedestrians and the platform, reflections and occlusions. To overcome these problems, statistical analysis, information fusion and domain knowledge are exploited. First, we perform robust object detection in each sensor using statistical colour change detection and a continuously updated background model. Then, we integrate the results using domain knowledge and a common ground plane for all cameras. Finally, a binary decision tree is defined to detect events of interests. The effectiveness of the method is demonstrated on the dataset of the Challenge for Real-time Events Detection Solutions (CREDS).

Andrea Cavallaro
Large-Scale Geospatial Indexing for Image-Based Retrieval and Analysis

We describe a method for indexing and retrieving high-resolution image regions in large geospatial data libraries. An automated feature extraction method is used that generates a unique and specific structural description of each segment of a tessellated input image file. These tessellated regions are then merged into similar groups and indexed to provide flexible and varied retrieval in a query-by-example environment.

Kenneth W. Tobin, Budhendra L. Bhaduri, Eddie A. Bright, Anil Cheriyadat, Thomas P. Karnowski, Paul J. Palathingal, Thomas E. Potok, Jeffery R. Price
An Interactive Visualization Method for Integrating Digital Elevation Models and Geographic Information Systems Vector Layers

Currently there are software packages that allow a user to analyze GIS data. The problem is that the software is limited in the interaction provided. This paper presents a method for merging digital elevation models (DEM) and GIS vector layers, and also provided is an interactive visualization tool for analyzing the output.

J. Stuart, J. Jaquish, S. Bassett, F. Harris, W. Sherman
Splines Interpolation in High Resolution Satellite Imagery

In this paper some insights into the behavior of interpolation functions for resampling high resolution satellite images are presented. Using spatial and frequency domain characteristics, splines interpolation performance is compared to nearest-neighbor, linear and cubic interpolation. It is shown that splines interpolation injects spatial information into the final resample image better than the other three methods. Splines interpolation is also shown to be faster than cubic interpolation when the former is implemented with the LU decomposition algorithm for its tridiagonal system of linear equations. Therefore, if the main purpose for high resolution satellite resampling is to obtain an optimal smooth final image, intuitive and experimental justifications are provided for preferring splines interpolation to nearest-neighbor, linear and cubic interpolation.

José A. Malpica
Tool for Storm Analysis Using Multiple Data Sets

This note describes a web-based tool for storm analysis using multiple data sets developed for use in research of thunderstorms and forecasting applications. The tool was developed for users to monitor atmospheric changes along the path of storm systems. It demonstrates the use of the Thematic Real time Environmental Data Distributed Services (THREDDS) and other data sets.

Robert M. Rabin, Tom Whittaker
3D Modeling and Adaptation for Virtual Heritage System

Recently, virtual heritage has emerged as a promising technology for conservation, preservation, and interpretation of our culture and natural history. We implemented a virtual heritage system with real data acquisition and a MPEG-21 based graphics adaptation architecture for transmitting adapted 3D contents to multiple target devices. Our system consists of 3D modeling tools, 3D content authoring tools, 3D database system and 3D graphics adaptation system. In this paper, we provide a overview of our system, including our approach in 3D modeling, a web-based interactive presentation over network and proposed a graphics adaptation system within MPEG-21 framework for transmitting adapted 3D contents to multiple target devices with different user preferences.

Minh Tuan Le, Hae-Kwang Kim, Yong-Moo Kwon
Direct Point Rendering on GPU

In this paper, we propose a method for directly rendering point sets which only have positional information by using recent graphics processors (GPUs). Almost all the algorithms in our method are processed on GPU. Our point-based rendering algorithms apply an image buffer which has lower-resolution image than a frame buffer. Normal vectors are computed and various types of noises are reduced on such an image buffer. Our approach then produces high-quality images even for noisy point clouds especially acquired by 3D scanning devices. Our approach also uses splats in the actual rendering process. However, the number of points to be rendered in our method is in general less than the number of input points due to the use of selected points on an image buffer, which allows our approach to be processed faster than the previous approaches of GPU-based point rendering.

Hiroaki Kawata, Takashi Kanai
An Artistic Portrait Caricature Model

We present a new model for artistic portrait caricature modeling and rendering. It is composed of a caricature model and a rendering system. The computer portrait caricature is based on an exaggeration of the face features depending on measures realized on the head. A canon provides “golden proportions” to idealize adult male or female head. According to automatic measures realized on a 3D head mesh, the model produces a new deformed mesh. Using the deformed mesh, the renderer produces images in real-time with an hand-like line drawing style.

V. Boyer
Capturing and View-Dependent Rendering of Billboard Models

In this paper, we propose a method for obtaining a textured billboards representation of a static scene, given a sequence of calibrated video images. Each billboard is a textured and partially transparent plane into which the input images are mapped using perspective projection. Binning using Hough transform is used to find the position of the billboards, and optic flow measures are used to determine their textures. Since these billboards are correct only from specific view-points, view-dependent rendering is used to choose and display appropriate billboards to reproduce the input.

Oliver Lee, Anusheel Bhushan, Pablo Diaz-Gutierrez, M. Gopi
Error-Bounded Solid Voxelization for Polygonal Model Based on Heuristic Seed Filling

Although surface voxelization is now becoming a matured field, solid voxelization still lags quite behind due to the lack of robust and efficient means of processing complex polygonal models. Our paper presents a fast and easy to implement error-bounded solid voxelization algorithm in three subtasks: (i) PVM (Primary Volume Model), in which most of the non-object voxels are eliminated. (ii) BVM (Boundary Volume Model), which provides reliable discrete object boundary. (iii) RVM (Revised Volume Model), which is the result of Heuristic Seed Filling as the critical portion of our algorithm. In the third subtask, the non-object voxels from PVM and the object voxels from BVM form the boundaries and seeds are computed from PVM and BVM as well. Unlike traditional seed filling, our approach achieves significantly improved performance and produces error-bounded experimental results. Furthermore, our method is more robust compared to 3D scan-filling.

Jianguang Weng, Yueting Zhuang, Hui Zhang
Riemannian Mean Curvature Flow

In this paper we explicitly derive a level set formulation for mean curvature flow in a Riemannian metric space. This extends the traditional geodesic active contour framework which is based on conformal flows. Curve evolution for image segmentation can be posed as a Riemannian evolution process where the induced metric is related to the local structure tensor. Examples on both synthetic and real data are shown.

Raúl San José Estépar, Steve Haker, Carl-Fredrik Westin
3D Shape from Unorganized 3D Point Clouds

We present a framework to automatically infer topology and geometry from an unorganized 3D point cloud obtained from a 3D scene. If the cloud is not oriented, we use existing methods to orient it prior to recovering the topology. We develop a quality measure for scoring a chosen topology/orientation. The topology is used to segment the cloud into manifold components and later in the computation of shape descriptors.

George Kamberov, Gerda Kamberova, Amit Jain
3D Hand Pose Reconstruction with ISOSOM

We present an appearance-based 3D hand posture estimation method that determines a ranked set of possible hand posture candidates from an unmarked hand image, based on an analysis by synthesis method and an image retrieval algorithm. We formulate the posture estimation problem as a nonlinear, many-to-many mapping problem in a high dimension space. A general algorithm called ISOSOM is proposed for nonlinear dimension reduction, applied to 3D hand pose reconstruction to establish the mapping relationships between the hand poses and the image features. In order to interpolate the intermediate posture values given the sparse sampling of ground-truth training data, the geometric map structure of the samples’ manifold is generated. The experimental results show that the ISOSOM algorithm performs better than traditional image retrieval algorithms for hand pose estimation.

Haiying Guan, Matthew Turk
A Motion Capture System for Sign Language Synthesis: Overview and Related Issues

HANDY[1] is a project for the synthesis of articulatory sign-language as used by the deaf. This paper describes a sub-project named HANDYCap, which is used as a prior stage to the actual sign-language synthesis. HANDYCap is a low-cost hand-motion capture and data-processing system which we use to enable semi-automatic construction of the sign database. Additionally, the paper gives a broad survey of the technical and linguistic issues relating to the signing task. The optical tracking system used for motion-capture is described, together with related issues. In performing the database-construction task it is necessary to pay attention to the linguistic background, as well as to the problems of loading and storing the data efficiently and correctly.

László Havasi, Helga M. Szabó
Dynamic Visualization of Spatially Referenced Information

This paper proposes a dynamic visualization approach that combines the graph metaphor and the spatial metaphor for visualizing and navigating spatially referenced data. It provides users with a “marching” scheme for viewing a series of graphs

G

1

,

G

2

, ...

G

n

that are related to a spatial map. Moreover, It provides three types of navigation mechanisms. This dynamic visualization can facilitate the processes of spatially referenced data mining.

Wu Quan, Mao Lin Huang
WYSIWYG-Tool Tips: Enhancing Tool Tips with Translucent Preview Bitmaps

This paper suggests to enhance the concept of tool tips by presenting translucent preview bitmaps. The basic idea is to give the user a preview of what would happen when a certain operation is invoked. We have implemented two prototypes to evaluate the concept. One provides WYSIWYG (What You See Is What You Get) tool tips for documents and applications on the desktop. The second one is an application that provides previews of dialogs while browsing through the menu. Initial experience shows that this can provide a benefit for the user as she or he sees what is going to happen rather than only providing feedback on what has happened. This is in particular of interest to actions that can not be undone.

Heiko Drewes, Albrecht Schmidt
Motion Visualization of Ultrasound Imaging

This paper presents a method for motion visualization of medical ultrasound imaging based on line integral convolution (LIC) algorithms. The motion vector is derived by a hierarchical-type of block matching algorithm with sub-pixel accuracy. Then, the original pixel-based LIC algorithm has been extended to block-based processing and the proposed method can handle both the steady and unsteady motion where the size of the convolution filter is a function of the motion strength. Test results are reported to verify the effectiveness of the proposed approach both in synthetic and in-vivo ultrasound images. We also present a display technique to show the grey-scale LIC image for the motion direction and color coded motion amplitude to enhance the motion strength.

Dong C. Liu, Longlong Hou, Paul S. Liu
Two Novel Complete Sets of Similarity Invariants

In this paper, we propose two complete sets of similarity invariant descriptors under the Fourier-Mellin Transform and the Analytical Fourier-Mellin Transform (AFMT) frameworks respectively. Furthermore, their numerical properties are presented and be revealed through image reconstruction. Experimental results indicate that our proposed invariant descriptors can fully reconstruct the original image eliminating any existing similarity transformation (such as rotation, translation and scale) from the original image.

Hongchuan Yu, Mohammed Bennamoun
Detection of Text Region and Segmentation from Natural Scene Images

In this paper, we present an approach to the segmentation and detection of text region from natural scenes. Clustering-based natural scene segmentation is first considered based on the histogram of hue and intensity components separately. Secondly, text region extraction method is proposed by using wavelet-based features for representing the input patterns and neural network architecture for the classifier. The effectiveness and reliability of the proposed method is demonstrated through various natural scene images. The experimental results have proven that the proposed method is effective.

Jonghyun Park, Soonyoung Park
ARM Based Microcontroller for Image Capturing in FPGA Design

This paper presents how a ARM7 was designed by repeated adding group-instructions and the system was verified in a self-developed FPGA board. This ARM7 was then connected with a CMOS image capturing and processing unit (IPU) implemented in other FPGA board. ARM7 now become a microcontroller for this IPU. IPU integrates image capturing, convolution and sorting in FPGA to perform 3-stage pipelined operations to seed up system operations. Convolution and sorting help further filter out the Fixed Patten Noise (FPN) and dark current noise in CMOS image sensor and result in better image qualities. The FPGA board with ARM7 and IPU could be used for teaching CPU design, controller design and a system-on –chip (SoC) design since all circuits are in a single FPGA chip.

Chi-Jeng Chang, Wu-Ting Wu, Hui-Ching Su, Zen-Yi Huang, Hsin-Yen Li
Contraction Mapping Calibration

In this paper a new calibration scheme for recovering Euclidian camera parameters from their affine of projective primitives is presented. It is based on a contraction mapping implying that the obtained solution is unique, i.e. no local minimas threaten to yield a non-optimal solution. The approach unifies Euclidian calibration from affine and projective configurations and fewer cameras (

m

≥ 2) need to be available than in traditional schemes. The algorithm is validated on synthetic and real data.

Nicolas Guilbert, Anders Heyden
Discrimination of Natural Contours by Means of Time-Scale-Frequency Decompositions

This paper evaluates the discriminative potential of time-scale-frequency decompositions for contour-based recognition of natural shapes. Specifically, it provides the analysis and comparison of descriptors derived from the Fourier Transform, the Short-Time Fourier Transform, the Wavelet Transform and the Multi-Resolution Fourier Transform. Linear Discriminant Analysis and Backward Sequential Selection are employed for dimensionality reduction and selection of the most significant features, respectively. A Bayesian Classifier is used for class discrimination. To improve discrimination, a hierarchical classification is adopted. The approaches are analyzed and compared considering experiments developed over digitalized leaves.

Leandro A. Loss, Clésio L. Tozzi
Color and Edge Refinement Method for Content Based Image Retrieval

Standard histograms, because of their efficiency and insensitivity to small changes, are widely used for content based image retrieval. But the main disadvantage of histograms is that many images of different appearances can have similar histograms because histograms provide coarse characterization of an image. Color histograms too are widely used and suffer from the same problem. In this paper, the technique defined is based on Histogram Refinement [1] and we call it Color and Egde Refinement. Color and Egde Refinement method splits the pixels in a given bucket into several classes just like histogram refinement method. The classes are all related to colors & edges and are based on color & edge coherence vectors.

Taesu Park, Minhyuk Chang, Jongan Park
Selecting a Discriminant Subset of Co-occurrence Matrix Features for Texture-Based Image Retrieval

In the general case, searching for images in a content-based image retrieval (CBIR) system amounts essentially, and unfortunately, to a sequential scan of the whole database. In order to accelerate this process, we want to generate summaries of the image database. In this paper, we focus on the selection of the texture features that will be used as a signature in our forthcoming system. We analysed the descriptors extracted from grey-level co-occurrence matrices’s (COM) under the constraints imposed by database systems.

Najlae Idrissi, José Martinez, Driss Aboutajdine
An Automatic Relevance Feedback in Image Retrieval Using Belief Functions

This paper proposes an automatic relevance feedback approach for content-based image retrieval using information fusion and without any user input. This method is proposed as an alternative of the simple ranking of result images. The idea consists to pass from a simple user selected query image to multi-images query in order to get more information about the query image type. Given a query image, the system first computes its feature vector to rank the images according to a well-chosen similarity measure. For each retrieved image, the degree of belief about the relevance is then assigned as a function of this measure. This degree of belief is then updated using an iterative process. At each iteration, we evaluate, for each retrieved image, the degree of relevance using the combination of belief functions associated to previously retrieved images. Then, each retrieved image is not found by the query image only but it is found by the query image and previously retrieved images too. Some experimental results will be proposed in this paper in order to demonstrate that the methodology improves the efficiency and accuracy of retrieval systems.

Saïd Kharbouche, Patrick Vannorenberghe, Christèle Lecomte, Pierre Miché
A Fast Full Search Algorithm for Variable Block-Based Motion Estimation of H.264

In this paper, we propose a novel fast motion estimation algorithm based on successive elimination algorithm (SEA) which can dramatically reduce complexity of the variable block size motion estimation in H.264 encoder. The proposed method applies the conventional SEA in the hierarchical manner to the seven block modes. That is, the proposed algorithm can remove the unnecessary computation of SAD by means of the process that the previous minimum SAD is compared to a current bound value which is obtained by accumulating current sum norms and reused SAD of 4x4 blocks for the bigger block sizes than 4x4. As a result, we have tighter bound in the inequality between SAD and sum norm than the bound in the ordinary SEA. If the basic size of the block is smaller than 4x4, the bound will become tighter but it also causes to increase computational complexity, especially addition operations for sum norm. Compared with fast full search algorithm of JM of H.264, our algorithm saves 60 to 70% of computation on average for several image sequences.

Chan Lim, Hyun-Soo Kang, Tae-Yong Kim, Kook-Yeol Yoo
Adaptive Switching Linear Predictor for Lossless Image Compression

Linear prediction has been extensively researched and a significant number of techniques have been proposed to enhance its effectiveness, among them switching linear predictors. In this paper, we propose a general framework for designing a family of adaptive switching linear predictors. In addition, we will utilize the proposed framework to construct a concrete implementation based on set partitions and relational operators.

Abdul Itani, Manohar Das
Toward Real Time Fractal Image Compression Using Graphics Hardware

In this paper, we present a parallel fractal image compression using the programmable graphics hardware. The main problem of fractal compression is the very high computing time needed to encode images. Our implementation exploits SIMD architecture and inherent parallelism of recently graphic boards to speed-up baseline approach of fractal encoding. The results we present are achieved on cheap and widely available graphics boards.

Ugo Erra
Motion Based Segmentation Using MPEG Streams and Watershed Method

Many computer vision applications require the calculation of motion present in image sequences, such as video indexing, summarization, motion segmentation and others. In previous work, we have presented a new technique which performs Global Motion Estimation on MPEG compressed video. This article presents a method to extend the process to allow fast motion-based segmentation of a video. The method enables the segmentation of the background and objects which have their own local motion in real time. The motion information belonging to each area is also given. Moreover, some indicators warn if the estimation is not reliable.

Renan Coudray, Bernard Besserer
Efficient Depth Edge Detection Using Structured Light

This research features a novel approach that efficiently detects depth edges in real world scenes. Depth edges play a very important role in many computer vision problems because they represent object contours. We strategically project structured light and exploit distortion of the light pattern in the structured light image along depth discontinuities to reliably detect depth edges. Distortion along depth discontinuities may not occur or be large enough to detect depending on the distance from the camera or projector. For practical application of the proposed approach, we have presented methods that guarantee the occurrence of the distortion along depth discontinuities for a continuous range of object location. Experimental results show that the proposed method accurately detects depth edges of human hand and body shapes as well as general objects.

Jiyoung Park, Cheolhwon Kim, Juneho Yi, Matthew Turk
Image Smoothing and Segmentation by Graph Regularization

We propose a discrete regularization framework on weighted graphs of arbitrary topology, which leads to a family of nonlinear filters, such as the bilateral filter or the TV digital filter. This framework, which minimizes a loss function plus a regularization term, is parameterized by a weight function defined as a similarity measure. It is applicable to several problems in image processing, data analysis and classification. We apply this framework to the image smoothing and segmentation problems.

Sébastien Bougleux, Abderrahim Elmoataz
Backmatter
Metadaten
Titel
Advances in Visual Computing
herausgegeben von
George Bebis
Richard Boyle
Darko Koracin
Bahram Parvin
Copyright-Jahr
2005
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-32284-9
Print ISBN
978-3-540-30750-1
DOI
https://doi.org/10.1007/11595755