Skip to main content
Top

2006 | Book

Advances in Visual Computing

Second International Symposium, ISVC 2006 Lake Tahoe, NV, USA, November 6-8, 2006 Proceedings, Part I

Editors: George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Paolo Remagnino, Ara Nefian, Gopi Meenakshisundaram, Valerio Pascucci, Jiri Zara, Jose Molineros, Holger Theisel, Tom Malzbender

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

It is with great pleasure that we welcome you all to the proceedings of the 2nd InternationalSymposiumonVisualComputing(ISVC2006)heldinLakeTahoe. Followinga successfulmeeting lastyear,wewitnesseda muchstrongerandmore productive event this year. ISVC o?ers a common umbrella for the four main areas of visual computing including vision, graphics, visualization, and virtual reality. Its goal is to provide a forum for researchers, scientists, engineers and practitionersthroughouttheworldtopresenttheirlatestresearch?ndings,ideas, developments and applications in the broader area of visual computing. This year, the program consisted of 13 oral sessions, one poster session, ten special tracks, and six keynote presentations. The response to the call for - pers was very strong. We received more than twice the papers received last year. Speci?cally, we received over 280 submissions for the main symposium from which we accepted 65 papers for oral presentation (23% acceptance) and 56 papers for poster presentation (20% acceptance). Special track papers were solicited separately through the Organizing and Program Committees of each track. A total of 57 papers were accepted for presentation in the special tracks. All papers were reviewed with an emphasis on potential to contribute to the state of the art in the ?eld. Selection criteria included accuracy and originality of ideas, clarity and signi?cance of results, and presentation quality. The review process was quite rigorous, involving two to three independent blind reviews followed by several days of discussion. During the discussion period we tried to correct anomalies and errors that might have existed in the initial reviews.

Table of Contents

Frontmatter
Activity Recognition Via Classification Constrained Diffusion Maps

Applying advanced video technology to understand human activity and intent is becoming increasingly important for video surveillance. In this paper, we perform automatic activity recognition by classification of spatial temporal features from video sequence. We propose to incorporate class labels information to find optimal heating time for dimensionality reduction using diffusion via random walks. We perform experiments on real data, and compare the proposed method with existing random walk diffusion map method and dual root minimal spanning tree diffusion method. Experimental results show that our proposed method is better.

Yunqian Ma, S. B. Damelin, O. Masoud, N. Papanikolopoulos
Generating and Updating Textures for a Large-Scale Environment

With the rapid development of sensor and modeling technologies, it becomes increasingly feasible to model a large-scale environment. However, the acquisition and updating of textures for such a large-scale environment is still a challenging task, often demanding tedious and time-consuming manual interactions. This paper presents new techniques to generate high quality textures for given rough urban building models by automatic camera calibration and pose recovery, and to continuously update these textures in real time using videos as a texture resource. A number of static textures are generated for a university campus size model, and these textures are dynamically updated using videos in real time, which demonstrate the effectiveness of our algorithms.

Jinhui Hu, Suya You, Ulrich Neumann
Planar Surface Detection in Image Pairs Using Homographic Constraints

Planar surfaces are important characteristics in man-made environments and have been successfully applied to camera calibration and interactive modeling. We develop a method for detecting planes in image pairs under epipolar constraints using planar homographies. In order to extract the whole planes, the normalized cut method is used to segment the original images. We pick those segmented regions that best fit a triangulation of the homography inliers as the detected planes. We illustrate the algorithm’s performance using gray-level and color image pairs.

Qiang He, Chee-hung Henry Chu
Robust Quality-Scalable Transmission of JPEG2000 Images over Wireless Channels Using LDPC Codes

A new error-resilient JPEG2000 wireless transmission scheme is proposed. The proposed scheme exploits the ‘progressive by quality’ structure of the JPEG2000 code-stream and takes into account the effect of channel errors at different quality layers in order to protect the coded bit-stream according to channel conditions using multi-rate low-density parity-check (LDPC) codes, leading to a flexible joint source-channel coding design. The novelty of this adaptive technique lies in its ability to truncate the less important source layers to accommodate optimal channel protection to more important ones to maximize received image quality. Results show that the proposed scheme facilitates considerable gains in terms of subjective and objective quality as well as decoding probability of the retrieved images.

Abdullah Al Muhit, Teong Chee Chuah
A Novelty Detection Approach for Foreground Region Detection in Videos with Quasi-stationary Backgrounds

Detecting regions of interest in video sequences is one of the most important tasks in many high level video processing applications. In this paper a novel approach based on support vector data description is presented, which detects foreground regions in videos with quasi-stationary backgrounds. The main contribution of this paper is the novelty detection approach which automatically segments video frames into background/foreground regions. By using support vector data description for each pixel, the decision boundary for the background class is modeled without the need to statistically model its probability density function. The proposed method is able to achieve very accurate foreground region detection rates even in very low contrast video sequences, and in the presence of quasi-stationary backgrounds. As opposed to many statistical background modeling approaches, the only critical parameter that needs to be adjusted in our method is the number of background training frames.

Alireza Tavakkoli, Mircea Nicolescu, George Bebis
Procedural Image Processing for Visualization

We present a novel Procedural Image Processing (PIP) method and demonstrate its applications in visualization. PIP modulates the sampling positions of a conventional image processing kernel (e.g. edge detection filter) through a procedural perturbation function. When properly designed, PIP can produce a variety of styles for edge depiction, varying on width, solidity, and pattern, etc. In addition to producing artistic stylization, in this paper we demonstrate that PIP can be employed to achieve various visualization tasks, such as contour enhancement, focus+context visualization, importance driven visualization and uncertainty visualization.

PIP produces unique effects that often either cannot be easily achieved through conventional filters or would require multiple pass filtering. PIP perturbation functions are either defined by analytical expressions or encoded in pre-generated images. We leverage the programmable fragment shader of the current graphics hardware for achieving the operations in real-time.

Xiaoru Yuan, Baoquan Chen
Tracking of Individuals in Very Long Video Sequences

In this paper we present an approach for automatically detecting and tracking humans in very long video sequences. The detection is based on background subtraction using a multi-mode Codeword method. We enhance this method both in terms of representation and in terms of automatically updating the background allowing for handling gradual and rapid changes. Tracking is conducted by building appearance-based models and matching these over time. Tests show promising detection and tracking results in a ten hour video sequence.

P. Fihl, R. Corlin, S. Park, T. B. Moeslund, M. M. Trivedi
A Natural Interface for Sign Language Mathematics

The general goal of our research is the creation of a natural and intuitive interface for input and recognition of American Sign Language (ASL) math signs. The specific objective of this work is the development of two new interfaces for the Mathsigner

tm

application. Mathsigner

tm

is an interactive, 3D animation-based game designed to increase the mathematical skills of deaf children. The program makes use of standard input devices such as mouse and keyboard. In this paper we show a significant extension of the application by proposing two new user interfaces: (1) a glove-based interface, and (2) an interface based on the use of a specialized keyboard. So far, the interfaces allow for real-time input and recognition of the ASL numbers zero to twenty.

Nicoletta Adamo-Villani, Bedřich Beneš, Matt Brisbin, Bryce Hyland
A Novel Gait Recognition Method Via Fusing Shape and Kinematics Features

Existing methods of gait recognition are mostly based on either holistic shape information or kinematics features. Both of them are very important cues in human gait recognition. In this paper we propose a novel method via fusing shape and motion features. Firstly, the binary silhouette of a walking person is detected from each frame of the monocular image sequences. Then the static shape is represented using the ratio of the body’s height to width and the pixel number of silhouette. Meanwhile, a 2D stick figure model and trajectory-based kinematics features are extracted from the image sequences for describing and analyzing the gait motion. Next, we discuss two fusion strategies relevant to the above mentioned feature sets: feature level fusion and decision level fusion. Finally, a similarity measurement based on the gait cycles and two different classifiers (Nearest Neighbor and KNN) are carried out to recognize different subjects. Experimental results on UCSD and CMU databases demonstrate the feasibility of the proposed algorithm and show that fusion can be an effective strategy to improve the recognition performance.

Yanmei Chai, Qing Wang, Jingping Jia, Rongchun Zhao
Illumination Normalization for Color Face Images

The performance of appearance based face recognition algorithms is adversely affected by illumination variations. Illumination normalization can greatly improve their performance. We present a novel algorithm for illumination normalization of color face images. Face Albedo is estimated from a single color face image and its co-registered 3D image (pointcloud). Unlike existing approaches, our algorithm takes into account both Lambertian and specular reflections as well as attached and cast shadows. Moreover, our algorithm is invariant to facial pose and expression and can effectively handle the case of multiple extended light sources. The approach is based on Phong’s lighting model. The parameters of the Phong’s model and the number, direction and intensities of the dominant light sources are automatically estimated. Specularities in the face image are used to estimate the directions of the dominant light sources. Next, the 3D face model is ray-casted to find the shadows of every light source. The intensities of the light sources and the parameters of the lighting model are estimated by fitting Phong’s model onto the skin data of the face. Experiments were performed on the challenging FRGC v2.0 data and satisfactory results were achieved (the mean fitting error was 6.3% of the maximum color value).

Faisal R. Al-Osaimi, Mohammed Bennamoun, Ajmal Mian
Real-Time Detection of Out-of-Plane Objects in Stereo Vision

This paper proposes an automatic approach to detecting objects appearing in front of planar background. A planar homography is estimated with high accuracy in an off-line initialization phase. Given a pair of binocular images, we apply the estimated homography to one of the images, and then compute a similarity map between the transformed image and the other. Normalized cross-correlation is used in the computation of the similarity map to measure the similarity between neighborhoods of overlapping pixels. Normalized cross-correlation measure is superior to absolute difference in alleviating the influence of image noise and small mis-alignment caused by imperfect homography estimation. The similarity map with pixel intensities ranging between 0 and 1 leads to an easy detection of out-of-plane objects because the values of pixels corresponding to planar background are close to 1. Tracking could be incorporated with our out-of-plane object detection method to further improve robustness in live video applications. This approach has been used in tracking people and demonstrated reliable performance.

Weiguang Guan, Patricia Monger
Stereo Imaging with Uncalibrated Camera

3D images provide more information to human than their 2D counterparts and have many applications in entertainment, scientific data visualization, etc. The ability to generate accurate 3D dynamic scene and 3D movie from uncalibrated cameras is a challenge. We propose a systematic approach to stereo image/video generation. With our proposed approach, a realistic 3D scene can be created via either a single uncalibrated moving camera or two synchronized cameras. 3D video can also be generated through multiple synchronized video streams. Our approach first uses a Gabor filter bank to extract image features. Second, we develop an improved Elastic Graph Matching method to perform reliable image registration from multi-view images or video frames. Third, a fast and efficient image rectification method based on multi-view geometry is presented to create stereo image pairs. Extensive tests using real images collected from widely separated cameras were performed to test our proposed approach.

Xiaokun Li, Chiman Kwan, Baoxin Li
Global Hand Pose Estimation by Multiple Camera Ellipse Tracking

Immersive virtual environments with life-like interaction capabilities have very demanding requirements including high precision and processing speed. These issues raise many challenges for computer vision-based motion estimation algorithms. In this study, we consider the problem of hand tracking using multiple cameras and estimating its 3D global pose (i.e., position and orientation of the palm). Our interest is in developing an accurate and robust algorithm to be employed in an immersive virtual training environment, called ”Virtual GloveboX” (VGX) [1], which is currently under development at NASA Ames. In this context, we present a marker-based, hand tracking and 3D global pose estimation algorithm that operates in a controlled, multi-camera, environment built to track the user’s hand inside VGX. The key idea of the proposed algorithm is tracking the 3D position and orientation of an elliptical marker placed on the dorsal part of the hand using model-based tracking approaches and active camera selection. It should be noted that, the use of markers is well justified in the context of our application since VGX naturally allows for the use of gloves without disrupting the fidelity of the interaction. Our experimental results and comparisons illustrate that the proposed approach is more accurate and robust than related approaches. A byproduct of our multi-camera ellipse tracking algorithm is that, with only minor modifications, the same algorithm can be used to automatically re-calibrate (i.e., fine-tune) the extrinsic parameters of a multi-camera system leading to more accurate pose estimates.

Jorge Usabiaga, Ali Erol, George Bebis, Richard Boyle, Xander Twombly
Vision-Based Self-localization of Autonomous Guided Vehicle Using Landmarks of Colored Pentagons

This paper describes an idea for determining self-organization using visual land marks. The critical geometric dimensions of a pentagon are used here to locate the relative position of the mobile robot with respect to the pattern. This method has the advantages of simplicity and flexibility. This pentagon is also provided with a unique identification, using invariant features and colors that enable the system to find the absolute location of the patterns. This algorithm determines both the correspondence between observed landmarks and a stored sequence, computes the absolute location of the observer using those correspondences, and calculates relative position from a pentagon using its five vertices. The algorithm has been implemented and tested. In several trials it computes location accurate to within 5.4 centimeters in less than 0.3 second.

Y. S. Kim, J. C. Kim, E. J. Park, Joonwhoan Lee
An Automated System for Contact Lens Inspection

This paper describes a novel method for the industrial inspection of ophthalmic contact lenses in a time constrained production line environment. We discuss the background to this problem, look at previous solutions and relevant allied work before describing our system. An overview of the system is given together with detailed descriptions of the algorithms used to perform the image processing, classification and inspection system. We conclude with a preliminary assessment of the system performance and discuss future work needed to complete the system.

A. I. Bazin, T. Cole, B. Kett, M. S. Nixon
Efficient Motion Search in Large Motion Capture Databases

Large human motion databases contain variants of natural motions that are valuable for animation generation and synthesis. But retrieving visually similar motions is still a difficult and time-consuming problem. This paper provides methods for identifying visually and numerically similar motions in a large database given a query of motion segment. We propose an efficient indexing strategy that represents the motions compactly through a preprocessing. This representation scales down the range of searching the database. Motions in this range are possible candidates of the final matches. For detailed comparisons between the query and the candidates, we propose an algorithm that compares the motions’ curves swiftly. Our methods can apply to large human motion databases and achieve high performance and accuracy compared with previous work. We present experimental results on testing a database of about 2.9 million frames, or about 27 hours of motions played at 30 Hz.

Yi Lin
Real-Time Rendering of Light Shafts on GPU

In the past, it is difficult to simulate light shafts effect in real-time. One major reason is the high computational expense to perform the physically-accurate computation of atmosphere scattering. Another is due to the limitation of computer resource, especially lack of power and programmability in the graphic hardware. Recently, with the advent of more powerful graphic card in standard PC platform and the development of programmable stages in the graphic pipeline, a lot of computational expensive algorithms are made available in modern commercial games. In this paper, we propose a novel method of rendering light shafts with atmospheric scattering based on actual physical phenomena. The proposed method utilizes hardware frame buffer object and a mesh refinement pattern to achieve photorealistic effect at high frame rate.

Shuyi Chen, Sheng Li, Guoping Wang
Learning the Stylistic Similarity Between Human Motions

This paper presents a computational model of stylistic similarity between human motions that is statistically derived from a comprehensive collection of captured, stylistically similar motion pairs. In this model, a set of hypersurfaces learned by single-class SVM and kernel PCA characterize the region occupied by stylistically similar motion pairs in the space of all possible pairs. The proposed model is further applied to a system for adapting an existing clip of human motion to a new environment, where stylistic distortion is avoided by enforcing stylistic similarity of the synthesized motion to the existing motion. The effectiveness of the system has been verified by 18 distinct adaptations, which produced walking, jumping, and running motions that exhibit the intended styles as well as the intended contact configurations.

Yu-Ren Chien, Jing-Sin Liu
Effects of Layer Partitioning in Collaborative 3D Visualizations

Display technologies that support multiple independent views of the same co-located 3D visualization volume make new forms of collaboration possible. In this field of research, until now most efforts have focused on technical solutions and their applications. The main contribution of this paper is the results from a study comparing integral and partitioned 3D content in a head coupled stereoscopic environment through independent views of a shared 3D visualization.

In our study we used a geospatial task that was solved by ten pairs of collaborating individuals (dyads). We measured task performance by time and error rate for the dyads in two main conditions: a) an integral visualization that presented a map in the display surface and four layers at different depths below the display surface to each of the observers, and b) a partitioned visualization, where two mutually exclusive subsets of the layers were presented to each of the observers together with the map in the display surface.

The results from the study showed significant differences in regard to performance times between the two conditions. Task performance was significantly better in the condition with layer partitioning. Partitioned visualizations can thus, at least in some cases, improve performance in tasks requiring collaboration between users.

Lars Winkler Pettersson, Andreas Kjellin, Mats Lind, Stefan Seipel
GPU-Based Active Contour Segmentation Using Gradient Vector Flow

One fundamental step for image-related research is to obtain an accurate segmentation. Among the available techniques, the active contour algorithm has emerged as an efficient approach towards image segmentation. By progressively adjusting a reference curve using combination of external and internal force computed from the image, feature edges can be identified. The Gradient Vector Flow (GVF) is one efficient external force calculation for the active contour and a GPU-centric implementation of the algorithm is presented in this paper. Since the internal SIMD architecture of the GPU enables parallel computing, General Purpose GPU (GPGPU) based processing can be applied to improve the speed of the GVF active contour for large images. Results of our experiments show the potential of GPGPU in the area of image segmentation and the potential of the GPU as a powerful co-processor to traditional CPU computational tasks.

Zhiyu He, Falko Kuester
Active Single Landmark Based Global Localization of Autonomous Mobile Robots

This paper presents landmark based global self-localization of autonomous mobile robots in a known but highly dynamic environment. The algorithm is based on range estimation to naturally occurring distinct features as it is not possible to modify the environment with special navigational aids. These features are sparse in our application domain and are frequently occluded by other robots. To enable the robot to estimate its absolute position with respect to a single landmark it is equipped with dead-reckoning sensors in addition to the stereo vision system mounted on a rotating head. The pivoted stereo vision system of the robot enables it to measure range and use bi/trilateration based methods as they require fewer landmarks compared to angle based triangulation. Further reduction of landmarks is achieved when robot orientation is estimated independently. Simulation results are presented which illustrate the performance of our algorithm.

Abdul Bais, Robert Sablatnig, Jason Gu, Stefan Mahlknecht
Iterative Estimation of 3D Transformations for Object Alignment

An Iterative Estimation Algorithm (IEA) of 3D transformations between two objects is presented in this paper. Skeletons of the 3D objects are extracted using a fully parallel thinning technique, feature point pairs (land markers) are extracted from skeletons automatically with a heuristic rule, and a least squares method and an iterative approach are applied to estimate the 3D transformation matrix. The algorithm has three advantages. First of all, no initial transformation matrix is needed. Secondly, user interaction is not required for identifying the land markers. Thirdly, the time complexity of this algorithm is polynomial. Experiments show that this method works quite well with high accuracy when the translations and rotation angles are small, even when noise exists in the data.

Tao Wang, Anup Basu
Temporal Alignment of Time Varying MRI Datasets for High Resolution Medical Visualization

Four-dimensional (4D) visualization of medical data, which entails the addition of time as the fourth dimension to 3D data, is fast gaining ground as a tool for diagnosis and surgical planning by medical practitioners. However, current medical image acquisition techniques do not support high-resolution 4D capture. Instead, multiple 3D datasets are acquired and a temporal relation is computed between these datasets in order to align them in time. In past work we presented a method of temporal alignment of MRI datasets to generate high-resolution medical data, which can be extended to 4D visualization. In this work, we present the details of our temporal alignment algorithm and also present comparative analysis in order to highlight the advantages of our method.

Meghna Singh, Anup Basu, Mrinal Mandal
Physically Interacting with Four Dimensions

We exploit the combination of a virtual world containing physically-interacting 4D objects with a multimodal haptics-driven user-interface model; the goal is to facilitate the development of accurate cognitive models enabling the visualization of 4D space. Our primary test domain supports tactile interaction with physically colliding and deformable curves and surfaces embedded in 4D, an important and challenging subject area of classical topology. We implement intricate interactions involving 4D curves and surfaces by haptically manipulating 3D projections of these objects.

Hui Zhang, Andrew J. Hanson
Low Level Moving-Feature Extraction Via Heat Flow Analogy

In this paper, an intelligent and automatic moving object edge detection algorithm is proposed, based on heat flow analogy. This algorithm starts with anisotropic heat diffusion in the spatial domain to remove noise and sharpen region boundaries for the purpose of obtaining high quality edge data. Then, isotropic heat diffusion is applied in the temporal domain to calculate the total amount of heat flow. The moving edges are represented as the total amount of heat flow out from the reference frame. The overall process is completed by non-maxima suppression and hysteresis thresholding to obtain binary moving edges. Evaluation results indicate that this approach has advantages in handling noise in the temporal domain because of the averaging inherent of isotropic heat flow. Results also show that this technique can detect moving edges in image sequences.

Cem Direkoğlu, Mark S. Nixon
Shape Tracking and Registration for 4D Visualization of MRI and Structure

We describe our preliminary research on integrating MRI video with a 3D surface scan of a face. Our approach first extracts contours of a video by using snakes [4, 5]; then the outline structure of the video is matched with a close matching contour on the 3D face structure. The matching and alignment of the two representations uses curvature representations along with some simple heuristics about the relative locations of the facial features, such as nose and chin.

Even though techniques like video fluoroscopy [8] can create high quality images, it subjects patients to high volumes of radiation, and cannot be used to monitor patients over short time intervals. Our alternative combines MRI video with 3D facial structure to improve visualization for medical professionals. The MRI video was created in our related research [7] by registering multiple MRI sequences of swallowing.

Irene Cheng, Sharmin Nilufar, Anup Basu, Randy Goebel
History Trees as Descriptors of Macromolecular Structures

High-level structural information about macromolecules is now being organized into databases. One of the common ways of storing information in such databases is in the form of three-dimensional (3D) electron microscopic (EM) maps, which are 3D arrays of real numbers obtained by a reconstruction algorithm from EM projection data. We propose and demonstrate a method of automatically constructing, from any 3D EM map, a topological descriptor (which we call a history tree) that is amenable to automatic comparison.

Deniz Sarioz, T. Yung Kong, Gabor T. Herman
Fusing Features in Direct Volume Rendered Images

In this paper, we propose a novel framework which can fuse multiple user selected features in different direct volume rendered images into a comprehensive image according to users’ preference. The framework relies on three techniques, i.e.,

user voting

,

genetic algorithm

, and

image similarity

. In this framework, we transform the fusing problem to an optimization problem with a novel energy function which is based on user voting and image similarity. The optimization problem can then be solved by the genetic algorithm. Experimental results on some real volume data demonstrate the effectiveness of our framework.

Yingcai Wu, Huamin Qu, Hong Zhou, Ming-Yuen Chan
Binocular Uncalibrated Photometric Stereo

In Uncalibrated Photometric Stereo (UPS), the surface normals and light sources are determined up to a group of ambiguous Generalized Bas-Relief (GBR) transformations. However, it has been shown by previous works to be rather troublesome to solve these ambiguities. In this paper, a framework of Binocular Uncalibrated Photometric Stereo (B-UPS) is given for accurate stereo matching for lambertian and non-lambertian objects. It is also shown that the problem of 3D reconstruction with UPS is converted into that of stereo matching with B-UPS. By this conversion, the intractable GBR transformations can be bypassed. In B-UPS, the Orientation-Consistency cue (OC) [1] for distant-lighting condition and Local-Orientation-Consistency (LOC) cue for non-distant lighting condition are used together for stereo matching, where the combination of both cues is made possible by a planar-area detection method based on a pseudo-normal-map segmentation scheme. Excellent matching and reconstruction results for objects with constant and spatial-varying BRDF demonstrate the superiority of B-UPS.

Hui Kong, Pengfei Xu, Eam Khwang Teoh
Empirical Evaluation of a Visual Interface for Exploring Message Boards

This paper introduces a method that presents a number of characteristics of threads in a discussion forum through graphical illustrations. This technique brings together visual components, such as dimension, color, intensity, and position to present multiple aspects of a thread including the amount of information, popularity, activities, comparative value, and tenure of the thread. This high visual abstraction of threads allows us to display a large number of threads showing overall properties of the contents on a limited screen space. These proposed visualization techniques will assist the user to filtering noisy threads effectively from threads having important features. We have conducted an experimental study, which compares the effectiveness of the developed visual interface to a traditional text-based interface. The experimental study has shown that the user’s search speed and accuracy in finding noticeable threads from a huge collection of threads has improved significantly by using the visual navigation tool.

Beomjin Kim, Philip Johnson, Jason Baker
Direct Estimation of the Stereo Geometry from Monocular Normal Flows

The increasing use of active vision systems makes it necessary to determine the relative geometry between the cameras in the system at arbitrary time. There has been some work on on-line estimation of the relative camera geometry parameters. However, many of them are based on epipolar geometry, motion correspondences, or even presence of some calibration reference objects in the scene. In this paper, we describe a method that allows the relative geometry of two cameras be estimated without assuming that their visual fields picture the same object, nor that motion correspondences in each camera are fully estimated beforehand. The method starts from monocular normal flows in the two cameras and estimates the relative geometry parameters without evening accessing the full optical flows. Experimental results are shown to illustrate the performance of the method.

Ding Yuan, Ronald Chung
Singular Value Decomposition-Based Illumination Compensation in Video

This paper presents a singular value decomposition (SVD)-based illumination compensation method in video having varying scene illumination. In video that does not contain scene changes, the color distributions in the RGB space are different frame to frame, mainly due to varying illumination. In this paper, the color distribution of a scene is modeled as an ellipsoid using SVD and scene illumination of successive frames is preserved by the linear transformation in the RGB space. The effect of illumination change is effectively removed by the linear transformation and the similarity measures such as the normalized cross correlation, the sum of absolute differences, and the sum of squared differences of two successive image frames, are preserved, which illustrates the effectiveness of the proposed algorithm. Simulation results with several synthetic and real test sequences show the robustness of the proposed method to illumination changes compared with the conventional methods.

Ki-Youn Lee, Rae-Hong Park
Facial Expression Transformations for Expression-Invariant Face Recognition

This paper presents a method of expression-invariant face recognition by transforming the input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is queried, it is represented by a feature vector using the active appearance model. Then, the facial expression state of the queried feature vector is identified by the facial expression recognizer. Then, the queried feature vector is transformed into the neutral facial expression vector using the identified facial expression state via the

direct

or

indirect

facial expression transformation, where the former uses the bilinear translation directly to transform the facial expression, but the latter uses the bilinear translation to obtain the relative expression parameters and transforms the facial expression indirectly by the obtained relative expression parameters. Then, the neutral facial expression vector is converted into the neutral facial expression image via the AAM reconstruction. Finally, the face recognition has been performed by the distance-based matching technique. Experimental results show that the proposed expression-invariant face recognition method is very robust under a variety of facial expressions.

Hyung-Soo Lee, Daijin Kim
A High-Speed Parallel Architecture for Stereo Matching

The stereo matching algorithm based on the belief propagation (BP) has the low matching error as the global method, but has the disadvantage of a long processing time. In addition to a low error of less than 2.6% in the Middlebury image simulation, a new architecture based on BP shows a high-speed parallel VLSI structure of the time complexity O(N), at properly small iterations, so that it can be useful as a chip in the real-time application like robots and navigations.

Sungchan Park, Hong Jeong
Light Simulation in a Distributed Driving Simulator

In this paper we present our work on modularizing and distributing a VR application – the Virtual Night Drive simulator. The main focus in our work is the simulation of headlights. The realistic but still interactive visualization of those lights is essential for a usable driving simulator at night. Modern techniques like pixel and vertex shaders and volume rendering help to realize the complex task of light simulation. But there are still scenarios, especially when having several cars with headlights in one scene, that require distributed setups to run the simulation in an interactive way. In this paper we present an architecture that supports several approaches of distributed light simulation, compare it to existing systems and give an outlook on what is left to do.

Stefan Lietsch, Henning Zabel, Martin Eikermann, Veit Wittenberg, Jan Berssenbrügge
Self-adaptive RBF Neural Networks for Face Recognition

A self-adaptive radial basis function neural network (RBFNN)-based recognition of human faces has been proposed in this paper. Conventionally, all the hidden layer neurons of an RBFNN are considered to generate outputs at the output layer. In this work, a confidence measure has been imposed to select a subset of the hidden layer neurons to generate outputs at the output layer, thereby making the RBFNN as self-adaptive for choosing hidden layer neurons to be considered while generating outputs at the output layer. This process also reduces the computation time at the output layer of the RBFNN by neglecting the ineffective RBFs. The performance of the proposed method has been evaluated on the ORL and the UMIST face databases. The experimental results indicate that the proposed method can achieve excellent recognition rates and outperform some of the traditional face recognition approaches.

S. Gharai, S. Thakur, S. Lahiri, J. K. Sing, D. K. Basu, M. Nasipuri, M. Kundu
An Improved Representation of Junctions Through Asymmetric Tensor Diffusion

Junctions form critical features in motion segmentation, image enhancement, and object classification to name but a few application domains. Traditional approaches to identifying junctions include convolutional methods, which involve considerable tuning to handle non-trivial inputs and diffusion techniques that address only symmetric structure. A new approach is proposed that requires minimal tuning and can distinguish between the basic, but critically different, ‘X’ and ‘T’ junctions. This involves a multi-directional representation of gradient structure and employs asymmetric tensor diffusion to emphasize such junctions. The approach combines the desirable properties of asymmetry from convolutional methods with the robustness of local support from diffusion.

Shawn Arseneau, Jeremy R. Cooperstock
Accurate Extraction of Reciprocal Space Information from Transmission Electron Microscopy Images

As the study of complex systems has become dominant in physics the link between computational and physical science has become ever more important. In particular, with the rising popularity of imaging techniques in physis, the development and application of cutting edge computer vision techniques has become vital. Here we present novel image analysis methods which can be used to extract the position of features in diffraction patterns (reciprocal space) with unprecedented accuracy.

The first contribution we have developed is a method for calculating the nonlinear response of photographic film by using the noise in the image enabling the extraction of accurate intensity information. This allows high-resolution (but non-linear) film to be used in place of low-resolution (but linear) CCD cameras. The second contribution is a method for accurately localising very faint features in diffraction patterns by modelling the features and using the expectation maximization algorithm directly on the image to fit them. The accuracy of this technique has been verified by testing it on synthetic data.

These methods have been applied to transmission electron microscopy data, and have already enabled discoveries which would have been impossible using previously available techniques.

Edward Rosten, Susan Cox
GPU Accelerated Isosurface Extraction on Tetrahedral Grids

Visualizing large unstructured grids is extremely useful to understand natural and simulated phenomena. However, informative volume visualization is difficult to achieve efficiently due to the huge amount of information to process. In this paper, we present a method to efficiently tessellate on a GPU large unstructured tetrahedral grids made of millions of cells. This method avoids data redundancy by using textures for storing most of the needed data; textures are accessed through vertex texture lookup in the vertex shading unit of modern graphics cards. Results show that our method is about 2 times faster than the same CPU-based extraction, and complementary with previous approaches based on GPU registers: it is less efficient for small grids, but handles millions-tetrahedra grids in graphics memory, which was impossible with previous works. Future hardware evolutions are expected to make our approach much more efficient.

Luc Buatois, Guillaume Caumon, Bruno Lévy
Enhancing Information on Large Scenes by Mixing Renderings

We propose a new model for visualization of high scale scenes. It is designed to enhance pertinent informations that become quickly viewable on a large scene. It consists in mixing different kind of rendering techniques in the same frame. This method is achieved in real-time during the rendering process using GPU programming. Moreover rendering techniques used and key points defined by the user can be interactively changed. We present our model, and a new non-photorealistic rendering techniques. Images produced look better and provide more informations than traditional rendering techniques.

Vincent Boyer, Dominique Sobczyk
Auto-focusing in Extreme Zoom Surveillance: A System Approach with Application to Faces

Auto-focusing is an indispensable function for imaging systems used in surveillance and object tracking. In this paper, we conduct a study of an image-based passive auto-focusing control for high magnification (>50×) systems using off-the-shelf telescopes and digital camcorders with applications to long range near-ground surveillance and face tracking. Considering both speed of convergence and robustness to image degradations induced by high system magnifications and long observation distances, we introduce an auto-focusing mechanism suitable for such applications, including hardware design and algorithm development. We focus on the derivation of the transition criteria following maximum likelihood (ML) estimation for the selection of adaptive step sizes and the use of sharpness measures for the proper evaluation of high magnification images. The efficiency of the proposed system is demonstrated in real-time auto-focusing and tracking of faces from distances of 50m~300m.

Yi Yao, Besma Abidi, Michael Tousek, Mongi Abidi
Trifocal Transfer Based Novel View Synthesis for Micromanipulation

In trifocal transfer based novel view synthesis, matched pixels of both input views are projected in the novel view. The angle of view of this latest is usually narrow, i.e. the novel view is very close to input ones. In this paper we improve the method to get a large angle of view. A simplex approach is used to compute the model of the virtual views pose. This model allows the computation of the novel view at any desired angle of view. We also show that those results are very useful in micromanipulation tasks where transfer of edges is enough instead of the entire pixels of input views.

Julien Bert, Sounkalo Dembélé, Nadine Lefort-Piat
Simulation of Diabetic Retinopathy Neovascularization in Color Digital Fundus Images

Diabetic retinopathy (DR) has been identified as a leading cause of blindness. One type of lesion, neovascularization (NV), indicates that the disease has entered a vision-threatening phase. Early detection of NV is thus clinically significant. Efforts have been devoted to use computer-aided analyses of digital retina images to detect DR. However, developing reliable NV detection algorithms requires large numbers of digital retinal images to test and refine approaches. Computer simulation of NV offers the potential of developing lesion detection algorithms without the need for large image databases of real pathology. In this paper, we propose a systematic approach to simulating NV. Specifically, we propose two algorithms based on fractal models to simulate the main structure of NV and an adaptive color generation method to assign photorealistic pixel values to the structure. Moreover, we develop an interactive system that provides instant visual feedback to support NV simulation guided by an ophthalmologist. This enables us to combine the low level algorithms with high-level human feedback to simulate realistic lesions. Experiments suggest that our method is able to produce simulated NVs that are indistinguishable from real lesions.

Xinyu Xu, Baoxin Li, Jose F. Florez, Helen K. Li
Mesh Optimisation Using Edge Information in Feature-Based Surface Reconstruction

One of the most challenging and fundamental problems in computer vision is to reconstruct a surface model given a set of uncalibrated 2D images. Well-established Structure from Motion (SfM) algorithms often result in a sparse set of 3D surface points, but surface modelling based on sparse 3D points is not easy. In this paper, we present a new method to refine and optimise surface meshes using edge information in the 2D images. We design a meshing – edge point detection – re-meshing scheme that can gradually refine the surface mesh until it best fits the true physical surface of the object being modelled. Our method is tested on real images and satisfactory results are obtained.

Jun Liu, Roger Hubbold
Finite Sample Bias of Robust Scale Estimators in Computer Vision Problems

In computer vision applications of robust estimation techniques, it is usually assumed that a large number of data samples are available. As a result, the finite sample bias of estimation processes has been overlooked. This is despite the fact that many asymptotically unbiased estimators have substantial bias in cases where a moderate number of data samples are available. Such cases are frequently encountered in computer vision practice, therefore, it is important to choose the right estimator for a given task by virtue of knowing its finite sample bias. This paper investigates the finite sample bias of robust scale estimation and analyses the finite sample performance of three modern robust scale estimators (Modified Statistical Scale Estimator, Residual Consensus estimator and Two-Step Scale Estimator) that have been used in computer vision applications. Simulations and real data experiments are used to verify the results.

Reza Hoseinnezhad, Alireza Bab-Hadiashar, David Suter
Flexible Segmentation and Smoothing of DT-MRI Fields Through a Customizable Structure Tensor

We present a novel structure tensor for matrix-valued images. It allows for user defined parameters that add flexibility to a number of image processing algorithms for the segmentation and smoothing of tensor fields. We provide a thorough theoretical derivation of the new structure tensor, including a proof of the equivalence of its unweighted version to the existing structure tensor from the literature. Finally, we demonstrate its advantages for segmentation and smoothing, both on synthetic tensor fields and on real DT-MRI data.

Thomas Schultz, Bernhard Burgeth, Joachim Weickert
Using Visualizations to Support Design and Debugging in Virtual Reality

We present a visualization system that helps designers conceptualise interactions in a virtual environment (VE). We use event-condition-action triads (triggersets) for specifying interactions, and provide multiple visualizations: sequence diagrams, floorplans and timelines. We present a two part study: sequencing VE interactions accurately and debugging mistakes. Subjects were divided into two groups: one received visualizations and triggersets and the other (a control group) received triggersets only. The visualization group described 72.5% of the sequence correctly on average, compared to 56.4% by the non-visualization group. The visualization group also detected more than twice as many errors as the control group. The visualization group worked well with multiple, linked windows to create an understanding of the design. Floorplans were most useful for an overview, timelines for understanding specific sequences and sequence diagrams for sequencing and finding mistakes.

Cara Winterbottom, Edwin Blake, James Gain
Strategies for Part-Based Shape Analysis Using Skeletons

Skeletons are often used as a framework for part-based shape analysis. This paper describes some useful strategies that can be employed to improve the performance of such shape matching algorithms. Four key strategies are proposed. The first is to incorporate ligature-sensitive information into the part decomposition and shape matching processes. The second is to treat part decomposition as a dynamic process in which the selection of the final decomposition of a shape is deferred until the shape matching stage. The third is the need to combine both local and global measures when computing shape dissimilarity. Finally, curvature error between skeletal segments must be weighted by the limb-width profile along the skeleton. Experimental results show that the incorporation of these strategies significantly improves the retrieval accuracy when applied to LEMS’s 99 and 216 silhouette database [10].

Wooi-Boon Goh
Automatic Learning of Articulated Skeletons from 3D Marker Trajectories

We present a novel fully-automatic approach for estimating an articulated skeleton of a moving subject and its motion from body marker trajectories that have been measured with an optical motion capture system. Our method does not require a priori information about the shape and proportions of the tracked subject, can be applied to arbitrary motion sequences, and renders dedicated initialization poses unnecessary. To serve this purpose, our algorithm first identifies individual rigid bodies by means of a variant of spectral clustering. Thereafter, it determines joint positions at each time step of motion through numerical optimization, reconstructs the skeleton topology, and finally enforces fixed bone length constraints. Through experiments, we demonstrate the robustness and efficiency of our algorithm and show that it outperforms related methods from the literature in terms of accuracy and speed.

Edilson de Aguiar, Christian Theobalt, Hans-Peter Seidel
Real Time Hand Gesture Recognition Including Hand Segmentation and Tracking

In this paper we present a system that performs automatic gesture recognition. The system consists of two main components: (i) A unified technique for segmentation and tracking of face and hands using a skin detection algorithm along with handling occlusion between skin objects to keep track of the status of the occluded parts. This is realized by combining 3 useful features, namely, color, motion and position. (ii) A static and dynamic gesture recognition system. Static gesture recognition is achieved using a robust hand shape classification, based on PCA subspaces, that is invariant to scale along with small translation and rotation transformations. Combining hand shape classification with position information and using DHMMs allows us to accomplish dynamic gesture recognition.

Thomas Coogan, George Awad, Junwei Han, Alistair Sutherland
Physically-Based Real-Time Diffraction Using Spherical Harmonics

Diffraction, interference, dispersive refraction and scattering are four wavelength-dependent mechanisms that produce iridescent colors. Wavelength-dependent functions need to be sampled at discrete wavelengths in the visible spectrum, which increases the computational intensity of rendering iridescence. Furthermore, diffraction requires careful sampling since its response function varies at a higher frequency variation with sharper peaks than interference or dispersive refraction. Consequently, rendering physically accurate diffraction has previously either been approximated using simplified color curves, or been limited to offline rendering techniques such as ray tracing. We propose a technique for real-time rendering of physically accurate diffraction on programmable hardware. Our technique adaptively samples the diffraction BRDF and precomputes it to Spherical Harmonic (SH) basis that preserves the peak intensity of the reflected light. While previous work on diffraction used low dynamic range lights, we preserve the full dynamic range of the incident illumination and the diffractive response over the entire hemisphere of incoming light directions. We defer conversion from a wavelength representation to a tone mapped RGB triplet until display.

Clifford Lindsay, Emmanuel Agu
3D Segmentation of Mammospheres for Localization Studies

Three dimensional cell culture assays have emerged as the basis of an improved model system for evaluating therapeutic agents, molecular probes, and exogenous stimuli. However, there is a gap in robust computational techniques for segmentation of image data that are collected through confocal or deconvolution microscopy. The main issue is the volume of data, overlapping subcellular compartments, and variation in scale and size of subcompartments of interest. A

geometric

technique has been developed to bound the solution of the problem by first localizing centers of mass for each cell and then partitioning clump of cells along minimal intersecting surfaces. An approximate solution to the center of mass is realized through iterative spatial voting, which is tolerant to variation in shape morphologies and overlapping compartments and is shown to have an excellent noise immunity. These centers of mass are then used to partition a clump of cells along minimal intersecting surfaces that are estimated by Radon transform. Examples on real data and performance of the system over a large population of data are evaluated. Although proposed strategies have been developed and tested on data collected through fluorescence microscopy, they are applicable to other problems in low level vision and medical imaging.

Ju Han, Hang Chang, Qing Yang, Mary Helen Barcellos-Hoff, Bahram Parvin
Viewpoint Selection for Angiographic Volume

In this paper, we present a novel viewpoint selection framework for angiographic volume data. We propose several view descriptors based on typical concerns of clinicians for the view evaluation. Compared with conventional approaches, our method can deliver a more representative global optimal view by sampling at a much higher rate in the view space. Instead of performing analysis on sample views individually, we construct a solution space to estimate the quality of the views. Descriptor values are propagated to the solution space where an efficient searching process can be performed. The best viewpoint can be found by analyzing the accumulated descriptor values in the solution space based on different visualization goals.

Ming-Yuen Chan, Huamin Qu, Yingcai Wu, Hong Zhou
Recognizing Action Primitives in Complex Actions Using Hidden Markov Models

There is biological evidence that human actions are composed out of action primitives, similarly to words and sentences being composed out of phonemes. Given a set of action primitives and an action composed out of these primitives we present a Hidden Markov Model-based approach that allows to recover the action primitives in that action. In our approach, the primitives may have different lengths, no clear “divider” between the primitives is necessary. The primitive detection is done online, no storing of past data is necessary. We verify our approach on a large database. Recognition rates are slightly smaller than the rate when recognizing the singular action primitives.

V. Krüger
Polyhedrization of Discrete Convex Volumes

In recent years the problem of obtaining a reversible discrete surface polyhedrization (DSP) is attracting an increasing interest within the discrete geometry community. In this paper we propose the first algorithm for obtaining a reversible polyhedrization with a

guaranteed performance

, i.e., together with a bound on the ratio of the number of facets of the obtained polyhedron and one with a minimal number of facets. The algorithm applies to the case of a

convex

DSP when a discrete surface

M

is determined by a convex body in ℝ

3

. The performance estimation is based on a new lower bound (in terms of the diameter of

M

) on the number of 2-facets of an optimal polyhedrization. That bound easily extends to an arbitrary dimension

n

. We also discuss on approaches for solving the general 3D DSP.

Valentin E. Brimkov, Reneta Barneva
Automatic Camera Calibration and Scene Reconstruction with Scale-Invariant Features

The goal of our research is to robustly reconstruct general 3D scenes from 2D images, with application to automatic model generation in computer graphics and virtual reality. In this paper we aim at producing relatively dense and well-distributed 3D points which can subsequently be used to reconstruct the scene structure. We present novel camera calibration and scene reconstruction using scale-invariant feature points. A generic high-dimensional vector matching scheme is proposed to enhance the efficiency and reduce the computational cost while finding feature correspondences. A framework for structure and motion is also presented that better exploits the advantages of scale-invariant features. In this approach we solve the “phantom points” problem and this greatly reduces the possibility of error propagation. The whole process requires no information other than the input images. The results illustrate that our system is capable of producing accurate scene structure and realistic 3D models within a few minutes.

Jun Liu, Roger Hubbold
Surface Fitting to Curves with Energy Control

An algorithm for surface fitting to curves with energy control is proposed in this paper. Given four boundary curves and a set of unorganized curves, we impose the constrained energy on the desired surface, and covert the minimum energy problem into a linear equation system of the control points of the surface. We prove that there is one unique solution of this equation system. The proposed algorithm is independent of the coordinate system, and experience shows that the resultant surface is fair.

Wen-Ke Wang, Hui Zhang, Jun-Hai Yong, Jia-Guang Sun
Blob Tracking with Adaptive Feature Selection and Accurate Scale Determination

We propose a novel color based tracking framework in which an object configuration and color feature are simultaneously determined via scale space filtration. The tracker can automatically select discriminative color feature that well distinguishes foreground from background. According to that feature, a likelihood image of the target is generated for each incoming frame. The target’s area turns into a blob in the likelihood image. The scale of this blob can be determined based on the local maximum of differential scale-space filters. We employ the QP_TR trust region algorithm to search for the local maximum of multi-scale normalized Laplacian filter of the likelihood image to locate the target as well as determine its scale. Based on the tracking results of sequence examples, the proposed method has been proven to be resilient to the color and lighting changes, be capable of describing the target more accurately and achieve much better tracking precision.

Jingping Jia, David Feng, Yanmei Chai, Rongchun Zhao, Zheru Chi
Self-Calibration with Two Views Using the Scale-Invariant Feature Transform

In this paper, we present a self-calibration strategy to estimate camera intrinsic and extrinsic parameters using the scale-invariant feature transform (SIFT). The accuracy of the estimated parameters depends on how reliably a set of image correspondences is established. The SIFT employed in the self-calibration algorithms plays an important role in accurate estimation of camera parameters, because of its robustness to changes in viewing conditions. Under the assumption that the camera intrinsic parameters are constant, experimental results show that the SIFT-based approach using two images yields more competitive results than the existing Harris corner detector-based approach using two images.

Jae-Ho Yun, Rae-Hong Park
Improved Face Recognition Using Extended Modular Principal Component Analysis

In this paper, we present an improved face recognition algorithm using extended modular principal component analysis (PCA). The proposed method, when compared with a regular PCA-based algorithm, has significantly improved recognition rate with large variations in pose, lighting direction, and facial expression. The face images are divided into multiple, smaller blocks based on the Gaussian model and we use the PCA approach to these combined blocks for obtaining two eyes, nose, mouth, and glabella. Priority for merging blocks is decided by using fuzzy logic. Some of the local facial features do not vary with pose, lighting direction, and facial expression. The proposed technique is robust against these variations.

Changhan Park, Inho Paek, Joonki Paik
Shape Reconstruction by Line Voting in Discrete Space

Shape from silhouettes is a binary geometric tomography since both objects and projections, which are measured as silhouettes, are binary. In this paper, we formulate shape from silhouettes in the three-dimensional discrete space. This treatment of the problem implies an ambiguity theorem for the reconstruction of objects in discrete space. Furthermore, we show that in three-dimensional space, it is possible to reconstruct a class of non-convex objects from a collection of silhouettes though on a plane non-convex object is unreconstractable from any collection of silhouettes.

Kosuke Sato, Atsushi Imiya, Tomoya Sakai
Characterization of the Closest Discrete Approximation of a Line in the 3-Dimensional Space

The present paper deals with discrete lines in the 3-dimensional space. In particular, we focus on the minimal 0-connected set of closest integer points to a Euclidean line. We propose a definition which leads to geometric, arithmetic and algorithmic characterizations of naive discrete lines in the 3-dimensional space.

J. -L. Toutant
Margin Maximizing Discriminant Analysis for Multi-shot Based Object Recognition

This paper discusses general object recognition by using image set in the scenario where multiple shots are available for each object. As a way of matching sets of images, canonical correlations offer many benefits in accuracy, efficiency, and robustness compared to the classical parametric distribution-based and non-parametric sample-based methods. However, it is essentially an representative but not a discriminative way for all the previous methods in using canonical correlations for comparing sets of images. Our purpose is to define a transformation such that, in the transformed space, the sum of canonical correlations (the cosine value of the principle angles between any two subspaces) of the intra-class image sets can be minimized and meantime the sum of canonical correlations of the inter-class image sets can be maximized. This is done by learning a margin-maximized linear discriminant function of the canonical correlations. Finally, this transformation is derived by a novel iterative optimization process. In this way, a discriminative way of using canonical correlations is presented. The proposed method significantly outperforms the state-of-the-art methods for two different object recognition problems on two large databases: a celebrity face database which is constructed using Image Google and the ALOI database of generic objects where hundreds of sets of images are taken at different views.

Hui Kong, Eam Khwang Teoh, Pengfei Xu
A Novel 3D Statistical Shape Model for Segmentation of Medical Images

A 3D Partitioned Active Shape Model (PASM) is proposed in this paper to address the problems of 3D Active Shape Models (ASM) caused by the limited numbers of training samples, which is usually the case in 3D segmentation. When training sets are small, 3D ASMs tend to be restrictive, because the plausible area/allowable region spanned by relatively few eigenvectors cannot capture the full range of shape variability. 3D PASMs overcome this limitation by using a partitioned representation of the ASM. Given a Point Distribution Model (PDM), the mean mesh is partitioned into a group of small tiles. The statistical priors of tiles are estimated by applying Principal Component Analysis to each tile to constrain corresponding tiles during deformation. To avoid the inconsistency of shapes between tiles, samples are projected as curves in one hyperspace, instead of point clouds in several hyperspaces. The deformed model points are then fitted into the allowable region of the model by using a curve alignment scheme. The experiments on 3D human brain MRIs show that when the numbers of the training samples are limited, the 3D PASMs significantly improve the segmentation results as compared to 3D ASMs and 3D Hierarchical ASMs, which are the extension of the 2D Hierarchical ASM to the 3D case.

Zheen Zhao, Eam Khwang Teoh
Scale Consistent Image Completion

Most patch based image completion algorithms fill in missing parts of images by copying patches from the known part of the image into the unknown part. The criterion for preferring one patch over another is the compatibility or consistency of the proposed patch with the nearby region that is known or already completed. In this paper we propose adding another dimension to this consistency criterion, namely, scale. Thus, the preferred patch is chosen by evaluating its consistency with respect to smoothed (less detailed) versions of the image, as well as its surroundings in the current version. Applied recursively, this approach results in a multi-scale framework that is shown to yield a dramatic improvement in the robustness of a good existing image completion algorithm.

Michal Holtzman-Gazit, Irad Yavneh
EXDRAP: An Extended Dead Reckoning Architectural Pattern for the Development of Web-Based DVE Applications

Prosperity of distributed 3D applications on the Web heavily depends on the portability and reusability of the content created. Currently, Web3d formats often fall short in resolving such issues. This paper introduces EXDRAP as a hybrid publishing paradigm for declaratively creating Web-based collaborative virtual reality applications which we believe improves portability and reusability. The major issues concerning the development of Web-based CVEs are closely investigated; and an extended

dead reckoning

technique and an optimizing translation mechanism are proposed which reduce the latency (lag) and the amount of memory taken by the browser, respectively. Based on X3D (the successor to VRML) as the ISO standard for real-time computer graphics on the Web, the concepts are successfully implemented and integrated into Jakarta Struts Framework. In order to gain maximum portability, the integration of the X3D browser and the server-side technology is made possible through the use of ECMAScript instead of java on the client end.

Nerssi Nasiri Amini, Mostafa Haghjoo
Optimal Parameterizations of Bézier Surfaces

The presentation of Bézier surfaces affects the results of rendering and tessellating applications greatly. To achieve optimal parameterization, we present two reparameterization algorithms using linear Möbius transformations and quadratic transformations, respectively. The quadratic reparameterization algorithm can produce more satisfying results than the Möbius reparameterization algorithm with degree elevation cost. Examples are given to show the performance of our algorithms for rendering and tessellating applications.

Yi-Jun Yang, Jun-Hai Yong, Hui Zhang, Jean-Claude Paul, Jiaguang Sun
Constrained Delaunay Triangulation Using Delaunay Visibility

An algorithm for constructing constrained Delaunay triangulation (CDT) of a planar straight-line graph (PSLG) is presented. Although the uniform grid method can reduce the time cost of visibility determinations, the time needed to construct the CDT is still long. The algorithm proposed in this paper decreases the number of edges involved in the computation of visibility by replacing traditional visibility with Delaunay visibility. With Delaunay visibility introduced, all strongly Delaunay edges are excluded from the computation of visibility. Furthermore, a sufficient condition for DT (CDT whose triangles are all Delaunay) existence is presented to decrease the times of visibility determinations. The mesh generator is robust and exhibits a linear time complexity for randomly generated PSLGs.

Yi-Jun Yang, Hui Zhang, Jun-Hai Yong, Wei Zeng, Jean-Claude Paul, Jiaguang Sun
Immersing Tele-operators in Collaborative Augmented Reality

In a collaborative system, the level of co-presence, the feeling of being with the remote participants in the same working environment, is very important for natural and efficient task performance. One way to achieve such co-presence is to recreate the participants as real as possible, for instance, with the 3D whole body representation. In this paper, we introduce a method to recreate and immerse tele-operators in a collaborative augmented reality (AR) environment. The method starts with capturing the 3D cloud points of the remote operators and reconstructs them in the shared environment in real time. In order to realize interaction among the participants, the operator’s motion is tracked using a feature extraction and point matching (PM) algorithm. With the participant tracking, various types of 3D interaction become possible.

Jane Hwang, Namgyu Kim, Gerard J. Kim
GrayCut – Object Segmentation in IR-Images

Object segmentation is a crucial task for image analysis and has been studied widely in the past. Most segmentation algorithms rely on changes in contrast or on clustering the same colors only. Yet there seem to be no real one-and-for-all solution to the problem. Nevertheless graph-based energy minimization techniques have been proven to yield very good results in comparison to other techniques. They combine contrast and color information into an energy minimization criterion. We give a brief overview of two recently proposed techniques and present some enhancements to them. Furthermore a combination of them into the

GrayCut

algorithm leads to suitable results for segmenting objects in infrared images.

Christian Ruwwe, Udo Zölzer
Unsupervised Clustering of Shapes

A new method for unsupervised clustering of shapes is here proposed. This method is based on two steps: in the first step a preliminary clusterization is obtained by considering the distance among shapes after alignment with procrustes analysis [1],[2]. This step is based on the minimization of the functional

θ

(

N

cluster

)=

αN

cluster

+(1/

N

cluster

)

dist

(

c

i

) where

N

cluster

is the total number of clusters,

dist

(

c

i

) is the intra-cluster variability and

α

is an appropriate constant. In the second step, the curvature of shapes belonging to clusters obtained in the first step is examined to i) identify possible outliers and to ii) introduce a further refinement of clusters. The proposed method was tested on the Kimia, Surrey and MPEG7 shape databases and was able to obtain correct clusters, corresponding to perceptually homogeneous object categories. The proposed method was able to distinguish shapes with subtle differences, such as birds with one or two feet and to distinguish among very similar animal species....

Mohammad Reza Daliri, Vincent Torre
Markerless Pose Tracking for Augmented Reality

In this paper a new approach is presented for markerless pose tracking in augmented reality. Using a tracking by detection approach, we estimate the 3D camera pose by detecting natural feature points in each input frame and building correspondences between 2D feature points. Instead of modeling the 3D environment, which is changing constantly and dynamically, we use a virtual square to define a 3D reference coordinate system. Camera pose can hence be estimated relative to it and the calculated 3D pose parameters can be used to render virtual objects into the real world. We propose and implement several strategies for robust matching, pose estimation and refinement. Experimental evaluation has shown that the approach is capable of online pose tracking and augmentation.

Chunrong Yuan
Lip Detection Using Confidence-Based Adaptive Thresholding

In this paper we propose a lip detector based on adaptive thresholding for hue-transformed face images. The adaptation is performed according to the confidence values of the estimated lip regions. The confidence of lip means how much similarity exists between the detected lip region and a true lip. We construct simple fuzzy rules of the confidence using true lip statistics of center position, width and height. The threshold value is adaptively changed so that the confidence of a renewed lip region is maximized. By lip detection experiments with VidTimit database we demonstrate the performance enhancement of our proposed method.

Jin Young Kim, Seung You Na, Ronald Cole
Optic Flow Integration at Multiple Spatial Frequencies – Neural Mechanism and Algorithm

In this work we present an iterative multi-scale algorithm for motion estimation that follows mechanisms of motion processing in the human brain. Keeping the properties of a previously presented neural model of cortical motion integration we created a computationally fast algorithmic implementation of the model. The novel contribution is the extension of the algorithm to operate on multiple scales without the disadvantages of typical coarse-to-fine approaches. Compared to the implementation with one scale our multi-scale approach generates faster dense flow fields and reduces wrong motion estimations. In contrast to other approaches, motion estimation on the fine scale is biased by the coarser scales without being corrupted if erroneous motion cues are generated on coarser scales, e.g., when small objects are overlooked. This multi-scale approach is also consistent with biological observations: The function of fast feedforward projections to higher cortical areas with large receptive fields and feedback connections to earlier areas as suggested by our approach might contribute to human motion estimation.

Cornelia Beck, Pierre Bayerl, Heiko Neumann
A Critical Appraisal of the Box Counting Method to Assess the Fractal Dimension of Tree Crowns

In this paper, we study the application of the box counting method (BCM) to estimate the fractal dimension of 3D plant foliage. We use artificial crowns with known theoretical fractal dimension to characterize the accuracy of the BCM and we extend the approach to 3D digitized plants. In particular, errors are experimentally characterized for the estimated values of the fractal dimension. Results show that, with careful protocols, the estimated values are quite accurate. Several limits of the BCM are also analyzed in this context. This analysis is used to introduce a new estimator, derived from the BCM estimator, whose behavior is characterized.

D. Da Silva, F. Boudon, C. Godin, O. Puech, C. Smith, H. Sinoquet
3D Surface Reconstruction and Registration for Image Guided Medialization Laryngoplasty

The purpose of our project is to develop an image guided system for the medialization laryngoplasty. One of the fundamental challenges in our system is to accurately register the preoperative 3D CT data to the intraoperative 3D surfaces of the patient. In this paper, we will present a combined surface and fiducial based registration method to register the preoperative 3D CT data to the intraoperative surface of larynx. To accurately model the exposed surface area, an active illumination based stereo vision technique is used for the surface reconstruction. To register the point clouds from the intraoperative stage to the preoperative 3D CT data, a shape priori based ICP method is proposed to quickly register the two surfaces. The proposed approach is capable of tracking the fiducial markers and reconstructing the surface of larynx with no damage to the anatomical structure. Although, the proposed method is specifically designed for the image guided laryngoplasty, it can be applied to other image guided surgical areas. We used off-the-shelf digital cameras, LCD projector and rapid 3D prototyper to develop our experimental system. The final RMS error in the registration is less than 1mm.

Ge Jin, Sang-Joon Lee, James K. Hahn, Steven Bielamowicz, Rajat Mittal, Raymond Walsh
Vision-Based User Interfaces for Health Applications: A Survey

This paper proposes a survey of vision-based human computer interfaces for several key-fields in health care: data visualization for image-guided diagnosis, image-guided therapy planning and surgery, the operating room, assistance to motor-impaired patients, and monitoring and support of elderly. The emphasis is on the contribution of the underlying computer vision techniques to the usability and usefullness of interfaces for each specific domain.

It is also shown that end-user requirements have a significant impact on the algorithmic design of the computer vision techniques embedded in the interfaces.

Alexandra Branzan Albu
Multiple Hypothesis Target Tracking Using Merge and Split of Graph’s Nodes

In this paper, we propose a maximum a posteriori formulation to the multiple target tracking problem. We adopt a graph representation for storing the detected regions as well as their association over time. The multiple target tracking problem is formulated as a multiple paths search in the graph. Due to the noisy foreground segmentation, an object may be represented by several foreground regions and one foreground region may corresponds to multiple objects. We introduce merge, split and mean shift operations that add new hypothesis to the measurement graph in order to be able to aggregate, split detected blobs or re-acquire objects that have not been detected during stop-and-go-motion. To make full use of the visual observations, we consider both motion and appearance likelihood. Experiments have been conducted on both indoor and outdoor data sets, and a comparison has been carried to assess the contribution of the new tracker.

Yunqian Ma, Qian Yu, Isaac Cohen
Understanding 3D Emotions Through Compact Anthropometric Autoregressive Models

Reproducing realistic facial expressions is an important challenge in human computer interaction. In this paper we propose a novel method of modeling and recovering the transitions between different expressions through the use of an autoregressive process. In order to account for computational complexity, we adopt a compact face representation inspired from MPEG-4 standards while in terms of expressions a well known Facial Action Unit System (FACS) comprising the six dominant ones is considered. Then, transitions between expressions are modeled through a time series according to a linear model. Explicit constraints driven from face anthropometry and points interaction are inherited in this model and minimize the risk of producing non-realistic configurations. Towards optimal animation performance, a particular hardware architecture is used to provide the 3D depth information of the corresponding facial elements during the learning stage and the Random Sampling Consensus algorithm for the robust estimation of the model parameters. Promising experimental results demonstrate the potential of such an approach.

Charlotte Ghys, Nikos Paragios, Bénédicte Bascle
Graph-Based Multi-resolution Temporal-Based Face Reconstruction

Reproducing high quality facial expressions is an important challenge in human-computer interaction. Laser-scanners offer an expensive solution to such a problem with image based alternatives being a low-resolution alternative. In this paper, we propose a new method for stereo reconstruction from multiple video pairs that is capable of producing high resolution facial models. To this end, a combinatorial optimization approach is considered and is coupled in time to produce high resolution depth maps. Such optimization is addressed with the use of graph-cuts leading to precise reconstruction of facial expressions that can then be used for animation.

Charlotte Ghys, Nikos Paragios, Bénédicte Bascle
Web-Based Interface for the Visualization of Microarray Data

This paper presents the design and development of a web-based interface for the visualization of high dimensional data such as microarray data. A co-ordinate based method, namely, 3D Star Coordinate (3SC) projection technique is used for the visualization. The proposed web-based interface enables the user to choose an existing dataset from the database or upload a dataset and visualize the best possible projection of the data on an applet running on the client web browser. The proposed projection algorithm runs in Matlab at the server side for faster computation and using Java Servlets the results are delivered to the client machine.

B. Vanteru, J. Shaik, M. Yeasin
3D and Texture Modelling of Precolombian Objects

In this paper we present a 3D and texture modelling for Precolombian objects. Our experimental setup consists of a no contact 3D digitizer for range image acquisition and a camera CCD of high resolution to acquire the intensity images of the target object. The mapping texture process is described as a parameterization function with a range image; we estimate the camera orientation from calibration techniques, utilizing pattern calibration before data acquisition. We describe a texturized mapping strategy based on multi-view to adequately address photography related problems such as inhomogeneous lighting, highlights and occlusion. For each triangle in the model, the optimal image is selected from source images with an average weighted the scalar product between the image normal vector and triangle normal vector. Finally, we show the highly detailed texture models of Precolombian objects.

Jorge Hernández, Flavio Prieto
Segmentation of Triangular Meshes Using Multi-scale Normal Variation

In this paper, we present a scheme that segments triangular meshes into several meaningful patches using multi-scale normal variation. In differential geometry, there is a traditional scheme that segments smooth surfaces into several patches such as elliptic, hyperbolic, or parabolic regions, with several curves such as ridge, valley, and parabolic curve between these regions, by means of the principal curvatures of the surface. We present a similar segmentation scheme for triangular meshes. For this purpose, we develop a simple and robust scheme that approximates the principal curvatures on triangular meshes by multi-scale normal variation scheme. Using these approximated principal curvatures and modifying the classical segmentation scheme for triangular meshes, we design a scheme that segments triangular meshes into several meaningful regions. This segmentation scheme is implemented by evaluating a

feature weight

at each vertex, which quantifies the likelihood that each vertex belongs to one of the regions. We test our scheme on several face models and demonstrate its capability by segmenting them into several meaningful regions.

Kyungha Min, Moon-Ryul Jung
Integration of Multiple Methods for Class and Specific Object Recognition

Service robots need object recognition strategy that can work on various objects and backgrounds. Since no single method can work well in various situations, we need to combine several methods so that the robots can use an appropriate one automatically. In this paper we propose a scheme to classify situations depending on the characteristics of object of interest, background and user demand. We classify the situations into three categories and employ different techniques for each one. We use SIFT and biologically motivated object recognition techniques developed by Serre et al. for two categories. These two methods do not work well on the remaining category of situations. We propose a contour based technique for this remaining category. Through our experiments, we show that the contour based method performs better than the previously mentioned two methods for this category of situations.

Al Mansur, Md. Altab Hossain, Yoshinori Kuno
An Efficient Photon Mapping Algorithm for Rendering Light-Emitting Fluids

In this paper, we present a novel and efficient algorithm for rendering light-emitting fluids such as fire. For this purpose, we extend the well-known photon mapping algorithm for volumetric environments. The photons emitted from the fluids are stored in a voxelized space, instead of k-d tree in the original photon mapping scheme. We further propose an efficient photon collecting algorithm based on Bresenham’s algorithm, which can collect photons in reduced computational loads. We prove the effectiveness and efficiency of the proposed algorithm by visualizing light-emitting fluids such as fire in various styles.

Kyungha Min
Face Recognition Using 2D and 3D Multimodal Local Features

Machine recognition of faces is very challenging because it is an interclass recognition problem and the variation in faces is very low compared to other biometrics. Global features have been extensively used for face recognition however they are sensitive to variations caused by expressions, illumination, pose, occlusions and makeup. We present a novel 3D local feature for automatic face recognition which is robust to these variations. The 3D features are extracted by uniformly sampling local regions of the face in locally defined coordinate bases which makes them invariant to pose. The high descriptiveness of this feature makes it ideal for the challenging task of interclass recognition. In the 2D domain, we use the SIFT descriptor and fuse the results with the 3D approach at the score level. Experiments were performed using the FRGC v2.0 data and the achieved verification rates at 0.001 FAR were 98.5% and 86.0% for faces with neutral and non-neutral expressions respectively.

Ajmal Mian, Mohammed Bennamoun, Robyn Owens
Adaptive Background Generation for Video Object Segmentation

In this paper, we present a novel method for generating background that adopts frame difference and a median filter to sensitive areas where illumination changes occur. The proposed method also uses fewer frames than the existing methods. Background generation is widely used as a preprocessing for video-based tracking, surveillance, and object detection. The proposed background generation method utilizes differences and motion changes between two consecutive frames to cope with the changes of illumination in an image sequence. It also utilizes a median filter to adaptively generate a robust background. The proposed method enables more efficient background reconstruction with fewer frames than existing methods use.

Taekyung Kim, Joonki Paik
Omnidirectional Camera Calibration and 3D Reconstruction by Contour Matching

This paper presents a novel approach to both omnidirectional camera calibration and 3D reconstruction of the surrounding scene by contour matching in architectural scenes. By using a quantitative measure to consider the inlier distribution, we can estimate more precise camera model parameters and structure from motion. Since most of line segments of man-made objects are projected to the contours in omnidirectional images, contour matching problem is important in camera recovery process. We propose a novel 3D reconstruction method by contour matching in three omnidirectional views. First, two points on the contour and their viewing vectors are used to determine an interpretation plane equation, and we obtain a contour intersecting both the plane and the estimated patch of the camera model. Then, 3D line segment is calculated from two patches, which is projected to the contour on the third views, and these matching results are used in refinement of camera recovery.

Yongho Hwang, Jaeman Lee, Hyunki Hong
Real-Time GPU-Based Simulation of Dynamic Terrain

Although a great deal of research effort has been devoted to the realistic and efficient rendering of terrain data, the existing research mainly focuses on displaying static terrain. The realistic simulation of dynamic terrain, including vehicle-terrain interactions, is absent in most 3D graphical applications. In this paper, we present a new GPU-based algorithm for dynamic terrain simulation. A key component of this algorithm is a Dynamically-Displaced Height Map (DDHM) which is generated and manipulated on the GPU. In addition, our method achieves real-time performance by using new features of the latest graphics hardware and shader technology. We demonstrate our algorithm by simulating a ground vehicle traveling on soft terrain. Our algorithm is particularly useful for ground based visual simulation applications as well as 3D games.

Anthony S. Aquilio, Jeremy C. Brooks, Ying Zhu, G. Scott Owen
High-Resolution Video from Series of Still Photographs

In this paper, we explored the problem of creating a high-resolution video from a series of still photographs. Instead of enhancing the resolution from the video stream, we consider the problem of generating a high-resolution video as an image synthesis problem. Using the continuous shot in the digital camera, we can get a series of still photographs at 2 to 3 frames pre second. The main challenge in our approach is to synthesize the in between frames from two consecutive still images. The image synthesis approach varies based on the scene motion and image characteristics. We have applied optical flow, image segmentation, image filtering and skeleton based image warping techniques to generate high-resolution video.

Ge Jin, James K. Hahn
Backmatter
Metadata
Title
Advances in Visual Computing
Editors
George Bebis
Richard Boyle
Bahram Parvin
Darko Koracin
Paolo Remagnino
Ara Nefian
Gopi Meenakshisundaram
Valerio Pascucci
Jiri Zara
Jose Molineros
Holger Theisel
Tom Malzbender
Copyright Year
2006
Publisher
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-48631-2
Print ISBN
978-3-540-48628-2
DOI
https://doi.org/10.1007/11919476

Premium Partner