Skip to main content
Top

2006 | Book

Advances in Visual Computing

Second International Symposium, ISVC 2006 Lake Tahoe, NV, USA, November 6-8, 2006. Proceedings, Part II

Editors: George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Paolo Remagnino, Ara Nefian, Gopi Meenakshisundaram, Valerio Pascucci, Jiri Zara, Jose Molineros, Holger Theisel, Tom Malzbender

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

It is with great pleasure that we welcome you all to the proceedings of the 2nd InternationalSymposiumonVisualComputing(ISVC2006)heldinLakeTahoe. Followinga successfulmeeting lastyear,wewitnesseda muchstrongerandmore productive event this year. ISVC o?ers a common umbrella for the four main areas of visual computing including vision, graphics, visualization, and virtual reality. Its goal is to provide a forum for researchers, scientists, engineers and practitionersthroughouttheworldtopresenttheirlatestresearch?ndings,ideas, developments and applications in the broader area of visual computing. This year, the program consisted of 13 oral sessions, one poster session, ten special tracks, and six keynote presentations. The response to the call for - pers was very strong. We received more than twice the papers received last year. Speci?cally, we received over 280 submissions for the main symposium from which we accepted 65 papers for oral presentation (23% acceptance) and 56 papers for poster presentation (20% acceptance). Special track papers were solicited separately through the Organizing and Program Committees of each track. A total of 57 papers were accepted for presentation in the special tracks. All papers were reviewed with an emphasis on potential to contribute to the state of the art in the ?eld. Selection criteria included accuracy and originality of ideas, clarity and signi?cance of results, and presentation quality. The review process was quite rigorous, involving two to three independent blind reviews followed by several days of discussion. During the discussion period we tried to correct anomalies and errors that might have existed in the initial reviews.

Table of Contents

Frontmatter
Multiple Description Coding for Robust Video Transmission Over Wireless Ad-Hoc Networks

This paper presents a quality-scalable multiple description coding method for the H.264/AVC for error-resilient video transmission in wireless ad-hoc networks. In the proposed system, each video frame is divided into two regions according to the motion activities in the scene and the more important region is encoded into two independently decodable and equally important descriptions. The two descriptions are transmitted over different paths to a receiver in order to alleviate the effect of unstable channel conditions of wireless ad-hoc networks. If the two descriptions are received error-free, then a high quality video is reconstructed. If one description is lost, then the other description can still be decoded to produce a basic video quality, and furthermore, the correctly received description is used to estimate the lost information of the corrupted description. The proposed MD coder maintains an adequate video quality as long as both description are not simultaneously lost. Simulation results show that the proposed MD coding combined with multi-path transport system is largely immune to packet losses, and therefore, can be a promising solution for robust video communications in wireless ad-hoc networks.

Joohee Kim
Emotion-Based Textile Indexing Using Colors, Texture and Patterns

We propose a textile indexing system which can classify textile images based on human emotions. The emotions can be regarded as emotional reactions of human beings when they view specific textile images. The evaluation system starts with extracting features of textile images such as colors, texture and patterns using various image processing techniques. The proposed system utilizes both fuzzy rules and neural networks. The fuzzy rules are determined for six emotional features which can be formulated with respect to color and texture. On the other hand, the neural network is used for recognizing patterns which can be used in classifying textile images based on the 4 other emotional features. For the machine learning component of the system, we selected 70 subjects so that they could view and annotate 160 textile images using ten pairs of emotional features. The fuzzy rule based component of the system uses color features and texture features in order to predict six pairs of emotional features such as (warm, cold), (gay, sober), (cheerful, dismal), (light, dark), (strong, weak), and (hard, soft). The neural-network based component of the system can predict four pairs of emotional features such as (natural, unnatural), (dynamic, static), (unstable, stable) and (gaudy, plain). Our experimental results showed that the proposed system was effective for predicting human emotions based on textile images and improving the accuracy of indexing the textile images based on emotional features.

Soo-jeong Kim, Eun Yi Kim, Karpjoo Jeong, Jee-in Kim
Affine Camera for 3-D Retinal Surface Reconstruction

We study 3D retinal surface reconstruction by using an affine camera due to two following reasons: (1) NIH’s retinal imaging protocols specify a narrow field of view and (2) each retinal image has small depth variation. Specifically, we incorporate the prior knowledge of human retina geometry in the reconstruction process, and introduce a point-based approach to estimate the retinal spherical surface. We also show that lens distortion removal and affine bundle adjustment improve the reconstruction error in terms of the deviation from the underling spherical surface. Simulation results on both synthetic data and real images show the effectiveness and robustness of the proposed algorithm.

Thitiporn Chanwimaluang, Guoliang Fan
The Diagnostic Application of Brain Image Processing and Analysis System for Ischemic Stroke

The diffusion weighted image (DWI) technique is routinely used for diagnosis and treatment of early stroke due to its superior performance, especially when compared with conventional magnetic resonance image (MRI) for detection of acute ischemic stroke. Using DWI examination, this paper proposes an application of image processing in a computer-aided diagnosis system, which can effectively calculate the volume size and provide 3D reconstruction data of a lesion. The potential benefits of using our system include the higher accuracy of acute stroke lesion definition, the reduced time and procedure of calculating the volume, and providing 3D reconstruction image of stroke patients, which can effectively assist doctors in making more accurate diagnoses and treating patients in a more convenient way. Compared with the traditional method, the experimental results have shown the superior performance of this proposed system.

Tzyh-Chyang Chang, Jiann-Der Lee, Chung-Hsien Huang, Tony Wu, Chi-Jen Chen, Shwu-Jiuan Wu
Development of Early Tunnel Fire Detection Algorithm Using the Image Processing

To avoid the large scale of damage of fire occurred in the tunnel, it is necessary to have a system to minimize and to discover the incident fast. However it is impossible to keep the human observation of CCTV in tunnel for 24 hour. So if the fire and smoke detection system through image processing warn fire state, it can be very convenient, and it can be possible to minimize damage even when people is not in front of monitor. In this paper, we proposed algorithm using the image processing, which is an early detection of the fire and smoke occurrence in the tunnel. The fire and smoke detection is different from the forest fire detection as there are elements such as car and tunnel lights and others that are different from the forest environment so that an indigenous algorithm has to be developed. The two algorithms proposed in this paper, are able to detect the exact position, at the earlier stay of detection. In addition, by comparing properties of each algorithm throughout experiment, we have proved the propriety of algorithm.

Dongil Han, Byoungmoo Lee
Simulation of Artificial Winds Using a Hardware Illumination Technique

Real world winds can be classified into two categories: natural winds and artificial winds. While previous researchers have focused on natural winds, we also need to simulate artificial winds, which may be generated by electric fans, air conditioners, human mouths, etc. In this paper, we present a mathematical model for artificial winds, and also present an efficient method for calculating the forces applied on the objects under the influence of artificial winds. Our method uses a hardware illumination technique to finally calculate the artificial wind forces, and thus the simulation process is accelerated considerably. Hence our model achieves real-time display of visually plausible motions, and its implementation shows sufficient performance for real-time interactivity. Through combining our method with previous natural wind models, we can also expect to build up an integrated wind simulation system for computer graphics applications.

Namkyung Lee, Nakhoon Baek, Kwan Woo Ryu
Learning for Multi-view 3D Tracking in the Context of Particle Filters

In this paper we present an approach to use prior knowledge in the particle filter framework for 3D tracking, i.e. estimating the state parameters such as joint angles of a 3D object. The probability of the object’s states, including correlations between the state parameters, is learned a priori from training samples. We introduce a framework that integrates this knowledge into the family of particle filters and particularly into the annealed particle filter scheme. Furthermore, we show that the annealed particle filter also works with a variational model for level set based image segmentation that does not rely on background subtraction and, hence, does not depend on a static background. In our experiments, we use a four camera set-up for tracking the lower part of a human body by a kinematic model with 18 degrees of freedom. We demonstrate the increased accuracy due to the prior knowledge and the robustness of our approach to image distortions. Finally, we compare the results of our multi-view tracking system quantitatively to the outcome of an industrial marker based tracking system.

Juergen Gall, Bodo Rosenhahn, Thomas Brox, Hans-Peter Seidel
Improving Brightness for a Multi-projector Display Considering Image Content

Very large-area multi-projector display systems have wide applications since they provide immersive environments. The brightness of projectors in a system is not identical due to the design and aging of the bulbs, and optical properties of the projectors. Therefore, it is imperative to coordinate the brightness level and characteristics of projectors in the system. This issue has been addressed by many researchers. However, the state of the art solution has considered the characteristic of the projector’s brightness only. In this paper, we propose to take into account the dynamic range of the image content. We expand the dynamic range of the input image to utilize the maximum possible brightness of a projector while satisfying the spatial smoothness constraint. The proposed method achieves a seamless multi-display with impressive improvement of the brightness level.

Hee-Won Lee, Byung-Uk Lee
VirtualQWERTY: Textual Communication in Virtual Reality

This paper presents

VirtualQWERTY

, an effective and powerful textual communication toolkit for virtual reality. The first key idea of this paper is that we exploit the text input mechanism on mobile phones. We present a widget whose user interface is similar to that of the keypad on a mobile phone. Each keypad is composed of several buttons that have several characters. The second key idea of this paper is that the characters on the keypad are arranged by mimicking the character layout of the QWERTY keyboard. These key ideas present users an intuitive and easily learnable textual communication toolkit in virtual reality. In order to exploit

VirtualQWERTY

, we present an indexing scheme by which users locate the button they want and a selection scheme by which users choose the desired character from the characters on the button using various input devices. After testing this tool by many subjects, we draw an important conclusion about the performance of the textual communication toolkit in virtual environments.

Jaewoo Ahn, Kyungha Min
Networked Heterogeneous Camera System for High Resolution Face Images

This paper presents a system that combines fixed stereo cameras with wide field of view (FOV) and high resolution active Pan/Tilt (PT) cameras into a networked platform. Designed for security applications, the purpose is to provide high resolution face images while covering a wide area. A custom PT unit allowing a fixed camera projection center was designed to hold the high resolution camera. Two such cameras in addition to one stereo camera were implemented in a real scene in order to test the effectiveness of the system. Preliminary results demonstrate the efficiency of the proposed system.

Sofiane Yous, Abdelaziz Khiat, Masatsugu Kidode, Tsukasa Ogasawara
A New Method for Approximating Optimal Parameterization of Polynomial Curves

Rational re-parameterizations of a polynomial curve that preserve the curve degree and [0,1] parameter domain are characterized by a single degree of freedom. The optimal re-parameterization in this family can be identified but the existing methods may exhibit too much residual parametric speed variation for motion control and other applications. In this paper, a new re-parameterization method to optimal parameterization is presented and the optimal parameterization in this family obtained by the new method satisfies that the maximum deviation from unit-speed is the minimum. Experiments for comparing the efficiency of this algorithm with other methods are also included.

Fenghua Guo, Caiming Zhang
Interpolation by Piecewise Quadric Polynomial to Scattered Data Points

This paper presents a new method to construct C

1

polynomial surface to the scattered data points. The given data points are triangulated into triangle networks, and a C

1

piecewise quadric interpolant patch is constructed at the adjacent region of each point. For the different cases of the number of adjacent points, two different algorithms are used for determining the patches, respectively. The surface patch on each triangle is constructed by the weighted combination of the three patches corresponding to the vertices of the triangle. All of the surface patches are put together to form the whole surface with C

1

continuity. The construction of the surface by the new method is simple, and the surface has the shape suggested by the given data points and is more fairing. Comparison examples are included.

Shanshan Gao, Caiming Zhang, Li Zhong
Detection and Localization of the Top Object in the Stack of Objects

This paper proposes a new algorithm that detects the top object in a stack of arbitrarily overlapped objects, using the fact that the boundary edges of an occluding surface are not disconnected by other edges. To determine whether a surface is an occluding one or occluded one, this paper represents the objects in an input image using the attributed relation graph where a node represents a surface and an arc connecting two nodes shows the adjacency of the two surfaces in the image. An arc is weighted by two relation values, each of which is weighted on a node and tells the number of edges shared with the surface connected in the opposite side of the edge. Based on the magnitudes of the relation values in the attributed graph, all surfaces are classified into either occluding or occluded ones and grouped as a node set. The top object is selected as the result of the merging process of the node sets. The experimental results have shown that the proposed algorithm efficiently separates the top object in the various images of object stacks, with the time complexity of O(n) where n is the number of edges in the image.

Hernsoo Hahn, Youngjoon Han
Recognition of 3D Object Using Attributed Relation Graph of Silhouette’s Extended Convex Hull

This paper presents a new approach of recognizing a 3D object using a single camera, based on the extended convex hull of its silhouette. It aims at minimizing the DB size and simplifying the processes for matching and feature extraction. For this purpose, two concepts are introduced: extended convex hull and measurable region. Extended convex hull is consisted of convex curved edges as well as convex polygons. Measurable region is the cluster of the viewing vectors of a camera represented as the points on the orientation sphere from which a specific set of surfaces can be measured. A measurable region is represented by the extended convex hull of the silhouette which can be obtained by viewing the object from the center of the measurable region. Each silhouette is represented by a relation graph where a node describes an edge using its type, length, reality, and components. Experimental results are included to show that the proposed algorithm works efficiently even when the objects are overlapped and partially occluded. The time complexity for searching the object model in the database is

O

(

N

) where

N

is the number of silhouette models.

Hernsoo Hahn, Youngjoon Han
Image Retrieval by Local Contrast Patterns and Color

Despite simplicity of the Local binary patterns (LBP) or local edge patterns (LEP) for texture description, they do not always convey complex pattern information. Moreover they are susceptive to various image distortions. Hence we propose a new descriptor called Local Contrast Patterns(LCP), which encode the joint difference distribution of filter responses that can be effectively computed by the higher order directional Gaussian derivatives. Though statistical moments of the filter responses are typical texture features, various complex patterns ( e.g., edges, points, blobs) are well captured by the proposed LCP. Observation shows that anyone of the first few derivatives can produce promising results compared to LBP(or LEP). To extract more improved outcome, two sub-optimal descriptors (LCP1, LCP2) are computed by maximizing local bit frequency and local contrast-ratio. Global RGB color histogram is then combined with the proposed LCP descriptors for color-texture retrieval. Experiments with the grayscale (Brodatz album) and color-texture (MIT VisTex) databases show that our proposed LCP (LCP+RGB) produces 8 % and 2.1 % (1.4 % and 1.9 % ) improved recall rates compared to LBP and LEP (LBP+RGB and LEP+RGB) features. The achievement of the lowest rank ratio, i.e., 2.789 for gray images (1.482 for color images) also indicates the potentiality of the proposed LCP2(LCP2+RGB) feature.

M. K. Bashar, N. Ohnishi
A VR Game Platform Built Upon Wireless Sensor Network

This paper presents a VR game platform built upon wireless sensor network (WSN). The WSN is composed of multiple beacons and a receiver. Each beacon transmits both an ultrasonic pulse and the RF message at the same time. Then, the receiver attached to a moving device uses the standard time difference of arrival technique to estimate its distance from each beacon. The estimated distances are passed to the context-aware game server, which computes the location of the receiver. The location data are fed into VR games, and determine the virtual device’s motion. The prototype implementation proves that WSN and VR games can be neatly integrated, and can lead to a new type of natural interface for the next-generation game.

Doo-seop Eom, Jungshik Jang, Taeyoung Kim, JungHyun Han
Feature Extraction and Selection for Recognizing Humans by Their Gait

We describe an efficient and effective feature extraction and selection method for identifying humans by their gait. A sequential set of 2D stick figures is extracted from gait silhouette data by determining the joint angles and body points, and it is used to represent the gait signature that is primitive data for extracting motion parameters. The motion parameters in the gait signatures are stride length, cycle time, speed, and joint angles, and the gait features are extracted from these motion parameters. By measuring a class separability of the extracted features, important features are selected from original feature sets for classifying human in the gait patterns. Then, a

k-

NN classifier is used to analyze the discriminatory ability of the selected features. In experiments, higher gait classification performances, which are 96.7%, have been achieved.

Jang-Hee Yoo, Mark S. Nixon
Rectification of Illumination in Images Used for Shape from Focus

Shadows occurring in images often lead to undesirable results in computer vision. An inherent weakness of 3D reconstruction from image focus is that, they require the imaged scene to have significant textures. In many real-world applications, surfaces can be smooth and lacking in detectable texture. In such cases, Shape-form-Focus (SFF) generates inaccurate and sparse depth maps. For accurate calculation of depth map, we consider the effects of illumination. In SFF the images are taken with one light source and the shadows occur in each frame. The more bright regions of the images give more accurate depth map, whereas, the less bright regions give less accurate depth map. In this paper we propose an algorithm that removes the shadows from the image sequence which are used for SFF methods. We show the results and compare them with the previous results. From simulation results, the depth maps of objects are improved when the shadows are removed.

S. M. Mannan, Aamir Saeed Malik, Humaira Nisar, Tae-Sun Choi
Bilateral Edge Detection on a Virtual Hexagonal Structure

Edge detection plays an important role in image processing area. This paper presents an edge detection method based on bilateral filtering which achieves better performance than single Gaussian filtering. In this form of filtering, both spatial closeness and intensity similarity of pixels are considered in order to preserve important visual cues provided by edges and reduce the sharpness of transitions in intensity values as well. In addition, the edge detec-tion method proposed in this paper is achieved on sampled images represented on a newly developed virtual hexagonal structure. Due to the compact and circular nature of the hexagonal lattice, a better quality edge map is obtained on the hexagonal structure than common edge detection on square structure. Experimental results using proposed methods exhibit encouraging performance.

Xiangjian He, Wenjing Jia, Namho Hur, Qiang Wu, Jinwoong Kim, Tom Hintz
Issues and Implementation of C 1 and C 2 Natural Neighbor Interpolation

Smooth local coordinates have been proposed by Hiyoshi and Sugihara 2000 to improve the classical Sibson’s and Laplace coordinates. These smooth local coordinates are computed by integrating geometric quantities over weights in the power diagram. In this paper we describe how to efficiently implement the Voronoi based

C

2

local coordinates. The globally

C

2

interpolant that Hiyoshi and Sugihara presented in 2004 is then compared to Sibson’s and Farin’s

C

1

interpolants when applied to scattered data interpolation.

T. Bobach, M. Bertram, G. Umlauf
Iris Recognition Using a Low Level of Details

This paper describes a new iris recognition algorithm, which uses a low level of details. Combining statistical classification and elastic boundary fitting, the iris is first localized. Then, the localized iris image is down-sampled by a factor of m, and filtered by a modified Laplacian kernel. Since the output of the Laplacian operator is sensitive to a small shift of the full-resolution iris image, the outputs of the Laplacian operator are computed for all space-shifts. The quantized output with maximum entropy is selected as the final feature representation. Experimentally we showed that the proposed method produces superb performance in iris segmentation and recognition.

Index Terms:

iris segmentation, iris recognition, shift-invariant, multiscale Laplacian kernel.

Jaemin Kim, Seongwon Cho, Daewhan Kim, Sun-Tae Chung
Dynamic Reconstruction of Complex Planar Objects on Irregular Isothetic Grids

The vectorization of discrete regular images has been widely developed in many image processing and synthesis applications, where images are considered as a regular static data. Regardless of final application, we have proposed in [14] a reconstruction algorithm of planar graphical elements on irregular isothetic grids. In this paper, we present a dynamic version of this algorithm to control the reconstruction. Indeed, we handle local refinements to update efficiently our complete shape representation. We also illustrate an application of our contribution for interactive approximation of implicit curves by lines, controlling the topology of the reconstruction.

Antoine Vacavant, David Coeurjolly, Laure Tougne
Layout of Multiple Views for Volume Visualization: A User Study

Volume visualizations can have drastically different appearances when viewed using a variety of transfer functions. A problem then occurs in trying to organize many different views on one screen. We conducted a user study of four layout techniques for these multiple views. We timed participants as they separated different aspects of volume data for both time-invariant and time-variant data using one of four different layout schemes. The layout technique had no impact on performance when used with time-invariant data. With time-variant data, however, the multiple view layouts all resulted in better times than did a single view interface. Surprisingly, different layout techniques for multiple views resulted in no noticeable difference in user performance. In this paper, we describe our study and present the results, which could be used in the design of future volume visualization software to improve the productivity of the scientists who use it.

Daniel Lewis, Steve Haroz, Kwan-Liu Ma
Video Indexing and Retrieval in Compressed Domain Using Fuzzy-Categorization

There has been an increased interest in video indexing and retrieval in recent years. In this work, indexing and retrieval system of the visual contents is based on feature extracted from the compressed domain. Direct possessing of the compressed domain spares the decoding time, which is extremely important when indexing large number of multimedia archives. A fuzzy-categorizing structure is designed in this paper to improve the retrieval performance. In our experiment, a database that consists of basketball videos has been constructed for our study. This database includes three categories: full-court match, penalty and close-up. First, spatial and temporal feature extraction is applied to train the fuzzy membership functions using the minimum entropy optimal algorithm. Then, the max composition operation is used to generate a new fuzzy feature to represent the content of the shots. Finally, the fuzzy-based representation becomes the indexing feature for the content-based video retrieval system. The experimental results show that the proposal algorithm is quite promising for semantic-based video retrieval.

Hui Fang, Rami Qahwaji, Jianmin Jiang
Computing Homology for Surfaces with Generalized Maps: Application to 3D Images

In this paper, we present an algorithm which allows to compute efficiently generators of the first homology group of a closed surface, orientable or not. Starting with an initial subdivision of a surface, we simplify it to its minimal form (minimal refers to the number of cells), while preserving its homology. Homology generators can thus be directly deduced from the minimal representation of the initial surface. Finally, we show how this algorithm can be used in a 3D labelled image in order to compute homology of each region described by its boundary.

Guillaume Damiand, Samuel Peltier, Laurent Fuchs
Dynamic Texture Analysis and Synthesis Using Tensor Decomposition

Dynamic textures are sequences of images showing temporal regularity, such as smoke, flames, flowing water, or moving grass. Despite being a multidimensional signal, existing models reshape the dynamic texture into a 2D signal for analysis. In this article, we propose to directly decompose the multidimensional (tensor) signal, free from reshaping operations. We show that decomposition techniques originally applied to study psychometric or chemometric data can be used for this purpose. Since spatial, time, and color information are analyzed at the same time, such techniques permit to obtain more compact models. Only one third or less model coefficients are needed for the same quality and synthesis cost of 2D based models, as illustrated by experiments on real dynamic textures.

Roberto Costantini, Luciano Sbaiz, Sabine Süsstrunk
Color Pair Clustering for Texture Detection

A novel approach to the extraction of image regions of uniform color and its application to automatic texture detection is discussed. The method searches for alternating color patterns, through hierarchical clustering of color pairs from adjacent image regions. The final result is a hierarchy of texture regions, described by their boundaries and a set of features, detected at multiple accuracy levels. The results are presented on some images of natural scenes from the Berkeley segmentation dataset and benchmark.

Lech Szumilas, Allan Hanbury
CPU-GPU Multithreaded Programming Model: Application to the Path Tracing with Next Event Estimation Algorithm

Today’s hardware includes powerful devices such as graphics process units (GPU) that are not always used to their maximum capacities. Our main goal is to take advantage of these unused resources. To achieve this, we abstract GPUs as SIMD streaming coprocessors and use them within the framework of a multithreaded parallel model. Thus we aim to use all the computing power of a modern PC for speeding up a global illumination simulation software.

The global illumination of a virtual scene can be estimated with stochastic methods such as Path Tracing. These methods however remain costly in terms of rendering time, because of the high sampling required to produce good quality frames. The most part of the rendering time is spent performing intersections tests between rays and triangles. We propose to speed up the rendering of a frame, by using all the available CPUs and GPUs. Our work is based on the

ray engine

developed by Carr et al. for ray tracing, and is mapped to our parallel programming model.

Christophe Cassagnabère, François Rousselle, Christophe Renaud
Real-Time and Robust Monocular SLAM Using Predictive Multi-resolution Descriptors

We describe a robust system for vision-based SLAM using a single camera which runs in real-time, typically around 30 fps. The key contribution is a novel utilisation of multi-resolution descriptors in a coherent top-down framework. The resulting system provides superior performance over previous methods in terms of robustness to erratic motion, camera shake, and the ability to recover from measurement loss. SLAM itself is implemented within an unscented Kalman filter framework based on a constant position motion model, which is also shown to provide further resilience to non-smooth camera motion. Results are presented illustrating successful SLAM operation for challenging hand-held camera movement within desktop environments.

Denis Chekhlov, Mark Pupilli, Walterio Mayol-Cuevas, Andrew Calway
A Faster Graph-Based Segmentation Algorithm with Statistical Region Merge

The paper presents a modification of a bottom up graph theoretic image segmentation algorithm to improve its performance. This algorithm uses Kruskal’s algorithm to build minimum spanning trees for segmentation that reflect global properties of the image: a predicate is defined for measuring the evidence of a boundary between two regions and the algorithm makes greedy decisions to produce the final segmentation. We modify the algorithm by reducing the number of edges required for sorting based on two criteria. We also show that the algorithm produces an over segmented result and suggest a statistical region merge process that will reduce the over segmentation. We have evaluated the algorithm by segmenting various video clips Our experimental results indicate the improved performance and quality of segmentation.

Ahmed Fahad, Tim Morris
Sensor Fusion Based Obstacle Detection/Classification for Active Pedestrian Protection System

This paper proposes a sensor fusion based obstacle detection/classification system for active pedestrian protection system. At the front-end of vehicle, one laser scanner and one camera is installed. Clustering and tracking of range data from laser scanner generate obstacle candidates. Vision system classifies the candidates into three categories: pedestrian, vehicle, and other. Gabor filter bank extracts the feature vector of candidate image. The obstacle classification is implemented by combining two classifiers with the same architecture: support vector machine for pedestrian and vehicle. Obstacle detection system recognizing the class can actively protect pedestrian while reducing false positive rate.

Ho Gi Jung, Yun Hee Lee, Pal Joo Yoon, In Yong Hwang, Jaihie Kim
Combinatorial Pyramids and Discrete Geometry for Energy-Minimizing Segmentation

This paper defines the basis of a new hierarchical segmentation framework based on an energy minimization scheme. This new framework is based on two formal tools. First, a combinatorial pyramid encodes efficiently a hierarchy of partitions. Secondly, discrete geometric estimators measure precisely some important geometric parameters of the regions. These measures combined with photometrical and topological features of the partition allow to design energy terms based on discrete measures. Our segmentation framework exploits these energies to build a pyramid of image partitions with a minimization scheme. Some experiments illustrating our framework are shown and discussed.

Martin Braure de Calignon, Luc Brun, Jacques-Olivier Lachaud
Fast Dense Stereo Matching Using Adaptive Window in Hierarchical Framework

A new area-based stereo matching in hierarchical framework is proposed. Local methods generally measure the similarity between the image pixels using local support window. An appropriate support window, where the pixels have similar disparity, should be selected adaptively for each pixel. Our algorithm consists of the following two steps. In the first step, given an estimated initial disparity map, we obtain an object boundary map for distinction of homogeneous/object boundary region. It is based on the assumption that the depth boundary exists inside of intensity boundary. In the second step for improving accuracy, we choose the size and shape of window using boundary information to acquire the accurate disparity map. Generally, the boundary regions are determined by the disparity information, which should be estimated. Therefore, we propose a hierarchical structure for simultaneous boundary and disparity estimation. Finally, we propose post-processing scheme for removal of outliers. The algorithm does not use a complicate optimization. Instead, it concentrates on the estimation of a optimal window for each pixel in improved hierarchical framework, therefore, it is very efficient in computational complexity. The experimental results on the standard data set demonstrate that the proposed method achieves better performance than the conventional methods in homogeneous regions and object boundaries.

SangUn Yoon, Dongbo Min, Kwanghoon Sohn
A New Photographing Apparatus for Skin Maps of Human Face Rendering

Movie contents makers generally require high quality texture maps built of images to render a realistic face skin. However, it is difficult to capture the images of same facial expression from various directions because the expression can be changed easily. A new data capture apparatus using polarization properties of light is made to capture mapping images for human face. It captures images without specular reflection in a moment, and the images are used to make texture maps, bump maps and special characteristic maps. The maps are used to render a realistic human.

Haedong Kim, Howook Jang, Inho Lee
Investigating the Dynamics of Facial Expression

This paper is concerned with capturing the dynamics of facial expression. The dynamics of facial expression can be described as the intensity and timing of a facial expression and its formation. To achieve this we developed a technique that can accurately classify and differentiate between subtle and similar expressions, involving the lower face. This is achieved by using

Local Linear Embedding

(LLE) to reduce the dimensionality of the dataset and applying

Support Vector Machines

(SVMs) to classify expressions. We then extended this technique to estimate the dynamics of facial expression formation in terms of intensity and timing.

Jane Reilly, John Ghent, John McDonald
GLOBAL Topology Preservation in Isosurface Extraction of Volumetric Data

We study the topology inconsistency between the interested object boundary and the isosurface extracted with Marching Cubes, and present a new method preserving the topology of object boundary. Different from the MC-like algorithms, the new method is not restricted to the Marching Cubes’ partition on the grid vertexes: inner or outer vertexes. We find that a pixel curved out by Marching Cubes algorithm maybe include a part of the interested object and it is even a key part for preserving the boundary topology. So the boundary pixel is defined as a pixel that includes the object boundary and it’s pixel value is between those of the inner and outer pixel. This paper shows how to find the boundary pixels and the boundary position in it. Thus we can find the object boundary in a pixel whose value is out of the isovalue, which is regarded as background by Marching Cubes algorithm. Experiments based on CT data also verify the effect of the new method.

Xingqiang Yang, Caiming Zhang
Real-Time Model-Based SLAM Using Line Segments

Existing monocular vision-based SLAM systems favour interest point features as landmarks, but these are easily occluded and can only be reliably matched over a narrow range of viewpoints. Line segments offer an interesting alternative, as line matching is more stable with respect to viewpoint changes and lines are robust to partial occlusion. In this paper we present a model-based SLAM system that uses 3D line segments as landmarks. Unscented Kalman filters are used to initialise new line segments and generate a 3D wireframe model of the scene that can be tracked with a robust model-based tracking algorithm. Uncertainties in the camera position are fed into the initialisation of new model edges. Results show the system operating in real-time with resilience to partial occlusion. The maps of line segments generated during the SLAM process are physically meaningful and their structure is measured against the true 3D structure of the scene.

Andrew P. Gee, Walterio Mayol-Cuevas
Feature Correspondences from Multiple Views of Coplanar Ellipses

We address the problem of feature correspondences in images of coplanar ellipses with objective to benefit of robust ellipse fitting algorithm. The main difficulty is the lack of projective invariant points immediately available. Therefore, our key idea is to construct virtual line and point features using the property of tangent invariance under perspective projection. The proposed method requires first a robust detection of ellipse edge points to fit a parametric model on each ellipse. The feature lines are then obtained by computing the 4 bitangents to each couple of ellipses. The points are derived by considering the tangent points and the intersection points between bitangents. Results of experimental studies are presented to demonstrate the reliability and robustness of the feature extraction process. Subpixel accuracy is easily achieved. A real application to camera self-calibration is also described.

C. Barat, J. F. Menudet, H. Louhichi, T. Fournel
Evaluation of Subpixel Tracking Algorithms

Evaluation of tracking algorithms can be done in several different ways, either using real or synthetic images. The main benefit with the second alternative is that the environment is completely controlled, there is no problem to get the ground truth and the noise is well known. This paper contains the results from an evaluation of subpixel tracking algorithms. The main focus of the evaluation is to compare the performance of subpixel methods with different computation complexity, in order to see whether the tracking performance justifies more complex algorithms.

Johan Skoglund, Michael Felsberg
Adaptive Real-Time Rendering for Large-Scale Molecular Models

Real-time surface rendering of large-scale molecular models such as a colon bacillus requires a great number of polygons to be displayed on a display device. Since a long latency of display and manipulation is fatal in maintaining presence in a virtual environment, high performance computing power and high quality graphical components are required to exercise real-time rendering of such a large-scale molecular model. We propose an algorithm which enables a PC level computer to render and display large-scale molecular models in real-time. The proposed algorithm adaptively visualizes large-scale molecular models. We tested our algorithm with molecular models of which polygons range from 533,774 polygons to 2,656,246 polygons. Our experiments showed that frame rates of displaying and manipulating the models ranged from 17.85 to 55.64 frames-per-second. The frames rates are 4.3 to 6.9 times higher than those of the models which are obtained using a conventional system. Our system enables biologists to display and manipulate large-scale molecular models in real-time which could not be done fast enough to be used in a virtual environment using the conventional systems.

Jun Lee, Sungjun Park, Jee-In Kim
An Efficient Algorithm for Connected Attribute Thinnings and Thickenings

Connected attribute filters are morphological operators widely used for their ability of simplifying the image without moving its contours. In this paper, we present a fast, versatile and easy-to-implement algorithm for grayscale connected attribute thinnings and thickennings, a subclass of connected filters for the wide range of non-increasing attributes. We show that our algorithm consumes less memory and is computationally more efficient than other available methods on natural images, for strictly identical results.

David Lesage, Jérôme Darbon, Ceyhun Burak Akgül
Creating Multi-layered 3D Images Using Reversible Jump MCMC Algorithms

Standard 3D ranging and imaging systems process only a single return from an assumed single opaque surface. However, there are situations when the laser return consists of multiple peaks due to the footprint of the beam impinging on a target with surfaces distributed in depth or with semi-transparent surfaces. If all these returns are processed, a more informative multi-layered 3D image is created. We propose a unified theory of pixel processing for ladar data using a Bayesian approach that incorporates spatial constraints through a Markov Random Field. The different parameters of the several returns are estimated using reversible jump Markov chain Monte Carlo (RJMCMC) techniques in combination with an adaptive strategy of delayed rejection to improve the estimates of the parameters.

Sergio Hernandez-Marin, Andrew M. Wallace, Gavin J. Gibson
A Multi-Modal Interface for Road Planning Tasks Using Vision, Haptics and Sound

Planning of transportation infrastructure requires analyzing combinations of many different types of geo-spatial information (maps). Displaying all of these maps together in a tradition Geographic Information System (GIS) limits its effectiveness with visual clutter and information overload. Multi-modal interfaces (MMIs) aim to improve the efficiency of human-computer interaction by combining several types of sensory modalities. We are presenting a prototype virtual environment using vision, haptics and sonification for multi-modal GIS scenarios such as road planning. We use a point-haptic device (Phantom) for various haptic effects and sonification to present additional non-visual data while drawing on a virtual canvas. We conducted a user study to gather experience with this multi-modal system and to learn more about how these users interact with geospatial data via various combinations of sensory modalities. The results indicate that certain forms of haptics and audio were preferentially used to present certain types of spatial data.

Matt Newcomb, Chris Harding
Venation Pattern Analysis of Leaf Images

The work investigates pattern recognition methods to detect venation patterns on leaves. An automated technique that involves scale space analysis and an automated technique that includes a combination of edge detectors are compared with a manual technique. A small data set of five images is considered in this initial exploratory work and the results are qualitatively evaluated. The results show that the technique involving scale-space analysis is demonstrated to be a promising research direction to pursue.

James Clarke, Sarah Barman, Paolo Remagnino, Ken Bailey, Don Kirkup, Simon Mayo, Paul Wilkin
A Mobile Low-Cost Motion Capture System Based on Accelerometers

Low-cost accelerometers can be employed to create a motion-capture solution for below US$ 100. It may be used in mobile settings employing a portable digital recording device to capture the analog data of 15 degrees of freedom. The solution is integrated with standard 3D animation software. We introduce methods to extract and tweak kinematical as well as timing data from these acceleration sensors, which are attached to an actor’s limbs. These methods take care of the fact that the measured acceleration data alone can neither provide complete nor accurate information to satisfactorily reconstruct the captured motion. Particular emphasis is placed on the ease of use, in particular concerning the calibration of the system.

Jan-Phillip Tiesel, Jörn Loviscach
Fusing Individual Algorithms and Humans Improves Face Recognition Accuracy

Recent work indicates that state-of-the-art face recognition algorithms can surpass humans matching identity in pairs of face images taken under different illumination conditions. It has been demonstrated further that fusing algorithm- and human-derived face similarity estimates cuts error rates substantially over the performance of the best algorithms. Here we employed a pattern-based classification procedure to fuse individual human subjects and algorithms with the goal of determining whether strategy differences among humans are strong enough to suggest particular man-machine combinations. The results showed that error rates for the pairwise man-machine fusions were reduced an average of 47 percent when compared to the performance of the algorithms individually. The performance of the best pairwise combinations of individual humans and algorithms was only slightly less accurate than the combination of individual humans with all seven algorithms. The balance of man and machine contributions to the pairwise fusions varied widely, indicating that a one-size-fits-all weighting of human and machine face recognition estimates is not appropriate.

Alice J. O’Toole, Fang Jiang, Hervé Abdi, P. Jonathon Phillips
A Method for the Automatic Analysis of Colour Category Pixel Shifts During Dichromatic Vision

In this paper we present a method for automatically evaluating the amount of colour changes images undergo when perceived by individuals with colour deficient vision. This measure enables the classification of images based on the extent images visually change when viewed by people with one of the three classes of dichromatic (protanopia, deuteranopia, and tritanopia) colour vision. By measuring the extent that colour images appear perceptually different a designer, or automated layout technique, will have an indication of whether the choice of colour usage in an image could lead to colour ambiguity or colour confusions.

Mike Bennett, Aaron Quigley
Rendering Dynamic Real-World Scenes Using Image Spheres

We introduce

Image Spheres

– a novel approach for rendering dynamic real-world scenes. As sensors for online geometry reconstruction enhance more and more, it is desirable to render their output while up-to-date color information is applied. Images received from cameras mounted on moving pan-tilt units are uploaded to the graphic memory almost immediately after they are received. With very low preprocessing they are used to color the perpetually updated geometry immediately. To control graphic memory usage, our method includes a fast image replacement algorithm which regards redundancy when deciding which image to delete. We demonstrate our results on high polygonal laser scan models combined with geometry data received by a Photonic Mixer Device range sensor.

Stephan Behrendt
Arithmetic Discrete Parabolas

In the present paper, we propose a new definition of discrete parabolas, the so-called

arithmetic discrete parabolas

. We base our approach on a non-constant thickness function and characterized the 0-connected and 1-connected parabolas in terms of thickness function. This results extend the well-known characterization of the

κ

-connectedness of arithmetic discrete lines, depending on the norm ||

$\textperiodcentered$

||

 ∞ 

and ||

$\textperiodcentered$

||

1

of their normal vector.

I. Debled-Rennesson, E. Domenjoud, D. Jamet
Retinal Spot Lesion Detection Using Adaptive Multiscale Morphological Processing

We present a new spot lesion detection algorithm for retinal images with background diabetic retinopathy (DR) pathologies. The highlight of this algorithm is its capability to deal with all DR-related spot lesions of various sizes and shapes that is accomplished by a unique adaptive multiscale morphological processing technique. A scale map is generated to delineate lesion areas based an edge model, and it is used to fuse multiscale morphological processing results for lesion enhancements. The local/releative entropy thresholding techniques are employed to segment lesion regions, and a scale-guided validation process is used to remove over-detections based on the scale map. The proposed algorithm is tested on 30 retinal images where all spot lesions are hand-labelled for performance evaluation. Compared with two existing algorithms, the proposed one significantly improves the overall performance of spot lesion detection producing higher sensitivity and/or predictive values.

Xin Zhang, Guoliang Fan
Iterative Closest SIFT Formulation for Robust Feature Matching

This paper presents a new feture matching algorithm. The proposed algorithm integrates the Scale Invariant Feature Transform (SIFT) local descriptor in the Iterative Closest Point (ICP) scheme. The new algorithm addresses the problem of finding the appropriate match between repetitive patterns that appear in manmade scenes. The matching of two sets of points is computed integrating appearance and distance properties between putative match candidates. To demonstrate the performance of the new algorithm, the new approach is applied on real images. The results show that the proposed algorithm increases the number of correct feature correspondences and at the same time reduces significantly matching errors when compared to the original SIFT and ICP algorithms.

Rafael Lemuz-López, Miguel Arias-Estrada
Invertible Polygonalization of 3D Planar Digital Curves and Application to Volume Data Reconstruction

In this paper, we describe a new algorithm to compute in linear time a 3D planar polygonal curve from a planar digital curve, that is a curve which belongs to a digital plane. Based on this algorithm, we propose a new method for converting the boundary of digital volumetric objects into polygonal meshes which aims at providing a topologically consistent and invertible reconstruction, i.e. the digitization of the obtained object is equal to the original digital data. Indeed, we do not want any information to be added or lost. In order to limit the number of generated polygonal faces, our approach is based on the use of digital geometry tools which allow the reconstruction of large pieces of planes.

Martine Dexet, David Cœurjolly, Eric Andres
Lateral and Depth Calibration of PMD-Distance Sensors

A growing number of modern applications such as position determination, object recognition and collision prevention depend on accurate scene analysis. The estimation of an object’s distance relative to an observers position by image analysis or laser scan techniques is thereby still the most time-consuming and expensive part.

A lower-priced and much faster alternative is the distance measurement with modulated, coherent infrared light based on the (PMD) technique. As this approach is a rather new and unexplored method, proper calibration techniques have not been widely investigated yet. This paper describes an accurate distance calibration approach for PMD-based distance sensoring.

Marvin Lindner, Andreas Kolb
Autonomous Vehicle Video Aided Navigation – Coupling INS and Video Approaches

As autonomous vehicle systems become more prevalent, their navigation capabilities become increasingly critical. Currently most systems rely on a combined GPS/INS solution for vehicle pose computation, while some systems use a video-based approach. One problem with a GPS/INS approach is the possible loss of GPS data, especially in urban environments. Using only INS in this case causes significant drift in the computed pose. The video-based approach is not always reliable due to its heavy dependence on image texture. Our approach to autonomous vehicle navigation exploits the best of both of these by coupling an outlier-robust video-based solution with INS when GPS is unavailable. This allows accurate computation of the system’s current pose in these situations. In this paper we describe our system design and provide an analysis of its performance, using simulated data with a range of different noise levels.

Chris Baker, Chris Debrunner, Sean Gooding, William Hoff, William Severson
A Method of Improving Cloud Predictions for Real-Time Weather Forecasting and Visualization

Indirect or passive observations using satellite remote sensing in the visible, infrared and microwave spectra provide global coverage of the thermal states of the cloud tops or the ground surface. The commonly employed temperature profile matching techniques using satellite data and numerical weather prediction models are only relatively successful in estimating the cloud top height (CTH) for optically dense middle and high clouds (cloud tops at heights generally greater than two kilometers). Therefore, accurate predictions of low-level CTH present a formidable challenge to the forecasting and nowcasting community. In this study, we present an approach to estimating low-level CTH by combining the above-cloud information extracted from the satellite imagery and the below-cloud information obtained from weather station measurements. Assumed ranges of brightness temperature and CTH are used to process the cloudy pixels for visualization and classification purposes. Our study indicates that the CTH evaluated using satellite data confirms the presence of low-level clouds in the range 400-1000 m. Accurate estimates of the boundary layer CTH can provide better low-level cloud products (

e

.

g

., fog or clouds formed by fog lifting) for improved weather forecasting and applications in the research community.

Ramesh Vellore, Darko Koračin, Melanie Wetzel
An Efficient Hardware Architecture for Full-Search Variable Block Size Motion Estimation in H.264/AVC

In this paper, we propose a high speed hardware architecture for the implementation of full-search variable block size motion estimation (VBSME) suitable for high quality video compression. In the high-quality video with large frame size and search range, the memory bandwidth is mainly responsible for throughput limitations and power consumption in VBSME. The proposed architecture is designed for reducing the memory bandwidth by adopting "meander”-like scan for a high overlapped data of the search area and using on-chip memory to reuse the overlapped data. We can reuse the previous candidate block of 94% to the current one and save about 23% memory access cycles in a search range of [-16, +15]. The architecture has been prototyped in Verilog HDL, simulated by ModelSim and synthesized by Synopsys Design Compiler with Samsung 0.18um standard cell library. Under a clock frequency of 51MHz, The simulation result shows that the architecture can achieve the real-time processing of 720x576 picture size at 30fps with the search range of [-16~+15].

Seung-Man Pyen, Kyeong-Yuk Min, Jong-Wha Chong, Satoshi Goto
A Domain Reduction Algorithm for Incremental Projective Reconstruction

In this paper we address the problem of recovering the three-dimensional shape of an object and the motion of the camera based on multiple feature correspondences from an image sequence. We present a new incremental projective factorization algorithm using a perspective camera model. The original projective factorization method produces robust results. However, the method can not be applied to real-time applications since it is based on a batch processing pipeline and the size of the data matrix grows with each additional frame. The proposed algorithm obtains an estimate of shape and motion for each additional frame adding a dimension reduction step. A subset of frames is selected analyzing the contribution of frames to the reconstruction quality. The main advantage of the novel algorithm is the reduction of the computational cost while keeping the robustness of the original method. Experiments with synthetic and real images illustrate the accuracy and performance of the new algorithm.

Rafael Lemuz-López, Miguel Arias-Estrada
An Automated Procedure for Word Balloon Placement in Cinema Comics

Comics has been studied for a long time, as they are the oldest genre of non-photorealistic rendering. Especially, it is a very interesting work to transform a video into a comics in an automatic manner. As a kind of sub-work for this, we introduce a method to automatically place word balloons in an image in this paper. It is not trivial to place the time-serial word balloons with keeping the rules of 2D comics text layout. Our algorithm first positions each word balloon relative to its respective actor while maintaining the reading order of dialogues by applying a cartographical labeling algorithm. The position of each balloon is then refined on the basis of our proposed measure that estimates the quality of balloon layout. We experiment with several commercial cinemas, which shows that the final results are good enough to compare with hand-drawn comics.

Bong-Kyung Chun, Dong-Sung Ryu, Won-Il Hwang, Hwan-Gue Cho
Segmentation of Three Dimensional Cell Culture Models from a Single Focal Plane

Three dimensional cell culture models offer new opportunities for development of computational techniques for segmentation and localization. These assays have a unique signature of a clump of cells that correspond to a functioning colony. Often the nuclear compartment is labeled and then imaged with fluorescent microscopy to provide context for protein localization. These colonies are first delineated from background using the level set method. Within each colony, nuclear regions are then bounded by their center of mass through radial voting, and a local neighborhood for each nucleus is established through Voronoi tessellation. Finally, the level set method is applied again within each Voronoi region to delineate the nuclear compartment. The paper concludes with the application of the proposed method to a dataset of experimental data demonstrating a stable solution when iterative radial voting and level set methods are used synergistically.

Hang Chang, Bahram Parvin
A Semi-automatic 3D Reconstruction Algorithm for Telepresence

The use of three dimensional computer based models has increased tremendously in the last decades. This is due to the need of numerous emerging applications like telepresence and virtual reality. In the context of telepresence, this paper implements a semi-automatic 3D model reconstruction algorithm from stereo images. The major contribution in this work lies in three main parts: calibration, stereo correspondences, and 3D objects alignement. It is semi-automatic since the calibration technique requires some intervention. However, this will add the ability to change the zoom during the acquisition process.

Michel Sarkis, Klaus Diepold
Motion Detection Using an Improved Colour Model

We discuss common colour models for background subtraction and problems related to their utilisation. A novel approach to represent chrominance information more suitable for robust background modelling and shadow suppression is proposed. Our method relies on the ability to represent colours in terms of a 3D-polar coordinate system having saturation independent of the brightness function; specifically, we build upon an Improved Hue, Luminance, and Saturation space (IHLS). The additional peculiarity of the approach is that we deal with the problem of unstable hue values at low saturation by modelling the hue-saturation relationship using saturation-weighted hue statistics. The effectiveness of the proposed method is shown in an experimental comparison with approaches based on Normalised RGB,

c

1

c

2

c

3

, and HSV.

Horst Wildenauer, Philipp Blauensteiner, Allan Hanbury, Martin Kampel
Combining Pixelization and Dimensional Stacking

The combination of pixelization and dimensional stacking yields a highly informative visualization that uniquely facilitates feature discovery and exploratory analysis of multidimensional, multivariate data. Pixelization is the mapping of each data point in some set to a pixel in an image. Dimensional stacking is a layout method where N dimensions are projected into 2. We have combined both methods to support visual data mining of a vast neuroscience database. Images produced from this approach have now appeared in the Journal of Neurophysiology [1] and are being used for educational purposes in neuroscience classes at Emory University. In this paper we present our combination of dimensional stacking and pixelization, our extensions to these methods, and how our techniques have been used in neuroscience investigations.

John T. Langton, Astrid A. Prinz, Timothy J. Hickey
Detection and Characterization of Abnormal Vascular Patterns in Automated Cervical Image Analysis

In colposcopy, mosaic and punctation are two major abnormal vessels associated with cervical intraepithelial neoplasia (CIN). Detection and characterization of mosaic and punctation in digital cervical images is a crucial step towards developing a computer-aided diagnosis (CAD) system for cervical cancer screening and diagnosis. This paper presents automated techniques for detection and characterization of mosaic and punctation vessels in cervical images. The techniques are based on iterative morphological operations with various sizes of structural elements, in combination with adaptive thresholding. Information about color, region, and shape properties is used to refine the detection results. The techniques have been applied to clinical data with promising results.

Wenjing Li, Allen Poirson
Towards a Modular Network-Distributed Mixed-Reality Learning Space System

We propose a Modular Mixed-Reality Learning Space System (MRLSS) that relies on middleware tools and distributed mixed-reality technologies to support multi-modal communications between local, remote, and virtual

audience sets

. Each audience set occupies a

spatial module

represented throughout the system as a

cell

which is visually displayed on specifically aligned projection surfaces in each module. A module can host multiple cells and can be categorized based on scalability and technical advantage. For example, an

Individual Cell

(ICell) could contain a participant with only a web cam and audio. A

Classroom Cell

(CCell) could be a single classroom. A

Virtual Cell

(VCell) is a graphically rendered space with unique possibilities for interaction, experience, and exploration. A

Studio Cell

(SCell) is a specialized facility with advanced systems, services, and scalable spatial capabilities. A

University Cell

(UCell) can host multiple instances of an MRLSS, i.e. simultaneously host and combine more than one MRLSS.

Timothy J. Rogers, Bedřich Beneš, Gary R. Bertoline
Real-Time Multi-view 3D Object Tracking in Cluttered Scenes

This paper presents an approach to real-time 3D object tracking in cluttered scenes using multiple synchronized and calibrated cameras. The goal is to accurately track targets over a long period of time in the presence of complete occlusion in some of the camera views. In the proposed system, color histogram was used to represent object appearance. Tracked 3D object locations were smoothed and new locations predicted using a Kalman filter. The predicted object 3D location was then projected onto all camera views to provide a search region for robust 2D object tracking and occlusion detection. The experimental results were validated using ground-truth data obtained from a marker-based motion capture system. The results illustrate that the proposed approach is capable of effective and robust 3D tracking of multiple objects in cluttered scenes.

Huan Jin, Gang Qian, Stjepan Rajko
Visualizing Natural Stereo Images in Short Distance: A New Approach

This work approaches the visualization of natural stereo images through webcams, in real-time. After analyzing the conventional stereo model geometry and its constraints related to viewer’s movement, it became necessary to eliminate those geometric constrains, when applying the model to visualizing stereoscopic webcams images in the context of remote interactions. Then, based on natural human binocular vision geometry, we propose a stereo processing model for short distance visualization.

Antonia Lucinelma Pessoa Albuquerque, Sergio Pinheiro, Rubens Melo
Next Best View Algorithms for Interior and Exterior Model Acquisition

Rapid determination of optimal laser range finder placement is becoming of increased importance in both computer vision and robotics. The need of next-best-view algorithms for intelligent decisions in where to place the laser range finder is important for scanning both objects and landscapes to assure that the scene is fully realized and can be registered accurately. Presented here are methods for determining sensor placement to maximize the amount of information that is learned about a scene or object. Using occupancy grids of voxels and ray tracing, the placement of the sensor can be optimized for maximum collection of new data. This work provides fast algorithms that determine optimal sensor placement both outside an object of interest and inside of a closed environment. These methods take into consideration the limitations of the sensor, the interaction of the sensor to the environment, and its placement in only known areas to restrict the search space.

Bradley D. Null, Eric D. Sinzinger
An Experiential Approach to Interacting with Biological Information

Technological advancements in the life sciences have enabled biologists to generate high volumes of heterogeneous, multimedia data. The challenge today lies in correlating and integrating the information in ways that promote a holistic understanding of the underlying biological phenomena. This paper presents our research in designing an experiential information interaction environment for query-exploration of complex biological information. This entails development of a unified presentation-query-exploration environment that incorporates and relates multimodal data repositories and views. Specifically, our interface captures textual, sequence, and structural views of biological entities and presents semantic correlations between them using Gene Ontology annotations. Additionally, the system extracts and displays the spatial-temporal characteristics of the data to facilitate querying and discernment of relationships. The different views of the data are interactive, in order to facilitate information exploration and assimilation. Experiments and examples demonstrate the system’s features, efficacy, and ability to facilitate both concept discovery and information querying.

Naureen Moon, Bibek Dev Bhattarai, Rahul Singh
Convex Shapes and Convergence Speed of Discrete Tangent Estimators

Discrete geometric estimators aim at estimating geometric characteristics of a shape with only its digitization as input data. Such an estimator is multigrid convergent when its estimates tend toward the geometric characteristics of the shape as the digitization step

h

tends toward 0. This paper studies the multigrid convergence of tangent estimators based on maximal digital straight segment recognition. We show that such estimators are multigrid convergent for some family of convex shapes and that their speed of convergence is on average

${\mathcal{O}}{(h^{\frac{2}{3}})}$

. Experiments confirm this result and suggest that the bound is tight.

Jacques-Olivier Lachaud, François de Vieilleville
Towards Obtaining an Ideal Real Time Panoramic Video

Panoramic image construction is a well known image processing technique, which allows capturing of a large field of view in one single image or video. Panoramic video construction from videos captured from multiple cameras pose interesting challenges to the existing panoramic image construction algorithms. Computation time and quality of video output are major issues in panoramic videos. While the time of computation is required to meet basic real-time constraints of video streams, the quality should be good enough such that the stream can be used for the purposes intended. In this work, we quantify the various stages in panoramic image construction. We propose various intra-frame and inter-frame optimizations on the top of basic panoramic image construction algorithm and a methodology to achieve real-time panoramic videos.

Harsh Dhand, Lakshmi Pavan Daggubati
Object Recognition Using Local Descriptors: A Comparison

Local image descriptors have been widely researched and used, due to their resistance to clutter and partial occlusion, as well as their partial insensitivity to object pose. Recently Mikolajczyk and Schmid [1] compared a number of such descriptors and concluded that the SIFT-based ones perform best in image matching tasks. This paper compares the effect that three local descriptors have on object recognition: SIFT [2], PCA-SIFT [3] and keyed context patches [4]. We use a data set containing images of six objects on clean and cluttered backgrounds, taken around the whole viewing sphere. We conclude that keyed context patches perform best overall, but they are outperformed for some objects by the second best feature, PCA-SIFT.

A. Salgian
Improving Spatiotemporal Inpainting with Layer Appearance Models

The problem of removing blemishes in mosaics of building facades caused by foreground objects such as trees may be framed in terms of inpainting. Affected regions are first automatically segmented and then inpainted away using a combination of cues from unoccluded, temporally adjacent views of the same building patch, as well as surrounding unoccluded patches in the same frame. Discriminating the building layer from those containing foreground features is most directly accomplished through parallax due to camera motion over the sequence. However, the intricacy of tree silhouettes often complicates accurate motion-based segmentation, especially along their narrower branches. In this work we describe methods for automatically training appearance-based classifiers from a coarse motion-based segmentation to recognize foreground patches in static imagery and thereby improve the quality of the final mosaic. A local technique for photometric adjustment of inpainted patches which compensates for exposure variations between frames is also discussed.

Thommen Korah, Christopher Rasmussen
Edge Detection Using a Complex-Valued Directional Vector Representation

Edge detection is a fundamental step in computer vision and image analysis. A new method using the modified Laplacian and position mask is proposed for edge detection of both gray-evel and color images. A directional field is computed by the proposed masks and edges are detected by additional simple mask operations. Edge detection results by the proposed masks are shown in comparison with those by Sobel, Laplacian of Gaussian (LoG), and Canny operators.

Sung Bae Kim, Rae-Hong Park
Analysis and Design of Graphical Password Techniques

In recent years, graphical passwords have become a viable alternative to the traditional text-based passwords due to their superior ease of recall and potential large password space. However, there is very limited research in classifying and mathematically analyzing the graphical password techniques. In this paper, we conduct a comprehensive study of the existing graphical password techniques. We propose the first taxonomy for graphical passwords methods and discuss the important elements in designing them. We present a mathematical analysis of the graphical password space. Finally we present a new graphical password scheme.

Xiaoyuan Suo, Ying Zhu, G. Scott Owen
Viewing Scenes Occluded by Smoke

In this paper, we focus on the problem of reconstructing images of scenes occluded by thick smoke. We propose a simple and effective algorithm that creates a single clear image of the scene given only a video sequence as input. Our method is based on two key observations. First, an increase in smoke density induces a decrease in both image contrast and color saturation. Measuring the decay of the high-frequency content in each video frame provides an effective way of quantifying the amount of contrast reduction. Secondly, the dynamic nature of the smoke causes the scene to be partially visible at times. By dividing the video sequence into subregions, our method is able to select the subregion-frame containing the least amount of smoke occlusion over time. Current experiments on different data sets show very promising results.

Arturo Donate, Eraldo Ribeiro
A Non-photorealistic Rendering of Seurat’s Pointillism

In recent years, there has been a trend on simulating impressionism with computers. Among the various styles of impressionism, we are particularly interested in simulating the style of

pointillism

, especially the style presented by

Georges-Pierre Seurat

, as he was deemed the founder of pointillism. The reason that his style attracts us is twofold. First, the painting process of pointillism is extremely laborious, so simulating his painting style by computers is desired. Second, though several existing impressionism algorithms may approximate pointillism with point-like strokes, some delicate features frequently observed in Seurat’s paintings are still not satisfactorily reflected by those general schemes. To achieve simulating Seurat’s painting style, we made careful observations on all accessible Seurat’s paintings and extract from them some important features, such as the few primitive colors, point sizes, and the effects of complementary colors and halos. These features have been successfully simulated and results are compared with not only Seurat’s existing paintings, but also with previous attempted simulations.

Hui-Lin Yang, Chuan-Kai Yang
Semantically Relevant Image Retrieval by Combining Image and Linguistic Analysis

In this paper, we introduce a novel approach to image-based information retrieval by combining image analysis with linguistic analysis of associated annotation information. While numerous Content Based Image Retrieval (CBIR) systems exist, most of them are constrained to use images as the only source of information. In contrast, recent research, especially in the area of web-search has also used techniques that rely purely on textual information associated with an image. The proposed research adopts a conceptually different philosophy. It utilizes the information at both the image and annotation level, if it detects a strong semantic coherence between them. Otherwise, depending on the quality of information available, either of the media is selected to execute the search. Semantic similarity is defined through the use of linguistic relationships in WordNet as well as through shape, texture, and color. Our investigations lead to results that are of significance in designing multimedia information retrieval systems. These include technical details on designing cross-media retrieval strategies as well as the conclusion that combining information modalities during retrieval not only leads to more semantically relevant performance but can also help capture highly complex issues such as the emergent semantics associated with images.

Tony Lam, Rahul Singh
A New Algorithm for Solid Texture Synthesis

Despite the tremendous rendering power offered by modern GPUs, real-time and photo-realistic rendering is still often out of reach of traditional polygonal-based rendering. Thanks to the invention of texture mapping, a scene with a moderate number of triangles could be readily and vividly rendered by nowadays popular and inexpensive graphics cards. However, as a desired texture often comes with a very limited size, the technique of texture synthesis, i.e., synthesizing a larger texture from an originally smaller texture, has become a popular research topic in recent years. Numerous techniques have been proposed to successfully synthesizing 2D textures in terms of quality and performance. 3D or solid texture synthesis, on the other hand, remains relatively unexplored due to its higher complexity. There are several types of existing algorithms for solid texture synthesis, and among them, the outstanding work by Jagnow et al. [1] opens a new door for solid texture synthesis of discrete particles; however, their work did not address two important issues, thus leaving room for further improvement. First, without the help of stereology, users need to explicitly provide the 3D shapes of target particles for synthesis, and this is especially true when oftentimes only a 2D texture image is available for solid texture synthesis. Second, the locations and orientations of the 3D particles are resolved by a

simulated annealing

method, which is intrinsically a non-deterministic approach, and thus the optimality is not always guaranteed. To solve the shape problem, we propose a simple algorithm that applies the idea of

visual hulls

to approximate the shapes of 3D particles when only a 2D image is given; to solve the location and orientation problem, we design a deterministic algorithm that could place these desired 3D particles in space more properly. Most importantly, there is no need for user’s intervention for both algorithms. We have successfully implemented the proposed algorithm and the experimental results are also presented for comparisons with previous results and also for the proof of our concepts.

Jia-Wei Chiou, Chuan-Kai Yang
What Can We Learn from Biological Vision Studies for Human Motion Segmentation?

We review recent biological vision studies that are related to

human motion segmentation

. Our goal is to develop a practically plausible computational framework that is guided by recent cognitive and psychological studies on the human visual system for the segmentation of human body in a video sequence. Specifically, we discuss the roles and interactions of bottom-up and top-down processes in visual perception processing as well as how to combine them synergistically in one computational model to guide human motion segmentation. We also examine recent research on biological movement perception, such as neural mechanisms and functionalities for biological movement recognition and two major psychological tracking theories. We attempt to develop a comprehensive computational model that involves both bottom-up and top-down processing and is deeply inspired by biological motion perception. According to this model, object segmentation, motion estimation, and action recognition are results of recurrent feedforward (bottom-up) and feedback (top-down) processes. Some open technical questions are also raised and discussed for future research.

Cheng Chen, Guoliang Fan
3D Geometry from Uncalibrated Images

We present an automatic pipeline for recovering the geometry of a 3D scene from a set of unordered, uncalibrated images. The contributions in the paper are the presentation of the system as a whole, from images to geometry, the estimation of the local scale for various scene components in the orientation-topology module, the procedure for orienting the cloud components, and the method for dealing with points of contact. The methods are aimed to process complex scenes and non-uniformly sampled, noisy data sets.

George Kamberov, Gerda Kamberova, O. Chum, Š. Obdržálek, D. Martinec, J. Kostková, T. Pajdla, J. Matas, R. Šára
Hierarchical Image Database Navigation on a Hue Sphere

Due to the ever growing size of image collections efficient and effective ways of browsing and navigation through them as well as locating and searching desired images are in high demand. Most current approaches project image thumnails onto a 2-dimensional visualisation space so that images that are visually similar are placed close to each other on the display. Unfortunately, this approach is only of limited use for larger image sets as images are occluding each other while at the same time certain parts of the projection space are not utilised. In this paper we provide an image database navigation approach that does not suffer from these disadvantages. Images are projected on a spherical visualisation space which users can intutively rotate to focus on images with a different main hue or brightness. Thumbnails are placed on a regular grid on the sphere’s surface to avoid any overlapping effects. Large image datasets are handled through a clustering technique which allows browsing in a hierarchical manner.

Gerald Schaefer, Simon Ruszala
Image-Based Informatics for Preclinical Biomedical Research

In 2006, the

New England Journal of Medicine

selected medical imaging as one of the eleven most important innovations of the past 1,000 years, primarily due to its ability to allow physicians and researchers to visualize the very nature of disease. As a result of the broad-based adoption of micro imaging technologies, preclinical researchers today are generating terabytes of image data from both anatomic and functional imaging modes. In this paper we describe our early research to apply content-based image retrieval to index and manage large image libraries generated in the study of amyloid disease in mice. Amyloidosis is associated with diseases such as Alzheimer’s, type 2 diabetes, chronic inflammation and myeloma. In particular, we will focus on results to date in the area of small animal organ segmentation and description for CT, SPECT, and PET modes and present a small set of preliminary retrieval results for a specific disease state in kidney CT cross-sections.

Kenneth W. Tobin, Deniz Aykac, V. Priya Govindasamy, Shaun S. Gleason, Jens Gregor, Thomas P. Karnowski, Jeffery R. Price, Jonathan Wall
Segmentation-Based Registration of Organs in Intraoperative Video Sequences

Intraoperative optical imaging of exposed organs in visible, near-infrared, and infrared (IR) wavelengths in the body has the potential to be useful for real-time assessment of organ viability and image guidance during surgical intervention. However, the motion of the internal organs presents significant challenges for fast analysis of recorded 2D video sequences. The movement observed during surgery, due to respiration, cardiac motion, blood flow, and mechanical shift accompanying the surgical intervention, causes organ reflection in the image sequence, making optical measurements for further analysis challenging. Correcting alignment is difficult in that the motion is not uniform over the image. This paper describes a Canny edge-based method for segmentation of the specific organ or region under study, along with a moment-based registration method for the segmented region. Experimental results are provided for a set of intraoperative IR image sequences.

James Goddard, Timothy Gee, Hengliang Wang, Alexander M. Gorbach
On Asymmetric Classifier Training for Detector Cascades

This paper examines the Asymmetric AdaBoost algorithm introduced by Viola and Jones for cascaded face detection. The Viola and Jones face detector uses cascaded classifiers to successively filter, or reject, non-faces. In this approach most non-faces are easily rejected by the earlier classifiers in the cascade, thus reducing the overall number of computations. This requires earlier cascade classifiers to very seldomly reject true instances of faces. To reflect this training goal, Viola and Jones introduce a weighting parameter for AdaBoost iterations and show it enforces a desirable bound. During their implementation, a modification to the proposed weighting was introduced, while enforcing the same bound. The goal of this paper is to examine their asymmetric weighting by putting AdaBoost in the form of Additive Regression as was done by Friedman, Hastie, and Tibshirani. The author believes this helps to explain the approach and adds another connection between AdaBoost and Additive Regression.

Timothy F. Gee
Active Stabilization of Images Acquired on a Walking Robotic Platform

To increase the quality of scientific data collected from autonomous mobile agents such as rovers and walking robotic devices, biological methods can be mimicked for better navigation and balance control of both the agent itself and the manipulation of scientific instruments. Drawing on the design of the neuro-vestibular control system, the EarBot controller is designed to stabilize a multi-axis camera system mounted atop a moving agent. An eight-legged robot called the SCORPION, designed to navigate and explore rough terrain considered inhospitable to wheeled rovers, is used as the testbed to analyze the EarBot’s functionality and behavior. Eventually, the EarBot will be used to control the balance the robot itself through expanded modelling of the vestibulo-motor control loops used in postural control. This paper presents the theoretical concepts and initial controller implementations for stabilizing the camera during walking motion of the SCORPION.

Xander Twombly, Richard Boyle, Silvano Colombano
Motion Estimation with Edge Continuity Constraint for Crowd Scene Analysis

This paper presents a new motion estimation method aimed at crowd scene analysis in complex video sequences. The proposed technique makes use of image descriptors extracted from points lying at the maximum curvature on the Canny edge map of an analyzed image. Matches between two consecutive frames are then carried out by searching for descriptors that satisfy both a well-defined similarity metric and a structural constraint imposed by the edge map. A preliminary assessment using real-life video sequences gives both qualitative and quantitative results.

B. Zhan, P. Remagnino, S. A. Velastin, N. Monekosso, L. -Q. Xu
Perceptual Grouping Based on Iterative Multi-scale Tensor Voting

We propose a new approach for perceptual grouping of oriented segments in highly cluttered images based on tensor voting. Segments are represented as second-order tensors and communicate with each other through a voting scheme that incorporates the Gestalt principles of visual perception. An iterative scheme has been devised which removes noise segments in a conservative way using multi-scale analysis and re-voting. We have tested our approach on data sets composed of real objects in real backgrounds. Our experimental results indicate that our method can segment successfully objects in images with up to twenty times more noise segments than object ones.

Leandro Loss, George Bebis, Mircea Nicolescu, Alexei Skourikhine
Fast Prediction Mode Decision Algorithm for H.264 Based on Hierarchical Mode Classification Framework

The latest video compression standard H.264 supports many intra and inter prediction modes. The mode with the optimal rate-distortion performance is determined using the powerful Lagrangian minimization tool such as rate-distortion optimization, which achieves highest possible coding efficiency but results in much higher computational complexity. In this paper, a fast prediction mode decision algorithm for H.264 is proposed. It is based on hierarchical mode classification framework which assists fast mode decision by pre-selecting the class for macroblock using the extracted spatial and temporal features of macroblock. Since tests for many modes of non-selected classes will be skipped, much computation of rate-distortion optimization can be saved. Experimental results show that the proposed method can reduce the execution time of mode decision by 85% on the average without perceivable loss in coding rate and quality.

Cheng-dong Shen, Si-kun Li
Camera Self-calibration in Scheimpflug Condition for Air Flow Investigation

Camera calibration is a necessary preliminary step to ensure high quality measurement in a section of a flow. The bundle adjustment technique is based on redundancy resulting from multi-viewing the calibration target. In this paper we proposed a well-adapted protocol for self-calibrating a Scheimpflug model. Such a protocol allows calibration by hand positioning a laser printed target. Experimental results indicate that with only 5 views we can accurately calibrate a Scheimpflug camera in Stereoscopic Particle Image Velocimetry condition.

Hanene Louhichi, Thierry Fournel, Jean Marc Lavest, Habib Ben Aissia
Backmatter
Metadata
Title
Advances in Visual Computing
Editors
George Bebis
Richard Boyle
Bahram Parvin
Darko Koracin
Paolo Remagnino
Ara Nefian
Gopi Meenakshisundaram
Valerio Pascucci
Jiri Zara
Jose Molineros
Holger Theisel
Tom Malzbender
Copyright Year
2006
Publisher
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-48627-5
Print ISBN
978-3-540-48626-8
DOI
https://doi.org/10.1007/11919629

Premium Partner