Top

2011 | Book

Read chapter Read first chapter

Computer Vision/Computer Graphics Collaboration Techniques

5th International Conference, MIRAGE 2011, Rocquencourt, France, October 10-11, 2011. Proceedings

Editors: André Gagalowicz, Wilfried Philips

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 5th International Conference on Computer Vision/Computer Graphics Collaboration Techniques, MIRAGE 2011, held in Rocquencourt, France, in October 2011. The 23 full papers presented were carefully reviewed and selected from numerous submissions. The papers cover a wide range of topics with focus on Computer Vision/Computer Graphics Collaboration Techniques involving image analysis/synthesis approaches especially concerning theoretical, computational, experimental or industrial aspects of model-based image analysis and image-based model synthesis.

Frontmatter

Bundle Adjustment for Stereoscopic 3D

Abstract

The recent resurgence of stereoscopic 3D films has triggered a high demand for post-processing tools for stereoscopic image sequences. Camera motion estimation, also known as structure-from-motion (SfM) or match-moving, is an essential step in the post-processing pipeline. In order to ensure a high accuracy of the estimated camera parameters, a bundle adjustment algorithm should be employed. We present a new stereo camera model for bundle adjustment. It is designed to be applicable to a wide range of cameras employed in today’s movie productions. In addition, we describe how the model can be integrated efficiently into the sparse bundle adjustment framework, enabling the processing of stereoscopic image sequences with traditional efficiency and improved accuracy. Our camera model is validated by synthetic experiments, on rendered sequences, and on a variety of real-world video sequences.

Christian Kurz, Thorsten Thormählen, Hans-Peter Seidel

3D Modeling of Haussmannian Facades

Abstract

Urban modeling has attracted many attentions since Google and Microsoft have launched their 3D geo softwares. However, in order to achieve photo-realistic results at the same level as for the latest interactive video games for high-end applications, less effort has been made to automate urban objects recognition and reconstruction. This paper consists of the automation of image-based Haussmannian facade recognition and reconstruction. The input image is firstly rectified and segmented in order to obtain a rectangular and less distorted facade image extracted from urban scenes. Then based upon various visual features and architectural knowledge, different facade elements which include windows, doors and balconies are detected including positions and also measured sizes. Combined with the depth information computed from 3D range data, the facade geometries and textures can be produced. Finally, an analysis-synthesis approach is used to reconstruct the full 3D facade representation. The processing pipeline developed for this research has also been verified and tested on various Parisian facades and confirmed the desired recognition and reconstruction results.

Chun Liu, André Gagalowicz

Lung Cancer Detection from Thoracic CT Scans Using 3-D Deformable Models Based on Statistical Anatomical Analysis

Abstract

In the present paper, we propose a novel recognition method of pulmonary nodules (possible lung cancers) in thoracic CT scans. Pulmonary nodules and blood vessels are represented by 3-D deformable spherical and cylindrical models. The validity of these object models are evaluated by the probability distributions that reflect the results of the statistical anatomical analysis of blood vessel trees in human lungs. The fidelity of the object models to CT scans are evaluated by five similarity measurements based on the differences in intensity distributions between the CT scans and templates produced from the object models. Through these evaluations, the posterior probabilities of hypotheses that the object models appear in the CT scans are calculated by use of the Bayes theorem. The nodule recognition is performed by the maximum a posterior estimation. Experimental results obtained by applying the proposed method to actual CT scans are shown.

Hotaka Takizawa, Shigeyuki Ishii

An Evaluation on Estimators for Stochastic and Heuristic Based Cost Functions Using the Epipolar-Constraint

Abstract

Camera calibration is a process of optimizing the camera parameters. This paper describes an evaluation of different stochastic and heuristic estimators for cost function minimization used in camera calibration.

The optimization algorithm is a standard gradient walk on the epipolar-constraint. The results show estimators work similar on the given set of correspondence. Correspondences selected to a given distribution over the frame gives better calibration results, especially the results on the yaw angle estimation show more robust results. In this paper the distribution will uniformly distributed over the frame using binning [1,2].

The data used in this paper shows binning does lower the error behavior in most calibrations. The \(\mathcal{L}^1\)-norm and \(\mathcal{L}^2\)-norm using binning does not reach an error with respect to the ground truth higher 4 pix. The calibrations rejecting binning shows an impulse on the 970 calibration. To avoid this impulse binning is used.

Binning influences the calibration result more as the choice of the right m-estimator or the right norm.

Anton Feldmann, Lars Krüger, Franz Kummert

Facial Movement Based Recognition

Abstract

The modelling and understanding of the facial dynamics of individuals is crucial to achieving higher levels of realistic facial animation. We address the recognition of individuals through modelling the facial motions of several subjects. Modelling facial motion comes with numerous challenges including accurate and robust tracking of facial movement, high dimensional data processing and non-linear spatial-temporal structural motion. We present a novel framework which addresses these problems through the use of video-specific Active Appearance Models (AAM) and Gaussian Process Latent Variable Models (GP-LVM). Our experiments and results qualitatively and quantitatively demonstrate the framework’s ability to successfully differentiate individuals by temporally modelling appearance invariant facial motion. Thus supporting the proposition that a facial activity model may assist in the areas of motion retargeting, motion synthesis and experimental psychology.

Alexander Davies, Carl Henrik Ek, Colin Dalton, Neill Campbell

Towards Temporally-Coherent Video Matting

Abstract

Extracting temporally-coherent alpha mattes in video is an important but challenging problem in post-production. Previous video matting systems are highly sensitive to initial conditions and image noise, thus cannot reliably produce stable alpha mattes without temporal jitter. In this paper we propose an improved video matting system which contains two new components: (1) an accurate trimap propagation mechanism for setting up the initial matting conditions in a temporally-coherent way; and (2) a temporal matte filter which can improve the temporal coherence of the mattes while maintaining the matte structures on individual frames. Experimental results show that compared with previous methods, the two new components lead to alpha mattes with better temporal coherence.

Xue Bai, Jue Wang, David Simons

Leaf Segmentation and Tracking Using Probabilistic Parametric Active Contours

Abstract

Active contours or snakes are widely used for segmentation and tracking. These techniques require the minimization of an energy function, which is generally a linear combination of a data fit term and a regularization term. This energy function can be adjusted to exploit the intrinsic object and image features. This can be done by changing the weighting parameters of the data fit and regularization term. There is, however, no rule to set these parameters optimally for a given application. This results in trial and error parameter estimation. In this paper, we propose a new active contour framework defined using probability theory. With this new technique there is no need for ad hoc parameter setting, since it uses probability distributions, which can be learned from a given training dataset.

Jonas De Vylder, Daniel Ochoa, Wilfried Philips, Laury Chaerle, Dominique Van Der Straeten

Gallbladder Segmentation from 2–D Ultrasound Images Using Active Contour Models and Gradient Vector Flow

Abstract

Extracting the shape of the gallbladder from an ultrasonography (USG) image is an important step in software supporting medical diagnostics, as it allows superfluous information which is immaterial in the diagnostic process to be eliminated. In this project, several active contour models were used to segment the shape of the gallbladder, both for cases free of lesions, and for those showing specific disease units, namely: lithiasis, polyps, and anatomical changes, such as folds of the gallbladder. The approximate edge of the gallbladder is found by applying one of the active contour models like the membrane and motion equation as well as the gradient vector flow model (GVF-snake). Then, the fragment of the image located outside the identified gallbladder contour is eliminated from the image. The tests carried out showed that the average value of the Dice similarity coefficient for the three active contour models applied reached 81.8%.

Marcin Ciecholewski

Multi-view Alpha Matte for Free Viewpoint Rendering

Abstract

We present a multi-view alpha matting method that requires no user input and is able to deal with any arbitrary scene geometry through the use of depth maps. The algorithm uses multiple observations of the same point to construct constraints on the true foreground color and estimate its transparency. A novel free viewpoint rendering pipeline is also presented that takes advantage of the generated alpha maps to improve the quality of synthesized views over state-of-the-art methods. The results show a clear improvement on image quality by implicitly correcting depth map errors, providing more natural boundaries on transparent regions, and removing artifacts.

Daniel Herrera C., Juho Kannala, Janne Heikkilä

Using Spatially Distributed Patterns for Multiple View Camera Calibration

Abstract

This paper presents an approach to intrinsic and extrinsic camera parameter calibration from a series of photographs or from video. For the reliable and accurate estimation of camera parameters it is common to use specially designed calibration patterns. However, using a single pattern, a globally consistent calibration is only possible from positions and viewing directions from where this single pattern is visible. To overcome this problem, the presented approach uses multiple coded patterns that can be distributed over a large area. A connection graph representing visible patterns in multiple views is generated, which is used to estimate globally consistent camera parameters for the complete scene. The approach is evaluated on synthetic and real-world ground truth examples. Furthermore, the approach is applied to calibrate the stereo-cameras of a robotic head on a moving platform.

Martin Grochulla, Thorsten Thormählen, Hans-Peter Seidel

Optimal Gabor Filters and Haralick Features for the Industrial Polarization Imaging

Abstract

During the past decade, computer vision methods for inline inspection became an important tool in a lot of industrial processes. During the same time polarization imaging techniques rapidly evolved with the development of electro-optic components, as e.g. the polarization cameras, now available on the market. This paper is dedicated to the application of polarization techniques for visually inspecting complex metallic surfaces. As we will shortly recall, this consists of a direct image interpretation based on the measurement of the polarization parameters of the light reflected by the inspected object. The proposed image interpretation procedure consists of a Gabor pre-filtering and a Haralick feature detector. It is demonstrated that polarization images permit to reach higher classification rates than in case of a direct interpretation of images without polarization information.

Yannick Caulier, Christophe Stolz

Joint Histogram Modelling for Segmentation Multiple Sclerosis Lesions

Abstract

This paper presents a novel methodology based on joint histograms, for the automated and unsupervised segmentation of multiple sclerosis (MS) lesion in cranial magnetic resonance (MR) imaging. Our workflow is composed of three steps: locate the MS lesion region in the joint histogram, segment MS lesions, and false positive reduction. The advantage of our approach is that it can segment small lesions, does not require prior skull segmentation, and is robust with regard to noisy and inhomogeneous data. Validation on the BrainWeb simulator and real data demonstrates that our method has an accuracy comparable with other MS lesion segmentation methods.

Ziming Zeng, Reyer Zwiggelaar

Surface Reconstruction of Scenes Using a Catadioptric Camera

Abstract

This paper presents a method to reconstruct a surface from images of a scene taken by an equiangular catadioptric camera. Such a camera is convenient for several reasons: it is low cost, and almost all visible parts of the scene are projected in a single image. Firstly, the camera parameters and a sparse cloud of 3d points are simultaneously estimated. Secondly, a triangulated surface is robustly estimated from the cloud. Both steps are automatic. Experiments are provided from hundreds of photographs taken by a pedestrian. In contrast to other methods working in similar experimental conditions, ours provides a manifold surface in spite of the difficult (passive and sparse) data.

Shuda Yu, Maxime Lhuillier

Creating Chinkin Works in the Virtual Space

Abstract

In this paper, a method to enable people to experience "Chinkin" in the virtual space is introduced. Chinkin is a traditional artistic technique to draw designs on lacquer ware with fine lines and dots, and it was widely used to produce Daimyo’s utensils in the Edo period during the 17th and the 19th century in Japan. The author develops a virtual Chinkin system based on the virtual sculpting method which is an interactive CG creating method developed by the author. Chinkin consists of some processes and each process is realized by virtual carving: an interactive deformation of solids, and virtual painting: an interactive generation of a 3D texture map. In this system, the user can experience each process of the Chinkin technique in the virtual space and can create virtual sculptures ornamented with Chinkin designs as CG. Both the operation and the result are similar to the real ones and it would be useful for introduction, education and preservation of the traditional Chinkin technique at museum and school.

Shinji Mizuno

Real-Time Multi-view Human Motion Tracking Using 3D Model and Latency Tolerant Parallel Particle Swarm Optimization

Abstract

This paper demonstrates how latency tolerant parallel particle swarm optimization can be used to achieve real-time full-body motion tracking. The tracking is realized using multi-view images and articulated 3D model with a truncated cones-based representation of the body. Each CPU core computes fitness score for a single camera. On each node the algorithm uses the current temporary best fitness value without waiting for the global best one from cooperating sub-swarms. The algorithm runs at 10 Hz on eight PC nodes connected by 1 GigE.

Bogdan Kwolek, Tomasz Krzeszowski, Konrad Wojciechowski

Snap Image Composition

Abstract

Snap Composition broadens the applicability of interactive image composition. Current tools, like Adobe’s Photomerge Group Shot, do an excellent job when the background can be aligned and objects have limited motion. Snap Composition works well even when the input images include different objects and the backgrounds cannot be aligned. The power of Snap Composition comes from the ability to assign for every output pixel a source pixel in any input image, and from any location in that image. An energy value is computed for each such assignment, representing both the user constraints and the quality of composition. Minimization of this energy gives the desired composition.

Composition is performed once a user marks objects in the different images, and optionally drags them into a new location in the target canvas. The background around the dragged objects, as well as the final locations of the objects themselves, will be automatically computed for seamless composition. If the user does not drag the selected objects to a desired place, they will automatically snap into a suitable location. A video describing the results can be seen in www.vision.huji.ac.il/shiftmap/SnapVideo.mp4 .

Yael Pritch, Yair Poleg, Shmuel Peleg

Towards the Automatic Generation of 3D Photo-Realistic Avatars Using 3D Scanned Data

Abstract

The use of 3D avatar scanned from real person 3D data has become more and more common in different fields such as video games and movies. The target applications for these models require different constraints on the result to be fulfilled. In this paper we propose to generate high-resolution closed meshes of a person that can be used for virtual try-on applications, therefore the 3D model should be precise enough for the customer to recognize him/herself wearing a virtual garment. Our approach uses a generic model which is deformed using fast editing method to fit a point cloud obtained from a 3d scanner. Our system is fully automatic and requires only an unoriented point cloud.

Thibault Luginbühl, Laurent Delattre, André Gagalowicz

Content Based Image Retrieval Using Visual-Words Distribution Entropy

Abstract

Bag-of-visual-words (BOVW) is a representation of images which is built using a large set of local features. To date, the experimental results presented in the literature have shown that this approach achieves high retrieval scores in several benchmarking image databases because of their ability to recognize objects and retrieve near-duplicate (to the query) images. In this paper, we propose a novel method that fuses the idea of inserting the spatial relationship of the visual words in an image with the conventional Visual Words method. Incorporating the visual distribution entropy leads to a robust scale invariant descriptor. The experimental results show that the proposed method demonstrates better performance than the classic Visual Words approach, while it also outperforms several other descriptors from the literature.

Savvas A. Chatzichristofis, Chryssanthi Iakovidou, Yiannis S. Boutalis

Video Summarization Using a Self-Growing and Self-Organized Neural Gas Network

Abstract

In this paper, a novel method to generate video summaries is proposed, which is allocated mainly for being applied to on-line videos. The novelty of this approach lies in the fact that the video summarization problem is considered as a single query image retrieval problem. According to the proposed method, each frame is considered as a separate image and is described by the recently proposed Compact Composite Descriptors(CCDs) and a visual word histogram. In order to classify the frames into clusters, the method utilizes a powerful Self-Growing and Self-Organized Neural Gas (SGONG) network. Its main advantage is that it adjusts the number of created neurons and their topology in an automatic way. Thus, after training, the SGONG give us the appropriate number of output classes and their centers. The extraction of a representative key frame from every cluster leads to the generation of the video abstract. A significant characteristic of the proposed method is its ability to calculate dynamically the appropriate number of clusters. Experimental results are presented to indicate the effectiveness of the proposed approach.

Dim P. Papadopoulos, Savvas A. Chatzichristofis, Nikos Papamarkos

Real-Time Upper-Body Human Pose Estimation Using a Depth Camera

Abstract

Automatic detection and pose estimation of humans is an important task in Human-Computer Interaction (HCI), user interaction and event analysis. This paper presents a model based approach for detecting and estimating human pose by fusing depth and RGB color data from monocular view. The proposed system uses Haar cascade based detection and template matching to perform tracking of the most reliably detectable parts namely, head and torso. A stick figure model is used to represent the detected body parts. The fitting is then performed independently for each limb, using the weighted distance transform map. The fact that each limb is fitted independently speeds-up the fitting process and makes it robust, avoiding the combinatorial complexity problems that are common with these types of methods. The output is a stick figure model consistent with the pose of the person in the given input image. The algorithm works in real-time and is fully automatic and can detect multiple non-intersecting people.

Himanshu Prakash Jain, Anbumani Subramanian, Sukhendu Das, Anurag Mittal

Image Matting with Transductive Inference

Abstract

Various matting methods have been proposed to isolate objects from images by extracting alpha mattes. Although they typically work well for images with smooth regions, their ability to deal with complex or textured patterns is limited due to their inductive inference nature. In this paper we present a Transductive Matting algorithm which explicitly treats the matting task as a statistical transductive inference. Unlike previous approaches, we assume the user marked pixels do not fully capture the statistical distributions of foreground and background colors in the unknown region of the given trimap, thus new foreground and background colors are allowed to be recognized in the transductive labeling process. Quantitative comparisons show that our method achieves better results than previous methods on textured images.

Jue Wang

A New Buckling Model for Cloth Simulation

Abstract

Textiles are normally incompressible: when we try to compress them, they immediately buckle. Unfortunately, many cloth simulation solvers disregard this fact. In this paper, we present an efficient method to model buckling using distance contraint. This constraint is formulated as a linear complementarity problem (LCP) and can be easily integrated within a collision handling process.

Tung Le Thanh, André Gagalowicz

A Public System for Image Based 3D Model Generation

Abstract

This paper presents a service that creates complete and realistic 3D models out of a set of photographs taken with a consumer camera. In contrast to other systems which produce sparse point clouds or individual depth maps, our system automatically generates textured and dense models that require little or no post-processing. Our reconstruction pipeline features automatic camera parameter retrieval from the web and intelligent view selection. This ARC3D system is available as a public, free-to-use web service (http://www.arc3d.be). Results are made available both as a full-resolution model and as a low-resolution for web browser viewing using WebGL.

David Tingdahl, Luc Van Gool

A Novel Approach to Image Assessment by Seeking Unification of Subjective and Objective Criteria Based on Supervised Learning

Abstract

Image quality assessment is a challenge research topic in imaging engineering and applications, especially in the case where the reference image cannot be accessed, such as aerial images. In view of such an issue, a novel learning based evaluation approach was developed. In practice, only objective quality criteria usually cannot achieve desired result. Based on the analysis of multiple objective quality assessment criteria, a boosting algorithm with supervised learning, LassBoost (Learn to Assess with Boosting), was employed to seek the unification of the multiple objective criteria with subjective criteria. This new approach can effectively fuse multiple objective quality criteria guided by the subjective quality level such that the subjective /objective criteria can be unified using weighted regression method. The experimental results illustrate that the proposed method can achieve significantly better performance for image quality assessment, thus can provide a powerful decision support in imaging engineering and practical applications.

Pipei Huang, Shiyin Qin, Donghuan Lu

Backmatter

Title: Computer Vision/Computer Graphics Collaboration Techniques
Editors: André Gagalowicz
Wilfried Philips
Publisher: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-24136-9
Print ISBN: 978-3-642-24135-2
DOI: https://doi.org/10.1007/978-3-642-24136-9

Springer Professional

About this book

Table of Contents

Frontmatter

Bundle Adjustment for Stereoscopic 3D

3D Modeling of Haussmannian Facades

Lung Cancer Detection from Thoracic CT Scans Using 3-D Deformable Models Based on Statistical Anatomical Analysis

An Evaluation on Estimators for Stochastic and Heuristic Based Cost Functions Using the Epipolar-Constraint

Facial Movement Based Recognition

Towards Temporally-Coherent Video Matting

Leaf Segmentation and Tracking Using Probabilistic Parametric Active Contours

Gallbladder Segmentation from 2–D Ultrasound Images Using Active Contour Models and Gradient Vector Flow

Multi-view Alpha Matte for Free Viewpoint Rendering

Using Spatially Distributed Patterns for Multiple View Camera Calibration

Optimal Gabor Filters and Haralick Features for the Industrial Polarization Imaging

Joint Histogram Modelling for Segmentation Multiple Sclerosis Lesions

Surface Reconstruction of Scenes Using a Catadioptric Camera

Creating Chinkin Works in the Virtual Space

Real-Time Multi-view Human Motion Tracking Using 3D Model and Latency Tolerant Parallel Particle Swarm Optimization

Snap Image Composition

Towards the Automatic Generation of 3D Photo-Realistic Avatars Using 3D Scanned Data

Content Based Image Retrieval Using Visual-Words Distribution Entropy

Video Summarization Using a Self-Growing and Self-Organized Neural Gas Network

Real-Time Upper-Body Human Pose Estimation Using a Depth Camera

Image Matting with Transductive Inference

A New Buckling Model for Cloth Simulation

A Public System for Image Based 3D Model Generation

A Novel Approach to Image Assessment by Seeking Unification of Subjective and Objective Criteria Based on Supervised Learning

Backmatter