Skip to main content

2012 | Buch

Computer Vision – ECCV 2012. Workshops and Demonstrations

Florence, Italy, October 7-13, 2012, Proceedings, Part II

herausgegeben von: Andrea Fusiello, Vittorio Murino, Rita Cucchiara

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The three volume set LNCS 7583, 7584 and 7585 comprises the Workshops and Demonstrations which took place in connection with the European Conference on Computer Vision, ECCV 2012, held in Firenze, Italy, in October 2012. The total of 179 workshop papers and 23 demonstration papers was carefully reviewed and selected for inclusion in the proceedings. They where held at workshops with the following themes: non-rigid shape analysis and deformable image alignment; visual analysis and geo-localization of large-scale imagery; Web-scale vision and social media; video event categorization, tagging and retrieval; re-identification; biological and computer vision interfaces; where computer vision meets art; consumer depth cameras for computer vision; unsolved problems in optical flow and stereo estimation; what's in a face?; color and photometry in computer vision; computer vision in vehicle technology: from earth to mars; parts and attributes; analysis and retrieval of tracked events and motion in imagery streams; action recognition and pose estimation in still images; higher-order models and global constraints in computer vision; information fusion in computer vision for concept recognition; 2.5D sensing technologies in motion: the quest for 3D; benchmarking facial image analysis technologies.

Inhaltsverzeichnis

Frontmatter

Second Workshop on Consumer Depth Cameras for Computer Vision (CDC4CV)

High Accuracy TOF and Stereo Sensor Fusion at Interactive Rates

We propose two new GPU-based sensor fusion approaches for time of flight (TOF) and stereo depth data. Data fidelity measures are defined to deal with the fundamental limitations of both techniques alone. Our algorithms combine TOF and stereo, yielding megapixel depth maps, enabling our approach to be used in a movie production scenario. Our local model works at interactive rates but yields noisier results, whereas our variational technique is more robust at a higher computational cost. The results show an improvement over each individual method with TOF interreflection remaining an open challenge. To encourage quantitative evaluations, a ground truth dataset is made publicly available.

Rahul Nair, Frank Lenzen, Stephan Meister, Henrik Schäfer, Christoph Garbe, Daniel Kondermann
A Modular Framework for 2D/3D and Multi-modal Segmentation with Joint Super-Resolution

A versatile multi-image segmentation framework for 2D/3D or multi-modal segmentation is introduced in this paper with possible application in a wide range of machine vision problems. The framework performs a joint segmentation and super-resolution to account for images of unequal resolutions gained from different imaging sensors. This allows to combine high resolution details of one modality with the distinctiveness of another modality. A set of measures is introduced to weight measurements according to their expected reliability and it is utilized in the segmentation as well as the super-resolution. The approach is demonstrated with different experimental setups and the effect of additional modalities as well as of the parameters of the framework are shown.

Benjamin Langmann, Klaus Hartmann, Otmar Loffeld
Real-Time Plane Segmentation and Obstacle Detection of 3D Point Clouds for Indoor Scenes

Scene analysis is an important issue in computer vision and extracting structural information is one of the fundamental techniques. Taking advantage of depth camera, we propose a novel fast plane segmentation algorithm and use it to detect obstacles in indoor environment. The proposed algorithm has two steps: the initial segmentation and the refined segmentation. Firstly, depth image is converted into 3D point cloud and divided into voxels, which are less sensitive to noises compared with pixels. Then area-growing algorithm is used to extract the candidate planes according to the normal of each voxel. Secondly, each point that hasn’t been classified to any plane is examined whether it actually belongs to a plane. The two-step strategy has been proven to be a fast segmentation method with high accuracy. The experimental results demonstrate that our method can segment planes and detect obstacles in real-time with high accuracy for indoor scenes.

Zhe Wang, Hong Liu, Yueliang Qian, Tao Xu
Combining Textural and Geometrical Descriptors for Scene Recognition

Local description of images is a common technique in many computer vision related research. Due to recent improvements in RGB-D cameras, local description of 3D data also becomes practical. The number of studies that make use of this extra information is increasing. However, their applicabilities are limited due to the need for generic combination methods. In this paper, we propose combining textural and geometrical descriptors for scene recognition of RGB-D data. The methods together with the normalization stages proposed in this paper can be applied to combine any descriptors obtained from 2D and 3D domains. This study represents and evaluates different ways of combining multi-modal descriptors within the BoW approach in the context of indoor scene localization. Query’s rough location is determined from the pre-recorded images and depth maps in an unsupervised image matching manner.

Neslihan Bayramog̃lu, Janne Heikkilä, Matti Pietikäinen
Human-Centric Indoor Environment Modeling from Depth Videos

We propose an approach to model indoor environments from depth videos (the camera is stationary when recording the videos), which includes extracting the 3-D spatial layout of the rooms and modeling objects as 3-D cuboids. Different from previous work which purely relies on image appearance, we argue that indoor environment modeling should be human-centric: not only because humans are an important part of the indoor environments, but also because the interaction between humans and environments can convey much useful information about the environments. In this paper, we develop an approach to extract physical constraints from human poses and motion to better recover the spatial layout and model objects inside. We observe that the cues provided by human-environment intersection are very powerful: we don’t have a lot of training data but our method can still achieve promising performance. Our approach is built on depth videos, which makes it more user friendly.

Jiwen Lu, Gang Wang
Human Daily Action Analysis with Multi-view and Color-Depth Data

Improving human action recognition in videos is restricted by the inherent limitations of the visual data. In this paper, we take the depth information into consideration and construct a novel dataset of human daily actions. The proposed ACT4

2

dataset provides synchronized data from 4 views and 2 sources, aiming to facilitate the research of action analysis across multiple views and multiple sources. We also propose a new descriptor of depth information for action representation, which depicts the structural relations of spatiotemporal points within action volume using the distance information in depth data. In experimental validation, our descriptor obtains superior performance to the state-of-the-art action descriptors designed for color information, and more robust to viewpoint variations. The fusion of features from different sources is also discussed, and a simple but efficient method is presented to provide a baseline performance on the proposed dataset.

Zhongwei Cheng, Lei Qin, Yituo Ye, Qingming Huang, Qi Tian
Viewpoint Invariant Matching via Developable Surfaces

Stereo systems, time-of-flight cameras, laser range sensors and consumer depth cameras nowadays produce a wealth of image data with depth information (RGBD), yet the number of approaches that can take advantage of color and geometry data at the same time is quite limited. We address the topic of wide baseline matching between two RGBD images, i.e. finding correspondences from largely different viewpoints for recognition, model fusion or loop detection. Here we normalize local image features with respect to the underlying geometry and show a significantly increased number of correspondences. Rather than moving a virtual camera to some position in front of a dominant scene plane, we propose to unroll developable scene surfaces and detect features directly in the “wall paper” of the scene. This allows viewpoint invariant matching also in scenes with curved architectural elements or with objects like bottles, cans or (partial) cones and others. We prove the usefulness of our approach using several real world scenes with different objects.

Bernhard Zeisl, Kevin Köser, Marc Pollefeys
A Unified Energy Minimization Framework for Model Fitting in Depth

In this paper we present a unified energy minimization framework for model fitting and pose recovery problems in depth cameras. 3D level-set embedding functions are used to represent object models implicitly and a novel 3D chamfer matching based energy function is minimized by adjusting the generic projection matrix, which could be parameterized differently according to specific applications. Our proposed energy function takes the advantage of the gradient of 3D level-set embedding function and can be efficiently solved by gradients-based optimization methods. We show various real-world applications, including real-time 3D tracking in depth, simultaneous calibration and tracking, and 3D point cloud modeling. We perform experiments on both real data and synthetic data to show the superior performance of our method for all the applications above.

Carl Yuheng Ren, Ian Reid
Object Recognition Robust to Imperfect Depth Data

In this paper, we present an adaptive data fusion model that robustly integrates depth and image only perception. Combining dense depth measurements with images can greatly enhance the performance of many computer vision algorithms, yet degraded depth measurements (e.g., missing data) can also cause dramatic performance losses to levels below image-only algorithms. We propose a generic fusion model based on maximum likelihood estimates of fused image-depth functions for both available and missing depth data. We demonstrate its application to each step of a state-of-the-art image-only object instance recognition pipeline. The resulting approach shows increased recognition performance over alternative data fusion approaches.

David F. Fouhey, Alvaro Collet, Martial Hebert, Siddhartha Srinivasa
3D Object Detection with Multiple Kinects

Categorizing and localizing multiple objects in 3D space is a challenging but essential task for many robotics and assisted living applications. While RGB cameras as well as depth information have been widely explored in computer vision there is surprisingly little recent work combining multiple cameras and depth information. Given the recent emergence of consumer depth cameras such as Kinect we explore how multiple cameras and active depth sensors can be used to tackle the challenge of 3D object detection. More specifically we generate point clouds from the depth information of multiple registered cameras and use the VFH descriptor [20] to describe them. For color images we employ the DPM [3] and combine both approaches with a simple voting approach across multiple cameras.

On the large RGB-D dataset [12] we show improved performance for object classification on multi-camera point clouds and object detection on color images, respectively. To evaluate the benefit of joining color and depth information of multiple cameras, we recorded a novel dataset with four Kinects showing significant improvements over a DPM baseline for 9 object classes aggregated in challenging scenes. In contrast to related datasets our dataset provides color and depth information recorded with multiple Kinects and requires localizing and categorizing multiple objects in 3D space. In order to foster research in this field, the dataset, including annotations, is available on our web page.

Wandi Susanto, Marcus Rohrbach, Bernt Schiele

Unsolved Problems in Optical Flow and Stereo Estimation

Combining Monocular Geometric Cues with Traditional Stereo Cues for Consumer Camera Stereo

This paper presents an algorithm for considering both stereo cues and structural priors to obtain a geometrically representative depth map from a narrow baseline stereo pair. We use stereo pairs captured with a consumer stereo camera and observe that traditional depth estimation using stereo matching techniques encounters difficulties related to the narrow baseline relative to the depth of the scene. However, monocular geometric cues based on attributes such as lines and the horizon provide additional hints about the global structure that stereo matching misses. We merge both monocular and stereo matching features in a piecewise planar reconstruction framework that is initialized with a discrete inference step, and refined with a continuous optimization to encourage the intersections of hypothesized planes to coincide with observed image lines. We show through our results on stereo pairs of manmade structures captured outside of the lab that our algorithm exploits the advantages of both approaches to infer a better depth map of the scene.

Adarsh Kowdle, Andrew Gallagher, Tsuhan Chen
Quality Assessment of Non-dense Image Correspondences

Non-dense image correspondence estimation algorithms are known for their speed, robustness and accuracy. However, current evaluation methods evaluate correspondences point-wise and consider only correspondences that are actually estimated. They cannot evaluate the fact that some algorithms might leave important scene correspondences undetected - correspondences which might be vital for succeeding applications. Additionally, often the reference correspondences for real world scenes are also sparse. Outliers that do not hit a reference measurement can remain undetected with the current, point-wise evaluation methods. To assess the quality of correspondence fields we propose a histogram based evaluation metric that does not rely on point-wise comparison and is therefore robust to sparsity in estimate as well as reference.

Anita Sellent, Jochen Wingbermühle
A Complete Confidence Framework for Optical Flow

Assessing the performance of optical flow in the absence of ground truth is of prime importance for a correct interpretation and application. Thus, in recent years, the interest in developing confidence measures has increased. However, by its complexity, assessing the capability of such measures for detecting areas of poor performance of optical flow is still unsolved.

We define a confidence measure in the context of numerical stability of the optical flow scheme and also a protocol for assessing its capability to discard areas of non-reliable flows. Results on the Middlebury database validate our framework and show that, unlike existing measures, our measure is not biased towards any particular image feature.

Patricia Márquez-Valle, Debora Gil, Aura Hernàndez-Sabaté
An Improved Stereo Matching Algorithm with Ground Plane and Temporal Smoothness Constraints

In this study, novel techniques are presented addressing the challenges of stereo matching algorithms for surveillance and vehicle control. For this purpose, one of the most efficient local stereo matching techniques, namely permeability filter, is modified in terms of road plane geometry and temporal consistency in order to take the major challenges of such a scenario into account. Relaxing smoothness assumption of the permeability filter along vertical axis enables extraction of road geometry with high accuracy, even for the cases where ground plane does not contain sufficient textural information. On the other hand, temporal smoothness is enforced by transferring reliable depth assignments against illumination changes, reflections and instant occlusions. According to the extensive experiments on a recent challenging stereo video dataset, the proposed modifications provide reliable disparity maps under severe challenges and low texture distribution, improving scene analyses for surveillance related applications. Although improvements are illustrated for a specific local stereo matching algorithm, the presented specifications and modifications can be applied for the other similar stereo algorithms as well.

Cevahir Çığla, A. Aydın Alatan
On the Evaluation of Scene Flow Estimation

This paper surveys the state of the art in evaluating the performance of scene flow estimation and points out the difficulties in generating benchmarks with ground truth which have not allowed the development of general, reliable solutions. Hopefully, the renewed interest in dynamic 3D content, which has led to increased research in this area, will also lead to more rigorous evaluation and more effective algorithms. We begin by classifying methods that estimate depth, motion or both from multi-view sequences according to their parameterization of shape and motion. Then, we present several criteria for their evaluation, discuss their strengths and weaknesses and conclude with recommendations.

Philippos Mordohai
Analysis of KITTI Data for Stereo Analysis with Stereo Confidence Measures

The recently published KITTI stereo dataset provides a new quality of stereo imagery with partial ground truth for benchmarking stereo matchers. Our aim is to test the value of stereo confidence measures (e.g. a left-right consistency check of disparity maps, or an analysis of the slope of a local interpolation of the cost function at the taken minimum) when applied to recorded datasets, such as published with KITTI. We choose popular measures as available in the stereo-analysis literature, and discuss a naive combination of these. Evaluations are carried out using a sparsification strategy. While the best single confidence measure proved to be the right-left consistency check for high disparity map densities, the best overall performance is achieved with the proposed naive measure combination. We argue that there is still demand for more challenging datasets and more comprehensive ground truth.

Ralf Haeusler, Reinhard Klette
Lessons and Insights from Creating a Synthetic Optical Flow Benchmark

With the MPI-Sintel Flow dataset, we introduce a naturalistic dataset for optical flow evaluation derived from the open source CGI movie Sintel. In contrast to the well-known Middlebury dataset, the MPI-Sintel Flow dataset contains longer and more varied sequences with image degradations such as motion blur, defocus blur, and atmospheric effects. Animators use a variety of techniques that produce pleasing images but make the raw animation data inappropriate for computer vision applications if used “out of the box”. Several changes to the rendering software and animation files were necessary in order to produce data for flow evaluation and similar changes are likely for future efforts to construct a scientific dataset from an animated film. Here we distill our experience with Sintel into a set of best practices for using computer animation to generate scientific data for vision research.

Jonas Wulff, Daniel J. Butler, Garrett B. Stanley, Michael J. Black

What’s in a Face?

Modeling and Detection of Wrinkles in Aging Human Faces Using Marked Point Processes

In this paper we propose a new generative model for wrinkles on aging human faces using Marked Point Processes (MPP). Wrinkles are considered as stochastic spatial arrangements of sequences of line segments, and detected in an image by proper localization of line segments. The intensity gradients are used to detect more probable locations and a prior probability model is used to constrain properties of line segments. Wrinkles are localized by sampling MPP using the Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm. We also present an evaluation setup to measure the performance of the proposed model. We present results on a variety of images obtained from the Internet to illustrate the performance of the proposed model.

Nazre Batool, Rama Chellappa
How Does Aging Affect Facial Components?

There is growing interest in achieving age invariant face recognition due to its wide applications in law enforcement. The challenge lies in that face aging is quite a complicated process, which involves both intrinsic and extrinsic factors. Face aging also influences individual facial components (such as the mouth, eyes, and nose) differently. We propose a component based method for age invariant face recognition. Facial components are automatically localized based on landmarks detected using an Active Shape Model. Multi-scale local binary pattern and scale-invariant feature transform features are then extracted from each component, followed by random subspace linear discriminant analysis for classification. With a component based representation, we study how aging influences individual facial components on two large aging databases (MORPH Album2 and PCSO). Per component performance analysis shows that the nose is the most stable component during face aging. Age invariant recognition exploiting demographics shows that face aging has more influence on females than males. Overall, recognition performance on the two databases shows that the proposed component based approach is more robust to large time lapses than FaceVACS, a leading commercial face matcher.

Charles Otto, Hu Han, Anil Jain
Spatio-Temporal Multifeature for Facial Analysis

Human faces are 3D complex objects consisting of geometrical and appearance variations. They exhibit local and global variations when observed over time. In our daily life communication, human faces are seen in actions conveying a set of information during interaction. Cognitive science explains that human brains are capable of extracting this set of information very efficiently resulting in a better interaction with others. Our goal is to extract a single feature set which represents multiple facial characteristics. This problem is addressed by the analysis of different feature components on facial classifications using a 3D surface model. We propose a unified framework which is capable to extract multiple information from the human faces and at the same time robust against rigid and non-rigid facial deformations. A single feature vector corresponding to a given image is representative of person’s identity, facial expressions, gender and age estimation. This feature set is called spatio-temporal multifeature (STMF) extracted from image sequences. An STMF is configured with three different feature components which is tested thoroughly to evidence its validity. The experimental results from four different databases show that this feature set provides high accuracy and at the same time exhibits robustness. The results have been discussed comparatively with different approaches.

Zahid Riaz, Michael Beetz
The Role of Facial Regions in Evaluating Social Dimensions

Facial trait judgments are an important information cue for people. Recent works in the Psychology field have stated the basis of face evaluation, defining a set of traits that we evaluate from faces (e.g. dominance, trustworthiness, aggressiveness, attractiveness, threatening or intelligence among others). We rapidly infer information from others faces, usually after a short period of time (< 1000

ms

) we perceive a certain degree of dominance or trustworthiness of another person from the face. Although these perceptions are not necessarily accurate, they influence many important social outcomes (such as the results of the elections or the court decisions). This topic has also attracted the attention of Computer Vision scientists, and recently a computational model to automatically predict trait evaluations from faces has been proposed. These systems try to mimic the human perception by means of applying machine learning classifiers to a set of labeled data. In this paper we perform an experimental study on the specific facial features that trigger the social inferences. Using previous results from the literature, we propose to use simple similarity maps to evaluate which regions of the face influence the most the trait inferences. The correlation analysis is performed using only appearance, and the results from the experiments suggest that each trait is correlated with specific facial characteristics.

David Masip Rodo, Alexander Todorov, Jordi Vitrià Marca
Illumination Normalization Using Self-lighting Ratios for 3D2D Face Recognition

3D2D face recognition is beginning to gain attention from the research community. It takes advantage of 3D facial geometry to normalize the head pose and registers it into a canonical 2D space. In this paper, we present a novel illumination normalization approach for 3D2D face recognition which does not require any training or prior knowledge on the type, number, and direction of the lighting sources. Estimated using an image-specific filtering technique in the frequency domain, a self-lighting ratio is employed to suppress illumination differences. Experimental results on the UHDB11 and FRGC databases indicate that the proposed approach improves the performance significantly for face images with large illumination variations.

Xi Zhao, Shishir K. Shah, Ioannis A. Kakadiaris
Robust Learning from Normals for 3D Face Recognition

We introduce novel subspace-based methods for learning from the azimuth angle of surface normals for 3D face recognition. We show that the normal azimuth angles combined with Principal Component Analysis (PCA) using a cosine-based distance measure can be used for robust face recognition from facial surfaces. The proposed algorithms are well-suited for all types of 3D facial data including data produced by range cameras (depth images), photometric stereo (PS) and shade-from-X (SfX) algorithms. We demonstrate the robustness of the proposed algorithms both in 3D face reconstruction from synthetically occluded samples, as well as, in face recognition using the FRGC v2 3D face database and the recently collected Photoface database where the proposed method achieves state-of-the-art results. An important aspect of our method is that it can achieve good face recognition/verification performance by using raw 3D scans without any heavy preprocessing (i.e., model fitting, surface smoothing etc.).

Ioannis Marras, Stefanos Zafeiriou, Georgios Tzimiropoulos
Coupled Marginal Fisher Analysis for Low-Resolution Face Recognition

Many scenarios require that face recognition be performed at conditions that are not optimal. Traditional face recognition algorithms are not best suited for matching images captured at a low-resolution to a set of high-resolution gallery images. To perform matching between images of different resolutions, this work proposes a method of learning two sets of projections, one for high-resolution images and one for low-resolution images, based on local relationships in the data. Subsequent matching is done in a common subspace. Experiments show that our algorithm yields higher recognition rates than other similar methods.

Stephen Siena, Vishnu Naresh Boddeti, B. V. K. Vijaya Kumar
Exploring Bag of Words Architectures in the Facial Expression Domain

Automatic facial expression recognition (AFER) has undergone substantial advancement over the past two decades. This work explores the application of bag of words (BoW), a highly matured approach for object and scene recognition to AFER. We proceed by first highlighting the reasons that makes the task for BoW differ for AFER compared to object and scene recognition. We propose suitable extensions to BoW architecture for the AFER’s task. These extensions are able to address some of the limitations of current state of the art appearance-based approaches to AFER. Our BoW architecture is based on the spatial pyramid framework, augmented by multiscale dense SIFT features, and a recently proposed approach for object classification: locality-constrained linear coding and max-pooling. Combining these, we are able to achieve a powerful facial representation that works well even with linear classifiers. We show that a well designed BoW architecture can provide a performance benefit for AFER, and elements of the proposed BoW architecture are empirically evaluated. The proposed BoW approach supersedes previous state of the art results by achieving an average recognition rate of 96% on AFER for two public datasets.

Karan Sikka, Tingfan Wu, Josh Susskind, Marian Bartlett
Kernel Conditional Ordinal Random Fields for Temporal Segmentation of Facial Action Units

We consider the problem of automated recognition of temporal segments (neutral, onset, apex and offset) of Facial Action Units. To this end, we propose the Laplacian-regularized Kernel Conditional Ordinal Random Field model. In contrast to standard modeling approaches to recognition of AUs’ temporal segments, which treat each segment as an independent class, the proposed model takes into account

ordinal

relations between the segments. The experimental results evidence the effectiveness of such an approach.

Ognjen Rudovic, Vladimir Pavlovic, Maja Pantic
Exploring the Facial Expression Perception-Production Link Using Real-Time Automated Facial Expression Recognition

Motor production may play an important role in learning to recognize facial expressions. The present study explores the influence of facial production training on the perception of facial expressions by employing a novel production training intervention built on feedback from automated facial expression recognition. We hypothesized that production training using the automated feedback system would improve an individual’s ability to identify dynamic emotional faces. Thirty-four participants were administered a dynamic expression recognition task before and after either interacting with a production training video game called the Emotion Mirror or playing a control video game. Consistent with the prediction that perceptual benefits are tied to expression production, individuals with high engagement in production training improved more than individuals with low engagement or individuals who did not receive production training. These results suggest that the visual-motor associations involved in expression production training are related to perceptual abilities. Additionally, this study demonstrates a novel application of computer vision for real-time facial expression intervention training.

David M. Deriso, Josh Susskind, Jim Tanaka, Piotr Winkielman, John Herrington, Robert Schultz, Marian Bartlett
Understanding Critical Factors in Appearance-Based Gender Categorization

Gender categorization, based on the analysis of facial appearance, can be useful in a large set of applications. In this paper we investigate the gender classification problem from a non-conventional perspective. In particular, the analysis will aim to determine the factors critically affecting the accuracy of available technologies, better explaining differences between face-based identification and gender categorization.

A novel challenging protocol is proposed, exploiting the dimensions of the Face Recognition Grand Challenge version 2.0 database (FRGC2.0). This protocol is evaluated against several classification algorithms and different kind of features, such as Gabor and LBP. The results obtained show that gender classification can be made independent from other appearance-based factors such as the skin color, facial expression, and illumination condition.

Enrico Grosso, Andrea Lagorio, Luca Pulina, Massimo Tistarelli
Facial Landmarking: Comparing Automatic Landmarking Methods with Applications in Soft Biometrics

Registration is a critical step in computer-based image analysis. In this work we examine the effects of registration in face-based soft-biometrics. This form of soft-biometrics, better termed as facial analytics, takes an image containing a face and returns attributes of that face. In this work, the attributes of focus are gender and race. Automatic generation of facial analytics relies on accurate registration. Hence, this work evaluates three techniques for dense registration, namely AAM, Stacked ASM and CLM. Further, we evaluate the influence of facial landmark mis-localization, resulting from these techniques, on gender classification and race determination. To the best of our knowledge, such an evaluation of landmark mis-localization on soft biometrics, has not been conducted. We further demonstrate an effective system for gender and race classification based on dense landmarking and multi-factored principle components analysis. The system performs well against a multi-age face dataset for both gender and race classification.

Amrutha Sethuram, Karl Ricanek, Jason Saragih, Chris Boehnen
Gender Recognition Using Cognitive Modeling

In this work, we use cognitive modeling to estimate the

”gender strength”

of frontal faces, a continuous class variable, superseding the traditional binary class labeling. To incorporate this continuous variable we suggest a novel linear gender classification algorithm, the Gender Strength Regression. In addition, we use the gender strength to construct a smaller but refined training set, by identifying and removing ill-defined training examples. We use this refined training set to improve the performance of known classification algorithms. Also the human performance of known data sets is reported, and surprisingly it seems to be quite a hard task for humans. Finally our results are reproduced on a data set of above 40,000 public Danish LinkedIN profile pictures.

Jens Fagertun, Tobias Andersen, Rasmus Reinhold Paulsen
Periocular Recognition Using Retinotopic Sampling and Gabor Decomposition

We present a new system for biometric recognition using periocular images based on retinotopic sampling grids and Gabor analysis of the local power spectrum. A number of aspects are studied, including: 1) grid adaptation to dimensions of the target eye vs. grids of constant size, 2) comparison between circular- and rectangular-shaped grids, 3) use of Gabor magnitude vs. phase vectors for recognition, 4) rotation compensation between query and test images, and 5) comparison with an iris machine expert. Results show that our system achieves competitive verification rates compared with other periocular recognition approaches. We also show that top verification rates can be obtained without rotation compensation, thus allowing to remove this step for computational efficiency. Also, the performance is not affected substantially if we use a grid of fixed dimensions, or it is even better in certain situations, avoiding the need of accurate detection of the iris region.

Fernando Alonso-Fernandez, Josef Bigun
Exploiting Perception for Face Analysis: Image Abstraction for Head Pose Estimation

We present an algorithm to estimate the pose of a human head from a single, low resolution image in real time. It builds on the fundamentals of human perception i.e. abstracting the relevant details from visual cues. Most images contain far too many cues than what are required for estimating human head pose. Thus, we use non-photorealistic rendering to eliminate irrelevant details like expressions from the picture and accentuate facial features critical to estimating head pose. The maximum likelihood pose range is then estimated by training a classifier on scaled down abstracted images. The results are extremely encouraging especially when compared with other recent methods.Moreover the algorithm is robust to illumination, expression, identity and resolution.

Anant Vidur Puri, Brejesh Lall
Complex Bingham Distribution for Facial Feature Detection

We present a novel method for facial feature point detection on images captured from severe uncontrolled environments based on a combination of regularized boosted classifiers and mixture of complex Bingham distributions. The complex Bingham distribution is a rotation-invariant shape representation that can handle pose, in-plane rotation and occlusion better than existing models. Additionally, we regularized a boosted classifier with a variance normalization factor to reduce false positives. Using the proposed two models, we formulate our facial features detection approach in a Bayesian framework of a maximum a-posteriori estimation. This approach allows for the inclusion of the uncertainty of the regularized boosted classifier and complex Bingham distribution. The proposed detector is tested on different datasets and results show comparable performance to the state-of-the-art with the BioID database and outperform them in uncontrolled datasets.

Eslam Mostafa, Aly Farag

4th Color and Photometry in Computer Vision Workshop 2012

Estimating Surface Normals from Spherical Stokes Reflectance Fields

In this paper we introduce a novel technique for estimating surface normals from the four Stokes polarization parameters of specularly reflected light under a single spherical incident lighting condition that is either unpolarized or circularly polarized. We illustrate the practicality of our technique by estimating surface normals under uncontrolled outdoor illumination from just four observations from a fixed viewpoint.

Giuseppe Claudio Guarnera, Pieter Peers, Paul Debevec, Abhijeet Ghosh
Base Materials for Photometric Stereo

Image-based capture of material appearance has been extensively studied, but the quality of the results and generality of the applied methods leave a lot of room for improvement. Most existing methods rely on parametric models of reflectance and require complex hardware systems or accurate geometric models that are not always available or practical. Rather than independently estimating reflectance properties for each surface point, it is common to express the reflectance as a combination of base materials inherent to each particular object or scene.

We propose a method for efficient and automatic extraction of base materials in a photometric stereo system. After jointly estimating per-pixel reflectances and refined surface normals using these materials, we can render photo-realistic images of complex objects under novel lighting conditions in real time.

David Tingdahl, Christoph Godau, Luc Van Gool
Robust Luminance and Chromaticity for Matte Regression in Polynomial Texture Mapping

Polynomial Texture Mapping (PTM) is a technique employed in a variety of settings, from museums to in-the-field image capture to multi-illuminant microscopy. It consists of illuminating the surface in question with lights from a collection of light directions, each light in turn. To date, the most accurate interpolation employed in PTM consists of two stages: a matte regression stage followed by a further specularity/shadow interpolation. For the first stage, recovering an underlying matte model so as to acquire surface albedo, normals and chromaticity, PTM employs polynomial regression at each pixel, mapping light-direction to luminance. A more accurate model excludes outlier values deriving from specularities and shadows by employing a robust regression from 6-D polynomials to 1-D luminance. Robust methods are guaranteed to automatically find the best representation of the underlying matte content. Here, we retain the idea of using robust methods but instead investigate using a much simpler robust 1-D mode-finder, acting on luminance and on chromaticity components. We then go on to increase accuracy by carrying out 3-D to 1-D regression: this strikes a balance between the best method and the fastest method, with greatly diminished complexity and another large speedup. We show that little accuracy is lost using this much simpler method, and demonstrate the effectiveness of the new method on several image datasets.

Mingjing Zhang, Mark S. Drew
Illuminant Estimation from Projections on the Planckian Locus

This paper deals with the automatic evaluation of the illuminant from a color photography. While many methods have been developed over the last years, this problem is still open since no method builds on hypotheses that are universal enough to deal with all possible situations. The proposed approach relies on a physical assumption about the possible set of illuminants and on the selection of grey pixels. Namely, a subset of pixels is automatically selected, which is then projected on the Planckian locus. Then, a simple voting procedure yields a robust estimation of the illuminant. As shown by experiments on two classical databases, the method offers state of the art performances among learning-free methods, at a reasonable computational cost.

Baptiste Mazin, Julie Delon, Yann Gousseau
Lighting Estimation in Indoor Environments from Low-Quality Images

Lighting conditions estimation is a crucial point in many applications. In this paper, we show that combining color images with corresponding depth maps (provided by modern depth sensors) allows to improve estimation of positions and colors of multiple lights in a scene. Since usually such devices provide low-quality images, for many steps of our framework we propose alternatives to classical algorithms that fail when the image quality is low. Our approach consists in decomposing an original image into specular shading, diffuse shading and albedo. The two shading images are used to render different versions of the original image by changing the light configuration. Then, using an optimization process, we find the lighting conditions allowing to minimize the difference between the original image and the rendered one.

Natalia Neverova, Damien Muselet, Alain Trémeau
Color Constancy Using Single Colors

This work investigates if the von Kries adaptation can be generalized to deal with single colored patches. We investigate which colored patches can give statistically equivalent performance to a white patch for von Kries adaptation. The investigation is then extended to couples of colors, and the analysis of the characteristics of the colors forming the couples is carried out. We focus here on single and couples of colors since common objects and logos are usually composed by a small number of colors.

Simone Bianco
An Effective Method for Illumination-Invariant Representation of Color Images

This paper proposes a method for illumination-invariant representation of natural color images. The invariant representation is derived, not using spectral reflectance, but using only RGB camera outputs. We suppose that the materials of target objects are composed of dielectric or metal, and the surfaces include illumination effects such as highlight, gloss, or specularity. We preset the procedure for realizing the invariant representation in three steps: (1) detection of specular highlight, (2) illumination color estimation, and (3) invariant representation for reflectance color. The performance of the proposed method is examined in experiments using real-world objects including metals and dielectrics in detail. The limitation of the method is also discussed. Finally, the proposed representation is applied to the edge detection problem of natural color images.

Takahiko Horiuchi, Abdelhameed Ibrahim, Hideki Kadoi, Shoji Tominaga
Specularity, the Zeta-image, and Information-Theoretic Illuminant Estimation

Identification of illumination is an important problem in imaging. In this paper we present a new and effective physics-based colour constancy algorithm which makes use of a novel log-relative-chromaticity planar constraint. We call the new feature the Zeta-image. We show that this new feature is tied to a novel application of the Kullback-Leibler Divergence, here applied to chromaticity values instead of probabilities. The new method requires no training data or tunable parameters. Moreover it is simple to implement and very fast. Our experimental results across datasets of real images show the proposed method significantly outperforms other unsupervised methods while its estimation accuracy is comparable with more complex, supervised, methods. As well, the new planar constraint can be used as a post-processing stage for any candidate colour constancy method in order to improve its accuracy.

Mark S. Drew, Hamid Reza Vaezi Joze, Graham D. Finlayson
Robust Estimation of Pigment Distributions from Multiband Skin Images and Its Application to Realistic Skin Image Synthesis

This paper describes a robust method for estimating pigment distributions on a skin surface from multiband images. The spatial distributions of the pigments such as melanin, oxy-hemoglobin and deoxy-hemoglobin give rise to a color texture. The distributions are estimated by using the Kubelka-Munk theory. The accuracy of estimating the pigment distributions is affected by a fine texture of sulcus cutis and a broad texture of shade caused by three-dimensional body shape. In order to separate these textures from the color texture, wavelet-based multi-resolution analysis (MRA) is applied to the multiband images before the pigment estimation, because the textures of sulcus cutis and shade predominantly have low and high spatial frequency components in the multiband skin images, respectively. Realistic skin image is synthesized from modified pigment distributions with additional features such as stain, inflammation and bruise by changing the concentrations of melanin, oxy-hemoglobin and deoxy-hemoglobin, respectively. The experimental results of skin image synthesis show good feasibility of the proposed method.

Motonori Doi, Masahiro Konishi, Akira Kimachi, Shogo Nishi, Shoji Tominaga
A Fisheye Camera System for Polarisation Detection on UAVs

We present a light-weight polarisation sensor that consists of four synchronised cameras equipped with differently oriented polarisers and fisheye lenses allowing us to image the whole sky hemisphere. Due to its low weight and compact size it is well-suited as a biomimetic sensor on-board a UAV. We describe efficient methods for reconstruction of the full-sky polarisation pattern and estimation of sun position. In contrast to state-of-the art polarisation systems for UAVs that estimate sun azimuth only, our approach can determine sun elevation as well, even in the presence of clouds and for significant pitch and roll angles of the UAV. The calibration and registration of the four fisheye cameras is achieved by extending an existing omni-directional calibration toolbox to multi-camera calibration. We present examples of full-sky reconstruction of the polarisation pattern as well as an analysis of the error in the sun position estimate. In addition, we performed a preliminary test on-board a quadcopter.

Wolfgang Stürzl, Nicole Carey
Time-Lapse Image Fusion

Exposure fusion is a well known technique for blending multiple, differently-exposed images to create a single frame with wider dynamic range. In this paper, we propose a method that applies and extends exposure fusion to blend visual elements from time sequences while preserving interesting structure. We introduce a time-dependent decay into the image blending process that determines the contribution of individual frames based on their relative position in the sequence, and show how this temporal component can be made dependent on visual appearance. Our time-lapse fusion method can simulate on video the kind visual effects that arise in long-exposure photography. It can also create very-long-exposure photographs impossible to capture with current digital sensor technologies.

Francisco J. Estrada
HDR Imaging under Non-uniform Blurring

Knowledge of scene irradiance is necessary in many computer vision algorithms. In this paper, we develop a technique to obtain the high dynamic range (HDR) irradiance of a scene from a set of differently exposed images captured using a hand-held camera. Any incidental motion induced by camera-shake can result in non-uniform motion blur. This is particularly true for frames captured with high exposure durations. We model the motion blur using a transformation spread function (TSF) that represents space-variant blurring as a weighted average of differently transformed versions of the latent image. We initially estimate the TSF of the blurred frames and then estimate the latent irradiance of the scene.

C. S. Vijay, Paramanand Chandramouli, Rajagopalan Ambasamudram
Semantic Image Segmentation Using Visible and Near-Infrared Channels

Recent progress in computational photography has shown that we can acquire physical information beyond visible (RGB) image representations. In particular, we can acquire near-infrared (NIR) cues with only slight modification to any standard digital camera. In this paper, we study whether this extra channel can improve semantic image segmentation. Based on a state-of-the-art segmentation framework and a novel manually segmented image database that contains 4-channel images (RGB+NIR), we study how to best incorporate the specific characteristics of the NIR response. We show that it leads to improved performances for 7 classes out of 10 in the proposed dataset and discuss the results with respect to the physical properties of the NIR response.

Neda Salamati, Diane Larlus, Gabriela Csurka, Sabine Süsstrunk
Utilization of False Color Images in Shadow Detection

Shadows are illuminated as a result of Rayleigh scattering phenomenon, which happens to be more effective for small wavelengths of light. We propose utilization of false color images for shadow detection, since the transformation eliminates high frequency blue component and introduces low frequency near-infrared channel. Effectiveness of the approach is tested by using several shadow-variant texture and color-related cues proposed in the literature. Performances of these cues in regular and false color images are compared and analyzed within a supervised system by using a support vector machine classifier.

Yagiz Aksoy, A. Aydın Alatan
High Information Rate and Efficient Color Barcode Decoding

The necessity of increasing information density in a given space motivates the use of more colors in color barcodes. A popular system, Microsoft’s HCCB technology, uses four or eight colors per patch. This system displays a color palette of four or eight colors in the color barcode to solve the problem with the dependency of the surface color on the illuminant spectrum, viewing parameters, and other sources. Since the displayed colors cannot be used to encode information, this solution comes at the cost of reduced information rate. In this contribution, we introduce a new approach to color barcode decoding that uses 24 colors per patch and requires a small number of reference colors to display in a barcode. Our algorithm builds groups of colors from each color patch and a small number of reference color patches, and models their evolution due to changing illuminant using a linear subspace. Therefore, each group of colors is represented by one such subspace. Our experimental results show that our barcode decoding algorithm achieves higher information rate with a very low probability of decoding error compared to systems that do display a color palette. The computational complexity of our algorithm is relatively low due to searching for the nearest subspace among 24 subspaces only.

Homayoun Bagherinia, Roberto Manduchi
Uzawa Block Relaxation Methods for Color Image Restoration

In this paper we propose to investigate the use of a vectorial total variation model with spatially varying regularization and data terms for color image denoising and restoration. We pay attention to two main minimization problems: the minimization of a weighted vectorial total variation term

TV

g

, which acts as a regularization term, using the

L

2

norm as data term or the minimization of the vectorial total variation with a spatially varying

$L^1_g$

norm. The optimization process takes benefit of convex optimization tools by introducing an augmented Lagrangian formulation. This formulation leads us to simple and efficient algorithms based on Uzawa block relaxation schemes that are also robust towards the choice of the penalty parameter. In this paper, We propose to study more particularly the impact of spatially varying terms (total variation term or data terms) for color image restoration. A new weighted total variation term is proposed for old parchments restoration and we also compare the use of a weighted total variation term with a spatially varying data term for impulse noise removal in color images.

Cédric Loosli, Stéphanie Jehan-Besson, Jonas Koko

Third Workshop on Computer Vision in Vehicle Technology: From Earth to Mars

Monocular Rear-View Obstacle Detection Using Residual Flow

We present a system for automatically detecting obstacles from a moving vehicle using a monocular wide angle camera. Our system was developed in the context of finding obstacles and particularly children when backing up. Camera viewpoint is transformed to a virtual bird-eye view. We developed a novel image registration algorithm to obtain ego-motion that in combination with variational dense optical flow outputs a residual motion map with respect to the ground. The residual motion map is used to identify and segment 3D and moving objects. Our main contribution is the feature-based image registration algorithm that is able to separate and obtain ground layer ego-motion accurately even in cases of ground covering only 20% of the image, outperforming RANSAC.

Jose Molineros, Shinko Y. Cheng, Yuri Owechko, Dan Levi, Wende Zhang
Subtraction-Based Forward Obstacle Detection Using Illumination Insensitive Feature for Driving-Support

This paper proposes a method for detecting general obstacles on a road by subtracting present and past in-vehicle camera images. The image-subtraction-based object detection approach can be applied to detect any kind of obstacles although the existing learning-based methods detect only specific obstacles. To detect general obstacles, the proposed method first computes a frame-by-frame correspondence between the present and the past in-vehicle camera image sequences, and then registrates road surfaces between the frames. Finally, obstacles are detected by applying image subtraction to the registrated road surface regions with an illumination insensitive feature for robust detection. Experiments were conducted by using several image sequences captured by an actual in-vehicle camera to confirm the effectiveness of the proposed method. The experimental results shows that the proposed method can detect general obstacles accurately at a distance enough to avoid them safely even in situations with different illuminations.

Haruya Kyutoku, Daisuke Deguchi, Tomokazu Takahashi, Yoshito Mekada, Ichiro Ide, Hiroshi Murase
Adaptive Visual Obstacle Detection for Mobile Robots Using Monocular Camera and Ultrasonic Sensor

This paper presents a novel vision based obstacle detection algorithm that is adapted from a powerful background subtraction algorithm: ViBe (VIsual Background Extractor). We describe an adaptive obstacle detection method using monocular color vision and an ultrasonic distance sensor. Our approach assumes an obstacle free region in front of the robot in the initial frame. However, the method dynamically adapts to its environment in the succeeding frames. The adaptation is performed using a model update rule based on using ultrasonic distance sensor reading. Our detailed experiments validate the proposed concept and ultrasonic sensor based model update.

İbrahim K. İyidir, F. Boray Tek, Doğan Kırcalı
Data-Driven Vehicle Identification by Image Matching

Vehicle identification from images has been predominantly addressed through automatic license plate recognition (ALPR) techniques which detect and recognize the characters in the plate region of the image. We move away from traditional ALPR techniques and advocate for a data-driven approach for vehicle identification. Here, given a plate image region, the idea is to search for a near-duplicate image in an annotated database; if found, the identity of the near-duplicate is transferred to the input region. Although this approach could be perceived as impractical, we actually demonstrate that it is feasible with state-of-the-art image representations, and that it presents some advantages in terms of speed, and time-to-deploy. To overcome the issue of identifying previously unseen identities, we propose an image simulation approach where photo-realistic images of license plates are generated for desired plate numbers. We demonstrate that there is no perceivable performance difference between using synthetic and real plates. We also improve the matching accuracy using similarity learning, which is in the spirit of domain adaptation.

Jose A. Rodriguez-Serrano, Harsimrat Sandhawalia, Raja Bala, Florent Perronnin, Craig Saunders
A Vision-Based Navigation Facility for Planetary Entry Descent Landing

This paper describes a facility set up as a test bed and a proof of concept to study open issues of future space missions. The final goal of such studies is to increase the on board autonomy, of primary importance for missions covering very high distances. We refer in particular to vision-based modules, in charge of acquiring and processing images during the Entry Descent and Landing (EDL) phases of a Lander, and contributing to a precise localization of the landing region and a safe landing. We will describe the vision-based algorithms already implemented on the facility, and a preliminary experimental analysis which allowed us to validate the approaches and provided very promising results.

Piergiorgio Lanza, Nicoletta Noceti, Corrado Maddaleno, Antonio Toma, Luca Zini, Francesca Odone
CYKLS: Detect Pedestrian’s Dart Focusing on an Appearance Change

We propose a new method for detecting “pedestrians’ dart” to support drivers cognition in real traffic scenario. The main idea is to detect sudden appearance change of pedestrians before their consequent actions happen. Our new algorithm, called “Chronologically Yielded values of Kullback-Leibler divergence between Separate frames” (CYKLS), is a combination of two main procedures: (1) calculation of appearance change by Kullback-Leibler divergence between descriptors in some time interval frames, and (2) detection of non-periodic sequence by a new smoothing method in the field of time series analysis. We can detect pedestrians’ dart with 22% Equal Error Rate, using a dataset which includes 144 dart scenes.

Masahiro Ogawa, Hideo Fukamachi, Ryuji Funayama, Toshiki Kindo
Pose-Invariant Face Recognition in Videos for Human-Machine Interaction

Human-machine interaction is a hot topic nowadays in the communities of computer vision and robotics. In this context, face recognition algorithms (used as primary cue for a person’s identity assessment) work well under controlled conditions but degrade significantly when tested in real-world environments. This is mostly due to the difficulty of simultaneously handling variations in illumination, pose, and occlusions. In this paper, we propose a novel approach for robust pose-invariant face recognition for human-robot interaction based on the real-time fitting of a 3D deformable model to input images taken from video sequences. More concrete, our approach generates a rectified face image irrespective with the actual head-pose orientation. Experimental results performed on Honda video database, using several manifold learning techniques, show a distinct advantage of the proposed method over the standard 2D appearance-based snapshot approach.

Bogdan Raducanu, Fadi Dornaika
Hierarchical Properties of Multi-resolution Optical Flow Computation

Most of the methods to compute optical flows are variational-technique-based methods, which assume that image functions have spatiotemporal continuities and appearance motions are small. In the viewpoint of the discrete errors of spatial- and time-differentials, the appropriate resolution for optical flow depends on both the resolution and the frame rate of images since there is a problem with the accuracy of the discrete approximations of derivatives. Therefore, for low frame-rate images, the appropriate resolution for optical flow should be lower than the resolution of the images. However, many traditional methods estimate optical flow with the same resolution as the images. Therefore, if the resolution of images is too high, down-sampling the images is effective for the variational-technique-based methods. In this paper, we analyze the appropriate resolutions for optical flows estimated by variational optical-flow computations from the viewpoint of the error analysis of optical flows. To analyze the appropriate resolutions, we use hierarchical structures constructed from the multi-resolutions of images. Numerical results show that decreasing image resolutions is effective for computing optical flows by variational optical-flow computations in low frame-rate sequences.

Yusuke Kameda, Atsushi Imiya, Tomoya Sakai
Semantic Road Segmentation via Multi-scale Ensembles of Learned Features

Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand–designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process.

Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state–of–the–art methods using other sources of information such as depth, motion or stereo.

Jose M. Alvarez, Yann LeCun, Theo Gevers, Antonio M. Lopez
Monocular Visual Odometry and Dense 3D Reconstruction for On-Road Vehicles

More and more on-road vehicles are equipped with cameras each day. This paper presents a novel method for estimating the relative motion of a vehicle from a sequence of images obtained using a single vehicle-mounted camera. Recently, several researchers in robotics and computer vision have studied the performance of motion estimation algorithms under non-holonomic constraints and planarity. The successful algorithms typically use the smallest number of feature correspondences with respect to the motion model. It has been strongly established that such minimal algorithms are efficient and robust to outliers when used in a hypothesize-and-test framework such as random sample consensus (RANSAC). In this paper, we show that the planar 2-point motion estimation can be solved analytically using a single quadratic equation, without the need of iterative techniques such as Newton-Raphson method used in existing work. Non-iterative methods are more efficient and do not suffer from local minima problems. Although 2-point motion estimation generates visually accurate on-road vehicle trajectory, the motion is not precise enough to perform dense 3D reconstruction due to the non-planarity of roads. Thus we use a 2-point relative motion algorithm for the initial images followed by 3-point 2D-to-3D camera pose estimation for the subsequent images. Using this hybrid approach, we generate accurate motion estimates for a plane-sweeping algorithm that produces dense depth maps for obstacle detection applications.

Menglong Zhu, Srikumar Ramalingam, Yuichi Taguchi, Tyler Garaas
Erratum: High Accuracy TOF and Stereo Sensor Fusion at Interactive Rates

There was an error in the acknowledgements section of this paper. The correct acknowledgement text is as follows:

This work is part of a joint research project with the Filmakademie Baden-Württemberg, Institute of Animation. It is co-funded by the Intel Visual Computing Institute and under grant 2-4225.16/380 of the ministry of economy Baden-Württemberg as well as further partners Unexpected, Pixomondo, ScreenPlane, Bewegte Bilder and Tridelity. The content is under sole responsibility of the authors.

Rahul Nair, Frank Lenzen, Stephan Meister, Henrik Schäfer, Christoph Garbe, Daniel Kondermann
Backmatter
Metadaten
Titel
Computer Vision – ECCV 2012. Workshops and Demonstrations
herausgegeben von
Andrea Fusiello
Vittorio Murino
Rita Cucchiara
Copyright-Jahr
2012
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-33868-7
Print ISBN
978-3-642-33867-0
DOI
https://doi.org/10.1007/978-3-642-33868-7