Inverse Kinematics Using Sequential Monte Carlo Methods

In this paper we propose an original approach to solve the Inverse Kinematics problem. Our framework is based on Sequential Monte Carlo Methods and has the advantage to avoid the classical pitfalls of numerical inversion methods since only direct calculations are required. The resulting algorithm accepts arbitrary constraints and exhibits linear complexity with respect to the number of degrees of freedom. Hence, the proposed system is far more efficient for articulated figures with a high number of degrees of freedom.

Nicolas Courty, Elise Arnaud

Estimation of Dense, Non-rigid Motion Fields from a Multi-camera Array Using a Hierarchical Mixture Model

The problem of modelling objects of arbitrary complecity for video based rendering has been much studied in recent years, with the growing interest in ‘free viewpoint’ video and similar applications. Typical approaches fall into two categories: those which approximate surfaces from dense depth maps obtained by generalisations of stereopsis methods and those which employ an explicit geometric representation such as a mesh. While the former has generality with respect to geometry, it is inevitably limited in terms of viewpoint; the latter, on the other hand, sacrifices generality of object geometry for freedom to pick an arbitary viewpoint. The purpose of the work reported here is to bridge this gap in object representation, by employing a surface element model, but one which is freed from the restrictions of a mesh. Estimation of the model and tracking it through time from multiple cameras is achieved by novel multiresolution stochastic simulation methods. After a brief outline of the method, its use in modelling human motions using data from the Warwick multi-camera studio is presented to illustrate its effectiveness compared to the current state of the art.

Adam Bowen, Andrew Mullins, Roland Wilson, Nasir Rajpoot

Learning to Look at Humans — What Are the Parts of a Moving Body?

We present a system that can segment articulated, non-rigid motion without a priori knowledge of the number of clusters present in the analyzed scenario. We combine existing algorithms for tracking and extend clustering techniques by a self-tuning heuristic. Application to video sequences of humans shows good segmentation into limbs.

Thomas Walther, Rolf P. Würtz

Continuous Hand Gesture Recognition in the Learned Hierarchical Latent Variable Space

We describe a hierarchical approach for recognizing continuous hand gestures. It consists of hierarchical nonlinear dimensionality reduction based feature extraction and Hierarchical Conditional Random Field (Hierarchical CRF) based motion modeling. Articulated hands can be decomposed into several hand parts and we explore the underlying structures of articulated action spaces for both the hand and hand parts using Hierarchical Gaussian Process Latent Variable Model (HGPLVM). In this hierarchical latent variable space, we propose a Hierarchical CRF, which can simultaneously capture the extrinsic class dynamics and learn the relationship between motions of hand parts and class labels, to model the hand motions. Approving recognition performance is obtained on our user-defined hand gesture dataset.

Lei Han, Wei Liang

Real-Time 3D Body Pose Tracking from Multiple 2D Images

We present a human body motion tracking system for an interactive virtual simulation training environment. This system captures images using IR illumination and near-IR cameras to overcome limitations of a dimly lit environment. Features, such as silhouettes and medial axis of blobs are extracted from the images which lack much internal texture. We use a combination of a 2D ICP and particle filtering method to estimate the articulated body configuration of a trainee from image features. The method allows articulation of the arms at elbows and shoulders and of the body at the waist; this is a considerable improvement over previous such methods. Our system works in real-time and is robust to temporary errors in image acquisition or tracking. The system serves as part of a multi-modal user-input device for interactive simulation.

Chi-Wei Chu, Ramakant Nevatia

Applying Space State Models in Human Action Recognition: A Comparative Study

This paper presents comparative results of applying different architectures of generative classifiers (HMM, FHMM, CHMM, Multi-Stream HMM, Parallel HMM ) and discriminative classifier as Conditional Random Fields (CRFs) in human action sequence recognition. The models are fed with histogram of very informative features such as contours evolution and optical-flow. Motion orientation discrimination has been obtained tiling the bounding box of the subject and extracting features from each tile. We run our experiments on two well-know databases, KTH´s database and Weizmann´s. The results show that both type of models reach similar score, being the generative model better when used with optical flow features and being the discriminative one better when uses with shape-context features.

M. Ángeles Mendoza, Nicolás Pérez de la Blanca

Bone Glow: An Improved Method for the Assignment of Weights for Mesh Deformation

Many real-time algorithms for mesh deformation driven by animation of an underlying skeleton make use of a set of per-bone weights associated with each vertex. There are few unguided algorithms for the assignment of these weights with a recent proposed solution being

bone heat

[1]. In this paper, we briefly discuss bone heat and provide examples where it performs poorly. We then develop a refinement of bone heat, termed bone glow, which, in our validation, performed as well as bone heat in simple cases while not suffering from bone heat’s observed weaknesses.

Rich Wareham, Joan Lasenby

Analysis of Human Motion, Based on the Reduction of Multidimensional Captured Data – Application to Hand Gesture Compression, Segmentation and Synthesis

This paper describes a method to analyze human motion, based on the reduction of multidimensional captured motion data. A Dynamic Programming Piecewise Linear Approximation model is used to automatically extract in an optimal way key-postures distributed along the motion data. This non uniform sub-sampling can be exploited for motion compression, segmentation, or re-synthesis. It has been applied on arm end-point motion for 3D or 6D trajectories. The analysis method is then evaluated, using an approximation of the curvature and the tangential velocity, which turns out to be robust to noise and can be calculated on multidimensional data.

Sylvie Gibet, Pierre-François Marteau

Exploiting Structural Hierarchy in Articulated Objects Towards Robust Motion Capture

This paper presents a general analysis framework towards exploiting the underlying hierarchical and scalable structure of an articulated object for pose estimation and tracking. The Scalable Human Body Model (SHBM) is presented as a set of human body models ordered following a hierarchy criteria. The concept of annealing is applied to derive a generic particle filtering scheme able to perform a sequential filtering over the models contained in the SHBM leading to a

structural annealing

process. This scheme is applied to perform human motion capture in a multi-camera environment. Finally, the effectiveness of the proposed system is addressed by comparing its performance with the standard and annealed particle filtering approaches over an annotated database.

C. Canton-Ferrer, J. R. Casas, M. Pardàs

Estimating Human Skeleton Parameters and Configuration in Real-Time from Markered Optical Motion Capture

This paper is concerned with real-time approaches to using marker-based optical motion capture to identify, parametrize, and estimate the frame by frame configuration of the human skeleton. An overview of the stages of a system is provided with the main emphasis devoted to two new methods for refining the rotation estimates used within the transformational algorithm class of joint parameter estimation methods. Virtual Marker Insertion uses additional markers inserted at the current estimates of joint location to partially enforce the concurrency of available joint location estimates. This simple algorithm is shown to outperform the methods presented in the literature. A conjugate gradient optimization on a minimal parameterization of the standard transformational algorithm cost function gives superior results, but at considerable computational cost, limiting its probable application to those frames which are actually rendered in a feedback system.

Jonathan Cameron, Joan Lasenby

Dealing with Self-occlusion in Region Based Motion Capture by Means of Internal Regions

Self-occlusion is a common problem in silhouette based motion capture, which often results in ambiguous pose configurations. In most works this is compensated by a priori knowledge about the motion or the scene, or by the use of multiple cameras. Here we suggest to overcome this problem by splitting the surface model of the object and tracking the silhouette of each part rather than the whole object. The splitting can be done automatically by comparing the appearance of the different parts with the Jensen-Shannon divergence. Tracking is then achieved by maximizing the appearance differences of all involved parts and the background simultaneously via gradient descent. We demonstrate the improvements with tracking results from simulated and real world scenes.

Christian Schmaltz, Bodo Rosenhahn, Thomas Brox, Joachim Weickert, Lennart Wietzke, Gerald Sommer

Model-Based Analysis and Synthesis of Time-Varying Mesh

Time-varying mesh (TVM) is a technique that describes full shape and motion of a real-world moving object. Thus, TVM is used to capture and reproduce human behavior and natural movements precisely, such as the expression of the face or small changes in cloths. But on the other hand, TVM requires large storage space and computational cost. To solve this problem, we propose a framework of motion editing and synthesis. In our approach, a skeleton model is extracted from each frame and motion tracking and analysis are conducted using the skeleton models. Then, a representative frame is deformed based on the extracted motion information. As a result, 3D scene representation with much smaller data size is made possible.

Takashi Maeda, Toshihiko Yamasaki, Kiyoharu Aizawa

Mesh Vertex Pose and Position Interpolation Using Geometric Algebra

This paper presents a method of interpolating between two or more general displacements (rotation and translation). The resulting interpolated path is smooth and possesses a number of desirable properties. It differs from existing algorithms which require factorising the pose into separate rotation and translation components and is derived from an intuitively appealing framework–i.e. a natural extension of the standard interpolation scheme for pure rotations. While this paper presents the theory behind the interpolation and its description as a tool, we also outline the possible advantages of using this technique for vision and graphics applications.

Rich Wareham, Joan Lasenby

An Audiovisual Talking Head for Augmented Speech Generation: Models and Animations Based on a Real Speaker’s Articulatory Data

We present a methodology developed to derive three-dimensional models of speech articulators from volume MRI and multiple view video images acquired on one speaker. Linear component analysis is used to model these highly deformable articulators as the weighted sum of a small number of basic shapes corresponding to the articulators’ degrees of freedom for speech. These models are assembled into an audiovisual talking head that can produce augmented audiovisual speech, i.e. can display usually non visible articulators such as tongue or velum. The talking head is then animated by recovering its control parameters by inversion from the coordinates of a small number of points of the articulators of the same speaker tracked by Electro-Magnetic Articulography. The augmented speech produced points the way to promising applications in the domain of speech therapy for speech retarded children, perception and production rehabilitation of hearing impaired children, and pronunciation training for second language learners.

Pierre Badin, Frédéric Elisei, Gérard Bailly, Yuliya Tarabalka

Towards Efficiency in Cloth Simulation

Cloth simulation is an extremely expensive task. Realistic cloth models coupled with stable numerical integration demand all the processing power we can spend. Although implicit integration schemes allow us to use large time steps, the exponential time complexity limits the number of particles that are reasonable to use in any simulation. In this paper, we present a technique that simulates cloth surfaces with adaptive level of detail without degrading the surface physical properties or the stability of the simulation. The underlying mesh operations are derived from subdivision surfaces, but rules are adapted to prevent the numerical simulation from diverging. We also demonstrate how the model constants can be specified in a resolution independent way.While the use of DLOD effectively allows the formation of folds and wrinkles, the final appearance can still be far from smooth. To further increase the realism we can use a GPU post processing algorithm to smooth the final rendered surface.

Fernando Birra, Manuel Santos

2D Articulated Body Tracking with Self-occultations Handling

Recently many methods for human articulated body tracking were proposed in the literature. These techniques are often computationally intensive and cannot be used for Human-Computer Interface. We propose in this article a real-time algorithm for upper body tracking with occultation handling. The tracking is based on an articulated body model, also used to automatically initialize the target. After an independent search of the most likely positions of each limb, a dynamic programming algorithm is used to find the best configuration according to the links between limbs. The self-occultations between the limbs are directly taken into account by the tracking algorithm and results show the interest of the proposed approach.

Eric Para, Olivier Bernier, Catherine Achard

A Two-Step Approach for Detecting Individuals within Dense Crowds

This paper proposes a two-step approach for detecting individuals within dense crowds. First step uses an offline-trained Viola-type head detector in still color images of dense crowds in a cluttered background. In the second step, which aims to reduce false alarm rates at same detection rates, color bin images are constructed from normalized

rg

color histograms of the detected windows in the first step. Haar-like features extracted from these color bin images are input to a trained cascade of boosted classifiers to separate correct detections from false alarms. Experimental results of both steps are presented as Receiver Operating Characteristics (ROC) curves, in comparison with recent related work. Our proposed two-step approach is able to attain a high detection rate of 90.0%, while maintaining false alarm rate below 40.0%, as compared to other work which attains a high 70.0% false alarm rate when detection rate is still below 90.0%.

Chern-Horng Sim, Ekambaram Rajmadhan, Surendra Ranganath

3D Modeling for Deformable Objects

This paper presents an efficient BSpline surface reconstruction technique for modelling deformable objects. The differences of our methods from previous BSpline fitting approaches are: 1) the reconstructed BSpline patch does not need to be square shaped. This significantly reduces the required number of BSpline patches for reconstruction; 2) the dataset to be reconstructed does not have to be grid data. This is important, especially for 3D scan data, which is unstructured dense point cloud, normally with holes. A compact 3D shape description can be obtained using our approach. This shape representation allows 3D metamorphosis, direct manipulation of free-form deformation, and level of detail control (real time multi-resolution rendering). The demonstrated results are reconstructed directly from the dense point clouds collected from our 3D scanner (based on stereo photogrammetry technique) and example datasets provided by Cyberware.

Yi Song, Li Bai

Active-Vision System Reconfiguration for Form Recognition in the Presence of Dynamic Obstacles

This paper presents a novel, agent-based sensing-system reconfigura tion methodology for the recognition of time-varying geometry objects or subjects (

targets

). A multi-camera active-vision system is used to improve form-recognition performance by selecting near-optimal viewpoints along a prediction horizon. The proposed method seeks to maximize the visibility of such a time-varying geometry in a cluttered, dynamic environment. Simulated experiments clearly show a tangible potential performance gain.

Matthew Mackay, Beno Benhabib

View-Invariant Human Action Detection Using Component-Wise HMM of Body Parts

This paper presents a framework for view-invariant action recognition in image sequences. Feature-based human detection becomes extremely challenging when the agent is being observed from different viewpoints. Besides, similar actions, such as walking and jogging, are hardly distinguishable by considering the human body as a whole. In this work, we have developed a system which detects human body parts under different views and recognize similar actions by learning temporal changes of detected body part components. Firstly, human body part detection is achieved to find separately three components of the human body, namely the head, legs and arms. We incorporate a number of sub-classifiers, each for a specific range of view-point, to detect those body parts. Subsequently, we have extended this approach to distinguish and recognise actions like walking and jogging based on component-wise HMM learning.

Bhaskar Chakraborty, Marco Pedersoli, Jordi Gonzàlez

A Generative Model for Motion Synthesis and Blending Using Probability Density Estimation

The main focus of this paper is to present a method of reusing motion captured data by learning a generative model of motion. The model allows synthesis and blending of cyclic motion, whilst providing it with the style and realism present in the original data. This is achieved by projecting the data into a lower dimensional space and learning a multivariate probability distribution of the motion sequences. Functioning as a generative model, the probability density estimation is used to produce novel motions from the model and gradient based optimisation used to generate the final animation. Results show plausible motion generation and lifelike blends between different actions.

Dumebi Okwechime, Richard Bowden

Gestural Interaction Using Feature Classification

This paper describes our ongoing research work on deviceless interaction using hand gesture recognition with a calibrated stereo system. Video-based interaction is one of the most intuitive kinds of Human-Computer-Interaction with Virtual-Reality applications due to the fact that users are not wired to a computer. If interaction with three-dimensional environments is considered, pointing, grabbing and releasing are the most intuitive gestures used by humans. This paper describes our video-based gesture recognition system that observes the user in front of a large displaying screen, identifying three different hand gestures in real time using 2D feature classification and determines 3D information like the 3D position of the user’s hand or the pointing direction if performed. Different scenario applications like a virtual chess game against the computer or an industrial scenario have been developed and tested. To estimate the possible count of distinguishable gestures a sign language recognition application has been developed and tested using a single uncalibrated camera only.

Cornelius Malerczyk

Predicting Missing Markers to Drive Real-Time Centre of Rotation Estimation

This paper addresses the problem of real-time location of the joints or centres of rotation (CoR) of human skeletons in the presence of missing data. The data is assumed to be 3

d

marker positions from a motion capture system. We present an integrated framework which predicts the occluded marker positions using a Kalman filter in combination with inferred information from neighbouring markers and thereby maintains a continuous data-flow. The CoR positions can be calculated with high accuracy even in cases where markers are occluded for a long period of time.

Andreas Aristidou, Jonathan Cameron, Joan Lasenby

Accurate Human Motion Capture Using an Ergonomics-Based Anthropometric Human Model

In this paper we present our work on markerless model-based 3D human motion capture using multiple cameras. We use an industry proven anthropometric human model that was modeled taking ergonomic considerations into account. The outer surface consists of a precise yet compact 3D surface mesh that is mostly rigid on body part level apart from some small but important torsion deformations. Benefits are the ability to capture a great amount of possible human appearances with high accuracy while still having a simple to use and computationally efficient model. We have introduced special optimizations such as caching into the model to improve its performance in tracking applications. Available force and comfort measures within the model provide further opportunities for future research.

3D articulated pose estimation is performed in a Bayesian framework, using a set of hierarchically coupled local particle filters for tracking. This makes it possible to sample efficiently from the high dimensional space of articulated human poses without constraining the allowed movements. Sequences of tracked upper-body as well as full-body motions captured by three cameras show promising results. Despite the high dimensionality of our model (51 DOF) we succeed at tracking using only silhouette overlap as weighting function due to the precise outer appearance of our model and the hierarchical decomposition.

Jan Bandouch, Florian Engstler, Michael Beetz

A Deformable Surface Model with Volume Preserving Springs

This paper discusses the possibility of employing a surface model to emulate volume behaviour. This is inspired by a significant interest in employing the surface data due to its simplicity. However, there are issues in properties estimation and volume preservation. Therefore, the aim of the ongoing research includes exploring the potential of a surface mass spring model with shape-preserving springs for volume simulation. Initial evaluations illustrate the feasibility of employing a mass spring model with volume preserving springs to simulate the dynamic behaviour of a soft volume. The proposed framework can be further explored to address other material properties.

Sylvester Arnab, Vinesh Raja

Temporal Nearest End-Effectors for Real-Time Full-Body Human Actions Recognition

In this paper we present a novel method called Temporal Nearest End-Effectors (TNEE) to automatically classify full-body human actions captured in real-time. This method uses a simple representation for modeling actions based exclusively on the recent positions of the user’s end-effectors, i.e. hands, head and feet, relative to the pelvis. With this method, the essential information of full-body movements is retained in a reduced form. The recognition procedure combines the evaluation of the performed poses and the temporal coherence. The performance of TNEE is tested with real motion capture data obtaining satisfactory results for real-time applications.

Oscar Mena, Luis Unzueta, Basilio Sierra, Luis Matey

Fusing Edge Cues to Handle Colour Problems in Image Segmentation

This paper presents a new background subtraction algorithm for mobile objects segmentation from a static background scene. Firstly, a case analysis of colour-motion segmentation problems is presented. Secondly, an architecture which fuses colour, intensity and edge cues is developed to cope the motion segmentation problems presented in the case analysis. Our approach first combines both colour and intensity cues in order to solve problems, such as saturation or the lack of the colour when the background model is built. Nonetheless, some colours problems presented in the case analysis are not solved yet, such as the camouflage in chroma. Then, in order to solve this problems a new cue based on edges is proposed. Finally, our approach which fuses colour, intensity and edge cues is presented, thereby obtaining accurate motion segmentation in both indoor and outdoor scenes.

I. Huerta, A. Amato, J. Gonzàlez, J. J. Villanueva

Body-Part Templates for Recovery of 2D Human Poses under Occlusion

Detection of humans and estimation of their 2D poses from a single image are challenging tasks. This is especially true when part of the observation is occluded. However, given a limited class of movements, poses can be recovered given the visible body-parts. To this end, we propose a novel template representation where the body is divided into five body-parts. Given a match, we not only estimate the joints in the body-part, but all joints in the body. Quantitative evaluation on a HumanEva walking sequence shows mean 2D errors of approximately 27.5 pixels. For simulated occlusion of the head and arms, similar results are obtained while occlusion of the legs increases this error by 6 pixels.

Ronald Poppe, Mannes Poel

Autonomous Virtual Agents for Performance Evaluation of Tracking Algorithms

This paper describes a framework which exploits the use of computer animation to evaluate the performance of tracking algorithms. This can be achieved in two different, complementary strategies. On the one hand, augmented reality allows to gradually increasing the scene complexity by adding virtual agents into a real image sequence. On the other hand, the simulation of virtual environments involving autonomous agents provides with synthetic image sequences. These are used to evaluate several difficult tracking problems which are under research nowadays, such as performance processing long–time runs and the evaluation of sequences containing crowds of people and numerous occlusions. Finally, a general event–based evaluation metric is defined to measure whether the agents and actions in the scene given by the ground truth were correctly tracked by comparing two event lists. This metric is suitable to evaluate different tracking approaches where the underlying algorithm may be completely different.

Pau Baiget, Xavier Roca, Jordi Gonzàlez

A Manipulable Vision-Based 3D Input Device for Space Curves

This paper presents a novel and user friendly input device for 3D curves. The system is based on a piece of flexible wire and a single off-the-shelf photo camera: the user bends the wire to the desired 3D shape; then, an ad-hoc technique for 3D reconstruction is used to recover its 3D shape (a space curve) from a single image.

The result is a simple, unusual input device with many potential applications, ranging from games to 3D modeling. For untrained users, this is a much more intuitive input technique than alternative methods. A disadvantage is that changes to the wire’s shape are not reflected in real time on the recovered representation.

We give a detailed description of the system’s structure, briefly recall the reconstruction technique, and describe a prototype in which the input device is seamlessly integrated in the popular Blender 3D modeling software. We finally show simple example applications in which the shape of the wire is used to define the trajectory of moving objects, to deform a 3D object, and to animate a 3D character.

Vincenzo Caglioti, Alessandro Giusti, Lorenzo Mureddu, Pierluigi Taddei

A Comparative Study on Using Zernike Velocity Moments and Hidden Markov Models for Hand Gesture Recognition

Hand-gesture recognition presents a challenging problem for computer vision due to the articulated structure of the human hand and the complexity of the environments in which it is typically applied. Solving such a problem requires a robust tracking mechanism which in turn depends on an effective feature descriptor and classifier. Moment invariants, as discriminative feature descriptors, have been used for shape representation for many years. Zernike moments have been particularly attractive for their scale, translation and rotation invariance. More recently, Zernike moments have been extended to a spatio-temporal descriptor, known as the Zernike velocity moment, through combining with the displacement vector of the centre of mass of the target object between video frames. This descriptor has hitherto been demonstrated successfully in human gait analysis. In this paper, we introduce and evaluate the application of Zernike velocity moments in hand-gesture recognition, and compare with a bank of hidden Markov models with Zernike moments as observations. We demonstrate good performance for both approaches, with a substantial increase in performance for the latter method.

Moaath Al-Rajab, David Hogg, Kia Ng

Deformable Volumetric Simplex Meshes

In this article, two possible ways to build volumetric meshes of simplexes are described. One structure uses the crystalline organization of carbon atoms in diamond and it is used to simulate the deformation under internal pressure of a container. The second structure is in layers and it is used to simulate human body tissues that could be cut. We also describe the cutting process for this layered structure.

Luis Cruz, Luis Gerardo de la Fraga

Analytical Simulation of B-Spline Surfaces Deformation

In this paper an analytic solution of an evolution model is proposed in order to deform B-splines parametric surfaces. The deformation model is based on an associated energy functional to one surface and its variational formulation is introduced. After some simplifications including assumptions regarding the mass and damping matrices and taking into account the properties of B-splines when are used as finite elements, a second order differential equations is obtained which can be solved analytically. The spatial discretization where these finite elements are defined and computed appears as a reduced number of control points and is deformed instead of all the surface points, obtaining an efficient and fast method in order to simulate surface deformations.

Manuel González-Hidalgo, Antoni Jaume Capó, Arnau Mir, Gabriel Nicolau-Bestard

Apperance-Based Tracking and Face Identification in Video Sequences

We present a technique for face recognition in videos. We are able to recognise a face in a video sequence, given a single gallery image. By assuming that the face is in an approximately frontal position, we jointly model changes in facial appearance caused by identity and illumination. The identity of a face is described by a vector of appearance parameters. We use an angular distance to measure the similarity of faces and a probabilistic procedure to accumulate evidence for recognition along the sequence. We achieve 93.8% recognition success in a set of 65 sequences of 6 subjects from the LaCascia and Sclaroff database.

José Miguel Buenaposada, Juan Bekios, Luis Baumela

Interactive Constrained Deformations of NURBS Surfaces: N-SCODEF

In this paper we propose a generalized SCODEF deformation method in order to deform NURBS surface. The deformation method propose a wide class of deformation functions applied to a set of select-user constraints and a wide range of influence zones, expanding the used one for the original SCODEF method. Also, we propose the use of several norms and distances in order to define and compute the deformation function and the constraint influence zone, ensuring a wide range of deformed shapes.

Marcos Clapés, Manuel González-Hidalgo, Arnau Mir-Torres, Pere A. Palmer-Rodríguez

A 3D Shape Descriptor for Human Pose Recovery

This paper deals with human body pose recovery through a multicamera system, which is a key task in monitoring of human activity. The proposed algorithm reconstructs the 3D visual hull of the observed body and characterizes its shape with a new 3D shape descriptor. The body pose is then infered through an original two-stage regression process. As the learning step is independant of the camera configuration, the resulting system is easy to set up. This solution is evaluated on synthetic scenes and promising results on real images are also presented.

Laetitia Gond, Patrick Sayd, Thierry Chateau, Michel Dhome

Fast Detection and Modeling of Human-Body Parts from Monocular Video

This paper presents a novel and fast scheme to detect different body parts in human motion. Using monocular video sequences, trajectory estimation and body modeling of moving humans are combined in a co-operating processing architecture. More specifically, for every individual person, features of body ratio, silhouette and appearance are integrated into a hybrid model to detect body parts. The conventional assumption of upright body posture is not required. We also present a new algorithm for accurately finding the center point of the human body. The body configuration is finally described by a skeleton model. The feasibility and accuracy of the proposed scheme are analyzed by evaluating its performance for various sequences with different subjects and motion types (walking, pointing, kicking, leaping and falling). Our detection system achieves nearly real-time performance (around 10 frames/second).

Weilun Lao, Jungong Han, Peter H. N. de With

Kinetic Pseudo-energy History for Human Dynamic Gestures Recognition

In this paper we present a new approach, based on the kinetic status history, to automatically determine the starting and ending instants of human dynamic gestures. This method opens up the possibility to distinguish static or quasi-static poses from dynamic actions, during a real-time human motion capture. This way a more complex Human-Computer Interaction (HCI) can be attained. Along with this procedure, we also present a novel method to recognize dynamic gestures independently from the velocity with which they have been performed. The efficiency of this approach is tested with gestures captured with a triple axis accelerometer, and recognized with different statistical classifiers, obtaining satisfactory results for real-time applications.

Luis Unzueta, Oscar Mena, Basilio Sierra, Ángel Suescun

An Improved Algorithm for Estimating the ICA Model Concerning the Convergence Rate

The aim of the present paper is to propose a estimation algorithm of the ICA model, an algorithm based on successive approximations. The convergence rate of the successive approximations method are substantiated for the bidimensional case, a case which presents interest from a practical point of view, and we want to establish the performances of the proposed algorithm to estimate the independent components. Comparative analysis is done and experimentally derived conclusions on the performance of the proposed method are drawn in the last section of the paper for signals recognition applications.

Doru Constantin, Luminita State

Automatic Adjustment of Rigs to Extracted Skeletons

In the animation, the process of rigging a character is an elaborated and time consuming task. The rig is developed for a specific character, and it can not be reused in other meshes. In this paper we present a method to automatically adjust a human-like character rig to an arbitrary human-like 3D mesh, using a extracted skeleton obtained from the input mesh. Our method is based on the selection and extraction of feature points, to find an equivalence between an extracted skeleton and the animation rig.

Jorge E. Ramirez, Xavier Lligadas, Antonio Susin

Real-Time Recognition of Human Gestures for 3D Interaction

A fundamental natural interaction concept is not yet fully exploited in most of the existing human-computer interfaces. Recent technological advances have created the possibility to naturally and significantly enhance the interface perception by means of visual inputs, the so called Vision-Based Interaction (VBI). In this paper, we present a gesture recognition algorithm where the user’s movements are obtained through a real-time vision-based motion capture system. Specifically, we focus on recognizing users motions with a particular mean, that is, a gesture. Defining an appropriate representation of the user’s motions based on a temporal posture parameterization, we apply non-parametric techniques to learn and recognize the user’s gestures in real-time. This scheme of recognition has been tested for controlling a classical computer videogame. The results obtained show an excellent performance in on-line classification and it allows the possibility to achieve a learning phase in real-time due to its computational simplicity.

Antoni Jaume-i-Capó, Javier Varona, Francisco J. Perales

Effective Emotional Classification Combining Facial Classifiers and User Assessment

An effective method for the automatic classification of facial expressions into emotional categories is presented. The system is able to classify the user facial expression in terms of the six Ekman’s universal emotions (plus the neutral one), giving a membership confidence value to each emotional category. The method is capable of analysing any subject, male or female of any age and ethnicity. The classification strategy is based on a combination (weighted majority voting) of the five most used classifiers. Another significant difference with other works is that human assessment is taken into account in the evaluation of the results. The information obtained from the users classification makes it possible to verify the validity of our results and to increase the performance of our method.

Isabelle Hupont, Sandra Baldassarri, Rafael Del Hoyo, Eva Cerezo

Online and Offline Fingerprint Template Update Using Minutiae: An Experimental Comparison

Although the template fingerprint collected during the registration phase of personal verification systems can be considered in principle as representative of the subject identity, some recent works pointed out that it is not completely able to follow the intra-class variations of the fingerprint shape. Accordingly, making these systems adaptive to these variations is one of the most interesting problems, and is often called as the “template updating” problem. In this paper, two different approaches for fingerprint template updating are compared by experiments. The first one, already proposed in other works, relies on the concept of ”online” template updating, that is, the fingerprint template is updated when the system is operating. As alternative, we propose the “offline” template update, which requires the collection of a representative batch of samples when the system is operating. They concur to the template updating when the system is offline, that is, it is not operating. Preliminary experiments carried out on the FVC data sets allow to point out some differences among the investigated approaches.

Biagio Freni, Gian Luca Marcialis, Fabio Roli

Users with Disabilities: Maximum Control with Minimum Effort

People with disabilities can benefit greatly from services provided by computers and robots. Access to remote communications and information as well as to interpersonal communication and environmental control are assisted by current ubiquitous computers, wired and wireless networks and intelligent environments. Sensory, physical or/and cognitive restrictions to interacting with computers can be avoided by means of alternative interaction devices and procedures. Nevertheless, alternative methods are usually much slower than standard communications, frequently leading users with disabilities into unbalanced or unsafe situations. Therefore, the main challenge of human-machine interaction systems that have to be used by people with disabilities is to obtain the maximum communication and control with the minimum physical and cognitive effort from the user. This lecture overviews the main techniques used to optimize the control and communication flow, resulting in higher user satisfaction and security.

Julio Abascal

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter