Skip to main content

2013 | Buch

Computer Analysis of Images and Patterns

15th International Conference, CAIP 2013, York, UK, August 27-29, 2013, Proceedings, Part I

herausgegeben von: Richard Wilson, Edwin Hancock, Adrian Bors, William Smith

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The two volume set LNCS 8047 and 8048 constitutes the refereed proceedings of the 15th International Conference on Computer Analysis of Images and Patterns, CAIP 2013, held in York, UK, in August 2013. The 142 papers presented were carefully reviewed and selected from 243 submissions. The scope of the conference spans the following areas: 3D TV, biometrics, color and texture, document analysis, graph-based methods, image and video indexing and database retrieval, image and video processing, image-based modeling, kernel methods, medical imaging, mobile multimedia, model-based vision approaches, motion analysis, natural computation for digital imagery, segmentation and grouping, and shape representation and analysis.

Inhaltsverzeichnis

Frontmatter
Biomedical Imaging: A Computer Vision Perspective

Many computer vision algorithms have been successfully adapted and applied to biomedical imaging applications. However, biomedical computer vision is far beyond being only an application field. Indeed, it is a wide field with huge potential for developing novel concepts and algorithms and can be seen as a driving force for computer vision research. To emphasize this view of biomedical computer vision we consider a variety of important topics of biomedical imaging in this paper and exemplarily discuss some challenges, the related concepts, techniques, and algorithms.

Xiaoyi Jiang, Mohammad Dawood, Fabian Gigengack, Benjamin Risse, Sönke Schmid, Daniel Tenbrinck, Klaus Schäfers
Rapid Localisation and Retrieval of Human Actions with Relevance Feedback

As increasing levels of multimedia data online require more sophisticated methods to organise this data, we present a practical system for performing rapid localisation and retrieval of human actions from large video databases. We first temporally segment the database and calculate a histogram-match score for each segment against the query. High-scoring, adjacent segments are joined into candidate localised regions using a noise-robust localisation algorithm, and each candidate region is then ranked against the query. Experiments show that this method surpasses the efficiency of previous attempts to perform similar action searches with localisation. We demonstrate how results can be enhanced using relevance feedback, considering how relevance feedback can be effectively applied in the context of localisation.

Simon Jones, Ling Shao
Deformable Shape Reconstruction from Monocular Video with Manifold Forests

A common approach to recover structure of 3D deformable scene and camera motion from uncalibrated 2D video sequences is to assume that shapes can be accurately represented in linear subspaces. These methods are simple and have been proven effective for reconstructions of objects with relatively small deformations, but have considerable limitations when the deformations are large or complex. This paper describes a novel approach to reconstruction of deformable objects utilising a manifold decision forest technique. The key contribution of this work is the use of random decision forests for the shape manifold learning. The learned manifold defines constraints imposed on the reconstructed shapes. Due to nonlinear structure of the learned manifold, this approach is more suitable to deal with large and complex object deformations when compared to the linear constraints.

Lili Tao, Bogdan J. Matuszewski
Multi-SVM Multi-instance Learning for Object-Based Image Retrieval

Object-based image retrieval has been an active research topic in recent years, in which a user is only interested in some object in the images. The recently proposed methods try to comprehensively use both image- and region-level features for more satisfactory performance, but they either cannot well explore the relationship between the two kinds of features or lead to heavy computational load. In this paper, by adopting support vector machine (SVM) as the basic classifier, a novel multi-instance learning method is proposed. To deal with the different forms of image- and region-level representations, standard SVM and multi-instance SVM are utilized respectively. Moreover, the relationship between images and their segmented regions is also taken into account. A unified optimization framework is developed to involve all the available information, and an efficient iterative solution is introduced. Experimental results on the benchmark data set demonstrate the effectiveness of our proposal.

Fei Li, Rujie Liu, Takayuki Baba
Maximizing Edit Distance Accuracy with Hidden Conditional Random Fields

Handwriting recognition aims at predicting a sequence of characters from an image of a handwritten text. Main approaches rely on learning statistical models such as Hidden Markov Models or Conditional Random Fields, whose quality is measured through character and word error rates while they are usually not trained to optimize such criterion. We propose an efficient method for learning Hidden Conditional Random Fields to optimize the error rate within the large margin framework.

Antoine Vinel, Thierry Artières
Background Recovery by Fixed-Rank Robust Principal Component Analysis

Background recovery is a very important theme in computer vision applications. Recent research shows that robust principal component analysis (RPCA) is a promising approach for solving problems such as noise removal, video background modeling, and removal of shadows and specularity. RPCA utilizes the fact that the background is common in multiple views of a scene, and attempts to decompose the data matrix constructed from input images into a low-rank matrix and a sparse matrix. This is possible if the sparse matrix is sufficiently sparse, which may not be true in computer vision applications. Moreover, algorithmic parameters need to be fine tuned to yield accurate results. This paper proposes a fixed-rank RPCA algorithm for solving background recovering problems whose low-rank matrices have known ranks. Comprehensive tests show that, by fixing the rank of the low-rank matrix to a known value, the fixed-rank algorithm produces more reliable and accurate results than existing low-rank RPCA algorithm.

Wee Kheng Leow, Yuan Cheng, Li Zhang, Terence Sim, Lewis Foo
Manifold Learning and the Quantum Jensen-Shannon Divergence Kernel

The quantum Jensen-Shannon divergence kernel [1] was recently introduced in the context of unattributed graphs where it was shown to outperform several commonly used alternatives. In this paper, we study the separability properties of this kernel and we propose a way to compute a low-dimensional kernel embedding where the separation of the different classes is enhanced. The idea stems from the observation that the multidimensional scaling embeddings on this kernel show a strong horseshoe shape distribution, a pattern which is known to arise when long range distances are not estimated accurately. Here we propose to use Isomap to embed the graphs using only local distance information onto a new vectorial space with a higher class separability. The experimental evaluation shows the effectiveness of the proposed approach.

Luca Rossi, Andrea Torsello, Edwin R. Hancock
Spatio-temporal Manifold Embedding for Nearly-Repetitive Contents in a Video Stream

This paper presents a framework to identify and align nearly-repetitive contents in a video stream using spatio-temporal manifold embedding. The similarities observed in frame sequences are captured by defining two types of correlation graphs: an intra-correlation graph in the spatial domain and an inter-correlation graph in the temporal domain. The presented work is novel in that it does not utilise any prior information such as the length and contents of the repetitive scenes. No template is required, and no learning process is involved in the approach. Instead it analyses the video contents using the spatio-temporal extension of SIFT combined with a coding technique. The underlying structure is then reconstructed using manifold embedding. Experiments using a TRECVID rushes video proved that the framework was able to improve embedding of repetitive sequences over the conventional methods, thus was able to identify the repetitive contents from complex scenes.

Manal Al Ghamdi, Yoshihiko Gotoh
Spatio-temporal Human Body Segmentation from Video Stream

We present a framework in which human body volume is extracted from a video stream. Following the line of object tracking-based methods, our approach detect and segment human body regions by jointly embedding parts and pixels. For all extracted segments the appearance and shape models are learned in order to automatically extract the foreground objects across a sequence of video frames. We evaluated the framework using a challenging set of video clips, consisting of office scenes, selected from Hollywood2 dataset. The outcome from the experiments indicates that the approach was able to create better segmentation than recently implemented work.

Nouf Al Harbi, Yoshihiko Gotoh
Sparse Depth Sampling for Interventional 2-D/3-D Overlay: Theoretical Error Analysis and Enhanced Motion Estimation

Patient motion compensation is challenging for dynamic 2-D/3-D overlay in interventional procedures. A first motion compensation approach based on depth-layers has been recently proposed, where 3-D motion can be estimated by tracking feature points on 2-D X-ray images. However, the sparse depth estimation introduces a systematic error. In this paper, we present a theoretical analysis on the systematic error and propose an enhanced motion estimation strategy accordingly. The simulation experiments show that the proposed approach yields a reduced 3-D correction error that is consistently below 2 mm, in comparison to a mean of 6 mm with high variance using the previous approach.

Jian Wang, Christian Riess, Anja Borsdorf, Benno Heigl, Joachim Hornegger
Video Synopsis Based on a Sequential Distortion Minimization Method

The main goal of the proposed method is to select from a video the most “significant” frames in order to broadcast, without apparent loss of content by decreasing the potential distortion criterion. Initially, the video is divided into shots and the number of synopsis frames per shot is computed based on a criterion that takes into account the visual content variation. Next, the most “significant” frames are sequentially selected, so that the visual content distortion between the initial video and the synoptic video is minimized. Experimental results and comparisons with other methods on several real-life and animation video sequences illustrate the high performance of the proposed scheme.

Costas Panagiotakis, Nelly Ovsepian, Elena Michael
A Graph Embedding Method Using the Jensen-Shannon Divergence

Riesen and Bunke recently proposed a novel dissimilarity based approach for embedding graphs into a vector space. One drawback of their approach is the computational cost graph edit operations required to compute the dissimilarity for graphs. In this paper we explore whether the Jensen-Shannon divergence can be used as a means of computing a fast similarity measure between a pair of graphs. We commence by computing the Shannon entropy of a graph associated with a steady state random walk. We establish a family of prototype graphs by using an information theoretic approach to construct generative graph prototypes. With the required graph entropies and a family of prototype graphs to hand, the Jensen-Shannon divergence between a sample graph and a prototype graph can be computed. It is defined as the Jensen-Shannon between the pair of separate graphs and a composite structure formed by the pair of graphs. The required entropies of the graphs can be efficiently computed, the proposed graph embedding using the Jensen-Shannon divergence avoids the burdensome graph edit operation. We explore our approach on several graph datasets abstracted from computer vision and bioinformatics databases.

Lu Bai, Edwin R. Hancock, Lin Han
Mixtures of Radial Densities for Clustering Graphs

We address the problem of unsupervised learning on graphs. The contribution is twofold: (1) we propose an EM algorithm for estimating the parameters of a mixture of radial densities on graphs on the basis of the graph orbifold framework; and (2) we compare orbifold-based clustering algorithms including the proposed EM algorithm against state-of-the-art methods based on pairwise dissimilarities. The results show that orbifold-based clustering methods complement the existing arsenal of clustering methods on graphs.

Brijnesh J. Jain
Complexity Fusion for Indexing Reeb Digraphs

In this paper we combine different quantifications of heat diffusion-thermodynamic depth on digraphs in order to match directed Reeb graphs for 3D shape recognition. Since different real valued functions can infer also different Reeb graphs for the same shape, we exploit a set of quasi-orthogonal representations for comparing sets of digraphs which encode the 3D shapes. In order to do so, we fuse complexities. Fused complexities come from computing the heat-flow thermodynamic depth approach for directed graphs, which has been recently proposed but not yet used for discrimination. In this regard, we do not rely on attributed graphs as usual for we want to explore the limits of pure topological information for structural pattern discrimination. Our experimental results show that: a) our approach is competitive with information-theoretic selection of spectral features and, b) it outperforms the discriminability of the von Neumann entropy embedded in a thermodynamic depth, and thus spectrally robust, approach.

Francisco Escolano, Edwin R. Hancock, Silvia Biasotti
Analysis of Wave Packet Signature of a Graph

In this paper we investigate a new approach for characterizing both the weighted and un-weighted graphs using the solution of the edge-based wave equation. The reason for using wave equation is that it provides a richer and potentially more expressive means of characterizing graphs than the more widely studied heat equation. The wave equation on a graph is defined using the Edge-based Laplacian. We commence by defining the eigensystem of the edge-based Laplacian. We give a solution of the wave equation and define signature for both weighted graphs and un-weighted graphs. In the experiment section we perform the proposed method on real world data and compare its performance with other state-of-the-art methods.

Furqan Aziz, Richard C. Wilson, Edwin R. Hancock
Hearing versus Seeing Identical Twins

Identical twins pose a great challenge to face recognition systems due to their similar appearance. Nevertheless, even though twins may look alike, we believe they speak differently. Hence we propose to use their voice patterns to distinguish between twins. Voice is a natural signal to produce, and it is a combination of physiological and behavioral biometrics, therefore it is suitable for twin verification. In this paper, we collect an audio-visual database from 39 pairs of identical twins. Three types of typical voice features are investigated, including Pitch, Linear Prediction Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCC). For each type of voice feature, we use Gaussian Mixture Model to model the voice spectral distribution of each subject, and then employ the likelihood ratio of the probe belonging to different classes for verification. The experimental results on this database demonstrate a significant improvement by using voice over facial appearance to distinguish between identical twins. Furthermore, we show that by fusion both types of biometrics, recognition accuracy can be improved.

Li Zhang, Shenggao Zhu, Terence Sim, Wee Kheng Leow, Hossein Najati, Dong Guo
Voting Strategies for Anatomical Landmark Localization Using the Implicit Shape Model

We address the problem of anatomical landmark localization using monocular camera information only. For person detection the Implicit Shape Model (ISM) is a well known method. Recently it was shown that the same local features that are used to detect persons, can be used to give rough estimates for anatomical landmark locations as well. Though the landmark localization accuracy of the original ISM is far away from being optimal. We show that a direct application of the ISM to the problem of landmark localization leads to poorly localized vote distributions. In this context, we propose three alternative voting strategies which include the use of a reference point, a simple observation vector filtering heuristic, and an observation vector weight learning algorithm. These strategies can be combined in order to further increase localization accuracy. An evaluation on the UMPM benchmark shows that these new voting strategies are able to generate compact and monotonically decreasing vote distributions, which are centered around the ground truth location of the landmarks. As a result, the ratio of correct votes can be increased from only 9.3% for the original ISM up to 42.1% if we combine all voting strategies.

Jürgen Brauer, Wolfgang Hübner, Michael Arens
Evaluating the Impact of Color on Texture Recognition

State-of-the-art texture descriptors typically operate on grey scale images while ignoring color information. A common way to obtain a joint color-texture representation is to combine the two visual cues at the pixel level. However, such an approach provides sub-optimal results for texture categorisation task.

In this paper we investigate how to optimally exploit color information for texture recognition. We evaluate a variety of color descriptors, popular in image classification, for texture categorisation. In addition we analyze different fusion approaches to combine color and texture cues. Experiments are conducted on the challenging scenes and 10 class texture datasets. Our experiments clearly suggest that in all cases color names provide the best performance. Late fusion is the best strategy to combine color and texture. By selecting the best color descriptor with optimal fusion strategy provides a gain of 5% to 8% compared to texture alone on scenes and texture datasets.

Fahad Shahbaz Khan, Joost van de Weijer, Sadiq Ali, Michael Felsberg
Temporal Self-Similarity for Appearance-Based Action Recognition in Multi-View Setups

We present a general data-driven method for multi-view action recognition relying on the appearance of dynamic systems captured from different viewpoints. Thus, we do not depend on 3d reconstruction, foreground segmentation, or accurate detections. We extend further earlier approaches based on

Temporal Self-Similarity Maps

by new low-level image features and similarity measures.

Gaussian Process

classification in combination with

Histogram Intersection Kernels

serve as powerful tools in our approach. Experiments performed on our new combined multi-view dataset as well as on the widely used IXMAS dataset show promising and competing results.

Marco Körner, Joachim Denzler
Adaptive Pixel/Patch-Based Stereo Matching for 2D Face Recognition

In this paper, we propose using adaptive pixel/patch-based stereo matching for 2D face recognition. We don’t perform 3D reconstruction but define a measure of the similarity of two 2D face images. After rectifying the two images by epipolar geometry, we match them using the similarity for face recognition. The proposed approach has been tested on the CMU PIE and FERET database and demonstrates superior performance compared to existing methods in real-world situations including changes in pose and illumination.

Rui Liu, Weiguo Feng, Ming Zhu
A Machine Learning Approach for Displaying Query Results in Search Engines

In this paper, we propose an approach that displays the results of a search engine query in a more effective way. Each web page retrieved by the search engine is subjected to a summarization process and the important content is extracted. The system consists of four stages. First, the hierarchical structures of documents are extracted. Then the lexical chains in documents are identified to build coherent summaries. The document structures and lexical chains are used to learn a summarization model by the next component. Finally, the summaries are formed and displayed to the user. Experiments on two datasets showed that the method significantly outperforms traditional search engines.

Tunga Güngör
A New Pixel-Based Quality Measure for Segmentation Algorithms Integrating Precision, Recall and Specificity

There are several approaches for performance evaluation of image processing algorithms in video-based surveillance systems: Precision/ Recall, Receiver Operator Characteristics (ROC), F-measure, Jaccard Coefficient, etc. These measures can be used to find good values for input parameters of image segmentation algorithms. Different measures can give different values of these parameters, considered as optimal by one criterion, but not by another. Most of the times, the measures are expressed as a compromise between two of the three aspects that are important for a quality assessment: Precision, Recall and Specificity. In this paper, we propose a new 3-dimensional measure (

D

prs

), which takes into account all of the three aspects. It can be considered as a 3D generalization of 2D ROC analysis and Precision/Recall curves. To estimate the impact of parameters on the quality of the segmentation, we study the behavior of this measure and compare it with several classical measures. Both objective and subjective evaluations confirm that our new measure allows to determine more stable parameters than classical criteria, and to obtain better segmentations of images.

Kannikar Intawong, Mihaela Scuturici, Serge Miguet
A Novel Border Identification Algorithm Based on an “Anti-Bayesian” Paradigm

Border Identification (BI) algorithms, a subset of Prototype Reduction Schemes (PRS) aim to reduce the number of training vectors so that the reduced set (the border set) contains only those patterns which lie near the border of the classes, and have sufficient information to perform a meaningful classification. However, one can see that the true border patterns (“near” border) are not able to perform the task independently as they are not able to always distinguish the testing samples. Thus, researchers have worked on this issue so as to find a way to strengthen the “border” set. A recent development in this field tries to add more border patterns, i.e., the “far” borders, to the border set, and this process continues until it reaches a stage at which the classification accuracy no longer increases. In this case, the cardinality of the border set is relatively high. In this paper, we aim to design a novel BI algorithm based on a new definition for the term “border”. We opt to select the patterns which lie at the border of the alternate class as the border patterns. Thus, those patterns which are neither on the true discriminant nor too close to the central position of the distributions, are added to the “border” set. The border patterns, which are

very small

in number (for example, five from both classes), selected in this manner, have the potential to perform a classification which is comparable to that obtained by well-known traditional classifiers like the SVM, and very close to the optimal Bayes’ bound.

Anu Thomas, B. John Oommen
Assessing the Effect of Crossing Databases on Global and Local Approaches for Face Gender Classification

This paper presents a comprehensive statistical study of the suitability of global and local approaches for face gender classification from frontal non-occluded faces. A realistic scenario is simulated with cross-database experiments where acquisition and demographic conditions considerably vary between training and test images. The performances of three classifiers (1-NN, PCA+LDA and SVM) using two types of features (grey levels and PCA) are compared for the two approaches. Supported by three statistical tests, the main conclusion extracted from the experiments is that if training and test faces are acquired under different conditions from diverse populations, no significant differences exist between global and local solutions. However, global methods outperform local models when training and test sets contain only images of the same database.

Yasmina Andreu Cabedo, Ramón A. Mollineda Cárdenas, Pedro García-Sevilla
BRDF Estimation for Faces from a Sparse Dataset Using a Neural Network

We present a novel five source near-infrared photometric stereo 3D face capture device. The accuracy of the system is demonstrated by a comparison with ground truth from a commercial 3D scanner. We also use the data from the five captured images to model the Bi-directional Reflectance Distribution Function (BRDF) in order to synthesise images from novel lighting directions. A comparison of these synthetic images created from modelling the BRDF using a three layer neural network, a linear interpolation method and the Lambertian model is given, which shows that the neural network proves to be the most photo-realistic.

Mark F. Hansen, Gary A. Atkinson, Melvyn L. Smith
Comparison of Leaf Recognition by Moments and Fourier Descriptors

We test various features for recognition of leaves of wooden species. We compare Fourier descriptors, Zernike moments, Legendre moments and Chebyshev moments. All the features are computed from the leaf boundary only. Experimental evaluation on real data indicates that Fourier descriptors slightly outperform the other tested features.

Tomáš Suk, Jan Flusser, Petr Novotný
Dense Correspondence of Skull Models by Automatic Detection of Anatomical Landmarks

Determining dense correspondence between 3D skull models is a very important but difficult task due to the complexity of the skulls. Non-rigid registration is at present the predominant approach for dense correspondence. It registers a reference model to a target model and then resamples the target according to the reference. Methods that use manually marked corresponding landmarks are accurate, but manual marking is tedious and potentially error prone. On the other hand, methods that automatically detect correspondence based on local geometric features are sensitive to noise and outliers, which can adversely affect their accuracy. This paper presents an automatic dense correspondence method for skull models that combines the strengths of both approaches. First, anatomical landmarks are automatically and accurately detected to serve as hard constraints for non-rigid registration. They ensure that the correspondence is anatomically consistent and accurate. Second, control points are sampled on the skull surfaces to serve as soft constraints for non-rigid registration. They provide additional local shape constraints for a closer match between the reference and the target. Test results show that, by combining both approaches, our algorithm can achieve more accurate automatic dense correspondence.

Kun Zhang, Yuan Cheng, Wee Kheng Leow
Detection of Visual Defects in Citrus Fruits: Multivariate Image Analysis vs Graph Image Segmentation

This paper presents an application of visual quality control in orange post-harvesting comparing two different approaches. These approaches correspond to two very different methodologies released in the area of Computer Vision. The first approach is based on Multivariate Image Analysis (MIA) and was originally developed for the detection of defects in random color textures. It uses Principal Component Analysis and the T

2

statistic to map the defective areas. The second approach is based on Graph Image Segmentation (GIS). It is an efficient segmentation algorithm that uses a graph-based representation of the image and a predicate to measure the evidence of boundaries between adjacent regions. While the MIA approach performs novelty detection on defects using a trained model of sound color textures, the GIS approach is strictly an unsupervised method with no training required on sound or defective areas. Both methods are compared through experimental work performed on a ground truth of 120 samples of citrus coming from four different cultivars. Although the GIS approach is faster and achieves better results in defect detection, the MIA method provides less false detections and does not need to use the hypothesis that the bigger area in samples always correspond to the non-damaged area.

Fernando López-García, Gabriela Andreu-García, José-Miguel Valiente-Gonzalez, Vicente Atienza-Vanacloig
Domain Adaptation Based on Eigen-Analysis and Clustering, for Object Categorization

Domain adaptation (DA) is a method used to obtain better classification accuracy, when the training and testing datasets have different distributions. This paper describes an algorithm for DA to transform data from source domain to match the distribution of the target domain. We use eigen-analysis of data on both the domains, to estimate the transformation along each dimension separately. In order to parameterize the distributions in both the domains, we perform clustering separately along every dimension, prior to the transformation. The proposed algorithm of DA when applied to the task of object categorization, gives better results than a few state of the art methods.

Suranjana Samanta, Sukhendu Das
Estimating Clusters Centres Using Support Vector Machine: An Improved Soft Subspace Clustering Algorithm

In this paper, a new approach of soft subspace clustering is proposed. It is based on the estimation of the clusters centres using a multi-class support vector machine (SVM). This method is an extension of the ESSC algorithm which is performed by optimizing an objective function containing three terms: a weighting within cluster compactness, entropy of weights and a weighting between clusters separations. First, the SVM is used to compute initial centres and partition matrices. This new developed formulation of the centres is integrated in each iteration to yield new centres and membership degrees. A comparative study has been conducted on UCI datasets and different image types. The obtained results show the effectiveness of the suggested method.

Amel Boulemnadjel, Fella Hachouf
Fast Approximate Minimum Spanning Tree Algorithm Based on K-Means

We present a fast approximate Minimum spanning tree(MST) framework on the complete graph of a dataset with

N

points, and any exact MST algorithm can be incorporated into the framework and speeded up. It employs a divide-and-conquer scheme to produce an approximate MST with theoretical time complexity of

O

(

N

1.5

), if the incorporated exact MST algorithm has the running time of

O

(

N

2

). Experimental results show that the proposed approximate MST algorithm is computational efficient, and the accuracy is close to the true MST.

Caiming Zhong, Mikko Malinen, Duoqian Miao, Pasi Fränti
Fast EM Principal Component Analysis Image Registration Using Neighbourhood Pixel Connectivity

Image registration

(IR) is the systematic process of aligning two images of the same or different modalities. The registration of mono and multimodal images i.e., magnetic resonance images, pose a particular challenge due to

intensity non-uniformities

(INU) and noise artefacts. Recent similarity measures including

regional mutual information

(RMI) and

expectation maximisation for principal component analysis

with

MI

(EMPCA-MI) have sought to address this problem. EMPCA-MI incorporates neighbourhood region information to iteratively compute principal components giving superior IR performance compared with RMI, though it is not always effective in the presence of high INU. This paper presents a

modified

EMPCA-MI (

m

EMPCA-MI) similarity measure which introduces a novel pre-processing step to exploit local spatial information using 4-and 8-pixel neighbourhood connectivity. Experimental results using diverse image datasets, conclusively demonstrate the improved IR robustness of

m

EMPCA-MI when adopting second-order neighbourhood representations. Furthermore,

m

EMPCA-MI with 4-pixel connectivity is notably more computationally efficient than EMPCA-MI.

Parminder Singh Reel, Laurence S. Dooley, K. C. P. Wong, Anko Börner
Fast Unsupervised Segmentation Using Active Contours and Belief Functions

In this paper, we study Active Contours (AC) based globally segmentation for vector valued images using evidential Kullback-Leibler (KL) distance. We investigate the evidential framework to fuse multiple features issued from vector-valued images. This formulation has two main advantages: 1) by the combination of foreground/background issued from the multiple channels in the same framework. 2) the incorporation of the heterogeneous knowledge and the reduction of the imprecision due to the noise. The statistical relation between the image channels is ensured by the Dempster-Shafer rule. We illustrate the performance of our segmentation algorithm using some challenging color and textured images.

Foued Derraz, Laurent Peyrodie, Abdelmalik Taleb-Ahmed, Miloud Boussahla, Gerard Forzy
Flexible Hypersurface Fitting with RBF Kernels

This paper gives a method of flexible hypersurface fitting with RBF kernel functions. In order to fit a hypersurface to a given set of points in an Euclidean space, we can apply the hyperplane fitting method to the points mapped to a high dimensional feature space. This fitting is equivalent to a one-dimensional reduction of the feature space by eliminating the linear space spanned by an eigenvector corresponding to the smallest eigenvalue of a variance covariance matrix of data points in the feature space. This dimension reduction is called minor component analysis (MCA), which solves the same eigenvalue problem as kernel principal component analysis and extracts the eigenvector corresponding to the least eigenvalue. In general, feature space is set to an Euclidean space, which is a finite Hilbert space. To consider an MCA for an infinite Hilbert space, a kernel MCA (KMCA), which leads to an MCA in reproducing kernel Hilbert space, should be constructed. However, the representer theorem does not hold for a KMCA since there are infinite numbers of zero-eigenvalues would appear in an MCA for the infinite Hilbert space. Then, the fitting solution is not determined uniquely in the infinite Hilbert space, contrary to there being a unique solution in a finite Hilbert space. This ambiguity of fitting seems disadvantageous because it derives instability in fitting, but it can realize flexible fitting. Based on this flexibility, this paper gives a hypersurface fitting method in the infinite Hilbert space with RBF kernel functions to realize flexible hypersurface fitting. Although some eigenvectors of the matrix defined from kernel function at each sample are considered, we have a candidate of a reasonable solution among the simulation result under a specific situation. It is seen that the flexibility of our method is still effective through simulations.

Jun Fujiki, Shotaro Akaho
Gender Classification Using Facial Images and Basis Pursuit

In many social interactions, it is important to correctly recognize the gender. Researches have addressed this issue based on facial images, ear images and gait. In this paper, we present an approach for gender classification using facial images based upon sparse representation and Basis Pursuit. In sparse representation, the training data is used to develop a dictionary based on extracted features. Classification is achieved by representing the extracted features of the test data using the dictionary. For this purpose, basis pursuit is used to find the best representation by minimizing the

l

1

norm. In this work, Gabor filters are used for feature extraction. Experimental results are conducted on the FERET data set and obtained results are compared with other works in this area. The results show improvement in gender classification over existing methods.

Rahman Khorsandi, Mohamed Abdel-Mottaleb
Graph Clustering through Attribute Statistics Based Embedding

This work tackles the problem of graph clustering by an explicit embedding of graphs into vector spaces. We use an embedding methodology based on occurrence and co-occurrence statistics of representative elements of the node attributes. This embedding methodology has already been used for graph classification problems. In the current paper we investigate its applicability to the problem of clustering color-attributed graphs. The ICPR 2010 Graph Embedding Contest serves us as an evaluation framework. Explicit and implicit embedding methods are evaluated in terms of their ability to cluster object images represented as attributed graphs. We compare the attribute statistics based embedding methodology to explicit and implicit embedding techniques proposed by the contest participants and show improvements in some of the datasets. We then demonstrate further improvements by means of different vectorial metrics and kernel functions on the embedded graphs.

Jaume Gibert, Ernest Valveny, Horst Bunke, Luc Brun
Graph-Based Regularization of Binary Classifiers for Texture Segmentation

In this paper, we propose to improve a recent texture-based graph regularization model used to perform image segmentation by including a binary classifier in the process. Built upon two non-local image processing techniques, the addition of a classifier brings to our model the ability to weight texture features according to their relevance. The graph regularization process is then applied on the initial segmentation provided by the classifier in order to clear it from most imperfections. Results are presented on artificial and medical images, and compared to an active contour driven by classifiers segmentation algorithm, highlighting the increased generality and accuracy of our model.

Cyrille Faucheux, Julien Olivier, Romuald Boné
Hierarchical Annealed Particle Swarm Optimization for Articulated Object Tracking

In this paper, we propose a novel algorithm for articulated object tracking, based on a hierarchical search and particle swarm optimization. Our approach aims to reduce the complexity induced by the high dimensional state space in articulated object tracking by decomposing the search space into subspaces and then using particle swarms to optimize over these subspaces hierarchically. Moreover, the intelligent search strategy proposed in [20] is integrated into each optimization step to provide a robust tracking algorithm under noisy observation conditions. Our quantitative and qualitative analysis both on synthetic and real video sequences show the efficiency of the proposed approach compared to other existing competitive tracking methods.

Xuan Son Nguyen, Séverine Dubuisson, Christophe Gonzales
High-Resolution Feature Evaluation Benchmark

Benchmark data sets consisting of image pairs and ground truth homographies are used for evaluating fundamental computer vision challenges, such as the detection of image features. The mostly used benchmark provides data with only low resolution images. This paper presents an evaluation benchmark consisting of high resolution images of up to 8 megapixels and highly accurate homographies. State of the art feature detection approaches are evaluated using the new benchmark data. It is shown that existing approaches perform differently on the high resolution data compared to the same images with lower resolution.

Kai Cordes, Bodo Rosenhahn, Jörn Ostermann
Fully Automatic Segmentation of AP Pelvis X-rays via Random Forest Regression and Hierarchical Sparse Shape Composition

Knowledge of landmarks and contours in anteroposterior (AP) pelvis X-rays is invaluable for computer aided diagnosis, hip surgery planning and image-guided interventions. This paper presents a fully automatic and robust approach for landmarking and segmentation of both pelvis and femur in a conventional AP X-ray. Our approach is based on random forest regression and hierarchical sparse shape composition. Experiments conducted on 436 clinical AP pelvis x-rays show that our approach achieves an average point-to-curve error around 1.3 mm for femur and 2.2 mm for pelvis, both with success rates around 98%. Compared to existing methods, our approach exhibits better performance in both the robustness and the accuracy.

Cheng Chen, Guoyan Zheng
Language Adaptive Methodology for Handwritten Text Line Segmentation

Text line segmentation in handwritten documents is a very challenging task because in handwritten documents curved text lines appear frequently. In this paper, we have implemented a general line segmentation approach for handwritten documents with various languages. A novel connectivity strength parameter is used for deciding the groups of the components which belongs to the same line. oversegmentation is also removed with the help of depth first search approach and iterative use of the

CSF

. We have implemented and tested this approach with English, Hindi and Urdu text images taken from benchmark database and find that it is a language adaptive approach which provide encouraged results. The average accuracy of the proposed technique is 97.30%.

Subhash Panwar, Neeta Nain, Subhra Saxena, P. C. Gupta
Learning Geometry-Aware Kernels in a Regularization Framework

In this paper, we propose a regularization framework for learning geometry-aware kernels. Some existing geometry-aware kernels can be viewed as instances in our framework. Moreover, the proposed framework can be used as a general platform for developing new geometry-aware kernels. We show how multiple sources of information can be integrated in our framework, allowing us to develop more flexible kernels. We present some new kernels based on our framework. The performance of the kernels is evaluated on classification and clustering tasks. The empirical results show that our kernels significantly improve the performance.

Binbin Pan, Wen-Sheng Chen
Motion Trend Patterns for Action Modelling and Recognition

A new method for action modelling is proposed, which combines the trajectory beam obtained by semi-dense point tracking and a local binary trend description inspired from the Local Binary Patterns (LBP). The semi dense trajectory approach represents a good trade-off between reliability and density of the motion field, whereas the LBP component allows to capture relevant elementary motion elements along each trajectory, which are encoded into mixed descriptors called Motion Trend Patterns (MTP). The combination of those two fast operators allows a real-time, on line computation of the action descriptors, composed of space-time blockwise histograms of MTP values, which are classified using a fast SVM classifier. An encoding scheme is proposed and compared with the state-of-the-art through an evaluation performed on two academic action video datasets.

Thanh Phuong Nguyen, Antoine Manzanera, Matthieu Garrigues
On Achieving Near-Optimal “Anti-Bayesian” Order Statistics-Based Classification for Asymmetric Exponential Distributions

This paper considers the use of Order Statistics (OS) in the theory of Pattern Recognition (PR). The pioneering work on using OS for classification was presented in [1] for the Uniform distribution, where it was shown that optimal PR can be achieved in a counter-intuitive manner, diametrically opposed to the Bayesian paradigm, i.e., by comparing the testing sample to a few samples

distant from the mean

- which is distinct from the optimal Bayesian paradigm. In [2], we showed that the results could be extended for a few

symmetric

distributions within the exponential family. In this paper, we attempt to extend these results significantly by considering asymmetric distributions within the exponential family, for some of which even the closed form expressions of the cumulative distribution functions are not available. These distributions include the Rayleigh, Gamma and certain Beta distributions. As in [1] and [2], the new scheme, referred to as Classification by Moments of Order Statistics (CMOS), attains an accuracy very close to the optimal Bayes’ bound, as has been shown both theoretically and by rigorous experimental testing.

Anu Thomas, B. John Oommen
Optimizing Feature Selection through Binary Charged System Search

Feature selection aims to find the most important information from a given set of features. As this task can be seen as an optimization problem, the combinatorial growth of the possible solutions may be inviable for a exhaustive search. In this paper we propose a new nature-inspired feature selection technique based on the Charged System Search (CSS), which has never been applied to this context so far. The wrapper approach combines the power of exploration of CSS together with the speed of the Optimum-Path Forest classifier to find the set of features that maximizes the accuracy in a validating set. Experiments conducted in four public datasets have demonstrated the validity of the proposed approach can outperform some well-known swarm-based techniques.

Douglas Rodrigues, Luis A. M. Pereira, Joao P. Papa, Caio C. O. Ramos, Andre N. Souza, Luciene P. Papa
Outlines of Objects Detection by Analogy

In this paper we propose a new technique for outlines of objects detection. We exploit the set of contours computed using the image analogies principle. A set of artificial patterns are used to locate contours of any query image, each one permits the location of contours corresponding to a specific intensity variation. We studied these contours and a theoretical foundation is proposed to explain the slow motion of these contours around regions boundaries. Experiments are conducted and the obtained results are presented and discussed.

Asma Bellili, Slimane Larabi, Neil M. Robertson
PaTHOS: Part-Based Tree Hierarchy for Object Segmentation

The problem we address in this paper is the segmentation and hierarchical grouping in digital images. In terms of image acquisition protocol, no constraints are posed to the user. At first, a histogram thresholding provides numerous segments where a homogeneity criterion is respected. Segments are merged together using similarity properties and aggregated in a hierarchy based on spatial inclusions. Shape and color features are extracted on the produced segments. Tests performed on Oxford Flower 17 [8] show that our method outperforms a similar one and allow the relevant object selection from the hierarchy. In our case, this approach represents the first stage towards flower variety identification.

Loreta Suta, Mihaela Scuturici, Vasile-Marian Scuturici, Serge Miguet
Tracking System with Re-identification Using a Graph Kernels Approach

This paper addresses people re-identification problem for visual surveillance applications. Our approach is based on a rich description of each occurrence of a person thanks to a graph encoding of its salient points. People appearance in a video is encoded by bags of graphs whose similarities are encoded by a graph kernel. Such similarities combined with a tracking system allow us to distinguish a new person from a re-entering one into a video. The efficiency of our method is demonstrated through experiments.

Amal Mahboubi, Luc Brun, Donatello Conte, Pasquale Foggia, Mario Vento
Recognizing Human-Object Interactions Using Sparse Subspace Clustering

In this paper, we approach the problem of recognizing human-object interactions from video data. Using only motion trajectories as input, we propose an unsupervised framework for clustering and classifying videos of people interacting with objects. Our method is based on the concept of sparse subspace clustering, which has been recently applied to motion segmentation. Here, we show that human-object interactions can be seen as trajectories lying on a low-dimensional subspace, and which can in turn be recovered by subspace clustering. Experimental results, performed on a publicly available dataset, show that our approach is comparable to the state-of-the-art.

Ivan Bogun, Eraldo Ribeiro
Scale-Space Clustering on the Sphere

We present an algorithm for scale-space clustering of point cloud on the sphere using the methodology for the estimation of the density distribution of the points in the linear scale space. Our algorithm regards the union of observed point sets as an image defined by the delta functions located at the positions of the points on the sphere. A blurred version of this image has a deterministic structure which qualitatively represents the density distribution of the points in a point cloud on a manifold.

Yoshihiko Mochizuki, Atsushi Imiya, Kazuhiko Kawamoto, Tomoya Sakai, Akihiko Torii
The Importance of Long-Range Interactions to Texture Similarity

We have tested 51 sets of texture features for estimating the perceptual similarity between textures. Our results show that these computational features only agree with human judgments at an average rate of 57.76%. In a second experiment we show that the agreement rates, between humans and computational features, increase when humans are not allowed to use long-range interactions beyond 19×19 pixels. We believe that this experiment provides evidence that humans exploit long-range interactions which are not normally available to computational features.

Xinghui Dong, Mike J. Chantler
Unsupervised Dynamic Textures Segmentation

This paper presents an unsupervised dynamic colour texture segmentation method with unknown and variable number of texture classes. Single regions with dynamic textures can furthermore change their location as well as their shape. Individual dynamic multispectral texture mosaic frames are locally represented by Markovian features derived from four directional multispectral Markovian models recursively evaluated for each pixel site. Estimated frame-based Markovian parametric spaces are segmented using an unsupervised segmenter derived from the Gaussian mixture model data representation which exploits contextual information from previous video frames segmentation history. The segmentation algorithm for every frame starts with an over segmented initial estimation which is adaptively modified until the optimal number of homogeneous texture segments is reached. The presented method is objectively numerically evaluated on the dynamic textural test set from the Prague Segmentation Benchmark.

Michal Haindl, Stanislav Mikeš
Voting Clustering and Key Points Selection

We propose a method for clustering and key points selection. We have shown that the proposed clustering based on the voting maximization scheme has advantages concerning the cluster’s compactness, working well for clusters of different densities and/or sizes. Experimental results demonstrate the high performance of the proposed scheme and its application to video summarization problem.

Costas Panagiotakis, Paraskevi Fragopoulou
Motor Pump Fault Diagnosis with Feature Selection and Levenberg-Marquardt Trained Feedforward Neural Network

We present a system for automatic model-free fault detection based on a feature set from vibrational patterns. The complexity of the feature model is reduced by feature selection. We use a wrapper approach for the selection criteria, incorporating the training of an artificial neural network into the selection process. For fast convergence we train with the Levenberg-Marquardt algorithm. Experiments are presented for eight different fault classes.

Thomas W. Rauber, Flávio M. Varejão
Unobtrusive Fall Detection at Home Using Kinect Sensor

The existing CCD-camera based systems for fall detection require time for installation and camera calibration. They do not preserve the privacy adequately and are unable to operate in low lighting conditions. In this paper we show how to achieve automatic fall detection using only depth images. The point cloud corresponding to floor is delineated automatically using v-disparity images and Hough transform. The ground plane is extracted by the RANSAC algorithm. The detection of the person takes place on the basis of the updated on-line depth reference images. Fall detection is achieved using a classifier trained on features representing the extracted person both in depth images and in point clouds. All fall events were recognized correctly on an image set consisting of 312 images of which 110 contained the human falls. The images were acquired by two Kinect sensors placed at two different locations.

Michal Kepski, Bogdan Kwolek
“BAM!” Depth-Based Body Analysis in Critical Care

We investigate computer vision methods to monitor Intensive Care Units (ICU) and assist in sedation delivery and accident prevention. We propose the use of a Bed Aligned Map (BAM) to analyze the patient’s body. We use a depth camera to localize the bed, estimate its surface and divide it into 10 cm × 10 cm cells. Here, the BAM represents the average cell height over the mattress. This depth-based BAM is independent of illumination and bed positioning, improving the consistency between patients. This representation allow us to develop metrics to estimate bed occupancy, body localization, body agitation and sleeping position. Experiments with 23 subjects show an accuracy in 4-level agitation tests of 88% and 73% in supine and fetal positions respectively, while sleeping position was recognized with a 100% accuracy in a 4-class test.

Manuel Martinez, Boris Schauerte, Rainer Stiefelhagen
3-D Feature Point Matching for Object Recognition Based on Estimation of Local Shape Distinctiveness

In this paper, we propose a reliable 3-D object recognition method that can statistically minimize object mismatching. Our method basically uses a 3-D object model that is represented as a set of feature points with 3-D coordinates. Each feature point also has an attribute value for the local shape around the point. The attribute value is represented as an orientation histogram of a normal vector calculated by using several neighboring feature points around each point. Here, the important thing is this attribute value means its local shape. By estimating the relative similarity of two points of all possible combinations in the model, we define the distinctiveness of each point. In the proposed method, only a small number of distinctive feature points are selected and used for matching with all feature points extracted from an acquired range image. Finally, the position and pose of the target object can be estimated from a number of correctly matched points. Experimental results using actual scenes have demonstrated that the recognition rate of our method is 93.8%, which is 42.2% higher than that of the conventional Spin Image method. Furthermore, its computing time is about nine times faster than that of the Spin Image method.

Masanobu Nagase, Shuichi Akizuki, Manabu Hashimoto
3D Human Tracking from Depth Cue in a Buying Behavior Analysis Context

This paper presents a real time approach to track the human body pose in the 3D space. For the buying behavior analysis, the camera is placed on the top of the shelves, above the customers. In this top view, the markerless tracking is harder. Hence, we use the depth cue provided by the kinect that gives discriminative features of the pose. We introduce a new 3D model that are fitted to these data in a particle filter framework. First the head and shoulders position is tracked in the 2D space of the acquisition images. Then the arms poses are tracked in the 3D space. Finally, we demonstrate that an efficient implementation provides a real-time system.

Cyrille Migniot, Fakhreddine Ababsa
A New Bag of Words LBP (BoWL) Descriptor for Scene Image Classification

This paper explores a new Local Binary Patterns (LBP) based image descriptor that makes use of the bag-of-words model to significantly improve classification performance for scene images. Specifically, first, a novel multi-neighborhood LBP is introduced for small image patches. Second, this multi-neighborhood LBP is combined with frequency domain smoothing to extract features from an image. Third, the features extracted are used with spatial pyramid matching (SPM) and bag-of-words representation to propose an innovative Bag of Words LBP (BoWL) descriptor. Next, a comparative assessment is done of the proposed BoWL descriptor and the conventional LBP descriptor for scene image classification using a Support Vector Machine (SVM) classifier. Further, the classification performance of the new BoWL descriptor is compared with the performance achieved by other researchers in recent years using some popular methods. Experiments with three fairly challenging publicly available image datasets show that the proposed BoWL descriptor not only yields significantly higher classification performance than LBP, but also generates results better than or at par with some other popular image descriptors.

Sugata Banerji, Atreyee Sinha, Chengjun Liu
Accurate Scale Factor Estimation in 3D Reconstruction

A well-known ambiguity in monocular structure from motion estimation is that 3D reconstruction is possible up to a similarity transformation, i.e. an isometry composed with isotropic scaling. To raise this ambiguity, it is commonly suggested to manually measure an absolute distance in the environment and then use it to scale a reconstruction accordingly. In practice, however, it is often the case that such a measurement cannot be performed with sufficient accuracy, compromising certain uses of a 3D reconstruction that require the acquisition of true Euclidean measurements. This paper studies three alternative techniques for obtaining estimates of the scale pertaining to a reconstruction and compares them experimentally with the aid of real and synthetic data.

Manolis Lourakis, Xenophon Zabulis
Affine Colour Optical Flow Computation

The purpose of this paper is three-fold. First, we develop an algorith for the computation a locally affine optical flow field from multichannel images as an extension of the Lucus-Kanade (LK) method. The classical LK method solves a system of linear equations assuming that the flow field is locally constant. Our method solves a collection of systems of linear equations assuming the flow field is locally affine. For autonomous navigation in a real environment, the adaptation of the motion and image analysis algorithm to illumination changes is a fundamental problem, because illumination changes in an image sequence yield counterfeit obstacles. Second, we evaluate the colour channel selection of colour optical flow computation. By selecting an appropriate colour channel, it is possible to avoid these counterfeit obstacle regions in the snapshot image in front of a vehicle. Finally, we introduce an evaluation criterion for the computed optical flow field without ground truth.

Ming-Ying Fan, Atsushi Imiya, Kazuhiko Kawamoto, Tomoya Sakai
Can Salient Interest Regions Resume Emotional Impact of an Image?

The salient regions of interest are supposed to contain the interesting keypoints for analysis and understanding. We studied in this paper the impact of image reduction to the region of interest on the emotion recognition. We chose a bottom-up visual attention model because we addressed emotions on a new low-semantic data set SENSE (Studies of Emotion on Natural image databaSE). We organized two experimentations. The first one has been conducted on the whole images called SENSE1 and the second on reduced images named SENSE2. These latter are obtained with a visual attention model and their size varies from 3% to 100% of the size of the original ones. The information collected during these evaluations are the nature and the power of emotions. For the nature observers have choice between ”Negative”, ”Neutral” and ”Positive” and the power varies from ”Weak” to ”Strong”. On the both experimentations some images have ambiguous categorization. In fact, the participants were not able to decide on their emotional class (Negative, Neutral ans Positive). The evaluations on reduced images showed that average 79% of the uncategorised images during SENSE1 are categorized during SENSE2 in one of the both major classes. Reducing the size of the area to be observed leads to a better evaluation maybe because some semantic content are attenuated.

Syntyche Gbèhounou, François Lecellier, Christine Fernandez-Maloigne, Vincent Courboulay
Contraharmonic Mean Based Bias Field Correction in MR Images

One of the key problems in magnetic resonance (MR) image analysis is to remove the intensity inhomogeneity artifact present in MR images, which often degrades the performance of an automatic image analysis technique. In this regard, the paper presents a novel approach for bias field correction in MR images using the merit of contraharmonic mean, which is used in low-pass averaging filter to estimate the near optimum bias field in multiplicative model. A theoretical analysis is presented to justify the use of contraharmonic mean for bias field estimation. The performance of the proposed approach, along with a comparison with other bias field correction algorithms, is demonstrated on a set of MR images for different bias fields and noise levels.

Abhirup Banerjee, Pradipta Maji
Correlation between Biopsy Confirmed Cases and Radiologist’s Annotations in the Detection of Lung Nodules by Expanding the Diagnostic Database Using Content Based Image Retrieval

In lung cancer computer-aided diagnosis (CAD) systems, having an accurate and available ground truth is critical and time consuming. In this study, we have explored Lung Image Database Consortium (LIDC) database containing pulmonary computed tomography (CT) scans, and we have implemented content-based image retrieval (CBIR) approach to exploit the limited amount of diagnostically labeled data in order to annotate unlabeled images with diagnoses. By applying CBIR method iteratively and using pathologically confirmed cases, we expand the set of diagnosed data available for CAD systems from 17 nodules to 121 nodules. We evaluated the method by implementing a CAD system that uses various combinations of lung nodule sets as queries and retrieves similar nodules from the diagnostically labeled dataset. In calculating the precision of this system Diagnosed dataset and computer-predicted malignancy data are used as ground truth for the undiagnosed query nodules. Our results indicate that CBIR expansion is an effective method for labeling undiagnosed images in order to improve the performance of CAD systems. It also indicated that little knowledge of biopsy confirmed cases not only assist the physician’s as second opinion to mark the undiagnosed cases and avoid unnecessary biopsies too.

Preeti Aggarwal, H. K. Sardana, Renu Vig
Enforcing Consistency of 3D Scenes with Multiple Objects Using Shape-from-Contours

In this paper we present a new approach for modelling scenes with multiple 3D objects from images taken from various viewpoints. Such images are segmented using either supervised or unsupervised algorithms. We consider the mean-shift and support vector machines for image segmentation using the colour and texture as features. Back-projections of segmented contours are used to enforce the consistency of the segmented contours with initial estimates of the 3D scene. A study for detecting merged objects in 3D scenes is provided as well.

Matthew Grum, Adrian G. Bors
Expectation Conditional Maximization-Based Deformable Shape Registration

This paper addresses the issue of matching statistical and non-rigid shapes, and introduces an Expectation Conditional Maximization-based deformable shape registration (ECM-DSR) algorithm. Similar to previous works, we cast the statistical and non-rigid shape registration problem into a missing data framework and handle the unknown correspondences with Gaussian Mixture Models (GMM). The registration problem is then solved by fitting the GMM centroids to the data. But unlike previous works where equal isotropic covariances are used, our new algorithm uses heteroscedastic covariances whose values are iteratively estimated from the data. A previously introduced virtual observation concept is adopted here to simplify the estimation of the registration parameters. Based on this concept, we derive closed-form solutions to estimate parameters for statistical or non-rigid shape registrations in each iteration. Our experiments conducted on synthesized and real data demonstrate that the ECM-DSR algorithm has various advantages over existing algorithms.

Guoyan Zheng
Facial Expression Recognition with Regional Features Using Local Binary Patterns

This paper presents a simple yet efficient and completely automatic approach to recognize six fundamental facial expressions using Local Binary Patterns (LBPs) texture features. A system is proposed that can automatically locate four important facial regions from which the uniform LBPs features are extracted and concatenated to form a 236 dimensional enhanced feature vector to be used for six fundamental expressions recognition. The features are trained using three widely used classifiers: Naive bayes, Radial Basis Function Network (RBFN) and three layered Multi-layer Perceptron (MLP3). The notable feature of the proposed method is the use of few preferred regions of the face to extract the LBPs features as opposed to the use of entire face. The experimental results obtained from MMI database show proficiency of the proposed features extraction method.

Anima Majumder, Laxmidhar Behera, Venkatesh K. Subramanian
Global Image Registration Using Random Projection and Local Linear Method

The purpose of this paper is twofold. First, we introduce fast global image registration using random projection. By generating many transformed images as entries in a dictionary from a reference image, nearest-neighbour-search (NNS)-based image registration computes the transformation that establishes the best match among the generated transformations. For the reduction in the computational cost for NNS without a significant loss of accuracy, we use random projection. Furthermore, for the reduction in the computational complexity of random projection, we use the spectrum-spreading technique and circular convolution. Second, for the reduction in the space complexity of the dictionary, we introduce an interpolation technique into the dictionary using the linear subspace method and a local linear property of the pattern space.

Hayato Itoh, Tomoya Sakai, Kazuhiko Kawamoto, Atsushi Imiya
Image Segmentation by Oriented Image Foresting Transform with Geodesic Star Convexity

Anatomical structures and tissues are often hard to be segmented in medical images due to their poorly defined boundaries, i.e., low contrast in relation to other nearby false boundaries. The specification of the boundary polarity and the usage of shape constraints can help to alleviate part of this problem. Recently, an Oriented Image Foresting Transform (OIFT) has been proposed. In this work, we discuss how to incorporate Gulshan’s geodesic star convexity prior in the OIFT approach for interactive image segmentation, in order to simultaneously handle boundary polarity and shape constraints. This convexity constraint eliminates undesirable intricate shapes, improving the segmentation of objects with more regular shape. We include a theoretical proof of the optimality of the new algorithm in terms of a global maximum of an oriented energy function subject to the shape constraints, and show the obtained gains in accuracy using medical images of thoracic CT studies.

Lucy A. C. Mansilla, Paulo A. V. Miranda
Multi-run 3D Streetside Reconstruction from a Vehicle

Accurate 3D modellers of real-world scenes are important tools for visualizing or understanding outside environments. The paper considers a camera-based 3D reconstruction system where stereo cameras are mounted on a mobile platform, recording images while moving through the scene. Due to the limited viewing angle of the cameras, resulting reconstructions often result in missing (e.g. while occluded) components of the scene. In this paper, we propose a stereo-based 3D reconstruction framework for merging multiple runs of reconstructions when driving in different directions through a real-world scene.

Yi Zeng, Reinhard Klette
Interactive Image Segmentation via Graph Clustering and Synthetic Coordinates Modeling

We propose a method for interactive image segmentation. We construct a weighted graph that represents the superpixels and the connections between them. An efficient algorithm for graph clustering based on synthetic coordinates is used yielding an initial map of classified pixels. The proposed method minimizes a min-max Bayesian criterion that has been successfully used on image segmentation problem taking into account visual information as well as the given markers. Experimental results and comparisons with other methods demonstrate the high performance of the proposed scheme.

Costas Panagiotakis, Harris Papadakis, Elias Grinias, Nikos Komodakis, Paraskevi Fragopoulou, Georgios Tziritas
Backmatter
Metadaten
Titel
Computer Analysis of Images and Patterns
herausgegeben von
Richard Wilson
Edwin Hancock
Adrian Bors
William Smith
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-40261-6
Print ISBN
978-3-642-40260-9
DOI
https://doi.org/10.1007/978-3-642-40261-6