nach oben

2012 | Buch

Structural, Syntactic, and Statistical Pattern Recognition

Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings

herausgegeben von: Georgy Gimel’farb, Edwin Hancock, Atsushi Imiya, Arjan Kuijper, Mineichi Kudo, Shinichiro Omachi, Terry Windeatt, Keiji Yamada

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This volume constitutes the refereed proceedings of the Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2012) and Statistical Techniques in Pattern Recognition (SPR 2012), held in Hiroshima, Japan, in November 2012 as a satellite event of the 21st International Conference on Pattern Recognition, ICPR 2012. The 80 revised full papers presented together with 1 invited paper and the Pierre Devijver award lecture were carefully reviewed and selected from more than 120 initial submissions. The papers are organized in topical sections on structural, syntactical, and statistical pattern recognition, graph and tree methods, randomized methods and image analysis, kernel methods in structural and syntactical pattern recognition, applications of structural and syntactical pattern recognition, clustering, learning, kernel methods in statistical pattern recognition, kernel methods in statistical pattern recognition, as well as applications of structural, syntactical, and statistical methods.

Inhaltsverzeichnis

Frontmatter

Pierre Devijver Award Lecture

Estimation, Learning, and Adaptation: Systems That Improve with Use

The accuracy of automated classification (labeling) of single patterns, especially printed, hand-printed, or handwritten characters, has leveled off. Further gains in accuracy require classifying sequences of patterns. Linguistic context, already widely used, relies on 1-D lexical and syntactic constraints. Style-constrained classification exploits the shape-similarity of sets of same-source (isogenous) characters of either the same or different classes. For understanding tables and forms, 2-D structural and relational constraints are necessary. Applications of pattern recognition that do not exceed the limits of human senses and cognition can benefit from green interaction wherein operator corrections are recycled to the classifier.

George Nagy

Invited Talks

Optimization Techniques for Geometric Estimation: Beyond Minimization

We overview techniques for optimal geometric estimation from noisy observations for computer vision applications. We first describe estimation techniques based on minimization of given cost functions: least squares (LS), maximum likelihood (ML), which includes reprojection error minimization (Gold Standard) as a special case, and Sampson error minimization. We then formulate estimation techniques not based on minimization of any cost function: iterative reweight, renormalization, and hyper-renormalization. Showing numerical examples, we conclude that hyper-renormalization is robust to noise and currently is the best method.

Kenichi Kanatani

Hierarchical Compositional Representations of Object Structure

Visual categorisation has been an area of intensive research in the vision community for several decades. Ultimately, the goal is to efficiently detect and recognize an increasing number of object classes. The problem entangles three highly interconnected issues: the internal object representation, which should compactly capture the visual variability of objects and generalize well over each class; a means for learning the representation from a set of input images with as little supervision as possible; and an effective inference algorithm that robustly matches the object representation against the image and scales favorably with the number of objects.

In this talk I will present our approach which combines a learned compositional hierarchy, representing (2D) shapes of multiple object classes, and a coarse-to-fine matching scheme that exploits a taxonomy of objects to perform efficient object detection. Our framework for learning a hierarchical compositional shape vocabulary for representing multiple object classes takes simple contour fragments and learns their frequent spatial configurations. These are recursively combined into increasingly more complex and class-specific shape compositions, each exerting a high degree of shape variability. At the top-level of the vocabulary, the compositions represent the whole shapes of the objects. The vocabulary is learned layer after layer, by gradually increasing the size of the window of analysis and reducing the spatial resolution at which the shape configurations are learned. The lower layers are learned jointly on images of all classes, whereas the higher layers of the vocabulary are learned incrementally, by presenting the algorithm with one object class after another.

However, in order for recognition systems to scale to a larger number of object categories, and achieve running times logarithmic in the number of classes, building visual class taxonomies becomes necessary. We propose an approach for speeding up recognition times of multi-class part-based object representations. The main idea is to construct a taxonomy of constellation models cascaded from coarse-to-fine resolution and use it in recognition with an efficient search strategy. The structure and the depth of the taxonomy is built automatically in a way that minimizes the number of expected computations during recognition by optimizing the cost-to-power ratio. The combination of the learned taxonomy with the compositional hierarchy of object shape achieves efficiency both with respect to the representation of the structure of objects and in terms of the number of modeled object classes. The experimental results show that the learned multi-class object representation achieves a detection performance comparable to the current state-of-the-art flat approaches with both faster inference and shorter training times.

Aleš Leonardis

Structural, Syntactical, and Statistical Pattern Recognition

Information Theoretic Prototype Selection for Unattributed Graphs

In this paper we propose a prototype size selection method for a set of sample graphs. Our first contribution is to show how approximate set coding can be extended from the vector to graph domain. With this framework to hand we show how prototype selection can be posed as optimizing the mutual information between two partitioned sets of sample graphs. We show how the resulting method can be used for prototype graph size selection. In our experiments, we apply our method to a real-world dataset and investigate its performance on prototype size selection tasks.

Lin Han, Luca Rossi, Andrea Torsello, Richard C. Wilson, Edwin R. Hancock

Graph Kernels: Crossing Information from Different Patterns Using Graph Edit Distance

Graph kernels allow to define metrics on graph space and constitute thus an efficient tool to combine advantages of structural and statistical pattern recognition fields. Within the chemoinformatics framework, kernels are usually defined by comparing the number of occurences of patterns extracted from two different graphs. Such a graph kernel construction scheme neglects the fact that similar but not identical patterns may lead to close properties. We propose in this paper to overcome this drawback by defining our kernel as a weighted sum of comparisons between all couples of patterns. In addition, we propose an efficient computation of the optimal edit distance on a limited set of finite trees. This extension has been tested on two chemoinformatics problems.

Benoit Gaüzère, Luc Brun, Didier Villemin

Mode Seeking Clustering by KNN and Mean Shift Evaluated

Clustering by mode seeking is most popular using the mean shift algorithm. A less well known alternative with different properties on the computational complexity is kNN mode seeking, based on the nearest neighbor rule instead of the Parzen kernel density estimator. It is faster and allows for much higher dimensionalities. We compare the performances of both procedures using a number of labeled datasets. The retrieved clusters are compared with the given class labels. In addition, the properties of the procedures are investigated for prototype selection.

It is shown that kNN mode seeking is well performing and is feasible for large scale problems with hundreds of dimensions and up to a hundred thousand data points. The mean shift algorithm may perform better than kNN mode seeking for smaller dataset sizes.

Robert P. W. Duin, Ana L. N. Fred, Marco Loog, Elżbieta Pękalska

Learning Sparse Kernel Classifiers in the Primal

The increasing number of classification applications in large data sets demands that efficient classifiers be designed not only in training but also for prediction. In this paper, we address the problem of learning kernel classifiers with reduced complexity and improved efficiency for prediction in comparison to those trained by standard methods. A single optimisation problem is formulated for classifier learning which optimises both classifier weights and eXpansion Vectors (XVs) that define the classification function in a joint fashion. Unlike the existing approach of Wu et al, which performs optimisation in the dual formulation, our approach solves the primal problem directly. The primal problem is much more efficient to solve, as it can be converted to the training of a linear classifier in each iteration, which scales

linearly

to the size of the data set and the number of expansions. This makes our primal approach highly desirable for large-scale applications, where the dual approach is inadequate and prohibitively slow due to the solution of

cubic-time

kernel SVM involved in each iteration. Experimental results have demonstrated the efficiency and effectiveness of the proposed primal approach for learning sparse kernel classifiers that clearly outperform the alternatives.

Zhouyu Fu, Guojun Lu, Kai-Ming Ting, Dengsheng Zhang

Graph and Tree Methods

Evolutionary Weighted Mean Based Framework for Generalized Median Computation with Application to Strings

A new general framework for generalized median approximation is proposed based on the concept of weighted mean of a pair of objects. It can be easily adopted for different application domains like strings, graphs or clusterings, among others. The framework is validated for strings showing its superiority over the state-of-the-art.

Lucas Franek, Xiaoyi Jiang

Graph Complexity from the Jensen-Shannon Divergence

In this paper we aim to characterize graphs in terms of structural complexities. Our idea is to decompose a graph into substructures of increasing layers, and then to measure the dissimilarity of these substructures using Jensen-Shannon divergence. We commence by identifying a centroid vertex by computing the minimum variance of its shortest path lengths. From the centroid vertex, a family of centroid expansion subgraphs of the graph with increasing layers are constructed. We then compute the depth-based complexity trace of a graph by measuring how the Jensen-Shannon divergence varies with increasing layers of the subgraphs. The required Shannon or von Neumann entropies are computed on the condensed subgraph family of the graph. We perform graph clustering in the principal components space of the complexity trace vector. Experiments on graph datasets abstracted from bioinformatics and image data demonstrate effectiveness and efficiency of the graphs complexity traces.

Lu Bai, Edwin R. Hancock

Complexity of Computing Distances between Geometric Trees

Geometric trees can be formalized as unordered combinatorial trees whose edges are endowed with geometric information. Examples are skeleta of shapes from images; anatomical tree-structures such as blood vessels; or phylogenetic trees. An inter-tree distance measure is a basic prerequisite for many pattern recognition and machine learning methods to work on anatomical, phylogenetic or skeletal trees. Standard distance measures between trees, such as tree edit distance, can be readily translated to the geometric tree setting. It is well-known that the tree edit distance for unordered trees is generally NP complete to compute. However, the classical proof of NP completeness depends on a particular case of edit distance with integer edit costs for trees with discrete labels, and does not obviously carry over to the class of geometric trees. The reason is that edge geometry is encoded in continuous scalar or vector attributes, allowing for continuous edit paths from one tree to another, rather than finite, discrete edit sequences with discrete costs for discrete label sets. In this paper, we explain why the proof does not carry over directly to the continuous setting, and why it does not work for the important class of trees with scalar-valued edge attributes, such as edge length. We prove the NP completeness of tree edit distance and another natural distance measure, QED, for geometric trees with vector valued edge attributes.

Aasa Feragen

Active Graph Matching Based on Pairwise Probabilities between Nodes

We propose a method to perform active graph matching in which the active learner queries one of the nodes of the first graph and the oracle feedback is the corresponding node of the other graph. The method uses any graph matching algorithm that iteratively updates a probability matrix between nodes (Graduated Assignment, Expectation Maximisation or Probabilistic Relaxation). The oracle’s feedback is used to update the costs between nodes and arcs of both graphs. We present and validate four different active strategies based on the probability matrix between nodes. It is not needed to modify the code of the graph-matching algorithms, since our method simply needs to read the probability matrix and to update the costs between nodes and arcs. Practical validation shows that with few oracle’s feedbacks, the algorithm finds the labelling that the user considers optimal because imposing few labellings the other ones are corrected automatically.

Xavier Cortés, Francesc Serratosa, Albert Solé-Ribalta

On the Relation between the Common Labelling and the Median Graph

In structural pattern recognition, given a set of graphs, the computation of a Generalized Median Graph is a well known problem. Some methods approach the problem by assuming a relation between the Generalized Median Graph and the Common Labelling problem. However, this relation has still not been formally proved. In this paper, we analyse such relation between both problems. The main result proves that the cost of the common labelling upper-bounds the cost of the median with respect to the given set. In addition, we show that the two problems are equivalent in some cases.

Nicola Rebagliati, Albert Solé-Ribalta, Marcello Pelillo, Francesc Serratosa

Randomized Methods and Image Analysis

A Hierarchical Image Segmentation Algorithm Based on an Observation Scale

Hierarchical image segmentation provides a region-oriented scale-space,

i.e.

, a set of image segmentations at different detail levels in which the segmentations at finer levels are nested with respect to those at coarser levels. Most image segmentation algorithms, such as region merging algorithms, rely on a criterion for merging that does not lead to a hierarchy. In addition, for image segmentation, the tuning of the parameters can be difficult. In this work, we propose a hierarchical graph based image segmentation relying on a criterion popularized by Felzenszwalb and Huttenlocher. Quantitative and qualitative assessments of the method on Berkeley image database shows efficiency, ease of use and robustness of our method.

Silvio Jamil F. Guimarães, Jean Cousty, Yukiko Kenmochi, Laurent Najman

A Discrete Scale Space Neighborhood for Robust Deep Structure Extraction

Linear or Gaussian scale space is a well known multi-scale representation for continuous signals. The exploration of its so-called deep structure by tracing critical points over scale has various theoretical applications and allows for the construction of a scale space hierarchy tree. However, implementational issues arise, caused by discretization and quantization errors. In order to develop more robust scale space based algorithms, the discrete nature of computer processed signals has to be taken into account. Aiming at a computationally practicable implementation of the discrete scale space framework, we investigated suitable neighborhoods, boundary conditions and sampling methods. We show that the resulting discrete scale space respects important topological invariants such as the Euler number, a key criterion for the successful implementation of algorithms operating on its deep structure. We discuss promising properties of topological graphs under the influence of smoothing, setting the stage for more robust deep structure extraction algorithms.

Martin Tschirsich, Arjan Kuijper

On the Correlation of Graph Edit Distance and L 1 Distance in the Attribute Statistics Embedding Space

Graph embeddings in vector spaces aim at assigning a pattern vector to every graph so that the problems of graph classification and clustering can be solved by using data processing algorithms originally developed for statistical feature vectors. An important requirement graph features should fulfil is that they reproduce as much as possible the properties among objects in the graph domain. In particular, it is usually desired that distances between pairs of graphs in the graph domain closely resemble those between their corresponding vectorial representations. In this work, we analyse relations between the edit distance in the graph domain and the

distance of the attribute statistics based embedding, for which good classification performance has been reported on various datasets. We show that there is actually a high correlation between the two kinds of distances provided that the corresponding parameter values that account for balancing the weight between node and edge based features are properly selected.

Jaume Gibert, Ernest Valveny, Horst Bunke, Alicia Fornés

Approximate Axial Symmetries from Continuous Time Quantum Walks

The analysis of complex networks is usually based on key properties such as small-worldness and vertex degree distribution. The presence of symmetric motifs on the other hand has been related to redundancy and thus robustness of the networks. In this paper we propose a method for detecting approximate axial symmetries in networks. For each pair of nodes, we define a continuous-time quantum walk which is evolved through time. By measuring the probability that the quantum walker to visits each node of the network in this time frame, we are able to determine whether the two vertices are symmetrical with respect to any axis of the graph. Moreover, we show that we are able to successfully detect approximate axial symmetries too. We show the efficacy of our approach by analysing both synthetic and real-world data.

Luca Rossi, Andrea Torsello, Edwin R. Hancock

A Clustering-Based Ensemble Technique for Shape Decomposition

Ensemble techniques have been very successful in pattern recognition. In this work we investigate ensemble solution for shape decomposition. A clustering-based approach is proposed to determine a final decomposition from an ensemble of input decompositions. A recently published performance evaluation framework consisting of a benchmark database with manual ground truth together with evaluation measures is used to demonstrate the benefit of the proposed ensemble technique.

Sergej Lewin, Xiaoyi Jiang, Achim Clausing

Kernel Methods in Structural and Syntactical Pattern Recognition

Laplacian Eigenimages in Discrete Scale Space

Linear or Gaussian scale space is a well known multi-scale representation for continuous signals. However, implementational issues arise, caused by discretization and quantization errors. In order to develop more robust scale space based algorithms, the discrete nature of computer processed signals has to be taken into account. Aiming at a computationally practicable implementation of the discrete scale space framework we used suitable neighborhoods, boundary conditions and sampling methods. In analogy to prevalent approaches, a discretized diffusion equation is derived from the continuous scale space axioms adapted to discrete two-dimensional images or signals, including requirements imposed by the chosen neighborhood and boundary condition. The resulting discrete scale space respects important topological invariants such as the Euler number, a key criterion for the successful implementation of algorithms operating on its deep structure. In this paper, relevant and promising properties of the discrete diffusion equation and the eigenvalue decomposition of its Laplacian kernel are discussed and a fast and robust sampling method is proposed. One of the properties leads to Laplacian eigenimages in scale space: Taking a reduced set of images can be considered as a way of applying a discrete Gaussian scale space.

Martin Tschirsich, Arjan Kuijper

A Relational Kernel-Based Framework for Hierarchical Image Understanding

While relational representations have been popular in early work on syntactic and structural pattern recognition, they are rarely used in contemporary approaches to computer vision due to their pure symbolic nature. The recent progress and successes in combining statistical learning principles with relational representations motivates us to reinvestigate the use of such representations. More specifically, we show that statistical relational learning can be successfully used for hierarchical image understanding. We employ kLog, a new logical and relational language for learning with kernels to detect objects at different levels in the hierarchy. The key advantage of kLog is that both appearance features and rich, contextual dependencies between parts in a scene can be integrated in a principled and interpretable way to obtain a qualitative representation of the problem. At each layer, qualitative spatial structures of parts in images are detected, classified and then employed one layer up the hierarchy to obtain higher-level semantic structures. We apply a four-layer hierarchy to street view images and successfully detect corners, windows, doors, and individual houses.

Laura Antanas, Paolo Frasconi, Fabrizio Costa, Tinne Tuytelaars, Luc De Raedt

A Jensen-Shannon Kernel for Hypergraphs

In this paper we explore how to construct a Jensen-Shannon kernel for hypergraphs. We commence by calculating probability distribution over the steady state random walk on a hypergraph. The Shannon entropies required to construct the Jensen-Shannon divergence for pairs of hypergraphs are obtained from steady state probability distributions of the random walk. The Jensen-Shannon divergence between a pair of hypergraphs is the difference between the Shannon entropies of the separate hypergraphs and a composite structure. Our proposed kernel is not restricted to hypergraphs. Experiments on (hyper)graph datasets extracted from bioinformatics and computer vision datasets demonstrate the effectiveness and efficiency of the Jensen-Shannon hypergraph kernel for classification and clustering.

Lu Bai, Edwin R. Hancock, Peng Ren

Heat Flow-Thermodynamic Depth Complexity in Directed Networks

In this paper we extend the heat diffusion-thermodynamic depth approach for undirected networks/graphs to directed graphs. This extension is motivated by the need to measure the complexity of structural patterns encoded by directed graphs. It consists of: a) analyzing and characterizing heat diffusion traces in directed graphs, b) extending the thermodynamic depth framework to capture the second-order variability of the diffusion traces to measure the complexity of directed networks. In our experiments we characterize several directed networks derived from different natural languages. We show that our proposed extension finds differences between languages that are blind to the classical analysis of degree distributions.

Francisco Escolano, Boyan Bonev, Edwin R. Hancock

Shape Similarity Based on a Treelet Kernel with Edition

Several shape similarity measures, based on shape skeletons, are designed in the context of graph kernels. State-of-the-art kernels act on bags of walks, paths or trails which decompose the skeleton graph, and take into account structural noise through edition mechanisms. However, these approaches fail to capture the complexity of junctions inside skeleton graphs due to the linearity of the patterns. To overcome this drawback, tree patterns embedded in the plane have been proposed to decompose the skeleton graphs. In this paper, we reinforce the behaviour of kernel based on tree patterns by explictly incorporating an edition mechanism adapted to tree patterns.

Sébastien Bougleux, François-Xavier Dupé, Luc Brun, Myriam Mokhtari

Applications of Structural and Syntactical Pattern Recognition

3D Shape Classification Using Commute Time

This paper describes a commute-time based 3D shape descriptor that is robust with respect to changes in pose and topology. A new and completely unsupervised mesh segmentation algorithm is proposed, which is based on the commute time embedding of the mesh and the k-means clustering using the embedded mesh vertices. We use the discrete Laplace-Beltrami operator to construct the graph Laplacian.

Muhammad Haseeb, Edwin R. Hancock

Conditional Random Fields for Land Use/Land Cover Classification and Complex Region Detection

Developing a complex region detection algorithm that is aware of its contextual relations with several classes necessitates statistical frameworks that can encode contextual relations rather than simple rule-based applications or heuristics. In this study, we present a conditional random field (CRF) model that is generated over the results of a robust local discriminative classifier in order to reveal contextual relations of complex objects and land use/land cover (LULC) classes. The proposed CRF model encodes the contextual relation between the LULC classes and complex regions (airfields) as well as updates labels of the discriminative classifier and labels the complex region in a unified framework. The significance of the developed model is that it does not need any explicit parameters and/or thresholds along with heuristics or expert rules.

Gulcan Can, Orhan Firat, Fatos Tunay Yarman Vural

Recognition of Long-Term Behaviors by Parsing Sequences of Short-Term Actions with a Stochastic Regular Grammar

Human behavior understanding from visual data has applications such as threat recognition. A lot of approaches are restricted to limited time actions, which we call

short-term actions

. Long-term behaviors are sequences of short-term actions that are more extended in time. Our hypothesis is that they usually present some structure that can be exploited to improve recognition of short-term actions. We present an approach to model long-term behaviors using a syntactic approach. Behaviors to be recognized are hand-crafted into the model in the form of grammar rules. This is useful for cases when few (or no) training data is available such as in threat recognition. We use a stochastic parser so we handle noisy inputs. The proposed method succeeds in recognizing a set of predefined long-term interactions in the CAVIAR dataset. Additionally, we show how imposing prior knowledge about the structure of the long-term behavior improves the recognition of short-term actions with respect to standard statistical approaches.

Gerard Sanromà, Gertjan Burghouts, Klamer Schutte

A Comparison between Structural and Embedding Methods for Graph Classification

Structural pattern recognition is a well-know research field that has its birth in the early 80s. Throughout 30 years, structures such as graphs have been compared through optimization of functions that directly use attribute values on nodes and arcs. Nevertheless, in the last decade, kernel and embedding methods appeared. These new methods deduct a similarity value and a final labelling between nodes through representing graphs into a multi-dimensional space. It seems that lately kernel and embedding methods are preferred with respect to classical structural methods. However, both approaches have advantages and drawbacks. In this work, we compare structural methods to embedding and kernel methods. Results show that, with the evaluated datasets, some structural methods give slightly better performance and therefore, it is still early to discard classical structural methods for graph pattern recognition.

Albert Solé-Ribalta, Xavier Cortés, Francesc Serratosa

Improving Fuzzy Multilevel Graph Embedding through Feature Selection Technique

Graphs are the most powerful, expressive and convenient data structures but there is a lack of efficient computational tools and algorithms for processing them. The embedding of graphs into numeric vector spaces permits them to access the state-of-the-art computational efficient statistical models and tools. In this paper we take forward our work on explicit graph embedding and present an improvement to our earlier proposed method, named “fuzzy multilevel graph embedding - FMGE”, through feature selection technique. FMGE achieves the embedding of attributed graphs into low dimensional vector spaces by performing a multilevel analysis of graphs and extracting a set of global, structural and elementary level features. Feature selection permits FMGE to select the subset of most discriminating features and to discard the confusing ones for underlying graph dataset. Experimental results for graph classification experimentation on IAM letter, GREC and fingerprint graph databases, show improvement in the performance of FMGE.

Muhammad Muzzamil Luqman, Jean Yves Ramel, Josep Lladós

Clustering

Dynamic Learning of SCRF for Feature Selection and Classification of Hyperspectral Imagery

This paper investigates the feature selection and contextual classification of hyperspectral images through the sparse conditional random field (SCRF) model. To relieve the heavy degeneration of classification performance caused by the characteristics of the hyperspectral data and the oversparsity when SCRF selects a small feature subset, we develop a dynamic learning framework to train the SCRF. Under the piecewise training framework, the proposed dynamic learning method of SCRF can be implemented efficiently through separated dynamic sparse trainings of simple classifiers defined by corresponding potentials. Experiments on the real-world hyperspectral images attest to the effectiveness of the proposed method.

Ping Zhong, Zhiming Qian, Runsheng Wang

Entropic Selection of Histogram Features for Efficient Classification

This paper addresses the problem of local histogram-based image feature selection for learning binary classifiers. We show a novel technique which efficiently combines histogram feature projection with the conditional mutual information (CMI) based classifier selection scheme. Moreover, we investigate cost-sensitive modifications of the CMI-based selection procedure, which further improves the classification performance. Extensive evaluations show that the proposed methods are suitable for object detection and recognition tasks.

Ákos Utasi

2D Shapes Classification Using BLAST

This paper presents a novel 2D shape classification approach, which exploits in this context the huge amount of work carried out by bioinformaticians in the biological sequence analysis research field. In particular, in the approach presented here, we propose to encode shapes as biological sequences, employing the widely known sequence alignment tool called BLAST (Basic Local Alignment Search Tool) to devise a similarity score, used in a nearest neighbour scenario. Obtained results on standard datasets show the feasibility of the proposed approach.

Pietro Lovato, Manuele Bicego

A New Random Forest Method for One-Class Classification

We propose a new one-class classification method, called One Class Random Forest, that is able to learn from one class of samples only. This method, based on a random forest algorithm and an original outlier generation procedure, makes use of the ensemble learning mechanisms offered by random forest algorithms to reduce both the number of artificial outliers to generate and the size of the feature space in which they are generated. We show that One Class Random Forests perform well on various UCI public datasets in comparison to few other state-of-the-art one class classification methods (gaussian density models, Parzen estimators, gaussian mixture models and one-class SVMs).

Chesner Désir, Simon Bernard, Caroline Petitjean, Laurent Heutte

A New Index Based on Sparsity Measures for Comparing Fuzzy Partitions

This article adresses the problem of assessing how close two strict and/or fuzzy partitions are. A new index based on a measurement of the sparsity of the contingency matrix crossing the partitions is proposed that satisfies the required properties formulated within the paper and presents a low complexity. It is compared to well-known existing indices of the literature, such as the Rand and the Jaccard indices, the transfert distance and some of their recent fuzzy counterparts.

Romain Quéré, Carl Frélicot

Learning

Polichotomies on Imbalanced Domains by One-per-Class Compensated Reconstruction Rule

A key issue in machine learning is the ability to cope with recognition problems where one or more classes are under-represented with respect to the others. Indeed, traditional algorithms fail under class imbalanced distribution resulting in low predictive accuracy over the minority classes. While large literature exists on binary imbalanced tasks, few researches exist for multiclass learning. In this respect, we present here a new method for imbalanced multiclass learning within the One-per-Class decomposition framework. Once the multiclass task is divided into several binary tasks, the proposed reconstruction rule discriminates between safe and dangerous classifications. Then, it sets the multiclass label using information on both data distributions and classification reliabilities provided by each binary classifier, lowering the effects of class skew and improving the performance. We favorably compare the proposed reconstruction rule with the standard One-per-Class method on ten datasets using four classifiers.

Roberto D’Ambrosio, Paolo Soda

The Dipping Phenomenon

One typically expects classifiers to demonstrate improved performance with increasing training set sizes or at least to obtain their best performance in case one has an infinite number of training samples at ones’s disposal. We demonstrate, however, that there are classification problems on which particular classifiers attain their optimum performance at a training set size which is finite. Whether or not this phenomenon, which we term dipping, can be observed depends on the choice of classifier in relation to the underlying class distributions. We give some simple examples, for a few classifiers, that illustrate how the dipping phenomenon can occur. Additionally, we speculate about what generally is needed for dipping to emerge. What is clear is that this kind of learning curve behavior does not emerge due to mere chance and that the pattern recognition practitioner ought to take note of it.

Marco Loog, Robert P. W. Duin

Colour Matching Function Learning

In this paper, we aim at learning the colour matching functions making use of hyperspectral and trichromatic imagery. The method presented here is quite general in nature, being data driven and devoid of constrained setups. Here, we adopt a probabilistic formulation so as to recover the colour matching functions directly from trichromatic and hyperspectral pixel pairs. To do this, we derive a log-likelihood function which is governed by both, the spectra-to-colour equivalence and a generative model for the colour matching functions. Cast into a probabilistic setting, we employ the EM algorithm for purposes of

maximum a posteriori

inference, where the M-step is effected making use of Levenberg-Marquardt optimisation. We present results on real-world data and provide a quantitative analysis based upon a colour calibration chart.

Luis Romero-Ortega, Antonio Robles-Kelly

Constrained Log-Likelihood-Based Semi-supervised Linear Discriminant Analysis

A novel approach to semi-supervised learning for classical Fisher linear discriminant analysis is presented. It formulates the problem in terms of a constrained log-likelihood approach, where the semi-supervision comes in through the constraints. These constraints encode that the parameters in linear discriminant analysis fulfill particular relations involving label-dependent and label-independent quantities. In this way, the latter type of parameters, which can be estimated based on unlabeled data, impose constraints on the former. The former parameters are the class-conditional means and the average within-class covariance matrix, which are the parameters of interest in linear discriminant analysis. The constraints lead to a reduction in variability of the label-dependent estimates, resulting in a potential improvement of the semi-supervised linear discriminant over that of its regular supervised counterpart. We state upfront that some of the key insights in this contribution have been published previously in a workshop paper by the first author. The major contribution in this work is the basic observation that a semi-supervised linear discriminant analysis can be formulated in terms of a principled log-likelihood approach, where the previous solution employed an ad hoc procedure. With the current contribution, we move yet another step closer to a proper formulation of a semi-supervised version of this classical technique.

Marco Loog, Are C. Jensen

Out-of-Sample Embedding by Sparse Representation

A critical aspect of non-linear dimensionality reduction techniques is represented by the construction of the adjacency graph. The difficulty resides in finding the optimal parameters, a process which, in general, is heuristically driven. Recently, sparse representation has been proposed as a non-parametric solution to overcome this problem. In this paper, we demonstrate that this approach not only serves for the graph construction, but also represents an efficient and accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. Experimental results conducted on some challenging datasets confirmed the robustness of our approach and its superiority when compared to existing techniques.

Bogdan Raducanu, Fadi Dornaika

Kernel Methods in Statistical Pattern Recognition

Extended Analyses for an Optimal Kernel in a Class of Kernels with an Invariant Metric

Learning based on kernel machines is widely known as a powerful tool for various fields of information science such as pattern recognition and regression estimation. An appropriate model selection is required in order to obtain desirable learning results. In our previous work, we discussed a class of kernels forming a nested class of reproducing kernel Hilbert spaces with an invariant metric and proved that the kernel corresponding to the smallest reproducing kernel Hilbert space, including an unknown true function, gives the best model. In this paper, we relax the invariant metric condition and show that a similar result is obtained when a subspace with an invariant metric exists.

Akira Tanaka, Ichigaku Takigawa, Hideyuki Imai, Mineichi Kudo

Simultaneous Learning of Localized Multiple Kernels and Classifier with Weighted Regularization

Kernel classifiers have demonstrated their high performance for many classification problems. For the proper selection of kernel functions, multiple kernel learning (MKL) has been researched. Furthermore, the localized MKL (LMKL) enables to set the weights for the kernel functions at each point. However, the training of the weight functions for kernel functions is a complex nonlinear problem and a classifier can be trained separately after the weights are fixed. The iteration of the two processes are often necessary. In this paper we propose a new framework for MKL/LMKL. In the framework, not kernel functions but mappings to the feature space are combined with weights. We also propose a new learning scheme to train simultaneously weights for kernel functions and a classifier. We realize a classifier by our framework with the Gaussian kernel function and the support vector machine. Finally, we show its advantages by experimental results.

Naoya Inoue, Yukihiko Yamashita

Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation

The objective of change-point detection is to discover abrupt property changes lying behind time-series data. In this paper, we present a novel statistical change-point detection algorithm that is based on non-parametric divergence estimation between two retrospective segments. Our method uses the relative Pearson divergence as a divergence measure, and it is accurately and efficiently estimated by a method of direct density-ratio estimation. Through experiments on real-world human-activity sensing, speech, and Twitter datasets, we demonstrate the usefulness of the proposed method.

Song Liu, Makoto Yamada, Nigel Collier, Masashi Sugiyama

Online Metric Learning Methods Using Soft Margins and Least Squares Formulations

Online metric learning using margin maximization has been introduced as a way to learn appropriate dissimilarity measures in an efficient way when information as pairs of examples is given to the learning system in a progressive way. These schemes have several practical advantages with regard to global ones in which a training set needs to be processed. On the other hand, they may suffer from a poor performance depending on the quality of the examples and the particular tuning or other implementation details. This paper formulates several online metric learning alternatives using a passive-aggressive schema. A new formulation of the online problem using least squares is also introduced. The relative behavior of the different alternatives is studied and comparative experimentation is carried out to put forward the benefits and weaknesses of each alternative.

Adrian Perez-Suay, Francesc J. Ferri

Shape Analysis Using the Edge-Based Laplacian

This paper presents a novel analysis and application of the eigensystem of the edge-based Laplacian of a graph. The advantage of using the edge-based Laplacian over its vertex-based counterpart is that it significantly expands the set of differential operators that can be implemented in the graph domain. We commence by presenting a new mesh characterization based on the adjacency matrix of the mesh that captures both the geometric and topological properties of the shape. We use the edge-based eigenvalues to develop a novel method for defining pose-invariant signatures for non-rigid three-dimensional shapes based on the edge-based heat kernel. To illustrate the utility of our method, we perform numerous experiments applying the method to correspondence matching and classifying non-rigid three-dimensional shapes represented in terms of meshes.

Furqan Aziz, Richard C. Wilson, Edwin R. Hancock

Applications of Statistical Pattern Recognition

One-Sided Prototype Selection on Class Imbalanced Dissimilarity Matrices

In the dissimilarity representation paradigm, several prototype selection methods have been used to cope with the topic of how to select a small representation set for generating a low-dimensional dissimilarity space. In addition, these methods have also been used to reduce the size of the dissimilarity matrix. However, these approaches assume a relatively balanced class distribution, which is grossly violated in many real-life problems. Often, the ratios of prior probabilities between classes are extremely skewed. In this paper, we study the use of renowned prototype selection methods adapted to the case of learning from an imbalanced dissimilarity matrix. More specifically, we propose the use of these methods to under-sample the majority class in the dissimilarity space. The experimental results demonstrate that the one-sided selection strategy performs better than the classical prototype selection methods applied over all classes.

Mónica Millán-Giraldo, Vicente García, J. Salvador Sánchez

Estimating Surface Characteristics and Extracting Features from Polarisation

In this paper we develop a practical method for estimating shape, color and reflectance using only three images taken under polarised light. We develop a novel and practical framework to optimise the estimates and eliminate the redundant information, then investigate three different methods to compare their class discriminating capacities. We present experiment to demonstrate the validity of the proposed method for a database of fruit objects from 5 different classes, and we show that the proposed method is capable of accurately extracting the features of the input examples. The framework can further be applied in a variety fields of computer vision and pattern recognition domains including object recognition and classification.

Lichi Zhang, Edwin R. Hancock, Jing Wu

Extended Fisher Criterion Based on Auto-correlation Matrix Information

Fisher’s linear discriminant analysis (FLDA) has been attracting many researchers and practitioners for several decades thanks to its ease of use and low computational cost. However, FLDA implicitly assumes that all the classes share the same covariance: which implies that FLDA might fail when this assumption is not necessarily satisfied. To overcome this problem, we propose a simple extension of FLDA that exploits a detailed covariance structure of every class by utilizing revealed by the class-wise auto-correlation matrices. The proposed method achieves remarkable improvements classification accuracy against FLDA while preserving two major strengths of FLDA: the ease of use and low computational costs. Experimental results with MNIST and other several data sets in UCI machine learning repository demonstrate the effectiveness of our method.

Hitoshi Sakano, Tsukasa Ohashi, Akisato Kimura, Hiroshi Sawada, Katsuhiko Ishiguro

Poisoning Adaptive Biometric Systems

Adaptive biometric recognition systems have been proposed to deal with natural changes of the clients’ biometric traits due to multiple factors, like

aging

. However, their adaptability to changes may be exploited by an attacker to compromise the stored templates, either to impersonate a specific client, or to deny access to him. In this paper we show how a carefully designed attack may gradually poison the template gallery of some users, and successfully mislead a simple PCA-based face verification system that performs self-update.

Battista Biggio, Giorgio Fumera, Fabio Roli, Luca Didaci

Modified Divergences for Gaussian Densities

Multivariate Gaussian densities are pervasive in pattern recognition and machine learning. A central operation that appears in most of these areas is to measure the difference between two multivariate Gaussians. Unfortunately, traditional measures based on the Kullback–Leibler (KL) divergence and the Bhattacharyya distance do not satisfy all metric axioms necessary for many algorithms. In this paper we propose a modification for the KL divergence and the Bhattacharyya distance, for multivariate Gaussian densities, that transforms the two measures into distance metrics. Next, we show how these metric axioms impact the unfolding process of manifold learning algorithms. Finally, we illustrate the efficacy of the proposed metrics on two different manifold learning algorithms when used for motion clustering in video data. Our results show that, in this particular application, the new proposed metrics lead to significant boosts in performance (at least 7%) when compared to other divergence measures.

Karim T. Abou–Moustafa, Frank P. Ferrie

Applications of Structural, Syntactical, and Statistical Methods

Graph Database Retrieval Based on Metric-Trees

M-trees are well-know structures used to speed-up queries in databases. In this paper, we evaluate the applicability of m-trees to graph databases. In classical schemes based on metric-trees, the routing information kept in a metric-tree node is a selected element from the sub-cluster that represents. Nevertheless, defining a graph that represents a set of graphs is not a trivial task. We evaluate different graphs-class prototype as routing nodes in the metric tree. The considered prototypes are: Median Graphs, Closure Graphs, First-Order Random Graphs, Function-Described Graphs and Second-Order Random Graphs.

Francesc Serratosa, Xavier Cortés, Albert Solé-Ribalta

Validation of Network Classifiers

This paper develops PAC (probably approximately correct) error bounds for network classifiers in the transductive setting, where the network node inputs and links are all known, the training nodes class labels are known, and the goal is to classify a working set of nodes that have unknown class labels. The bounds are valid for any model of network generation. They require working nodes to be selected independently, but not uniformly at random. For example, they allow different regions of the network to have different densities of unlabeled nodes.

James Li, Abdullah Sonmez, Zehra Cataltepe, Eric Bax

Alignment and Morphing for the Boundary Curves of Anatomical Organs

In this paper, we develop a tracking method for the deformable boundary curves of biological organs using variational registration method. We first define the relative distortion of a pair of curves using curvatures of curves. This minimum distortion aligns corresponding points of a pair of curves. Then, we derive the mean of curves as the curve which minimises the total distortion of a collection of shapes. We compute the intermediate boundary curve of a pair of curves as the mean of these curves.

Keiko Morita, Atsushi Imiya, Tomoya Sakai, Hidetaka Hontan, Yoshitaka Masutani

Unsupervised Clustering of Human Pose Using Spectral Embedding

In this paper we use the spectra of a Hermitian matrix and the coefficient of the symmetric polynomials to cluster different human poses taken by an inexpensive 3

camera, the Microsoft ’Kinect’ for XBox 360. We construct a Hermitian matrix from the joints and the angles subtended by each pair of limbs using the three-dimensional ’skeleton’ data delivered by Kinect. To compute the angles between a pair of limbs we construct the line graph from the given skeleton. We construct pattern vectors from the eigenvectors of the Hermitian matrix. The pattern vectors are embedded into a pattern-space using Principal Component Analysis (PCA). We compere the results obtained with the Laplacian spectra pattern vectors. The empirical results show that using the angular information can be efficiently used to clusters different human poses.

Muhammad Haseeb, Edwin R. Hancock

Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features

The problem of human action recognition has received increasing attention in recent years for its importance in many applications. Local representations and in particular STIP descriptors have gained increasing popularity for action recognition. Yet, the main limitation of those approaches is that they do not capture the spatial relationships in the subject performing the action. This paper proposes a novel method based on the fusion of global spatial relationships provided by graph embedding and the local spatio-temporal information of STIP descriptors. Experiments on an action recognition dataset reported in the paper show that recognition accuracy can be significantly improved by combining the structural information with the spatio-temporal features.

Ehsan Zare Borzeshi, Oscar Perez Concha, Massimo Piccardi

An Incremental Structured Part Model for Image Classification

The state-of-the-art image classification methods usually require many training samples to achieve good performance. To tackle this problem, we present a novel incremental method in this paper, which learns a part model to classify objects using only a small number of training samples. Our model captures the inherent connections of the semantic parts of objects and builds structural relationship between them. In the incremental learning stage, we use high entropy images that have been accepted by users to update the learned model. The proposed approach is evaluated on two datasets, which demonstrates its advantages over several alternative classification methods in the literature.

Huigang Zhang, Xiao Bai, Jian Cheng, Jun Zhou, Huijie Zhao

Top-Down Tracking and Estimating 3D Pose of a Die

Real-time 3D pose estimation from monocular image sequences is a challenging research topic. Although current methods are able to recover 3D pose, they are severely challenged by the computational cost. To address this problem, we propose a tracking and 3D pose estimation method supported by three main pillars: a pyramidal structure, an aspect graph and the checkpoints. Once initialized the systems performs a top-down tracking. At a high level it detects the position of the object and segments its time-space trajectory. This stage increases the stability and the robustness for the tracking process. Our main objective is the 3D pose estimation, the pose is estimated only in relevant events of the segmented trajectory, which reduces the computational effort required. In order to obtain the 3D pose estimation in the complete trajectory, an interpolation method, based on the aspect graph describing the structure of the object’s surface, can be used to roughly estimate the poses between two relevant events. This early version of the method has been developed to work with a specific type of polyhedron with strong edges, texture and differentiated faces, a die.

Fuensanta Torres, Walter G. Kropatsch

Large Scale Experiments on Fingerprint Liveness Detection

Fingerprint liveness detection consists in extracting measurements, from a fingerprint image, allowing to distinguish between an “alive” fingerprint image, that is, an image coming from the fingertip of the claimed identity, and an artificial replica. Several algorithms have been proposed so far, but the robustness of their performance has not yet been compared when varying several environmental conditions. In this paper, we present a set of experiments investigating the performance of several feature sets designed for fingerprint liveness detection. In particular we assessed the decrease of performance when varying the pressure and the environmental illumination as well as the size of the region of interest (ROI) used for extracting such features. Experimental results on a large data set show the different dependence of some features sets on the investigated conditions.

Gian Luca Marcialis, Luca Ghiani, Katja Vetter, Dirk Morgeneier, Fabio Roli

Implicit and Explicit Graph Embedding: Comparison of Both Approaches on Chemoinformatics Applications

Defining similarities or distances between graphs is one of the bases of the structural pattern recognition field. An important trend within this field consists in going beyond the simple formulation of similarity measures by studying properties of graph’s spaces induced by such distance or similarity measures . Such a problematic is closely related to the graph embedding problem. In this article, we investigate two types of similarity measures. The first one is based on the notion of graph edit distance which aims to catch a global dissimilarity between graphs. The second family is based on comparisons of bags of patterns extracted from graphs to be compared. Both approaches are detailed and their performances are evaluated on different chemoinformatics problems.

Benoit Gaüzère, Makoto Hasegawa, Luc Brun, Salvatore Tabbone

Modeling Spoken Dialog Systems under the Interactive Pattern Recognition Framework

The new Interactive Pattern Recognition (IPR) framework has been recently proposed. This proposal lets a human interact with a Pattern Recognition system allowing the system to learn from the interaction as well as adapt it to the human behavior. The aim of this paper is to apply the principles of IPR to the design of Spoken Dialog Systems (SDS). We propose a new formulation to present SDS as an IPR problem. To this end some extensions to the IPR approach are proposed. Additionally a user model based on the IPR paradigm is also defined. We applied the proposed formulation to compose a preliminary graphical model that has been experimentally developed to deal with a Spanish dialog task. An initial maximum likelihood strategy for the dialog manager actions along with a stochastic simulation of user behavior have allowed to get new dialogs. The preliminary evaluation of these results allowed us to consider this formulation as a promising framework to deal with SDS.

M. Inés Torres, Jose Miguel Benedí, Raquel Justo, Fabrizio Ghigi

Hierarchical Graph Representation for Symbol Spotting in Graphical Document Images

Symbol spotting can be defined as locating given query symbol in a large collection of graphical documents. In this paper we present a hierarchical graph representation for symbols. This representation allows graph matching methods to deal with low-level vectorization errors and, thus, to perform a robust symbol spotting. To show the potential of this approach, we conduct an experiment with the SESYD dataset.

Klaus Broelemann, Anjan Dutta, Xiaoyi Jiang, Josep Lladós

Compact Form of the Pseudo–inverse Matrix in the Approximation of a Star Graph Using the Conductance Electrical Model (CEM)

The Conductance Electrical Model (CEM) transforms a graph into a circuit and can be use to do ”inexact graph isomorphism” as it was shown in [13]. In second stage of this process, we transform the circuit

in a star circuit, using the Moore–Penrose pseudo–inverse of a matrix for which there is a general formula that requires transpose, multiply and invert matrices with a time complexity of

(

), where

is the number of nodes of the graph. However, due to the special structure of the star transformation, we are able to exploit this special structure to compute the pseudo–inverse without using the general Moore–Penrose formula. We have developed a closed formula that can compute the elements of the pseudo–inverse without using that formula, that means without multiplying matrices neither doing the matrix inversion and that moreover can be computed in

(

). This method also eliminates the problems due to computer rounding and due to bad–conditioned problems in mathematical terms.

Manuel Igelmo, Alberto Sanfeliu

A Heuristic Based on the Intrinsic Dimensionality for Reducing the Number of Cyclic DTW Comparisons in Shape Classification and Retrieval Using AESA

Cyclic Dynamic Time Warping (CDTW) is a good dissimilarity of shape descriptors of high dimensionality based on contours, but it is computationally expensive. For this reason, to perform recognition tasks, a method to reduce the number of comparisons and avoid an exhaustive search is convenient. The Approximate and Eliminate Search Algorithm (AESA) is a relevant indexing method because of its drastic reduction of comparisons, however, this algorithm requires a metric distance and that is not the case of CDTW. In this paper, we introduce a heuristic based on the intrinsic dimensionality that allows to use CDTW and AESA together in classification and retrieval tasks over these shape descriptors. Experimental results show that, for descriptors of high dimensionality, our proposal is optimal in practice and significantly outperforms an exhaustive search, which is the only alternative for them and CDTW in these tasks.

Vicente Palazón-González, Andrés Marzal

Support Vector Machines Training Data Selection Using a Genetic Algorithm

This paper presents a new method for selecting valuable training data for support vector machines (SVM) from large, noisy sets using a genetic algorithm (GA). SVM training data selection is a known, however not extensively investigated problem. The existing methods rely mainly on analyzing the geometric properties of the data or adapt a randomized selection, and to the best of our knowledge, GA-based approaches have not been applied for this purpose yet. Our work was inspired by the problems encountered when using SVM for skin segmentation. Due to a very large set size, the existing methods are too time-consuming, and random selection is not effective because of the set noisiness. In the work reported here we demonstrate how a GA can be used to optimize the training set, and we present extensive experimental results which confirm that the new method is highly effective for real-world data.

Michal Kawulok, Jakub Nalepa

A Unified View of Two-Dimensional Principal Component Analyses

Recently, two-dimensional principal component analysis (2D-PCA) and its variants have been proposed by several researchers. In this paper, we summarize their 2DPCA variants, show some equivalence among them, and present a unified view in which the non-iterative 2DPCA variants are interpreted as the non-iterative approximate algorithms for the iterative 2DPCA variants, i.e., the non-iterative 2DPCA variants are derived as the first iterations of the iterative algorithm started from different initial settings. Then we classify the non-iterative 2DPCA variants on the basis of their algorithmic patterns and propose a new non-iterative 2DPCA algorithm based on the classification. The effectiveness of the proposed algorithm is experimentally demonstrated on three publicly accessible face image databases.

Kohei Inoue, Kenji Hara, Kiichi Urahama

Automatic Dimensionality Estimation for Manifold Learning through Optimal Feature Selection

A very important aspect in manifold learning is represented by automatic estimation of the intrinsic dimensionality. Unfortunately, this problem has received few attention in the literature of manifold learning. In this paper, we argue that feature selection paradigm can be used to the problem of automatic dimensionality estimation. Besides this, it also leads to improved recognition rates. Our approach for optimal feature selection is based on a Genetic Algorithm. As a case study for manifold learning, we have considered Laplacian Eigenmaps (LE) and Locally Linear Embedding (LLE). The effectiveness of the proposed framework was tested on the face recognition problem. Extensive experiments carried out on ORL, UMIST, Yale, and Extended Yale face data sets confirmed our hypothesis.

Fadi Dornaika, Ammar Assoum, Bogdan Raducanu

Novel Gabor-PHOG Features for Object and Scene Image Classification

A new Gabor-PHOG (GPHOG) descriptor is first introduced in this paper for image feature extraction by concatenating the Pyramid of Histograms of Oriented Gradients (PHOG) of all the local Gabor filtered images. Next, a comparative assessment of the classification performance of the GPHOG descriptor is made in six different color spaces, namely the RGB, HSV, YCbCr, oRGB, DCS and YIQ color spaces, to propose the novel YIQ-GPHOG and the YCbCr-GPHOG feature vectors that perform well on different object and scene image categories. Third, a novel Fused Color GPHOG (FC-GPHOG) feature is presented by integrating the PCA features of the six color GPHOG descriptors for object and scene image classification, with applications to image search and retrieval. Finally, the Enhanced Fisher Model (EFM) is applied for discriminatory feature extraction and the nearest neighbor classification rule is used for image classification. The effectiveness of the proposed feature vectors for image classification is evaluated using two grand challenge datasets, namely the Caltech 256 dataset and the MIT Scene dataset.

Atreyee Sinha, Sugata Banerji, Chengjun Liu

Binary Gabor Statistical Features for Palmprint Template Protection

The biometric template protection system requires a high-quality biometric channel and a well-designed error correction code (ECC). Due to the intra-class variations of biometric data, an efficient fixed-length binary feature extractor is required to provide a high-quality biometric channel so that the system is robust and accurate, and to allow a secret key to be combined for security. In this paper we present a binary palmprint feature extraction method to achieve a robust biometric channel for template protection system. The real-valued texture statistical features are firstly extracted based on Gabor magnitude and phase responses. Then a bits quantization and selection algorithm is introduced. Experimental results on the HongKong PloyU Palmprint database verify the efficiency of our method which achieves low verification error rate by a robust palmprint binary representation of low bit error rate.

Meiru Mu, Qiuqi Ruan, Xiaoying Shao, Luuk Spreeuwers, Raymond Veldhuis

Class-Dependent Dissimilarity Measures for Multiple Instance Learning

Multiple Instance Learning (MIL) is concerned with learning from sets (bags) of feature vectors (instances), where the individual instance labels are ambiguous. In MIL it is often assumed that positive bags contain at least one instance from a so-called concept in instance space, whereas negative bags only contain negative instances. The classes in a MIL problem are therefore not treated in the same manner. One of the ways to classify bags in MIL problems is through the use of bag dissimilarity measures. In current dissimilarity approaches, such dissimilarity measures act on the bag as a whole and do not distinguish between positive and negative bags. In this paper we explore whether this is a reasonable approach and when and why a dissimilarity measure that is dependent on the bag label, might be more appropriate.

Veronika Cheplygina, David M. J. Tax, Marco Loog

Bidirectional Language Model for Handwriting Recognition

In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via

-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.

Volkmar Frinken, Alicia Fornés, Josep Lladós, Jean-Marc Ogier

Hypergraph Spectra for Unsupervised Feature Selection

Most existing feature selection methods focus on ranking individual features based on a utility criterion, and select the optimal feature set in a greedy manner. However, the feature combinations found in this way do not give optimal classification performance, since they neglect the correlations among features. In an attempt to overcome this problem, we develop a novel unsupervised feature selection technique by using hypergraph spectral embedding, where the projection matrix is constrained to be a selection matrix designed to select the optimal feature subset. Specifically, by incorporating multidimensional interaction information (MII) for higher order similarities measure, we establish a novel hypergraph framework which is used for characterizing the multiple relationships within a set of samples. Thus, the structural information latent in the data can be more effectively modeled. Secondly, we derive a hypergraph embedding view of feature selection which casting the feature discriminant analysis into a regression framework that considers the correlations among features. As a result, we can evaluate joint feature combinations, rather than being confined to consider them individually, and are thus able to handle feature redundancy. Experimental results demonstrate the effectiveness of our feature selection method on a number of standard datasets.

Zhihong Zhang, Edwin R. Hancock

Feature Selection Using Counting Grids: Application to Microarray Data

In this paper a novel feature selection scheme is proposed, which exploits the potentialities of a recent probabilistic generative model, the Counting Grid. This model is able to cluster together similar observations, highlighting the compactness of a class and its underlying structure. The proposed feature selection scheme is applied to the expression microarray scenario, a peculiar context with very few patterns and a huge number of features. Experiments on benchmark datasets show that the proposed approach is effective and stable, assessing state-of-the-art classification accuracies.

Pietro Lovato, Manuele Bicego, Marco Cristani, Nebojsa Jojic, Alessandro Perina

Infinite Sparse Factor Analysis for Blind Source Separation in Reverberant Environments

Sound source separation in a real-world indoor environment is an ill-formed problem because sound source mixing is affected by the number of sounds, sound source activities, and reverberation. In addition, blind source separation (BSS) suffers from a permutation ambiguity in a frequency domain processing. Conventional methods have two problems: (1) impractical assumptions that the number of sound sources is given, and (2) permutation resolution as a post processing. This paper presents a non-parametric Bayesian BBS called permutation-free infinite sparse factor analysis (PF-ISFA) that solves the two problems simultaneously. Experimental results show that PF-ISFA outperforms conventional complex ISFA in all measures of BSS_EVAL criteria. In particular, PF-ISFA improves Signal-to-Interference Ratio by 14.45 dB and 5.46 dB under RT

= 30 ms and RT

= 460 ms conditions, respectively.

Kohei Nagira, Takuma Otsuka, Hiroshi G. Okuno

Sparse Discriminant Analysis Based on the Bayesian Posterior Probability Obtained by L1 Regression

Recently the kernel discriminant analysis (KDA) has been successfully applied in many applications. However, kernel functions are usually defined a priori and it is not known what the optimum kernel function for nonlinear discriminant analysis is. Otsu derived the optimum nonlinear discriminant analysis (ONDA) by assuming the underlying probabilities similar with the Bayesian decision theory. Kurita derived discriminant kernels function (DKF) as the optimum kernel functions in terms of the discriminant criterion by investigating the optimum discriminant mapping constructed by the ONDA. The derived kernel function is defined by using the Bayesian posterior probabilities. We can define a family of DKFs by changing the estimation method of the Bayesian posterior probabilities. In this paper, we propose a novel discriminant kernel function based on L1-regularized regression, called L1 DKF. L1 DKF is given by using the Bayesian

posterior

probabilities estimated by L1 regression. Since L1 regression yields a sparse representation for given samples, we can naturally introduce the sparseness into the discriminant kernel function. To introduce the sparseness into LDA, we use L1 DKF as the kernel function of LDA. In experiments, we show sparseness and classification performance of L1 DKF.

Akinori Hidaka, Takio Kurita

Conditional Variance of Differences: A Robust Similarity Measure for Matching and Registration

This paper presents a new similarity measure, the

sum of conditional variance of differences

(SCVD), designed to be insensitive to highly non-linear intensity transformations such as the ones occurring in multi-modal image registration and tracking. It improves on another recently introduced statistical measure, the sum of conditional variances (SCV), which has been reported to outperform comparable information theoretic similarity measures such as mutual information (MI) and cross-cumulative residual entropy (CCRE). We also propose two additional extensions that further increase the robustness of SCV(D) by relaxing the quantisation process and making it symmetric. We demonstrate the benefits of SCVD and improvements on image matching and registration through experiments.

Atsuto Maki, Riccardo Gherardi

A Class Centric Feature and Classifier Ensemble Selection Approach for Music Genre Classification

Music genre classification has attracted a lot of research interest due to the rapid growth of digital music. Despite the availability of a vast number of audio features and classification techniques, genre classification still remains a challenging task. In this work we propose a class centric feature and classifier ensemble selection method which deviates from the conventional practice of employing a single, or an ensemble of classifiers trained with a selected set of audio features. We adopt a binary decomposition technique to divide the multiclass problem into a set of binary problems which are then treated in a class specific manner. This differs from the traditional techniques which operate on the naive assumption that a specific set of features and/or classifiers can perform equally well in identifying all the classes. Experimental results obtained on a popular genre dataset and a newly created dataset suggest significant improvements over traditional techniques.

Hasitha Bimsara Ariyaratne, Dengsheng Zhang, Guojun Lu

A Local Adaptation of the Histogram Radon Transform Descriptor: An Application to a Shoe Print Dataset

In this paper we propose a shape recognition approach applied to a dataset composed of 512 shoeprints where shapes are strongly occluded. We provide a local adaptation of the HRT (Histogram Radon Transform) descriptor. A shoeprint is decomposed into its connect components and describes locally by the local HRT. Then, following this description, we find the best local matching between the connected components and the similarity between two images is defined as mean of local similarity measures.

Makoto Hasegawa, Salvatore Tabbone

A Multiple Classifier System for Classification of Breast Lesions Using Dynamic and Morphological Features in DCE-MRI

In this paper we propose a Multiple Classifier System (MCS) for classifying breast lesions in Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI). The proposed MCS combines the results of two classifiers trained with dynamic and morphological features respectively. Twenty-one malignant and seventeen benign breast lesions, histologically proven, were analyzed. Volumes of Interest (VOIs) have been automatically extracted via a segmentation procedure assessed in a previous study. The performance of the MCS have been compared with histological classification. Results indicated that with automatic segmented VOIs 90% of test-set lesions were correctly classified.

Roberta Fusco, Mario Sansone, Antonella Petrillo, Carlo Sansone

A Comparative Analysis of Forgery Detection Algorithms

The aim of this work is to make an objective comparison between different forgery techniques and present a tool that helps taking a more reliable decision about the integrity of a given image or part of it. The considered techniques, all recently proposed in the scientific community, follow different and complementary approaches so as to guarantee robustness with respect to tampering of different types and characteristics. Experiments have been conducted on a large set of images using an automatic copy-paste tampering generator. Early results point out significant differences about competing techniques, depending also on complexity and side information.

Davide Cozzolino, Giovanni Poggi, Carlo Sansone, Luisa Verdoliva

Low Training Strength High Capacity Classifiers for Accurate Ensembles Using Walsh Coefficients

If a binary decision is taken for each classifier in an ensemble, training patterns may be represented as binary vectors. For a two-class supervised learning problem this leads to a partially specified Boolean function that may be analysed in terms of spectral coefficients. In this paper it is shown that a vote which is weighted by the coefficients enables a fast ensemble classifier that achieves performance close to Bayes rate. Experimental evidence shows that effective classifier performance may be achieved with one epoch of training of an MLP using Levenberg-Marquardt with 64 hidden nodes.

Terry Windeatt, Cemre Zor

A Novel Shadow-Assistant Human Fall Detection Scheme Using a Cascade of SVM Classifiers

Visual recognition of human fall incidents in video clips has been an active research issue in recent years, However, most published methods cannot effectively differentiate between fall-down and fall-like incidents such as sitting and squatting. In this paper, we present a novel shadow-assistant method for detecting human fall. Normally, complex 3-D models are used to estimate the human height. However, to reduce the high computational cost, only the information of moving shadow is used for this context. Because the system is based on a combination of shadow-assistant height estimation, and a cascade of SVM classifiers, it can distinguish between fall-down and fall-like incidents with a high degree of accuracy from very short sequence of 1-10 frames. Our experimental results demonstrate that under bird’s-eye view camera setting, the proposed system still can achieve 100% detect rate and a low false alarm rate, while the detection rate of other fall detection schemes have been dropped dramatically.

Yie-Tarng Chen, You-Rong Lin, Wen-Hsien Fang

Analysis of Co-training Algorithm with Very Small Training Sets

Co-training is a well known semi-supervised learning algorithm, in which two classifiers are trained on two different views (feature sets): the initially small training set is iteratively updated with unlabelled samples classified with high confidence by one of the two classifiers. In this paper we address an issue that has been overlooked so far in the literature, namely, how co-training performance is affected by the size of the initial training set, as it decreases to the minimum value below which a given learning algorithm can not be applied anymore. In this paper we address this issue empirically, testing the algorithm on 24 real datasets artificially splitted in two views, using two different base classifiers. Our results show that a very small training set, even made up of one only labelled sample per class, does not adversely affect co-training performance.

Luca Didaci, Giorgio Fumera, Fabio Roli

Classification of High-Dimension PDFs Using the Hungarian Algorithm

The Hungarian algorithm can be used to calculate the earth mover’s distance, as a measure of the difference between two probability density functions, when the pdfs are described by sets of

points sampled from their distributions. However, information generated by the algorithm about precisely how the pdfs are different is not utilized. In this paper, a method is presented that incorporates this information into a ‘bag-of-words’ type method, in order to increase the robustness of a classification. This method is applied to an image classification problem, and is found to outperform several existing methods.

James S. Cope, Paolo Remagnino

Face Recognition Using Multilinear Manifold Analysis of Local Descriptors

In this paper, we propose to represent a face image as a local descriptor tensor and use a Multilinear Manifold Analysis (MMA) method for discriminant feature extraction, which is used for face recognition. The local descriptor tensor, which is a combination of the descriptor of local regions (K*K-pixel patch) in the image, can represent image more efficient than pixel-level intensity representation, and also than the popular Bag-Of-Feature (BOF) model, which approximately represents each local descriptor as a predefined visual word. Therefore it should be more effective in computational time than the BOF model. For extracting discriminant and compact features from the local descriptor tensor, we propose to use the proposed TMultilinear Manifold Analysis (MMA) algorithm, which has several benefits compared with conventional subspace learning methods such as PCA, ICA, LDA and so on: (1) a natural way of representing data without losing structure information, i.e., the information about the relative positions of pixels or regions; (2) a reduction in the small sample size problem which occurs in conventional supervised learning because the number of training samples is much less than the dimensionality of the feature space; (3) a neighborhood structure preserving in tensor feature space for face recognition and a good convergence property in training procedure. We validate our proposed algorithm on Benchmark database Yale and PIE, and experimental results show recognition rate with the proposed method can be greatly improved compared with conventional subspace analysis methods especially for small training sample number.

Xian-Hua Han, Yen-Wei Chen

A Genetic Inspired Optimization for ECOC

In this work, we propose a novel Genetic Inspired Error Correcting Output Codes (ECOC) Optimization, which looks for an efficient problem-dependent encoding of the multi-class task with high generalization performance. This optimization procedure is based on novel ECOC-Compliant crossover, mutation, and extension operators, which guide the optimization process to promising regions of the search space. The results on several public datasets show significant performance improvements as compared to state-of-the-art ECOC strategies.

Miguel Ángel Bautista, Sergio Escalera, Xavier Baró, Oriol Pujol

Backmatter

Titel: Structural, Syntactic, and Statistical Pattern Recognition
herausgegeben von: Georgy Gimel’farb
Edwin Hancock
Atsushi Imiya
Arjan Kuijper
Mineichi Kudo
Shinichiro Omachi
Terry Windeatt
Keiji Yamada
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-34166-3
Print ISBN: 978-3-642-34165-6
DOI: https://doi.org/10.1007/978-3-642-34166-3