Skip to main content

2010 | Buch

Structural, Syntactic, and Statistical Pattern Recognition

Joint IAPR International Workshop, SSPR&SPR 2010, Cesme, Izmir, Turkey, August 18-20, 2010. Proceedings

herausgegeben von: Edwin R. Hancock, Richard C. Wilson, Terry Windeatt, Ilkay Ulusoy, Francisco Escolano

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This volume in the Springer Lecture Notes in Computer Science (LNCS) series contains the papers presented at the S+SSPR 2010 Workshops, which was the seventh occasion that SPR and SSPR workshops have been held jointly. S+SSPR 2010 was organized by TC1 and TC2, Technical Committees of the International Association for Pattern Recognition(IAPR), andheld inCesme, Izmir, whichis a seaside resort on the Aegean coast of Turkey. The conference took place during August 18–20, 2010, only a few days before the 20th International Conference on Pattern Recognition (ICPR) which was held in Istanbul. The aim of the series of workshops is to create an international forum for the presentation of the latest results and exchange of ideas between researchers in the ?elds of statistical and structural pattern recognition. SPR 2010 and SSPR 2010 received a total of 99 paper submissions from many di?erent countries around the world, giving it a truly international perspective, as has been the case for previous S+SSPR workshops. This volume contains 70 accepted papers, 39 for oral and 31 for poster presentation. In addition to par- lel oral sessions for SPR and SSPR, there were two joint oral sessions of interest to both SPR and SSPR communities. Furthermore, to enhance the workshop experience, there were two joint panel sessions on “Structural Learning” and “Clustering,” in which short author presentations were followed by discussion. Another innovation this year was the ?lming of the proceedings by Videol- tures.

Inhaltsverzeichnis

Frontmatter

Invited Talks

From Region Based Image Representation to Object Discovery and Recognition

This paper presents an overview of the work we have done over the last several years on object recognition in images from region-based image representation. The overview focuses on the following related problems: (1) discovery of a single 2D object category frequently occurring in a given image set; (2) learning a model of the discovered category in terms of its photometric, geometric, and structural properties; and (3) detection and segmentation of objects from the category in new images. Images in the given set are segmented, and then each image is represented by a region graph that captures hierarchy and neighbor relations among image regions. The region graphs are matched to extract the maximally matching subgraphs, which are interpreted as instances of the discovered category. A graph-union of the matching subgraphs is taken as a model of the category. Matching the category model to the region graph of a new image yields joint object detection and segmentation. The paper argues that using a hierarchy of image regions and their neighbor relations offers a number of advantages in solving (1)-(3), over the more commonly used point and edge features. Experimental results, also reviewed in this paper, support the above claims. Details of our methods as well of comparisons with other methods are omitted here, and can be found in the indicated references.

Narendra Ahuja, Sinisa Todorovic
Learning on Manifolds

Mathematical formulation of certain natural phenomena exhibits group structure on topological spaces that resemble the Euclidean space only on a small enough scale, which prevents incorporation of conventional inference methods that require global vector norms. More specifically in computer vision, such underlying notions emerge in differentiable parameter spaces. Here, two Riemannian manifolds including the set of affine transformations and covariance matrices are elaborated and their favorable applications in distance computation, motion estimation, object detection and recognition problems are demonstrated after reviewing some of the fundamental preliminaries.

Fatih Porikli
Classification and Trees

Breiman, Friedman, Gordon and Stone recognized that tree classifiers would be very valuable to practicing statisticians. Their

cart

algorithm became very popular indeed. Designing tree-based classifiers, however, has its pitfalls. It is easy to make them too simple or too complicated so that Bayes risk consistency is compromised. In this talk, we explore the relationship between algorithmic complexity of tree-based methods and performance.

Luc Devroye
Structural Patterns in Complex Networks through Spectral Analysis

The study of some structural properties of networks is introduced from a graph spectral perspective. First, subgraph centrality of nodes is defined and used to classify essential proteins in a proteomic map. This index is then used to produce a method that allows the identification of superhomogeneous networks. At the same time this method classify non-homogeneous network into three universal classes of structure. We give examples of these classes from networks in different real-world scenarios. Finally, a communicability function is studied and showed as an alternative for defining communities in complex networks. Using this approach a community is unambiguously defined and an algorithm for its identification is proposed and exemplified in a real-world network.

Ernesto Estrada

Structural Descriptions

Graph Embedding Using an Edge-Based Wave Kernel

This paper describes a new approach for embedding graphs on pseudo-Riemannian manifolds based on the wave kernel. The wave kernel is the solution of the wave equation on the edges of a graph. Under the embedding, each edge becomes a geodesic on the manifold. The eigensystem of the wave-kernel is determined by the eigenvalues and the eigenfunctions of the normalized adjacency matrix and can be used to solve the edge-based wave equation. By factorising the Gram-matrix for the wave-kernel, we determine the embedding co-ordinates for nodes under the wave-kernel. We investigate the utility of this new embedding as a means of gauging the similarity of graphs. We experiment on sets of graphs representing the proximity of image features in different views of different objects. By applying multidimensional scaling to the similarity matrix we demonstrate that the proposed graph representation is capable of clustering different views of the same object together.

Hewayda ElGhawalby, Edwin R. Hancock
A Structured Learning Approach to Attributed Graph Embedding

In this paper, we describe the use of concepts from structural and statistical pattern recognition for recovering a mapping which can be viewed as an operator on the graph attribute-set. This mapping can be used to embed graphs into spaces where tasks such as categorisation and relational matching can be effected. We depart from concepts in graph theory to introduce mappings as operators over graph spaces. This treatment leads to the recovery of a mapping based upon the graph attributes which is related to the edge-space of the graphs under study. As a result, this mapping is a linear operator over the attribute set which is associated with the graph topology. Here, we employ an optimisation approach whose cost function is related to the target function used in discrete Markov Random Field approaches. Thus, the proposed method provides a link between concepts in graph theory, statistical inference and linear operators. We illustrate the utility of the recovered embedding for shape matching and categorisation on MPEG7 CE-Shape-1 dataset. We also compare our results to those yielded by alternatives.

Haifeng Zhao, Jun Zhou, Antonio Robles-Kelly

Machine Learning

Combining Elimination Rules in Tree-Based Nearest Neighbor Search Algorithms

A common activity in many pattern recognition tasks, image processing or clustering techniques involves searching a labeled data set looking for the nearest point to a given unlabelled sample. To reduce the computational overhead when the naive exhaustive search is applied, some fast nearest neighbor search (NNS) algorithms have appeared in the last years. Depending on the structure used to store the training set (usually a tree), different strategies to speed up the search have been defined. In this paper, a new algorithm based on the combination of different pruning rules is proposed. An experimental evaluation and comparison of its behavior with respect to other techniques has been performed, using both real and artificial data.

Eva Gómez-Ballester, Luisa Micó, Franck Thollard, Jose Oncina, Francisco Moreno-Seco
Localized Projection Learning

It is interesting to compare different criteria of kernel machines. In this paper, the following is made: 1) to cope with the scaling problem of projection learning, we propose a dynamic localized projection learning using

k

nearest neighbors, 2) the localized method is compared with SVM from some viewpoints, and 3) approximate nearest neighbors are demonstrated their usefulness in such a localization. As a result, it is shown that SVM is superior to projection learning in many classification problems in its optimal setting but the setting is not easy.

Kazuki Tsuji, Mineichi Kudo, Akira Tanaka
Entropy-Based Variational Scheme for Fast Bayes Learning of Gaussian Mixtures

In this paper, we propose a fast entropy-based variational scheme for learning Gaussian mixtures. The key element of the proposal is to exploit the incremental learning approach to perform model selection through efficient iteration over the Variational Bayes (VB) optimization step in a way that the number of splits is minimized. In order to minimize the number of splits we only select for spliting the worse kernel in terms of evaluating its entropy. Recent Gaussian mixture learning proposals suggest the use of that mechanism if a bypass entropy estimator is available. Here we will exploit the recently proposed Leonenko estimator. Our experimental results, both in 2D and in higher dimension show the effectiveness of the approach which reduces an order of magnitude the computational cost of the state-of-the-art incremental component learners.

Antonio Peñalver, Francisco Escolano, Boyan Bonev

Structural Learning

Learning Graph Quantization

This contribution extends learning vector quantization to the domain of graphs. For this, we first identify graphs with points in some orbifold, then derive a generalized differentiable intrinsic metric, and finally extend the update rule of LVQ for generalized differentiable distance metrics. First experiments indicate that the proposed approach can perform comparable to state-of-the-art methods in structural pattern recognition.

Brijnesh J. Jain, S. Deepak Srinivasan, Alexander Tissen, Klaus Obermayer
High-Dimensional Spectral Feature Selection for 3D Object Recognition Based on Reeb Graphs

In this work we evaluate purely structural graph measures for 3D object classification. We extract spectral features from different Reeb graph representations and successfully deal with a multi-class problem. We use an information-theoretic filter for feature selection. We show experimentally that a small change in the order of selection has a significant impact on the classification performance and we study the impact of the precision of the selection criterion. A detailed analysis of the feature participation during the selection process helps us to draw conclusions about which spectral features are most important for the classification problem.

Boyan Bonev, Francisco Escolano, Daniela Giorgi, Silvia Biasotti
Dissimilarity-Based Multiple Instance Learning

In this paper, we propose to solve multiple instance learning problems using a dissimilarity representation of the objects. Once the dissimilarity space has been constructed, the problem is turned into a standard supervised learning problem that can be solved with a general purpose supervised classifier. This approach is less restrictive than kernel-based approaches and therefore allows for the usage of a wider range of proximity measures. Two conceptually different types of dissimilarity measures are considered: one based on point set distance measures and one based on the earth movers distance between distributions of within- and between set point distances, thereby taking relations within and between sets into account. Experiments on five publicly available data sets show competitive performance in terms of classification accuracy compared to previously published results.

Lauge Sørensen, Marco Loog, David M. J. Tax, Wan-Jui Lee, Marleen de Bruijne, Robert P. W. Duin
A Game Theoretic Approach to Learning Shape Categories and Contextual Similarities

The search of a model for representing and evaluating the similarities between shapes in a perceptually coherent way is still an open issue. One reason for this is that our perception of similarities is strongly influenced by the underlying category structure. In this paper we aim at jointly learning the categories from examples and the similarity measures related to them. There is a chicken and egg dilemma here: class knowledge is required to determine perceived similarities, while the similarities are needed to extract class knowledge in an unsupervised way. The problem is addressed through a game theoretic approach which allows us to compute 2D shape categories based on a skeletal representation. The approach provides us with both the cluster information needed to extract the categories, and the relevance information needed to compute the category model and, thus, the similarities. Experiments on a database of 1000 shapes showed that the approach outperform other clustering approaches that do not make use of the underlying contextual information and provides similarities comparable with a state-of-the-art label-propagation approach which, however, cannot extract categories.

Aykut Erdem, Andrea Torsello

Poster Session

A Comparison between Two Representatives of a Set of Graphs: Median vs. Barycenter Graph

In this paper we consider two existing methods to generate a representative of a given set of graphs, that satisfy the following two conditions. On the one hand, that they are applicable to graphs with any kind of labels in nodes and edges and on the other hand, that they can handle relatively large amount of data. Namely, the approximated algorithms to compute the Median Graph via graph embedding and a new method to compute the Barycenter Graph. Our contribution is to give a new algorithm for the barycenter computation and to compare it to the median Graph. To compare these two representatives, we take into account algorithmic considerations and experimental results on the quality of the representative and its robustness, on several datasets.

Itziar Bardaji, Miquel Ferrer, Alberto Sanfeliu
Impact of Visual Information on Text and Content Based Image Retrieval

Nowadays, multimedia documents composed of text and images are increasingly used, thanks to the Internet and the increasing capacity of data storage. It is more and more important to be able to retrieve needles in this huge haystack. In this paper, we present a multimedia document model which combines textual and visual information. Using a bag-of-words approach, it represents a textual and visual document using a vector for each modality. Given a multimedia query, our model combines scores obtained for each modality and returns a list of relevant retrieved documents. This paper aims at studying the influence of the weight given to the visual information relative to the textual information. Experiments on the multimedia ImageCLEF collection show that results can be improved by learning this weight parameter.

Christophe Moulin, Christine Largeron, Mathias Géry
Automatic Traffic Monitoring from Satellite Images Using Artificial Immune System

Automatic and intelligence Road traffic monitoring is a new research issue for high resolution satellite imagery application in transportation. One of the results of this research was to control the traffic jam in roads and to recognize the traffic density quickly and accurately. This article presents a new approach for recognizing the vehicle and the road in satellite high-resolution images in non-urban areas. For road recognition, they used feature extraction and image processing techniques like Hough transform, Gradient, and thresholding operation and they presented an artificial immune approach to extract vehicle targets from high resolution panchromatic satellite imagery. The average of results is about 94 percent and it shows that the used procedure has the suitable efficiency.

Mehrad Eslami, Karim Faez
Graduated Assignment Algorithm for Finding the Common Labelling of a Set of Graphs

In pattern recognition applications, it is useful to represent objects by attributed graphs considering their structural properties. Besides, some graph matching problems need a Common Labelling between vertices of a set of graphs. Computing this Common Labelling is an NP-complete problem. State-of-the-art algorithms are composed by two steps: in the first, they compute all pairwise labellings among the graphs and in the second, they combine this information to obtain a Common Labelling. The drawback of these methods is that global information is only considered in the second step. To solve this problem, by reducing the Common Labelling problem to the quadratic assignment one, all graphs nodes are labelled to a virtual structure whereby the Common Labeling is generated using global information. We tested the algorithm on both real-world and synthetic data. We show that the algorithm offers better performance than a reference method with same computational cost.

Albert Solé-Ribalta, Francesc Serratosa
Affinity Propagation for Class Exemplar Mining

This paper focusses on the problem of locating object class exemplars from a large corpus of images using affinity propagation. We use attributed relational graphs to represent groups of local invariant features together with their spatial arrangement. Rather than mining exemplars from the entire graph corpus, we prefer to cluster object specific exemplars. Firstly, we obtain an object specific cluster of graphs using a similarity propagation based graph clustering (SPGC) method. Here a SOM neural net based tree clustering method is used to incrementally cluster a large corpus of local invariant descriptors. The popular affinity propagation based clustering algorithm is then individually applied to each object specific cluster. Using this clustering method, we obtain object specific exemplars together with a high precision for the data associated with each exemplar. The strategy adopted is one of divide and conquer, and this greatly increases the efficiency of mining exemplars. Using the exemplars, we perform recognition using a majority voting strategy that is weighted by nearest neighbor similarity. Experiments are performed on over 80K images spanning ~500 objects, and demonstrate the performance in terms of efficiency, scalability and recognition.

Shengping Xia, Rui Song, Edwin R. Hancock
Guided Informative Image Partitioning

Image partitioning separates an image into multiple visually and semantically homogeneous regions, providing a summary of visual content. Knowing that human observers focus on interesting objects or regions when interpreting a scene, and envisioning the usefulness of this focus in many computer vision tasks, this paper develops a user-attention adaptive image partitioning approach. Given a set of pairs of oversegments labeled by a user as ”should be merged” or ”should not be merged”, the proposed approach produces a fine partitioning in user defined interesting areas, to retain interesting information, and a coarser partitioning in other regions to provide a parsimonious representation. To achieve this, a novel Markov Random Field (MRF) model is used to optimally infer the relationship (”merge” or ”not merge”) among oversegment pairs, by using the graph nodes to describe the relationship between pairs. By training an SVM classifier to provide the data term, a graph-cut algorithm is employed to infer the best MRF configuration. We discuss the difficulty in translating this configuration back to an image labelling, and develop a non-trivial post-processing to refine the configuration further. Experimental verification on benchmark data sets demonstrates the effectiveness of the proposed approach.

Nathan Brewer, Nianjun Liu, Lei Wang
Visual Alphabets on Different Levels of Abstraction for the Recognition of Deformable Objects

Recognition systems for complex and deformable objects must handle a variety of possible object appearances. In this paper, a compositional approach to this problem is studied which splits the set of possible appearances into easier sub-problems. To this end, a grammar is introduced that represents objects by a hierarchy of increasingly abstract visual alphabets. These alphabets store features, complex patterns and different views of objects. The geometrical constraints are optimised to the respective level of abstraction. The performance of the method is demonstrated on a cartoon data base with high intra-class variance.

Martin Stommel, Klaus-Dieter Kuhnert
Graph Embedding Based on Nodes Attributes Representatives and a Graph of Words Representation

Although graph embedding has recently been used to extend statistical pattern recognition techniques to the graph domain, some existing embeddings are usually computationally expensive as they rely on classical graph-based operations. In this paper we present a new way to embed graphs into vector spaces by first encapsulating the information stored in the original graph under another graph representation by clustering the attributes of the graphs to be processed. This new representation makes the association of graphs to vectors an easy step by just arranging both node attributes and the adjacency matrix in the form of vectors. To test our method, we use two different databases of graphs whose nodes attributes are of different nature. A comparison with a reference method permits to show that this new embedding is better in terms of classification rates, while being much more faster.

Jaume Gibert, Ernest Valveny
Extracting Plane Graphs from Images

In order to use structural techniques from graph-based pattern recognition, a first necessary step consists in extracting a graph in an automatic way from an image. We propose to extract

plane graphs

, because of algorithmic properties these graphs have for isomorphism related problems. We also consider the problem of extracting semantically well-founded graphs as a compression issue: we get simple graphs from which can be rebuilt images similar to the initial image. The technique we introduce consists in segmenting the original image, extracting interest pixels on the segmented image, then converting these pixels into pointels, which in turn can be related by region-based triangulation. We show the feasibility and interest of this approach in a series of experiments.

Émilie Samuel, Colin de la Higuera, Jean-Christophe Janodet
Indexing Tree and Subtree by Using a Structure Network

In pattern recognition, graphs become alluring more and more as structural pattern representations due to their richer representability than feature vectors. However, there are many challenging problems using graphs for pattern recognition. One is that it is difficult to investigate the relationships of graphs effectively, even of trees. In this paper, we focus on the structure relationship analysis of trees, such as tree and subtree isomorphism, maximum common subtree, minimum common supertree, etc., which is almost suffered from all kinds of tree recognition problems. For investigating the relationships of structures of trees, we propose a structure network to represent the evolutional relationships of structures of trees. Moreover, for a lot of tree isomorphism problems appearing in the application of structure network, we propose a method that encodes the structure of tree as a numerical sequence, and illustrate its efficiency by comparing it with traditional matching method for tree isomorphism problem.

Mingming Zhang, Shinichiro Omachi
Attributed Graph Matching for Image-Features Association Using SIFT Descriptors

Image-features matching based on SIFT descriptors is subject to the misplacement of certain matches due to the local nature of the SIFT representations. Some well-known outlier rejectors aim to remove those misplaced matches by imposing geometrical consistency. We present two graph matching approaches (one continuous and one discrete) aimed at the matching of SIFT features in a geometrically consistent way. The two main novelties are that, both local and contextual coherence are imposed during the optimization process and, a model of structural consistency is presented that accounts for the quality rather than the quantity of the surrounding matches. Experimental results show that our methods achieve good results under various types of noise.

Gerard Sanromà, René Alquézar, Francesc Serratosa
A Causal Extraction Scheme in Top-Down Pyramids for Large Images Segmentation

Applicative fields based on the analysis of large images must deal with two important problems. First, the size in memory of such images usually forbids a global image analysis hereby inducing numerous problems for the design of a global image partition. Second, due to the high resolution of such images, global features only appear at low resolutions and a single resolution analysis may loose important information. The tiled top-down pyramidal model has been designed to solve this two major challenges. This model provides a hierarchical encoding of the image at single or multiple resolutions using a top-down construction scheme. Moreover, the use of tiles bounds the amount of memory required by the model while allowing global image analysis. The main limitation of this model is the splitting step used to build one additional partition from the above level. Indeed, this step requires to temporary refine the split region up to the pixel levelwhich entails high memory requirements and processing time. In this paper, we propose a new splitting step within the tiled top-down pyramidal framework which overcomes the previously mentioned limitations.

Romain Goffe, Guillaume Damiand, Luc Brun
Fast Population Game Dynamics for Dominant Sets and Other Quadratic Optimization Problems

We propose a fast population game dynamics, motivated by the analogy with infection and immunization processes within a population of ”players,” for finding dominant sets, a powerful graph-theoretical notion of a cluster. Each step of the proposed dynamics is shown to have a linear time/space complexity and we show that, under the assumption of symmetric affinities, the average population payoff is strictly increasing along any non-constant trajectory, thereby allowing us to prove that dominant sets are asymptotically stable (i.e., attractive) points for the proposed dynamics. The approach is general and can be applied to a large class of quadratic optimization problems arising in computer vision. Experimentally, the proposed dynamics is found to be orders of magnitude faster than and as accurate as standard algorithms.

Samuel Rota Bulò, Immanuel M. Bomze, Marcello Pelillo
What Is the Complexity of a Network? The Heat Flow-Thermodynamic Depth Approach

In this paper we establish a formal link between network complexity in terms of Birkhoff-von Neumann decompositions and heat flow complexity (in terms of quantifying the heat flowing through the network at a given inverse temperature). We propose and proof characterization theorems and also two fluctuation laws for sets of networks. Such laws emerge from studying the implicacions of the Fluctuation Theorem in heat-flow characterization. Furthermore, we also define heat flow complexity in terms of thermodynamic depth, which results in a novel approach for characterizing networks and quantify their complexity In our experiments we characterize several protein-protein interaction (PPI) networks and then highlight their evolutive differences, in order to test the consistence of the proposed complexity measure in terms of the second law of thermodynamics.

Francisco Escolano, Miguel A. Lozano, Edwin R. Hancock, Daniela Giorgi
New Partially Labelled Tree Similarity Measure: A Case Study

Trees are a powerful data structure for representing data for which hierarchical relations can be defined. They have been applied in a number of fields like image analysis, natural language processing, protein structure, or music retrieval, to name a few. Procedures for comparing trees are very relevant in many task where tree representations are involved. The computation of these measures is usually a time consuming tasks and different authors have proposed algorithms that are able to compute them in a reasonable time, through approximated versions of the similarity measure. Other methods require that the trees are fully labelled for the distance to be computed. In this paper, a new measure is presented able to deal with trees labelled only at the leaves, that runs in

O

(|

T

A

|×|

T

B

|) time. Experiments and comparative results are provided.

David Rizo, José M. Iñesta
Complete Search Space Exploration for SITG Inside Probability

Stochastic Inversion Transduction Grammars are a very powerful formalism in Machine Translation that allow to parse a string pair with efficient Dynamic Programming algorithms. The usual parsing algorithms that have been previously defined cannot explore the complete search space. In this work, we propose important modifications that consider the whole search space. We formally prove the correctness of the new algorithm. Experimental work shows important improvements in the probabilistic estimation of the models when using the new algorithm.

Guillem Gascó, Joan-Andreu Sánchez, José-Miguel Benedí
Commute-Time Convolution Kernels for Graph Clustering

Commute time has proved to be a powerful attribute for clustering and characterising graph structure, and which is easily computed from the Laplacian spectrum. Moreover, commute time is robust to deletions of random edges and noisy edge weights. In this paper, we explore the idea of using convolution kernel to compare the distributions of commute time over pairs of graphs. We commence by computing the commute time distance in graphs. We then use a Gaussian convolution kernel to compare distributions. We use kernel kmeans for clustering and use kernel PCA for illustration using the COIL object recognition database.

Normawati A Rahman, Edwin R. Hancock

Geometric Methods

Non-Euclidean Dissimilarities: Causes and Informativeness

In the process of designing pattern recognition systems one may choose a representation based on pairwise dissimilarities between objects. This is especially appealing when a set of discriminative features is difficult to find. Various classification systems have been studied for such a dissimilarity representation: the direct use of the nearest neighbor rule, the postulation of a dissimilarity space and an embedding to a virtual, underlying feature vector space.

It appears in several applications that the dissimilarity measures constructed by experts tend to have a

non

-Euclidean behavior. In this paper we first analyze the causes of such choices and then experimentally verify that the non-Euclidean property of the measure can be informative.

Robert P. W. Duin, Elżbieta Pękalska
Non-parametric Mixture Models for Clustering

Mixture models have been widely used for data clustering. However, commonly used mixture models are generally of a parametric form (e.g., mixture of Gaussian distributions or GMM), which significantly limits their capacity in fitting diverse multidimensional data distributions encountered in practice. We propose a non-parametric mixture model (NMM) for data clustering in order to detect clusters generated from arbitrary unknown distributions, using non-parametric kernel density estimates. The proposed model is non-parametric since the generative distribution of each data point depends only on the rest of the data points and the chosen kernel. A leave-one-out likelihood maximization is performed to estimate the parameters of the model. The NMM approach, when applied to cluster high dimensional text datasets significantly outperforms the state-of-the-art and classical approaches such as K-means, Gaussian Mixture Models, spectral clustering and linkage methods.

Pavan Kumar Mallapragada, Rong Jin, Anil Jain

Structural Methods for Vision

A Probabilistic Approach to Spectral Unmixing

In this paper, we present a statistical approach to spectral unmixing with unknown endmember spectra and unknown illuminant power spectrum. The method presented here is quite general in nature, being applicable to settings in which sub-pixel information is required. The method is formulated as a simultaneous process of illuminant power spectrum prediction and basis material reflectance decomposition via a statistical approach based upon deterministic annealing and the maximum entropy principle. As a result, the method presented here is related to soft clustering tasks with a strategy for avoiding local minima. Furthermore, the final endmembers depend on the similarity between pixel reflectance spectra. Hence, the method does not require a preset number of material clusters or spectral signatures as input. We show the utility of our method on trichromatic and hyperspectral imagery and compare our results to those yielded by alternatives elsewhere in the literature.

Cong Phuoc Huynh, Antonio Robles-Kelly
A Game-Theoretic Approach to the Enforcement of Global Consistency in Multi-view Feature Matching

In this paper we introduce a robust matching technique that allows to operate a very accurate selection of corresponding feature points from multiple views. Robustness is achieved by enforcing global geometric consistency at an early stage of the matching process, without the need of ex-post verification through reprojection. Two forms of global consistency are proposed, but in both cases they are reduced to pairwise compatibilities making use of the size and orientation information provided by common feature descriptors. Then a game-theoretic approach is used to select a maximally consistent set of candidate matches, where highly compatible matches are enforced while incompatible correspondences are driven to extinction. The effectiveness of the approach in estimating camera parameters for bundle adjustment is assessed and compared with state-of-the-art techniques.

Emanuele Rodolà, Andrea Albarelli, Andrea Torsello
An Algorithm for Recovering Camouflage Errors on Moving People

In this paper we present a model-based algorithm working as a post-processing phase of any foreground object detector. The model is suited to recover camouflage errors producing the segmentation of an entity in small and unconnected parts. The model does not require training procedures, but only information about the estimated size of the person, obtainable when an inverse perspective mapping procedure is used.

A quantitative evaluation of the effectiveness of the method, used after four well known moving object detection algorithms has been carried out. Performance are given on a variety of publicly available databases, selected among those presenting highly camouflaged objects in real scenes referring to both indoor and outdoor environments.

D. Conte, P. Foggia, G. Percannella, F. Tufano, M. Vento

Clustering

Semi-supervised Clustering Using Heterogeneous Dissimilarities

The performance of many clustering algorithms such as

k

-means depends strongly on the dissimilarity considered to evaluate the sample proximities. The choice of a good dissimilarity is a difficult task because each dissimilarity reflects different features of the data. Therefore, different dissimilarities should be integrated in order to reflect more accurately which is similar for the user and the problem at hand.

In many applications, the user feedback or the a priory knowledge about the problem provide pairs of similar and dissimilar examples. This side-information may be used to learn a distance metric and to improve the clustering results. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting.

The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves the classical

k

-means clustering based on a single dissimilarity.

Manuel Martín-Merino
On Consensus Clustering Validation

Work on clustering combination has shown that clustering combination methods typically outperform single runs of clustering algorithms. While there is much work reported in the literature on validating data partitions produced by the traditional clustering algorithms, little has been done in order to validate data partitions produced by clustering combination methods. We propose to assess the quality of a consensus partition using a pattern pairwise similarity induced from the set of data partitions that constitutes the clustering ensemble. A new validity index based on the likelihood of the data set given a data partition, and three modified versions of well-known clustering validity indices are proposed. The validity measures on the original, clustering ensemble, and similarity spaces are analysed and compared based on experimental results on several synthetic and real data sets.

João M. M. Duarte, Ana L. N. Fred, André Lourenço, F. Jorge F. Duarte
Pairwise Probabilistic Clustering Using Evidence Accumulation

In this paper we propose a new approach for consensus clustering which is built upon the evidence accumulation framework. Our method takes the co-association matrix as the only input and produces a soft partition of the dataset, where each object is probabilistically assigned to a cluster, as output. Our method reduces the clustering problem to a polynomial optimization in probability domain, which is attacked by means of the Baum-Eagon inequality. Experiments on both synthetic and real benchmarks data, assess the effectiveness of our approach.

Samuel Rota Bulò, André Lourenço, Ana Fred, Marcello Pelillo
Exploring the Performance Limit of Cluster Ensemble Techniques

Cluster ensemble techniques are a means for boosting the clustering performance. However, many cluster ensemble methods are faced with high computational complexity. Indeed, the median partition methods are

$\mathcal{NP}$

-complete. While a variety of approximative approaches for suboptimal solutions have been proposed in the literature, the performance evaluation is typically done by means of ground truth. In contrast this work explores the question how well the cluster ensemble methods perform in an absolute sense

without ground truth

, i.e. how they compare to the (unknown) optimal solution. We present a study of applying and extending a lower bound as an attempt to answer the question. In particular, we demonstrate the tightness of the lower bound, which indicates that there exists no more room for further improvement (for the particular data set at hand). The lower bound can thus be considered as a means of exploring the performance limit of cluster ensemble techniques.

Xiaoyi Jiang, Daniel Abdala
Contour Grouping by Clustering with Multi-feature Similarity Measure

Contour grouping is a key task in computer vision domain. It extracts the meaningful objects information from low-level image features and provides the input for the further processing. There have been many techniques proposed over the decades. As a useful data analysis method in machine learning, clustering is a natural way for doing the grouping task. However, due to many complicated factors in natural images, such as noises and clutter in background, many clustering algorithms, which just use pairwise similarity measure, are not robust enough and always fail to generate grouping results that are consistent with the visual objects perceived by human. In this article, we present how the grouping performance is improved by utilizing multi-feature similarity under the information based clustering framework compared with other clustering methods using pairwise similarity.

Xue Bai, Siwei Luo, Qi Zou, Yibiao Zhao

Poster Session

A Psychophysical Evaluation of Texture Degradation Descriptors

Delivering digitally a realistic appearance of materials is one of the most difficult tasks of computer vision. Accurate representation of surface texture can be obtained by means of view- and illumination-dependent textures. However, this kind of appearance representation produces massive datasets so their compression is inevitable. For optimal visual performance of compression methods, their parameters should be tuned to a specific material. We propose a set of statistical descriptors motivated by textural features, and psychophysically evaluate their performance on three subtle artificial degradations of textures appearance. We tested five types of descriptors on five different textures and combination of thirteen surface shapes and two illuminations. We found that descriptors based on a two-dimensional causal auto-regressive model, have the highest correlation with the psychophysical results, and so can be used for automatic detection of subtle changes in rendered textured surfaces in accordance with human vision.

Jiří Filip, Pavel Vácha, Michal Haindl, Patrick R. Green
Content-Based Tile Retrieval System

A content-based tile retrieval system based on the underlying multispectral Markov random field representation is introduced. Single tiles are represented by our approved textural features derived from especially efficient Markovian statistics and supplemented with Local Binary Patterns (LBP) features representing occasional tile inhomogeneities. Markovian features are on top of that also invariant to illumination colour and robust to illumination direction variations, therefore an arbitrary illuminated tiles do not negatively influence the retrieval result. The presented computer-aided tile consulting system retrieves tiles from recent tile production digital catalogues, so that the retrieved tiles have as similar pattern and/or colours to a query tile as possible. The system is verified on a large commercial tile database in a psychovisual experiment.

Pavel Vácha, Michal Haindl
Performance Improvement in Multiple-Model Speech Recognizer under Noisy Environments

Multiple-model speech recognizer has been shown to be quite successful in noisy speech recognition. However, its performance has usually been tested using the general speech front-ends which do not incorporate any noise adaptive algorithms. For the accurate evaluation of the effectiveness of the multiple-model frame in noisy speech recognition, we used the state-of-the-art front-ends and compared its performance with the well-known multi-style training method. In addition, we improved the multiple-model speech recognizer by employing N-best reference HMMs for interpolation and using multiple SNR levels for training each of the reference HMM.

Jang-Hyuk Yoon, Yong-Joo Chung
On Feature Combination for Music Classification

We address the problem of combining different types of audio features for music classification. Several feature-level and decision-level combination methods have been studied, including kernel methods based on multiple kernel learning, decision level fusion rules and stacked generalization. Eight widely used audio features were examined in the experiments on multi-feature based music classification. Results on benchmark data set have demonstrated the effectiveness of using multiple types of features for music classification and identified the most effective combination method for improving classification performance.

Zhouyu Fu, Guojun Lu, Kai-Ming Ting, Dengsheng Zhang
Information Theoretical Kernels for Generative Embeddings Based on Hidden Markov Models

Many approaches to learning classifiers for structured objects (

e.g.

, shapes) use generative models in a Bayesian framework. However, state-of-the-art classifiers for vectorial data (

e.g.

, support vector machines) are learned discriminatively. A generative embedding is a mapping from the object space into a fixed dimensional feature space, induced by a generative model which is usually learned from data. The fixed dimensionality of these feature spaces permits the use of state of the art discriminative machines based on vectorial representations, thus bringing together the best of the discriminative and generative paradigms.

Using a generative embedding involves two steps: (i) defining and learning the generative model used to build the embedding; (ii) discriminatively learning a (maybe kernel) classifier on the adopted feature space. The literature on generative embeddings is essentially focused on step (i), usually adopting some standard off-the-shelf tool (e.g., an SVM with a linear or RBF kernel) for step (ii). In this paper, we follow a different route, by combining several Hidden Markov Models-based generative embeddings (including the classical Fisher score) with the recently proposed non-extensive information theoretic kernels. We test this methodology on a 2D shape recognition task, showing that the proposed method is competitive with the state-of-art.

André F. T. Martins, Manuele Bicego, Vittorio Murino, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Dynamic Linear Combination of Two-Class Classifiers

In two-class problems, the linear combination of the outputs (scores) of an ensemble of classifiers is widely used to attain high performance. In this paper we investigate some techniques aimed at

dynamically

estimate the coefficients of the linear combination on a pattern per pattern basis. We will show that such a technique allows providing better performance than those of

static

combination techniques, whose parameters are computed beforehand. The coefficients of the linear combination are dynamically computed according to the Wilcoxon-Mann-Whitney statistic. Reported results on a multi-modal biometric dataset show that the proposed dynamic mechanism allows attaining very low error rates when high level of precision are required.

Carlo Lobrano, Roberto Tronci, Giorgio Giacinto, Fabio Roli
Large-Scale Text to Image Retrieval Using a Bayesian K-Neighborhood Model

In this paper we introduce a new approach aimed at solving the problem of image retrieval from text queries. We propose to estimate the word relevance of an image using a neighborhood-based estimator. This estimation is obtained by counting the number of word-relevant images among the

K

-neighborhood of the image. To this end a Bayesian approach is adopted to define such a neighborhood. The local estimations of all the words that form a query are naively combined in order to score the images according to that query. The experiments show that the results are better and faster than the state-of-the-art techniques. A special consideration is done for the computational behaviour and scalability of the proposed approach.

Roberto Paredes
Maximum a Posteriori Based Kernel Classifier Trained by Linear Programming

We propose a new approach for classification problem based on the maximum a posteriori (MAP) estimation. The necessary and sufficient condition for the cost function to estimate a posteriori probability was obtained. It was clarified by the condition that a posteriori probability cannot be estimated by using linear programming. In this paper, a kernelized function of which result is the same as that of the MAP classifier is estimated. By relieving the problem from to estimate a posteriori probability to such a function, the freedom of cost function becomes wider. We propose a new cost function for such a function that can be solved by using linear programming. We conducted binary classification experiment by using 13 datasets from the UCI repository and compared the results to the well known methods. The proposed method outperforms the other methods for several datasets. We also explain the relation and the similarity between the proposed method and the support vector machine (SVM). Furthermore, the proposed method has other advantages for classification. Besides it can be solved by linear programming which has many excellent solvers, it does not have regularization parameter such as C in the cost function in SVM and its cost function is so simple that we can consider its various extensions for future work.

Nopriadi, Yukihiko Yamashita
Improvement of the Disc Harmonic Moments Descriptor by an Exponentially Decaying Distance Transform

The authors propose an improvement of a recent region-based shape descriptor inspired by the 3D spherical harmonics: the Disk Harmonic Moments Descriptor (DHMD). The binary image is weighted by an exponentially decaying distance transform (EDDT) before applying the disc harmonic transform (DHT) introduced recently as a good shape representation. The performance of the improved DHMD is compared to other recent methods from the same category. Set B of the MPEG-7 CE-1-Shape database is used for experimental validation. To benchmark the performance of the compared descriptors precision-recall pair is employed. The proposed approach seems be more efficient and effective if compared to its competitors.

Noureddine Ennahnahi, Mohammed Oumsis, Mohammed Meknassi
Feature Level Fusion of Face and Palmprint Biometrics

This paper presents a feature level fusion of face and palmprint biometrics. It uses the improved K-medoids clustering algorithm and isomorphic graph. The performance of the system has been verified by two distance metrics namely, K-NN and normalized correlation metrics. It uses two multibiometrics databases of face and palmprint images for testing. The experimental results reveal that the feature level fusion with the improved K-medoids partitioning algorithm exhibits robust performance and increases its performance with utmost level of accuracy.

Dakshina Ranjan Kisku, Phalguni Gupta, Jamuna Kanta Sing
Scale and Rotation Invariant Detection of Singular Patterns in Vector Flow Fields

We present a method for detecting and describing features in vector flow fields. Our method models flow fields locally using a linear combination of complex monomials. These monomials form an orthogonal basis for analytic flows with respect to a correlation-based inner-product. We investigate the invariance properties of the coefficients of the approximation polynomials under both rotation and scaling operators. We then propose a descriptor for local flow patterns, and developed a method for comparing them invariantly against rigid transformations. Additionally, we propose a SIFT-like detector that can automatically detect singular flow patterns at different scales and orientations. Promising detection results are obtained on different fluid flow data.

Wei Liu, Eraldo Ribeiro
Using K-NN SVMs for Performance Improvement and Comparison to K-Highest Lagrange Multipliers Selection

Support Vector Machines (SVM) can perform very well on noise free data sets and can usually achieve good classification accuracies when the data is noisy. However, because of the overfitting problem, the accuracy decreases if the SVM is modeled improperly or if the data is excessively noisy or nonlinear. For SVM, most of the misclassification occurs when the test data lies closer to the decision boundary. Therefore in this paper, we investigate the effect of Support Vectors found by SVM, and their effect on the decision when used with the Gaussian kernel. Based on the discussion results we also propose a new technique to improve the performance of SVM by creating smaller clusters along the decision boundary in the higher dimensional space. In this way we reduce the overfitting problem that occurs because of the model selection or the noise effect. As an alternative SVM tuning method, we also propose using only K highest Lagrange multipliers to summarize the decision boundary instead of the whole support vectors and compare the performances. Thus with test results, we show that the number of Support Vectors can be decreased further by using only a fraction of the support vectors found at the training step as a post-processing method.

Sedat Ozer, Chi Hau Chen, Imam Samil Yetik
Automatic Speech Segmentation Based on Acoustical Clustering

In this paper, we present an automatic speech segmentation system based on acoustical clustering plus dynamic time warping. Our system operates at three stages, the first one obtains a coarse segmentation as a starting point to the second one. The second stage fixes phoneme boundaries in an iterative process of progressive refinement. The third stage makes a finer adjustment by considering some acoustic parameters estimated at a higher subsampling rate around the boundary to be adjusted. No manually segmented utterances are used in any stage.

The results presented here demonstrate a good learning capability of the system, which only uses the phonetic transcription of each utterance. Our approach obtains similar results than the ones reported by previous related works on TIMIT database.

Jon A. Gómez, Emilio Sanchis, María J. Castro-Bleda
An Efficient Iris and Eye Corners Extraction Method

Eye features are one of the most important clues for many computer vision applications. In this paper, an efficient method to automatically extract eye features is presented. The extraction is highly based on the usage of the common knowledge about face and eye structure. With the assumption of frontal face images, firstly coarse eye regions are extracted by removing skin pixels in the upper part of the face. Then, iris circle position and radius are detected by using Hough transform in a coarse-to-fine fashion. In the final step, edges created by upper and lower eyelids are detected and polynomials are fitted to those edges so that their intersection points are labeled as eye corners. The algorithm is experimented on the Bosphorus database and the obtained results demonstrate that it can locate eye features very accurately. The strength of the proposed method stems from its reproducibility due to the utilization of simple and efficient image processing methods while achieving remarkable results without any need of training.

Nesli Erdogmus, Jean-Luc Dugelay

Dissimilarity-Based Methods

An Empirical Comparison of Kernel-Based and Dissimilarity-Based Feature Spaces

The aim of this paper is to find an answer to the question: What is the difference between dissimilarity-based classifications(DBCs) and other kernel-based classifications(KBCs)? In DBCs [11], classifiers are defined among classes; they are not based on the feature measurements of individual objects, but rather on a suitable dissimilarity measure among them. In KBCs [15], on the other hand, classifiers are designed in a high-dimensional feature space transformed from the original input feature space through kernels, such as a Mercer kernel. Thus, the difference that exists between the two approaches can be summarized as follows: The

distance

kernel of DBCs represents the discriminative information in a relative manner, i.e. through pairwise dissimilarity relations between two objects, while the

mapping

kernel of KBCs represents the discriminative information uniformly in a fixed way for all objects. In this paper, we report on an empirical evaluation of some classifiers built in the two different representation spaces: the dissimilarity space and the kernel space. Our experimental results, obtained with well-known benchmark databases, demonstrate that when the kernel parameters have not been appropriately chosen, DBCs always achieve better results than KBCs in terms of classification accuracies.

Sang-Woon Kim, Robert P. W. Duin
The Dissimilarity Representation as a Tool for Three-Way Data Classification: A 2D Measure

The dissimilarity representation has demonstrated advantages in the solution of classification problems. Meanwhile, the representation of objects by multi-dimensional arrays is necessary in many research areas. However, the development of proper classification tools that take the multi-way structure into account is incipient. This paper introduces the use of the dissimilarity representation as a tool for classifying three-way data, as dissimilarities allow the representation of multi-dimensional objects in a natural way. As an example, the classification of three-way seismic volcanic data is used. A comparison is made between dissimilarity measures used in different representations of the three-way data. 2D dissimilarity measures for three-way data can be useful.

Diana Porro-Muñoz, Robert P. W. Duin, Mauricio Orozco-Alzate, Isneri Talavera, John Makario Londoño-Bonilla
Regularising the Ricci Flow Embedding

This paper concerns the analysis of patterns that are specified in terms of non-Euclidean dissimilarity or proximity rather than ordinal values. In prior work we have reported a means of correcting or rectifying the similarities so that the non-Euclidean artifacts are minimized. This is achieved by representing the data using a graph, and evolving the manifold embedding of the graph using Ricci flow. Although the method provides encouraging results, it can prove to be unstable. In this paper we explore how this problem can be overcome using a graph regularisation technique. Specifically, by regularising the curvature of the manifold on which the graph is embedded, then we can improve both the stability and performance of the method. We demonstrate the utility of our method on the standard “Chicken pieces” dataset and show that we can transform the non-Euclidean distances into Euclidean space.

Weiping Xu, Edwin R. Hancock, Richard C. Wilson
Spherical Embedding and Classification

Most problems in pattern recognition can be posed in terms of using the dissimilarities between the set of objects of interest. A vector-space representation of the objects can be obtained by embedding them as points in Euclidean space. However many dissimilarities are non-Euclidean and cannot be represented accurately in Euclidean space. This can lead to a loss of information and poor performance. In this paper, we approach this problem by embedding the points in a non-Euclidean curved space, the hypersphere. This is a metric but non-Euclidean space which allows us to define a geometry and therefore construct geometric classifiers. We develop a optimisation-based procedure for embedding objects on hyperspherical manifolds from a given set of dissimilarities. We use the Lie group representation of the hypersphere and its associated Lie algebra to define the exponential map between the manifold and its local tangent space. We can then solve the optimisation problem locally in Euclidean space. This process is efficient enough to allow us to embed large datasets. We also define the nearest mean classifier on the manifold and give results for the embedding accuracy, the nearest mean classifier and the nearest-neighbor classifier on a variety of indefinite datasets.

Richard C. Wilson, Edwin R. Hancock

Language

Language Detection and Tracking in Multilingual Documents Using Weak Estimators

This paper deals with the extremely complicated problem of language detection and tracking in real-life electronic (for example, in Word-of-Mouth (WoM)) applications, where various segments of the text are written in different languages. The difficulties in solving the problem are many-fold. First of all, the analyst has no knowledge of when one language stops and when the next starts. Further, the features which one uses for any one language (for example, the

n

-grams) will not be valid to recognize another. Finally, and most importantly, in most real-life applications, such as in WoM, the fragments of text available before the switching, are so

small

that it renders any meaningful classification using traditional estimation methods almost meaningless. Earlier, the authors of [10] had recommended that for a variety of problems, the use of strong estimators (i.e., estimators that converge with probability 1) is sub-optimal. In this vein, we propose to solve the current problem using novel estimators that are pertinent for non-stationary environments. The classification results which involve as many as 8 languages demonstrates that our proposed methodology is both powerful and efficient.

Aleksander Stensby, B. John Oommen, Ole-Christoffer Granmo
Similarity Word-Sequence Kernels for Sentence Clustering

In this paper, we present a novel clustering approach based on the use of kernels as similarity functions and the

C

-means algorithm. Several word-sequence kernels are defined and extended to verify the properties of similarity functions. Afterwards, these monolingual word-sequence kernels are extended to bilingual word-sequence kernels, and applied to the task of monolingual and bilingual sentence clustering. The motivation of this proposal is to group similar sentences into clusters so that specialised models can be trained for each cluster, with the purpose of reducing in this way both the size and complexity of the initial task. We provide empirical evidence for proving that the use of bilingual kernels can lead to better clusters, in terms of intra-cluster perplexities.

Jesús Andrés-Ferrer, Germán Sanchis-Trilles, Francisco Casacuberta
Bayesian Adaptation for Statistical Machine Translation

In many pattern recognition problems, learning from training samples is a process that requires important amounts of training data and a high computational effort. Sometimes, only limited training data and/or limited computational resources are available, but there is also available a previous system trained for a closely related task and with enough training material. This scenario is very frequent in statistical machine translation and adaptation can be a solution to deal with this problem. In this paper, we present an adaptation technique for (state-of-the-art) log-linear modelling based on the well-known Bayesian learning paradigm. This technique has been applied to statistical machine translation and can be easily extended to other pattern recognition areas in which log-linear models are used. We show empirical results in which a small amount of adaptation data is able to improve both the non-adapted system and a system that optimises the above-mentioned weights only on the adaptation set.

Germán Sanchis-Trilles, Francisco Casacuberta
A Generative Score Space for Statistical Dialog Characterization in Social Signalling

The analysis of human conversations under a social signalling perspective recently raised the joint attention of pattern recognition and psychology researchers. In particular, the dialog classification represents an appealing recent application whose aim is to go beyond the meaning of the spoken words, focusing instead on the way the sentences are pronounced by capturing natural (or hidden) characteristics, such the mood of the conversation. An effective strategy to face this issue is to encode the turn-taking dynamics in a generative model, whose structure is composed by conditional dependencies among first-order Markov processes. In this paper, we follow this strategy, investigating how to boost the classification performances of this model and of the related higher-order Markov extensions, through the definition of a novel generative score space. Generative score spaces are employed to increase generative classification in a discriminative way, also allowing a deep understanding of the processed data through the use of standard pattern recognition strategies. Experiments on real data certify the goodness of our intuition.

Anna Pesarin, Paolo Calanca, Vittorio Murino, Marco Cristani

Multiple Classifiers

A Modular Approach to Training Cascades of Boosted Ensembles

Building on the ideas of Viola-Jones [1] we present a framework for training cascades of boosted ensembles (CoBE) which introduces further modularity and tractability to the training process. It addresses the challenges faced by CoBE frameworks such as protracted runtimes, slow layer convergences and classifier optimization. The framework possesses the ability to bootstrap positive samples and may in turn be extended into the domain of incremental learning. This paper aims to address our framework’s susceptibility to overfitting with possible solutions. Experiments are conducted on face detectors using the bootstrapping of large positive datasets and their accuracy, with respect to overfitting, is examined.

Teo Susnjak, Andre L. Barczak, Ken A. Hawick
A Linear Combination of Classifiers via Rank Margin Maximization

The method we present aims at building a weighted linear combination of already trained dichotomizers, where the weights are determined to maximize the minimum rank margin of the resulting ranking system. This is particularly suited for real applications where it is difficult to exactly determine key parameters such as costs and priors. In such cases ranking is needed rather than classification. A ranker can be seen as a more basic system than a classifier since it ranks the samples according to the value assigned by the classifier to each of them. Experiments on popular benchmarks along with a comparison with other typical rankers are proposed to show how effective can be the approach.

Claudio Marrocco, Paolo Simeone, Francesco Tortorella
Combination of Dichotomizers for Maximizing the Partial Area under the ROC Curve

In recent years, classifier combination has been of great interest for the pattern recognition community as a method to improve classification performance. The most part of combination rules are based on maximizing the accuracy and, only recently, the Area under the ROC curve (AUC) has been proposed as an alternative measure. However, there are several applications which focus only on particular regions of the ROC curve, i.e. the most relevant for the problem. In these cases, looking on a partial section of the AUC is the most suitable approach to adopt. In this paper we propose a new algorithm able to maximize only a part of the AUC by means of a linear combination of dichotomizers. Moreover, we empirically show that algorithms that maximize the AUC do not maximize the partial AUC, i.e., the two kinds of maximization are independent.

Maria Teresa Ricamato, Francesco Tortorella

Graphs

Ihara Coefficients: A Flexible Tool for Higher Order Learning

The aim of this paper is to seek a compact characterization of irregular unweighted hypergraphs for the purposes of clustering. To this end, we propose a novel hypergraph characterization method by using the Ihara coefficients, i.e. the characteristic polynomial coefficients extracted from the Ihara zeta function. We investigate the flexibility of the Ihara coefficients for learning relational structures with different relational orders. Furthermore, we introduce an efficient method for computing the coefficients. Our representation for hypergraphs takes into account not only the vertex connections but also the hyperedge cardinalities, and thus can distinguish different relational orders, which is prone to ambiguity in the hypergraph Laplacian. In experiments we demonstrate the effectiveness of the proposed characterization for clustering irregular unweighted hypergraphs and its advantages over the spectral characterization of the hypergraph Laplacian.

Peng Ren, Tatjana Aleksić, Richard C. Wilson, Edwin R. Hancock
A New Spectral Bound on the Clique Number of Graphs

Many computer vision and patter recognition problems are intimately related to the maximum clique problem. Due to the intractability of this problem, besides the development of heuristics, a research direction consists in trying to find good bounds on the clique number of graphs. This paper introduces a new spectral upper bound on the clique number of graphs, which is obtained by exploiting an invariance of a continuous characterization of the clique number of graphs introduced by Motzkin and Straus. Experimental results on random graphs show the superiority of our bounds over the standard literature.

Samuel Rota Bulò, Marcello Pelillo
Large Sample Statistics in the Domain of Graphs

One challenge in bridging the gap between structural and statistical pattern recognition consists in studying combinatorial structures like graphs using probabilistic methods. This contribution presents the structural counterparts of the first and second fundamental theorem in probability, (1) the law of large numbers and (2) the central limit theorem. In addition, we derive characterizations and uniqueness conditions for the mean of graphs. As a special case, we investigate the weighted mean of two graphs. The proposed results establish a sound statistical foundation for unsupervised structural pattern recognition methods.

Brijnesh J. Jain, Klaus Obermayer

Statistical Pattern Recognition

Analysis of the Multi-Dimensional Scale Saliency Algorithm and Its Application to Texture Categorization

A new approach for multi-dimensional Scale Saliency (MDSS) was lately introduced. In this approach, the Scale Saliency algorithm by Kadir and Brady is extended to the multi-dimensional domain. The MDSS algorithm is based on alternative entropy and divergence estimation methods whose complexity does not increase exponentially with data dimensionality. However, MDSS has not been applied to any practical problem yet. In this paper we apply the MDSS algorithm to the texture categorization problem, and we provide further experiments in order to assess the suitability of different estimators to the algorithm. We also propose a new divergence measure based on the k-d partition algorithm.

Pablo Suau, Francisco Escolano
Interactive Image Retrieval Using Smoothed Nearest Neighbor Estimates

Relevance feedback has been adopted by most recent Content Based Image Retrieval systems to reduce the semantic gap that exists between the subjective similarity among images and the similarity measures computed in a given feature space. Distance-based relevance feedback using nearest neighbors has been recently presented as a good tradeoff between simplicity and performance. In this paper, we analyse some shortages of this technique and propose alternatives that help improving the efficiency of the method in terms of the retrieval precision achieved. The resulting method has been evaluated on several repositories which use different feature sets. The results have been compared to those obtained by the nearest neighbor approach in its standard form, suggesting a better performance.

Miguel Arevalillo-Herraez, Francesc J. Ferri
Kernel Fusion of Multiple Histogram Descriptors for Robust Face Recognition

A multiple kernel fusion method combining two multiresolution histogram face descriptors is proposed to create a powerful representation method for face recognition. The multi resolution histogram descriptors are based on local binary patterns and local phase coding to achieve invariance to various types of image degradation. The multi-kernel fusion is based on the computationally efficient spectral regression KDA. The proposed face recognition method is evaluated on FRGC 2.0 database yielding very impressive results.

Chi-Ho Chan, Josef Kittler, Muhammad Atif Tahir

Structural Methods for OCR

Efficient OCR Post-Processing Combining Language, Hypothesis and Error Models

In this paper, an OCR post-processing method that combines a language model, OCR hypothesis information and an error model is proposed. The approach can be seen as a flexible and efficient way to perform Stochastic Error-Correcting Language Modeling. We use Weighted Finite-State Transducers (WFSTs) to represent the language model, the complete set of OCR hypotheses interpreted as a sequence of vectors of

a posteriori

class probabilities, and an error model with symbol substitutions, insertions and deletions. This approach combines the practical advantages of a de-coupled (OCR + post-processor) model with the error-recovery power of a integrated model.

Rafael Llobet, J. Ramon Navarro-Cerdan, Juan-Carlos Perez-Cortes, Joaquim Arlandis
Rejection Threshold Estimation for an Unknown Language Model in an OCR Task

In an OCR post-processing task, a language model is used to find the best transformation of the OCR hypothesis into a string compatible with the language. The cost of this transformation is used as a confidence value to reject the strings that are less likely to be correct, and the error rate of the accepted strings should be strictly controlled by the user. In this work, the expected error rate distribution of an unknown language model is estimated from a training set composed of known language models. This means that after building a new language model, the user should be able to automatically “fix” the expected error rate at an acceptable level instead of having to deal with an arbitrary threshold.

Joaquim Arlandis, Juan-Carlos Perez-Cortes, J. Ramon Navarro-Cerdan, Rafael Llobet
A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR

This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Procedure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.

José Ignacio Abreu Salas, Juan Ramón Rico-Juan
Backmatter
Metadaten
Titel
Structural, Syntactic, and Statistical Pattern Recognition
herausgegeben von
Edwin R. Hancock
Richard C. Wilson
Terry Windeatt
Ilkay Ulusoy
Francisco Escolano
Copyright-Jahr
2010
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-14980-1
Print ISBN
978-3-642-14979-5
DOI
https://doi.org/10.1007/978-3-642-14980-1

Premium Partner