Skip to main content
main-content

Über dieses Buch

This book constitutes the proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, S+SSPR 2018, held in Beijing, China, in August 2018.
The 49 papers presented in this volume were carefully reviewed and selected from 75 submissions. They were organized in topical sections named: classification and clustering; deep learning and neurla networks; dissimilarity representations and Gaussian processes; semi and fully supervised learning methods; spatio-temporal pattern recognition and shape analysis; structural matching; multimedia analysis and understanding; and graph-theoretic methods.

Inhaltsverzeichnis

Frontmatter

Classification and Clustering

Frontmatter

Image Annotation Using a Semantic Hierarchy

With the fast development of smartphones and social media image sharing, automatic image annotation has become a research area of great interest. It enables indexing, extracting and searching in large collections of images in an easier and faster way. In this paper, we propose a model for the annotation extension of images using a semantic hierarchy. This latter is built from vocabulary keyword annotations combining a mixture of Bernoulli distributions with mixtures of Gaussians.

Abdessalem Bouzaieni, Salvatore Tabbone

Malignant Brain Tumor Classification Using the Random Forest Method

Brain tumor grading is pivotal in treatment planning. Contrast-enhanced T1-weighted MR image is commonly used for grading. However, the classification of different types of high-grade gliomas using T1-weighted MR images is still challenging, due to the lack of imaging biomarkers. Previous studies only focused on simple visual features, ignoring rich information provided by MR images. In this paper, we propose an automatic classification pipeline using random forest to differentiate the WHO Grade III and Grade IV gliomas, by extracting discriminative features based on 3D patches. The proposed pipeline consists of three main steps in both the training and the testing stages. First, we select numerous 3D patches in and around the tumor regions of the given MR images. This can suppress the intensity information from the normal region, which is trivial for the classification process. Second, we extract features based on both patch-wise information and subject-wise clinical information, and then we refine this step to optimize the performance of malignant tumor classification. Third, we incorporate the classification forest for training/testing the classifier. We validate the proposed framework on 96 malignant brain tumor patients that consist of both Grade III (N = 38) and Grade IV gliomas (N = 58). The experiments show that the proposed framework has demonstrated its validity in the application of high-grade gliomas classification, which may help improve the poor prognosis of high-grade gliomas.

Lichi Zhang, Han Zhang, Islem Rekik, Yaozong Gao, Qian Wang, Dinggang Shen

Rotationally Invariant Bark Recognition

An efficient bark recognition method based on a novel wide-sense Markov spiral model textural representation is presented. Unlike the alternative bark recognition methods based on various gray-scale discriminative textural descriptions, we benefit from fully descriptive color, rotationally invariant bark texture representation. The proposed method significantly outperforms the state-of-the-art bark recognition approaches in terms of the classification accuracy.

Václav Remeš, Michal Haindl

Dynamic Voting in Multi-view Learning for Radiomics Applications

Cancer diagnosis and treatment often require a personalized analysis for each patient nowadays, due to the heterogeneity among the different types of tumor and among patients. Radiomics is a recent medical imaging field that has shown during the past few years to be promising for achieving this personalization. However, a recent study shows that most of the state-of-the-art works in Radiomics fail to identify this problem as a multi-view learning task and that multi-view learning techniques are generally more efficient. In this work, we propose to further investigate the potential of one family of multi-view learning methods based on Multiple Classifier Systems where one classifier is learnt on each view and all classifiers are combined afterwards. In particular, we propose a random forest based dynamic weighted voting scheme, which personalizes the combination of views for each new patient to classify. The proposed method is validated on several real-world Radiomics problems.

Hongliu Cao, Simon Bernard, Laurent Heutte, Robert Sabourin

Iterative Deep Subspace Clustering

Recently, deep learning has been widely used for subspace clustering problem due to the excellent feature extraction ability of deep neural network. Most of the existing methods are built upon the auto-encoder networks. In this paper, we propose an iterative framework for unsupervised deep subspace clustering. In our method, we first cluster the given data to update the subspace ids, and then update the representation parameters of a Convolutional Neural Network (CNN) with the clustering result. By iterating the two steps, we can obtain not only a good representation for the given data, but also more precise subspace clustering result. Experiments on both synthetic and real-world data show that our method outperforms the state-of-the-art on subspace clustering accuracy.

Lei Zhou, Shuai Wang, Xiao Bai, Jun Zhou, Edwin Hancock

A Scalable Spectral Clustering Algorithm Based on Landmark-Embedding and Cosine Similarity

We extend our recent work on scalable spectral clustering with cosine similarity (ICPR’18) to other kinds of similarity functions, in particular, the Gaussian RBF. In the previous work, we showed that for sparse or low-dimensional data, spectral clustering with the cosine similarity can be implemented directly through efficient operations on the data matrix such as elementwise manipulation, matrix-vector multiplication and low-rank SVD, thus completely avoiding the weight matrix. For other similarity functions, we present an embedding-based approach that uses a small set of landmark points to convert the given data into sparse feature vectors and then applies the scalable computing framework for the cosine similarity. Our algorithm is simple to implement, has clear interpretations, and naturally incorporates an outliers removal procedure. Preliminary results show that our proposed algorithm yields higher accuracy than existing scalable algorithms while running fast.

Guangliang Chen

Deep Learning and Neural Networks

Frontmatter

On Fast Sample Preselection for Speeding up Convolutional Neural Network Training

We propose a fast hybrid statistical and graph-based sample preselection method for speeding up CNN training process. To do so, we process each class separately: some candidates are first extracted based on their distances to the class mean. Then, we structure all the candidates in a graph representation and use it to extract the final set of preselected samples. The proposed method is evaluated and discussed based on an image classification task, on three data sets that contain up to several hundred thousands of images.

Frédéric Rayar, Seiichi Uchida

UAV First View Landmark Localization via Deep Reinforcement Learning

In recent years, the study of Unmanned Aerial Vehicle (UAV) autonomous landing has been a hot research topic. Aiming at UAV’s landmark localization, the computer vision algorithms have excellent performance. In the computer vision research field, the deep learning methods are widely employed in object detection and localization. However, these methods rely heavily on the size and quality of the training datasets. In this paper, we propose to exploit the Landmark-Localization Network (LLNet) to solve the UAV landmark localization problem in terms of a deep reinforcement learning strategy with small-sized training datasets. The LLNet learns how to transform the bounding box into the correct position through a sequence of actions. To train a robust landmark localization model, we combine the policy gradient method in deep reinforcement learning algorithm and the supervised learning algorithm together in the training stage. The experimental results show that the LLNet is able to locate the landmark precisely.

Xinran Wang, Peng Ren, Leijian Yu, Lirong Han, Xiaogang Deng

Context Free Band Reduction Using a Convolutional Neural Network

In this paper, we present a method for content-free band selection and reduction for hyperspectral imaging. Here, we reconstruct the spectral image irradiance in the wild making use of a reduced set of wavelength-indexed bands at input. To this end, we use of a deep neural net which employs a learnt sparse input connection map to select relevant bands at input. Thus, the network can be viewed as learning a non-linear, locally supported generic transformation between a subset of input bands at a pixel neighbourhood and the scene irradiance of the central pixel at output. To obtain the sparse connection map we employ a variant of the Levenberg-Marquardt algorithm (LMA) on manifolds which is devoid of the damping factor often used in LMA approaches. We show results on band selection and illustrate the utility of the connection map recovered by our approach for spectral reconstruction using a number of alternatives on widely available datasets.

Ran Wei, Antonio Robles-Kelly, José Álvarez

Local Patterns and Supergraph for Chemical Graph Classification with Convolutional Networks

Convolutional neural networks (CNN) have deeply impacted the field of machine learning. These networks, designed to process objects with a fixed topology, can readily be applied to images, videos and sounds but cannot be easily extended to structures with an arbitrary topology such as graphs. Examples of applications of machine learning to graphs include the prediction of the properties molecular graphs, or the classification of 3D meshes. Within the chemical graphs framework, we propose a method to extend networks based on a fixed topology to input graphs with an arbitrary topology. We also propose an enriched feature vector attached to each node of a chemical graph and a new layer interfacing graphs with arbitrary topologies with a full connected layer.

Évariste Daller, Sébastien Bougleux, Luc Brun, Olivier Lézoray

Learning Deep Embeddings via Margin-Based Discriminate Loss

Deep metric learning has gained much popularity in recent years, following the success of deep learning. However, existing frameworks of deep metric learning based on contrastive loss and triplet loss often suffer from slow convergence, partially because they employ only one positive example and one negative example while not interacting with the other positive or negative examples in each update. In this paper, we firstly propose the strict discrimination concept to seek an optimal embedding space. Based on this concept, we then propose a new metric learning objective called Margin-based Discriminate Loss which tries to keep the similar and the dissimilar strictly discriminate by pulling multiple positive examples together while pushing multiple negative examples away at each update. Importantly, it doesn’t need expensive sampling strategies. We demonstrate the validity of our proposed loss compared with the triplet loss as well as other competing loss functions for a variety of tasks on fine-grained image clustering and retrieval.

Peng Sun, Wenzhong Tang, Xiao Bai

Dissimilarity Representations and Gaussian Processes

Frontmatter

Protein Remote Homology Detection Using Dissimilarity-Based Multiple Instance Learning

A challenging Pattern Recognition problem in Bioinformatics concerns the detection of a functional relation between two proteins even when they show very low sequence similarity – this is the so-called Protein Remote Homology Detection (PRHD) problem. In this paper we propose a novel approach to PRHD, which casts the problem into a Multiple Instance Learning (MIL) framework, which seems very suitable for this context. Experiments on a standard benchmark show very competitive performances, also in comparison with alternative discriminative methods.

Antonelli Mensi, Manuele Bicego, Pietro Lovato, Marco Loog, David M. J. Tax

Local Binary Patterns Based on Subspace Representation of Image Patch for Face Recognition

In this paper, we propose a new local descriptor named as PCA-LBP for face recognition. In contrast to classical LBP methods, which compare pixels about single value of intensity, our proposed method considers that comparison among image patches about their multi-dimensional subspace representations. Such a representation of a given image patch can be defined as a set of coordinates by its projection into a subspace, whose basis vectors are learned in selective facial image patches of the training set by Principal Component Analysis. Based on that, PCA-LBP descriptor can be computed by applying several LBP operators between the central image patch and its 8 neighbors considering their representations along each discretized subspace basis. In addition, we propose PCA-CoALBP by introducing co-occurrence of adjacent patterns, aiming to incorporate more spatial information. The effectiveness of our proposed two methods is accessed through evaluation experiments on two public face databases.

Xin Zong

An Image-Based Representation for Graph Classification

This paper proposes to study the relevance of image representations to perform graph classification. To do so, the adjacency matrix of a given graph is reordered using several matrix reordering algorithms. The resulting matrix is then converted into an image thumbnail, that is used to represent the graph. Experimentation on several chemical graph data sets and an image data set show that the proposed graph representation performs as well as the state-of-the-art methods.

Frédéric Rayar, Seiichi Uchida

Visual Tracking via Patch-Based Absorbing Markov Chain

Bounding box description of target object usually includes background clutter, which easily degrades tracking performance. To handle this problem, we propose a general approach to learn robust object representation for visual tracking. It relies a novel patch-based absorbing Markov chain (AMC) algorithm. First, we represent object bounding box with a graph whose nodes are image patches, and introduce a weight for each patch that describes its reliability belonging to foreground object to mitigate background clutter. Second, we propose a simple yet effective AMC-based method to optimize reliable foreground patch seeds as their qualities are very important for patch weight computation. Third, based on the optimized seeds, we also utilize AMC to compute patch weights. Finally, the patch weights are incorporated into object feature description and tracking is carried out by adopting structured support vector machine algorithm. Experiments on the benchmark dataset demonstrate the effectiveness of our proposed approach.

Ziwei Xiong, Nan Zhao, Chenglong Li, Jin Tang

Gradient Descent for Gaussian Processes Variance Reduction

A key issue in Gaussian Process modeling is to decide on the locations where measurements are going to be taken. A good set of observations will provide a better model. Current state of the art selects such a set so as to minimize the posterior variance of the Gaussian Process by exploiting submodularity. We propose a Gradient Descent procedure to iteratively improve an initial set of observations so as to minimize the posterior variance directly. The performance of the technique is analyzed under different conditions by varying the number of measurement points, the dimensionality of the domain and the hyperparameters of the Gaussian Process. Results show the applicability of the technique and the clear improvements that can be obtain under different settings.

Lorenzo Bottarelli, Marco Loog

Semi and Fully Supervised Learning Methods

Frontmatter

Sparsification of Indefinite Learning Models

The recently proposed Krĕin space Support Vector Machine (KSVM) is an efficient classifier for indefinite learning problems, but with a non-sparse decision function. This very dense decision function prevents practical applications due to a costly out of sample extension. In this paper we provide a post processing technique to sparsify the obtained decision function of a Krĕin space SVM and variants thereof. We evaluate the influence of different levels of sparsity and employ a Nyström approach to address large scale problems. Experiments show that our algorithm is similar efficient as the non-sparse Krĕin space Support Vector Machine but with substantially lower costs, such that also large scale problems can be processed.

Frank-Michael Schleif, Christoph Raab, Peter Tino

Semi-supervised Clustering Framework Based on Active Learning for Real Data

In this paper, we propose a real data clustering method based on active learning. Clustering methods are difficult to apply to real data for two reasons. First, real data may include outliers that adversely affect clustering. Second, the clustering parameters such as the number of clusters cannot be made constant because the number of classes of real data may increase as time goes by. To solve the first problem, we focus on labeling outliers. Therefore, we develop a stream-based active learning framework for clustering. The active learning framework enables us to label the outliers intensively. To solve the second problem, we also develop an algorithm to automatically set clustering parameters. This algorithm can automatically set the clustering parameters with some labeled samples. The experimental results show that our method can deal with the problems mentioned above better than the conventional clustering methods.

Ryosuke Odate, Hiroshi Shinjo, Yasufumi Suzuki, Masahiro Motobayashi

Supervised Classification Using Feature Space Partitioning

In the paper we consider the supervised classification problem using feature space partitioning. We first apply heuristic algorithm for partitioning a graph into a minimal number of cliques and subsequently the cliques are merged by means of the nearest neighbor rule. The main advantage of the new approach which optimally utilizes the geometrical structure of the training set is decomposition of the l-class problem ($$l>2$$) into l single-class optimization problems. We discuss computational complexity of the proposed method and the resulting classification rules. The experiments in which we compared the box algorithm and SVM show that in most cases the box algorithm performs better than SVM.

Ventzeslav Valev, Nicola Yanev, Adam Krzyżak, Karima Ben Suliman

Deep Homography Estimation with Pairwise Invertibility Constraint

Recent works have shown that deep learning methods can improve the performance of the homography estimation due to the better features extracted by convolutional networks. Nevertheless, these works are supervised and rely too much on the labeled training dataset as they aim to make the homography be estimated as close to the ground truth as possible, which may cause overfitting. In this paper, we propose a Siamese network with pairwise invertibility constraint for supervised homography estimation. We utilize spatial pyramid pooling modules to improve the quality of extracted features in each image by exploiting context information. Discovering the fact that there is a pair of homographies from a given image pair which are inverse matrices, we propose the invertibility constraint to avoid overfitting. To employ the constraint, we adopt the matrix representation of the homography rather than the commonly used 4-point parameterization in other methods. Experiments on the synthetic dataset generated from MSCOCO dataset show that our proposed method outperforms several state-of-the-art approaches.

Xiang Wang, Chen Wang, Xiao Bai, Yun Liu, Jun Zhou

Spatio-temporal Pattern Recognition and Shape Analysis

Frontmatter

Graph Time Series Analysis Using Transfer Entropy

In this paper, we explore how Schreiber’s transfer entropy can be used to develop a new entropic characterisation of graphs derived from time series data. We use the transfer entropy to weight the edges of a graph where the nodes represent time series data and the edges represent the degree of commonality of pairs of time series. The result is a weighted graph which captures the information transfer between nodes over specific time intervals. From the weighted normalised Laplacian we characterise the network at each time interval using the von Neumann entropy computed from the normalised Laplacian spectrum, and study how this entropic characterisation evolves with time, and can be used to capture temporal changes and anomalies in network structure. We apply the method to stock-market data, which represent time series of closing stock prices on the New York stock exchange and NASDAQ markets. This data is augmented with information concerning the industrial or commercial sector to which the stock belong. We use our method not only to analyse overall market behaviour, but also inter-sector and intra-sector trends.

Ibrahim Caglar, Edwin R. Hancock

Analyzing Time Series from Chinese Financial Market Using a Linear-Time Graph Kernel

Graph-based data has played an important role in representing complex patterns from real-world data, but there is very little work on mining time series with graphs. And those existing graph-based time series mining methods always use well-selected data. In this paper, we investigate a method for extracting graph structures, which contain the structural information that cannot be captured by vector-based data, from the whole Chinese financial time series. We call them time-varying networks, each node in these networks represents the individual time series of a stock and each undirected edge between two nodes represents the correlation between two stocks. We further review a linear-time graph kernel for labeled graphs and show whether the graph kernel, together with time-varying networks, can be used to analyze Chinese financial time series. In the experiments, we apply our method to analyze the whole Chinese Stock Market daily transaction data, i.e., the stock prices data, and use the graph kernel to measure similarities between those extracted networks. Then we compare the performances of our method and other sequence-based or vector-based methods by using kernel principle components analysis to map those results into low dimensional feature space. The experimental results demonstrate the efficiency and effectiveness of our methods together with graph kernels in analyzing Chinese financial time series.

Yuhang Jiao, Lixin Cui, Lu Bai, Yue Wang

A Preliminary Survey of Analyzing Dynamic Time-Varying Financial Networks Using Graph Kernels

In this paper, we investigate whether graph kernels can be used as a means of analyzing time-varying financial market networks. Specifically, we aim to identify the significant financial incident that changes the financial network properties through graph kernels. Our financial networks are abstracted from the New York Stock Exchange (NYSE) data over 6004 trading days, where each vertex represents the individual daily return price time series of a stock and each edge represents the correlation between pairwise series. We propose to use two state-of-the-art graph kernels for the analysis, i.e., the Jensen-Shannon graph kernel and the Weisfeiler-Lehman subtree kernel. The reason of using the two kernels is that they are the representative methods of global graph kernels and local graph kernels, respectively. We perform kernel Principle Components Analysis (kPCA) associated with each kernel matrix to embed the networks into a 3-dimensional principle space, where the time-varying networks of all trading days are visualized. Experimental results on the financial time series of NYSE dataset demonstrate that graph kernels can well distinguish abrupt changes of financial networks with time, and provide a more effective alternative way of analyzing original multiple co-evolving financial time series. We theoretically indicate the perspective of developing novel graph kernels on time-varying networks for multiple co-evolving time series analysis in future work.

Lixin Cui, Lu Bai, Luca Rossi, Zhihong Zhang, Yuhang Jiao, Edwin R. Hancock

Few-Example Affine Invariant Ear Detection in the Wild

Ear detection in the wild with the varying pose, lighting, and complex background is a challenging unsolved problem. In this paper, we study affine invariant ear detection in the wild using only a small number of ear example images and formulate the problem of affine invariant ear detection as a task of locating an affine transformation of an ear model in an image. Ear shapes are represented by line segments, which incorporate structural information of line orientation and line-point association. Then a novel fast line based Hausdorff distance (FLHD) is developed to match two sets of line segments. Compared to existing line segment Hausdorff distance, FLHD is one order of magnitude faster with similar discriminative power. As there are a large number of transformations to consider, an efficient global search using branch-and-bound scheme is presented to locate the ear. This makes our algorithm be able to handle arbitrary 2D affine transformations. Experimental results on real-world images that were acquired in the wild and Point Head Pose database show the effectiveness and robustness of the proposed method.

Jianming Liu, Yongsheng Gao, Yue Li

Line Voronoi Diagrams Using Elliptical Distances

The paper introduces an Elliptical Line Voronoi diagram. In contrast to the classical approaches, it represents the line segment by its end points, and computes the distance from point to line segment using the Confocal Ellipse-based Distance. The proposed representation offers specific mathematical properties, prioritizes the sites of the greater length and corners with the obtuse angles without using an additional weighting scheme. The above characteristics are suitable for the practical applications such as skeletonization and shape smoothing.

Aysylu Gabdulkhakova, Maximilian Langer, Bernhard W. Langer, Walter G. Kropatsch

Structural Matching

Frontmatter

Modelling the Generalised Median Correspondence Through an Edit Distance

On the one hand, classification applications modelled by structural pattern recognition, in which elements are represented as strings, trees or graphs, have been used for the last thirty years. In these models, structural distances are modelled as the correspondence (also called matching or labelling) between all the local elements (for instance nodes or edges) that generates the minimum sum of local distances. On the other hand, the generalised median is a well-known concept used to obtain a reliable prototype of data such as strings, graphs and data clusters. Recently, the structural distance and the generalised median has been put together to define a generalise median of matchings to solve some classification and learning applications. In this paper, we present an improvement in which the Correspondence edit distance is used instead of the classical Hamming distance. Experimental validation shows that the new approach obtains better results in reasonable runtime compared to other median calculation strategies.

Carlos Francisco Moreno-García, Francesc Serratosa

Learning the Sub-optimal Graph Edit Distance Edit Costs Based on an Embedded Model

Graph edit distance has become an important tool in structural pattern recognition since it allows us to measure the dissimilarity of attributed graphs. One of its main constraints is that it requires an adequate definition of edit costs, which eventually determines which graphs are considered similar. These edit costs are usually defined as concrete functions or constants in a manual fashion and little effort has been done to learn them. The present paper proposes a framework to define these edit costs automatically. Moreover, we concretise this framework in two different models based on neural networks and probability density functions.

Pep Santacruz, Francesc Serratosa

Ring Based Approximation of Graph Edit Distance

The graph edit distance ($$\mathrm {GED}$$) is a flexible graph dissimilarity measure widely used within the structural pattern recognition field. A widely used paradigm for approximating $$\mathrm {GED}$$ is to define local structures rooted at the nodes of the input graphs and use these structures to transform the problem of computing $$\mathrm {GED}$$ into a linear sum assignment problem with error correction ($$\mathrm {LSAPE}$$). In the literature, different local structures such as incident edges, walks of fixed length, and induced subgraphs of fixed radius have been proposed. In this paper, we propose to use rings as local structure, which are defined as collections of nodes and edges at fixed distances from the root node. We empirically show that this allows us to quickly compute a tight approximation of $$\mathrm {GED}$$.

David B. Blumenthal, Sébastien Bougleux, Johann Gamper, Luc Brun

Graph Edit Distance in the Exact Context

This paper presents a new Mixed Integer Linear Program (MILP) formulation for the Graph Edit Distance (GED) problem. The contribution is an exact method that solves the GED problem for attributed graphs. It has an advantage over the best existing one when dealing with the case of dense of graphs, because all its constraints are independent from the number of edges in the graphs. The experiments have shown the efficiency of the new formulation in the exact context.

Mostafa Darwiche, Romain Raveaux, Donatello Conte, Vincent T’Kindt

The VF3-Light Subgraph Isomorphism Algorithm: When Doing Less Is More Effective

We have recently intoduced VF3, a general-purpose subgraph isomorphism algorithm that has demonstrated to be very effective on several datasets, especially on very large and very dense graphs.In this paper we show that on some classes of graphs, the whole power of VF3 may become overkill; indeed, by removing some of the heuristics used in it, and as a consequence also some of the data structures that are required by them, we obtain an algorithm that is actually faster.In order to provide a characterization of this modified algorithm, called VF3-Light, we have performed an evaluation using several kinds of graphs; besides comparing VF3-Light with VF3, we have also compared it to RI, a fast recent algorithm that is based on a similar approach.

Vincenzo Carletti, Pasquale Foggia, Antonio Greco, Alessia Saggese, Mario Vento

A Deep Neural Network Architecture to Estimate Node Assignment Costs for the Graph Edit Distance

The problem of finding a distance and a correspondence between a pair of graphs is commonly referred to as the Error-tolerant Graph matching problem. The Graph Edit Distance is one of the most popular approaches to solve this problem. This method needs to define a set of parameters and the cost functions aprioristically. On the other hand, in recent years, Deep Neural Networks have shown very good performance in a wide variety of domains due to their robustness and ability to solve non-linear problems. The aim of this paper is to present a model to compute the assignments costs for the Graph Edit Distance by means of a Deep Neural Network previously trained with a set of pairs of graphs properly matched. We empirically show a major improvement using our method with respect to the state-of-the-art results.

Xavier Cortés, Donatello Conte, Hubert Cardot, Francesc Serratosa

Error-Tolerant Geometric Graph Similarity

Graph matching is the task of computing the similarity between two graphs. Error-tolerant graph matching is a type of graph matching, in which a similarity between two graphs is computed based on some tolerance value whereas within exact graph matching a strict one-to-one correspondence is required between two graphs. In this paper, we present an approach to error-tolerant graph similarity using geometric graphs. We define the vertex distance (dissimilarity) and edge distance between two graphs and combine them to compute graph distance.

Shri Prakash Dwivedi, Ravi Shankar Singh

Learning Cost Functions for Graph Matching

During the last decade, several approaches have been proposed to address detection and recognition problems, by using graphs to represent the content of images. Graph comparison is a key task in those approaches and usually is performed by means of graph matching techniques, which aim to find correspondences between elements of graphs. Graph matching algorithms are highly influenced by cost functions between nodes or edges. In this perspective, we propose an original approach to learn the matching cost functions between graphs’ nodes. Our method is based on the combination of distance vectors associated with node signatures and an SVM classifier, which is used to learn discriminative node dissimilarities. Experimental results on different datasets compared to a learning-free method are promising.

Rafael de O. Werneck, Romain Raveaux, Salvatore Tabbone, Ricardo da S. Torres

Multimedia Analysis and Understanding

Frontmatter

Matrix Regression-Based Classification for Face Recognition

Partially occlusion is a common difficulty arisen in applications of face recognition, and many algorithms based on linear representation may pay attention to such cases. In this paper, we consider the partial occlusion problem via inner-class linear regression. Specifically, we develop a matrix regression-based classification (MRC) method in which every sample from the same class are represented as matrices instead of vector and adopted to encode a probe image under. In the regression step, a L21-norm based matrix regression model is proposed, which can efficiently depress the effect of occlusion in probe image. Accordingly, an efficient algorithm is derived to optimize the proposed objective function. In addition, we argue that the corrupted pixels in probe image should not be considered in decision step. Thus, we introduce a robust threshold to dynamically eliminate the corrupted rows in probe image before making decision. Performance of MRC is evaluated on several datasets and the results are compared with those of other state-of-the-art methods.

Jian-Xun Mi, Quanwei Zhu, Zhiheng Luo

Plenoptic Imaging for Seeing Through Turbulence

Atmospheric distortion is one of the main barriers to imaging over long distances. Changes in the local refractive index perturb light rays as they pass through, causing distortion in the images captured in a camera. This problem can be overcome to some extent by using a plenoptic imaging system (one which contains an array of microlenses in the optical path). In this paper, we propose a model of image distortion in the microlens images and propose a computational method for correcting the distortion. This algorithm estimates the distortion field in the microlenses. We then propose a second algorithm to infer a consistent final image from the multiple images of each pixel in the microlens array. These algorithms detect the distortion caused by changes in atmospheric refractive index and allow the reconstruction of a stable image even under turbulent imaging conditions. Finally we present some reconstruction results and examine whether there is any increase in performance from the camera system. We demonstrate that the system can detect and track distortions caused by turbulence and reconstruct an improved final image.

Richard C. Wilson, Edwin R. Hancock

Weighted Local Mutual Information for 2D-3D Registration in Vascular Interventions

In this paper, a new similarity measure, WLMI (Weighted Local Mutual Information), based on weighted patch and mutual information is proposed to register the preoperative 3D CT model to the intra-operative 2D X-ray images in vascular interventions. We embed this metric into the 2D-3D registration framework, where we show that the robustness and accuracy of the registration can be effectively improved by adapting the strategy of local image patch selection and the weighted joint distribution calculation based on gradient. Experiments on both synthetic and real X-ray image registration show that the proposed method produces considerably better registration results in a shorter time compared with the conventional MI and Normalized MI methods.

Cai Meng, Qi Wang, Shaoya Guan, Yi Xie

Cross-Model Retrieval with Reconstruct Hashing

Hashing has been widely used in large-scale vision problems thanks to its efficiency in both storage and speed. For fast cross-modal retrieval task, cross-modal hashing (CMH) has received increasing attention recently with its ability to improve quality of hash coding by exploiting the semantic correlation across different modalities. Most traditional CMH methods focus on designing a good hash function to use supervised information appropriately, but the performance are limited by hand-crafted features. Some deep learning based CMH methods focus on learning good features by using deep network, however, directly quantizing the feature may result in large loss for hashing. In this paper, we propose a novel end-to-end deep cross-modal hashing framework, integrating feature and hash-code learning into the same network. We keep the relationship of features between modalities. For hash process, we design a novel net structure and loss for hash learning as well as reconstruct the hash codes to features to improve the quality of codes. Experiments on standard databases for cross-modal retrieval show the proposed methods yields substantial boosts over latest state-of-the-art hashing methods.

Yun Liu, Cheng Yan, Xiao Bai, Jun Zhou

Deep Supervised Hashing with Information Loss

Recently, deep neural networks based hashing methods have greatly improved the image retrieval performance by simultaneously learning feature representations and binary hash functions. Most deep hashing methods utilize supervision information from semantic labels to preserve the distance similarity within local structures, however, the global distribution is ignored. We propose a novel deep supervised hashing method which aims to minimize the information loss during low-dimensional embedding process. More specifically, we use Kullback-Leibler divergences to constrain the compact codes having a similar distribution with the original images. Experimental results have shown that our method outperforms current stat-of-the-art methods on benchmark datasets.

Xueni Zhang, Lei Zhou, Xiao Bai, Edwin Hancock

Single Image Super Resolution via Neighbor Reconstruction

Super Resolution (SR) is a complex, ill-posed problem where the aim is to construct the mapping between the low and high resolution manifolds of image patches. Anchored neighborhood regression for SR (namely A+ [15]) has shown promising results. In this paper we present a new regression-based SR algorithm that overcomes the limitations of A+ and benefits from an innovative and simple Neighbor Reconstruction Method (NRM). This is achieved by vector operations on an anchored point and its corresponding neighborhood. NRM reconstructs new patches which are closer to the anchor point in the manifold space. Our method is robust to NRM sparsely-sampled points: increasing PSNR by 0.5 dB compared to the next best method. We comprehensively validate our technique on standardised datasets and compare favourably with the state-of-the-art methods: we obtain PSNR improvement of up to 0.21 dB compared to previously-reported work.

Zhihong Zhang, Zhuobin Xu, Zhiling Ye, Yiqun Hu, Lixin Cui, Lu Bai

An Efficient Method for Boundary Detection from Hyperspectral Imagery

In this paper, we propose a novel method for efficient boundary detection in close-range hyperspectral images (HSI). We adopt different spectral similarity measurements to construct a sparse spectral-spatial affinity matrix that characterizes the similarity between the spectral responses of neighboring pixels within a local neighborhood. After that, we adopt a spectral clustering method in which the eigenproblem is solved and the eigenvectors of smallest eigenvalues are calculated. Morphological erosion is then applied on each eigenvector to detect the boundary. We fuse the results of all eigenvectors to obtain the final boundary map. Our method is evaluated on a real-world HSI dataset and compared with three alternative methods. The results exhibit that our method outperforms the alternatives, and can cope with several scenarios that methods based on color images can not handle.

Suhad Lateef Al-Khafaji, Jun Zhou, Alan Wee-Chung Liew

Graph-Theoretic Methods

Frontmatter

Bags of Graphs for Human Action Recognition

Bags of visual words are a well known approach for images classification that also has been used in human action recognition. This model proposes to represent images or videos in a structure referred to as bag of visual words before classifying. The process of representing a video in a bag of visual words is known as the encoding process and is based on mapping the interest points detected in the scene into the new structure by means of a codebook. In this paper we propose to improve the representativeness of this model including the structural relations between the interest points using graph sequences. The proposed model achieves very competitive results for human action recognition and could also be applied to solve graph sequences classification problems.

Xavier Cortés, Donatello Conte, Hubert Cardot

Categorization of RNA Molecules Using Graph Methods

RNA molecules are a group of biologically active molecules which have a similar structure to DNA. Graph-based methods for classification have shown promise on other biological compounds such as protein. In this paper, we investigate the use of graph representations of RNA, graph-feature based methods and their role in classifying RNA into particular categories. We describe a number of possible graph representations of RNA structure and how useful information can be encoded in the graph. We show how graph-kernel and graph-feature methods can be used to provide descriptors for the molecules. Finally, on a moderately-sized database of 419 RNA structures, we explore how these methods can be used to classify RNA into high-level categories provided by the biological context or function of the molecules. We find that graph descriptors give state-of-the-art performance on sequence classification, but that the graph elements of the description do not add useful information above the base-sequence.

Richard C. Wilson, Enes Algul

Quantum Edge Entropy for Alzheimer’s Disease Analysis

In this paper, we explore how to the decompose the global statistical mechanical entropy of a network into components associated with its edges. Commencing from a statistical mechanical picture in which the normalised Laplacian matrix plays the role of Hamiltonian operator, thermodynamic entropy can be calculated from partition function associated with different energy level occupation distributions arising from Bose-Einstein statistics and Fermi-Dirac statistics. Using the spectral decomposition of the Laplacian, we show how to project the edge-entropy components so that the detailed distribution of entropy across the edges of a network can be achieved. We apply the resulting method to fMRI activation networks to evaluate the qualitative and quantitative characterisations. The entropic measurement turns out to be an effective tool to identify the differences in structure of Alzheimer’s disease by selecting the most salient anatomical brain regions.

Jianjia Wang, Richard C. Wilson, Edwin R. Hancock

Approximating GED Using a Stochastic Generator and Multistart IPFP

The Graph Edit Distance defines the minimal cost of a sequence of elementary operations transforming a graph into another graph. This versatile concept with an intuitive interpretation is a fundamental tool in structural pattern recognition. However, the exact computation of the Graph Edit Distance is $$\mathcal {NP}$$-complete. Iterative algorithms such as the ones based on Franck-Wolfe method provide a good approximation of true edit distance with low execution times. However, underlying cost function to optimize being neither concave nor convex, the accuracy of such algorithms highly depends on the initialization. In this paper, we propose a smart random initializer using promising parts of previously computed solutions.

Nicolas Boria, Sébastien Bougleux, Luc Brun

Offline Signature Verification by Combining Graph Edit Distance and Triplet Networks

Biometric authentication by means of handwritten signatures is a challenging pattern recognition task, which aims to infer a writer model from only a handful of genuine signatures. In order to make it more difficult for a forger to attack the verification system, a promising strategy is to combine different writer models. In this work, we propose to complement a recent structural approach to offline signature verification based on graph edit distance with a statistical approach based on metric learning with deep neural networks. On the MCYT and GPDS benchmark datasets, we demonstrate that combining the structural and statistical models leads to significant improvements in performance, profiting from their complementary properties.

Paul Maergner, Vinaychandran Pondenkandath, Michele Alberti, Marcus Liwicki, Kaspar Riesen, Rolf Ingold, Andreas Fischer

On Association Graph Techniques for Hypergraph Matching

Association graph techniques represent a classical approach to tackle the graph matching problem and recently the idea has been generalized to the case of hypergraphs. In this paper, we explore the potential of this approach in conjunction with a class of dynamical systems derived from the Baum-Eagon inequality. In particular, we focus on the pure isomorphism case and show, with extensive experiments on a large synthetic dataset, that despite its simplicity the Baum-Eagon dynamics does an excellent job at finding globally optimal solutions.

Giulia Sandi, Sebastiano Vascon, Marcello Pelillo

Directed Network Analysis Using Transfer Entropy Component Analysis

In this paper, we present a novel method for detecting directed network characteristics using histogram statistics based on degree distribution associated with transfer entropy. The proposed model in this paper established in information theory looks forward to learn the low dimensional representation of sample graphs, which can be obtained by transfer entropy component analysis (TECA) model. In particular, we apply transfer entropy to measure the transfer information between different time series data. For instances, for the fMRI time series data, we can use the transfer entropy to explore the connectivity between different brain functional regions effectively, which plays a significant role in diagnosing Alzheimers disease (AD) and its prodromal stage, mild cognitive impairment (MCI). With the properties of the directed graph in hand, we commence to further encode it into advanced representation of graphs based on the histogram statistics of degree distribution and multilinear principal component analysis (MPCA) technology. It not only reduces the memory space occupied by the huge transfer entropy matrix, but also enables the features to have a stronger representational capacity in the low-dimensional feature space. We conduct a classification experiment on the proposed model for the fMRI time series data. The experimental results verify that our model can significantly improve the diagnosis accuracy for MCI subjects.

Meihong Wu, Yangbin Zeng, Zhihong Zhang, Haiyun Hong, Zhuobin Xu, Lixin Cui, Lu Bai, Edwin R. Hancock

A Mixed Entropy Local-Global Reproducing Kernel for Attributed Graphs

In this paper, we develop a new mixed entropy local-global reproducing kernel for vertex attributed graphs based on depth-based representations that naturally reflect both local and global entropy based graph characteristics. Specifically, for a pair of graphs, we commence by computing the nest depth-based representations rooted at the centroid vertices. The resulting mixed local-global reproducing kernel for a pair of graphs is computed by measuring a basic $$H^1$$-reproducing kernel between their nest representations associated with different entropy measures. We show that the proposed kernel not only reflect both the local and global graph characteristics through the nest depth-based representations, but also reflect rich edge connection information and vertex label information through different kinds of entropy measures. Moreover, since both the required basic $$H^1$$-reproducing kernel and the nest depth-based representation can be computed in a polynomial time, the new proposed kernel processes efficient computational complexity. Experiments on standard graph datasets demonstrate the effectiveness and efficiency of the proposed kernel.

Lixin Cui, Lu Bai, Luca Rossi, Zhihong Zhang, Lixiang Xu, Edwin R. Hancock

Dirichlet Densifiers: Beyond Constraining the Spectral Gap

In this paper, we derive a new bound for commute times estimation. This bound does not rely on the spectral gap but on graph densification (or graph rewiring). Firstly, we motivate the bound by showing that implicitly constraining the spectral gap through graph densification cannot fully explain some estimations in real datasets. Then, we set our working hypothesis: if densification can deal with a small/moderate degradation of the spectral gap, this is due to the fact that inter-cluster commute distances are considerably shrunk. This suggests a more detailed bound which explicitly accounts for the shrinking effect of densification. Finally, we formally develop this bound, thus uncovering the deep implications of graph densification in commute times estimation.

Manuel Curado, Francisco Escolano, Miguel Angel Lozano, Edwin R. Hancock

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise