Kernels for acyclic digraphs

doi:10.1016/j.patrec.2012.07.017

Pattern Recognition Letters

Volume 33, Issue 16, 1 December 2012, Pages 2239-2244

https://doi.org/10.1016/j.patrec.2012.07.017 Get rights and content

Abstract

This paper proposes two efficient kernels for comparing acyclic, directed graphs. The first kernel counts the number of common paths and allows for weighing according to path-length and/or according to the vertices contained in each particular path. The second kernel counts the number of paths in common minors of the graphs involved and allows for length- and vertex-weighting too. Both kernels have algorithmic complexity that is cubic in the size of the vertex-set. The performance of the algorithms is concisely demonstrated using synthetic and real data.

Highlights

► Graph kernel of complexity O(∣V∣₃) that evaluates all common paths in DAG’s with common vertex set V. ► Graph kernel of complexity O(∣V∣₃) that evaluates all paths of all common graph-minors in DAG’s with common vertex set V. ► Kernels allow for weighing the paths according to the vertices included. ► Performance is demonstrated on synthetic data.

Introduction

Graphs comprise a natural way of representing data structures from a variety of branches of science and engineering. Hence, graph comparison, graph distance, graph classification and graph similarity are important issues in many pattern recognition and machine learning domains (Bunke and Riesen, 2011). Examples are easily found in data mining (Cook and Holder, 2007), computer vision (Kandel et al., 2007), social networks (Faust and Skvoretz, 2002, Kleinnijenhuis et al., 1997, Widmer, 2010), chemistry and bio-informatics (Borgwardt, 2011, Hattori et al., 2003, Sperschneider, 2008), taxonomy (Baum, 2007), image recognition (Wu et al., 2009), analysis of semantic structures (Lin et al., 2010) and e-business (Shvaiko and Euzenat, 2005). After numerical vectors and strings, graphs are probably the most frequently encountered data type in information science.

Kernel methods (Shaw-Taylor and Cristianini, 2004) are a computationally very attractive class of methods since they can be applied in high-dimensional feature spaces while avoiding the direct use of the feature map. So, a kernel function generates a Gram matrix which in turn allows for the application of a variety of linear statistical models that subserve discrimination, classification and dimensionality reduction (Cristianini and Shawe-Taylor, 2000, Shaw-Taylor and Cristianini, 2004).

Furthermore, kernel functions have very attractive closure properties (Haussler, 1999, Shaw-Taylor and Cristianini, 2004) that allow us to easily combine different feature spaces in learning problems that operate on the Gram matrix. For example, in image recognition, distinct kernels for e.g. texture and color can be combined to enhance learning performance (Gehler and Nowozin, 2009).

Recently, very general graph kernels have been proposed that generalize random walk (Gärtner et al., 2003) and marginalized kernels (Kashima et al., 2003) by Vishwanathan et al. (2010). But such kernels tend to focus on shorter walks and attempts (e.g. Mahé et al., 2004) to prevent this, lead to tottering, i.e. to a disproportionate contribution of self-repeating walks. Promising are the Weisfeiler-Lehman kernels as proposed by Shervazidze et al. (2010) that efficiently handle very large graphs by comparing subtrees of limited height. The aim of the present paper is to propose kernels for directed acyclic graphs that are direct generalizations of subsequence kernels for strings as dealt with in Elzinga et al., 2008, Elzinga et al., 2011, Lodhi et al., 2002, Wang and Lin, 2007. Such kernels are less general but efficiently exploit paths instead of walks and do not rely on sampling.

Thereto, after concisely discussing some concepts and notation in Section 2 and related work in Section 3, we discuss two kernels based on all common (vertex-weighted) paths in Section 4 and evaluate their performance with synthetic data. In Section 5, we discuss our results.

Section snippets

Preliminaries

Here, we write G = (V, E, L) to denote a graph (e.g. Diestel, 2005) on a set of vertices V = {v₁ , … , v_n} with edges E ⊆ V × V and a set of labels L. For the sake of clarity, we mostly confine to unlabeled graphs, i.e. graphs where ∣L∣ = 1, and therefore, we will often ignore the labeling. However, the results are easily transferred to labeled graphs. An n-walk on G consists of a sequence of n + 1 vertices $v_{i_{1}}, \dots, v_{i_{n}}$ from V such that $(v_{i_{j}}, v_{i_{j + 1}}) \in E$ for all 1 ⩽ j ⩽ n. Note that a walk may visit some or all of its

Related work

Over the last decade, many (kernel) methods have been proposed for graph comparison. Here we confine the discussion to kernel methods based on counting or estimating the number of common walks or paths in a pair of graphs.

The kernels that we propose simply count all common paths of the graphs involved and the more of these common paths, the more similar the graphs. Another and almost equally simple idea is that of a random walk to measure similarity: given two graphs, perform a random walk on

A vertex-weighted paths kernel

Let P^∗ denote the set of all possible paths over a set of vertices V and let $Z^{*}$ denote the set of nonnegative integers. We define an arbitrary but fixed map $r : P^{*} \mapsto Z^{*}$ that assigns to each path p ∈ P^∗ a unique integer $r (p) \in Z^{*}$ . Next, for a graph G = (V, E), we create a feature map ϕ(G) = (x₁, x₂ , …) through $x_{r (p)} = \{\begin{matrix} 1 & if p \in G \\ 0 & otherwise \end{matrix}$ and the inner product 〈ϕ(G), ϕ(G′)〉 now counts the number of common paths of the pair (G,G′). There is a very simple, adjacency matrix based kernel function to evaluate such inner

A comparative evaluation

In this section the proposed kernels are evaluated by comparing with some of the graph similarity measures commonly used in the literature. We considered SimPack (Bernstein et al., 2005), a generic Java library for similarity measures. It includes the classical graph similarity measures: Conceptual Similarity, Graph Isomorphism, Subgraph Isomorphism, Maximum Common Subgraph Isomorphism, Graph Isomorphism Covering (Ullman, 1976, Jamil, 2011, Weber et al., in press, Bunke, 1997, Bunke and

Conclusion

The results shown in Fig. 4 and in Table 2, Table 3, on synthetic and real data, demonstrate that the proposed kernels are quite efficient and effective, and they can easily handle big graphs. The effect of the full weighting as in (17) only has a negligible effect on performance (not shown). Tottering is not an issue and implementation is very easy. Furthermore, paths and tickets are more expressive features than walks or just shortest paths. It would be interesting to modify the kernels in

Acknowledgements

We are grateful to two anonymous reviewers whose remarks stimulated the comparative evaluations reported in Section 5. Furthermore, we like to express our gratitude to Luis Trindade who provided valuable assistance in the comparative experiments.

References (45)

H. Bunke et al.
Recent advances in graph-based pattern recognition with applications in document analysis
Pattern Recognition
(2011)
H. Bunke et al.
A graph distance metric based on the maximal common subgraph
Pattern Recognition Lett.
(1998)
C.H. Elzinga et al.
Algorithms for subsequence combinatorics
Theor. Comput. Sci.
(2008)
C.H. Elzinga et al.
Concordance and consensus
Inform. Sci.
(2011)
K. Riesen et al.
Approximate graph edit distance computation by means of bipartite graph matching
Image Vision Comput.
(2009)
R. Bapat
Graphs and Matrices
(2011)
D.A. Baum
Concordance trees, concordance factors, and the exploration of reticulate genealogy
Taxon
(2007)
Bernstein, A., Kaufmann, E., Kiefer, C., Bürki, C., 2005. SimPack: A Generic Java Library for Similarity Measures in...
K.M. Borgwardt
Kernel methods in bioinformatics
K.M. Borgwardt et al.
Shortest-path kernels on graphs

H. Bunke

On a relation between graph edit distance and maximum common subgraph

Pattern Recognition Lett.

(1997)

D.J. Cook et al.

Mining Graph Data

(2007)

T.H. Cormen et al.

Introduction to Algorithms

(2009)

N. Cristianini et al.

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

(2000)

R. Diestel

Graph Theory

(2005)

K. Faust et al.

Comparing networks across space and time, size and species

Sociol. Methodol.

(2002)

M.L. Fernandez et al.

A graph distance metric combining maximum common subgraph and minimum common supergraph

Pattern Recognition Lett.

(2001)

T. Gärtner et al.

On graph kernels: Hardness results and efficient alternatives

P. Gehler et al.

On feature combination for multiclass object classification

M. Hattori et al.

Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways

J. Amer. Chem. Soc.

(2003)

Haussler, D., 1999. Convolution Kernels on Discrete Structures. Tech. Rep. UCSC-CRL-99-10. Department of Computer...

W. Imrich et al.

Topics in Graph Theory

(2008)

Cited by (13)

Quantifying sequential subsumption
2019, Theoretical Computer Science
Subsumption is used in knowledge representation and ontology to describe the relationship between concepts. Concept A is subsumed by concept B if the extension of A is always a subset of the extension of B, irrespective of the interpretation. The subsumption relation is also useful in other data analysis tasks such as pattern recognition – for example in image analysis to detect objects in an image, and in spectral data analysis to detect the presence of a reference pattern in a given spectrum. Sometimes the subsumption relation may not be 100% true, so it is useful to quantify this relationship.
In this paper we study how to quantify subsumption for sequential patterns. We review existing work on subsumption, give an axiomatic characterisation of subsumption, and present one general approach to quantification in terms of set intersection operation over concept extension. Constructing the concept extension set explicitly is impossible without specifying the domain of discourse and the interpretation. Instead, we focus on concept intension for sequences as patterns and propose to represent concept intension of a sequence by its subsequences. We further consider different types of concept intension set – subsequence set, subsequence multiset, embedding set and embedding set with constraints such as warping and selection. We then present a general algorithmic framework for computing set intersections, and specific algorithms for computing different concept intension sets. We also present an experimental evaluation of these algorithms with regard to their runtime performance.
Versatile string kernels
2013, Theoretical Computer Science
This paper proposes a class of string kernels that can handle a variety of subsequence-based features. Slight adaptations of the basic algorithm allow for weighing subsequence lengths, restricting or soft-penalizing gap-size, character-weighing and soft-matching of characters. An easy extension of the kernels allows for comparing run-length encoded strings with a time-complexity that is independent of the length of the original strings. Such kernels have applications in image processing, computational biology, in demography and in comparing partial rankings.
Assessing dissimilarity of employment history information from survey and administrative data using sequence analysis techniques
2022, Quality and Quantity
Warshall’s algorithm—survey and applications
2021, Annales Mathematicae et Informaticae
Flowchart-based cross-language source code similarity detection
2020, Scientific Programming
Complex spatial region representation and similar matching for multi-object image retrieval
2020, Proceedings of SPIE - The International Society for Optical Engineering

View all citing articles on Scopus

¹: Address: Faculty of Computer Science and Technology, Inner Mongolia University of the Nationalities, PR China.

View full text

Kernels for acyclic digraphs

Abstract

Highlights

Introduction

Section snippets

Preliminaries

Related work

A vertex-weighted paths kernel

A comparative evaluation

Conclusion

Acknowledgements

Pattern Recognition

Pattern Recognition Lett.

Theor. Comput. Sci.

Inform. Sci.

Image Vision Comput.

Graphs and Matrices

Concordance trees, concordance factors, and the exploration of reticulate genealogy

Taxon

Kernel methods in bioinformatics

Shortest-path kernels on graphs

On a relation between graph edit distance and maximum common subgraph

Pattern Recognition Lett.

Mining Graph Data

Introduction to Algorithms

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

Graph Theory

Comparing networks across space and time, size and species

Sociol. Methodol.

A graph distance metric combining maximum common subgraph and minimum common supergraph

Pattern Recognition Lett.

On graph kernels: Hardness results and efficient alternatives

On feature combination for multiclass object classification

Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways

J. Amer. Chem. Soc.

Topics in Graph Theory