Kernels for acyclic digraphs
Highlights
► Graph kernel of complexity O(∣V∣3) that evaluates all common paths in DAG’s with common vertex set V. ► Graph kernel of complexity O(∣V∣3) that evaluates all paths of all common graph-minors in DAG’s with common vertex set V. ► Kernels allow for weighing the paths according to the vertices included. ► Performance is demonstrated on synthetic data.
Introduction
Graphs comprise a natural way of representing data structures from a variety of branches of science and engineering. Hence, graph comparison, graph distance, graph classification and graph similarity are important issues in many pattern recognition and machine learning domains (Bunke and Riesen, 2011). Examples are easily found in data mining (Cook and Holder, 2007), computer vision (Kandel et al., 2007), social networks (Faust and Skvoretz, 2002, Kleinnijenhuis et al., 1997, Widmer, 2010), chemistry and bio-informatics (Borgwardt, 2011, Hattori et al., 2003, Sperschneider, 2008), taxonomy (Baum, 2007), image recognition (Wu et al., 2009), analysis of semantic structures (Lin et al., 2010) and e-business (Shvaiko and Euzenat, 2005). After numerical vectors and strings, graphs are probably the most frequently encountered data type in information science.
Kernel methods (Shaw-Taylor and Cristianini, 2004) are a computationally very attractive class of methods since they can be applied in high-dimensional feature spaces while avoiding the direct use of the feature map. So, a kernel function generates a Gram matrix which in turn allows for the application of a variety of linear statistical models that subserve discrimination, classification and dimensionality reduction (Cristianini and Shawe-Taylor, 2000, Shaw-Taylor and Cristianini, 2004).
Furthermore, kernel functions have very attractive closure properties (Haussler, 1999, Shaw-Taylor and Cristianini, 2004) that allow us to easily combine different feature spaces in learning problems that operate on the Gram matrix. For example, in image recognition, distinct kernels for e.g. texture and color can be combined to enhance learning performance (Gehler and Nowozin, 2009).
Recently, very general graph kernels have been proposed that generalize random walk (Gärtner et al., 2003) and marginalized kernels (Kashima et al., 2003) by Vishwanathan et al. (2010). But such kernels tend to focus on shorter walks and attempts (e.g. Mahé et al., 2004) to prevent this, lead to tottering, i.e. to a disproportionate contribution of self-repeating walks. Promising are the Weisfeiler-Lehman kernels as proposed by Shervazidze et al. (2010) that efficiently handle very large graphs by comparing subtrees of limited height. The aim of the present paper is to propose kernels for directed acyclic graphs that are direct generalizations of subsequence kernels for strings as dealt with in Elzinga et al., 2008, Elzinga et al., 2011, Lodhi et al., 2002, Wang and Lin, 2007. Such kernels are less general but efficiently exploit paths instead of walks and do not rely on sampling.
Thereto, after concisely discussing some concepts and notation in Section 2 and related work in Section 3, we discuss two kernels based on all common (vertex-weighted) paths in Section 4 and evaluate their performance with synthetic data. In Section 5, we discuss our results.
Section snippets
Preliminaries
Here, we write G = (V, E, L) to denote a graph (e.g. Diestel, 2005) on a set of vertices V = {v1 , … , vn} with edges E ⊆ V × V and a set of labels L. For the sake of clarity, we mostly confine to unlabeled graphs, i.e. graphs where ∣L∣ = 1, and therefore, we will often ignore the labeling. However, the results are easily transferred to labeled graphs. An n-walk on G consists of a sequence of n + 1 vertices from V such that for all 1 ⩽ j ⩽ n. Note that a walk may visit some or all of its
Related work
Over the last decade, many (kernel) methods have been proposed for graph comparison. Here we confine the discussion to kernel methods based on counting or estimating the number of common walks or paths in a pair of graphs.
The kernels that we propose simply count all common paths of the graphs involved and the more of these common paths, the more similar the graphs. Another and almost equally simple idea is that of a random walk to measure similarity: given two graphs, perform a random walk on
A vertex-weighted paths kernel
Let P∗ denote the set of all possible paths over a set of vertices V and let denote the set of nonnegative integers. We define an arbitrary but fixed map that assigns to each path p ∈ P∗ a unique integer . Next, for a graph G = (V, E), we create a feature map ϕ(G) = (x1, x2 , …) throughand the inner product 〈ϕ(G), ϕ(G′)〉 now counts the number of common paths of the pair (G,G′). There is a very simple, adjacency matrix based kernel function to evaluate such inner
A comparative evaluation
In this section the proposed kernels are evaluated by comparing with some of the graph similarity measures commonly used in the literature. We considered SimPack (Bernstein et al., 2005), a generic Java library for similarity measures. It includes the classical graph similarity measures: Conceptual Similarity, Graph Isomorphism, Subgraph Isomorphism, Maximum Common Subgraph Isomorphism, Graph Isomorphism Covering (Ullman, 1976, Jamil, 2011, Weber et al., in press, Bunke, 1997, Bunke and
Conclusion
The results shown in Fig. 4 and in Table 2, Table 3, on synthetic and real data, demonstrate that the proposed kernels are quite efficient and effective, and they can easily handle big graphs. The effect of the full weighting as in (17) only has a negligible effect on performance (not shown). Tottering is not an issue and implementation is very easy. Furthermore, paths and tickets are more expressive features than walks or just shortest paths. It would be interesting to modify the kernels in
Acknowledgements
We are grateful to two anonymous reviewers whose remarks stimulated the comparative evaluations reported in Section 5. Furthermore, we like to express our gratitude to Luis Trindade who provided valuable assistance in the comparative experiments.
References (45)
- et al.
Recent advances in graph-based pattern recognition with applications in document analysis
Pattern Recognition
(2011) - et al.
A graph distance metric based on the maximal common subgraph
Pattern Recognition Lett.
(1998) - et al.
Algorithms for subsequence combinatorics
Theor. Comput. Sci.
(2008) - et al.
Concordance and consensus
Inform. Sci.
(2011) - et al.
Approximate graph edit distance computation by means of bipartite graph matching
Image Vision Comput.
(2009) Graphs and Matrices
(2011)Concordance trees, concordance factors, and the exploration of reticulate genealogy
Taxon
(2007)- Bernstein, A., Kaufmann, E., Kiefer, C., Bürki, C., 2005. SimPack: A Generic Java Library for Similarity Measures in...
Kernel methods in bioinformatics
- et al.
Shortest-path kernels on graphs
On a relation between graph edit distance and maximum common subgraph
Pattern Recognition Lett.
Mining Graph Data
Introduction to Algorithms
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
Graph Theory
Comparing networks across space and time, size and species
Sociol. Methodol.
A graph distance metric combining maximum common subgraph and minimum common supergraph
Pattern Recognition Lett.
On graph kernels: Hardness results and efficient alternatives
On feature combination for multiclass object classification
Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways
J. Amer. Chem. Soc.
Topics in Graph Theory
Cited by (13)
Quantifying sequential subsumption
2019, Theoretical Computer ScienceVersatile string kernels
2013, Theoretical Computer ScienceWarshall’s algorithm—survey and applications
2021, Annales Mathematicae et InformaticaeFlowchart-based cross-language source code similarity detection
2020, Scientific ProgrammingComplex spatial region representation and similar matching for multi-object image retrieval
2020, Proceedings of SPIE - The International Society for Optical Engineering
- 1
Address: Faculty of Computer Science and Technology, Inner Mongolia University of the Nationalities, PR China.