Abstract
Finding subgraph isomorphisms is an important problem in many applications which deal with data modeled as graphs. While this problem is NP-hard, in recent years, many algorithms have been proposed to solve it in a reasonable time for real datasets using different join orders, pruning rules, and auxiliary neighborhood information. However, since they have not been empirically compared one another in most research work, it is not clear whether the later work outperforms the earlier work. Another problem is that reported comparisons were often done using the original authors' binaries which were written in different programming environments. In this paper, we address these serious problems by re-implementing five state-of-the-art subgraph isomorphism algorithms in a common code base and by comparing them using many real-world datasets and their query loads. Through our in-depth analysis of experimental results, we report surprising empirical findings.
- J. Cheng, Y. Ke, W. Ng, and A. Lu. Fg-index: towards verification-free query processing on graph databases. In SIGMOD, pages 857-872, 2007. Google ScholarDigital Library
- L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE PAMI, 26(10):1367-1372, 2004. Google ScholarDigital Library
- W.-S. Han, J. Lee, M.-D. Pham, and J. X. Yu. iGraph: A framework for comparisons of disk-based graph indexing techniques. PVLDB, 3(1):449-459, 2010. Google ScholarDigital Library
- H. He and A. K. Singh. Closure-tree: An index structure for graph queries. In ICDE, pages 38-49, 2006. Google ScholarDigital Library
- H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, pages 405-418, 2008. Google ScholarDigital Library
- A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood based fast graph search in large networks. In SIGMOD, pages 901-912, 2011. Google ScholarDigital Library
- D. E. Knuth. Estimating the efficiency of backtrack programs. Technical report, Stanford, CA, USA, 1974. Google ScholarDigital Library
- G. Miklau. Xml data repository. http://www.cs.washington.edu/research/xmldatasets.Google Scholar
- M. Mongiovi, R. D. Natale, R. Giugno, A. Pulvirenti, A. Ferro, and R. Sharan. Sigma: a set-cover-based inexact graph matching algorithm. J. Bioinformatics and Computational Biology, 8(2):199-218, 2010.Google ScholarCross Ref
- R. Shamir and D. Tsur. Faster subtree isomorphism. Theory of Computing Systems, Israel Symposium on the, 0:126, 1997. Google ScholarDigital Library
- H. Shang, Y. Zhang, X. Lin, and J. X. Yu. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 1(1):364-375, 2008. Google ScholarDigital Library
- D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS, pages 39-52, 2002. Google ScholarDigital Library
- Y. Tian and J. M. Patel. TALE: A tool for approximate large graph matching. In ICDE, pages 963-972, 2008. Google ScholarDigital Library
- J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23:31-42, January 1976. Google ScholarDigital Library
- X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, pages 721-724, 2002. Google ScholarDigital Library
- X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In SIGMOD, pages 335-346, 2004. Google ScholarDigital Library
- S. Zhang, S. Li, and J. Yang. GADDI: distance index based subgraph matching in biological networks. In EDBT, pages 192-203, 2009. Google ScholarDigital Library
- P. Zhao and J. Han. On graph query optimization in large networks. PVLDB, 3(1):340-351, 2010. Google ScholarDigital Library
- P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta ≥ graph. In VLDB, pages 938-949, 2007. Google ScholarDigital Library
- L. Zou, L. Chen, J. X. Yu, and Y. Lu. A novel spectral coding in a large graph database. In EDBT, pages 181-192, 2008. Google ScholarDigital Library
Index Terms
- An in-depth comparison of subgraph isomorphism algorithms in graph databases
Recommendations
Polynomial-time algorithms for Subgraph Isomorphism in small graph classes of perfect graphs
Given two graphs, Subgraph Isomorphism is the problem of deciding whether the first graph (the base graph) contains a subgraph isomorphic to the second one (the pattern graph). This problem is NP-complete even for very restricted graph classes such as ...
The subgraph isomorphism problem for outerplanar graphs
This paper deals with the subgraph isomorphism problem for outerplanar graphs (SUBOUTISOM). In general, since trees and forests are outerplanar, SUBOUTISOM is NP-complete. We show that SUBOUTISOM remains NP-complete even when the strongest connectivity ...
Induced subgraph isomorphism
The complexity of the subgraph isomorphism problem where the pattern graph is of fixed size is well known to depend on the topology of the pattern graph. Here, we present two results which, in contrast, provide evidence that no topology of an induced ...
Comments