skip to main content
article

An in-depth comparison of subgraph isomorphism algorithms in graph databases

Authors Info & Claims
Published:01 December 2012Publication History
Skip Abstract Section

Abstract

Finding subgraph isomorphisms is an important problem in many applications which deal with data modeled as graphs. While this problem is NP-hard, in recent years, many algorithms have been proposed to solve it in a reasonable time for real datasets using different join orders, pruning rules, and auxiliary neighborhood information. However, since they have not been empirically compared one another in most research work, it is not clear whether the later work outperforms the earlier work. Another problem is that reported comparisons were often done using the original authors' binaries which were written in different programming environments. In this paper, we address these serious problems by re-implementing five state-of-the-art subgraph isomorphism algorithms in a common code base and by comparing them using many real-world datasets and their query loads. Through our in-depth analysis of experimental results, we report surprising empirical findings.

References

  1. J. Cheng, Y. Ke, W. Ng, and A. Lu. Fg-index: towards verification-free query processing on graph databases. In SIGMOD, pages 857-872, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE PAMI, 26(10):1367-1372, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W.-S. Han, J. Lee, M.-D. Pham, and J. X. Yu. iGraph: A framework for comparisons of disk-based graph indexing techniques. PVLDB, 3(1):449-459, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. He and A. K. Singh. Closure-tree: An index structure for graph queries. In ICDE, pages 38-49, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, pages 405-418, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood based fast graph search in large networks. In SIGMOD, pages 901-912, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. E. Knuth. Estimating the efficiency of backtrack programs. Technical report, Stanford, CA, USA, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Miklau. Xml data repository. http://www.cs.washington.edu/research/xmldatasets.Google ScholarGoogle Scholar
  9. M. Mongiovi, R. D. Natale, R. Giugno, A. Pulvirenti, A. Ferro, and R. Sharan. Sigma: a set-cover-based inexact graph matching algorithm. J. Bioinformatics and Computational Biology, 8(2):199-218, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. R. Shamir and D. Tsur. Faster subtree isomorphism. Theory of Computing Systems, Israel Symposium on the, 0:126, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Shang, Y. Zhang, X. Lin, and J. X. Yu. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 1(1):364-375, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS, pages 39-52, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Tian and J. M. Patel. TALE: A tool for approximate large graph matching. In ICDE, pages 963-972, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23:31-42, January 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, pages 721-724, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In SIGMOD, pages 335-346, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Zhang, S. Li, and J. Yang. GADDI: distance index based subgraph matching in biological networks. In EDBT, pages 192-203, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Zhao and J. Han. On graph query optimization in large networks. PVLDB, 3(1):340-351, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta ≥ graph. In VLDB, pages 938-949, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Zou, L. Chen, J. X. Yu, and Y. Lu. A novel spectral coding in a large graph database. In EDBT, pages 181-192, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An in-depth comparison of subgraph isomorphism algorithms in graph databases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 6, Issue 2
          December 2012
          120 pages

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 December 2012
          Published in pvldb Volume 6, Issue 2

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader