skip to main content
research-article

Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs

Published:01 January 2015Publication History
Skip Abstract Section

Abstract

Subgraph Isomorphism is a fundamental problem in graph data processing. Most existing subgraph isomorphism algorithms are based on a backtracking framework which computes the solutions by incrementally matching all query vertices to candidate data vertices. However, we observe that extensive duplicate computation exists in these algorithms, and such duplicate computation can be avoided by exploiting relationships between data vertices. Motivated by this, we propose a novel approach, BoostIso, to reduce duplicate computation. Our extensive experiments with real datasets show that, after integrating our approach, most existing subgraph isomorphism algorithms can be speeded up significantly, especially for some graphs with intensive vertex relationships, where the improvement can be up to several orders of magnitude.

References

  1. A. V. Aho, M. R. Garey, and J. D. Ullman. The transitive reduction of a directed graph. SIAM, 1: 131--137, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  2. Q. Chen, A. Lim, and K. W. Ong. D(k)-index: An adaptive structural summary for graph-structured data. In SIGMOD, pages 134--144, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. Pattern Anal. Mach. Intell., IEEE Trans, 26(10): 1367--1372, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. Fan, J. Li, X. Wang, and Y. Wu. Query preserving graph compression. In SIGMOD, pages 157--168, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. R. Garey and D. S. Johnson. Computer and intractability. A Guide to the NP-Completeness, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W.-S. Han, J. Lee, and J.-H. Lee. Turbo iso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In SIGMOD, pages 337--348, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. He and A. K. Singh. Query language and access methods for graph databases. In Manag. and Min. Graph Data, pages 125--160. 2010.Google ScholarGoogle ScholarCross RefCross Ref
  8. R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes. Exploiting local similarity for indexing paths in graph-structured data. In ICDE, pages 129--140, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Lee, W.-S. Han, R. Kasperovics, and J.-H. Lee. An in-depth comparison of subgraph isomorphism algorithms in graph databases. In VLDB, pages 133--144, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Milo and D. Suciu. Index structures for path expressions. In ICDT, pages 277--295, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Navlakha, R. Rastogi, and N. Shrivastava. Graph summarization with bounded error. In SIGMOD, pages 419--432, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Shang, Y. Zhang, X. Lin, and J. X. Yu. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 1(1): 364--375, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efficient subgraph matching on billion node graphs. PVLDB, 5: 788--799, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Tian, R. A. Hankins, and J. M. Patel. Efficient aggregation for graph summarization. In SIGMOD, pages 567--580, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. R. Ullmann. An algorithm for subgraph isomorphism. JACM, 23(1): 31--42, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Zhang, Y. Tian, and J. M. Patel. Discovery-driven graph summarization. In ICDE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  17. P. Zhao and J. Han. On graph query optimization in large networks. PVLDB, 3: 340--351, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader