skip to main content
10.1145/2463676.2465300acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

Published:22 June 2013Publication History

ABSTRACT

Given a query graph q and a data graph g, the subgraph isomorphism search finds all occurrences of q in g and is considered one of the most fundamental query types for many real applications. While this problem belongs to NP-hard, many algorithms have been proposed to solve it in a reasonable time for real datasets. However, a recent study has shown, through an extensive benchmark with various real datasets, that all existing algorithms have serious problems in their matching order selection. Furthermore, all algorithms blindly permutate all possible mappings for query vertices, often leading to useless computations. In this paper, we present an efficient and robust subgraph search solution, called TurboISO, which is turbo-charged with two novel concepts, candidate region exploration and the combine and permute strategy (in short, Comb/Perm). The candidate region exploration identifies on-the-fly candidate subgraphs (i.e, candidate regions), which contain embeddings, and computes a robust matching order for each candidate region explored. The Comb/Perm strategy exploits the novel concept of the neighborhood equivalence class (NEC). Each query vertex in the same NEC has identically matching data vertices. During subgraph isomorphism search, Comb/Perm generates only combinations for each NEC instead of permutating all possible enumerations. Thus, if a chosen combination is determined to not contribute to a complete solution, all possible permutations for that combination will be safely pruned. Extensive experiments with many real datasets show that TurboISO consistently and significantly outperforms all competitors by up to several orders of magnitude.

References

  1. R. A. Baeza-Yates and G. Valiente. An image similarity measure based on graph matching. In SPIRE, pages 28--38, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Cheng, Y. Ke, W. Ng, and A. Lu. Fg-index: towards verification-free query processing on graph databases. In SIGMOD, pages 857--872, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE PAMI, 26(10):1367--1372, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Deshpande, M. Kuramochi, N. Wale, and G. Karypis. Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans. Knowl. Data Eng., 17(8):1036--1050, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W.-S. Han, J. Lee, M.-D. Pham, and J. X. Yu. iGraph: A framework for comparisons of disk-based graph indexing techniques. PVLDB, 3(1):449--459, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, pages 405--418, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Khan, N. Li, X. Yan, Z. Guan, S. Chakraborty, and S. Tao. Neighborhood based fast graph search in large networks. In SIGMOD, pages 901--912, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Lee, W.-S. Han, R. Kasperovics, and J.-H. Lee. An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB, 6(2), 2013, http://www-db.knu.ac.kr/vldb13.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Mongiovi, R. D. Natale, R. Giugno, A. Pulvirenti, A. Ferro, and R. Sharan. Sigma: a set-cover-based inexact graph matching algorithm. J. Bioinformatics and Computational Biology, 8(2):199--218, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Ohlrich, C. Ebeling, E. Ginting, and L. Sather. Subgemini: Identifying subcircuits using a fast subgraph isomorphism algorithm. In DAC, pages 31--37, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. E. Saeedy and P. Kalnis. Grami: Generalized frequent pattern mining in a single large graph.Google ScholarGoogle Scholar
  12. H. Shang, Y. Zhang, X. Lin, and J. X. Yu. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 1(1):364--375, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Shao, H. Wang, and Y. Li. The trinity graph engine. Technical Report 161291, Microsoft Research, 2012.Google ScholarGoogle Scholar
  14. Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efficient subgraph matching on billion node graphs. PVLDB, 5(9):788--799, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Tian and J. M. Patel. TALE: A tool for approximate large graph matching. In ICDE, pages 963--972, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23:31--42, January 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In SIGMOD, pages 335--346, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Yan, P. S. Yu, and J. Han. Graph indexing based on discriminative frequent structure analysis. ACM Trans. Database Syst., 30(4):960--993, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Zhang, M. Hu, and J. Yang. Treepi: A novel graph indexing method. In ICDE, pages 966--975, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  20. S. Zhang, S. Li, and J. Yang. GADDI: distance index based subgraph matching in biological networks. In EDBT, pages 192--203, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Zhao and J. Han. On graph query optimization in large networks. PVLDB, 3(1):340--351, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In VLDB, pages 938--949, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Zou, L. Chen, J. X. Yu, and Y. Lu. A novel spectral coding in a large graph database. In EDBT, pages 181--192, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
          June 2013
          1322 pages
          ISBN:9781450320375
          DOI:10.1145/2463676

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 June 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SIGMOD '13 Paper Acceptance Rate76of372submissions,20%Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader