skip to main content
research-article

SAPPER: subgraph indexing and approximate matching in large graphs

Published:01 September 2010Publication History
Skip Abstract Section

Abstract

With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occurrences of a query graph in a large database graph is an important research topic. Due to the existence of noise (e.g., missing edges) in the large database graph, we investigate the problem of approximate subgraph indexing, i.e., finding the occurrences of a query graph in a large database graph with (possible) missing edges. The SAPPER method is proposed to solve this problem. Utilizing the hybrid neighborhood unit structures in the index, SAPPER takes advantage of pre-generated random spanning trees and a carefully designed graph enumeration order. Real and synthetic data sets are employed to demonstrate the efficiency and scalability of our approximate subgraph indexing method.

References

  1. D. J. Aldous, The random walk construction of uniform spanning trees and uniform labelled trees, SIAM J. Discrete Math, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal and R. Srikant, Fast algorithms for mining association rules, Prof. of VLDB, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. H. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM 13 (7), 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Cheng, Y. Ke, W. Ng and A. Lu, FG-Index: towards verification-free query processing on graph databases. Proc. of SIGMOD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Chazelle, J. Kilian, R. Rubinfeld and A. Tal, The bloomier filter: an efficient data structure for static support lookup tables, Proc. of 5th Annual ACM-SIAM Symposium on Discrete Algorithms, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Cordella, P. Foggia, C. Sansone and M. Vento, A (sub)graph isomorphism algorithm for matching large graphs. PAMI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna and R. Sharan, QNet: a tool for querying protein interaction networks, Proc. of RECOMB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Nguyen, H. Nguyen, N. Pham, J. AI-Kofahi and T. Nguyen, Graph-based mining of multiple object usage patterns, Proc. of the Joint Meeting of ESEC and ACM SIGSOFT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Giugno and D. Shasha, GraphGrep: A fast and universal method for querying graphs. Proc. of ICPR, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. Han, J. Pei and Y. Yin, Mining frequent patterns without candidate generation, Proc. of SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. He and A. K. Singh, Closure-Tree: an index structure for graph queries. Proc. of ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Jiang, H. Wang, P. Yu and S. Zhou, GString: A novel approach for efficient search in graph databases. Proc. of ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nuc. Ac. Res, 2000, 28:27--30Google ScholarGoogle ScholarCross RefCross Ref
  14. M. Koyuturk, A. Grama and W. Szpankowski, Pairwise local alignment of protein interaction networks guided by models of evolution. Proc. of RECOMB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Mandreoli, R. Martoglia, G. Villani and W. Penzo, Flexible query answering on graph-modeled data. Proc. of EDBT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Mongiovi, R. Natale, R. Giugno, A, Pulvirenti, and A. Ferro. A set-cover-based approach for inexact graph matching. Proc. of CSB, 2009.Google ScholarGoogle Scholar
  17. R. Pinter, O. Rokhlenko, E. Yeger-Lotem and M. Ziv-Ukelson, Alignment of metabolic pathways, Bioinformatics, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Shang, Y. Zhang, X. Lin, and J. Yu, Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Tian and J. Patel, TALE: a tool for approximate large graph matching, Proc. of ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Ullmann, An algorithm for subgraph isomorphism. J. ACM, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. X. Wang, A. Smalter, J. Huan, and G. Lushington, G-Hash: towards fast kernel-based similarity search in large graph databases, Proc. of EDBT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X. Yan, P. Yu and J. Han, Graph indexing, a frequent structure-based approach. Proc. of SIGMOD, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Zhang, M. Hu, and J. Yang, Treepi: a novel graph indexing method. Proc. of ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  24. S. Zhang, S. Li, and J. Yang, Gaddi: distance index based subgraph matching in biological networks. Proc. of EDBT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Gene Ontology. http://www.geneontology.org/.Google ScholarGoogle Scholar

Index Terms

  1. SAPPER: subgraph indexing and approximate matching in large graphs
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
          September 2010
          1658 pages

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 September 2010
          Published in pvldb Volume 3, Issue 1-2

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader