Abstract
With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occurrences of a query graph in a large database graph is an important research topic. Due to the existence of noise (e.g., missing edges) in the large database graph, we investigate the problem of approximate subgraph indexing, i.e., finding the occurrences of a query graph in a large database graph with (possible) missing edges. The SAPPER method is proposed to solve this problem. Utilizing the hybrid neighborhood unit structures in the index, SAPPER takes advantage of pre-generated random spanning trees and a carefully designed graph enumeration order. Real and synthetic data sets are employed to demonstrate the efficiency and scalability of our approximate subgraph indexing method.
- D. J. Aldous, The random walk construction of uniform spanning trees and uniform labelled trees, SIAM J. Discrete Math, 1990. Google ScholarDigital Library
- R. Agrawal and R. Srikant, Fast algorithms for mining association rules, Prof. of VLDB, 1994. Google ScholarDigital Library
- B. H. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM 13 (7), 1970. Google ScholarDigital Library
- J. Cheng, Y. Ke, W. Ng and A. Lu, FG-Index: towards verification-free query processing on graph databases. Proc. of SIGMOD, 2007. Google ScholarDigital Library
- B. Chazelle, J. Kilian, R. Rubinfeld and A. Tal, The bloomier filter: an efficient data structure for static support lookup tables, Proc. of 5th Annual ACM-SIAM Symposium on Discrete Algorithms, 2004. Google ScholarDigital Library
- L. Cordella, P. Foggia, C. Sansone and M. Vento, A (sub)graph isomorphism algorithm for matching large graphs. PAMI, 2004. Google ScholarDigital Library
- B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna and R. Sharan, QNet: a tool for querying protein interaction networks, Proc. of RECOMB, 2007. Google ScholarDigital Library
- T. Nguyen, H. Nguyen, N. Pham, J. AI-Kofahi and T. Nguyen, Graph-based mining of multiple object usage patterns, Proc. of the Joint Meeting of ESEC and ACM SIGSOFT, 2009. Google ScholarDigital Library
- R. Giugno and D. Shasha, GraphGrep: A fast and universal method for querying graphs. Proc. of ICPR, 2002.Google ScholarCross Ref
- J. Han, J. Pei and Y. Yin, Mining frequent patterns without candidate generation, Proc. of SIGMOD, 2000. Google ScholarDigital Library
- H. He and A. K. Singh, Closure-Tree: an index structure for graph queries. Proc. of ICDE, 2006. Google ScholarDigital Library
- H. Jiang, H. Wang, P. Yu and S. Zhou, GString: A novel approach for efficient search in graph databases. Proc. of ICDE, 2007.Google ScholarCross Ref
- M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nuc. Ac. Res, 2000, 28:27--30Google ScholarCross Ref
- M. Koyuturk, A. Grama and W. Szpankowski, Pairwise local alignment of protein interaction networks guided by models of evolution. Proc. of RECOMB, 2005. Google ScholarDigital Library
- F. Mandreoli, R. Martoglia, G. Villani and W. Penzo, Flexible query answering on graph-modeled data. Proc. of EDBT, 2009. Google ScholarDigital Library
- M. Mongiovi, R. Natale, R. Giugno, A, Pulvirenti, and A. Ferro. A set-cover-based approach for inexact graph matching. Proc. of CSB, 2009.Google Scholar
- R. Pinter, O. Rokhlenko, E. Yeger-Lotem and M. Ziv-Ukelson, Alignment of metabolic pathways, Bioinformatics, 2005. Google ScholarDigital Library
- H. Shang, Y. Zhang, X. Lin, and J. Yu, Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 2008. Google ScholarDigital Library
- Y. Tian and J. Patel, TALE: a tool for approximate large graph matching, Proc. of ICDE, 2008. Google ScholarDigital Library
- J. Ullmann, An algorithm for subgraph isomorphism. J. ACM, 1976. Google ScholarDigital Library
- X. Wang, A. Smalter, J. Huan, and G. Lushington, G-Hash: towards fast kernel-based similarity search in large graph databases, Proc. of EDBT, 2009. Google ScholarDigital Library
- X. Yan, P. Yu and J. Han, Graph indexing, a frequent structure-based approach. Proc. of SIGMOD, 2004. Google ScholarDigital Library
- S. Zhang, M. Hu, and J. Yang, Treepi: a novel graph indexing method. Proc. of ICDE, 2007.Google ScholarCross Ref
- S. Zhang, S. Li, and J. Yang, Gaddi: distance index based subgraph matching in biological networks. Proc. of EDBT, 2009. Google ScholarDigital Library
- Gene Ontology. http://www.geneontology.org/.Google Scholar
Index Terms
- SAPPER: subgraph indexing and approximate matching in large graphs
Recommendations
On the Multichromatic Number of s-Stable Kneser Graphs
For positive integers n and s, a subset Sï [n] is s-stable if sï |i-j|ï n-s for distinct i,j∈S . The s-stable r-uniform Kneser hypergraph KGrn,ks-stable is the r-uniform hypergraph that has the collection of all s-stable k-element subsets of [n] as ...
Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes
An adjacent vertex-distinguishing edge coloring of a simple graph G is a proper edge coloring of G such that incident edge sets of any two adjacent vertices are assigned different sets of colors. A total coloring of a graph G is a coloring of both the ...
Forbidden Subgraphs and Weak Locally Connected Graphs
A graph is called H-free if it has no induced subgraph isomorphic to H. A graph is called $$N^i$$Ni-locally connected if $$G[\{ x\in V(G): 1\le d_G(w, x)\le i\}]$$G[{x?V(G):1≤dG(w,x)≤i}] is connected and $$N_2$$N2-locally connected if $$G[\{uv: \{uw, vw\...
Comments