ABSTRACT
Retrieving related graphs containing a query graph from a large graph database is a key issue in many graph-based applications, such as drug discovery and structural pattern recognition. Because sub-graph isomorphism is a NP-complete problem [4], we have to employ a filter-and-verification framework to speed up the search efficiency, that is, using an effective and efficient pruning strategy to filter out the false positives (graphs that are not possible in the results) as many as possible first, then validating the remaining candidates by subgraph isomorphism checking. In this paper, we propose a novel filtering method, a spectral encoding method, i.e. GCoding. Specifically, we assign a signature to each vertex based on its local structures. Then, we generate a spectral graph code by combining all vertex signatures in a graph. Based on spectral graph codes, we derive a necessary condition for sub-graph isomorphism. Then we propose two pruning rules for sub-graph search problem, and prove that they satisfy the no-false-negative requirement (no dismissal in answers). Since graph codes are in numerical space, we take this advantage and conduct efficient filtering over graph codes. Extensive experiments show that GCoding outperforms existing counterpart methods.
- D. Cai, Z. Shao, X. He, X. Yan, and J. Han. Community mining from multi-relational networks. In PKDD, 2005.Google ScholarCross Ref
- J. Cheng, Y. Ke, W. Ng, and A. Lu. fg-index: Towards verification-free query processing on graph databases. In SIGMOD, 2007. Google ScholarDigital Library
- J. H. D. W. Williams and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.Google ScholarCross Ref
- S. Fortin. The graph isomorphism problem. Department of Computing Science, University of Alberta, 1996.Google Scholar
- H. H. Goldstine, F. J. Murray, and J. von Neumann. The jacobi method for real symmetric matrices. J. ACM, 6(1):59--96, 1959. Google ScholarDigital Library
- P. Y. H. Jiang, H. Wang and S. Zhou. Gstring: A novel approach for efficient search in graph databases. In ICDE, 2007.Google ScholarCross Ref
- H. He and A. K. Singh. Closure-tree: An index structure for graph queries. In ICDE, 2006. Google ScholarDigital Library
- A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD, 2000. Google ScholarDigital Library
- C. A. James, D. Weininger, and J. Delany. Daylight theory manual daylisght version 4.82. Daylight Chemical Information Systems, Inc., 2003.Google Scholar
- Y. Ke, J. Cheng, and W. Ng. Correlation search in graph databases. In SIGKDD, 2007. Google ScholarDigital Library
- H. Kitagawa and Y. Ishikawa. False drop analysis of set retrieval with signature files. Inf. Syst., 27(2), 2002.Google Scholar
- M. Kuramochi and G. Karypis. Frequent subgraph discovery. In ICDM, 2001. Google ScholarDigital Library
- E. G. M. Petrakis and C. Faloutsos. Similarity searching in medical image databases. IEEE Transactions on Knowledge and Data Enginnering, 9(3), 1997. Google ScholarDigital Library
- D. Shasha, J. T.-L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS, 2002. Google ScholarDigital Library
- A. Shokoufandeh, S. J. Dickinson, K. Siddiqi, and S. W. Zucker. Indexing using a spectral encoding of topological structure. In CVPR, 1999.Google ScholarCross Ref
- E. Tousidou, P. Bozanis, and Y. Manolopoulos. Signature-based structures for objects with set-valued attributes. Inf. Syst., 27(2), 2002. Google ScholarDigital Library
- J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23(1), 1976. Google ScholarDigital Library
- D. B. West. Introduction to Graph Theory (2nd Edition). Prentice Hall, 2000.Google Scholar
- P. Willett. Chemical similarity searching. J. Chem. Inf. Comput. Sci, 38(6), 1998.Google ScholarCross Ref
- X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, 2002. Google ScholarDigital Library
- X. Yan and J. Han. gspan: Graph-based substructure pattern mining. Proc. of Int. Conf. on Data Mining, 2002. Google ScholarDigital Library
- X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In SIGMOD, 2004. Google ScholarDigital Library
- N. Zhang, M. T. Özsu, I. F. Ilyas, and A. Aboulnaga. Fix: Feature-based indexing technique for xml documents. In VLDB, 2006. Google ScholarDigital Library
- S. Zhang, M. Hu, and J. Yang. Treepi: A novel graph indexing method. In ICDE, 2007.Google ScholarCross Ref
- P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In VLDB, 2007. Google ScholarDigital Library
Index Terms
- A novel spectral coding in a large graph database
Recommendations
Large Induced Acyclic and Outerplanar Subgraphs of 2-Outerplanar Graph
Albertson and Berman conjectured that every planar graph has an induced forest on half of its vertices. The best known lower bound, due to Borodin, is that every planar graph has an induced forest on two fifths of its vertices. In a related result, ...
Distance spectral spread of a graph
Let D(G)=(d"i","j)"n"x"n denote the distance matrix of a connected graph G with order n, where d"i"j is equal to the distance between vertices v"i and v"j in G. The value D"i=@?"j"="1^nd"i"j (i=1,2,...,n) is called the distance degree of vertex v"i. ...
Graph colouring with no large monochromatic components
For a graph G and an integer t we let mcct(G) be the smallest m such that there exists a colouring of the vertices of G by t colours with no monochromatic connected subgraph having more than m vertices. Let <private-char><inline-graphic mime-subtype="...
Comments