research-article

SAPPER: subgraph indexing and approximate matching in large graphs

Authors:
Shijie Zhang

Case Western Reserve University

Case Western Reserve University
View Profile

,
Jiong Yang

Case Western Reserve University

Case Western Reserve University
View Profile

,
Wei Jin

Case Western Reserve University

Case Western Reserve University
View Profile

Proceedings of the VLDB Endowment Volume 3 Issue 1-2pp 1185–1194https://doi.org/10.14778/1920841.1920988

Published:01 September 2010Publication History

Proceedings of the VLDB Endowment

Abstract

With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occurrences of a query graph in a large database graph is an important research topic. Due to the existence of noise (e.g., missing edges) in the large database graph, we investigate the problem of approximate subgraph indexing, i.e., finding the occurrences of a query graph in a large database graph with (possible) missing edges. The SAPPER method is proposed to solve this problem. Utilizing the hybrid neighborhood unit structures in the index, SAPPER takes advantage of pre-generated random spanning trees and a carefully designed graph enumeration order. Real and synthetic data sets are employed to demonstrate the efficiency and scalability of our approximate subgraph indexing method.

References

D. J. Aldous, The random walk construction of uniform spanning trees and uniform labelled trees, SIAM J. Discrete Math, 1990. Google ScholarDigital Library
R. Agrawal and R. Srikant, Fast algorithms for mining association rules, Prof. of VLDB, 1994. Google ScholarDigital Library
B. H. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM 13 (7), 1970. Google ScholarDigital Library
J. Cheng, Y. Ke, W. Ng and A. Lu, FG-Index: towards verification-free query processing on graph databases. Proc. of SIGMOD, 2007. Google ScholarDigital Library
B. Chazelle, J. Kilian, R. Rubinfeld and A. Tal, The bloomier filter: an efficient data structure for static support lookup tables, Proc. of 5th Annual ACM-SIAM Symposium on Discrete Algorithms, 2004. Google ScholarDigital Library
L. Cordella, P. Foggia, C. Sansone and M. Vento, A (sub)graph isomorphism algorithm for matching large graphs. PAMI, 2004. Google ScholarDigital Library
B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna and R. Sharan, QNet: a tool for querying protein interaction networks, Proc. of RECOMB, 2007. Google ScholarDigital Library
T. Nguyen, H. Nguyen, N. Pham, J. AI-Kofahi and T. Nguyen, Graph-based mining of multiple object usage patterns, Proc. of the Joint Meeting of ESEC and ACM SIGSOFT, 2009. Google ScholarDigital Library
R. Giugno and D. Shasha, GraphGrep: A fast and universal method for querying graphs. Proc. of ICPR, 2002.Google ScholarCross Ref
J. Han, J. Pei and Y. Yin, Mining frequent patterns without candidate generation, Proc. of SIGMOD, 2000. Google ScholarDigital Library
H. He and A. K. Singh, Closure-Tree: an index structure for graph queries. Proc. of ICDE, 2006. Google ScholarDigital Library
H. Jiang, H. Wang, P. Yu and S. Zhou, GString: A novel approach for efficient search in graph databases. Proc. of ICDE, 2007.Google ScholarCross Ref
M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nuc. Ac. Res, 2000, 28:27--30Google ScholarCross Ref
M. Koyuturk, A. Grama and W. Szpankowski, Pairwise local alignment of protein interaction networks guided by models of evolution. Proc. of RECOMB, 2005. Google ScholarDigital Library
F. Mandreoli, R. Martoglia, G. Villani and W. Penzo, Flexible query answering on graph-modeled data. Proc. of EDBT, 2009. Google ScholarDigital Library
M. Mongiovi, R. Natale, R. Giugno, A, Pulvirenti, and A. Ferro. A set-cover-based approach for inexact graph matching. Proc. of CSB, 2009.Google Scholar
R. Pinter, O. Rokhlenko, E. Yeger-Lotem and M. Ziv-Ukelson, Alignment of metabolic pathways, Bioinformatics, 2005. Google ScholarDigital Library
H. Shang, Y. Zhang, X. Lin, and J. Yu, Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 2008. Google ScholarDigital Library
Y. Tian and J. Patel, TALE: a tool for approximate large graph matching, Proc. of ICDE, 2008. Google ScholarDigital Library
J. Ullmann, An algorithm for subgraph isomorphism. J. ACM, 1976. Google ScholarDigital Library
X. Wang, A. Smalter, J. Huan, and G. Lushington, G-Hash: towards fast kernel-based similarity search in large graph databases, Proc. of EDBT, 2009. Google ScholarDigital Library
X. Yan, P. Yu and J. Han, Graph indexing, a frequent structure-based approach. Proc. of SIGMOD, 2004. Google ScholarDigital Library
S. Zhang, M. Hu, and J. Yang, Treepi: a novel graph indexing method. Proc. of ICDE, 2007.Google ScholarCross Ref
S. Zhang, S. Li, and J. Yang, Gaddi: distance index based subgraph matching in biological networks. Proc. of EDBT, 2009. Google ScholarDigital Library
Gene Ontology. http://www.geneontology.org/.Google Scholar

Index Terms

SAPPER: subgraph indexing and approximate matching in large graphs
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory

Index terms have been assigned to the content through auto-classification.

Recommendations

On the Multichromatic Number of s-Stable Kneser Graphs

For positive integers n and s, a subset Sï [n] is s-stable if sï |i-j|ï n-s for distinct i,j∈S . The s-stable r-uniform Kneser hypergraph KGrn,ks-stable is the r-uniform hypergraph that has the collection of all s-stable k-element subsets of [n] as ...
Read More
Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes

An adjacent vertex-distinguishing edge coloring of a simple graph G is a proper edge coloring of G such that incident edge sets of any two adjacent vertices are assigned different sets of colors. A total coloring of a graph G is a coloring of both the ...
Read More
Forbidden Subgraphs and Weak Locally Connected Graphs

A graph is called H-free if it has no induced subgraph isomorphic to H. A graph is called $$N^i$$Ni-locally connected if $$G[\{ x\in V(G): 1\le d_G(w, x)\le i\}]$$G[{x?V(G):1≤dG(w,x)≤i}] is connected and $$N_2$$N2-locally connected if $$G[\{uv: \{uw, vw\...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the VLDB Endowment Volume 3, Issue 1-2
September 2010
1658 pages
ISSN:2150-8097
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 September 2010
Published in pvldb Volume 3, Issue 1-2
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 59
  Total Citations
  View Citations
- 587
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SAPPER: subgraph indexing and approximate matching in large graphs

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

On the Multichromatic Number of s-Stable Kneser Graphs

Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes

Forbidden Subgraphs and Weak Locally Connected Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

SAPPER: subgraph indexing and approximate matching in large graphs

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

On the Multichromatic Number of s-Stable Kneser Graphs

Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes

Forbidden Subgraphs and Weak Locally Connected Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media