research-article

Comparing stars: on approximating graph edit distance

Authors:
Zhiping Zeng

Tsinghua University, Beijing, P.R. China

Tsinghua University, Beijing, P.R. China
View Profile

,
Anthony K. H. Tung

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Jianyong Wang

Tsinghua University, Beijing, P.R. China

Tsinghua University, Beijing, P.R. China
View Profile

,
Jianhua Feng

Tsinghua University, Beijing, P.R. China

Tsinghua University, Beijing, P.R. China
View Profile

,
Lizhu Zhou

Tsinghua University, Beijing, P.R. China

Tsinghua University, Beijing, P.R. China
View Profile

Proceedings of the VLDB Endowment Volume 2 Issue 1pp 25–36https://doi.org/10.14778/1687627.1687631

Published:01 August 2009Publication History

Proceedings of the VLDB Endowment

Abstract

Graph data have become ubiquitous and manipulating them based on similarity is essential for many applications. Graph edit distance is one of the most widely accepted measures to determine similarities between graphs and has extensive applications in the fields of pattern recognition, computer vision etc. Unfortunately, the problem of graph edit distance computation is NP-Hard in general. Accordingly, in this paper we introduce three novel methods to compute the upper and lower bounds for the edit distance between two graphs in polynomial time. Applying these methods, two algorithms AppFull and AppSub are introduced to perform different kinds of graph search on graph databases. Comprehensive experimental studies are conducted on both real and synthetic datasets to examine various aspects of the methods for bounding graph edit distance. Result shows that these methods achieve good scalability in terms of both the number of graphs and the size of graphs. The effectiveness of these algorithms also confirms the usefulness of using our bounds in filtering and searching of graphs.

References

H. A. Almohamad and S. O. Duffuaa. A linear programming approach for the weighted graph matching problem. IEEE Trans. PAMI, 15(5):522--525, 1993. Google ScholarDigital Library
N. Augsten, M. Böhlen, and J. Gamper. Approximate matching of hierarchical data using pq-grams. In Proceedings of the 31st international conference on Very large data bases, pages 301--312, 2005. Google ScholarDigital Library
N. Augsten, M. Böhlen, and J. Gamper. An incrementally maintainable index for approximate lookups in hierarchical data. In Proceedings of the 32nd international conference on Very large data bases, pages 247--258, 2006. Google ScholarDigital Library
S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. PAMI, 23(10):1089--1105, 2001. Google ScholarDigital Library
C. Borgelt and M. R. Berthold. Mining molecular fragments: Finding relevant substructures of molecules. In ICDM '02. Google ScholarDigital Library
J. M. Bower and H. Bolouri. Computational Modeling of Genetic and Biochemical Networks (Computational Molecular Biology). 2004. Google ScholarDigital Library
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener. Graph structure in the web: experiments and models. In WWW '00. Google ScholarDigital Library
H. Bunke and K. Shearer. A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters, 19(3--4):255--259, 1998. Google ScholarDigital Library
C. Chen, X. Yan, P. S. Yu, J. Han, D.-Q. Zhang, and X. Gu. Towards graph containment search and indexing. In VLDB '07. Google ScholarDigital Library
M.-L. Fernández and G. Valiente. A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters, 22(6--7):753--758, 2001. Google ScholarDigital Library
M. R. Garey and D. S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness. 1990. Google ScholarDigital Library
P. Hart, N. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. SSC, 4(2):100--107, 1968.Google Scholar
H. He and A. Singh. Closure-tree: An index structure for graph queries. ICDE'06. Google ScholarDigital Library
H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In SIGMOD '07. Google ScholarDigital Library
H. Hu, X. Y., Y. Hang, J. Han, and X. J. Zhou. Mining coherent dense subgraphs across massive biological network for functional discovery. Bioinformatics, 1(1):1--9, 2005. Google ScholarDigital Library
D. Justice and A. Hero. A binary linear programming formulation of the graph edit distance. IEEE Trans. PAMI, 28(8):1200--1214, 2006. Google ScholarDigital Library
V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB '05. Google ScholarDigital Library
H. W. Kuhn. The hungarian method for the assignment problem. Naval Research Logistics, 2:83--97, 1955.Google ScholarCross Ref
M. Kuramochi and G. Karypis. Frequent subgraph discovery. In ICDM '01. Google ScholarDigital Library
B. T. Messmer and H. Bunke. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. PAMI, 20(5):493--504, 1998. Google ScholarDigital Library
R. Myers, R. C. Wilson, and E. R. Hancock. Bayesian Graph Edit Distance. IEEE Trans. PAMI, 22(6):628--635, 2000. Google ScholarDigital Library
M. Neuhaus, K. Riesen, and H. Bunke. Fast suboptimal algorithms for the computation of graph edit distance. In SSSPR '06. Google ScholarDigital Library
E. G. M. Petrakis and C. Faloutsos. Similarity searching in medical image databases. IEEE Transactions on Knowledge and Data Engineering, 9(3):435--447, 1997. Google ScholarDigital Library
J. Raymond, E. Gardiner, and P. Willett. RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs. The Computer Journal, 45(6):631--644, 2002.Google ScholarCross Ref
K. Riesen, S. Fankhauser, and H. Bunke. Speeding up graph edit distance computation with a bipartite heuristic. In MLG '07.Google Scholar
D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS '02. Google ScholarDigital Library
S. Srinivasa and S. Kumar. A platform based on the multi-dimensional data modal for analysis of bio-molecular structures. In VLDB '03. Google ScholarDigital Library
Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. Saga: a subgraph matching tool for biological graphs. Bioinformatics, 23(2):232--239, 2007. Google ScholarDigital Library
N. Trinajstic, J. V. Knop, W. R. Muller, K. Syzmanski, and S. Nikolic. Computational Chemical Graph Theory: Characterization, Enumeration and Generation of Chemical Structures by Computer Methods. 1991. Google ScholarDigital Library
S. Trissl and U. Leser. Fast and practical indexing and querying of very large graphs. In SIGMOD '07. Google ScholarDigital Library
P. Willett, J. Barnard, and G. Downs. Chemical similarity searching. J. Chem. Inf. Comput. Sci, 38(6):983--996, 1998.Google ScholarCross Ref
R. C. Wilson and E. R. Hancock. Structural matching by discrete relaxation. IEEE Trans. PAMI, 19(6):634--648, 1997. Google ScholarDigital Library
X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM'02. Google ScholarDigital Library
X. Yan, P. S. Yu, and J. Han. Graph indexing: a frequent structure-based approach. In SIGMOD '04. Google ScholarDigital Library
X. Yan, F. Zhu, P. S. Yu, and J. Han. Feature-based similarity search in graph structures. ACM TODS, 31(4), 2006. Google ScholarDigital Library
R. Yang, P. Kalnis, and A. K. H. Tung. Similarity evaluation on tree-structured data. In SIGMOD '05. Google ScholarDigital Library
P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In VLDB '07. Google ScholarDigital Library

Index Terms

Comparing stars: on approximating graph edit distance

Recommendations

Stars and triangles

A graph is S_m-full if each vertex belongs to an induced star of order m. The stellarity of a vertex ν in a graph is the maximum size of an induced star centered at ν. Answering a question of Füredi, Mubayi, and West, we show that an S_m-full finite graph ...
Read More
Decomposition of complete graphs into paths and stars

Let P"k"+"1 denote a path of length k and let S"k"+"1 denote a star with k edges. As usual K"n denotes the complete graph on n vertices. In this paper we investigate the decomposition of K"n into paths and stars, and prove the following results. Theorem ...
Read More
Stars versus stripes Ramsey numbers

For given simple graphs G1,G2,,Gt, the Ramsey number R(G1,G2,,Gt) is the smallest positive integer n such that if the edges of the complete graph Kn are partitioned into t disjoint color classes giving t graphs H1,H2,,Ht, then at least one Hi has a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 2, Issue 1
August 2009
1293 pages
ISSN:2150-8097
Editors:
Serge Abiteboul,
Tova Milo,
Jignesh Patel,
Philippe Rigaux
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2009
Published in pvldb Volume 2, Issue 1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 113
  Total Citations
  View Citations
- 1,053
  Total Downloads
- Downloads (Last 12 months)145
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Comparing stars: on approximating graph edit distance

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Stars and triangles

Decomposition of complete graphs into paths and stars

Stars versus stripes Ramsey numbers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Comparing stars: on approximating graph edit distance

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Stars and triangles

Decomposition of complete graphs into paths and stars

Stars versus stripes Ramsey numbers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media