skip to main content
research-article

Graph pattern matching: from intractable to polynomial time

Published:01 September 2010Publication History
Skip Abstract Section

Abstract

Graph pattern matching is typically defined in terms of subgraph isomorphism, which makes it an np-complete problem. Moreover, it requires bijective functions, which are often too restrictive to characterize patterns in emerging applications. We propose a class of graph patterns, in which an edge denotes the connectivity in a data graph within a predefined number of hops. In addition, we define matching based on a notion of bounded simulation, an extension of graph simulation. We show that with this revision, graph pattern matching can be performed in cubic-time, by providing such an algorithm. We also develop algorithms for incrementally finding matches when data graphs are updated, with performance guarantees for dag patterns. We experimentally verify that these algorithms scale well, and that the revised notion of graph pattern matching allows us to identify communities commonly found in real-world networks.

References

  1. S. Amer-Yahia, M. Benedikt, and P. Bohannon. Challenges in searching online communities. IEEE Data Eng. Bull., 30(2):23--31, 2007.Google ScholarGoogle Scholar
  2. J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Y. Berger-Wolf and J. Saia. A framework for analysis of dynamic social networks. In KDD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. P. Chan and H. Lim. Optimization and evaluation of shortest path queries. VLDB J., 16(3):343--369, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In VLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang. Fast graph pattern matching. In ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Cho, N. Shivakumar, and H. Garcia-Molina. Finding replicated Web collections. SIGMOD Rec., 29(2), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Fan and P. Bohannon. Information preserving XML schema embedding. TODS, 33(1), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Fan, J. Li, S. Ma, H. Wang, and Y. Wu. Graph homomorphism revisited for graph matching. PVLDB, 3, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. AAAI FS., 2006.Google ScholarGoogle Scholar
  14. H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Milo and D. Suciu. Index structures for path expressions. In ICDT, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. D. Nardo, F. Ranzato, and F. Tapparo. The subgraph similarity problem. TKDE, 21(5):748--749, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Natarajan. Understanding the structure of a drug trafficking organization: a conversational analysis. Crime Prevention Studies, 11:273--298, 2000.Google ScholarGoogle Scholar
  20. G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Ramalingam and T. W. Reps. A categorized bibliography on incremental computation. In POPL, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Ronen and O. Shmueli. SoQL: A language for querying and creating data in social networks. In ICDE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Saha. An incremental bisimulation algorithm. In FSTTCS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Terveen and D. W. McDonald. Social matching: A framework and research agenda. ACM Trans. Comput.-Hum. Interact., 12(3), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. H. Tong, C. Faloutsos, B. Gallagher, and T. Eliassi-Rad. Fast best-effort pattern matching in large attributed graphs. In KDD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23(1), 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. Yi, H. He, I. Stanoi, and J. Yang. Incremental maintenance of XML structural indexes. In SIGMOD, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. L. Zou, L. Chen, and M. T. Özsu. Distance-join: Pattern match query in a large graph database. In VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Graph pattern matching: from intractable to polynomial time
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
        September 2010
        1658 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 September 2010
        Published in pvldb Volume 3, Issue 1-2

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader