skip to main content
10.1145/2915516.2915519acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Distributed Incremental Pattern Matching on Streaming Graphs

Published:31 May 2016Publication History

ABSTRACT

Big data has shifted the computing paradigm of data analysis. While some of the data can be treated as simple texts or independent data records, many other applications have data with structural patterns which are modeled as a graph, such as social media, road network traffic and smart grid, etc. However, there is still limited amount of work has been done to address the velocity problem of graph processing. In this work, we aim to develop a distributed processing system for solving pattern matching queries on streaming graphs where graphs evolve over time upon the arrives of streaming graph update events. To achieve the goal, we proposed an incremental pattern matching algorithm and implemented it on GPS, a vertex centric distributed graph computing framework. We also extended the GPS framework to support streaming graph, and adapted a subgraphcentric data model to further reduce communication overhead and system performance. Our evaluation using real wiki trace shows that our approach achieves a 3x -- 10x speedup over the batch algorithm, and significantly reduces network and memory usage.

References

  1. Apache giraph. http://giraph.apache.org/.Google ScholarGoogle Scholar
  2. D. A. Bader, J. Berry, A. Amos-Binks, D. Chavarría-Miranda, C. Hastings, K. Madduri, and S. C. Poulos. Stinger: Spatio-temporal interaction networks and graphs (sting) extensible representation. Georgia Institute of Technology, Tech. Rep, 2009.Google ScholarGoogle Scholar
  3. J. Brynielsson, J. Hogberg, L. Kaati, C. Martenson, and P. Svenson. Detecting social positions using simulation. In Advances in Social Networks Analysis and Mining, International Conference on, pages 48--55, Aug 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Cheng, J. Hong, A. Kyrola, Y. Miao, X. Weng, M. Wu, F. Yang, L. Zhou, F. Zhao, and E. Chen. Kineograph: Taking the pulse of a fast-changing and connected world. In ACM EuroSys, pages 85--98, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Fan, J. Li, J. Luo, Z. Tan, X. Wang, and Y. Wu. Incremental graph pattern matching. In ACM SIGCOM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractable to polynomial time. Proc. VLDB Endow., 3(1--2):264--275, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Fan, X. Wang, Y. Wu, and D. Deng. Distributed graph simulation: Impossibility and possibility. Proc. VLDB Endow., 7(12):1083--1094, Aug. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Fard, M. Nisar, L. Ramaswamy, J. Miller, and M. Saltz. A distributed vertex-centric approach for pattern matching in massive graphs. In Big Data, 2013 IEEE International Conference on, pages 403--411, Oct 2013.Google ScholarGoogle ScholarCross RefCross Ref
  9. M. R. Garey and D. S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York, NY, USA, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. C. Godskesen and S. Nanz. Mobility models and behavioural equivalence for wireless networks. In Coordination Models and Languages, pages 106--122, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. Computing simulations on finite and infinite graphs. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science, FOCS '95, pages 453--, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: a system for dynamic load balancing in large-scale graph processing. In ACM EuroSys, pages 169--182, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow., 5(8):716--727, Apr. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Ma, Y. Cao, J. Huai, and T. Wo. Distributed graph pattern matching. In Proceedings of the 21st International Conference on World Wide Web, pages 949--958, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In ACM SIGCOM, pages 135--146, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. McGregor. Graph stream algorithms: A survey. SIGMOD Rec., 43(1):9--20, May 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Ntoulas, J. Cho, and C. Olston. What's new on the web?: The evolution of the web from a search engine perspective. In Proceedings of the 13th International Conference on World Wide Web, pages 1--12, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Roy, I. Mihailovic, and W. Zwaenepoel. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of ACM Symposium on Operating Systems Principles, pages 472--488, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Salihoglu and J. Widom. Gps: A graph processing system. In SSDBM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Simmhan, A. Kumbhare, C. Wickramaarachchi, S. Nagarkar, S. Ravi, C. Raghavendra, and V. Prasanna. Goffish: A sub-graph centric framework for large-scale graph analytics. In Euro-Par, pages 451--462. 2014.Google ScholarGoogle ScholarCross RefCross Ref
  21. I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1222--1230, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson. From think like a vertex to think like a graph. Proceedings of the VLDB Endowment, 7(3):193--204, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Vaquero, F. Cuadrado, D. Logothetis, and C. Martella. Adaptive partitioning for large-scale dynamic graphs. In Proceedings of Symposium on Cloud Computing, pages 35:1--35:2, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Wickramaarachchi, M. Frincu, and V. Prasanna. Enabling real-time pro-active analytics on streaming graphs. algorithms, 15:18, 2010.Google ScholarGoogle Scholar
  25. R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. Graphx: A resilient distributed graph system on spark. In First International Workshop on Graph Data Management Experiences and Systems, pages 2:1--2:6, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distributed Incremental Pattern Matching on Streaming Graphs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            HPGP '16: Proceedings of the ACM Workshop on High Performance Graph Processing
            May 2016
            60 pages
            ISBN:9781450343503
            DOI:10.1145/2915516

            Copyright © 2016 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 31 May 2016

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            HPGP '16 Paper Acceptance Rate5of6submissions,83%Overall Acceptance Rate5of6submissions,83%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader