Abstract
This paper studies fundamental problems for distributed graph simulation. Given a pattern query Q and a graph G that is fragmented and distributed, a graph simulation algorithm A is to compute the matches Q(G) of Q in G. We say that A is parallel scalable in (a) response time if its parallel computational cost is determined by the largest fragment Fm of G and the size |Q| of query Q, and (b) data shipment if its total amount of data shipped is determined by |Q| and the number of fragments of G, independent of the size of graph G. (1) We prove an impossibility theorem: there exists no distributed graph simulation algorithm that is parallel scalable in either response time or data shipment. (2) However, we show that distributed graph simulation is partition bounded, i.e., its response time depends only on |Q|, |Fm| and the number |Vf| of nodes in G with edges across different fragments; and its data shipment depends on |Q| and the number |Ef| of crossing edges only. We provide the first algorithms with these performance guarantees. (3) We also identify special cases of patterns and graphs when parallel scalability is possible. (4) We experimentally verify the scalability and efficiency of our algorithms.
- Amazon. http://aws.amazon.com/ec2/.Google Scholar
- Facebook statistics. http://www.facebook.com/press/info.php?statistics.Google Scholar
- Full version. http://www.cs.ucsb.edu/~yinghui/full.pdf.Google Scholar
- Trinity. research.microsoft.com/en-us/projects/trinity/.Google Scholar
- V. L. Anh and A. Kiss. Efficient processing regular queries in shared-nothing parallel database systems using tree-and structural indexes. In ADBIS Research Communic, 2007.Google Scholar
- S. Blom and S. Orzan. A distributed algorithm for strong bisimulation reduction of state spaces. STTT, 7(1), 2005. Google ScholarDigital Library
- J. Brynielsson, J. Högberg, L. Kaati, C. Martenson, and P. Svenson. Detecting social positions using simulation. In ASONAM, 2010. Google ScholarDigital Library
- A. Buluç and K. Madduri. Graph partitioning for scalable distributed graph computations. In Graph Partitioning and Graph Clustering, pages 83--102, 2012.Google Scholar
- F. Chung, R. Graham, R. Bhagwan, S. Savage, and G. M. Voelker. Maximizing data locality in distributed systems. JCSS, 72(8): 1309--1316, 2006. Google ScholarDigital Library
- G. Cong, W. Fan, and A. Kementsietsidis. Distributed query evaluation with performance guarantees. In SIGMOD, 2007. Google ScholarDigital Library
- W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractable to polynomial time. PVLDB, 3(1), 2010. Google ScholarDigital Library
- W. Fan, X. Wang, and Y. Wu. Performance guarantees for distributed reachability queries. PVLDB, 5(11), 2012. Google ScholarDigital Library
- W. Fan, X. Wang, and Y. Wu. Incremental graph pattern matching. TODS, 38(3), 2013. Google ScholarDigital Library
- A. Fard, M. U. Nisar, L. Ramaswamy, J. A. Miller, and M. Saltz. A distributed vertex-centric approach for pattern matching in massive graphs. In Big Data, 2013.Google Scholar
- S. Flesca, F. Furfaro, and A. Pugliese. A framework for the partial evaluation of SPARQL queries. In SUM, 2008. Google ScholarDigital Library
- J. C. Godskesen and S. Nanz. Mobility models and behavioural equivalence for wireless networks. In Coordination Models and Languages, pages 106--122, 2009. Google ScholarDigital Library
- A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev., 39, 2008. Google ScholarDigital Library
- M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995. Google ScholarDigital Library
- M. Jamali. A distributed method for trust-aware recommendation in social networks. CoRR, 2010.Google Scholar
- R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In KDD, 2006. Google ScholarDigital Library
- J. Leskovec, A. Singh, and J. Kleinberg. Patterns of influence in a recommendation network. In Advances in Knowledge Discovery and Data Mining, pages 380--389. 2006. Google ScholarDigital Library
- Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. Hellerstein. Distributed GraphLab: A framework for machine learning and data mining in the cloud. PVLDB, 2012. Google ScholarDigital Library
- J. J. Luczkovich, S. P. Borgatti, J. C. Johnson, and M. G. Everett. Defining and measuring trophic role similarity in food webs using regular equivalence. Journal of Theoretical Biology, 220(3): 303--321, 2003.Google ScholarCross Ref
- S. Ma, Y. Cao, W. Fan, J. Huai, and T. Wo. Capturing topology in graph pattern matching. PVLDB, 5(4), 2011. Google ScholarDigital Library
- S. Ma, Y. Cao, J. Huai, and T. Wo. Distributed graph pattern matching. In WWW, pages 949--958, 2012. Google ScholarDigital Library
- G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010. Google ScholarDigital Library
- F. Rahimian, A. H. Payberah, S. Girdzijauskas, M. Jelasity, and S. Haridi. Ja-be-ja: A distributed algorithm for balanced graph partitioning. Technical report, Swedish Institute of Computer Science, 2013.Google Scholar
- M. Rowe and O. Group. Interlinking distributed social graphs. In WWW, 2009.Google Scholar
- M. Shoaran and A. Thomo. Fault-tolerant computation of distributed regular path queries. TCS, 410(1): 62--77, 2009. Google ScholarDigital Library
- D. Suciu. Distributed query evaluation on semistructured data. TODS, 27(1):1--62, 2002. Google ScholarDigital Library
- Y. Tao, W. Lin, and X. Xiao. Minimal MapReduce algorithms. SIGMOD, 2013. Google ScholarDigital Library
- R. E. Tarjan. Depth-first search and linear graph algorithms. SICOMP, 1(2): 146--160, 1972.Google ScholarDigital Library
- J. R. Ullmann. An algorithm for subgraph isomorphism. JACM, 23(1): 31--42, 1976. Google ScholarDigital Library
- V. Venkataramani et al. TAO: how facebook serves the social graph. In SIGMOD, 2012. Google ScholarDigital Library
- D. P. Woodruff and Q. Zhang. When distributed computation is communication expensive. In DISC, 2013.Google ScholarDigital Library
- K. Zeng, J. Yang, H. Wang, B. Shao, and Z. Wang. A distributed graph engine for web scale rdf data. In VLDB, 2013. Google ScholarDigital Library
Index Terms
- Distributed graph simulation: impossibility and possibility
Recommendations
Collapsible subgraphs of a 4-edge-connected graph
AbstractJaeger in 1979 showed that every 4-edge-connected graph is supereulerian, graphs that have spanning eulerian subgraphs. Catlin in 1988 sharpened Jaeger’s result by showing that every 4-edge-connected graph is collapsible, graphs that ...
Traversability and connectivity of the middle graph of a graph
We define a graph M(G) as an intersection graph @W(F) on the point set V(G) of any graph G. Let X(G) be the line set of G and F = V'(G) @__ __ X(G), where V'(G) indicates the family of all one point subsets of the set V(G). Let M(G) = @W(F). M(G) is ...
Trivially noncontractible edges in a contraction critically 5-connected graph
An edge of a k-connected graph is said to be k-contractible if the contraction of the edge results in a k-connected graph. A k-connected graph with no k-contractible edge is said to be contraction critically k-connected. An edge of a k-connected graph ...
Comments