1 Introduction
-
Q1: Is it possible to design a dynamic labelling algorithm which can efficiently reflect both incremental and decremental updates on graphs for fast and accurate distance computation?
-
Q2: Can such a dynamic labelling algorithm handle multiple updates in parallel in order to offer performance gains over existing dynamic labelling algorithms in the single-update setting?
-
Parallel searches: We exploit interactions between updates and design a novel parallel approach to find affected vertices, which involves both landmark parallelism and anchor parallelism.
-
Bounded space: We bound search spaces for updates to only small portions of graphs that are affected, which are achieved by identifying boundary vertices with respect to updates.
-
Repair inference: We develop a repairing approach that can efficiently infer the new distances of affected vertices to repair their labels through a level-by-level computation from boundary vertices.
1.1 Contributions
-
We study the problem of answering exact distance queries on dynamic graphs by proposing an efficient solution comprising of a fully dynamic distance labelling and online sparsified searches.
-
Our fully dynamic labelling algorithm can uniformly handle both incremental and decremental updates efficiently using a novel parallel search strategy and a bounded repairing inference mechanism.
-
We prove the correctness of our method and show that it can preserve the minimality of distance labelling, as well as provide the complexity analysis.
-
We have evaluated our method on 10 real-world large networks to verify their efficiency, scalability and robustness. The results show that our algorithms significantly outperform the state-of-the-art methods. It can answer queries in the order of milliseconds and maintain very small labelling sizes, even for large graphs with billions of edges. To the best of our knowledge, this is the first study to develop a parallel fully dynamic labelling method for distance queries.
1.2 Outline
2 Related work
3 Problem formulation
4 Parallel fully dynamic framework
4.1 Fully dynamic distance labelling
4.2 Fully dynamic distance querying
5 Parallel maintenance of distance labelling
5.1 Finding affected vertices
5.2 Finding boundary vertices
5.3 Repairing affected vertices
6 Theoretical discussion
6.1 Proof of correctness
FindAffected
finds the set of all affected vertices w.r.t. a landmark r by a sequence of updates B. By Lemma 1, we know that each affected vertex is enqueued into \(\mathcal {Q}\) during a BFS search from b (Lines 4 and 7-8 in Algorithm 2). Thus, \(\mathcal {A}(r, B)\) contains all affected vertices w.r.t. r and (a,b).FindBoundary
finds the set of all boundary vertices w.r.t. a landmark r and updates B. Algorithm 3 finds boundary vertices \({\mathscr{B}}(r, B)\) from all affected vertices \(v \in \mathcal {A}(r, B)\). It adds an affected vertex \(v \in \mathcal {A}(r, B)\) into \({\mathscr{B}}(r, B)\) iff v has at least one unaffected neighbor (Lines 4-7). Algorithm 3 also removes the distance entries of r from all affected vertices (Line 8).RepairAffected
modifies Γ = (H,L) to \({\Gamma }^{\prime }=(H^{\prime },L^{\prime })\) s.t. (1) \((r, d_{G^{\prime }}(r, v)) \in L^{\prime }(v)\) for \(v\in \mathcal {A}(r, B)\) iff \(P_{G^{\prime }}(r,v)\) does not contain any other landmark in the shortest-path from v to r; (2) \(\delta _{H^{\prime }}(r,r^{\prime })=d_{G^{\prime }}(r,r^{\prime })\) for any \(r^{\prime }\in R \backslash \{r\}\). By Lemma 3, starting from boundary vertices with the smallest distance bound, the distances of affected vertices on \(G^{\prime }\) are iteratively inferred in \(\mathcal {A}(r, B)\) and added into their labels via \(\mathcal {Q}_{l}\) if these affected vertices are not prunable (Lines 6, 15-16 and 19). If an affected vertex v is prunable, it is kept in \(\mathcal {Q}_{p}\); if v is also a landmark, \(\delta _{H^{\prime }}(r,v)\) in \(H^{\prime }\) is updated (Lines 5, 10-13, 18). Thus, every vertex v appearing in \(\mathcal {Q}_{p}\) has no distance entry of r in \(L^{\prime }(v)\), whereas every vertex v appearing in \(\mathcal {Q}_{l}\) must have \((r, d_{G^{\prime }}(r,v)))\in L^{\prime }(v)\). By Lemma 4, this proves both (1) and (2). □6.2 Complexity analysis
7 Experiments
7.1 Experimental setup
7.1.1 Baseline methods
-
FulFD: a fully dynamic algorithm proposed in [10] which combines a distance labelling with a graph traversal algorithm to answer distance queries.
-
FulHL: a fully dynamic labelling algorithm proposed in [11] which combines a highway cover labelling with graph traversal algorithm to answer distance queries.
-
IncHL+: an online incremental algorithm proposed in [25], which combines a highway cover labelling with a graph traversal algorithm for answering distance queries;
-
Opt-BiBFS: an online bidirectional BFS algorithm to answer distance queries, using an optimized strategy to expand searches from a direction with less vertices [10].
7.1.2 Datasets
Dataset | Network | |V | | |E| | Avg. deg. | Avg. dist. |
---|---|---|---|---|---|
Skitter | comp (u) | 1.7M | 11M | 13.08 | 5.0 |
Hollywood | social (u) | 1.1M | 114M | 98.91 | 3.9 |
Orkut | social (u) | 3.1M | 117M | 76.28 | 4.2 |
Enwiki | social (d) | 4.2M | 101M | 43.75 | 3.4 |
Livejournal | social (d) | 4.8M | 69M | 17.68 | 5.6 |
Indochina | web (d) | 7.4M | 194M | 40.73 | 7.7 |
IT | web (d) | 41M | 1.2B | 49.77 | 7.0 |
Twitter | social (d) | 42M | 1.5B | 57.74 | 3.6 |
Friendster | social (u) | 66M | 1.8B | 55.06 | 5.0 |
UK | web (d) | 106M | 3.7B | 62.77 | 6.9 |
7.1.3 Test data generation
7.2 Performance comparison
7.2.1 Update time
Dataset | Fully Dynamic Update Time (sec.) | \(\frac {|\mathcal {A} |}{|V |}\) | |||
---|---|---|---|---|---|
ParDHL− | ParDHL | FulHL | FulFD | ||
Skitter | 1.053 | 0.249 | 0.479 | 20.22 | 0.7857 |
Hollywood | 0.188 | 0.046 | 0.117 | 3.136 | 0.0379 |
Orkut | 4.118 | 0.998 | 1.833 | 35.98 | 0.2780 |
Enwiki | 5.103 | 2.244 | 3.586 | 115.9 | 0.4960 |
Livejournal | 0.591 | 0.233 | 0.502 | 13.07 | 0.0492 |
Indochina | 5.182 | 1.487 | 2.674 | 270.7 | 2.1660 |
IT | 44.04 | 9.630 | 55.74 | 416.9 | 2.9187 |
Twitter | 62.82 | 31.24 | 129.8 | 5010 | 0.3939 |
Friendster | 4.597 | 1.856 | 44.31 | 17.16 | 0.0009 |
UK | 33.14 | 11.22 | 74.56 | 431.0 | 0.6949 |
Dataset | Incremental Update Time (sec.) | \(\frac {|\mathcal {A} |}{|V |}\) | ||||
---|---|---|---|---|---|---|
ParDHL− | ParDHL | IncHL | IncHL+ | IncFD | ||
Skitter | 0.446 | 0.090 | 0.170 | 0.457 | 0.576 | 0.3847 |
Hollywood | 0.167 | 0.040 | 0.071 | 0.081 | 0.087 | 0.0405 |
Orkut | 1.949 | 0.430 | 1.698 | 3.026 | 1.990 | 0.1700 |
Enwiki | 0.285 | 0.107 | 0.284 | 0.229 | 0.157 | 0.0075 |
Livejournal | 0.300 | 0.096 | 0.222 | 0.325 | 0.251 | 0.0209 |
Indochina | 6.877 | 1.214 | 3.525 | 167.7 | 504.5 | 2.9487 |
IT | 60.89 | 12.88 | 62.45 | 95.92 | 335.7 | 4.2755 |
Twitter | 2.891 | 1.198 | 14.16 | 0.037 | 0.107 | 0.0004 |
Friendster | 4.388 | 1.724 | 21.57 | 0.169 | 0.220 | 0.0006 |
UK | 20.56 | 5.505 | 42.27 | 21.49 | 469.5 | 0.5029 |
Dataset | Decremental Update Time (sec.) | \(\frac {|\mathcal {A}|}{|V|}\) | |||
---|---|---|---|---|---|
ParDHL− | ParDHL | DecHL | DecFD | ||
Skitter | 1.214 | 0.317 | 0.700 | 24.22 | 0.7466 |
Hollywood | 0.218 | 0.053 | 0.129 | 6.912 | 0.0368 |
Orkut | 4.859 | 1.090 | 1.537 | 69.44 | 0.3197 |
Enwiki | 8.433 | 3.534 | 6.539 | 239.1 | 0.8536 |
Livejournal | 1.032 | 0.384 | 0.665 | 15.00 | 0.0937 |
Indochina | 2.111 | 1.265 | 1.425 | 43.12 | 0.4162 |
IT | 8.351 | 3.54 | 31.26 | 350.5 | 0.1637 |
Twitter | 90.03 | 50.02 | 231.9 | 10628 | 0.4971 |
Friendster | 4.784 | 1.957 | 44.62 | 28.54 | 0.0012 |
UK | 31.76 | 13.07 | 70.53 | 254.1 | 0.5165 |
7.2.2 Labelling size
Dataset | Query Time [ms] | Labelling Size | ||
---|---|---|---|---|
ParDHL | FulFD | ParDHL | FulFD | |
Skitter | 0.029 | 0.020 | 42 MB | 153 MB |
Hollywood | 0.026 | 0.036 | 27 MB | 263 MB |
Orkut | 0.102 | 0.156 | 70 MB | 711 MB |
Enwiki | 0.053 | 0.051 | 82 MB | 608 MB |
Livejournal | 0.043 | 0.051 | 122 MB | 663 MB |
Indochina | 0.788 | 0.767 | 85 MB | 838 MB |
IT | 1.167 | 1.103 | 866 MB | 4.74 GB |
Twitter | 0.868 | 0.174 | 1.14 GB | 3.83 GB |
Friendster | 0.815 | 0.902 | 2.43 GB | 9.14 GB |
UK | 1.174 | 5.233 | 1.78 GB | 11.8 GB |