research-article

Accelerating truss decomposition on heterogeneous processors

Authors:
Yulin Che

Hong Kong University of Science and Technology

Hong Kong University of Science and Technology
View Profile

,
Zhuohang Lai

Hong Kong University of Science and Technology

Hong Kong University of Science and Technology
View Profile

,
Shixuan Sun

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Yue Wang

Shenzhen University

Shenzhen University
View Profile

,
Qiong Luo

Hong Kong University of Science and Technology

Hong Kong University of Science and Technology
View Profile

Proceedings of the VLDB Endowment Volume 13 Issue 10pp 1751–1764https://doi.org/10.14778/3401960.3401971

Published:01 June 2020Publication History

Proceedings of the VLDB Endowment

Abstract

Truss decomposition is to divide a graph into a hierarchy of subgraphs, or trusses. A subgraph is a k-truss (k ≥ 2) if each edge is in at least k --- 2 triangles in the subgraph. Existing algorithms work by first counting the number of triangles each edge is in and then iteratively incrementing k to peel off the edges that will not appear in (k + 1)-truss. Due to the data and computation intensity, truss decomposition on billion-edge graphs takes hours to complete on a commodity computer.

We propose to accelerate in-memory truss decomposition by (1) compacting intermediate results to optimize memory access, (2) dynamically adjusting the computation based on data characteristics, and (3) parallelizing the algorithm on both the multicore CPU and the GPU. In particular, we optimize the triangle enumeration with data skew handling, and determine at runtime whether to pursue peeling or direct triangle counting to obtain a certain k-truss. We further develop a CPU-GPU co-processing strategy in which the CPU first computes intermediate results and sends the compacted results to the GPU for further computation. Our experiments on real-world datasets show that our implementations outperform the state of the art by up to an order of magnitude. Our source code is publicly available at https://github.com/RapidsAtHKUST/AccTrussDecomposition.

References

Cub documentation. http://nvlabs.github.io/cub/. Accessed in 2020.Google Scholar
Gcc atomic built-ins. https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html. Accessed in 2020.Google Scholar
Openmp documentation. https://www.openmp.org/. Accessed in 2020.Google Scholar
E. Akbas and P. Zhao. Truss-based community search: a truss-equivalence based indexing approach. PVLDB, 10(11):1298--1309, 2017. Google ScholarDigital Library
M. Almasri, O. Anjum, C. Pearson, Z. Qureshi, V. S. Mailthody, R. Nagi, J. Xiong, and W.-m. Hwu. Update on k-truss decomposition on gpu. In HPEC, pages 1--7. IEEE, 2019.Google ScholarCross Ref
P. Boldi, M. Rosa, M. Santini, and S. Vigna. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In WWW, pages 587--596, 2011. Google ScholarDigital Library
P. Boldi and S. Vigna. The webgraph framework i: compression techniques. In WWW, pages 595--602, 2004. Google ScholarDigital Library
L. Chang and L. Qin. Minimum degree-based core decomposition. In Cohesive Subgraph Computation over Large Sparse Graphs, pages 21--39. Springer, 2018.Google ScholarCross Ref
Y. Che. Rmat graph format converter code repository. https://github.com/RapidsAtHKUST/KroneckerBinEdgeListToCSR. Accessed in 2020.Google Scholar
Y. Che. Rmat graph generator code repository. https://github.com/RapidsAtHKUST/Graph500KroneckerGraphGenerator. Accessed in 2020.Google Scholar
Y. Che, Z. Lai, S. Sun, Q. Luo, and Y. Wang. Accelerating all-edge common neighbor counting on three processors. In ICPP, pages 1--10, 2019. Google ScholarDigital Library
Y. Che, Z. Lai, S. Sun, Y. Wang, and Q. Luo. Source code of accelerating truss decomposition on heterogeneous processors. https://github.com/RapidsAtHKUST/AccTrussDecomposition. Accessed in 2020.Google Scholar
N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14(1):210--223, 1985. Google ScholarDigital Library
J. Cohen. Trusses: Cohesive subgraphs for social network analysis. National security agency technical report, 16:3--1, 2008.Google Scholar
K. Date, K. Feng, R. Nagi, J. Xiong, N. S. Kim, and W.-M. Hwu. Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition on the minsky architecture: Static graph challenge: Subgraph isomorphism. In HPEC, pages 1--7. IEEE, 2017.Google ScholarCross Ref
L. Dhulipala, G. Blelloch, and J. Shun. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In SPAA, pages 293--304, 2017. Google ScholarDigital Library
C. Gui, L. Zheng, P. Yao, X. Liao, and H. Jin. Fast triangle counting on GPU. In HPEC, pages 1--7. IEEE, 2019. Google ScholarDigital Library
Y. Hu, H. Liu, and H. H. Huang. High-performance triangle counting on gpus. In HPEC, pages 1--5. IEEE, 2018.Google ScholarCross Ref
X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. Querying k-truss community in large and dynamic graphs. In SIGMOD, pages 1311--1322, 2014. Google ScholarDigital Library
H. Kabir and K. Madduri. Parallel k-truss decomposition on multicore systems. In HPEC, pages 1--7. IEEE, 2017.Google ScholarCross Ref
H. Kabir and K. Madduri. Shared-memory graph truss decomposition. In HiPC, pages 13--22. IEEE, 2017.Google ScholarCross Ref
J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014. Accessed in 2020.Google Scholar
L. Lü, T. Zhou, Q.-M. Zhang, and H. E. Stanley. The h-index of a network node and its relation to degree and coreness. Nature communications, 7:10168, 2016.Google ScholarCross Ref
V. S. Mailthody, K. Date, Z. Qureshi, C. Pearson, R. Nagi, J. Xiong, and W.-m. Hwu. Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition. In HPEC, pages 1--7. IEEE, 2018.Google Scholar
A. Montresor, F. De Pellegrini, and D. Miorandi. Distributed k-core decomposition. TPDS, 24(2):288--300, 2012. Google ScholarDigital Library
S. Pandey, X. S. Li, A. Buluç, J. Xu, and H. Liu. H-INDEX: hash-indexing for parallel triangle counting on gpus. In HPEC, pages 1--7. IEEE, 2019.Google ScholarCross Ref
R. Pearce. Triangle counting for scale-free graphs at scale in distributed memory. In HPEC, pages 1--4. IEEE, 2017.Google ScholarCross Ref
R. A. Rossi. Fast triangle core decomposition for mining large graphs. In PAKDD, pages 310--322. Springer, 2014.Google ScholarCross Ref
A. E. Sariyüce and A. Pinar. Fast hierarchy construction for dense subgraphs. PVLDB, 10(3):97--108, 2016. Google ScholarDigital Library
A. E. Sariyüce, C. Seshadhri, and A. Pinar. Local algorithms for hierarchical dense subgraph discovery. PVLDB, 12(1):43--56, 2018. Google ScholarDigital Library
A. E. Sariyuce, C. Seshadhri, A. Pinar, and U. V. Catalyurek. Finding the hierarchy of dense subgraphs using nucleus decompositions. In WWW, pages 927--937, 2015. Google ScholarDigital Library
J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In PPoPP, pages 135--146, 2013. Google ScholarDigital Library
J. Singler, P. Sanders, and F. Putze. Mcstl: The multi-core standard template library. In European Conference on Parallel Processing, pages 682--694. Springer, 2007. Google ScholarDigital Library
S. Smith, X. Liu, N. K. Ahmed, A. S. Tom, F. Petrini, and G. Karypis. Truss decomposition on shared-memory parallel systems. In HPEC, pages 1--6. IEEE, 2017.Google ScholarCross Ref
S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In WWW, pages 607--614, 2011. Google ScholarDigital Library
A. S. Tom, N. Sundaram, N. K. Ahmed, S. Smith, S. Eyerman, M. Kodiyath, I. Hur, F. Petrini, and G. Karypis. Exploring optimizations on shared-memory platforms for parallel triangle counting algorithms. In HPEC, pages 1--7. IEEE, 2017.Google ScholarCross Ref
C. Voegele, Y.-S. Lu, S. Pai, and K. Pingali. Parallel triangle counting and k-truss identification using graph-centric methods. In HPEC, pages 1--7. IEEE, 2017.Google ScholarCross Ref
J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9):812--823, 2012. Google ScholarDigital Library
J. Wu, A. Goshulak, V. Srinivasan, and A. Thomo. K-truss decomposition of large networks on a single consumer-grade machine. In ASONAM, pages 873--880. IEEE, 2018. Google ScholarDigital Library
J. Zhang, D. G. Spampinato, S. McMillan, and F. Franchetti. Preliminary exploration of large-scale triangle counting on shared-memory multicore system. In HPEC, pages 1--6. IEEE, 2018.Google ScholarCross Ref
Y. Zhang and J. X. Yu. Unboundedness and efficiency of truss maintenance in evolving graphs. In SIGMOD, pages 1024--1041, 2019. Google ScholarDigital Library

Recommendations

Accelerating Graph Structural Clustering Algorithms on Heterogeneous Processors
Read More
Accelerating DynEarthSol3D on tightly coupled CPU-GPU heterogeneous processors

DynEarthSol3D (Dynamic Earth Solver in Three Dimensions) is a flexible, open-source finite element solver that models the momentum balance and the heat transfer of elasto-visco-plastic material in the Lagrangian form using unstructured meshes. It ...
Read More
Accelerating simulation of agent-based models on heterogeneous architectures
GPGPU-6: Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units

The wide usage of GPGPU programming models and compiler techniques enables the optimization of data-parallel programs on commodity GPUs. However, mapping GPGPU applications running on discrete parts to emerging integrated heterogeneous architectures ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 13, Issue 10
June 2020
193 pages
ISSN:2150-8097
Editors:
Magdalena Balazinska
University of Washington
,
Xiaofang Zhou
University of Queensland, Australia
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 June 2020
Published in pvldb Volume 13, Issue 10
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 153
  Total Downloads
- Downloads (Last 12 months)69
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Accelerating truss decomposition on heterogeneous processors

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Accelerating Graph Structural Clustering Algorithms on Heterogeneous Processors

Accelerating DynEarthSol3D on tightly coupled CPU-GPU heterogeneous processors

Accelerating simulation of agent-based models on heterogeneous architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Accelerating truss decomposition on heterogeneous processors

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Accelerating Graph Structural Clustering Algorithms on Heterogeneous Processors

Accelerating DynEarthSol3D on tightly coupled CPU-GPU heterogeneous processors

Accelerating simulation of agent-based models on heterogeneous architectures

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media