skip to main content
research-article

Accelerating truss decomposition on heterogeneous processors

Published:01 June 2020Publication History
Skip Abstract Section

Abstract

Truss decomposition is to divide a graph into a hierarchy of subgraphs, or trusses. A subgraph is a k-truss (k ≥ 2) if each edge is in at least k --- 2 triangles in the subgraph. Existing algorithms work by first counting the number of triangles each edge is in and then iteratively incrementing k to peel off the edges that will not appear in (k + 1)-truss. Due to the data and computation intensity, truss decomposition on billion-edge graphs takes hours to complete on a commodity computer.

We propose to accelerate in-memory truss decomposition by (1) compacting intermediate results to optimize memory access, (2) dynamically adjusting the computation based on data characteristics, and (3) parallelizing the algorithm on both the multicore CPU and the GPU. In particular, we optimize the triangle enumeration with data skew handling, and determine at runtime whether to pursue peeling or direct triangle counting to obtain a certain k-truss. We further develop a CPU-GPU co-processing strategy in which the CPU first computes intermediate results and sends the compacted results to the GPU for further computation. Our experiments on real-world datasets show that our implementations outperform the state of the art by up to an order of magnitude. Our source code is publicly available at https://github.com/RapidsAtHKUST/AccTrussDecomposition.

References

  1. Cub documentation. http://nvlabs.github.io/cub/. Accessed in 2020.Google ScholarGoogle Scholar
  2. Gcc atomic built-ins. https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html. Accessed in 2020.Google ScholarGoogle Scholar
  3. Openmp documentation. https://www.openmp.org/. Accessed in 2020.Google ScholarGoogle Scholar
  4. E. Akbas and P. Zhao. Truss-based community search: a truss-equivalence based indexing approach. PVLDB, 10(11):1298--1309, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Almasri, O. Anjum, C. Pearson, Z. Qureshi, V. S. Mailthody, R. Nagi, J. Xiong, and W.-m. Hwu. Update on k-truss decomposition on gpu. In HPEC, pages 1--7. IEEE, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  6. P. Boldi, M. Rosa, M. Santini, and S. Vigna. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In WWW, pages 587--596, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Boldi and S. Vigna. The webgraph framework i: compression techniques. In WWW, pages 595--602, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Chang and L. Qin. Minimum degree-based core decomposition. In Cohesive Subgraph Computation over Large Sparse Graphs, pages 21--39. Springer, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  9. Y. Che. Rmat graph format converter code repository. https://github.com/RapidsAtHKUST/KroneckerBinEdgeListToCSR. Accessed in 2020.Google ScholarGoogle Scholar
  10. Y. Che. Rmat graph generator code repository. https://github.com/RapidsAtHKUST/Graph500KroneckerGraphGenerator. Accessed in 2020.Google ScholarGoogle Scholar
  11. Y. Che, Z. Lai, S. Sun, Q. Luo, and Y. Wang. Accelerating all-edge common neighbor counting on three processors. In ICPP, pages 1--10, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Che, Z. Lai, S. Sun, Y. Wang, and Q. Luo. Source code of accelerating truss decomposition on heterogeneous processors. https://github.com/RapidsAtHKUST/AccTrussDecomposition. Accessed in 2020.Google ScholarGoogle Scholar
  13. N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14(1):210--223, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Cohen. Trusses: Cohesive subgraphs for social network analysis. National security agency technical report, 16:3--1, 2008.Google ScholarGoogle Scholar
  15. K. Date, K. Feng, R. Nagi, J. Xiong, N. S. Kim, and W.-M. Hwu. Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition on the minsky architecture: Static graph challenge: Subgraph isomorphism. In HPEC, pages 1--7. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  16. L. Dhulipala, G. Blelloch, and J. Shun. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In SPAA, pages 293--304, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Gui, L. Zheng, P. Yao, X. Liao, and H. Jin. Fast triangle counting on GPU. In HPEC, pages 1--7. IEEE, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Hu, H. Liu, and H. H. Huang. High-performance triangle counting on gpus. In HPEC, pages 1--5. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  19. X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. Querying k-truss community in large and dynamic graphs. In SIGMOD, pages 1311--1322, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Kabir and K. Madduri. Parallel k-truss decomposition on multicore systems. In HPEC, pages 1--7. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  21. H. Kabir and K. Madduri. Shared-memory graph truss decomposition. In HiPC, pages 13--22. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014. Accessed in 2020.Google ScholarGoogle Scholar
  23. L. Lü, T. Zhou, Q.-M. Zhang, and H. E. Stanley. The h-index of a network node and its relation to degree and coreness. Nature communications, 7:10168, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  24. V. S. Mailthody, K. Date, Z. Qureshi, C. Pearson, R. Nagi, J. Xiong, and W.-m. Hwu. Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition. In HPEC, pages 1--7. IEEE, 2018.Google ScholarGoogle Scholar
  25. A. Montresor, F. De Pellegrini, and D. Miorandi. Distributed k-core decomposition. TPDS, 24(2):288--300, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Pandey, X. S. Li, A. Buluç, J. Xu, and H. Liu. H-INDEX: hash-indexing for parallel triangle counting on gpus. In HPEC, pages 1--7. IEEE, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  27. R. Pearce. Triangle counting for scale-free graphs at scale in distributed memory. In HPEC, pages 1--4. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  28. R. A. Rossi. Fast triangle core decomposition for mining large graphs. In PAKDD, pages 310--322. Springer, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  29. A. E. Sariyüce and A. Pinar. Fast hierarchy construction for dense subgraphs. PVLDB, 10(3):97--108, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. E. Sariyüce, C. Seshadhri, and A. Pinar. Local algorithms for hierarchical dense subgraph discovery. PVLDB, 12(1):43--56, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. E. Sariyuce, C. Seshadhri, A. Pinar, and U. V. Catalyurek. Finding the hierarchy of dense subgraphs using nucleus decompositions. In WWW, pages 927--937, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In PPoPP, pages 135--146, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Singler, P. Sanders, and F. Putze. Mcstl: The multi-core standard template library. In European Conference on Parallel Processing, pages 682--694. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Smith, X. Liu, N. K. Ahmed, A. S. Tom, F. Petrini, and G. Karypis. Truss decomposition on shared-memory parallel systems. In HPEC, pages 1--6. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In WWW, pages 607--614, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. S. Tom, N. Sundaram, N. K. Ahmed, S. Smith, S. Eyerman, M. Kodiyath, I. Hur, F. Petrini, and G. Karypis. Exploring optimizations on shared-memory platforms for parallel triangle counting algorithms. In HPEC, pages 1--7. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  37. C. Voegele, Y.-S. Lu, S. Pai, and K. Pingali. Parallel triangle counting and k-truss identification using graph-centric methods. In HPEC, pages 1--7. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  38. J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9):812--823, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Wu, A. Goshulak, V. Srinivasan, and A. Thomo. K-truss decomposition of large networks on a single consumer-grade machine. In ASONAM, pages 873--880. IEEE, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Zhang, D. G. Spampinato, S. McMillan, and F. Franchetti. Preliminary exploration of large-scale triangle counting on shared-memory multicore system. In HPEC, pages 1--6. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  41. Y. Zhang and J. X. Yu. Unboundedness and efficiency of truss maintenance in evolving graphs. In SIGMOD, pages 1024--1041, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 13, Issue 10
    June 2020
    193 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 June 2020
    Published in pvldb Volume 13, Issue 10

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader