skip to main content
research-article

Efficient bi-triangle counting for large bipartite networks

Published:01 February 2021Publication History
Skip Abstract Section

Abstract

A bipartite network is a network with two disjoint vertex sets and its edges only exist between vertices from different sets. It has received much interest since it can be used to model the relationship between two different sets of objects in many applications (e.g., the relationship between users and items in E-commerce). In this paper, we study the problem of efficient bi-triangle counting for a large bipartite network, where a bi-triangle is a cycle with three vertices from one vertex set and three vertices from another vertex set. Counting bi-triangles has found many real applications such as computing the transitivity coefficient and clustering coefficient for bipartite networks. To enable efficient bi-triangle counting, we first develop a baseline algorithm relying on the observation that each bi-triangle can be considered as the join of three wedges. Then, we propose a more sophisticated algorithm which regards a bi-triangle as the join of two super-wedges, where a wedge is a path with two edges while a super-wedge is a path with three edges. We further optimize the algorithm by ranking vertices according to their degrees. We have performed extensive experiments on both real and synthetic bipartite networks, where the largest one contains more than one billion edges, and the results show that the proposed solutions are up to five orders of magnitude faster than the baseline method.

References

  1. Nesreen K. Ahmed, Jennifer Neville, Ryan A. Rossi, and Nick G. Duffield. 2015. Efficient Graphlet Counting for Large Networks. In ICDM 2015, Atlantic City, NJ, USA, November 14-17, 2015. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sinan Aksoy, Tamara G. Kolda, and Ali Pinar. 2017. Measuring and modeling bipartite graphs with community structure. J. Complex Networks 5, 4 (2017).Google ScholarGoogle ScholarCross RefCross Ref
  3. Michael J. Barber. 2007. Modularity and community detection in bipartite networks. Phys. Rev. E 76 (2007).Google ScholarGoogle Scholar
  4. Etienne Birmelé. 2009. A scale-free graph model based on bipartite graphs. Discrete Applied Mathematics 157, 10 (2009), 2267--2284. Networks in Computational Biology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andreas Björklund, Petteri Kaski, and Lukasz Kowalik. 2017. Counting Thin Subgraphs via Packings Faster than Meet-in-the-Middle Time. ACM Trans. Algorithms 13, 4 (2017), 48:1--48:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Marco Bressan. 2019. Faster Subgraph Counting in Sparse Graphs. In IPEC.Google ScholarGoogle Scholar
  7. Norishige Chiba and Takao Nishizeki. 1985. Arboricity and Subgraph Listing Algorithms. SIAM J. Comput. 14 (1985), 210--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Luigi P. Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. 2004. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE TPAMI 26, 10 (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jean-Pierre Eckmann and Elisha Moses. 2002. Curvature of co-links uncovers hidden thematic layers in the World Wide Web. PNAS (2002).Google ScholarGoogle Scholar
  10. Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. 2020. A survey of community search over big graphs. The VLDB Journal (2020), 353--392.Google ScholarGoogle Scholar
  11. Yixiang Fang, Yixing Yang, Wenjie Zhang, Xuemin Lin, and Xin Cao. 2020. Effective and Efficient Community Search over Large Heterogeneous Information Networks. Proc. VLDB Endow. 13, 6 (2020), 854--867. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yixiang Fang, Kaiqiang Yu, Reynold Cheng, Laks V.S. Lakshmanan, and Xuemin Lin. 2019. Efficient algorithms for densest subgraph discovery. PVLDB 12, 11 (2019), 1719--1732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Peter Floderus, Mirosław Kowaluk, Andrzej Lingas, and Eva-Marta Lundell. 2013. Detecting and Counting Small Pattern Graphs. In Algorithms and Computation. Springer Berlin Heidelberg, 547--557.Google ScholarGoogle Scholar
  14. J. Flum and M. Grohe. 2002. The parameterized complexity of counting problems. In The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings. 538--547. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Frankl and V. Rödl. 1985. Near Perfect Coverings in Graphs and Hypergraphs. European Journal of Combinatorics 6, 4 (1985), 317 -- 326.Google ScholarGoogle ScholarCross RefCross Ref
  16. R. L. Graham. 1966. Bounds for certain multiprocessing anomalies. The Bell System Technical Journal 45, 9 (1966), 1563--1581.Google ScholarGoogle ScholarCross RefCross Ref
  17. Oded Green, Pavan Yalamanchili, and Lluís-Miquel Munguía. [n.d.]. Fast triangle counting on the GPU. In IA3 2014. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Thomas R. Halford and Keith M. Chugg. 2006. An algorithm for counting short cycles in bipartite graphs. IEEE Trans. Inf. Theory 52, 1 (2006), 287--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wook-Shin Han, Jinsoo Lee, and Jeong-Hoon Lee. 2013. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013. 337--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jiafeng Hu, Reynold Cheng, Kevin Chen-Chuan Chang, Aravind Sankar, Yixiang Fang, and Brian YH Lam. 2019. Discovering maximal motif cliques in large heterogeneous information networks. In ICDE. IEEE, 746--757.Google ScholarGoogle Scholar
  21. Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying k-truss community in large and dynamic graphs. In SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Xin Huang and Laks V. S. Lakshmanan. 2017. Attribute-Driven Community Search. Proc. VLDB Endow. 10, 9 (2017), 949--960. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xin Huang, Wei Lu, and Laks V. S. Lakshmanan. 2016. Truss Decomposition of Probabilistic Graphs: Semantics and Algorithms. In SIGMOD. 77--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Madhav Jha, C. Seshadhri, and Ali Pinar. 2013. A space efficient streaming algorithm for triangle counting using the birthday paradox. In SIGKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Madhav Jha, C. Seshadhri, and Ali Pinar. 2015. Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts. In WWW. 495--505. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hyeonji Kim, Juneyoung Lee, Sourav S. Bhowmick, Wook-Shin Han, Jeong-Hoon Lee, Seongyun Ko, and Moath H. A. Jarrah. 2016. DUALSIM: Parallel Subgraph Enumeration in a Massive Graph on a Single Machine. In SIGMOD, Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Longbin Lai, Lu Qin, Xuemin Lin, and Lijun Chang. 2017. Scalable subgraph enumeration in MapReduce: a cost-oriented approach. VLDB J. 26, 3 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Longbin Lai, Zhu Qing, Zhengyi Yang, Xin Jin, Zhengmin Lai, Ran Wang, Kongzhang Hao, Xuemin Lin, Lu Qin, Wenjie Zhang, et al. 2019. Distributed subgraph matching on timely dataflow. PVLDB (2019). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Pedro Lind, Marta C. Gonzalez, and Hans Herrmann. 2005. Cycles and clustering in bipartite networks. PRE (2005).Google ScholarGoogle Scholar
  30. Boge Liu, Long Yuan, Xuemin Lin, Lu Qin, Wenjie Zhang, and Jingren Zhou. 2019. Efficient (a, bgr)-core Computation: an Index-based Approach. In WWW, Ling Liu, Ryen W. White, Amin Mantrach, Fabrizio Silvestri, Julian J. McAuley, Ricardo Baeza-Yates, and Leila Zia (Eds.). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X. Liu and T. Murata. 2009. Community Detection in Large-Scale Bipartite Networks. In 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Chenhao Ma, Reynold Cheng, Laks V.S. Lakshmanan, Tobias Grubenmann, Yixiang Fang, and Xiaodong Li. 2019. LINC: a motif counting algorithm for uncertain graphs. PVLDB 13, 2 (2019), 155--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ali Masoudi-Nejad, Falk Schreiber, and Zahra Razaghi Moghadam Kashani. 2012. Building blocks of biological networks : a review on major network motif discovery algorithms. IET Systems Biology 6, 5 (2012), 164--174.Google ScholarGoogle ScholarCross RefCross Ref
  34. Nathalie Del Vecchio Matthieu Latapy, Clemence Magnien. 2008. Basic notions for the analysis of large two-mode networks. Social Networks 30, 1 (2008).Google ScholarGoogle Scholar
  35. M. E. J. Newman. 2003. The structure and function of complex networks. SIAM REVIEW 45, 2 (06 2003), 167--256.Google ScholarGoogle Scholar
  36. M. E. J. Newman. 2005. Approximating Clustering Coefficient and Transitivity. Journal of Graph Algorithms and Applications 9, 2 (2005), 265--275.Google ScholarGoogle ScholarCross RefCross Ref
  37. Tore Opsahl. 2013. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Soc. Networks 35, 2 (2013), 159--167.Google ScholarGoogle ScholarCross RefCross Ref
  38. Roger Pearce. 2017. Triangle counting for scale-free graphs at scale in distributed memory. In HPEC. 1--4.Google ScholarGoogle Scholar
  39. Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs. In WWW. 1431--1440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Pedro Manuel Pinto Ribeiro, Pedro Paredes, Miguel E.P. Silva, David Oliveira Aparício, and Fernando M. A. Silva. 2019. A Survey on Subgraph Counting: Concepts, Algorithms and Applications to Network Motifs and Graphlets. ArXiv abs/1910.13011 (2019).Google ScholarGoogle Scholar
  41. Garry Robins and Malcolm Alexander. 2004. Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs. Comput. Math. Organ. Theory 10, 1 (2004), 69--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. G. Robins and M. Alexander. 2004. Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs. Computational and Mathematical Organization Theory 10 (2004), 69--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Seyed-Vahid Sanei-Mehri, Ahmet Erdem Sariyüce, and Srikanta Tirthapura. 2018. Butterfly Counting in Bipartite Networks. In KDD. 2150--2159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ahmet Erdem Sariyüce and Ali Pinar. 2018. Peeling Bipartite Networks for Dense Subgraph Discovery. In WSDM. 504--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. C. Seshadhri, Ali Pinar, and Tamara G. Kolda. 2012. Fast Triangle Counting through Wedge Sampling. CoRR abs/1202.5230 (2012).Google ScholarGoogle Scholar
  46. Haichuan Shang, Ying Zhang, Xuemin Lin, and Jeffrey Xu Yu. 2008. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proc. VLDB Endow. 1, 1 (2008),364--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Katherine Faust Stanley Wasserman. [n.d.]. Social Network Analysis: Methods and Applications.Google ScholarGoogle Scholar
  48. C. E. Tsourakakis, P. Drineas, E. Michelakis, I. Koutis, and C. Faloutsos. 2011. Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. SNAM 1, 2 (2011), 75--81.Google ScholarGoogle Scholar
  49. J. R. Ullmann. 1976. An algorithm for subgraph isomorphism. JOURNAL OF THE ACM 28, 1 (1976), 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Kai Wang, Xin Cao, Xuemin Lin, Wenjie Zhang, and Lu Qin. 2018. Efficient computing of radius-bounded k-cores. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 233--244.Google ScholarGoogle ScholarCross RefCross Ref
  51. Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2019. Vertex Priority Based Butterfly Counting for Large-scale Bipartite Networks. Proc. VLDB Endow. 12, 10 (2019), 1139--1152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2020. Efficient bitruss decomposition for large-scale bipartite graphs. In ICDE. IEEE, 661--672.Google ScholarGoogle Scholar
  53. Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2020. Efficient Bitruss Decomposition for Large-scale Bipartite Graphs. In ICDE 2020. 661--672.Google ScholarGoogle Scholar
  54. Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John C. S. Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2018. MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 73--86.Google ScholarGoogle ScholarCross RefCross Ref
  55. Taotao Wang and Soung Chang Liew. 2014. Joint Channel Estimation and Channel Decoding in Physical-Layer Network Coding Systems: An EM-BP Factor Graph Framework. IEEE Trans. Wireless Communications 13, 4 (2014), 2229--2245.Google ScholarGoogle ScholarCross RefCross Ref
  56. Junchi Yan, Xu-Cheng Yin, Weiyao Lin, Cheng Deng, Hongyuan Zha, and Xiaokang Yang. 2016. A short survey of recent advances in graph matching. In ICMR. 167--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yixing Yang, Yixiang Fang, Xuemin Lin, and Wenjie Zhang. 2020. Effective and Efficient Truss Computation over Large Heterogeneous Information Networks. In ICDE. 901--912. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yixing Yang, Yixiang Fang, Maria E. Orlowska, Wenjie Zhang, Xuemin Lin. [n.d.]. Efficient Bi-triangle Counting for Large Bipartite Networks (Technical report). http://www.cse.unsw.edu.au/~z5151364/technique_report.pdf.Google ScholarGoogle Scholar
  59. Zhaonian Zou. 2016. Bitruss decomposition of bipartite graphs. In International Conference on Database Systems for Advanced Applications. Springer, 218--233. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 14, Issue 6
    February 2021
    261 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 February 2021
    Published in pvldb Volume 14, Issue 6

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader