Abstract
A bipartite network is a network with two disjoint vertex sets and its edges only exist between vertices from different sets. It has received much interest since it can be used to model the relationship between two different sets of objects in many applications (e.g., the relationship between users and items in E-commerce). In this paper, we study the problem of efficient bi-triangle counting for a large bipartite network, where a bi-triangle is a cycle with three vertices from one vertex set and three vertices from another vertex set. Counting bi-triangles has found many real applications such as computing the transitivity coefficient and clustering coefficient for bipartite networks. To enable efficient bi-triangle counting, we first develop a baseline algorithm relying on the observation that each bi-triangle can be considered as the join of three wedges. Then, we propose a more sophisticated algorithm which regards a bi-triangle as the join of two super-wedges, where a wedge is a path with two edges while a super-wedge is a path with three edges. We further optimize the algorithm by ranking vertices according to their degrees. We have performed extensive experiments on both real and synthetic bipartite networks, where the largest one contains more than one billion edges, and the results show that the proposed solutions are up to five orders of magnitude faster than the baseline method.
- Nesreen K. Ahmed, Jennifer Neville, Ryan A. Rossi, and Nick G. Duffield. 2015. Efficient Graphlet Counting for Large Networks. In ICDM 2015, Atlantic City, NJ, USA, November 14-17, 2015. 1--10. Google ScholarDigital Library
- Sinan Aksoy, Tamara G. Kolda, and Ali Pinar. 2017. Measuring and modeling bipartite graphs with community structure. J. Complex Networks 5, 4 (2017).Google ScholarCross Ref
- Michael J. Barber. 2007. Modularity and community detection in bipartite networks. Phys. Rev. E 76 (2007).Google Scholar
- Etienne Birmelé. 2009. A scale-free graph model based on bipartite graphs. Discrete Applied Mathematics 157, 10 (2009), 2267--2284. Networks in Computational Biology. Google ScholarDigital Library
- Andreas Björklund, Petteri Kaski, and Lukasz Kowalik. 2017. Counting Thin Subgraphs via Packings Faster than Meet-in-the-Middle Time. ACM Trans. Algorithms 13, 4 (2017), 48:1--48:26. Google ScholarDigital Library
- Marco Bressan. 2019. Faster Subgraph Counting in Sparse Graphs. In IPEC.Google Scholar
- Norishige Chiba and Takao Nishizeki. 1985. Arboricity and Subgraph Listing Algorithms. SIAM J. Comput. 14 (1985), 210--223. Google ScholarDigital Library
- Luigi P. Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. 2004. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE TPAMI 26, 10 (2004). Google ScholarDigital Library
- Jean-Pierre Eckmann and Elisha Moses. 2002. Curvature of co-links uncovers hidden thematic layers in the World Wide Web. PNAS (2002).Google Scholar
- Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. 2020. A survey of community search over big graphs. The VLDB Journal (2020), 353--392.Google Scholar
- Yixiang Fang, Yixing Yang, Wenjie Zhang, Xuemin Lin, and Xin Cao. 2020. Effective and Efficient Community Search over Large Heterogeneous Information Networks. Proc. VLDB Endow. 13, 6 (2020), 854--867. Google ScholarDigital Library
- Yixiang Fang, Kaiqiang Yu, Reynold Cheng, Laks V.S. Lakshmanan, and Xuemin Lin. 2019. Efficient algorithms for densest subgraph discovery. PVLDB 12, 11 (2019), 1719--1732. Google ScholarDigital Library
- Peter Floderus, Mirosław Kowaluk, Andrzej Lingas, and Eva-Marta Lundell. 2013. Detecting and Counting Small Pattern Graphs. In Algorithms and Computation. Springer Berlin Heidelberg, 547--557.Google Scholar
- J. Flum and M. Grohe. 2002. The parameterized complexity of counting problems. In The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings. 538--547. Google ScholarDigital Library
- P. Frankl and V. Rödl. 1985. Near Perfect Coverings in Graphs and Hypergraphs. European Journal of Combinatorics 6, 4 (1985), 317 -- 326.Google ScholarCross Ref
- R. L. Graham. 1966. Bounds for certain multiprocessing anomalies. The Bell System Technical Journal 45, 9 (1966), 1563--1581.Google ScholarCross Ref
- Oded Green, Pavan Yalamanchili, and Lluís-Miquel Munguía. [n.d.]. Fast triangle counting on the GPU. In IA3 2014. 1--8. Google ScholarDigital Library
- Thomas R. Halford and Keith M. Chugg. 2006. An algorithm for counting short cycles in bipartite graphs. IEEE Trans. Inf. Theory 52, 1 (2006), 287--292. Google ScholarDigital Library
- Wook-Shin Han, Jinsoo Lee, and Jeong-Hoon Lee. 2013. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013. 337--348. Google ScholarDigital Library
- Jiafeng Hu, Reynold Cheng, Kevin Chen-Chuan Chang, Aravind Sankar, Yixiang Fang, and Brian YH Lam. 2019. Discovering maximal motif cliques in large heterogeneous information networks. In ICDE. IEEE, 746--757.Google Scholar
- Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying k-truss community in large and dynamic graphs. In SIGMOD. Google ScholarDigital Library
- Xin Huang and Laks V. S. Lakshmanan. 2017. Attribute-Driven Community Search. Proc. VLDB Endow. 10, 9 (2017), 949--960. Google ScholarDigital Library
- Xin Huang, Wei Lu, and Laks V. S. Lakshmanan. 2016. Truss Decomposition of Probabilistic Graphs: Semantics and Algorithms. In SIGMOD. 77--90. Google ScholarDigital Library
- Madhav Jha, C. Seshadhri, and Ali Pinar. 2013. A space efficient streaming algorithm for triangle counting using the birthday paradox. In SIGKDD. Google ScholarDigital Library
- Madhav Jha, C. Seshadhri, and Ali Pinar. 2015. Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts. In WWW. 495--505. Google ScholarDigital Library
- Hyeonji Kim, Juneyoung Lee, Sourav S. Bhowmick, Wook-Shin Han, Jeong-Hoon Lee, Seongyun Ko, and Moath H. A. Jarrah. 2016. DUALSIM: Parallel Subgraph Enumeration in a Massive Graph on a Single Machine. In SIGMOD, Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). Google ScholarDigital Library
- Longbin Lai, Lu Qin, Xuemin Lin, and Lijun Chang. 2017. Scalable subgraph enumeration in MapReduce: a cost-oriented approach. VLDB J. 26, 3 (2017). Google ScholarDigital Library
- Longbin Lai, Zhu Qing, Zhengyi Yang, Xin Jin, Zhengmin Lai, Ran Wang, Kongzhang Hao, Xuemin Lin, Lu Qin, Wenjie Zhang, et al. 2019. Distributed subgraph matching on timely dataflow. PVLDB (2019). Google ScholarDigital Library
- Pedro Lind, Marta C. Gonzalez, and Hans Herrmann. 2005. Cycles and clustering in bipartite networks. PRE (2005).Google Scholar
- Boge Liu, Long Yuan, Xuemin Lin, Lu Qin, Wenjie Zhang, and Jingren Zhou. 2019. Efficient (a, bgr)-core Computation: an Index-based Approach. In WWW, Ling Liu, Ryen W. White, Amin Mantrach, Fabrizio Silvestri, Julian J. McAuley, Ricardo Baeza-Yates, and Leila Zia (Eds.). Google ScholarDigital Library
- X. Liu and T. Murata. 2009. Community Detection in Large-Scale Bipartite Networks. In 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology. Google ScholarDigital Library
- Chenhao Ma, Reynold Cheng, Laks V.S. Lakshmanan, Tobias Grubenmann, Yixiang Fang, and Xiaodong Li. 2019. LINC: a motif counting algorithm for uncertain graphs. PVLDB 13, 2 (2019), 155--168. Google ScholarDigital Library
- Ali Masoudi-Nejad, Falk Schreiber, and Zahra Razaghi Moghadam Kashani. 2012. Building blocks of biological networks : a review on major network motif discovery algorithms. IET Systems Biology 6, 5 (2012), 164--174.Google ScholarCross Ref
- Nathalie Del Vecchio Matthieu Latapy, Clemence Magnien. 2008. Basic notions for the analysis of large two-mode networks. Social Networks 30, 1 (2008).Google Scholar
- M. E. J. Newman. 2003. The structure and function of complex networks. SIAM REVIEW 45, 2 (06 2003), 167--256.Google Scholar
- M. E. J. Newman. 2005. Approximating Clustering Coefficient and Transitivity. Journal of Graph Algorithms and Applications 9, 2 (2005), 265--275.Google ScholarCross Ref
- Tore Opsahl. 2013. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Soc. Networks 35, 2 (2013), 159--167.Google ScholarCross Ref
- Roger Pearce. 2017. Triangle counting for scale-free graphs at scale in distributed memory. In HPEC. 1--4.Google Scholar
- Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs. In WWW. 1431--1440. Google ScholarDigital Library
- Pedro Manuel Pinto Ribeiro, Pedro Paredes, Miguel E.P. Silva, David Oliveira Aparício, and Fernando M. A. Silva. 2019. A Survey on Subgraph Counting: Concepts, Algorithms and Applications to Network Motifs and Graphlets. ArXiv abs/1910.13011 (2019).Google Scholar
- Garry Robins and Malcolm Alexander. 2004. Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs. Comput. Math. Organ. Theory 10, 1 (2004), 69--94. Google ScholarDigital Library
- G. Robins and M. Alexander. 2004. Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs. Computational and Mathematical Organization Theory 10 (2004), 69--94. Google ScholarDigital Library
- Seyed-Vahid Sanei-Mehri, Ahmet Erdem Sariyüce, and Srikanta Tirthapura. 2018. Butterfly Counting in Bipartite Networks. In KDD. 2150--2159. Google ScholarDigital Library
- Ahmet Erdem Sariyüce and Ali Pinar. 2018. Peeling Bipartite Networks for Dense Subgraph Discovery. In WSDM. 504--512. Google ScholarDigital Library
- C. Seshadhri, Ali Pinar, and Tamara G. Kolda. 2012. Fast Triangle Counting through Wedge Sampling. CoRR abs/1202.5230 (2012).Google Scholar
- Haichuan Shang, Ying Zhang, Xuemin Lin, and Jeffrey Xu Yu. 2008. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proc. VLDB Endow. 1, 1 (2008),364--375. Google ScholarDigital Library
- Katherine Faust Stanley Wasserman. [n.d.]. Social Network Analysis: Methods and Applications.Google Scholar
- C. E. Tsourakakis, P. Drineas, E. Michelakis, I. Koutis, and C. Faloutsos. 2011. Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. SNAM 1, 2 (2011), 75--81.Google Scholar
- J. R. Ullmann. 1976. An algorithm for subgraph isomorphism. JOURNAL OF THE ACM 28, 1 (1976), 31--42. Google ScholarDigital Library
- Kai Wang, Xin Cao, Xuemin Lin, Wenjie Zhang, and Lu Qin. 2018. Efficient computing of radius-bounded k-cores. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 233--244.Google ScholarCross Ref
- Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2019. Vertex Priority Based Butterfly Counting for Large-scale Bipartite Networks. Proc. VLDB Endow. 12, 10 (2019), 1139--1152. Google ScholarDigital Library
- Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2020. Efficient bitruss decomposition for large-scale bipartite graphs. In ICDE. IEEE, 661--672.Google Scholar
- Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, and Ying Zhang. 2020. Efficient Bitruss Decomposition for Large-scale Bipartite Graphs. In ICDE 2020. 661--672.Google Scholar
- Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John C. S. Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2018. MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 73--86.Google ScholarCross Ref
- Taotao Wang and Soung Chang Liew. 2014. Joint Channel Estimation and Channel Decoding in Physical-Layer Network Coding Systems: An EM-BP Factor Graph Framework. IEEE Trans. Wireless Communications 13, 4 (2014), 2229--2245.Google ScholarCross Ref
- Junchi Yan, Xu-Cheng Yin, Weiyao Lin, Cheng Deng, Hongyuan Zha, and Xiaokang Yang. 2016. A short survey of recent advances in graph matching. In ICMR. 167--174. Google ScholarDigital Library
- Yixing Yang, Yixiang Fang, Xuemin Lin, and Wenjie Zhang. 2020. Effective and Efficient Truss Computation over Large Heterogeneous Information Networks. In ICDE. 901--912. Google ScholarDigital Library
- Yixing Yang, Yixiang Fang, Maria E. Orlowska, Wenjie Zhang, Xuemin Lin. [n.d.]. Efficient Bi-triangle Counting for Large Bipartite Networks (Technical report). http://www.cse.unsw.edu.au/~z5151364/technique_report.pdf.Google Scholar
- Zhaonian Zou. 2016. Bitruss decomposition of bipartite graphs. In International Conference on Database Systems for Advanced Applications. Springer, 218--233. Google ScholarDigital Library
Recommendations
Bipartite subgraphs of triangle-free subcubic graphs
Suppose G is a graph with n vertices and m edges. Let n^' be the maximum number of vertices in an induced bipartite subgraph of G and let m^' be the maximum number of edges in a spanning bipartite subgraph of G. Then b(G)=m^'/m is called the bipartite ...
Triangle-free subcubic graphs with minimum bipartite density
A graph is subcubic if its maximum degree is at most 3. The bipartite density of a graph G is max{@e(H)/@e(G):H is a bipartite subgraph of G}, where @e(H) and @e(G) denote the numbers of edges in H and G, respectively. It is an NP-hard problem to ...
Bipartite density of triangle-free subcubic graphs
A graph is subcubic if its maximum degree is at most 3. The bipartite density of a graph G is defined as b(G)=max{|E(B)|/|E(G)|:B is a bipartite subgraph of G}. It was conjectured by Bondy and Locke that if G is a triangle-free subcubic graph, then b(G)>...
Comments