ABSTRACT
Finding dense bipartite subgraphs and detecting the relations among them is an important problem for affiliation networks that arise in a range of domains, such as social network analysis, word-document clustering, the science of science, internet advertising, and bioinformatics. However, most dense subgraph discovery algorithms are designed for classic, unipartite graphs. Subsequently, studies on affiliation networks are conducted on the co-occurrence graphs (e.g., co-author and co-purchase) that project the bipartite structure to a unipartite structure by connecting two entities if they share an affiliation. Despite their convenience, co-occurrence networks come at a cost of loss of information and an explosion in graph sizes, which limit the quality and the efficiency of solutions. We study the dense subgraph discovery problem on bipartite graphs. We define a framework of bipartite subgraphs based on the butterfly motif (2,2-biclique) to model the dense regions in a hierarchical structure. We introduce efficient peeling algorithms to find the dense subgraphs and build relations among them. We can identify denser structures compared to the state-of-the-art algorithms on co-occurrence graphs in real-world data. Our analyses on an author-paper network and a user-product network yield interesting subgraphs and hierarchical relations such as the groups of collaborators in the same institution and spammers that give fake ratings.
- 2016. IMDb. (2016). (www.imdb.com/interfaces).Google Scholar
- 2017. Konect network dataset. (2017). (http://www.discogs.com/).Google Scholar
- S. Aksoy, T. G. Kolda, and A. Pinar. 2017. Measuring and Modeling Bipartite Graphs with Community Structure. Journal of Complex Networks 5, 4 (2017), 581--603.Google ScholarCross Ref
- R. Alberich, J. Miro-Julia, and F Rosselló. 2002. Marvel Universe looks almost like a real social network. CoRR abs/cond-mat/0202174 (2002).Google Scholar
- Vladimir Batagelj and Matjaz Zaversnik. 2002. Generalized Cores. CoRR cs.DS/0202039 (2002).Google Scholar
- V. Batagelj and M. Zaversnik. 2003. An O(m) Algorithm for Cores Decomposition of Networks. CoRR cs/0310049 (2003).Google Scholar
- A. R. Benson, D. F. Gleich, and J. Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163--166.Google Scholar
- A. Beutel, W. Xu, V. Guruswami, C. Palow, and C. Faloutsos. 2013. CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks. In Proceedings of the 22nd International Conference on World Wide Web (WWW '13). 119--130. Google ScholarDigital Library
- S. P. Borgatti and M. G. Everett. 1997. Network analysis of 2-mode data. Social Networks 19, 3 (1997), 243 -- 269.Google ScholarCross Ref
- Ü. V. Çatalyürek and C. Aykanat. 1999. Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Transactions on Parallel and Distributed Systems 10, 7 (1999), 673--693. Google ScholarDigital Library
- M. Cerinsek and V. Batagelj. 2015. Generalized two-mode cores. Social Networks 42 (2015), 80 -- 87.Google ScholarCross Ref
- S. Chacon. 2009. The 2009 GitHub Contest. (2009). (github.com/blog/466-the-2009-github-contest).Google Scholar
- J. Chen and Y. Saad. 2012. Dense Subgraph Extraction with Application to Community Detection. IEEE Transactions on Knowledge & Data Engineering 24, 7 (2012), 1216--1230. Google ScholarDigital Library
- A. Clauset, E. Tucker, and M. Sainz. 2016. The Colorado Index of Complex Networks. (2016). (icon.colorado.edu).Google Scholar
- J. Cohen. 2008. Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report (2008).Google Scholar
- I.S. Dhillon. 2001. Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '01). 269--274. Google ScholarDigital Library
- M.G. Everett and S.P. Borgatti. 2013. The dual-projection approach for two-mode networks. Social Networks 35, 2 (2013), 204 -- 210.Google ScholarCross Ref
- D. C. Fain and J. O. Pedersen. 2006. Sponsored search: A brief history. Bulletin of the American Society for Information Science and Technology 32, 2 (2006), 12--13.Google ScholarCross Ref
- G. Fei, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R Ghosh. 2013. Exploiting Burstiness in Reviews for Review Spammer Detection. In ICWSM. The AAAI Press.Google Scholar
- C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. 2011. Evaluating Cooperation in Communities with the k-Core Structure. In ASONAM. 87--93. Google ScholarDigital Library
- D. Gibson, R. Kumar, and A. Tomkins. 2005. Discovering Large Dense Subgraphs in Massive Graphs. In VLDB. 721--732. Google ScholarDigital Library
- E. Gregori, L. Lenzini, and C. Orsini. 2011. k-dense communities in the internet AS-level topology. In International Conf. on Communication Systems and Networks (COMSNETS). 1--10.Google Scholar
- X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. 2014. Querying K-truss Community in Large and Dynamic Graphs. In Proc. of the ACM SIGMOD International Conf. on Management of Data. 1311--1322. Google ScholarDigital Library
- R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. 1999. Trawling the Web for Emerging Cyber-communities. In WWW. 1481--1493. Google ScholarDigital Library
- Jerome Kunegis. 2017. Konect network dataset. (2017). (http://konect.uni-koblenz.de).Google Scholar
- M. Latapy, C. Magnien, and N. Del Vecchio. 2008. Basic notions for the analysis of large two-mode networks. Social Networks 30, 1 (2008), 31 -- 48.Google ScholarCross Ref
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. (June 2014). (snap.stanford.edu/data).Google Scholar
- Michael Ley. 2016. DBLP computer science bibliography. (Sept. 2016). (dblp.uni-trier.de).Google Scholar
- Y. Li, T. Kuboyama, and H. Sakamoto. 2013. Truss Decomposition for Extracting Communities in Bipartite Graph. In IMMM 2013 : The Third International Conference on Advances in Information Mining and Management.Google Scholar
- D. Matula and L. Beck. 1983. Smallest-last ordering and clustering and graph coloring algorithms. Journal of ACM 30, 3 (1983), 417--427. Google ScholarDigital Library
- M. Mitzenmacher, J. Pachocki, R. Peng, C. Tsourakakis, and S. C. Xu. 2015. Scalable Large Near-Clique Detection in Large-Scale Networks via Sampling. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). 815--824. Google ScholarDigital Library
- A. P. Mukherjee and S. Tirthapura. 2014. Enumerating Maximal Bicliques from a Large Graph Using MapReduce. In Proceedings of the 2014 IEEE International Congress on Big Data (BIGDATACONGRESS '14). 707--716. Google ScholarDigital Library
- M. E. J. Newman. 2001. Scientific collaboration networks. I. Network construction and fundamental results. Phys. Rev. E 64 (2001), 016131. Issue 1.Google ScholarCross Ref
- M. E. J. Newman. 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys. Rev. E 64 (2001), 016132. Issue 1.Google ScholarCross Ref
- M. E. J. Newman. 2001. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences 98, 2 (2001), 404--409.Google ScholarCross Ref
- T. Opsahl. 2013. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Social Networks 35, 2 (2013), 159 -- 167.Google ScholarCross Ref
- G. Robins and M. Alexander. 2004. Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs. Computational & Mathematical Organization Theory 10, 1 (2004), 69--94. Google ScholarDigital Library
- K. Saito and T. Yamada. 2006. Extracting Communities from Complex Networks by the k-dense Method. In IEEE International Conf. on Data Mining Workshops, ICDMW. 300--304. Google ScholarDigital Library
- A. E. Sarıyüce and A. Pinar. 2016. Fast Hierarchy Construction for Dense Subgraphs. Proc. VLDB Endow. 10, 3 (Nov. 2016), 97--108. Google ScholarDigital Library
- A. E. Sarıyüce and A. Pinar. 2017. Peeling Bipartite Networks for Dense Subgraph Discovery. CoRR 1611.02756 (2017). (Extended version).Google Scholar
- A. E. Sarıyüce, C. Seshadhri, A. Pınar, and Ü. V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs Using Nucleus Decompositions. In Proc. of the International Conf. on World Wide Web (WWW). 927--937. Google ScholarDigital Library
- S. B. Seidman. 1983. Network structure and minimum degree. Social Networks 5, 3 (1983), 269--287.Google ScholarCross Ref
- K. Sim, J. Li, V. Gopalkrishnan, and G. Liu. 2009. Mining maximal quasi-bicliques: Novel algorithm and applications in the stock market and protein networks. Statistical Analysis and Data Mining 2, 4 (2009), 255--273. Google ScholarDigital Library
- C. Tsourakakis. 2015. The K-clique Densest Subgraph Problem. In Proc. of the 24th International Conf. on World Wide Web (WWW '15). 1122--1132. Google ScholarDigital Library
- C. E. Tsourakakis, J. Pachocki, and M. Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). 1451--1460. Google ScholarDigital Library
- A. Verma and S. Butenko. 2012. Network clustering via clique relaxations: A community based approach. In Graph Partitioning and Clustering, DIMACS Workshop. 129--140.Google Scholar
- M. M. Wolf, A. M. Klinvex, and D. M. Dunlavy. 2016. Advantages to Modeling Relational Data using Hypergraphs versus Graphs. In IEEE High Performance Extreme Computing Conference, HPEC.Google Scholar
- Y. Zhang and S. Parthasarathy. 2012. Extracting Analyzing and Visualizing Triangle K-Core Motifs Within Networks. In Proc. of the IEEE International Conf. on Data Engineering (ICDE). 1049--1060. Google ScholarDigital Library
Index Terms
- Peeling Bipartite Networks for Dense Subgraph Discovery
Recommendations
Near-optimal fully dynamic densest subgraph
STOC 2020: Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of ComputingWe give the first fully dynamic algorithm which maintains a (1−є)-approximate densest subgraph in worst-case time poly(logn, є−1) per update. Dense subgraph discovery is an important primitive for many real-world applications such as community detection,...
A forbidden subgraph characterization of line-polar bipartite graphs
A graph is polar if the vertex set can be partitioned into A and B in such a way that the subgraph induced by A is a complete multipartite graph and the subgraph induced by B is a disjoint union of cliques. Polar graphs are a common generalization of ...
Finding densest k-connected subgraphs
AbstractDense subgraph discovery is an important graph-mining primitive with a variety of real-world applications. One of the most well-studied optimization problems for dense subgraph discovery is the densest subgraph problem, where given an ...
Comments