skip to main content
10.1145/3159652.3159678acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Peeling Bipartite Networks for Dense Subgraph Discovery

Published:02 February 2018Publication History

ABSTRACT

Finding dense bipartite subgraphs and detecting the relations among them is an important problem for affiliation networks that arise in a range of domains, such as social network analysis, word-document clustering, the science of science, internet advertising, and bioinformatics. However, most dense subgraph discovery algorithms are designed for classic, unipartite graphs. Subsequently, studies on affiliation networks are conducted on the co-occurrence graphs (e.g., co-author and co-purchase) that project the bipartite structure to a unipartite structure by connecting two entities if they share an affiliation. Despite their convenience, co-occurrence networks come at a cost of loss of information and an explosion in graph sizes, which limit the quality and the efficiency of solutions. We study the dense subgraph discovery problem on bipartite graphs. We define a framework of bipartite subgraphs based on the butterfly motif (2,2-biclique) to model the dense regions in a hierarchical structure. We introduce efficient peeling algorithms to find the dense subgraphs and build relations among them. We can identify denser structures compared to the state-of-the-art algorithms on co-occurrence graphs in real-world data. Our analyses on an author-paper network and a user-product network yield interesting subgraphs and hierarchical relations such as the groups of collaborators in the same institution and spammers that give fake ratings.

References

  1. 2016. IMDb. (2016). (www.imdb.com/interfaces).Google ScholarGoogle Scholar
  2. 2017. Konect network dataset. (2017). (http://www.discogs.com/).Google ScholarGoogle Scholar
  3. S. Aksoy, T. G. Kolda, and A. Pinar. 2017. Measuring and Modeling Bipartite Graphs with Community Structure. Journal of Complex Networks 5, 4 (2017), 581--603.Google ScholarGoogle ScholarCross RefCross Ref
  4. R. Alberich, J. Miro-Julia, and F Rosselló. 2002. Marvel Universe looks almost like a real social network. CoRR abs/cond-mat/0202174 (2002).Google ScholarGoogle Scholar
  5. Vladimir Batagelj and Matjaz Zaversnik. 2002. Generalized Cores. CoRR cs.DS/0202039 (2002).Google ScholarGoogle Scholar
  6. V. Batagelj and M. Zaversnik. 2003. An O(m) Algorithm for Cores Decomposition of Networks. CoRR cs/0310049 (2003).Google ScholarGoogle Scholar
  7. A. R. Benson, D. F. Gleich, and J. Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163--166.Google ScholarGoogle Scholar
  8. A. Beutel, W. Xu, V. Guruswami, C. Palow, and C. Faloutsos. 2013. CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks. In Proceedings of the 22nd International Conference on World Wide Web (WWW '13). 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. P. Borgatti and M. G. Everett. 1997. Network analysis of 2-mode data. Social Networks 19, 3 (1997), 243 -- 269.Google ScholarGoogle ScholarCross RefCross Ref
  10. Ü. V. Çatalyürek and C. Aykanat. 1999. Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Transactions on Parallel and Distributed Systems 10, 7 (1999), 673--693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Cerinsek and V. Batagelj. 2015. Generalized two-mode cores. Social Networks 42 (2015), 80 -- 87.Google ScholarGoogle ScholarCross RefCross Ref
  12. S. Chacon. 2009. The 2009 GitHub Contest. (2009). (github.com/blog/466-the-2009-github-contest).Google ScholarGoogle Scholar
  13. J. Chen and Y. Saad. 2012. Dense Subgraph Extraction with Application to Community Detection. IEEE Transactions on Knowledge & Data Engineering 24, 7 (2012), 1216--1230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Clauset, E. Tucker, and M. Sainz. 2016. The Colorado Index of Complex Networks. (2016). (icon.colorado.edu).Google ScholarGoogle Scholar
  15. J. Cohen. 2008. Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report (2008).Google ScholarGoogle Scholar
  16. I.S. Dhillon. 2001. Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '01). 269--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M.G. Everett and S.P. Borgatti. 2013. The dual-projection approach for two-mode networks. Social Networks 35, 2 (2013), 204 -- 210.Google ScholarGoogle ScholarCross RefCross Ref
  18. D. C. Fain and J. O. Pedersen. 2006. Sponsored search: A brief history. Bulletin of the American Society for Information Science and Technology 32, 2 (2006), 12--13.Google ScholarGoogle ScholarCross RefCross Ref
  19. G. Fei, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R Ghosh. 2013. Exploiting Burstiness in Reviews for Review Spammer Detection. In ICWSM. The AAAI Press.Google ScholarGoogle Scholar
  20. C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. 2011. Evaluating Cooperation in Communities with the k-Core Structure. In ASONAM. 87--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Gibson, R. Kumar, and A. Tomkins. 2005. Discovering Large Dense Subgraphs in Massive Graphs. In VLDB. 721--732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. Gregori, L. Lenzini, and C. Orsini. 2011. k-dense communities in the internet AS-level topology. In International Conf. on Communication Systems and Networks (COMSNETS). 1--10.Google ScholarGoogle Scholar
  23. X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. 2014. Querying K-truss Community in Large and Dynamic Graphs. In Proc. of the ACM SIGMOD International Conf. on Management of Data. 1311--1322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. 1999. Trawling the Web for Emerging Cyber-communities. In WWW. 1481--1493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jerome Kunegis. 2017. Konect network dataset. (2017). (http://konect.uni-koblenz.de).Google ScholarGoogle Scholar
  26. M. Latapy, C. Magnien, and N. Del Vecchio. 2008. Basic notions for the analysis of large two-mode networks. Social Networks 30, 1 (2008), 31 -- 48.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. (June 2014). (snap.stanford.edu/data).Google ScholarGoogle Scholar
  28. Michael Ley. 2016. DBLP computer science bibliography. (Sept. 2016). (dblp.uni-trier.de).Google ScholarGoogle Scholar
  29. Y. Li, T. Kuboyama, and H. Sakamoto. 2013. Truss Decomposition for Extracting Communities in Bipartite Graph. In IMMM 2013 : The Third International Conference on Advances in Information Mining and Management.Google ScholarGoogle Scholar
  30. D. Matula and L. Beck. 1983. Smallest-last ordering and clustering and graph coloring algorithms. Journal of ACM 30, 3 (1983), 417--427. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Mitzenmacher, J. Pachocki, R. Peng, C. Tsourakakis, and S. C. Xu. 2015. Scalable Large Near-Clique Detection in Large-Scale Networks via Sampling. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). 815--824. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. P. Mukherjee and S. Tirthapura. 2014. Enumerating Maximal Bicliques from a Large Graph Using MapReduce. In Proceedings of the 2014 IEEE International Congress on Big Data (BIGDATACONGRESS '14). 707--716. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. E. J. Newman. 2001. Scientific collaboration networks. I. Network construction and fundamental results. Phys. Rev. E 64 (2001), 016131. Issue 1.Google ScholarGoogle ScholarCross RefCross Ref
  34. M. E. J. Newman. 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys. Rev. E 64 (2001), 016132. Issue 1.Google ScholarGoogle ScholarCross RefCross Ref
  35. M. E. J. Newman. 2001. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences 98, 2 (2001), 404--409.Google ScholarGoogle ScholarCross RefCross Ref
  36. T. Opsahl. 2013. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Social Networks 35, 2 (2013), 159 -- 167.Google ScholarGoogle ScholarCross RefCross Ref
  37. G. Robins and M. Alexander. 2004. Small Worlds Among Interlocking Directors: Network Structure and Distance in Bipartite Graphs. Computational & Mathematical Organization Theory 10, 1 (2004), 69--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. Saito and T. Yamada. 2006. Extracting Communities from Complex Networks by the k-dense Method. In IEEE International Conf. on Data Mining Workshops, ICDMW. 300--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. E. Sarıyüce and A. Pinar. 2016. Fast Hierarchy Construction for Dense Subgraphs. Proc. VLDB Endow. 10, 3 (Nov. 2016), 97--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A. E. Sarıyüce and A. Pinar. 2017. Peeling Bipartite Networks for Dense Subgraph Discovery. CoRR 1611.02756 (2017). (Extended version).Google ScholarGoogle Scholar
  41. A. E. Sarıyüce, C. Seshadhri, A. Pınar, and Ü. V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs Using Nucleus Decompositions. In Proc. of the International Conf. on World Wide Web (WWW). 927--937. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. S. B. Seidman. 1983. Network structure and minimum degree. Social Networks 5, 3 (1983), 269--287.Google ScholarGoogle ScholarCross RefCross Ref
  43. K. Sim, J. Li, V. Gopalkrishnan, and G. Liu. 2009. Mining maximal quasi-bicliques: Novel algorithm and applications in the stock market and protein networks. Statistical Analysis and Data Mining 2, 4 (2009), 255--273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. C. Tsourakakis. 2015. The K-clique Densest Subgraph Problem. In Proc. of the 24th International Conf. on World Wide Web (WWW '15). 1122--1132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. C. E. Tsourakakis, J. Pachocki, and M. Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). 1451--1460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. A. Verma and S. Butenko. 2012. Network clustering via clique relaxations: A community based approach. In Graph Partitioning and Clustering, DIMACS Workshop. 129--140.Google ScholarGoogle Scholar
  47. M. M. Wolf, A. M. Klinvex, and D. M. Dunlavy. 2016. Advantages to Modeling Relational Data using Hypergraphs versus Graphs. In IEEE High Performance Extreme Computing Conference, HPEC.Google ScholarGoogle Scholar
  48. Y. Zhang and S. Parthasarathy. 2012. Extracting Analyzing and Visualizing Triangle K-Core Motifs Within Networks. In Proc. of the IEEE International Conf. on Data Engineering (ICDE). 1049--1060. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Peeling Bipartite Networks for Dense Subgraph Discovery

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
          February 2018
          821 pages
          ISBN:9781450355810
          DOI:10.1145/3159652

          Copyright © 2018 ACM

          © 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 February 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader