skip to main content
article

Streaming algorithms for k-core decomposition

Authors Info & Claims
Published:01 April 2013Publication History
Skip Abstract Section

Abstract

A k-core of a graph is a maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph. k-core decomposition is often used in large-scale network analysis, such as community detection, protein function prediction, visualization, and solving NP-Hard problems on real networks efficiently, like maximal clique finding. In many real-world applications, networks change over time. As a result, it is essential to develop efficient incremental algorithms for streaming graph data. In this paper, we propose the first incremental k-core decomposition algorithms for streaming graph data. These algorithms locate a small subgraph that is guaranteed to contain the list of vertices whose maximum k-core values have to be updated, and efficiently process this subgraph to update the k-core decomposition. Our results show a significant reduction in run-time compared to non-incremental alternatives. We show the efficiency of our algorithms on different types of real and synthetic graphs, at different scales. For a graph of 16 million vertices, we observe speedups reaching a million times, relative to the non-incremental algorithms.

References

  1. J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani. k-core decomposition: A tool for the visualization of large scale networks. The Computing Research Repository (CoRR), abs/cs/0504107, 2005.Google ScholarGoogle Scholar
  2. R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In Workshop on Algorithms and Models for the Web Graph (WAW), pages 25-37, 2009. Google ScholarGoogle Scholar
  3. G. D. Bader and C. W. V. Hogue. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4, 2003.Google ScholarGoogle Scholar
  4. B. Balasundaram, S. Butenko, and I. Hicks. Clique relaxations in social network analysis: The maximum k-plex problem. Operations Research, 59:133-142, 2011. Google ScholarGoogle Scholar
  5. A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509-512, 1999.Google ScholarGoogle Scholar
  6. V. Batagelj and M. Zaversnik. An O(m) algorithm for cores decomposition of networks. The Computing Research Repository (CoRR), cs.DS/0310049, 2003.Google ScholarGoogle Scholar
  7. M. Baur, M. Gaertler, R. Görke, M. Krug, and D. Wagner. Augmenting k-core generation with preferential attachment. Networks and Heterogeneous Media, 3(2):277-294, 2008.Google ScholarGoogle Scholar
  8. D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In SIAM International Conference on Data Mining (SDM), 2004.Google ScholarGoogle Scholar
  9. J. Cheng, Y. Ke, S. Chu, and M. T. Ozsu. Efficient core decomposition in massive networks. In IEEE International Conference on Data Engineering (ICDE), pages 51-62, 2011. Google ScholarGoogle Scholar
  10. DIMACS. 10th DIMACS implementation challenge. http://www.cc.gatech.edu/dimacs10.Google ScholarGoogle Scholar
  11. S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. k-core organization of complex networks. Physical Review Letters, 96, 2006.Google ScholarGoogle Scholar
  12. Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In World Wide Web Conference (WWW), pages 461-470, 2007. Google ScholarGoogle Scholar
  13. P. Erdös and A. Rényi. On the evolution of random graphs. In Institute of Mathematics, Hungarian Academy of Sciences, pages 17-61, 1960.Google ScholarGoogle Scholar
  14. S. Fortunato. Community detection in graphs. Physics Reports, 483(3-5):75-174, 2009.Google ScholarGoogle Scholar
  15. M. Gaertler. Dynamic analysis of the autonomous system graph. In International Workshop on Inter-domain Performance and Simulation (IPS), pages 13-24, 2004.Google ScholarGoogle Scholar
  16. C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. D-cores: Measuring collaboration of directed graphs based on degeneracy. In IEEE International Conference on Data Mining (ICDM), pages 201-210, 2011. Google ScholarGoogle Scholar
  17. C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. Evaluating cooperation in communities with the k-core structure. In International Conference on Advances in Social Network Analysis and Mining (ASONAM), pages 87-93, 2011. Google ScholarGoogle Scholar
  18. J. Healy, J. Janssen, E. Milios, and W. Aiello. Characterization of graphs using degree cores. In Workshop on Algorithms and Models for the Web Graph (WAW), pages 137-148, 2006. Google ScholarGoogle Scholar
  19. G. Kortsarz and D. Peleg. Generating sparse 2-spanners. Journal of Algorithms, 17(2):222-236, 1994. Google ScholarGoogle Scholar
  20. R.-H. Li and J. X. Yu. Efficient core maintenance in large dynamic graphs. CoRR, abs/1207.4567, 2012.Google ScholarGoogle Scholar
  21. T. Luczak. Size and connectivity of the k-core of a random graph. Discrete Math, 91(1):61-68, 1991. Google ScholarGoogle Scholar
  22. A. A. Nanavati, G. Siva, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: findings and implications. In ACM International Conference on Information and Knowledge Management (CIKM), pages 435-444, 2006. Google ScholarGoogle Scholar
  23. F. Ozgul, Z. Erdem, C. Bowerman, and C. Atzenbeck. Comparison of feature-based criminal network detection models with k-core and n-clique. In International Conference on Advances in Social Network Analysis and Mining (ASONAM), pages 400-401, 2010. Google ScholarGoogle Scholar
  24. H. Saito, M. Toyoda, M. Kitsuregawa, and K. Aihara. A large-scale study of link spam detection by graph algorithms. In International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), pages 45-48, 2007. Google ScholarGoogle Scholar
  25. R. Samudrala and J. Moult. A graph-theoretic algorithm for comparative modeling of protein structure. Journal of Molecular Biology, 279(1):287-302, 1998.Google ScholarGoogle Scholar
  26. S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269-287, 1983.Google ScholarGoogle Scholar
  27. SNAP. Stanford network analysis package. http://snap.stanford.edu/snap.Google ScholarGoogle Scholar
  28. D. Turaga, H. Andrade, B. Gedik, C. Venkatramani, O. Verscheure, J. D. Harris, J. Cox, W. Szewczyk, and P. Jones. Design principles for developing stream processing applications. Software: Practice & Experience, 40(12):1073-1104, 2010. Google ScholarGoogle Scholar
  29. A. Verma and S. Butenko. Network clustering via clique relaxations: A community based approach. 10th DIMACS Implementation Challenge, 2011.Google ScholarGoogle Scholar
  30. S. Wuchty and E. Almaas. Peeling the yeast protein network. PROTEOMICS, 5(2):444-449, 2005.Google ScholarGoogle Scholar
  31. Y. Zhang and S. Parthasarathy. Extracting analyzing and visualizing triangle k-core motifs within networks. In IEEE International Conference on Data Engineering (ICDE), pages 1049-1060, 2012. Google ScholarGoogle Scholar

Index Terms

  1. Streaming algorithms for k-core decomposition
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 6, Issue 6
      April 2013
      144 pages

      Publisher

      VLDB Endowment

      Publication History

      • Published: 1 April 2013
      Published in pvldb Volume 6, Issue 6

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader