Abstract
A k-core of a graph is a maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph. k-core decomposition is often used in large-scale network analysis, such as community detection, protein function prediction, visualization, and solving NP-Hard problems on real networks efficiently, like maximal clique finding. In many real-world applications, networks change over time. As a result, it is essential to develop efficient incremental algorithms for streaming graph data. In this paper, we propose the first incremental k-core decomposition algorithms for streaming graph data. These algorithms locate a small subgraph that is guaranteed to contain the list of vertices whose maximum k-core values have to be updated, and efficiently process this subgraph to update the k-core decomposition. Our results show a significant reduction in run-time compared to non-incremental alternatives. We show the efficiency of our algorithms on different types of real and synthetic graphs, at different scales. For a graph of 16 million vertices, we observe speedups reaching a million times, relative to the non-incremental algorithms.
- J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani. k-core decomposition: A tool for the visualization of large scale networks. The Computing Research Repository (CoRR), abs/cs/0504107, 2005.Google Scholar
- R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In Workshop on Algorithms and Models for the Web Graph (WAW), pages 25-37, 2009. Google Scholar
- G. D. Bader and C. W. V. Hogue. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4, 2003.Google Scholar
- B. Balasundaram, S. Butenko, and I. Hicks. Clique relaxations in social network analysis: The maximum k-plex problem. Operations Research, 59:133-142, 2011. Google Scholar
- A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509-512, 1999.Google Scholar
- V. Batagelj and M. Zaversnik. An O(m) algorithm for cores decomposition of networks. The Computing Research Repository (CoRR), cs.DS/0310049, 2003.Google Scholar
- M. Baur, M. Gaertler, R. Görke, M. Krug, and D. Wagner. Augmenting k-core generation with preferential attachment. Networks and Heterogeneous Media, 3(2):277-294, 2008.Google Scholar
- D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In SIAM International Conference on Data Mining (SDM), 2004.Google Scholar
- J. Cheng, Y. Ke, S. Chu, and M. T. Ozsu. Efficient core decomposition in massive networks. In IEEE International Conference on Data Engineering (ICDE), pages 51-62, 2011. Google Scholar
- DIMACS. 10th DIMACS implementation challenge. http://www.cc.gatech.edu/dimacs10.Google Scholar
- S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. k-core organization of complex networks. Physical Review Letters, 96, 2006.Google Scholar
- Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In World Wide Web Conference (WWW), pages 461-470, 2007. Google Scholar
- P. Erdös and A. Rényi. On the evolution of random graphs. In Institute of Mathematics, Hungarian Academy of Sciences, pages 17-61, 1960.Google Scholar
- S. Fortunato. Community detection in graphs. Physics Reports, 483(3-5):75-174, 2009.Google Scholar
- M. Gaertler. Dynamic analysis of the autonomous system graph. In International Workshop on Inter-domain Performance and Simulation (IPS), pages 13-24, 2004.Google Scholar
- C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. D-cores: Measuring collaboration of directed graphs based on degeneracy. In IEEE International Conference on Data Mining (ICDM), pages 201-210, 2011. Google Scholar
- C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. Evaluating cooperation in communities with the k-core structure. In International Conference on Advances in Social Network Analysis and Mining (ASONAM), pages 87-93, 2011. Google Scholar
- J. Healy, J. Janssen, E. Milios, and W. Aiello. Characterization of graphs using degree cores. In Workshop on Algorithms and Models for the Web Graph (WAW), pages 137-148, 2006. Google Scholar
- G. Kortsarz and D. Peleg. Generating sparse 2-spanners. Journal of Algorithms, 17(2):222-236, 1994. Google Scholar
- R.-H. Li and J. X. Yu. Efficient core maintenance in large dynamic graphs. CoRR, abs/1207.4567, 2012.Google Scholar
- T. Luczak. Size and connectivity of the k-core of a random graph. Discrete Math, 91(1):61-68, 1991. Google Scholar
- A. A. Nanavati, G. Siva, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: findings and implications. In ACM International Conference on Information and Knowledge Management (CIKM), pages 435-444, 2006. Google Scholar
- F. Ozgul, Z. Erdem, C. Bowerman, and C. Atzenbeck. Comparison of feature-based criminal network detection models with k-core and n-clique. In International Conference on Advances in Social Network Analysis and Mining (ASONAM), pages 400-401, 2010. Google Scholar
- H. Saito, M. Toyoda, M. Kitsuregawa, and K. Aihara. A large-scale study of link spam detection by graph algorithms. In International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), pages 45-48, 2007. Google Scholar
- R. Samudrala and J. Moult. A graph-theoretic algorithm for comparative modeling of protein structure. Journal of Molecular Biology, 279(1):287-302, 1998.Google Scholar
- S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269-287, 1983.Google Scholar
- SNAP. Stanford network analysis package. http://snap.stanford.edu/snap.Google Scholar
- D. Turaga, H. Andrade, B. Gedik, C. Venkatramani, O. Verscheure, J. D. Harris, J. Cox, W. Szewczyk, and P. Jones. Design principles for developing stream processing applications. Software: Practice & Experience, 40(12):1073-1104, 2010. Google Scholar
- A. Verma and S. Butenko. Network clustering via clique relaxations: A community based approach. 10th DIMACS Implementation Challenge, 2011.Google Scholar
- S. Wuchty and E. Almaas. Peeling the yeast protein network. PROTEOMICS, 5(2):444-449, 2005.Google Scholar
- Y. Zhang and S. Parthasarathy. Extracting analyzing and visualizing triangle k-core motifs within networks. In IEEE International Conference on Data Engineering (ICDE), pages 1049-1060, 2012. Google Scholar
Index Terms
- Streaming algorithms for k-core decomposition
Recommendations
Incremental k-core decomposition: algorithms and evaluation
A k-core of a graph is a maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph. k-core decomposition is often used in large-scale network analysis, such as community detection, protein function prediction, ...
Streaming algorithms for 2-coloring uniform hypergraphs
WADS'11: Proceedings of the 12th international conference on Algorithms and data structuresWe consider the problem of two-coloring n-uniform hypergraphs. It is known that any such hypergraph with at most 1/10√n/ln n 2n hyperedges can be two-colored [7]. In fact, there is an efficient (requiring polynomial time in the size of the input) ...
Comments