skip to main content
research-article

K-core decomposition of large networks on a single PC

Published:01 September 2015Publication History
Skip Abstract Section

Abstract

Studying the topology of a network is critical to inferring underlying dynamics such as tolerance to failure, group behavior and spreading patterns. k-core decomposition is a well-established metric which partitions a graph into layers from external to more central vertices. In this paper we aim to explore whether k-core decomposition of large networks can be computed using a consumer-grade PC. We feature implementations of the "vertex-centric" distributed protocol introduced by Montresor, De Pellegrini and Miorandi on GraphChi and Webgraph. Also, we present an accurate implementation of the Batagelj and Zaversnik algorithm for k-core decomposition in Webgraph. With our implementations, we show that we can efficiently handle networks of billions of edges using a single consumer-level machine within reasonable time and can produce excellent approximations in only a fraction of the execution time. To the best of our knowledge, our biggest graphs are considerably larger than the graphs considered in the literature. Next, we present an optimized implementation of an external-memory algorithm (EMcore) by Cheng, Ke, Chu, and Özsu. We show that this algorithm also performs well for large datasets, however, it cannot predict whether a given memory budget is sufficient for a new dataset. We present a thorough analysis of all algorithms concluding that it is viable to compute k-core decomposition for large networks in a consumer-grade PC.

References

  1. M. Altaf-Ul-Amin, Y. Shinbo, K. Mihara, K. Kurokawa, and S. Kanaya. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC bioinformatics, 7(1):207, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani. k-core decomposition of internet graphs: hierarchies, self-similarity and measurement biases. arXiv preprint cs/0511007, 2005.Google ScholarGoogle Scholar
  3. J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decomposition. In Advances in neural information processing systems, pages 41--50, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Antiqueira, O. N. Oliveira, L. da Fontoura Costa, and M. d. G. V. Nunes. A complex network approach to text summarization. Information Sciences, 179(5):584--599, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A.-L. Barabási and J. Frangos. Linked: the new science of networks science of networks. Basic Books, 2014.Google ScholarGoogle Scholar
  6. V. Batagelj and M. Zaversnik. An o (m) algorithm for cores decomposition of networks. arXiv preprint cs/0310049, 2003.Google ScholarGoogle Scholar
  7. D. P. Bertsekas and J. N. Tsitsiklis. Parallel and distributed computation: numerical methods. Prentice-Hall, Inc., 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Bhawalkar, J. Kleinberg, K. Lewi, T. Roughgarden, and A. Sharma. Preventing unraveling in social networks: the anchored k-core problem. In Automata, Languages, and Programming, pages 440--451. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Boldi and S. Vigna. The webgraph framework i: compression techniques. In Proceedings of the 13th Int. Conference on World Wide Web, pages 595--602. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. Bonchi, F. Gullo, A. Kaltenbrunner, and Y. Volkovich. Core decomposition of uncertain graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1316--1325. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Cheng, Y. Ke, S. Chu, and M. T. Özsu. Efficient core decomposition in massive networks. In Data Engineering (ICDE), 2011 IEEE 27th Int. Conference on, pages 51--62. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. K-core organization of complex networks. Physical review letters, 96(4), 2006.Google ScholarGoogle Scholar
  13. N. Du, B. Wu, X. Pei, B. Wang, and L. Xu. Community detection in large-scale social networks. In Proceedings of the 9th WebKDD workshop, pages 16--25. ACM, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Fortunato. Community detection in graphs. Physics Reports, 486(3):75--174, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. A. V. Goltsev, S. N. Dorogovtsev, and J. Mendes. k-core (bootstrap) percolation on complex networks: Critical phenomena and nonlocal effects. Physical Review E, 73(5), 2006.Google ScholarGoogle ScholarCross RefCross Ref
  16. T. Gutierrez-Bunster, U. Stege, A. Thomo, and J. Taylor. How do biological networks differ from social networks? In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM Int. Conference on, pages 744--751. IEEE, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Healy, J. Janssen, E. Milios, and W. Aiello. Characterization of graphs using degree cores. In Algorithms and Models for the Web-Graph, pages 137--148. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Korovaiko and A. Thomo. Trust prediction from user-item ratings. Social Network Analysis and Mining, 3(3):749--759, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. Kyrola, G. E. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a pc. In OSDI, volume 12, pages 31--46, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, pages 303--336. Springer, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  21. X. Li, M. Wu, C.-K. Kwoh, and S.-K. Ng. Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC genomics, 11(Suppl 1):S3, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  22. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8):716--727, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Lü and T. Zhou. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications, 390(6):1150--1170, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  24. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD Int. Conference on Management of data, pages 135--146. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Montresor, F. De Pellegrini, and D. Miorandi. Distributed k-core decomposition. Parallel and Distributed Systems, IEEE Transactions on, 24(2):288--300, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Pagh and F. F. Rodler. Cuckoo hashing. Journal of Algorithms, 51(2):122--144, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos. Netprobe: a fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th Int. conference on World Wide Web, pages 201--210. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos. Community detection in social media. Data Mining and Knowledge Discovery, 24(3):515--554, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Pattillo, N. Youssef, and S. Butenko. On clique relaxation models in network analysis. European Journal of Operational Research, 226(1):9--18, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  30. M. Á. Serrano, M. Boguná, and A. Vespignani. Extracting the multiscale backbone of complex weighted networks. Proceedings of the national academy of sciences, 106(16):6483--6488, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  31. M. Thorup. String hashing for linear probing. In Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 655--664. Society for Industrial and Applied Mathematics, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Ugander, L. Backstrom, C. Marlow, and J. Kleinberg. Structural diversity in social contagion. Proceedings of the National Academy of Sciences, 109(16):5962--5966, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Wang, M. Li, J. Chen, and Y. Pan. A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 8(3):607--620, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. Weng, F. Menczer, and Y.-Y. Ahn. Virality prediction and community structure in social networks. Scientific reports, 3, 2013.Google ScholarGoogle Scholar
  35. T. Wolf, A. Schroter, D. Damian, L. D. Panjer, and T. H. Nguyen. Mining task-based social networks to explore collaboration in software teams. Software, IEEE, 26(1):58--66, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. W.-S. Yang and J.-B. Dia. Discovering cohesive subgroups from social networks for targeted advertising. Expert Systems with Applications, 34(3):2029--2038, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. K-core decomposition of large networks on a single PC
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 9, Issue 1
      September 2015
      35 pages

      Publisher

      VLDB Endowment

      Publication History

      • Published: 1 September 2015
      Published in pvldb Volume 9, Issue 1

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader