skip to main content
10.1145/2701126.2701157acmconferencesArticle/Chapter ViewAbstractPublication PagesicuimcConference Proceedingsconference-collections
research-article

Lossless graph summarization using dense subgraphs discovery

Authors Info & Claims
Published:08 January 2015Publication History

ABSTRACT

Dense subgraph discovery, in a large graph, is useful to solve the community search problem. Motivated from this, we propose a graph summarization method where we search and aggregate dense subgraphs into super nodes. Since the dense subgraphs have high overlap of common neighbors, thus merging such subgraphs can produce a highly compact summary graph. Whereas the member nodes of each dense subgraph have many edges in common, they also have some edges not belonging to a given subgraph; which in turn reduce the compression ratio. To solve this problem, we propose the concept of AutoPruning that effectively filters the dense subgraphs having higher ratio of common to that of non-common neighbors. To summarize the dense subgraphs, we use the Minimum Description Length (MDL) principle to obtain a highly compact summary with least edge corrections for lossless compression. We propose two alternatives to trade-off the computation time and compression ratio while creating a super node from each dense subgraph. Through experiments on two publicly available real world graphs, we compare the proposed approach with the well known Minimum Degree Measure (MDM), for dense subgraph discovery, and observe very encouraging results.

References

  1. Y. Asahiro, K. Iwama, H. Tamaki, and T. Tokuyama. Greedily finding a dense subgraph. Journal of Algorithms, 34(2): 203--221, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Boldi and S. Vigna. The webgraph framework i: compression techniques. In Proceedings of the 13th international conference on World Wide Web, pages 595--602. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Buehrer and K. Chellapilla. A scalable pattern mining approach to web graph compression with communities. In Proceedings of the 2008 International Conference on Web Search and Data Mining, pages 95--106. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization, pages 84--95. Springer, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan. On compressing social networks. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 219--228. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Cui, Y. Xiao, H. Wang, J. Hong, and W. Wang. Local search of communities in large graphs. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Hernández and G. Navarro. Compressed representations for web and social graphs. Knowledge and Information Systems, pages 1--35, 2013.Google ScholarGoogle Scholar
  8. U. Kang and C. Faloutsos. Beyond'caveman communities': Hubs and spokes for graph compression and mining. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, pages 300--309. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. U. Khan, N. Waqas, and Y.-K. Lee. Set-based approximate approach for lossless graph summarization. The Computing Journal- Under Review, 2014.Google ScholarGoogle Scholar
  10. S. Khuller and B. Saha. On finding dense subgraphs. In Automata, Languages and Programming, pages 597--608. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Navlakha, R. Rastogi, and N. Shrivastava. Graph summarization with bounded error. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 419--432. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Rissanen. Modeling by shortest data description. Automatica, 14(5): 465--471, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. B. Seidman. Network structure and minimum degree. Social networks, 5(3): 269--287, 1983.Google ScholarGoogle Scholar
  14. M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 939--948. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Tian, R. A. Hankins, and J. M. Patel. Efficient aggregation for graph summarization. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 567--580. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Toivonen, F. Zhou, A. Hartikainen, and A. Hinkka. Compression of weighted graphs. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 965--973. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Lossless graph summarization using dense subgraphs discovery

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IMCOM '15: Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication
        January 2015
        674 pages
        ISBN:9781450333771
        DOI:10.1145/2701126

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 January 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate213of621submissions,34%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader