skip to main content
research-article

Robust local community detection: on free rider effect and its elimination

Published:01 February 2015Publication History
Skip Abstract Section

Abstract

Given a large network, local community detection aims at finding the community that contains a set of query nodes and also maximizes (minimizes) a goodness metric. This problem has recently drawn intense research interest. Various goodness metrics have been proposed. However, most existing metrics tend to include irrelevant subgraphs in the detected local community. We refer to such irrelevant subgraphs as free riders. We systematically study the existing goodness metrics and provide theoretical explanations on why they may cause the free rider effect. We further develop a query biased node weighting scheme to reduce the free rider effect. In particular, each node is weighted by its proximity to the query node. We define a query biased density metric to integrate the edge and node weights. The query biased densest subgraph, which has the largest query biased density, will shift to the neighborhood of the query nodes after node weighting. We then formulate the query biased densest connected subgraph (QDC) problem, study its complexity, and provide efficient algorithms to solve it. We perform extensive experiments on a variety of real and synthetic networks to evaluate the effectiveness and efficiency of the proposed methods.

References

  1. http://filer.case.edu/yxw407.Google ScholarGoogle Scholar
  2. R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In FOCS, pages 475--486, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and MapReduce. PVLDB, 5(5): 454--465, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. W. Berry, B. Hendrickson, R. A. LaViolette, et al. Tolerating the community detection resolution limit with edge weighting. Physical Review E, 83(5): 056119, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  5. M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX, pages 139--152. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Ciglan, M. Laclavík, and K. Nørvåg. On community detection in real-world networks and the importance of degree assortativity. In KDD, pages 1007--1015, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Clauset. Finding local community structure in networks. Physical Review E, 72(2): 026132, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  8. W. Cui, Y. Xiao, H. Wang, Y. Lu, and W. Wang. Online search of overlapping communities. In SIGMOD, pages 277--288, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Cui, Y. Xiao, H. Wang, and W. Wang. Local search of communities in large graphs. In SIGMOD, pages 991--1002, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD, pages 150--160, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Fortunato. Community detection in graphs. Physics Reports, 486(3): 75--174, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  12. G. Gallo, M. D. Grigoriadis, and R. E. Tarjan. A fast parametric maximum flow algorithm and applications. SIAM J. Comput., 18(1): 30--55, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A.-C. Gavin et al. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440(7084): 631--636, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  14. X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. Querying k-truss community in large and dynamic graphs. In SIGMOD, pages 1311--1322, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Kannan, S. Vempala, and A. Vetta. On clusterings: good, bad and spectral. JACM, 51(3): 497--515, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. M. Karp. Reducibility among combinatorial problems. Springer, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  17. G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20(1): 359--392, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Khadivi, A. A. Rad, and M. Hasler. Network community-detection enhancement by proper weighting. Physical Review E, 83(4): 046104, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  19. K. Kloster and D. F. Gleich. Heat kernel based community detection. In KDD, pages 1386--1395, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. Kloumann and J. Kleinberg. Community membership identification from small seed sets. In KDD, pages 1366--1375, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Lancichinetti, S. Fortunato, and F. Radicchi. Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4): 046110, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  22. K. J. Lang and R. Andersen. Finding dense and isolated submarkets in a sponsored search spending graph. In CIKM, pages 613--622, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Luo, J. Z. Wang, and E. Promislow. Exploring local community structures in large networks. Web Intelligence and Agent Systems, 6(4): 387--400, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. Ma, H. Huang, Q. He, K. Chiew, J. Wu, and Y. Che. GMAC: a seed-insensitive approach to local community detection. In DaWak, pages 297--308, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. W. Mahoney, L. Orecchia, and N. K. Vishnoi. A local spectral method for graphs: with applications to improving graph partitions and exploring data graphs locally. JMLR, 13(1): 2339--2365, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Mehlhorn. A faster approximation algorithm for the Steiner problem in graphs. IPL, 27(3): 125--128, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. E. Newman. Modularity and community structure in networks. PNAS, 103(23): 8577--8582, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  28. Y. Saad. Iterative methods for sparse linear systems. SIAM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. Saha, A. Hoch, S. Khuller, L. Raschid, and X.-N. Zhang. Dense subgraphs with restrictions and applications to gene annotation graphs. In RECOMB, pages 456--472, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Schrijver. Combinatorial optimization: polyhedra and efficiency. Springer, 2003.Google ScholarGoogle Scholar
  31. M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. In KDD, pages 939--948, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. A. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In STOC, pages 81--90, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. A. Spielman and S.-H. Teng. A local clustering algorithm for massive graphs and its application to nearly-linear time graph partitioning. SIAM J. Comput., 42(1): 1--26, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. Tarjan. Depth-first search and linear graph algorithms. SIAM J. Comput., 1(2): 146--160, 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Tong, C. Faloutsos, and J. Pan. Fast random walk with restart and its applications. In ICDM, pages 613--622, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. Tsourakakis, F. Bonchi, A. Gionis, F. Gullo, and M. Tsiarli. Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In KDD, pages 104--112, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Wu, R. Jin, and X. Zhang. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In SIGMOD, pages 1139--1150, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM, pages 745--754, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Robust local community detection: on free rider effect and its elimination

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 8, Issue 7
        February 2015
        124 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 February 2015
        Published in pvldb Volume 8, Issue 7

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader