Abstract
Given a large network, local community detection aims at finding the community that contains a set of query nodes and also maximizes (minimizes) a goodness metric. This problem has recently drawn intense research interest. Various goodness metrics have been proposed. However, most existing metrics tend to include irrelevant subgraphs in the detected local community. We refer to such irrelevant subgraphs as free riders. We systematically study the existing goodness metrics and provide theoretical explanations on why they may cause the free rider effect. We further develop a query biased node weighting scheme to reduce the free rider effect. In particular, each node is weighted by its proximity to the query node. We define a query biased density metric to integrate the edge and node weights. The query biased densest subgraph, which has the largest query biased density, will shift to the neighborhood of the query nodes after node weighting. We then formulate the query biased densest connected subgraph (QDC) problem, study its complexity, and provide efficient algorithms to solve it. We perform extensive experiments on a variety of real and synthetic networks to evaluate the effectiveness and efficiency of the proposed methods.
- http://filer.case.edu/yxw407.Google Scholar
- R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In FOCS, pages 475--486, 2006. Google ScholarDigital Library
- B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and MapReduce. PVLDB, 5(5): 454--465, 2012. Google ScholarDigital Library
- J. W. Berry, B. Hendrickson, R. A. LaViolette, et al. Tolerating the community detection resolution limit with edge weighting. Physical Review E, 83(5): 056119, 2011.Google ScholarCross Ref
- M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX, pages 139--152. 2000. Google ScholarDigital Library
- M. Ciglan, M. Laclavík, and K. Nørvåg. On community detection in real-world networks and the importance of degree assortativity. In KDD, pages 1007--1015, 2013. Google ScholarDigital Library
- A. Clauset. Finding local community structure in networks. Physical Review E, 72(2): 026132, 2005.Google ScholarCross Ref
- W. Cui, Y. Xiao, H. Wang, Y. Lu, and W. Wang. Online search of overlapping communities. In SIGMOD, pages 277--288, 2013. Google ScholarDigital Library
- W. Cui, Y. Xiao, H. Wang, and W. Wang. Local search of communities in large graphs. In SIGMOD, pages 991--1002, 2014. Google ScholarDigital Library
- G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD, pages 150--160, 2000. Google ScholarDigital Library
- S. Fortunato. Community detection in graphs. Physics Reports, 486(3): 75--174, 2010.Google ScholarCross Ref
- G. Gallo, M. D. Grigoriadis, and R. E. Tarjan. A fast parametric maximum flow algorithm and applications. SIAM J. Comput., 18(1): 30--55, 1989. Google ScholarDigital Library
- A.-C. Gavin et al. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440(7084): 631--636, 2006.Google ScholarCross Ref
- X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. Querying k-truss community in large and dynamic graphs. In SIGMOD, pages 1311--1322, 2014. Google ScholarDigital Library
- R. Kannan, S. Vempala, and A. Vetta. On clusterings: good, bad and spectral. JACM, 51(3): 497--515, 2004. Google ScholarDigital Library
- R. M. Karp. Reducibility among combinatorial problems. Springer, 1972.Google ScholarCross Ref
- G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20(1): 359--392, 1998. Google ScholarDigital Library
- A. Khadivi, A. A. Rad, and M. Hasler. Network community-detection enhancement by proper weighting. Physical Review E, 83(4): 046104, 2011.Google ScholarCross Ref
- K. Kloster and D. F. Gleich. Heat kernel based community detection. In KDD, pages 1386--1395, 2014. Google ScholarDigital Library
- I. Kloumann and J. Kleinberg. Community membership identification from small seed sets. In KDD, pages 1366--1375, 2014. Google ScholarDigital Library
- A. Lancichinetti, S. Fortunato, and F. Radicchi. Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4): 046110, 2008.Google ScholarCross Ref
- K. J. Lang and R. Andersen. Finding dense and isolated submarkets in a sponsored search spending graph. In CIKM, pages 613--622, 2007. Google ScholarDigital Library
- F. Luo, J. Z. Wang, and E. Promislow. Exploring local community structures in large networks. Web Intelligence and Agent Systems, 6(4): 387--400, 2008. Google ScholarDigital Library
- L. Ma, H. Huang, Q. He, K. Chiew, J. Wu, and Y. Che. GMAC: a seed-insensitive approach to local community detection. In DaWak, pages 297--308, 2013.Google ScholarDigital Library
- M. W. Mahoney, L. Orecchia, and N. K. Vishnoi. A local spectral method for graphs: with applications to improving graph partitions and exploring data graphs locally. JMLR, 13(1): 2339--2365, 2012. Google ScholarDigital Library
- K. Mehlhorn. A faster approximation algorithm for the Steiner problem in graphs. IPL, 27(3): 125--128, 1988. Google ScholarDigital Library
- M. E. Newman. Modularity and community structure in networks. PNAS, 103(23): 8577--8582, 2006.Google ScholarCross Ref
- Y. Saad. Iterative methods for sparse linear systems. SIAM, 2003. Google ScholarDigital Library
- B. Saha, A. Hoch, S. Khuller, L. Raschid, and X.-N. Zhang. Dense subgraphs with restrictions and applications to gene annotation graphs. In RECOMB, pages 456--472, 2010. Google ScholarDigital Library
- A. Schrijver. Combinatorial optimization: polyhedra and efficiency. Springer, 2003.Google Scholar
- M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. In KDD, pages 939--948, 2010. Google ScholarDigital Library
- D. A. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In STOC, pages 81--90, 2004. Google ScholarDigital Library
- D. A. Spielman and S.-H. Teng. A local clustering algorithm for massive graphs and its application to nearly-linear time graph partitioning. SIAM J. Comput., 42(1): 1--26, 2013.Google ScholarDigital Library
- R. Tarjan. Depth-first search and linear graph algorithms. SIAM J. Comput., 1(2): 146--160, 1972.Google ScholarDigital Library
- H. Tong, C. Faloutsos, and J. Pan. Fast random walk with restart and its applications. In ICDM, pages 613--622, 2006. Google ScholarDigital Library
- C. Tsourakakis, F. Bonchi, A. Gionis, F. Gullo, and M. Tsiarli. Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In KDD, pages 104--112, 2013. Google ScholarDigital Library
- Y. Wu, R. Jin, and X. Zhang. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In SIGMOD, pages 1139--1150, 2014. Google ScholarDigital Library
- J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM, pages 745--754, 2012. Google ScholarDigital Library
Index Terms
- Robust local community detection: on free rider effect and its elimination
Recommendations
Local Community Detection via Edge Weighting
Information Retrieval TechnologyAbstractLocal community detection aims at discovering a community from a seed node by maximizing a given goodness metric. This problem has attracted a lot of attention, and various goodness metrics have been proposed in recent years. However, most ...
Community detection in large-scale networks: a survey and empirical evaluation
Community detection is a common problem in graph data analytics that consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large-scale networks is an important ...
Comments