research-article

Description-Driven Community Detection

Authors:
Simon Pool

Universiteit Utrecht, the Netherlands

Universiteit Utrecht, the Netherlands
View Profile

,
Francesco Bonchi

Yahoo! Research Barcelona, Spain

Yahoo! Research Barcelona, Spain
View Profile

,
Matthijs van Leeuwen

KU Leuven, Heverlee, Belgium

KU Leuven, Heverlee, Belgium
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 5 Issue 2Article No.: 28pp 1–28https://doi.org/10.1145/2517088

Published:30 April 2014Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Traditional approaches to community detection, as studied by physicists, sociologists, and more recently computer scientists, aim at simply partitioning the social network graph. However, with the advent of online social networking sites, richer data has become available: beyond the link information, each user in the network is annotated with additional information, for example, demographics, shopping behavior, or interests. In this context, it is therefore important to develop mining methods which can take advantage of all available information. In the case of community detection, this means finding good communities (a set of nodes cohesive in the social graph) which are associated with good descriptions in terms of user information (node attributes).

Having good descriptions associated to our models make them understandable by domain experts and thus more useful in real-world applications. Another requirement dictated by real-world applications, is to develop methods that can use, when available, any domain-specific background knowledge. In the case of community detection the background knowledge could be a vague description of the communities sought in a specific application, or some prototypical nodes (e.g., good customers in the past), that represent what the analyst is looking for (a community of similar users).

Towards this goal, in this article, we define and study the problem of finding a diverse set of cohesive communities with concise descriptions. We propose an effective algorithm that alternates between two phases: a hill-climbing phase producing (possibly overlapping) communities, and a description induction phase which uses techniques from supervised pattern set mining. Our framework has the nice feature of being able to build well-described cohesive communities starting from any given description or seed set of nodes, which makes it very flexible and easily applicable in real-world applications.

Our experimental evaluation confirms that the proposed method discovers cohesive communities with concise descriptions in realistic and large online social networks such as Delicious, Flickr, and LastFM.

References

Rakesh Agrawal, T. Imielinksi, and A. Swami. 1993. Mining association rules between sets of items in large databases. In Proceedings of SIGMOD'93. ACM, 207--216. Google ScholarDigital Library
Yong-Yeol Ahn, James P. Bagrow, and Sune Lehmann. 2010. Link communities reveal multiscale complexity in networks. Nature 466, 7307, 761--764.Google Scholar
Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, and Eric P. Xing. 2008. Mixed membership stochastic blockmodels. J. Machine Learn. Res. 9, 1981--2014. Google ScholarDigital Library
Martin Atzmueller and Folke Mitzlaff. 2011. Efficient descriptive community mining. In Proceedings of the 24th International FLAIRS Conference. AAAI Press.Google Scholar
Brian Ball, Brian Karrer, and M. E. J. Newman. 2011. An efficient and principled method for detecting communities in networks. Phys. Rev. E 84.Google ScholarCross Ref
Nikhil Bansal, Avrim Blum, and Shuchi Chawla. 2004. Correlation clustering. Machine Learn. 56, 1--3, 89--113. Google ScholarDigital Library
Nicola Barbieri, Francesco Bonchi, and Giuseppe Manco. 2013. Cascade-based community detection. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM'13). Google ScholarDigital Library
Jeffrey Baumes, Mark K. Goldberg, Mukkai S. Krishnamoorthy, Malik Magdon-Ismail, and Nathan Preston. 2005. Finding communities by clustering a graph into overlapping subgraphs. In Proceedings of the IADIS International Conference on Applied Computing (IADIS AC'05). 97--104.Google Scholar
Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Antti Ukkonen. 2012. Chromatic correlation clustering. In Proceedings of the KDD. Google ScholarDigital Library
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.Google Scholar
Fabricio A. Breve, Liang Zhao, Marcos G. Quiles, Witold Pedrycz, and Jiming Liu. 2011. Particle competition and cooperation for uncovering network overlap community structure. In Proceedings of the 8th International Symposium on Advances in Neural Networks (ISNN). Google ScholarDigital Library
B. Bringmann, S. Nijssen, N. Tatti, J. Vreeken, and A. Zimmerman. 2010. Mining sets of patterns. In Proceedings of the Tutorial at ECML PKDD'10. http://www.cs.kuleuven.be/conference/msop/.Google Scholar
Wei Chen, Zhenming Liu, Xiaorui Sun, and Yajun Wang. 2010. A game-theoretic framework to identify overlapping communities in social networks. Data Min. Knowl. Discov. 21, 2, 224--240. Google ScholarDigital Library
G. Dong and J. Li. 1999. Efficient mining of emerging patterns: Discovering trends and differences. In Proceedings of the SIGKDD'99. 43--52. Google ScholarDigital Library
T. S. Evans and R. Lambiotte. 2009. Line graphs, link partitions, and overlapping communities. Phys. Rev. E 80.Google ScholarCross Ref
Santo Fortunato. 2010. Community detection in graphs. Phys. Rep. 486.Google Scholar
M. Girvan and M. Newman. 2002. Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99, 12, 7821--7826.Google ScholarCross Ref
Steve Gregory. 2010. Finding overlapping communities in networks by label propagation. New J. Physics 12, 10.Google ScholarCross Ref
Arijit Khan, Xifeng Yan, and Kun-Lung Wu. 2010. Towards proximity pattern mining in large graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'10). ACM, New York, NY, 867--878. Google ScholarDigital Library
Youngdo Kim and Hawoong Jeong. 2011. The map equation for link communities. Phys. Rev. E 84.Google Scholar
Jussi M. Kumpula, Mikko Kivelä, Kimmo Kaski, and Jari Saramäki. 2008. Sequential algorithm for fast clique percolation. Phys. Rev. E 78.Google ScholarCross Ref
Andrea Lancichinetti, Santo Fortunato, and Janos Kertesz. 2009. Detecting the overlapping and hierarchical community structure of complex networks. New J. Physics 11.Google Scholar
Andrea Lancichinetti, Filippo Radicchi, J. J. Ramasco, and Santo Fortunato. 2011. Finding statistically significant communities in networks. PLoS ONE 6, 4, e18961.Google ScholarCross Ref
Silvio Lattanzi and D. Sivakumar. 2009. Affiliation networks. In Proceedings of STOC. 427--434. Google ScholarDigital Library
D. Leman, A. Feelders, and A. Knobbe. 2008. Exceptional Model Mining. In Proceedings of the ECML/PKDD'08. Vol. 2. 1--16. Google ScholarDigital Library
M. McPherson, L. Smith-Lovin, and J. M. Cook. 2001. Birds of a feather: Homophily in social networks. Ann. Rev. Sociol. 27.Google ScholarCross Ref
Flavia Moser, Recep Colak, Arash Rafiey, and Martin Ester. 2009. Mining cohesive patterns from graphs with feature vectors. In Proceedings of the SIAM International Conference on Data Mining (SDM). SIAM, 593--604.Google ScholarCross Ref
M. Newman. 2004. Fast algorithm for detecting community structure in networks. Phys. Rev. E 69.Google Scholar
M. Newman and M. Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69.Google Scholar
Arnau Padrol-Sureda, Guillem Perarnau-Llobet, Julian Pfeifle, and Victor Muntés-Mulero. 2010. Overlapping community search for social networks. In Proceedings of the International Conference on Data Engineering (ICDE). 992--995.Google ScholarCross Ref
G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature.Google Scholar
G. Palla, I. J. Farkas, P. Pollner, I. Derenyi, and T. Vicsek. 2007. Directed network modules. New J. Phys. 9, 6.Google ScholarCross Ref
J. R. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan-Kaufmann. Google ScholarDigital Library
Usha Nandini Raghavan, Réka Albert, and Soundar Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76.Google ScholarCross Ref
Ron Shamir, Roded Sharan, and Dekel Tsur. 2004. Cluster graph modification problems. Discrete Appl. Math. 144, 1--2, 173--182. Google ScholarDigital Library
Arlei Silva, Wagner Meira, Jr., and Mohammed J. Zaki. 2010. Structural correlation pattern mining for large graphs. In Proceedings of the 8th Workshop on Mining and Learning with Graphs (MLG'10). ACM, New York, NY, 119--126. Google ScholarDigital Library
Lei Tang, Xufei Wang, and Huan Liu. 2009. Uncoverning groups via heterogeneous interaction analysis. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM'09). 503--512. Google ScholarDigital Library
M. van Leeuwen. 2010. Maximal exceptions with minimal descriptions. Data Min. Knowl. Discov. 21, 2, 259--276. Google ScholarDigital Library
Matthijs van Leeuwen, Francesco Bonchi, Börkur Sigurbjörnsson, and Arno Siebes. 2009. Compressing tags to find interesting media groups. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM'09). 1147--1156. Google ScholarDigital Library
Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu. 2010. Discovering overlapping groups in social media. In Proceedings of the 10th IEEE International Conference on Data Mining (ICDM). Google ScholarDigital Library
J. Xie, S. Kelley, and B. Szymanski. 2013. Overlapping community detection in networks: The state of the art and comparative study. ACM Comput. Surv. 45, 4. Google ScholarDigital Library
Jierui Xie and Boleslaw K. Szymanski. 2012. Towards linear time overlapping community detection in social networks. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). Google ScholarDigital Library
Jierui Xie, Boleslaw K. Szymanski, and Xiaoming Liu. 2011. SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In Proceedings of the ICDM Workshops. 344--349. Google ScholarDigital Library
Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. 2009. Graph clustering based on structural/attribute similarities. Proc. Very Large Datab. 2, 1, 718--729. Google ScholarDigital Library
Albrecht Zimmermann, Björn Bringmann, and Ulrich Rückert. 2010. Fast, effective molecular feature mining by local optimization. In Proceedings of the ECML/PKDD'10. 563--578. Google ScholarDigital Library

Index Terms

Description-Driven Community Detection
1. Information systems
  1. Information systems applications
    1. Data mining
  2. World Wide Web
    1. Web applications
      1. Internet communications tools

Recommendations

Circle based community detection
I-CARE '13: Proceedings of the 5th IBM Collaborative Academia Research Exchange Workshop

The connection patterns among individuals or objects in complex (social) networks possess rich information that can be useful for conducting effecient network analysis. In particular we consider the task of community detection in social networks. ...
Read More
Community evaluation in Facebook groups
Abstract
One of the main points of the Next Generation Internet is to have a user-centric approach where daily behavior and social life of the users are studied and analyzed in order to model networks and services. Indeed, social life represents the ...
Read More
Community detection in social networks using user frequent pattern mining

Recently, social networking sites are offering a rich resource of heterogeneous data. The analysis of such data can lead to the discovery of unknown information and relations in these networks. The detection of communities including `similar' nodes is a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Intelligent Systems and Technology Volume 5, Issue 2
Special Issue on Linking Social Granularity and Functions
April 2014
347 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2611448
Issue’s Table of Contents

Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 April 2014
- Accepted: 1 August 2013
- Revised: 1 May 2013
- Received: 1 October 2012
Published in tist Volume 5, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Social networks
behavioral and demographic information
community detection
description
domain knowledge
social patterns
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 71
  Total Citations
  View Citations
- 697
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Description-Driven Community Detection

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Circle based community detection

Community evaluation in Facebook groups

Community detection in social networks using user frequent pattern mining

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Description-Driven Community Detection

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Circle based community detection

Community evaluation in Facebook groups

Community detection in social networks using user frequent pattern mining

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media