Skip to main content
Erschienen in: Social Network Analysis and Mining 4/2013

01.12.2013 | Original Article

Probabilistic analysis of communities and inner roles in networks: Bayesian generative models and approximate inference

verfasst von: Gianni Costa, Riccardo Ortale

Erschienen in: Social Network Analysis and Mining | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Two fundamental tasks in network analysis are community discovery and role assignment. Hitherto, these have been conducted separately. We argue that their integration provides a deeper understanding of connectivity patterns and present unsupervised learning approaches to the exploratory analysis of communities and inner roles of nodes across their interactions in directed networks. In particular, we propose two Bayesian probabilistic models of network interactions that seamlessly integrate community discovery and role assignment. One is the model of a generative process, in which pairs of nodes in a network are associated with communities and roles in the context of their communities; before that an interaction is possibly established between them. According to the generative semantics of such a model, nodes are represented as probability distributions over communities, while communities are viewed as probability distributions over roles. The other model specifies a generative process based on link partitioning, that associates pairs of nodes with respective roles and then assigns their possible interaction to one link community. This is accomplished by representing nodes as probability distributions over roles and, additionally, by explicitly modeling how roles interact with each other. The foresaid distributions are unknown parameters of the proposed models that are estimated from the observed network interactions through suitable Markov Chain Monte Carlo algorithms for approximated posterior inference and parameter estimation. One model overcomes state-of-the-art competitors in link-prediction over real-world networks. The other exhibits a competitive predictive power. Both models overcome an established probabilistic competitor at identifying relatively recognizable structure in synthetic networks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Aggarwal C (ed) (2011) Social network data analytics. Springer, Berlin Aggarwal C (ed) (2011) Social network data analytics. Springer, Berlin
Zurück zum Zitat Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466:761–764CrossRef Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466:761–764CrossRef
Zurück zum Zitat Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014MATH Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014MATH
Zurück zum Zitat Andrieu C, De Freitas N, Doucet A, Jordan MI (2003) An introduction to MCMC for machine learning. Mach Learn 50(1-2):5–43CrossRefMATH Andrieu C, De Freitas N, Doucet A, Jordan MI (2003) An introduction to MCMC for machine learning. Mach Learn 50(1-2):5–43CrossRefMATH
Zurück zum Zitat Ball B, Karrer B, Newman MEJ(2011) An efficient and principled method for detecting communities in networks. Phys Rev E 84:036103 Ball B, Karrer B, Newman MEJ(2011) An efficient and principled method for detecting communities in networks. Phys Rev E 84:036103
Zurück zum Zitat Barbieri N, Costa G, Manco G, Ortale R (2011) Modeling item selection and relevance for accurate recommendations: a bayesian approach. In: Proceedings of ACM conference on recommender systems, pp 21–28 Barbieri N, Costa G, Manco G, Ortale R (2011) Modeling item selection and relevance for accurate recommendations: a bayesian approach. In: Proceedings of ACM conference on recommender systems, pp 21–28
Zurück zum Zitat Barbieri N, Manco G, Ortale R, Ritacco E. (2012) Balancing prediction and recommendation accuracy: hierarchical latent factors for preference data. In: Proceedings of SIAM international conference on data mining, pp 1035–1046 Barbieri N, Manco G, Ortale R, Ritacco E. (2012) Balancing prediction and recommendation accuracy: hierarchical latent factors for preference data. In: Proceedings of SIAM international conference on data mining, pp 1035–1046
Zurück zum Zitat Bickel PJ, Chen A (2009) A nonparametric view of network models and Newman–Girvan and other modularities. Proc Natl Acad Sci 106:21068–21073 Bickel PJ, Chen A (2009) A nonparametric view of network models and Newman–Girvan and other modularities. Proc Natl Acad Sci 106:21068–21073
Zurück zum Zitat Bishop CM (2006) Pattern recognition and machine learning. Information Science and Statistics. Springer, New York, Secaucus, NJ. ISBN 0387310738 Bishop CM (2006) Pattern recognition and machine learning. Information Science and Statistics. Springer, New York, Secaucus, NJ. ISBN 0387310738
Zurück zum Zitat Blei D, Lafferty J (2009) Topic models. In: Srivastava A, Sahami M (eds) Text mining: classification, clustering, and applications. Chapman and Hall/CRC Data Mining and Knowledge Discovery Series, pp71 – 94 Blei D, Lafferty J (2009) Topic models. In: Srivastava A, Sahami M (eds) Text mining: classification, clustering, and applications. Chapman and Hall/CRC Data Mining and Knowledge Discovery Series, pp71 – 94
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATH
Zurück zum Zitat Box GEP, Tiao GC (1992) Bayesian inference in statistical analysis. Wiley-Interscience, New York Box GEP, Tiao GC (1992) Bayesian inference in statistical analysis. Wiley-Interscience, New York
Zurück zum Zitat Chatterjee N, Sinha S (2008) Understanding the mind of a worm: hierarchical network structure underlying nervous system function in C. elegans. In: Banerjee R, Chakrabarti BK (eds) Progress in brain research. Elsevier, Amsterdam, 145–153 Chatterjee N, Sinha S (2008) Understanding the mind of a worm: hierarchical network structure underlying nervous system function in C. elegans. In: Banerjee R, Chakrabarti BK (eds) Progress in brain research. Elsevier, Amsterdam, 145–153
Zurück zum Zitat Chou B-H, Suzuki E (2010) Discovering community-oriented roles of nodes in a social network. In: Proceedings of international conference on data warehousing and knowledge discovery, pp 52–64 Chou B-H, Suzuki E (2010) Discovering community-oriented roles of nodes in a social network. In: Proceedings of international conference on data warehousing and knowledge discovery, pp 52–64
Zurück zum Zitat Costa G, Ortale R (2012) A bayesian hierarchical approach for exploratory analysis of communities and roles in social networks. In: Proceedings of the IEEE/ACM international conference on advances in social networks analysis and mining, pp 194–201 Costa G, Ortale R (2012) A bayesian hierarchical approach for exploratory analysis of communities and roles in social networks. In: Proceedings of the IEEE/ACM international conference on advances in social networks analysis and mining, pp 194–201
Zurück zum Zitat Creamer G, Rowe R, Hershkop S, Stolfo SJ (2009) Segmentation and Automated Social Hierarchy Detection through Email Network Analysis. In: Zhang H, Spiliopoulou M, Mobasher B (eds)Advances in web mining and web usage analysis. Springer, Berlin, pp 40–58 Creamer G, Rowe R, Hershkop S, Stolfo SJ (2009) Segmentation and Automated Social Hierarchy Detection through Email Network Analysis. In: Zhang H, Spiliopoulou M, Mobasher B (eds)Advances in web mining and web usage analysis. Springer, Berlin, pp 40–58
Zurück zum Zitat Danon L, Duch J, Arenas A, Dfaz-Guilera A (2005) Comparing community structure identification. J Stat Mech Theory Exp 09008 Danon L, Duch J, Arenas A, Dfaz-Guilera A (2005) Comparing community structure identification. J Stat Mech Theory Exp 09008
Zurück zum Zitat Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, Cambridge Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, Cambridge
Zurück zum Zitat Evans TS (2010) Clique graphs and overlapping communities. J Stat Mech 12037 Evans TS (2010) Clique graphs and overlapping communities. J Stat Mech 12037
Zurück zum Zitat Evans TS, Lambiotte R (2009) Line graphs, line partitions and overlapping communities. Phys Rev E Stat Phys Plasmas Fluids 80:016105 Evans TS, Lambiotte R (2009) Line graphs, line partitions and overlapping communities. Phys Rev E Stat Phys Plasmas Fluids 80:016105
Zurück zum Zitat Evans TS, Lambiotte R (2010) Line graphs of weighted networks for overlapping communities. Eur Phys J B 77(2):265–272CrossRef Evans TS, Lambiotte R (2010) Line graphs of weighted networks for overlapping communities. Eur Phys J B 77(2):265–272CrossRef
Zurück zum Zitat Fortunato S, Barthélemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41CrossRef Fortunato S, Barthélemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41CrossRef
Zurück zum Zitat Freeman LC (1977) A set of measures of centrality based upon betweenness. Sociometry 40(1):35–41CrossRef Freeman LC (1977) A set of measures of centrality based upon betweenness. Sociometry 40(1):35–41CrossRef
Zurück zum Zitat Gelman A, Carlin JB, Stern HS, Rubin DB, Dunson DB (2013) Bayesian data analysis. Chapman and Hall/CRC, London Gelman A, Carlin JB, Stern HS, Rubin DB, Dunson DB (2013) Bayesian data analysis. Chapman and Hall/CRC, London
Zurück zum Zitat Henderson K, Eliassi Rad T (2009) Applying latent Dirichlet allocation to group discovery in large graphs. In: Proceedings of ACM symposium on applied computing, pp 1456–1461 Henderson K, Eliassi Rad T (2009) Applying latent Dirichlet allocation to group discovery in large graphs. In: Proceedings of ACM symposium on applied computing, pp 1456–1461
Zurück zum Zitat Henderson K, Eliassi-Rad T, Papadimitriou S, Faloutsos C (2010) Hcdf: s hybrid community discovery framework. In: Proceedings of SIAM international conference data mining, pp 754–765 Henderson K, Eliassi-Rad T, Papadimitriou S, Faloutsos C (2010) Hcdf: s hybrid community discovery framework. In: Proceedings of SIAM international conference data mining, pp 754–765
Zurück zum Zitat Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(1):291–307CrossRefMATH Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(1):291–307CrossRefMATH
Zurück zum Zitat Kim Y, Jeong H (2011) Map equation for link community. Phys Rev E Stat Phys Plasmas Fluids 84:026110 Kim Y, Jeong H (2011) Map equation for link community. Phys Rev E Stat Phys Plasmas Fluids 84:026110
Zurück zum Zitat Kolaczyk ED (2009) Statistical analysis of network data. Springer, Berlin Kolaczyk ED (2009) Statistical analysis of network data. Springer, Berlin
Zurück zum Zitat Lancichinetti A, Fortunato S (2009a) Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys Rev E Stat Phys Plasmas Fluids 80(1):016118 Lancichinetti A, Fortunato S (2009a) Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys Rev E Stat Phys Plasmas Fluids 80(1):016118
Zurück zum Zitat Lancichinetti A, Fortunato S (2009b) Community detection algorithms: a comparative analysis. Phys Rev E 80:056117 Lancichinetti A, Fortunato S (2009b) Community detection algorithms: a comparative analysis. Phys Rev E 80:056117
Zurück zum Zitat Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of international conference on World Wide Web, pp 631–640 Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of international conference on World Wide Web, pp 631–640
Zurück zum Zitat Liu JS (2001) Monte Carlo strategies in scientific computing. Springer, Berlin Liu JS (2001) Monte Carlo strategies in scientific computing. Springer, Berlin
Zurück zum Zitat Lorrain F, White HC (1971) The structural equivalence of individuals in social networks. J Math Sociol 1:49–80 Lorrain F, White HC (1971) The structural equivalence of individuals in social networks. J Math Sociol 1:49–80
Zurück zum Zitat McCallum A, Wang X, Corrada-Emmanuel A (2007) Topic and role discovery in social networks with experiments on Enron and academic email. J Artif Intell Res 30(1):249–272 McCallum A, Wang X, Corrada-Emmanuel A (2007) Topic and role discovery in social networks with experiments on Enron and academic email. J Artif Intell Res 30(1):249–272
Zurück zum Zitat Kourtellis N, Alahakoon T, Simha R, Iamnitchi A, Tripathi R (2013) Identifying high betweenness centrality nodes in large social networks. Soc Netw Anal Min (online first list) Kourtellis N, Alahakoon T, Simha R, Iamnitchi A, Tripathi R (2013) Identifying high betweenness centrality nodes in large social networks. Soc Netw Anal Min (online first list)
Zurück zum Zitat Neal RM (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical report Neal RM (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical report
Zurück zum Zitat Newman MEJ (2004a) Detecting community structure in networks. Eur Phys J B 38(2):321–330CrossRef Newman MEJ (2004a) Detecting community structure in networks. Eur Phys J B 38(2):321–330CrossRef
Zurück zum Zitat Newman MEJ (2004b) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066133 Newman MEJ (2004b) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066133
Zurück zum Zitat Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113 Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Zurück zum Zitat Newman MEJ, Leicht EA (2007) Mixture models and exploratory analysis in networks. Proc Natl Acad Sci USA 104:9564–9569CrossRefMATH Newman MEJ, Leicht EA (2007) Mixture models and exploratory analysis in networks. Proc Natl Acad Sci USA 104:9564–9569CrossRefMATH
Zurück zum Zitat Newman M, Barabási A-L, Watts DJ (2006) The structure and dynamics of networks. Princeton University Press, Princeton Newman M, Barabási A-L, Watts DJ (2006) The structure and dynamics of networks. Princeton University Press, Princeton
Zurück zum Zitat Pathak N, Delong C, Banerjee A, Erickson K (2008) Social topic models for community extraction. In: Proceedings of KDD workshop on social network mining and analysis Pathak N, Delong C, Banerjee A, Erickson K (2008) Social topic models for community extraction. In: Proceedings of KDD workshop on social network mining and analysis
Zurück zum Zitat Porter MA, Onnela J-P, Mucha PJ (2009) Communities in networks. Notices Am Math Soc 56(9):1082–1166MathSciNetMATH Porter MA, Onnela J-P, Mucha PJ (2009) Communities in networks. Notices Am Math Soc 56(9):1082–1166MathSciNetMATH
Zurück zum Zitat Pothen A, Simon HD, Liou K-P (1990) Partitioning sparse matrices with eigenvectors of graphs. SIAM J Matrix Anal Appl 11(3):430–452MathSciNetCrossRefMATH Pothen A, Simon HD, Liou K-P (1990) Partitioning sparse matrices with eigenvectors of graphs. SIAM J Matrix Anal Appl 11(3):430–452MathSciNetCrossRefMATH
Zurück zum Zitat Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101(9):2658–2663CrossRef Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101(9):2658–2663CrossRef
Zurück zum Zitat Scott J (2000) Social network analysis. SAGE, London Scott J (2000) Social network analysis. SAGE, London
Zurück zum Zitat Scripps J, Tan P-N, Esfahanian A-H (2007a) Exploration of link structure and community-based node roles in network analysis. Proceedings of international conference on data mining, pp 649–654 Scripps J, Tan P-N, Esfahanian A-H (2007a) Exploration of link structure and community-based node roles in network analysis. Proceedings of international conference on data mining, pp 649–654
Zurück zum Zitat Scripps J, Tan P-N, Esfahanian A-H (2007b) Node roles and community structure in networks. In: Proceedings of workshop on web mining and social network analysis (WebKDD and SNA-KDD), pp 26–35 Scripps J, Tan P-N, Esfahanian A-H (2007b) Node roles and community structure in networks. In: Proceedings of workshop on web mining and social network analysis (WebKDD and SNA-KDD), pp 26–35
Zurück zum Zitat Shetty J, Adibi J (2004) Enron email dataset. Technical report. USC Information Sciences Institute Shetty J, Adibi J (2004) Enron email dataset. Technical report. USC Information Sciences Institute
Zurück zum Zitat Sohn Y, Choi M-K, Ahn Y-Y, Lee J, Jeong J (2011) Topological cluster analysis reveals the systemic organization of the caenorhabditis elegans connectome. PLoS Comput Biol 7(5):e1001139 Sohn Y, Choi M-K, Ahn Y-Y, Lee J, Jeong J (2011) Topological cluster analysis reveals the systemic organization of the caenorhabditis elegans connectome. PLoS Comput Biol 7(5):e1001139
Zurück zum Zitat Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Landauer T, McNamara D, Dennis S, Kintsch W (eds) Latent semantic analysis: a road to meaning. Lawrence Erlbaum, Mahwah, NJ, pp 427–448 Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Landauer T, McNamara D, Dennis S, Kintsch W (eds) Latent semantic analysis: a road to meaning. Lawrence Erlbaum, Mahwah, NJ, pp 427–448
Zurück zum Zitat Yoshida T (2013) Weighted line graphs for overlapping community discovery. Soc Netw Anal Min (online first list Yoshida T (2013) Weighted line graphs for overlapping community discovery. Soc Netw Anal Min (online first list
Zurück zum Zitat Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge
Zurück zum Zitat Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440–442CrossRef Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393(6684):440–442CrossRef
Zurück zum Zitat White JG, Southgate E, Thompson JN, Brenner S (1986) The structure of the nervous system of the nematode Caenorhabditis elegans. Philos Trans R Soc B Biol Sci 314(1165):1–340CrossRef White JG, Southgate E, Thompson JN, Brenner S (1986) The structure of the nervous system of the nematode Caenorhabditis elegans. Philos Trans R Soc B Biol Sci 314(1165):1–340CrossRef
Zurück zum Zitat Winkler R (2003) An introduction to bayesian inference and decision. Probabilistic Publishing, Gainesville, FL Winkler R (2003) An introduction to bayesian inference and decision. Probabilistic Publishing, Gainesville, FL
Zurück zum Zitat Wu Z (2010) A fast and reasonable method for community detection with adjustable extent of overlapping. In: Proceedings of international conference on intelligent systems and knowledge engineering, pp 376–379 Wu Z (2010) A fast and reasonable method for community detection with adjustable extent of overlapping. In: Proceedings of international conference on intelligent systems and knowledge engineering, pp 376–379
Zurück zum Zitat Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state of the art and comparative study. ACM Comput Surv 45(4) Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state of the art and comparative study. ACM Comput Surv 45(4)
Zurück zum Zitat Liaghat Z, Hossein Rasekh A, Mahdavi A (2013) Application of data mining methods for link prediction in social networks. Soc Netw Anal Min (online first list) Liaghat Z, Hossein Rasekh A, Mahdavi A (2013) Application of data mining methods for link prediction in social networks. Soc Netw Anal Min (online first list)
Zurück zum Zitat Zhang H, Qiu B, Giles CL, Foley HC, Yen J (2007) An LDA-based community structure discovery approach for large-scale social networks. In: Proceedings of IEEE international conference on intelligence and security informatics, p 200–207 Zhang H, Qiu B, Giles CL, Foley HC, Yen J (2007) An LDA-based community structure discovery approach for large-scale social networks. In: Proceedings of IEEE international conference on intelligence and security informatics, p 200–207
Zurück zum Zitat Zhou D, Manavoglu E, Li J, Giles CL, Zha H (2006) Probabilistic models for discovering e-communities. In: Proceedings of international conference on World Wide Web, pp 173–182 Zhou D, Manavoglu E, Li J, Giles CL, Zha H (2006) Probabilistic models for discovering e-communities. In: Proceedings of international conference on World Wide Web, pp 173–182
Metadaten
Titel
Probabilistic analysis of communities and inner roles in networks: Bayesian generative models and approximate inference
verfasst von
Gianni Costa
Riccardo Ortale
Publikationsdatum
01.12.2013
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 4/2013
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-013-0130-z

Weitere Artikel der Ausgabe 4/2013

Social Network Analysis and Mining 4/2013 Zur Ausgabe