ABSTRACT
Many social networks can be characterized by a sequence of dyadic interactions between individuals. Techniques for analyzing such events are of increasing interest. In this paper, we describe a generative model for dyadic events, where each event arises from one of C latent classes, and the properties of the event (sender, recipient, and type) are chosen from distributions over these entities conditioned on the chosen class. We present two algorithms for inference in this model: an expectation-maximization algorithm as well as a Markov chain Monte Carlo procedure based on collapsed Gibbs sampling. To analyze the model's predictive accuracy, the algorithms are applied to multiple real-world data sets involving email communication, international political events, and animal behavior data.
Supplemental Material
- E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. Journal of Machine Learning Research, (September):1981--2014, 2008. Google ScholarDigital Library
- C. Anderson, S. Wasserman, and K. Faust. Building stochastic blockmodels. Social Networks, 14(1-2):137--161, June 1992.Google ScholarCross Ref
- M. C. Appleby. Competition in a red deer stag social group: rank, age and relatedness of opponents. Animal Behavior, 31:913--918, 1983.Google ScholarCross Ref
- U. Brandes, J. Lerner, and T. a.B. Snijders. Networks evolving step by step: statistical analysis of dyadic event data. 2009 International Conference on Advances in Social Network Analysis and Mining, pages 200--205, 2009. Google ScholarDigital Library
- C. Butts. A relational event model for social action. Sociological Methodology, 38(1):155--20, 2008.Google ScholarCross Ref
- J. Chang and D. Blei. Relational topic models for document networks. Proc. of Conf. on AI and Statistics (AISTATS'09), 2009.Google Scholar
- J.-P. Eckmann, E. Moses, and D. Sergi. Entropy of dialogues creates coherent structures in e-mail traffic. Proceedings of the National Academy of Sciences of the United States of America, 101(40):14333-7, October 2004.Google ScholarCross Ref
- S. E. Fienberg and S. S. Wasserman. Categorical data analysis of single sociometric relations. Sociological Methodology, 12:156--192, 1981.Google ScholarCross Ref
- S. Geisser and W. F. Eddy. A predictive approach to model selection. Journal of the American Statistical Association, 74(365):153, March 1979.Google ScholarCross Ref
- S. M. Goodreau, J. A. Kitts, and M. Morris. Birds of a feather, or friend of a friend? Using exponential random graph models to investigate adolescent social networks. Demography, 46:103--125, 2009.Google ScholarCross Ref
- T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101 Suppl:5228--35, April 2004.Google ScholarCross Ref
- P. D. Hoff. Multiplicative latent factor models for description and prediction of social networks. Computational and Mathematical Organization Theory, October 2008. Google ScholarDigital Library
- P. W. Holland, K. B. Laskey, and S. Leinhardt. Stochastic blockmodels: first steps. Social Networks, 5:109--137, 1983.Google ScholarCross Ref
- C. Kemp, J. Tenenbaum, T. Griffiths, T. Yamada, and N. Ueda. Learning systems of concepts with an infinite relational model. Proceedings of the 21st National Conference on Artificial Intelligence, 2006. Google ScholarDigital Library
- G. King,W Lowe.An automated information extraction tool for international conflict data with performance as good as human coders: A rare events evaluation design. International Organization, 57:617--642, 2003.Google ScholarCross Ref
- G. Kossinets. Effects of missing data in social networks. Social Networks, 28:247--268, 2006.Google ScholarCross Ref
- K. Kurihara, Y. Kameya, and T. Sato. A Frequency-based stochastic blockmodel. Workshop on Information-Based InductionSciences, 2006.Google Scholar
- J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008. Google ScholarDigital Library
- K. Nowicki and T. A. B. Snijders. Estimation and prediction of stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077--1087, 2001.Google ScholarCross Ref
- S. Rogers, A. Klami, J. Sinkkonen, M. Girolami, and S. Kaski. Infinite factorization of multiple non-parametric views. Machine Learning, 79(1-2):201--226, 2009. Google ScholarDigital Library
- S. Ross. Introduction to Probability Models. Academic Press, 2006. Google ScholarDigital Library
- M. Shafiei and H. Chipman. Mixed-membership stochastic block-models for transactional data. Workshop on Analyzing Networks and Learning with Graphs (NIPS 2009), pages 1--8, 2009.Google Scholar
- J. Sinkkonen, J. Aukia, S. Kaski, C. Rudin, R. Schapire, and I. Daubechies. Component models for large networks. ArXiv e-prints. arXiv:0803.1628., page 11--15, 2008.Google Scholar
- J. Sinkkonen, J. Aukia, S. Kaski. Infinite mixtures for multi-relational categorical data. 6th International Workshop on Mining and Learning with Graphs, Helsinki, Finland, 2008.Google Scholar
- T. A. B. Snijders and K. Nowicki. Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14(1):75--100, 1997.Google ScholarCross Ref
- Y. Wang and G. Wong. Stochastic blockmodels for directed graphs. Journal of the American Statistical Association, 82(397):8--19, 1987.Google ScholarCross Ref
Recommendations
Pseudo-marginal Bayesian inference for Gaussian process latent variable models
AbstractA Bayesian inference framework for supervised Gaussian process latent variable models is introduced. The framework overcomes the high correlations between latent variables and hyperparameters by collapsing the statistical model through approximate ...
A latent mixed membership model for relational data
LinkKDD '05: Proceedings of the 3rd international workshop on Link discoveryModeling relational data is an important problem for modern data analysis and machine learning. In this paper we propose a Bayesian model that uses a hierarchy of probabilistic assumptions about the way objects interact with one another in order to ...
Dense distributions from sparse samples: improved gibbs sampling parameter estimators for LDA
We introduce a novel approach for estimating Latent Dirichlet Allocation (LDA) parameters from collapsed Gibbs samples (CGS), by leveraging the full conditional distributions over the latent variable assignments to efficiently average over multiple ...
Comments