ABSTRACT
Textual analysis is one means by which to assess communication type and moderate the influence of network structure in predictive models of individual behavior. However, there are few methods available to incorporate textual content into time-evolving network models. In particular, modeling both the evolution of network topology and textual content change in time-varying communication data poses a difficult challenge. In this work, we propose a Temporally-Evolving Network Classifier (TENC) to incorporate the influence of time-varying edges and temporally-evolving attributes in relational classification models. To facilitate this, we use an evolutionary latent topic approach to automatically discover and label communications between individuals in a network with their corresponding latent topic. The topics of the messages are incorporated into the TENC along with time-varying relationships and temporally-evolving attributes, using weighted, exponential kernel summarization. We evaluate the utility of the TENC on a real-world classification task, where the aim is to predict the effectiveness of a developer in the python open-source developer network. We take advantage of the textual content in developer emails and bug communications, which both evolve over time. The TENC paired with the latent topics significantly improves performance over the baseline classifiers that only take into account the static properties of the topics and communications. The results show that the TENC can be used to accurately model the complete-set of temporal dynamics in time-evolving communication networks.
- R. Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu. Mining newsgroups using networks arising from social behavior. In Proceedings of the World Wide Web Conference, pages 529--535, 2003. Google ScholarDigital Library
- D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003. Google ScholarDigital Library
- D. Cai, Z. Shao, X. He, X. Yan, and J. Han. Community mining from multi-relational networks. In Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2005. Google ScholarDigital Library
- S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 307--318, 1998. Google ScholarDigital Library
- P. Domingos and M. Richardson. Mining the network value of customers. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 57--66, 2001. Google ScholarDigital Library
- L. Getoor and B. Taskar, editors. Introduction to Statistical Relational Learning. MIT Press, 2007. Google ScholarDigital Library
- S. Hill, D. Agarwal, R. Bell, and C. Volinsky. Building an effective representation of dynamic networks. Journal of Computational and Graphical Statistics, Sept, 2006.Google ScholarCross Ref
- A. McCallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on enron and academic email. In Journal of Artificial Intelligence Research (JAIR), pages 249--272, 2007. Google ScholarDigital Library
- A. McCallum, X. Wang, and N. Mohanty. Joint group and topic discovery from relations and text. In Statistical Network Analysis: Models, Issues and New Directions, Lecture Notes in Computer Science 4503, pages 28--44, 2007. Google ScholarDigital Library
- Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text - an exploration of temporal text mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 198--207, 2005. Google ScholarDigital Library
- P. Mika. Ontologies are us: A unified model of social networks and semantics. In Journal of Web Semantics, pages 5--15, 2007. Google ScholarDigital Library
- J. Neville, O. Simsek, D. Jensen, J. Komoroske, K. Palmer, and H. Goldberg. Using relational knowledge discovery to prevent securities fraud. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 449--458, 2005. Google ScholarDigital Library
- J. Neville, D. Jensen, L. Friedland, and M. Hay. Learning relational probability trees. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 625--630, 2003. Google ScholarDigital Library
- U. Sharan and J. Neville. Temporal-relational classifiers for prediction in evolving domains. In Proceedings of the 8th IEEE International Conference on Data Mining, 2008. Google ScholarDigital Library
- B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In Eighteenth Conference on Uncertainty in Artificial Intelligence, 2002. Google ScholarDigital Library
Index Terms
- Modeling the evolution of discussion topics and communication to improve relational classification
Recommendations
Jointly Discovering Fine-grained and Coarse-grained Sentiments via Topic Modeling
MM '14: Proceedings of the 22nd ACM international conference on MultimediaThe ever-increasing user-generated contents in social media and other web services make it highly desirable to discover opinions of users on all kinds of topics. Motivated by the assumption that individual word and paragraph in documents will deliver ...
Text, Topics, and Turkers: A Consensus Measure for Statistical Topics
HT '15: Proceedings of the 26th ACM Conference on Hypertext & Social MediaTopic modeling is an important tool in social media analysis, allowing researchers to quickly understand large text corpora by investigating the topics underlying them. One of the fundamental problems of topic models lies in how to assess the quality of ...
Extractive text summarization using clustering-based topic modeling
AbstractText summarization is the process of converting the input document into a short form, provided that it preserves the overall meaning associated with it. Primarily, text summarization is achieved in two ways, i.e., abstractive and extractive. ...
Comments