research-article

Transfer Learning to Infer Social Ties across Heterogeneous Networks

Authors:
Jie Tang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Tiancheng Lou

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jon Kleinberg

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

,
Sen Wu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 34 Issue 2Article No.: 7pp 1–43https://doi.org/10.1145/2746230

Published:13 April 2016Publication History

ACM Transactions on Information Systems

Abstract

Interpersonal ties are responsible for the structure of social networks and the transmission of information through these networks. Different types of social ties have essentially different influences on people. Awareness of the types of social ties can benefit many applications, such as recommendation and community detection. For example, our close friends tend to move in the same circles that we do, while our classmates may be distributed into different communities. Though a bulk of research has focused on inferring particular types of relationships in a specific social network, few publications systematically study the generalization of the problem of predicting social ties across multiple heterogeneous networks.

In this work, we develop a framework referred to as TranFG for classifying the type of social relationships by learning across heterogeneous networks. The framework incorporates social theories into a factor graph model, which effectively improves the accuracy of predicting the types of social relationships in a target network by borrowing knowledge from a different source network. We also present several active learning strategies to further enhance the inferring performance. To scale up the model to handle really large networks, we design a distributed learning algorithm for the proposed model.

We evaluate the proposed framework (TranFG) on six different networks and compare with several existing methods. TranFG clearly outperforms the existing methods on multiple metrics. For example, by leveraging information from a coauthor network with labeled advisor-advisee relationships, TranFG is able to obtain an F1-score of 90% (8%--28% improvements over alternative methods) for predicting manager-subordinate relationships in an enterprise email network. The proposed model is efficient. It takes only a few minutes to train the proposed transfer model on large networks containing tens of thousands of nodes.

References

Lada A. Adamic and Eytan Adar. 2001. Friends and neighbors on the web. Social Networks 25 (2001), 211--230.Google ScholarCross Ref
Rie Kubota Ando and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research 6 (2005), 1817--1853. Google ScholarDigital Library
Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. 2006. Multi-task feature learning. In Proceedings of the 18th Neural Information Processing Systems (NIPS’06). 41--48.Google Scholar
Andreas Argyriou, Andreas Maurer, and Massimiliano Pontil. 2008. An algorithm for transfer learning in a heterogeneous environment. In Proceedings of 2008 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’08). 71--85.Google ScholarCross Ref
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Web Data Mining (WSDM’11). 635--644. Google ScholarDigital Library
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.Google Scholar
John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP’06). 120--128. Google ScholarDigital Library
Ronald S. Burt. 1992. Structural Holes: The Social Structure of Competition. Harvard University Press.Google ScholarCross Ref
Bin Cao, Nathan Nan Liu, and Qiang Yang. 2010. Transfer learning for collective link prediction in multiple heterogenous domains. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 159--166.Google ScholarDigital Library
Wei Chen, Yajun Wang, and Siyu Yang. 2009. Efficient influence maximization in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). 199--207. Google ScholarDigital Library
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine Learning 20 (1995), 273--297. Google ScholarDigital Library
David J. Crandall, Lars Backstrom, Dan Cosley, Daniel Huttenlocher Siddharth Suri, and Jon Kleinberg. 2010. Inferring social ties from geographic coincidences. Proceedings of the National Academy of Science 107 (Dec. 2010), 22436--22441.Google Scholar
Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008. Translated learning: Transfer learning across different feature spaces. In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems (NIPS’08). 353--360.Google Scholar
Wenyuan Dai, Guirong Xue, Qiang Yang, and Yong Yu. 2007a. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 210--219. Google ScholarDigital Library
Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007b. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 193--200. Google ScholarDigital Library
James A. Davis and Samuel Leinhardt. 1972. The structure of positive interpersonal relations in small groups. In Sociological Theories in Progress, J. Berger (Ed.). Vol. 2. Houghton Mifflin, 218--251.Google Scholar
Christopher P. Diehl, Galileo Namata, and Lise Getoor. 2007. Relationship identification for social network discovery. In Proceedings of the 24th National Conference on Artificial Intelligence and 9th Conference on Innovative Applications of Artificial Intelligence (AAAI’07). 546--552. Google ScholarDigital Library
Yuxiao Dong, Jie Tang, Nitesh V. Chawla, Tiancheng Lou, Yang Yang, and Bai Wang. 2015. Inferring social status and rich club effects in enterprise communication networks. PLOS One 10, 3 (2015), e0119446.Google ScholarCross Ref
Yuxiao Dong, Jie Tang, Sen Wu, Jilei Tian, Nitesh V. Chawla, Jinghai Rao, and Huanhuan Cao. 2012. Link prediction and recommendation across heterogeneous social networks. In Proceedings of the 12th International Conference on Data Mining (ICDM’12). 181--190. Google ScholarDigital Library
Yuxiao Dong, Jing Zhang, Jie Tang, Nitesh V. Chawla, and Bai Wang. 2015. CoupledLP: Link prediction in coupled networks. In Proceedings of the 21nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). 199--208. Google ScholarDigital Library
Nathan Eagle, Alex (Sandy) Pentland, and David Lazer. 2009. Inferring social network structure using mobile phone data. Proceedings of the National Academy of Science 106, 36 (2009), 15274--15278.Google ScholarCross Ref
David Easley and Jon Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press. Google ScholarDigital Library
Jing Gao, Wei Fan, Jing Jian, and Jiawei Han. 2008. Knowledge transfer via multiple model local structure mapping. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08). 283--291. Google ScholarDigital Library
Lise Getoor and Ben Taskar. 2007. Introduction to Statistical Relational Learning. MIT Press. Google ScholarDigital Library
Malcolm Gladwell. 2001. The Tipping Point - How Little Things Make a Big Difference (New York, NY: Little Brown, 2000); G. Khermouch and J. Green, Buzz Marketing: Suddenly This Stealth Strategy Is Hot -- but It's Still Fraught with Risk. Business Week (2001), 50.Google Scholar
Bruno Goncalves, Nicola Perra, and Alessandro Vespignani. 2011. Modeling users’ activity on twitter networks: Validation of dunbar’s number. PLoS ONE 6 (08 2011), e22656.Google Scholar
Mark Granovetter. 1973. The strength of weak ties. American Journal of Sociology 78, 6 (1973), 1360--1380.Google ScholarCross Ref
R. Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2004. Propagation of trust and distrust. In Proceedings of the 13th International Conference on World Wide Web (WWW’04). 403--412. Google ScholarDigital Library
J. M. Hammersley and P. Clifford. 1971. Markov field on finite graphs and lattices. Unpublished manuscript (1971).Google Scholar
John Hopcroft, Tiancheng Lou, and Jie Tang. 2011. Who will follow you back? Reciprocal relationship prediction. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). 1137--1146. Google ScholarDigital Library
Tony Jebara. 2004. Multi-task feature and kernel selection for svms. In Proceedings of the 21th International Conference on Machine Learning (ICML’04). 55--62. Google ScholarDigital Library
George Karypis and Vipin Kumar. 1998. MeTis: Unstrctured Graph Partitioning and Sparse Matrix Ordering System, Version 4.0. http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/manual.pdf.Google Scholar
Elihu Katz. 1957. The two-step flow of communication: An up-to-date report on an hypothesis. Public Opinion Quarterly 21, 1 (1957), 61--78.Google ScholarCross Ref
Elihu Katz and Paul Felix Lazarsfeld. 1955. Personal Influence. The Free Press, New York, USA.Google Scholar
Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1 (1953), 39--43.Google ScholarCross Ref
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’03). 137--146. Google ScholarDigital Library
Nikos Komodakis, Nikos Paragios, and Georgios Tziritas. 2011. MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analalysis and Machine Intelligence 33, 3 (2011), 531--552. Google ScholarDigital Library
David Krackhardt. 1992. The strength of strong ties: The importance of philos in organizations. Networks and Organizations: Structure, Form, and Action 216 (1992), 239.Google Scholar
Frank R. Kschischang, Brendan J. Frey, and Hans andrea Loeliger. 2001. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47 (2001), 498--519. Google ScholarDigital Library
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 282--289. Google ScholarDigital Library
Steffen L. Lauritzen. 1996. Graphical Models. Oxford University Press, Oxford.Google Scholar
Paul Felix Lazarsfeld, Bernard Berelson, and Hazel Gaudet. 1944. The People’s Choice: How the Voter Makes up his Mind in a Presidential Campaign. Columbia University Press, New York, NY, USA.Google Scholar
Su-In Lee, Vassil Chatalbashev, David Vickrey, and Daphne Koller. 2007. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 489--496. Google ScholarDigital Library
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010a. Predicting positive and negative links in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 641--650. Google ScholarDigital Library
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010b. Signed networks in social media. In Proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI’10). 1361--1370. Google ScholarDigital Library
Xuejun Liao, Ya Xue, and Lawrence Carin. 2005. Logistic regression with an auxiliary data source. In Proceedings of the 22th International Conference on Machine Learning (ICML’05). 505--512. Google ScholarDigital Library
David Liben-Nowell and Jon M. Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58, 7 (2007), 1019--1031. Google ScholarDigital Library
Ryan Lichtenwalter, Jake T. Lussier, and Nitesh V. Chawla. 2010. New perspectives and methods in link prediction. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 243--252. Google ScholarDigital Library
Xiao Ling, Guirong Xue, Wenyuan Dai, Yun Jiang, Qiang Yang, and Yong Yu. 2008. Can Chinese web pages be classified with english data source? In Proceeding of the 17th International Conference on World Wide Web (WWW’08). 969--978. Google ScholarDigital Library
Tiancheng Lou and Jie Tang. 2013. Mining structural hole spanners through information diffusion in social networks. In Proceedings of the 22th International Conference on World Wide Web (WWW’13). 837--848. Google ScholarDigital Library
Tiancheng Lou, Jie Tang, John Hopcroft, Zhanpeng Fang, and Xiaowen Ding. 2013. Learning to predict reciprocity and triadic closure in social networks. ACM Transactions on Knowledge Discovery from Data 7, 2 (2013), Article No. 5. Google ScholarDigital Library
Julian McAuley and Jure Leskovec. 2014. Discovering social circles in ego networks. ACM Transactions on Knowledge Discovery from Data (TKDD) 8, 1 (2014), 4. Google ScholarDigital Library
Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). 467--475. Google ScholarDigital Library
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294.Google ScholarDigital Library
M. E. J. Newman. 2001. Clustering and preferential attachment in growing networks. Physical Reviews E 64, 2 (2001), 025102.Google ScholarCross Ref
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report SIDL-WP-1999-0120. Stanford University.Google Scholar
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (Oct. 2010), 1345 --1359. Google ScholarDigital Library
Maayan Roth, Assaf Ben-David, David Deutscher, Guy Flysher, Ilan Horn, Ari Leichtberg, Naty Leiser, Yossi Matias, and Ron Merom. 2010. Suggesting friends using the implicit social graph. In Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining (KDD’10). 233--242. Google ScholarDigital Library
Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’08). 1070--1079. Google ScholarDigital Library
Xiaoxiao Shi, Wei Fan, and Jiangtao Ren. 2008. Actively transfer domain knowledge. In Proceedings of 2008 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’08). 342--357.Google ScholarCross Ref
Xiaodan Song, Yun Chi, Koji Hino, and Belle L. Tseng. 2007. Identifying opinion leaders in the blogosphere. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM’06). 971--974. Google ScholarDigital Library
Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. 2011. User-level sentiment analysis incorporating social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). 1397--1405. Google ScholarDigital Library
Chenhao Tan, Jie Tang, Jimeng Sun, Quan Lin, and Fengjiao Wang. 2010. Social action tracking via noise tolerant time-varying factor graphs. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 1049--1058. Google ScholarDigital Library
Jie Tang, Tiancheng Lou, and Jon Kleinberg. 2012. Inferring social ties across heterogeneous networks. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM’12). 743--752. Google ScholarDigital Library
Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. 2009. Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). 807--816. Google ScholarDigital Library
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08). 990--998. Google ScholarDigital Library
Lei Tang and Huan Liu. 2011. Leveraging social media networks for classification. Data Mining and Knowledge Discovery 23, 3 (2011), 447--478. Google ScholarDigital Library
Wenbin Tang, Honglei Zhuang, and Jie Tang. 2011. Learning to Infer Social Ties in Large Networks. In Proceedings of 2011 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’11). 381--397. Google ScholarDigital Library
Benjamin Taskar, Ming Fai Wong, Pieter Abbeel, and Daphne Koller. 2003. Link prediction in relational data. In Proceedings of the 15th Neural Information Processing Systems (NIPS’03).Google Scholar
Martin J. Wainwright and Michael I. Jordan. 2008. Graphical models, exponential families, and variational inference. Foundational Trends in Machine Learning 1, 1--2 (Jan. 2008), 1--305. Google ScholarDigital Library
Chi Wang, Jiawei Han, Yuntao Jia, Jie Tang, Duo Zhang, Yintao Yu, and Jingyi Guo. 2010. Mining advisor-advisee relationships from research publication networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 203--212. Google ScholarDigital Library
Liaoruo Wang, Tiancheng Lou, Jie Tang, and John Hopcroft. 2011. Detecting community kernels in large social networks. In Proceedings of 2011 IEEE International Conference on Data Mining (ICDM’11). 784--793. Google ScholarDigital Library
Wim Wiegerinck. 2000. Variational approximations between mean field theory and the junction tree algorithm. In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI’00). 626--633. Google ScholarDigital Library
Shaomei Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. 2011. Who says what to whom on twitter. In Proceedings of the 20th International Conference on World Wide Web (WWW’11). 705--714. Google ScholarDigital Library
Rongjing Xiang, Jennifer Neville, and Monica Rogati. 2010. Modeling relationship strength in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 981--990. Google ScholarDigital Library
Eric P. Xing, Michael I. Jordan, and Stuart Russell. 2003. A generalized mean field algorithm for variational inference in exponential families. In Proceedings of 19th Conference on Uncertainty in Artificial Intelligence (UAI’03). 583--591. Google ScholarDigital Library
Yang Yang, Yizhou Sun, Jie Tang, Bo Ma, and Juanzi Li. 2015. Entity matching across heterogeneous sources. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). 1395--1404. Google ScholarDigital Library
Zi Yang, Jingyi Guo, Keke Cai, Jie Tang, Juanzi Li, Li Zhang, and Zhong Su. 2010. Understanding retweeting behaviors in social networks. In Proceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM’10). 1633--1636. Google ScholarDigital Library
Jiawei Zhang, Xiangnan Kong, and Philip S. Yu. 2013. Predicting social links for new users across aligned heterogeneous social networks. In ICDM’13. 1289--1294.Google Scholar
Yutao Zhang, Jie Tang, Zhilin Yang, Jian Pei, and Philip Yu. 2015. COSNET: Connecting heterogeneous social networks with local and global consistency. In KDD’15. 1485--1494. Google ScholarDigital Library
Honglei Zhuang, Jie Tang, Wenbin Tang, Tiancheng Lou, Alvin Chin, and Xia Wang. 2012. Actively learning to infer social ties. Data Mining and Knowledge Discovery 25, 2 (2012), 270--297. Google ScholarDigital Library

Index Terms

Transfer Learning to Infer Social Ties across Heterogeneous Networks

Recommendations

Inferring social ties across heterogenous networks
WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

It is well known that different types of social ties have essentially different influence on people. However, users in online social networks rarely categorize their contacts into "family", "colleagues", or "classmates". While a bulk of research has ...
Read More
Learning to predict reciprocity and triadic closure in social networks

We study how links are formed in social networks. In particular, we focus on investigating how a reciprocal (two-way) link, the basic relationship in social networks, is developed from a parasocial (one-way) relationship and how the relationships ...
Read More
Actively learning to infer social ties

We study the extent to which social ties between people can be inferred in large social network, in particular via active user interactions. In most online social networks, relationships are lack of meaning labels (e.g., "colleague" and "intimate ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 34, Issue 2
April 2016
220 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2891107
Editor:
Maarten de Rijke
University of Amsterdam, The Netherlands
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 April 2016
- Accepted: 1 March 2015
- Revised: 1 January 2015
- Received: 1 June 2014
Published in tois Volume 34, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Social ties
predictive model
social influence
social network
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 840
  Total Downloads
- Downloads (Last 12 months)51
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Transfer Learning to Infer Social Ties across Heterogeneous Networks

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Inferring social ties across heterogenous networks

Learning to predict reciprocity and triadic closure in social networks

Actively learning to infer social ties