Abstract
Users with demographic profiles in social networks offer the potential to understand the social principles that underpin our highly connected world, from individuals, to groups, to societies. In this article, we harness the power of network and data sciences to model the interplay between user demographics and social behavior and further study to what extent users’ demographic profiles can be inferred from their mobile communication patterns. By modeling over 7 million users and 1 billion mobile communication records, we find that during the active dating period (i.e., 18--35 years old), users are active in broadening social connections with males and females alike, while after reaching 35 years of age people tend to keep small, closed, and same-gender social circles. Further, we formalize the demographic prediction problem of inferring users’ gender and age simultaneously. We propose a factor graph-based WhoAmI method to address the problem by leveraging not only the correlations between network features and users’ gender/age, but also the interrelations between gender and age. In addition, we identify a new problem—coupled network demographic prediction across multiple mobile operators—and present a coupled variant of the WhoAmI method to address its unique challenges. Our extensive experiments demonstrate the effectiveness, scalability, and applicability of the WhoAmI methods. Finally, our study finds a greater than 80% potential predictability for inferring users’ gender from phone call behavior and 73% for users’ age from text messaging interactions.
- Talayeh Aledavood, Eduardo López, Sam G. B. Roberts, Felix Reed-Tsochas, Esteban Moro Egido, Robin I. M. Dunbar, and Jari Saramäki. 2015. Channel-specific daily patterns in mobile phone communication. CoRR, abs/1507.04596 (2015).Google Scholar
- Bin Bi, Milad Shokouhi, Michal Kosinski, and Thore Graepel. 2013. Inferring the demographics of search users: Social data meets search queries. In WWW’13. 131--140. Google ScholarDigital Library
- Vincent D. Blondel, Adeline Decuyper, and Gautier Krings. 2015. A survey of results on mobile phone datasets analysis. arXiv:1502.03406 (2015).Google Scholar
- F. Calabrese, L. Ferrari, and V. Blondel. 2014. Urban sensing using mobile phones network data: A survey of research. ACM Comput. Surv. (2014). Google ScholarDigital Library
- Deepayan Chakrabarti, Stanislav Funiak, Jonathan Chang, and Sofus A. Macskassy. 2014. Joint inference of multiple label types in large networks. In ICML’14. 874--882. Google ScholarDigital Library
- Benjamin Cornwell. 2011. Age trends in daily social contact patterns. Res. Aging 33, 5 (2011), 598--631.Google ScholarCross Ref
- Yuxiao Dong, Fabio Pinelli, Yiannis Gkoufas, Zubair Nabi, Francesco Calabrese, and Nitesh V. Chawla. 2015a. Inferring unusual crowd events from mobile phone call detail records. In Machine Learning and Knowledge Discovery in Databases. Springer International Publishing, 474--492.Google Scholar
- Yuxiao Dong, Jie Tang, Nitesh V. Chawla, Tiancheng Lou, Yang Yang, and Bai Wang. 2015b. Inferring social status and rich club effects in enterprise communication networks. PLoS ONE 10 (03 2015), e0119446.Google Scholar
- Yuxiao Dong, Jie Tang, Tiancheng Lou, Bin Wu, and Nitesh V. Chawla. 2013. How long will she call me? Distribution, social theory and duration prediction. In Machine Learning and Knowledge Discovery in Databases. Springer, 16--31.Google Scholar
- Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, and Nitesh V. Chawla. 2014. Inferring user demographics and social strategies in mobile social networks. In KDD’14. ACM, 15--24. Google ScholarDigital Library
- Yuxiao Dong, Jing Zhang, Jie Tang, Nitesh V. Chawla, and Bai Wang. 2015. CoupledLP: Link prediction in coupled networks. In KDD’15. ACM, 199--208. Google ScholarDigital Library
- Nan Du, Christos Faloutsos, Bai Wang, and Leman Akoglu. 2009. Large human communication networks: Patterns and a utility-driven generator. In KDD’09. ACM, 269--278. Google ScholarDigital Library
- Nathan Eagle, Alex (Sandy) Pentland, and David Lazer. 2009. Inferring social network structure using mobile phone data. Proc. Natl. Acad. Sci. U.S.A. 106, 36 (2009).Google Scholar
- David Easley and Jon Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press. Google ScholarDigital Library
- Mária Ercsey-Ravasz, Ryan N. Lichtenwalter, Nitesh V. Chawla, and Zoltán Toroczkai. 2012. Range-limited centrality measures in complex networks. Phys. Rev. E 85, 6 (Jun 2012), 066103.Google ScholarCross Ref
- Linton C. Freeman. 1982. Centered graphs and the structure of ego networks. Math. Soc. Sci. 3, 3 (1982), 291--304.Google ScholarCross Ref
- Huiji Gao, Jiliang Tang, Xia Hu, and Huan Liu. 2013. Modeling temporal effects of human mobile behavior on location-based social networks. In CIKM’13. 1673--1678. Google ScholarDigital Library
- Andrew Gelman and Jennifer Hill. 2006. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.Google Scholar
- Marta C. Gonzalez, Cesar A. Hidalgo, and Albert-Laszlo Barabasi. 2008. Understanding individual human mobility patterns. Nature 453, 7196 (2008), 779--782.Google Scholar
- Mark Granovetter. 1973. The strength of weak ties. Am. J. Sociology 78, 6 (1973), 1360--1380.Google ScholarCross Ref
- Mark Granovetter. 1985. Economic action and social structure: The problem of embeddedness. Amer. J. Sociol. (1985).Google Scholar
- Susan C. Herring. 2003. Gender and power in on-line communication. In Handbook of Language and Gender. Wiley-Blackwell, 202.Google Scholar
- Cesar A. Hidalgo and C. Rodriguez-Sickert. 2008. The dynamics of a mobile phone network. Physica A: Stat. Mech. Appl. 387, 12 (2008), 3017--3024.Google ScholarCross Ref
- Jian Hu, Hua-Jun Zeng, Hua Li, Cheng Niu, and Zheng Chen. 2007. Demographic prediction based on user’s browsing behavior. In WWW’07. 151--160. Google ScholarDigital Library
- Xia Hu and Huan Liu. 2012. Social status and role analysis of Palin’s email network. In WWW’12 Companion. ACM, 531--532. Google ScholarDigital Library
- Lauri Kovanen, Kimmo Kaski, János Kertész, and Jari Saramäki. 2013. Temporal motifs reveal homophily, gender-specific patterns, and group talk in call sequences. PNAS 110, 45 (2013), 18070--18075.Google ScholarCross Ref
- David Krackhardt. 1992. The Strength of Strong Ties. Cambridge, Harvard Business School Press, Hershey, USA.Google Scholar
- Frank R. Kschischang, Brendan J. Frey, and Hans A. Loeliger. 2001. Factor graphs and the sum-product algorithm. IEEE Trans. Internet Technol. 47 (2001), 498--519. Google ScholarDigital Library
- P. F. Lazarsfeld and R. K. Merton. 1954. Friendship as a social process: A substantive and methodological analysis. Freedom and Control in Modern Society,Van Nostrand, New York (1954), 8--66.Google Scholar
- Jure Leskovec and Eric Horvitz. 2008. Planetary-scale views on a large instant-messaging network. In WWW’08. ACM, 915--924. Google ScholarDigital Library
- Quim Llimona, Jordi Luque, Xavier Anguera, Zoraida Hidalgo, Souneil Park, and Nuria Oliver. 2015. Effect of gender and call duration on customer satisfaction in call center big data. In INTERSPEECH’15.Google Scholar
- Tiancheng Lou, Jie Tang, John Hopcroft, Zhanpeng Fang, and Xiaowen Ding. 2013. Learning to predict reciprocity and triadic closure in social networks. ACM Trans. Knowl. Discov. Data 7, 2 (2013), 5:1--5:25. Google ScholarDigital Library
- Peter V. Marsden. 1987. Core discussion networks of americans. Amer. Sociol. Rev. (1987), 122--131.Google Scholar
- Manfred Max-Neef, Antonio Elizalde, and Martin Hopenhayn. 1992. Development and human needs. Real-life Economics: Understanding Wealth Creation (1992), 197--213.Google Scholar
- M. Mead. 1970. Culture and Commitment: A Study of the Generation Gap. Natural History Press.Google Scholar
- L. Meng, Y. Hulovatyy, A. Striegel, and T. Milenković. 2016. On the interplay between individuals’ evolving interaction patterns and traits in dynamic multiplex social networks. IEEE Trans. Netw. Sci. Eng. 3, 1 (2016), 32--43.Google ScholarCross Ref
- Matthew Michelson and Sofus A. Macskassy. 2011. What blogs tell us about websites: A demographics study. In WSDM’11. ACM, 365--374. Google ScholarDigital Library
- Giovanna Miritello, Rubén Lara, Manuel Cebrian, and Esteban Moro. 2013. Limited communication capacity unveils strategies for human interaction. Sci. Rep. 3 (2013).Google Scholar
- Alan Mislove, Sune Lehmann, Yong-Yeol Ahn, Jukka-Pekka Onnela, and J. Niels Rosenquist. 2011. Understanding the demographics of twitter users. In ICWSM’11.Google Scholar
- Kaixiang Mo, Ben Tan, Erheng Zhong, and Qiang Yang. 2012. Your phone understands you. In Nokia MDC’12.Google Scholar
- Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In UAI’99. 467--475. Google ScholarDigital Library
- J. P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási. 2007. Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. U.S.A. (2007).Google ScholarCross Ref
- Vasyl Palchykov, Kimmo Kaski, János Kertész, Albert-László Barabási, and Robin I. M. Dunbar. 2012. Sex differences in intimate relationships. Sci. Rep. 2:370 (2012).Google Scholar
- Stephen W. Raudenbush and Anthony S. Bryk. 2002. Hierarchical Linear Models: Applications and Data Analysis Methods. Vol. 1. Sage.Google Scholar
- Jari Saramaki and Esteban Moro. 2015. From seconds to months: multi-scale dynamics of mobile telephone calls. arXiv:1504.01479 (2015).Google Scholar
- Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, and Jure Leskovec. 2008. Mobile call graphs: Beyond power-law and lognormal distributions. In KDD’08. ACM, 596--604. Google ScholarDigital Library
- Xiaolin Shi, Lada A. Adamic, and Martin J. Strauss. 2007. Networks of strong ties. Physica A: Stat. Mech. Appl. 378, 1 (2007), 33--47.Google ScholarCross Ref
- Bai-En Shie, S. Yu Philip, and Vincent S. Tseng. 2013. Mining interesting user behavior patterns in mobile commerce environments. Appl. Intell. 38, 3 (2013), 418--435. Google ScholarDigital Library
- Zbigniew Smoreda and Christian Licoppe. 2000. Gender-specific use of the domestic telephone. Soc. Psych. Quart. 63, 3 (2000), 238--252.Google ScholarCross Ref
- Richard C. Sprinthall. 2011. Basic Statistical Analysis. Pearson.Google Scholar
- Arkadiusz Stopczynski, Vedran Sekara, Piotr Sapiezynski, Andrea Cuttone, Jakob Eg Larsen, and Sune Lehmann. 2014. Measuring large-scale social networks with high resolution. PLOS One 9, 4 (2014), e95978.Google ScholarCross Ref
- Michael Szell and Stefan Thurner. 2013. How women organize social networks different from men. Sci. Rep. 3 (July 2013).Google Scholar
- Jie Tang, Tiancheng Lou, Jon Kleinberg, and Sen Wu. 2016. Transfer learning to infer social ties across heterogeneous networks. ACM Trans. Inf. Syst. 34, 2, Article 7 (April 2016). Google ScholarDigital Library
- Jie Tang, Sen Wu, and Jimeng Sun. 2013. Confluence: Conformity influence in large social networks. In KDD’13. ACM, 347--355. Google ScholarDigital Library
- Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. ArnetMiner: Extraction and mining of academic social networks. In KDD’08. 990--998. Google ScholarDigital Library
- Dashun Wang, Dino Pedreschi, Chaoming Song, Fosca Giannotti, and Albert-Laszlo Barabasi. 2011. Human mobility, social ties, and link prediction. In KDD’11. ACM, 1100--1108. Google ScholarDigital Library
- Josh Ying, Yao-Jen Chang, Chi-Min Huang, and Vincent S. Tseng. 2012. Demographic prediction based on user’s mobile behaviors. In Nokia MDC’12.Google Scholar
- Yuchen Zhao, Guan Wang, Philip S. Yu, Shaobo Liu, and Simon Zhang. 2013. Inferring social roles and statuses in social networks. In KDD’13. 695--703. Google ScholarDigital Library
- Yu Zheng. 2015. Trajectory data mining: An overview. ACM Trans. Intell. Syst. Technol. (TIST) 6, 3 (2015), 29. Google ScholarDigital Library
Index Terms
- User Modeling on Demographic Attributes in Big Mobile Social Networks
Recommendations
Inferring user demographics and social strategies in mobile social networks
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data miningDemographics are widely used in marketing to characterize different types of customers. However, in practice, demographic information such as age, gender, and location is usually unavailable due to privacy and other reasons. In this paper, we aim to ...
Social network site use, mobile personal talk and social capital among teenagers
Mobile personal talk has a positive relationship with network capital among teens.SNS adoption and mobile personal talk jointly predict teens' civic participation.Different SNS activities have different relationships with social capital.Mobile talk ...
Network, personality and social capital
WebSci '12: Proceedings of the 4th Annual ACM Web Science ConferenceWe present a study on the relationship between social network structure on Facebook and social capital, and how this relationship is moderated by personality traits. The findings suggest that one's number of friends does not necessarily have an effect ...
Comments