ABSTRACT
Demographics are widely used in marketing to characterize different types of customers. However, in practice, demographic information such as age, gender, and location is usually unavailable due to privacy and other reasons. In this paper, we aim to harness the power of big data to automatically infer users' demographics based on their daily mobile communication patterns. Our study is based on a real-world large mobile network of more than 7,000,000 users and over 1,000,000,000 communication records (CALL and SMS). We discover several interesting social strategies that mobile users frequently use to maintain their social connections. First, young people are very active in broadening their social circles, while seniors tend to keep close but more stable connections. Second, female users put more attention on cross-generation interactions than male users, though interactions between male and female users are frequent. Third, a persistent same-gender triadic pattern over one's lifetime is discovered for the first time, while more complex opposite-gender triadic patterns are only exhibited among young people.
We further study to what extent users' demographics can be inferred from their mobile communications. As a special case, we formalize a problem of double dependent-variable prediction-inferring user gender and age simultaneously. We propose the WhoAmI method, a Double Dependent-Variable Factor Graph Model, to address this problem by considering not only the effects of features on gender/age, but also the interrelation between gender and age. Our experiments show that the proposed WhoAmI method significantly improves the prediction accuracy by up to 10% compared with several alternative methods.
Supplemental Material
- J. Bao, Y. Zheng, D. Wilkie, and M. Mokbel. A survey on recommendations in location-based social networks. ACM TIST, 2014.Google Scholar
- M. Berlingerio, F. Calabrese, G. D. Lorenzo, R. Nair, F. Pinelli, and M. L. Sbodio. Allaboard: A system for exploring urban mobility and optimizing public transport using cellphone data. In ECML/PKDD, pages 663--666. Springer, 2013.Google ScholarDigital Library
- B. Bi, M. Shokouhi, M. Kosinski, and T. Graepel. Inferring the demographics of search users: Social data meets search queries. In WWW '13, pages 131--140, 2013. Google ScholarDigital Library
- Y. Dong, J. Tang, T. Lou, B. Wu, and N. V. Chawla. How long will she call me? distribution, social theory and duration prediction. In ECML/PKDD (2), pages 16--31, 2013.Google ScholarCross Ref
- N. Du, C. Faloutsos, B. Wang, and L. Akoglu. Large human communication networks: Patterns and a utility-driven generator. In KDD '09, pages 269--278. ACM, 2009. Google ScholarDigital Library
- N. Eagle, A. S. Pentland, and D. Lazer. Inferring social network structure using mobile phone data. PNAS, 106(36), 2009.Google Scholar
- H. Gao, J. Tang, X. Hu, and H. Liu. Modeling temporal effects of human mobile behavior on location-based social networks. In CIKM '13, pages 1673--1678, 2013. Google ScholarDigital Library
- F. Giannotti and D. Pedreschi. Mobility, data mining and privacy: Geographic knowledge discovery. Springer, 2008. Google ScholarDigital Library
- M. Granovetter. Economic action and social structure: The problem of embeddedness. The American Journal of Sociology, 1985.Google ScholarCross Ref
- J. Hu, H.-J. Zeng, H. Li, C. Niu, and Z. Chen. Demographic prediction based on user's browsing behavior. In WWW '07, pages 151--160, 2007. Google ScholarDigital Library
- X. Hu and H. Liu. Social status and role analysis of Palin's email network. In WWW '12 Companion, pages 531--532. ACM, 2012. Google ScholarDigital Library
- L. Kovanen, K. Kaski, J. Kertész, and J. Saramäki. Temporal motifs reveal homophily, gender-specific patterns, and group talk in call sequences. PNAS, 2013.Google ScholarCross Ref
- F. R. Kschischang, B. J. Frey, and H. A. Loeliger. Factor graphs and the sum-product algorithm. IEEE TOIT, 47:498--519, 2001. Google ScholarDigital Library
- P. F. Lazarsfeld and R. K. Merton. Friendship as a social process: A substantive and methodological analysis. Freedom and control in modern society, New York: Van Nostrand, pages 8--66, 1954.Google Scholar
- J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD '08, pages 462--470, 2008. Google ScholarDigital Library
- J. Leskovec and E. Horvitz. Planetary-scale views on a large instant-messaging network. In WWW '08, pages 915--924. ACM, 2008. Google ScholarDigital Library
- R. N. Lichtenwalter, J. T. Lussier, and N. V. Chawla. New perspectives and methods in link prediction. In KDD '10, pages 243--252. ACM, 2010. Google ScholarDigital Library
- H.-A. Loeliger. An introduction to factor graphs. Signal Processing Magazine, IEEE, 21(1):28--41, 2004.Google ScholarCross Ref
- T. Lou, J. Tang, J. Hopcroft, Z. Fang, and X. Ding. Learning to predict reciprocity and triadic closure in social networks. ACM TKDD, 7(2):5:1--5:25, 2013. Google ScholarDigital Library
- M. Mead. Culture and commitment: a study of the generation gap. Natural History Press, 1970.Google Scholar
- K. Mo, B. Tan, E. Zhong, and Q. Yang. Your phone understands you. In Nokia MDC '12, 2012.Google Scholar
- A. Monreale, F. Pinelli, R. Trasarti, and F. Giannotti. Wherenext: A location predictor on trajectory pattern mining. In KDD '09, pages 637--646, 2009. Google ScholarDigital Library
- K. P. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief propagation for approximate inference: An empirical study. In UAI '99, pages 467--475, 1999. Google ScholarDigital Library
- A. A. Nanavati, S. Gurumurthy, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: Findings and implications. In CIKM '06, pages 435--444, 2006. Google ScholarDigital Library
- A. Noulas, S. Scellato, N. Lathia, and C. Mascolo. Mining user mobility features for next place prediction in location-based services. In ICDM '12, pages 1038--1043, 2012. Google ScholarDigital Library
- J. P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási. Structure and tie strengths in mobile communication networks. PNAS, 2007.Google ScholarCross Ref
- V. Palchykov, K. Kaski, J. Kertész, A.-L. Barabási, and R. I. M. Dunbar. Sex differences in intimate relationships. Scientific Reports, 2:370, 2012.Google ScholarCross Ref
- R. Prasad. Generation Gap: A study of intergenerational sociological conflict. Mittal Publications, 1992.Google Scholar
- M. Seshadri, S. Machiraju, A. Sridharan, J. Bolot, C. Faloutsos, and J. Leskovec. Mobile call graphs: beyond power-law and lognormal distributions. In KDD '08, pages 596--604. ACM, 2008. Google ScholarDigital Library
- C. Song, Z. Qu, N. Blumm, and A.-L. Barabási. Limits of predictability in human mobility. Science, 2010.Google Scholar
- J. Tang, S. Wu, and J. Sun. Confluence: Conformity influence in large social networks. In KDD '13, pages 347--355. ACM, 2013. Google ScholarDigital Library
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD '08, pages 990--998, 2008. Google ScholarDigital Library
- D. Wang, D. Pedreschi, C. Song, F. Giannotti, and A.-L. Barabasi. Human mobility, social ties, and link prediction. In KDD '11, pages 1100--1108. ACM, 2011. Google ScholarDigital Library
- A. Y. Xue, R. Zhang, Y. Zheng, X. Xie, J. Huang, and Z. Xu. Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In ICDE '13, pages 254--265, 2013. Google ScholarDigital Library
- J. Ying, Y.-J. Chang, C.-M. Huang, and V. S. Tseng. Demographic prediction based on user's mobile behaviors. In Nokia MDC '12, 2012.Google Scholar
- N. J. Yuan, Y. Wang, F. Zhang, X. Xie, and G. Sun. Reconstructing individual mobility from smart card transactions: A space alignment approach. In ICDM'13, pages 877--886, 2013.Google ScholarCross Ref
- Y. Zhao, G. Wang, P. S. Yu, S. Liu, and S. Zhang. Inferring social roles and statuses in social networks. In KDD '13, pages 695--703, 2013. Google ScholarDigital Library
Index Terms
- Inferring user demographics and social strategies in mobile social networks
Recommendations
User Modeling on Demographic Attributes in Big Mobile Social Networks
Special issue: Search, Mining and their Applications on Mobile DevicesUsers with demographic profiles in social networks offer the potential to understand the social principles that underpin our highly connected world, from individuals, to groups, to societies. In this article, we harness the power of network and data ...
A videosharing social networking intervention for young adult cancer survivors
Clinicians interested in taking a proactive approach to healthy cancer survivorship might consider the use of a social networking and videosharing platform tailored specifically for young adult cancer survivors. This study examines six key factors that ...
Urban Social Media Demographics: An Exploration of Twitter Use in Major American Cities
This article explores intersections between place, race/ethnicity, and gender amongst American Twitter users and makes an argument that studying the intensity of tweets provides insights into how and why particular groups tweet. Given recent events in ...
Comments