Skip to main content
Top
Published in:

01-12-2020 | Original Article

Discovering patterns of customer financial behavior using social media data

Authors: Alexander Kalinin, Danila Vaganov, Klavdiya Bochenina

Published in: Social Network Analysis and Mining | Issue 1/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Social networks are a sterling source of information that reflects the real life of people in the digital space. This makes it possible to infer various aspects of the socioeconomic behavior of the user, even if he/she does not indicate them explicitly. In this study, on the one hand, we consider Russian online social network VK.com, which is analog to the global Facebook platform. On the other hand, there is a supplementary financial information source provided by the bank company. Combining the data of online social media with debit card transactions, we train machine learning models to infer the socioeconomic status (SES) of the user, as well as six purchasing patterns that characterize customer transactional activity of certain type. Namely, we detect if a user is a driver, parent, gamer, traveler, or he/she prefers to purchase at night/in the morning. SES is defined as average monthly expenses and considered as real number variable. The following features are extracted as predictors: demographic information from a user’s page, user participation in communities, topics of that communities, text embeddings of user posts, topological characteristics, and graph embeddings of nodes in the friendship graph. Obtained results show the superiority of graph embeddings in both classification and regression tasks (median absolute percentage error MedAPE = 29.7 for SES). Moreover, for drivers (Macro-\(F_1=0.688\)) and parents (Macro-\(F_1=0.679\)), the higher scores are reached by concatenation of different features. In addition, we investigate feature importance values and found that topics of user communities and the structure of its network influence on the model stronger than other features. The performed study shows the power of online social media data for inferring user socioeconomic attributes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
We use the comprehensive collection of stop-words for the Russian language, which is available at “https://github.com/stopwords-iso/stopwords-ru”.
 
Literature
go back to reference Aletras N, Chamberlain BP (2018) Predicting twitter user socioeconomic attributes with network and language information. In: Proceedings of the 29th on hypertext and social media, ACM, pp 20–24 Aletras N, Chamberlain BP (2018) Predicting twitter user socioeconomic attributes with network and language information. In: Proceedings of the 29th on hypertext and social media, ACM, pp 20–24
go back to reference Al-Sharawneh JA, Williams M (2010) Credibility-aware web-based social network recommender: follow the leader. In: ACM recommender systems, WARWICK, United Kingdome, pp 1–8 Al-Sharawneh JA, Williams M (2010) Credibility-aware web-based social network recommender: follow the leader. In: ACM recommender systems, WARWICK, United Kingdome, pp 1–8
go back to reference Blumenstock J, Cadamuro G, On R (2015) Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076CrossRef Blumenstock J, Cadamuro G, On R (2015) Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076CrossRef
go back to reference Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146CrossRef Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146CrossRef
go back to reference Borzymek P, Sydow M, Wierzbicki A (2009) Enriching trust prediction model in social network with user rating similarity. In: Proceedings of the 2009 international conference on computational aspects of social networks. CASON ’09, IEEE Computer Society, USA, pp 40–47. https://doi.org/10.1109/CASoN.2009.30. Borzymek P, Sydow M, Wierzbicki A (2009) Enriching trust prediction model in social network with user rating similarity. In: Proceedings of the 2009 international conference on computational aspects of social networks. CASON ’09, IEEE Computer Society, USA, pp 40–47. https://​doi.​org/​10.​1109/​CASoN.​2009.​30.​
go back to reference Campbell KE, Marsden PV, Hurlbert JS (1986) Social resources and socioeconomic status. Soc Netw 8(1):97–117CrossRef Campbell KE, Marsden PV, Hurlbert JS (1986) Social resources and socioeconomic status. Soc Netw 8(1):97–117CrossRef
go back to reference Chamberlain BP, Humby C, Deisenroth MP (2017) Probabilistic inference of twitter users’ age based on what they follow. In: Altun Y, Das K, Mielikäinen T, Malerba D, Stefanowski J, Read J, Žitnik M, Ceci M, Džeroski S (eds) Machine learning and knowledge discovery in databases. Springer, Cham, pp 191–203CrossRef Chamberlain BP, Humby C, Deisenroth MP (2017) Probabilistic inference of twitter users’ age based on what they follow. In: Altun Y, Das K, Mielikäinen T, Malerba D, Stefanowski J, Read J, Žitnik M, Ceci M, Džeroski S (eds) Machine learning and knowledge discovery in databases. Springer, Cham, pp 191–203CrossRef
go back to reference Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’16, Association for Computing Machinery, New York, NY, USA, pp 785–794. https://doi.org/10.1145/2939672.2939785. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’16, Association for Computing Machinery, New York, NY, USA, pp 785–794. https://​doi.​org/​10.​1145/​2939672.​2939785.​
go back to reference De Montjoye Y-A, Hidalgo CA, Verleysen M, Blondel VD (2013) Unique in the crowd: the privacy bounds of human mobility. Sci Rep 3:1376CrossRef De Montjoye Y-A, Hidalgo CA, Verleysen M, Blondel VD (2013) Unique in the crowd: the privacy bounds of human mobility. Sci Rep 3:1376CrossRef
go back to reference Ding S, Huang H, Zhao T, Fu X (2019) Estimating socioeconomic status via temporal-spatial mobility analysis—a case study of smart card data. In: 2019 28th international conference on computer communication and networks (ICCCN), pp 1–9 Ding S, Huang H, Zhao T, Fu X (2019) Estimating socioeconomic status via temporal-spatial mobility analysis—a case study of smart card data. In: 2019 28th international conference on computer communication and networks (ICCCN), pp 1–9
go back to reference Fisher JE (1987) Social class and consumer behavior: the relevance of class and status. ACR North American Advances Fisher JE (1987) Social class and consumer behavior: the relevance of class and status. ACR North American Advances
go back to reference Fixman M, Berenstein A, Brea J, Minnoni M, Travizano M, Sarraute, C (2016) A bayesian approach to income inference in a communication network. In: Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’16, IEEE Press, pp 579–582 Fixman M, Berenstein A, Brea J, Minnoni M, Travizano M, Sarraute, C (2016) A bayesian approach to income inference in a communication network. In: Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’16, IEEE Press, pp 579–582
go back to reference Garfinkel SL (2015) De-identification of personal information. Technical report, National Institute of Standards and Technology Garfinkel SL (2015) De-identification of personal information. Technical report, National Institute of Standards and Technology
go back to reference Heatherly R, Kantarcioglu M, Lindamood J (2013) Preventing private information inference attacks on social networks technical report UTDCS-03-09 (2), pp 1–18 Heatherly R, Kantarcioglu M, Lindamood J (2013) Preventing private information inference attacks on social networks technical report UTDCS-03-09 (2), pp 1–18
go back to reference Iqbal S, Ismail Z (2011) Buying behavior: gender and socioeconomic class differences on interpersonal in uence susceptibility. Int J Bus Soc Sci 2(4):55–66 Iqbal S, Ismail Z (2011) Buying behavior: gender and socioeconomic class differences on interpersonal in uence susceptibility. Int J Bus Soc Sci 2(4):55–66
go back to reference Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301):790–794CrossRef Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301):790–794CrossRef
go back to reference Kreidl M (2000) Perceptions of poverty and wealth in western and post-communist countries. Soc Justice Res 13(2):151–176CrossRef Kreidl M (2000) Perceptions of poverty and wealth in western and post-communist countries. Soc Justice Res 13(2):151–176CrossRef
go back to reference Lampos V, Aletras N, Geyti JK, Zou B, Cox IJ (2016) Inferring the socioeconomic status of social media users based on behaviour and language. In: European conference on information retrieval, Springer, pp 689–695 Lampos V, Aletras N, Geyti JK, Zou B, Cox IJ (2016) Inferring the socioeconomic status of social media users based on behaviour and language. In: European conference on information retrieval, Springer, pp 689–695
go back to reference Leo Y, Karsai M, Sarraute C, Fleury E (2018) Correlations and dynamics of consumption patterns in social-economic networks. Soc Netw Anal Min 8(1):9CrossRef Leo Y, Karsai M, Sarraute C, Fleury E (2018) Correlations and dynamics of consumption patterns in social-economic networks. Soc Netw Anal Min 8(1):9CrossRef
go back to reference Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2(1):2522–5839CrossRef Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2(1):2522–5839CrossRef
go back to reference Lv X, Jin P, Yue L (2016) User occupation prediction on microblogs. In: Li F, Shim K, Zheng K, Liu G (eds) Web technologies and applications. Springer, Cham, pp 497–501CrossRef Lv X, Jin P, Yue L (2016) User occupation prediction on microblogs. In: Li F, Shim K, Zheng K, Liu G (eds) Web technologies and applications. Springer, Cham, pp 497–501CrossRef
go back to reference Lv X, Jin P, Mu L, Wan S, Yue L (2017) Detecting user occupations on microblogging platforms: an experimental study. In: Chen L, Jensen CS, Shahabi C, Yang X, Lian X (eds) Web and big data. Springer, Cham, pp 331–345CrossRef Lv X, Jin P, Mu L, Wan S, Yue L (2017) Detecting user occupations on microblogging platforms: an experimental study. In: Chen L, Jensen CS, Shahabi C, Yang X, Lian X (eds) Web and big data. Springer, Cham, pp 331–345CrossRef
go back to reference McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444CrossRef McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444CrossRef
go back to reference Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems—volume 2. NIPS’13, Curran Associates Inc., Red Hook, NY, USA, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems—volume 2. NIPS’13, Curran Associates Inc., Red Hook, NY, USA, pp 3111–3119
go back to reference Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH
go back to reference Preoţiuc-Pietro D, Lampos V, Aletras N (2015b) An analysis of the user occupational class through Twitter content. Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pp 1754–1764. https://doi.org/10.3115/v1/P15-1169 Preoţiuc-Pietro D, Lampos V, Aletras N (2015b) An analysis of the user occupational class through Twitter content. Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pp 1754–1764. https://​doi.​org/​10.​3115/​v1/​P15-1169
go back to reference Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: SMUC ’10 Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: SMUC ’10
go back to reference Roth P (2019) In: Holzer B, Stegbauer C (eds) Feld (1981) The focused organization of social ties, Springer, Wiesbaden, pp 185–188 Roth P (2019) In: Holzer B, Stegbauer C (eds) Feld (1981) The focused organization of social ties, Springer, Wiesbaden, pp 185–188
go back to reference Schäfer I, Hansen H, Schön G, Höfels S, Altiner A, Dahlhaus A, Gensichen J, Riedel-Heller S, Weyerer S, Blank WA et al (2012) The in uence of age, gender and socio-economic status on multimorbidity patterns in primary care: first results from the multicare cohort study. BMC Health Serv Res 12(1):89CrossRef Schäfer I, Hansen H, Schön G, Höfels S, Altiner A, Dahlhaus A, Gensichen J, Riedel-Heller S, Weyerer S, Blank WA et al (2012) The in uence of age, gender and socio-economic status on multimorbidity patterns in primary care: first results from the multicare cohort study. BMC Health Serv Res 12(1):89CrossRef
go back to reference Segalovich I(2003) A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: Proceedings of the international conference on machine learning; models, technologies and applications. MLMTA’03. Citeseer Segalovich I(2003) A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: Proceedings of the international conference on machine learning; models, technologies and applications. MLMTA’03. Citeseer
go back to reference Tsakalidis A, Aletras N, Cristea AI, Liakata M (2018) Nowcasting the stance of social media users in a sudden vote: the case of the greek referendum. In: Proceedings of the 27th ACM international conference on information and knowledge management. CIKM ’18, Association for Computing Machinery, New York, NY, USA, pp 367–376. https://doi.org/10.1145/3269206.3271783. Tsakalidis A, Aletras N, Cristea AI, Liakata M (2018) Nowcasting the stance of social media users in a sudden vote: the case of the greek referendum. In: Proceedings of the 27th ACM international conference on information and knowledge management. CIKM ’18, Association for Computing Machinery, New York, NY, USA, pp 367–376. https://​doi.​org/​10.​1145/​3269206.​3271783.​
go back to reference Tsitsulin A, Mottin D, Karras P, Müller E (2018) Verse: versatile graph embeddings from similarity measures. In: Proceedings of the 2018 World Wide Web conference. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp 539–548. https://doi.org/10.1145/3178876.3186120. Tsitsulin A, Mottin D, Karras P, Müller E (2018) Verse: versatile graph embeddings from similarity measures. In: Proceedings of the 2018 World Wide Web conference. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp 539–548. https://​doi.​org/​10.​1145/​3178876.​3186120.​
go back to reference Tucker-Drob EM, Briley DA (2012) Socioeconomic status modifies interest-knowledge associations among adolescents. Personal Individ Differ 53(1):9–15CrossRef Tucker-Drob EM, Briley DA (2012) Socioeconomic status modifies interest-knowledge associations among adolescents. Personal Individ Differ 53(1):9–15CrossRef
go back to reference Vaganov D, Kalinin A, Bochenina K (2020) On inferring monthly expenses of social media users: towards data and approaches. In: Cherifi H, Gaito S, Mendes JF, Moro E, Rocha LM (eds) Complex networks and their applications VIII. Springer, Cham, pp 854–865CrossRef Vaganov D, Kalinin A, Bochenina K (2020) On inferring monthly expenses of social media users: towards data and approaches. In: Cherifi H, Gaito S, Mendes JF, Moro E, Rocha LM (eds) Complex networks and their applications VIII. Springer, Cham, pp 854–865CrossRef
go back to reference Vaganov D, Funkner A, Kovalchuk S, Guleva V, Bochenina, K (2018) Forecasting purchase categories with transition graphs using financial and social data. In: International conference on social informatics, Springer, pp 439–454 Vaganov D, Funkner A, Kovalchuk S, Guleva V, Bochenina, K (2018) Forecasting purchase categories with transition graphs using financial and social data. In: International conference on social informatics, Springer, pp 439–454
go back to reference Vorontsov K, Frei O, Apishev M, Romov P, Dudarenko M (2015) Bigartm: open source library for regularized multimodal topic modeling of large collections. In: AIST Vorontsov K, Frei O, Apishev M, Romov P, Dudarenko M (2015) Bigartm: open source library for regularized multimodal topic modeling of large collections. In: AIST
go back to reference Wang X, Yu L, Yao J, Cui B (2013) A multiple feature integration model to infer occupation from social media records. In: Lin X, Manolopoulos Y, Srivastava D, Huang G (eds) Web information systems engineering WISE 2013. Springer, Berlin, pp 137–150CrossRef Wang X, Yu L, Yao J, Cui B (2013) A multiple feature integration model to infer occupation from social media records. In: Lin X, Manolopoulos Y, Srivastava D, Huang G (eds) Web information systems engineering WISE 2013. Springer, Berlin, pp 137–150CrossRef
go back to reference Wang J, Gao J, Liu J-H, Yang D, Zhou T (2019) Regional economic status inference from information flow and talent mobility. EPL (Europhys Lett) 125(6):68002CrossRef Wang J, Gao J, Liu J-H, Yang D, Zhou T (2019) Regional economic status inference from information flow and talent mobility. EPL (Europhys Lett) 125(6):68002CrossRef
go back to reference Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the sixth international AAAI conference on weblogs and social media homophily, pp 387–390 Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the sixth international AAAI conference on weblogs and social media homophily, pp 387–390
go back to reference Zhang J, Hu X, Zhang Y, Liu H (2016) Your age is no secret: inferring microbloggers’ ages via content and interaction analysis. In: Proceedings of the 10th international conference on web and social media, ICWSM 2016 (Icwsm), pp 476–485 Zhang J, Hu X, Zhang Y, Liu H (2016) Your age is no secret: inferring microbloggers’ ages via content and interaction analysis. In: Proceedings of the 10th international conference on web and social media, ICWSM 2016 (Icwsm), pp 476–485
go back to reference Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: Proceedings of the 18th international conference on world wide web, ACM, pp 531–540 Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: Proceedings of the 18th international conference on world wide web, ACM, pp 531–540
Metadata
Title
Discovering patterns of customer financial behavior using social media data
Authors
Alexander Kalinin
Danila Vaganov
Klavdiya Bochenina
Publication date
01-12-2020
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2020
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-020-00690-3

Premium Partner