Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2020

01.12.2020 | Original Article

Discovering patterns of customer financial behavior using social media data

verfasst von: Alexander Kalinin, Danila Vaganov, Klavdiya Bochenina

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Social networks are a sterling source of information that reflects the real life of people in the digital space. This makes it possible to infer various aspects of the socioeconomic behavior of the user, even if he/she does not indicate them explicitly. In this study, on the one hand, we consider Russian online social network VK.com, which is analog to the global Facebook platform. On the other hand, there is a supplementary financial information source provided by the bank company. Combining the data of online social media with debit card transactions, we train machine learning models to infer the socioeconomic status (SES) of the user, as well as six purchasing patterns that characterize customer transactional activity of certain type. Namely, we detect if a user is a driver, parent, gamer, traveler, or he/she prefers to purchase at night/in the morning. SES is defined as average monthly expenses and considered as real number variable. The following features are extracted as predictors: demographic information from a user’s page, user participation in communities, topics of that communities, text embeddings of user posts, topological characteristics, and graph embeddings of nodes in the friendship graph. Obtained results show the superiority of graph embeddings in both classification and regression tasks (median absolute percentage error MedAPE = 29.7 for SES). Moreover, for drivers (Macro-\(F_1=0.688\)) and parents (Macro-\(F_1=0.679\)), the higher scores are reached by concatenation of different features. In addition, we investigate feature importance values and found that topics of user communities and the structure of its network influence on the model stronger than other features. The performed study shows the power of online social media data for inferring user socioeconomic attributes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We use the comprehensive collection of stop-words for the Russian language, which is available at “https://github.com/stopwords-iso/stopwords-ru”.
 
Literatur
Zurück zum Zitat Aletras N, Chamberlain BP (2018) Predicting twitter user socioeconomic attributes with network and language information. In: Proceedings of the 29th on hypertext and social media, ACM, pp 20–24 Aletras N, Chamberlain BP (2018) Predicting twitter user socioeconomic attributes with network and language information. In: Proceedings of the 29th on hypertext and social media, ACM, pp 20–24
Zurück zum Zitat Al-Sharawneh JA, Williams M (2010) Credibility-aware web-based social network recommender: follow the leader. In: ACM recommender systems, WARWICK, United Kingdome, pp 1–8 Al-Sharawneh JA, Williams M (2010) Credibility-aware web-based social network recommender: follow the leader. In: ACM recommender systems, WARWICK, United Kingdome, pp 1–8
Zurück zum Zitat Bernstein B (1960) Language and social class. Br J Sociol 11(3):271–276CrossRef Bernstein B (1960) Language and social class. Br J Sociol 11(3):271–276CrossRef
Zurück zum Zitat Blumenstock J, Cadamuro G, On R (2015) Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076CrossRef Blumenstock J, Cadamuro G, On R (2015) Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076CrossRef
Zurück zum Zitat Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146CrossRef Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146CrossRef
Zurück zum Zitat Borzymek P, Sydow M, Wierzbicki A (2009) Enriching trust prediction model in social network with user rating similarity. In: Proceedings of the 2009 international conference on computational aspects of social networks. CASON ’09, IEEE Computer Society, USA, pp 40–47. https://doi.org/10.1109/CASoN.2009.30. Borzymek P, Sydow M, Wierzbicki A (2009) Enriching trust prediction model in social network with user rating similarity. In: Proceedings of the 2009 international conference on computational aspects of social networks. CASON ’09, IEEE Computer Society, USA, pp 40–47. https://​doi.​org/​10.​1109/​CASoN.​2009.​30.​
Zurück zum Zitat Campbell KE, Marsden PV, Hurlbert JS (1986) Social resources and socioeconomic status. Soc Netw 8(1):97–117CrossRef Campbell KE, Marsden PV, Hurlbert JS (1986) Social resources and socioeconomic status. Soc Netw 8(1):97–117CrossRef
Zurück zum Zitat Chamberlain BP, Humby C, Deisenroth MP (2017) Probabilistic inference of twitter users’ age based on what they follow. In: Altun Y, Das K, Mielikäinen T, Malerba D, Stefanowski J, Read J, Žitnik M, Ceci M, Džeroski S (eds) Machine learning and knowledge discovery in databases. Springer, Cham, pp 191–203CrossRef Chamberlain BP, Humby C, Deisenroth MP (2017) Probabilistic inference of twitter users’ age based on what they follow. In: Altun Y, Das K, Mielikäinen T, Malerba D, Stefanowski J, Read J, Žitnik M, Ceci M, Džeroski S (eds) Machine learning and knowledge discovery in databases. Springer, Cham, pp 191–203CrossRef
Zurück zum Zitat Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’16, Association for Computing Machinery, New York, NY, USA, pp 785–794. https://doi.org/10.1145/2939672.2939785. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’16, Association for Computing Machinery, New York, NY, USA, pp 785–794. https://​doi.​org/​10.​1145/​2939672.​2939785.​
Zurück zum Zitat De Montjoye Y-A, Hidalgo CA, Verleysen M, Blondel VD (2013) Unique in the crowd: the privacy bounds of human mobility. Sci Rep 3:1376CrossRef De Montjoye Y-A, Hidalgo CA, Verleysen M, Blondel VD (2013) Unique in the crowd: the privacy bounds of human mobility. Sci Rep 3:1376CrossRef
Zurück zum Zitat Ding S, Huang H, Zhao T, Fu X (2019) Estimating socioeconomic status via temporal-spatial mobility analysis—a case study of smart card data. In: 2019 28th international conference on computer communication and networks (ICCCN), pp 1–9 Ding S, Huang H, Zhao T, Fu X (2019) Estimating socioeconomic status via temporal-spatial mobility analysis—a case study of smart card data. In: 2019 28th international conference on computer communication and networks (ICCCN), pp 1–9
Zurück zum Zitat Fisher JE (1987) Social class and consumer behavior: the relevance of class and status. ACR North American Advances Fisher JE (1987) Social class and consumer behavior: the relevance of class and status. ACR North American Advances
Zurück zum Zitat Fixman M, Berenstein A, Brea J, Minnoni M, Travizano M, Sarraute, C (2016) A bayesian approach to income inference in a communication network. In: Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’16, IEEE Press, pp 579–582 Fixman M, Berenstein A, Brea J, Minnoni M, Travizano M, Sarraute, C (2016) A bayesian approach to income inference in a communication network. In: Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining. ASONAM ’16, IEEE Press, pp 579–582
Zurück zum Zitat Garfinkel SL (2015) De-identification of personal information. Technical report, National Institute of Standards and Technology Garfinkel SL (2015) De-identification of personal information. Technical report, National Institute of Standards and Technology
Zurück zum Zitat Heatherly R, Kantarcioglu M, Lindamood J (2013) Preventing private information inference attacks on social networks technical report UTDCS-03-09 (2), pp 1–18 Heatherly R, Kantarcioglu M, Lindamood J (2013) Preventing private information inference attacks on social networks technical report UTDCS-03-09 (2), pp 1–18
Zurück zum Zitat Iqbal S, Ismail Z (2011) Buying behavior: gender and socioeconomic class differences on interpersonal in uence susceptibility. Int J Bus Soc Sci 2(4):55–66 Iqbal S, Ismail Z (2011) Buying behavior: gender and socioeconomic class differences on interpersonal in uence susceptibility. Int J Bus Soc Sci 2(4):55–66
Zurück zum Zitat Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301):790–794CrossRef Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301):790–794CrossRef
Zurück zum Zitat Kreidl M (2000) Perceptions of poverty and wealth in western and post-communist countries. Soc Justice Res 13(2):151–176CrossRef Kreidl M (2000) Perceptions of poverty and wealth in western and post-communist countries. Soc Justice Res 13(2):151–176CrossRef
Zurück zum Zitat Lampos V, Aletras N, Geyti JK, Zou B, Cox IJ (2016) Inferring the socioeconomic status of social media users based on behaviour and language. In: European conference on information retrieval, Springer, pp 689–695 Lampos V, Aletras N, Geyti JK, Zou B, Cox IJ (2016) Inferring the socioeconomic status of social media users based on behaviour and language. In: European conference on information retrieval, Springer, pp 689–695
Zurück zum Zitat Leo Y, Karsai M, Sarraute C, Fleury E (2018) Correlations and dynamics of consumption patterns in social-economic networks. Soc Netw Anal Min 8(1):9CrossRef Leo Y, Karsai M, Sarraute C, Fleury E (2018) Correlations and dynamics of consumption patterns in social-economic networks. Soc Netw Anal Min 8(1):9CrossRef
Zurück zum Zitat Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2(1):2522–5839CrossRef Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2(1):2522–5839CrossRef
Zurück zum Zitat Lv X, Jin P, Yue L (2016) User occupation prediction on microblogs. In: Li F, Shim K, Zheng K, Liu G (eds) Web technologies and applications. Springer, Cham, pp 497–501CrossRef Lv X, Jin P, Yue L (2016) User occupation prediction on microblogs. In: Li F, Shim K, Zheng K, Liu G (eds) Web technologies and applications. Springer, Cham, pp 497–501CrossRef
Zurück zum Zitat Lv X, Jin P, Mu L, Wan S, Yue L (2017) Detecting user occupations on microblogging platforms: an experimental study. In: Chen L, Jensen CS, Shahabi C, Yang X, Lian X (eds) Web and big data. Springer, Cham, pp 331–345CrossRef Lv X, Jin P, Mu L, Wan S, Yue L (2017) Detecting user occupations on microblogging platforms: an experimental study. In: Chen L, Jensen CS, Shahabi C, Yang X, Lian X (eds) Web and big data. Springer, Cham, pp 331–345CrossRef
Zurück zum Zitat McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444CrossRef McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444CrossRef
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems—volume 2. NIPS’13, Curran Associates Inc., Red Hook, NY, USA, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems—volume 2. NIPS’13, Curran Associates Inc., Red Hook, NY, USA, pp 3111–3119
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830MathSciNetMATH
Zurück zum Zitat Preoţiuc-Pietro D, Lampos V, Aletras N (2015b) An analysis of the user occupational class through Twitter content. Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pp 1754–1764. https://doi.org/10.3115/v1/P15-1169 Preoţiuc-Pietro D, Lampos V, Aletras N (2015b) An analysis of the user occupational class through Twitter content. Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pp 1754–1764. https://​doi.​org/​10.​3115/​v1/​P15-1169
Zurück zum Zitat Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: SMUC ’10 Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: SMUC ’10
Zurück zum Zitat Roth P (2019) In: Holzer B, Stegbauer C (eds) Feld (1981) The focused organization of social ties, Springer, Wiesbaden, pp 185–188 Roth P (2019) In: Holzer B, Stegbauer C (eds) Feld (1981) The focused organization of social ties, Springer, Wiesbaden, pp 185–188
Zurück zum Zitat Schäfer I, Hansen H, Schön G, Höfels S, Altiner A, Dahlhaus A, Gensichen J, Riedel-Heller S, Weyerer S, Blank WA et al (2012) The in uence of age, gender and socio-economic status on multimorbidity patterns in primary care: first results from the multicare cohort study. BMC Health Serv Res 12(1):89CrossRef Schäfer I, Hansen H, Schön G, Höfels S, Altiner A, Dahlhaus A, Gensichen J, Riedel-Heller S, Weyerer S, Blank WA et al (2012) The in uence of age, gender and socio-economic status on multimorbidity patterns in primary care: first results from the multicare cohort study. BMC Health Serv Res 12(1):89CrossRef
Zurück zum Zitat Segalovich I(2003) A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: Proceedings of the international conference on machine learning; models, technologies and applications. MLMTA’03. Citeseer Segalovich I(2003) A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: Proceedings of the international conference on machine learning; models, technologies and applications. MLMTA’03. Citeseer
Zurück zum Zitat Tsakalidis A, Aletras N, Cristea AI, Liakata M (2018) Nowcasting the stance of social media users in a sudden vote: the case of the greek referendum. In: Proceedings of the 27th ACM international conference on information and knowledge management. CIKM ’18, Association for Computing Machinery, New York, NY, USA, pp 367–376. https://doi.org/10.1145/3269206.3271783. Tsakalidis A, Aletras N, Cristea AI, Liakata M (2018) Nowcasting the stance of social media users in a sudden vote: the case of the greek referendum. In: Proceedings of the 27th ACM international conference on information and knowledge management. CIKM ’18, Association for Computing Machinery, New York, NY, USA, pp 367–376. https://​doi.​org/​10.​1145/​3269206.​3271783.​
Zurück zum Zitat Tsitsulin A, Mottin D, Karras P, Müller E (2018) Verse: versatile graph embeddings from similarity measures. In: Proceedings of the 2018 World Wide Web conference. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp 539–548. https://doi.org/10.1145/3178876.3186120. Tsitsulin A, Mottin D, Karras P, Müller E (2018) Verse: versatile graph embeddings from similarity measures. In: Proceedings of the 2018 World Wide Web conference. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp 539–548. https://​doi.​org/​10.​1145/​3178876.​3186120.​
Zurück zum Zitat Tucker-Drob EM, Briley DA (2012) Socioeconomic status modifies interest-knowledge associations among adolescents. Personal Individ Differ 53(1):9–15CrossRef Tucker-Drob EM, Briley DA (2012) Socioeconomic status modifies interest-knowledge associations among adolescents. Personal Individ Differ 53(1):9–15CrossRef
Zurück zum Zitat Vaganov D, Kalinin A, Bochenina K (2020) On inferring monthly expenses of social media users: towards data and approaches. In: Cherifi H, Gaito S, Mendes JF, Moro E, Rocha LM (eds) Complex networks and their applications VIII. Springer, Cham, pp 854–865CrossRef Vaganov D, Kalinin A, Bochenina K (2020) On inferring monthly expenses of social media users: towards data and approaches. In: Cherifi H, Gaito S, Mendes JF, Moro E, Rocha LM (eds) Complex networks and their applications VIII. Springer, Cham, pp 854–865CrossRef
Zurück zum Zitat Vaganov D, Funkner A, Kovalchuk S, Guleva V, Bochenina, K (2018) Forecasting purchase categories with transition graphs using financial and social data. In: International conference on social informatics, Springer, pp 439–454 Vaganov D, Funkner A, Kovalchuk S, Guleva V, Bochenina, K (2018) Forecasting purchase categories with transition graphs using financial and social data. In: International conference on social informatics, Springer, pp 439–454
Zurück zum Zitat Vorontsov K, Frei O, Apishev M, Romov P, Dudarenko M (2015) Bigartm: open source library for regularized multimodal topic modeling of large collections. In: AIST Vorontsov K, Frei O, Apishev M, Romov P, Dudarenko M (2015) Bigartm: open source library for regularized multimodal topic modeling of large collections. In: AIST
Zurück zum Zitat Wang X, Yu L, Yao J, Cui B (2013) A multiple feature integration model to infer occupation from social media records. In: Lin X, Manolopoulos Y, Srivastava D, Huang G (eds) Web information systems engineering WISE 2013. Springer, Berlin, pp 137–150CrossRef Wang X, Yu L, Yao J, Cui B (2013) A multiple feature integration model to infer occupation from social media records. In: Lin X, Manolopoulos Y, Srivastava D, Huang G (eds) Web information systems engineering WISE 2013. Springer, Berlin, pp 137–150CrossRef
Zurück zum Zitat Wang J, Gao J, Liu J-H, Yang D, Zhou T (2019) Regional economic status inference from information flow and talent mobility. EPL (Europhys Lett) 125(6):68002CrossRef Wang J, Gao J, Liu J-H, Yang D, Zhou T (2019) Regional economic status inference from information flow and talent mobility. EPL (Europhys Lett) 125(6):68002CrossRef
Zurück zum Zitat Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the sixth international AAAI conference on weblogs and social media homophily, pp 387–390 Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the sixth international AAAI conference on weblogs and social media homophily, pp 387–390
Zurück zum Zitat Zhang J, Hu X, Zhang Y, Liu H (2016) Your age is no secret: inferring microbloggers’ ages via content and interaction analysis. In: Proceedings of the 10th international conference on web and social media, ICWSM 2016 (Icwsm), pp 476–485 Zhang J, Hu X, Zhang Y, Liu H (2016) Your age is no secret: inferring microbloggers’ ages via content and interaction analysis. In: Proceedings of the 10th international conference on web and social media, ICWSM 2016 (Icwsm), pp 476–485
Zurück zum Zitat Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: Proceedings of the 18th international conference on world wide web, ACM, pp 531–540 Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: Proceedings of the 18th international conference on world wide web, ACM, pp 531–540
Metadaten
Titel
Discovering patterns of customer financial behavior using social media data
verfasst von
Alexander Kalinin
Danila Vaganov
Klavdiya Bochenina
Publikationsdatum
01.12.2020
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2020
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-020-00690-3

Weitere Artikel der Ausgabe 1/2020

Social Network Analysis and Mining 1/2020 Zur Ausgabe

Premium Partner