Skip to main content
Top
Published in: Cognitive Computation 4/2019

19-01-2019

Improving User Attribute Classification with Text and Social Network Attention

Authors: Yumeng Li, Liang Yang, Bo Xu, Jian Wang, Hongfei Lin

Published in: Cognitive Computation | Issue 4/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

User attribute classification is an important research topic in social media user profiling, which has great commercial value in modern advertisement systems. Existing research on user profiling has mostly focused on manually handcrafted features for different attribute classification tasks. However, these research has partly overlooked the social relation of users. We propose an end-to-end neural network model called the social convolution attention neural network. Our model leverages the convolution attention mechanism to automatically extract user features with respect to different attributes from social texts. The proposed model can capture the social relation of users by combining semantic context and social network information, and improve the performance of attribute classification. We evaluate our model in the gender, age, and geography classification tasks based on the dataset from SMP CUP 2016 competition, respectively. The experimental results demonstrate that the proposed model is effective in automatic user attribute classification with a particular focus on fine-grained user information. We propose an effective model based on the convolution attention mechanism and social relation information for user attribute classification. The model can significantly improve the accuracy in various user attribute classification tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We have also used an extra convolutional neural network (CNN) and long-short-term memory (LSTM) layers to encode word representations before the attention layer, but no improvement was achieved with greater time cost. We keep the CNN and LSTM layers as comments in our codes for future optimization.
 
Literature
1.
go back to reference Volkova S, Bachrach Y, Armstrong M, Sharma V. 2015. Inferring latent user properties from texts published in social media. In: AAAI, pp 4296–4297. Volkova S, Bachrach Y, Armstrong M, Sharma V. 2015. Inferring latent user properties from texts published in social media. In: AAAI, pp 4296–4297.
2.
go back to reference Park G, Schwartz AH, Eichstaedt JC, Kern ML, Kosinski M, Stillwell DJ, Ungar LH, Seligman MEP. Automatic personality assessment through social media language. J Pers Soc Psychol 2015;108(6): 934.CrossRefPubMed Park G, Schwartz AH, Eichstaedt JC, Kern ML, Kosinski M, Stillwell DJ, Ungar LH, Seligman MEP. Automatic personality assessment through social media language. J Pers Soc Psychol 2015;108(6): 934.CrossRefPubMed
3.
go back to reference Mueller J, Stumme G. 2016. Gender inference using statistical name characteristics in twitter. arXiv:1606.05467. Mueller J, Stumme G. 2016. Gender inference using statistical name characteristics in twitter. arXiv:1606.​05467.
4.
go back to reference Alowibdi JS, Buy UA, Yu P. 2013. Language independent gender classification on twitter. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pp 739–743. ACM. Alowibdi JS, Buy UA, Yu P. 2013. Language independent gender classification on twitter. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pp 739–743. ACM.
5.
go back to reference Chamberlain BP, Humby C, Deisenroth MP. 2016. Detecting the age of twitter users. arXiv:1601.04621. Chamberlain BP, Humby C, Deisenroth MP. 2016. Detecting the age of twitter users. arXiv:1601.​04621.
6.
go back to reference Sloan L, Morgan J, Burnap P, Williams M. Who tweets? deriving the demographic characteristics of age, occupation and social class from twitter user meta-data. Plos one 2015;10(3):e0115545.CrossRefPubMedPubMedCentral Sloan L, Morgan J, Burnap P, Williams M. Who tweets? deriving the demographic characteristics of age, occupation and social class from twitter user meta-data. Plos one 2015;10(3):e0115545.CrossRefPubMedPubMedCentral
7.
go back to reference Rahimi A, Vu D, Cohn T, Baldwin T. 2015. Exploiting text and network context for geolocation of social media users. arXiv:1506.04803. Rahimi A, Vu D, Cohn T, Baldwin T. 2015. Exploiting text and network context for geolocation of social media users. arXiv:1506.​04803.
8.
go back to reference Ludu PS. 2014. Inferring gender of a twitter user using celebrities it follows. arXiv:1405.6667. Ludu PS. 2014. Inferring gender of a twitter user using celebrities it follows. arXiv:1405.​6667.
9.
go back to reference Sesa-Nogueras E, Faundez-Zanuy M, Roure-alcobé J. Gender classification by means of online uppercase handwriting A text-dependent allographic approach. Cogn Comput 2016;8(1):15–29.CrossRef Sesa-Nogueras E, Faundez-Zanuy M, Roure-alcobé J. Gender classification by means of online uppercase handwriting A text-dependent allographic approach. Cogn Comput 2016;8(1):15–29.CrossRef
10.
go back to reference Wang L, Cao Z, de Melo G, Liu Z. 2016. Relation classification via multi-level attention cnns. In: Proceedings of the 54th annual meeting of the association for computational linguistics. Association for computational linguistics. Wang L, Cao Z, de Melo G, Liu Z. 2016. Relation classification via multi-level attention cnns. In: Proceedings of the 54th annual meeting of the association for computational linguistics. Association for computational linguistics.
11.
go back to reference Rush AM, Chopra S, Weston J. 2015. A neural attention model for abstractive sentence summarization. arXiv:1509.00685. Rush AM, Chopra S, Weston J. 2015. A neural attention model for abstractive sentence summarization. arXiv:1509.​00685.
12.
go back to reference Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. 2016. Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp 1480–1489. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. 2016. Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp 1480–1489.
13.
go back to reference Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y. 2017. A structured self-attentive sentence embedding. arXiv:1703.03130. Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y. 2017. A structured self-attentive sentence embedding. arXiv:1703.​03130.
14.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. 2017. Attention is all you need. In: Advances in neural information processing systems, pp 6000–6010. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. 2017. Attention is all you need. In: Advances in neural information processing systems, pp 6000–6010.
15.
go back to reference Schler J, Koppel M, Argamon S, Pennebaker JW. 2006. Effects of age and gender on blogging. In: AAAI Spring symposium: Computational approaches to analyzing weblogs, vol 6, pp 199–205. Schler J, Koppel M, Argamon S, Pennebaker JW. 2006. Effects of age and gender on blogging. In: AAAI Spring symposium: Computational approaches to analyzing weblogs, vol 6, pp 199–205.
16.
go back to reference Mukherjee A, Liu B. 2010. Improving gender classification of blog authors. In: Proceedings of the 2010 conference on empirical methods in natural language processing, pp 207–217. Association for computational linguistics. Mukherjee A, Liu B. 2010. Improving gender classification of blog authors. In: Proceedings of the 2010 conference on empirical methods in natural language processing, pp 207–217. Association for computational linguistics.
17.
go back to reference Feng S, Wang Y, Song K, Wang D, Yu G. Detecting multiple coexisting emotions in microblogs with convolutional neural networks. Cogn Comput 2018;10(1):136–155.CrossRef Feng S, Wang Y, Song K, Wang D, Yu G. Detecting multiple coexisting emotions in microblogs with convolutional neural networks. Cogn Comput 2018;10(1):136–155.CrossRef
18.
go back to reference Cha M, Gwon Y, Kung HT. 2015. Twitter geolocation and regional classification via sparse coding. In: ICWSM, pp 582–585. Cha M, Gwon Y, Kung HT. 2015. Twitter geolocation and regional classification via sparse coding. In: ICWSM, pp 582–585.
19.
go back to reference Burger JD, Henderson J, Kim G, Zarrella G. 2011. Discriminating gender on twitter. In: Proceedings of the conference on empirical methods in natural language processing, pp 1301–1309. Association for computational linguistics. Burger JD, Henderson J, Kim G, Zarrella G. 2011. Discriminating gender on twitter. In: Proceedings of the conference on empirical methods in natural language processing, pp 1301–1309. Association for computational linguistics.
20.
go back to reference Miller Z, Dickinson B, Hu W. Gender prediction on twitter using stream algorithms with n-gram character features. Int J Internet Sci 2012;2(04):143. Miller Z, Dickinson B, Hu W. Gender prediction on twitter using stream algorithms with n-gram character features. Int J Internet Sci 2012;2(04):143.
21.
go back to reference Bo H, Cook P, Baldwin T. 2012. Geolocation prediction in social media data by finding location indicative words. In: Proceedings of COLING, pp 1045–1062. Bo H, Cook P, Baldwin T. 2012. Geolocation prediction in social media data by finding location indicative words. In: Proceedings of COLING, pp 1045–1062.
22.
go back to reference Ahmed A, Hong L, Smola AJ. 2013. Hierarchical geographical modeling of user locations from social media posts. In: Proceedings of the 22nd international conference on world wide web, pp 25–36. ACM. Ahmed A, Hong L, Smola AJ. 2013. Hierarchical geographical modeling of user locations from social media posts. In: Proceedings of the 22nd international conference on world wide web, pp 25–36. ACM.
23.
go back to reference Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–851.CrossRef Li Y, Pan Q, Yang T, Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–851.CrossRef
24.
go back to reference Alradaideh QA, Alqudah GY. Application of rough set-based feature selection for arabic sentiment analysis. Cogn Comput 2017;9(4):436–445.CrossRef Alradaideh QA, Alqudah GY. Application of rough set-based feature selection for arabic sentiment analysis. Cogn Comput 2017;9(4):436–445.CrossRef
25.
go back to reference Asgarian E, Kahani M, Sharifi S. The impact of sentiment features on the sentiment polarity classification in persian reviews. Cogn Comput 2018;10(1):117–135.CrossRef Asgarian E, Kahani M, Sharifi S. The impact of sentiment features on the sentiment polarity classification in persian reviews. Cogn Comput 2018;10(1):117–135.CrossRef
26.
go back to reference Mukhtar N, Khan MA, Chiragh N. Effective use of evaluation measures for the validation of best classifier in urdu sentiment analysis. Cogn Comput 2017;9(4):446–456.CrossRef Mukhtar N, Khan MA, Chiragh N. Effective use of evaluation measures for the validation of best classifier in urdu sentiment analysis. Cogn Comput 2017;9(4):446–456.CrossRef
27.
go back to reference Peng H, Cambria E, Hussain A. A review of sentiment analysis research in chinese language. Cogn Comput 2017;9(4):423–435.CrossRef Peng H, Cambria E, Hussain A. A review of sentiment analysis research in chinese language. Cogn Comput 2017;9(4):423–435.CrossRef
28.
go back to reference Xi P, Lu J, Yi Z, Yan R. Automatic subspace learning via principal coefficients embedding. IEEE Trans Cybern 2017;47(11):3583–3596.CrossRef Xi P, Lu J, Yi Z, Yan R. Automatic subspace learning via principal coefficients embedding. IEEE Trans Cybern 2017;47(11):3583–3596.CrossRef
29.
go back to reference Xi P, Lu C, Yi Z, Tang H. Connections between nuclear-norm and frobenius-norm-based representations. IEEE Trans Neural Netw Learn Syst 2018;29(1):218–224.CrossRef Xi P, Lu C, Yi Z, Tang H. Connections between nuclear-norm and frobenius-norm-based representations. IEEE Trans Neural Netw Learn Syst 2018;29(1):218–224.CrossRef
30.
go back to reference Mikolov T, Chen K, Corrado G, Dean J. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Mikolov T, Chen K, Corrado G, Dean J. 2013. Efficient estimation of word representations in vector space. arXiv:1301.​3781.
31.
go back to reference Pennington J, Socher R, Manning C. 2014. Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543. Pennington J, Socher R, Manning C. 2014. Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.
32.
go back to reference Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Trans Assoc Comput Linguist 2017;5:135–146.CrossRef Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Trans Assoc Comput Linguist 2017;5:135–146.CrossRef
33.
go back to reference Le Quoc V, Mikolov Tomas. 2014. Distributed representations of sentences and documents. In: ICML, vol 14, pp 1188–1196. Le Quoc V, Mikolov Tomas. 2014. Distributed representations of sentences and documents. In: ICML, vol 14, pp 1188–1196.
34.
go back to reference Perozzi B, Al-Rfou R, Skiena S. 2014. Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710. ACM. Perozzi B, Al-Rfou R, Skiena S. 2014. Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710. ACM.
35.
go back to reference Grover A, Leskovec J. 2016. node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864. ACM. Grover A, Leskovec J. 2016. node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864. ACM.
36.
go back to reference Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. 2015. Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077. ACM. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. 2015. Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077. ACM.
37.
go back to reference Dong Y, Chawla NV, Swami A. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 135–144. ACM. Dong Y, Chawla NV, Swami A. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 135–144. ACM.
38.
go back to reference Lai Y-A, Hsu C-C, Chen WH, Yeh M-Y, Lin S-D. Prune: Preserving proximity and global ranking for network embedding. In: Advances in neural information processing systems, pp 5263–5272; 2017. Lai Y-A, Hsu C-C, Chen WH, Yeh M-Y, Lin S-D. Prune: Preserving proximity and global ranking for network embedding. In: Advances in neural information processing systems, pp 5263–5272; 2017.
39.
go back to reference Cavallari S, Zheng VW, Cai H, Chang KC-C, Cambria E. 2017. Learning community embedding with community detection and node embedding on graphs. In: Proceedings of the 2017 ACM On conference on information and knowledge management, pp 377–386. ACM. Cavallari S, Zheng VW, Cai H, Chang KC-C, Cambria E. 2017. Learning community embedding with community detection and node embedding on graphs. In: Proceedings of the 2017 ACM On conference on information and knowledge management, pp 377–386. ACM.
40.
go back to reference Cao S, Lu W, Xu Q. 2015. Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 891–900. ACM. Cao S, Lu W, Xu Q. 2015. Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 891–900. ACM.
41.
go back to reference Bo H, Cook P, Baldwin T. Text-based twitter user geolocation prediction. J Artif Intell Res 2014;49: 451–500.CrossRef Bo H, Cook P, Baldwin T. Text-based twitter user geolocation prediction. J Artif Intell Res 2014;49: 451–500.CrossRef
42.
go back to reference Tang D, Qin B, Liu T. 2015. Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432. Tang D, Qin B, Liu T. 2015. Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432.
43.
go back to reference Yang L, Lin H, Lin Y, Liu S. Detection and extraction of hot topics on chinese microblogs. Cogn Comput 2016;8(4):577–586.CrossRef Yang L, Lin H, Lin Y, Liu S. Detection and extraction of hot topics on chinese microblogs. Cogn Comput 2016;8(4):577–586.CrossRef
44.
go back to reference Xu B, Lin H, Lin Y. Assessment of learning to rank methods for query expansion. J Assoc Inf Sci Technol 2016;67(6):1345–1357.CrossRef Xu B, Lin H, Lin Y. Assessment of learning to rank methods for query expansion. J Assoc Inf Sci Technol 2016;67(6):1345–1357.CrossRef
45.
go back to reference Chen H, Sun M, Tu C, Lin Y, Liu Z. 2016. Neural sentiment classification with user and product attention. In: Proceedings of EMNLP. Chen H, Sun M, Tu C, Lin Y, Liu Z. 2016. Neural sentiment classification with user and product attention. In: Proceedings of EMNLP.
46.
go back to reference Cai F, Chen H. A probabilistic model for information retrieval by mining user behaviors. Cogn Comput 2016; 8(3):494–504.CrossRef Cai F, Chen H. A probabilistic model for information retrieval by mining user behaviors. Cogn Comput 2016; 8(3):494–504.CrossRef
47.
go back to reference Chen T, Guestrin C. 2016. Xgboost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794. ACM. Chen T, Guestrin C. 2016. Xgboost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794. ACM.
Metadata
Title
Improving User Attribute Classification with Text and Social Network Attention
Authors
Yumeng Li
Liang Yang
Bo Xu
Jian Wang
Hongfei Lin
Publication date
19-01-2019
Publisher
Springer US
Published in
Cognitive Computation / Issue 4/2019
Print ISSN: 1866-9956
Electronic ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-019-9624-y

Other articles of this Issue 4/2019

Cognitive Computation 4/2019 Go to the issue

Premium Partner