Skip to main content

2017 | OriginalPaper | Buchkapitel

Detecting User Occupations on Microblogging Platforms: An Experimental Study

verfasst von : Xia Lv, Peiquan Jin, Lin Mu, Shouhong Wan, Lihua Yue

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

User occupation refers to the professional position of a user in real world. It is very helpful for a number of applications, e.g., personalized recommendation and targeted advertising. However, because of the risk of privacy leaks, many users do not provide their occupation information on microblogging platforms. This makes it hard to detect user occupations on microblogging platforms. In this paper, we conduct an experimental study on this issue. Particularly, we propose an experimental framework of detecting user occupations on microblogging platforms. We first implement a number of classification models and devise various sets of features for user occupation detection. Then, we propose to construct an occupation-oriented lexicon, which is collected by an iterative extension algorithm considering semantic similarity and importance between words. We combine the lexicon with the word embedding approach to detect user occupations. We conduct comprehensive experiments and present a set of experimental results. The results show that the lexicon-based word embedding method achieves higher accuracy compared with traditional feature-base classification models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zheng, L., Jin, P., Zhao, J., Yue, L.: A fine-grained approach for extracting events on microblogs. In: Decker, H., Lhotská, L., et al. (eds.) DEXA 2014, LNCS, vol. 8644, pp. 275–283. Springer, Heidelberg (2014) Zheng, L., Jin, P., Zhao, J., Yue, L.: A fine-grained approach for extracting events on microblogs. In: Decker, H., Lhotská, L., et al. (eds.) DEXA 2014, LNCS, vol. 8644, pp. 275–283. Springer, Heidelberg (2014)
2.
Zurück zum Zitat Lv, X., Jin, P., Yue, L.: User occupation prediction on microblogs. In: Li, F., Shim, K., et al. (eds.) APWeb 2016, LNCS, vol. 9932, pp. 497–501. Springer, Heidelberg (2014) Lv, X., Jin, P., Yue, L.: User occupation prediction on microblogs. In: Li, F., Shim, K., et al. (eds.) APWeb 2016, LNCS, vol. 9932, pp. 497–501. Springer, Heidelberg (2014)
3.
Zurück zum Zitat Mislove, A., Viswanath, B., Gummadi, K., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: Third International Conference on Web Search and Web Data Mining (WSDM), pp. 251–260. ACM, New York (2010) Mislove, A., Viswanath, B., Gummadi, K., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: Third International Conference on Web Search and Web Data Mining (WSDM), pp. 251–260. ACM, New York (2010)
4.
Zurück zum Zitat Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in Twitter. In: 2nd International Workshop on Search and Mining User-Generated Contents, pp. 37–44. ACM, Toronto (2010) Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in Twitter. In: 2nd International Workshop on Search and Mining User-Generated Contents, pp. 37–44. ACM, Toronto (2010)
5.
Zurück zum Zitat Burger, J., Henderson, J., Kim, G., et al.: Discriminating gender on Twitter. In: 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309. ACL, Stroudsburg, PA, USA (2011) Burger, J., Henderson, J., Kim, G., et al.: Discriminating gender on Twitter. In: 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309. ACL, Stroudsburg, PA, USA (2011)
6.
Zurück zum Zitat Pennacchiotti, M., Popescu, A.: A machine learning approach to Twitter user classification. In: Fifth International Conference on Weblogs and Social Media, pp. 281–288. The AAAI Press, Barcelona, Catalonia, Spain (2011) Pennacchiotti, M., Popescu, A.: A machine learning approach to Twitter user classification. In: Fifth International Conference on Weblogs and Social Media, pp. 281–288. The AAAI Press, Barcelona, Catalonia, Spain (2011)
7.
Zurück zum Zitat Golbeck, J., Robles, C., Turner, K.: Predicting personality with social media. In: CHI 2011 Extended Abstracts on Human Factors in Computing Systems, pp. 253–262. ACM, Vancouver (2011) Golbeck, J., Robles, C., Turner, K.: Predicting personality with social media. In: CHI 2011 Extended Abstracts on Human Factors in Computing Systems, pp. 253–262. ACM, Vancouver (2011)
8.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
9.
Zurück zum Zitat Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2012) Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2012)
10.
Zurück zum Zitat Zhou, M., Xu, Y., Zhao, X.: Study of feature extract on microblog user occupation classification. In: Fourth International Symposium on Information Science and Engineering, pp. 20–23. IEEE CS, Shanghai (2012) Zhou, M., Xu, Y., Zhao, X.: Study of feature extract on microblog user occupation classification. In: Fourth International Symposium on Information Science and Engineering, pp. 20–23. IEEE CS, Shanghai (2012)
11.
Zurück zum Zitat Huang, B., Yang, Y., Mahmood, A., et al.: Microblog topic detection based on LDA model and single-pass clustering. In: Yao, J., et al. (eds.) RSCTC 2012. LNCS, vol. 7413, pp. 166–171. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32115-3_19 Huang, B., Yang, Y., Mahmood, A., et al.: Microblog topic detection based on LDA model and single-pass clustering. In: Yao, J., et al. (eds.) RSCTC 2012. LNCS, vol. 7413, pp. 166–171. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-32115-3_​19
12.
Zurück zum Zitat Tinati, R., Carr, L., Hall, W., et al.: Identifying communicator roles in Twitter. In: 21st International Conference on World Wide Web, pp. 1161–1168. ACM, Lyon (2012) Tinati, R., Carr, L., Hall, W., et al.: Identifying communicator roles in Twitter. In: 21st International Conference on World Wide Web, pp. 1161–1168. ACM, Lyon (2012)
13.
Zurück zum Zitat Quercia, D., Askham, H., Crowcroft, J.: TweetLDA: supervised topic classification and link prediction in Twitter. In: 4th Annual ACM Web Science Conference, pp. 247–250. ACM, Evanston (2012) Quercia, D., Askham, H., Crowcroft, J.: TweetLDA: supervised topic classification and link prediction in Twitter. In: 4th Annual ACM Web Science Conference, pp. 247–250. ACM, Evanston (2012)
14.
Zurück zum Zitat Nguyen, D., Gravel, R., Trieschnigg, D., et al.: How old do you think i am? A study of language and age in Twitter. In: Seventh International AAAI Conference on Weblogs and Social Media, pp. 439–448. The AAAI Press, Cambridge (2013) Nguyen, D., Gravel, R., Trieschnigg, D., et al.: How old do you think i am? A study of language and age in Twitter. In: Seventh International AAAI Conference on Weblogs and Social Media, pp. 439–448. The AAAI Press, Cambridge (2013)
15.
Zurück zum Zitat Schwartz, H., Eichstaedt, J., Kern, M., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE 8(9), e73791 (2013)CrossRef Schwartz, H., Eichstaedt, J., Kern, M., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE 8(9), e73791 (2013)CrossRef
16.
Zurück zum Zitat Zhao, J., Li, X., Jin, P.: A time-enhanced topic clustering approach for news web search. Int. J. Database Theory Appl. 5(4), 1–10 (2012) Zhao, J., Li, X., Jin, P.: A time-enhanced topic clustering approach for news web search. Int. J. Database Theory Appl. 5(4), 1–10 (2012)
17.
Zurück zum Zitat Yan, X., Guo, J., Lan, Y., et al.: A biterm topic model for short texts. In: 22nd International Conference on World Wide Web, pp. 1445–1456. ACM, Rio de Janeiro (2013) Yan, X., Guo, J., Lan, Y., et al.: A biterm topic model for short texts. In: 22nd International Conference on World Wide Web, pp. 1445–1456. ACM, Rio de Janeiro (2013)
18.
Zurück zum Zitat Huang, F., Li, C., Lin, L.: Identifying gender of microblog users based on message mining. In: Li, F., Li, G., et al. (eds.) WAIM 2014. LNCS, vol. 8485, pp. 488–493. Springer, Cham (2014). doi:10.1007/978-3-319-08010-9_54 Huang, F., Li, C., Lin, L.: Identifying gender of microblog users based on message mining. In: Li, F., Li, G., et al. (eds.) WAIM 2014. LNCS, vol. 8485, pp. 488–493. Springer, Cham (2014). doi:10.​1007/​978-3-319-08010-9_​54
19.
Zurück zum Zitat Li, Y., Liu, T., Liu, H., et al.: Predicting microblog user’s age based on text information. In: Lin, X., Manolopoulos, Y., et al. (eds.) WISE 2013. LNCS, vol. 8180, pp. 510–515. Springer, Berlin, Heidelberg (2014). doi:10.1007/978-3-642-41230-1_45 Li, Y., Liu, T., Liu, H., et al.: Predicting microblog user’s age based on text information. In: Lin, X., Manolopoulos, Y., et al. (eds.) WISE 2013. LNCS, vol. 8180, pp. 510–515. Springer, Berlin, Heidelberg (2014). doi:10.​1007/​978-3-642-41230-1_​45
20.
Zurück zum Zitat Yang, S., Kolcz, A., Schlaikjer, A., et al.: Large-scale high-precision topic modeling on Twitter. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1907–1916. ACM, New York (2014) Yang, S., Kolcz, A., Schlaikjer, A., et al.: Large-scale high-precision topic modeling on Twitter. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1907–1916. ACM, New York (2014)
21.
Zurück zum Zitat Wu, X., Wang, J.: Micro-blog in China: identify influential users and automatically classify posts on Sina micro-blog. J. Ambient Intell. Humanized Comput. 5(1), 51–63 (2014)CrossRef Wu, X., Wang, J.: Micro-blog in China: identify influential users and automatically classify posts on Sina micro-blog. J. Ambient Intell. Humanized Comput. 5(1), 51–63 (2014)CrossRef
22.
Zurück zum Zitat Chen, L., Qian, T., Wang, F., et al.: Age detection for Chinese users in Weibo. In: Dong, X., Yu, X., et al. (eds.) WAIM 2015. LNCS, vol. 9098, pp. 83–95. Springer, Cham (2015). doi:10.1007/978-3-319-21042-1_7 Chen, L., Qian, T., Wang, F., et al.: Age detection for Chinese users in Weibo. In: Dong, X., Yu, X., et al. (eds.) WAIM 2015. LNCS, vol. 9098, pp. 83–95. Springer, Cham (2015). doi:10.​1007/​978-3-319-21042-1_​7
23.
Zurück zum Zitat Tu, C., Liu, Z., Sun, M.: PRISM: profession identification in social media with personal information and community structure. In: Zhang, X., Sun, M., et al. (eds.) CNCSMP 2015. CCIS, vol. 568, pp. 15–27. Springer, Singapore (2015) Tu, C., Liu, Z., Sun, M.: PRISM: profession identification in social media with personal information and community structure. In: Zhang, X., Sun, M., et al. (eds.) CNCSMP 2015. CCIS, vol. 568, pp. 15–27. Springer, Singapore (2015)
24.
Zurück zum Zitat Huang, Y., Yu, L., Wang, X., et al.: A multi-source integration framework for user occupation inference in social media systems. World Wide Web 18(5), 1247–1267 (2015)CrossRef Huang, Y., Yu, L., Wang, X., et al.: A multi-source integration framework for user occupation inference in social media systems. World Wide Web 18(5), 1247–1267 (2015)CrossRef
25.
Zurück zum Zitat Preoţiuc-Pietro, D., Lampos, V., Aletras, N.: An analysis of the user occupational class through Twitter content. In: 53rd Annual Meeting of the Association for Computational Linguistics, pp. 1754–1764. ACL, Beijing (2015) Preoţiuc-Pietro, D., Lampos, V., Aletras, N.: An analysis of the user occupational class through Twitter content. In: 53rd Annual Meeting of the Association for Computational Linguistics, pp. 1754–1764. ACL, Beijing (2015)
27.
Zurück zum Zitat Wan, S., Jin, P., Yue, L.: An approach for image retrieval based on visual saliency. In: 2009 International Conference on Image Analysis and Signal Processing, pp. 172–175. IEEE CS, Linhai (2009) Wan, S., Jin, P., Yue, L.: An approach for image retrieval based on visual saliency. In: 2009 International Conference on Image Analysis and Signal Processing, pp. 172–175. IEEE CS, Linhai (2009)
Metadaten
Titel
Detecting User Occupations on Microblogging Platforms: An Experimental Study
verfasst von
Xia Lv
Peiquan Jin
Lin Mu
Shouhong Wan
Lihua Yue
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63579-8_26