Skip to main content
Top
Published in: Knowledge and Information Systems 1/2021

14-10-2020 | Regular Paper

Hashtag recommendation for short social media texts using word-embeddings and external knowledge

Authors: Nagendra Kumar, Eshwanth Baskaran, Anand Konjengbam, Manish Singh

Published in: Knowledge and Information Systems | Issue 1/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With the rapid growth of Twitter in recent years, there has been a tremendous increase in the number of tweets generated by users. Twitter allows users to make use of hashtags to facilitate effective categorization and retrieval of tweets. Despite the usefulness of hashtags, a major fraction of tweets do not contain hashtags. Several methods have been proposed to recommend hashtags based on lexical and topical features of tweets. However, semantic features and data sparsity in tweet representation have rarely been addressed by existing methods. In this paper, we propose a novel method for hashtag recommendation that resolves the data sparseness problem by exploiting the most relevant tweet information from external knowledge sources. In addition to lexical features and topical features, the proposed method incorporates the semantic features based on word-embeddings and user influence feature based on users’ influential position. To gain the advantage of various hashtag recommendation methods based on different features, our proposed method aggregates these methods using learning-to-rank and generates top-ranked hashtags. Experimental results show that the proposed method significantly outperforms the current state-of-the-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
2.
go back to reference Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
3.
go back to reference Brooks C.H, Montanez N (2006) Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th international conference on World Wide Web. pp 625–632. ACM Brooks C.H, Montanez N (2006) Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th international conference on World Wide Web. pp 625–632. ACM
4.
go back to reference Carpenter JP, Krutka DG (2014) How and why educators use twitter: a survey of the field. J Res Technol Educ 46(4):414–434CrossRef Carpenter JP, Krutka DG (2014) How and why educators use twitter: a survey of the field. J Res Technol Educ 46(4):414–434CrossRef
5.
go back to reference Chang HC (2010) A new perspective on twitter hashtag use: diffusion of innovation theory. Proc Assoc Inf Sci Technol 47(1):1–4 Chang HC (2010) A new perspective on twitter hashtag use: diffusion of innovation theory. Proc Assoc Inf Sci Technol 47(1):1–4
6.
go back to reference Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters. pp 241–249. Association for Computational Linguistics Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters. pp 241–249. Association for Computational Linguistics
7.
go back to reference Ding Z, Qiu X, Zhang Q, Huang X (2013) Learning topical translation model for microblog hashtag suggestion. In: Twenty-third international joint conference on artificial intelligence. pp 2078–2084 Ding Z, Qiu X, Zhang Q, Huang X (2013) Learning topical translation model for microblog hashtag suggestion. In: Twenty-third international joint conference on artificial intelligence. pp 2078–2084
8.
go back to reference Duan Y, Jiang L, Qin T, Zhou M, Shum H.Y (2010) An empirical study on learning to rank of tweets. In: Proceedings of the 23rd international conference on computational linguistics. pp 295–303. Association for Computational Linguistics Duan Y, Jiang L, Qin T, Zhou M, Shum H.Y (2010) An empirical study on learning to rank of tweets. In: Proceedings of the 23rd international conference on computational linguistics. pp 295–303. Association for Computational Linguistics
9.
go back to reference Efron M (2010) Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. pp 787–788. ACM Efron M (2010) Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. pp 787–788. ACM
10.
go back to reference Feng W, Wang J (2012) Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. pp 1276–1284. ACM Feng W, Wang J (2012) Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. pp 1276–1284. ACM
11.
go back to reference Ferragina P, Piccinno F, Santoro R (2015) On analyzing hashtags in twitter. In: International conference on Web and Social Media (ICWSM). AAAI Press, pp 110–119 Ferragina P, Piccinno F, Santoro R (2015) On analyzing hashtags in twitter. In: International conference on Web and Social Media (ICWSM). AAAI Press, pp 110–119
12.
go back to reference Ferragina P, Scaiella U (2010) Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM international conference on information and knowledge management. pp 1625–1628. ACM Ferragina P, Scaiella U (2010) Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM international conference on information and knowledge management. pp 1625–1628. ACM
13.
go back to reference Firth JR (1957) A synopsis of linguistic theory, 1930–1955. Studies in linguistic analysis Firth JR (1957) A synopsis of linguistic theory, 1930–1955. Studies in linguistic analysis
14.
go back to reference Godin F, Slavkovikj V, De Neve W, Schrauwen B, Van de Walle R (2013) Using topic models for twitter hashtag recommendation. In: Proceedings of the 22nd international conference on World Wide Web. pp 593–596. ACM Godin F, Slavkovikj V, De Neve W, Schrauwen B, Van de Walle R (2013) Using topic models for twitter hashtag recommendation. In: Proceedings of the 22nd international conference on World Wide Web. pp 593–596. ACM
15.
go back to reference Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Nat Acad Sci 101(suppl 1):5228–5235CrossRef Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Nat Acad Sci 101(suppl 1):5228–5235CrossRef
16.
go back to reference Guan Z, Bu J, Mei Q, Chen C, Wang C (2009) Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. pp 540–547. ACM Guan Z, Bu J, Mei Q, Chen C, Wang C (2009) Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. pp 540–547. ACM
17.
18.
go back to reference Hong L, Convertino G, Chi EH (2011) Language matters in twitter: a large scale study. In: ICWSM Hong L, Convertino G, Chi EH (2011) Language matters in twitter: a large scale study. In: ICWSM
19.
go back to reference Hu X, Sun N, Zhang C, Chua T.S (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of the 18th ACM conference on information and knowledge management. pp 919–928. ACM Hu X, Sun N, Zhang C, Chua T.S (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of the 18th ACM conference on information and knowledge management. pp 919–928. ACM
20.
go back to reference Kalloubi F, Nfaoui EH, El Beqqali O (2017) Harnessing semantic features for large-scale content-based hashtag recommendations on microblogging platforms. Int J Seman Web Inf Syst (IJSWIS) 13(1):63–81CrossRef Kalloubi F, Nfaoui EH, El Beqqali O (2017) Harnessing semantic features for large-scale content-based hashtag recommendations on microblogging platforms. Int J Seman Web Inf Syst (IJSWIS) 13(1):63–81CrossRef
21.
go back to reference Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762CrossRef Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762CrossRef
22.
go back to reference Kywe SM, Hoang TA, Lim EP, Zhu F (2012) On recommending hashtags in twitter networks. In: International conference on social informatics. pp 337–350. Springer Kywe SM, Hoang TA, Lim EP, Zhu F (2012) On recommending hashtags in twitter networks. In: International conference on social informatics. pp 337–350. Springer
23.
go back to reference Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. pp 1188–1196 Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. pp 1188–1196
24.
go back to reference Liang H, Xu Y, Li Y, Nayak R, Tao X (2010) Connecting users and items with weighted tags for personalized item recommendations. In: Proceedings of the 21st ACM conference on Hypertext and hypermedia. pp 51–60. ACM Liang H, Xu Y, Li Y, Nayak R, Tao X (2010) Connecting users and items with weighted tags for personalized item recommendations. In: Proceedings of the 21st ACM conference on Hypertext and hypermedia. pp 51–60. ACM
25.
go back to reference Liu Z, Chen X, Sun M (2011) A simple word trigger method for social tag suggestion. In: Proceedings of the conference on empirical methods in natural language processing. pp 1577–1588. Association for Computational Linguistics Liu Z, Chen X, Sun M (2011) A simple word trigger method for social tag suggestion. In: Proceedings of the conference on empirical methods in natural language processing. pp 1577–1588. Association for Computational Linguistics
26.
go back to reference Ma Z, Sun A, Cong G (2012) Will this# hashtag be popular tomorrow? In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. pp 1173–1174. ACM Ma Z, Sun A, Cong G (2012) Will this# hashtag be popular tomorrow? In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. pp 1173–1174. ACM
27.
go back to reference Ma Z, Sun A, Cong G (2013) On predicting the popularity of newly emerging hashtags in twitter. J Assoc Inf Sci Technol 64(7):1399–1410CrossRef Ma Z, Sun A, Cong G (2013) On predicting the popularity of newly emerging hashtags in twitter. J Assoc Inf Sci Technol 64(7):1399–1410CrossRef
28.
go back to reference Ma Z, Sun A, Yuan Q, Cong G (2014) Tagging your tweets: A probabilistic modeling of hashtag annotation in twitter. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. pp 999–1008. ACM Ma Z, Sun A, Yuan Q, Cong G (2014) Tagging your tweets: A probabilistic modeling of hashtag annotation in twitter. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. pp 999–1008. ACM
29.
go back to reference Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. pp 55–60 Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. pp 55–60
30.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado G.S, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado G.S, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119
31.
go back to reference Mishne G (2006) Autotag: a collaborative approach to automated tag assignment for weblog posts. In: Proceedings of the 15th international conference on World Wide Web. pp 953–954. ACM Mishne G (2006) Autotag: a collaborative approach to automated tag assignment for weblog posts. In: Proceedings of the 15th international conference on World Wide Web. pp 953–954. ACM
32.
go back to reference Otsuka E, Wallace S.A, Chiu D (2014) Design and evaluation of a twitter hashtag recommendation system. In: Proceedings of the 18th international database engineering and applications symposium. pp 330–333. ACM Otsuka E, Wallace S.A, Chiu D (2014) Design and evaluation of a twitter hashtag recommendation system. In: Proceedings of the 18th international database engineering and applications symposium. pp 330–333. ACM
33.
go back to reference Pan J.Y, Yang H.J, Faloutsos C, Duygulu P (2004) Gcap: Graph-based automatic image captioning. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW’04. pp 146. IEEE Pan J.Y, Yang H.J, Faloutsos C, Duygulu P (2004) Gcap: Graph-based automatic image captioning. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW’04. pp 146. IEEE
34.
go back to reference Ramos J et al (2003) Using tf-idf to determine word relevance in document queries. Proc First Inst Conf Mach Learn 242:133–142 Ramos J et al (2003) Using tf-idf to determine word relevance in document queries. Proc First Inst Conf Mach Learn 242:133–142
35.
go back to reference Romero D.M, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on World wide web. pp 695–704. ACM Romero D.M, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on World wide web. pp 695–704. ACM
36.
go back to reference Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. pp 487–494. AUAI Press Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. pp 487–494. AUAI Press
37.
go back to reference Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. nature 323(6088):533CrossRef Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. nature 323(6088):533CrossRef
38.
go back to reference Sedhai S, Sun A (2014) Hashtag recommendation for hyperlinked tweets. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. pp 831–834. ACM Sedhai S, Sun A (2014) Hashtag recommendation for hyperlinked tweets. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. pp 831–834. ACM
39.
go back to reference She J, Chen L (2014) Tomoha: Topic model-based hashtag recommendation on twitter. In: Proceedings of the 23rd international conference on World Wide Web. pp 371–372. ACM She J, Chen L (2014) Tomoha: Topic model-based hashtag recommendation on twitter. In: Proceedings of the 23rd international conference on World Wide Web. pp 371–372. ACM
40.
go back to reference Tsur O, Rappoport A (2012) What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM international conference on Web search and data mining. pp. 643–652. ACM Tsur O, Rappoport A (2012) What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM international conference on Web search and data mining. pp. 643–652. ACM
41.
go back to reference Versley Y, Ponzetto SP, Poesio M, Eidelman V, Jern A, Smith J, Yang X, Moschitti A (2008) Bart: A modular toolkit for coreference resolution. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: demo session. pp 9–12. Association for Computational Linguistics Versley Y, Ponzetto SP, Poesio M, Eidelman V, Jern A, Smith J, Yang X, Moschitti A (2008) Bart: A modular toolkit for coreference resolution. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: demo session. pp 9–12. Association for Computational Linguistics
42.
go back to reference Wang X, Wei F, Liu X, Zhou M, Zhang M (2011) Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM international conference on Information and knowledge management. pp 1031–1040. ACM Wang X, Wei F, Liu X, Zhou M, Zhang M (2011) Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM international conference on Information and knowledge management. pp 1031–1040. ACM
43.
go back to reference Wang Y, Zheng B (2014) On macro and micro exploration of hashtag diffusion in twitter. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). pp 285–288. IEEE Wang Y, Zheng B (2014) On macro and micro exploration of hashtag diffusion in twitter. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). pp 285–288. IEEE
44.
go back to reference Wang Y, Qu J, Liu J, Chen J, Huang Y (2014) What to tag your microblog: hashtag recommendation based on topic analysis and collaborative filtering. In: Asia-Pacific web conference. pp 610–618. Springer Wang Y, Qu J, Liu J, Chen J, Huang Y (2014) What to tag your microblog: hashtag recommendation based on topic analysis and collaborative filtering. In: Asia-Pacific web conference. pp 610–618. Springer
45.
go back to reference Xiao F, Noro T, Tokuda T (2012) News-topic oriented hashtag recommendation in twitter based on characteristic co-occurrence word detection. In: International conference on web engineering. pp 16–30. Springer Xiao F, Noro T, Tokuda T (2012) News-topic oriented hashtag recommendation in twitter based on characteristic co-occurrence word detection. In: International conference on web engineering. pp 16–30. Springer
46.
go back to reference Yang H, Chua T.S, Wang S, Koh C.K (2003) Structured use of external knowledge for event-based open domain question answering. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. pp. 33–40. ACM Yang H, Chua T.S, Wang S, Koh C.K (2003) Structured use of external knowledge for event-based open domain question answering. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. pp. 33–40. ACM
47.
go back to reference Zangerle E, Gassler W, Specht G (2011) Recommending#-tags in twitter. In: Proceedings of the workshop on semantic adaptive social web (SASWeb 2011). CEUR workshop proceedings. vol 730, pp 67–78 Zangerle E, Gassler W, Specht G (2011) Recommending#-tags in twitter. In: Proceedings of the workshop on semantic adaptive social web (SASWeb 2011). CEUR workshop proceedings. vol 730, pp 67–78
48.
go back to reference Zangerle E, Gassler W, Specht G (2013) On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc Netw Anal Min 3(4):889–898CrossRef Zangerle E, Gassler W, Specht G (2013) On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc Netw Anal Min 3(4):889–898CrossRef
49.
go back to reference Zhang Q, Gong Y, Sun X, Huang X (2014) Time-aware personalized hashtag recommendation on social media. In: Proceedings of the 25th international conference on computational linguistics: technical papers COLING 2014. pp 203–212 Zhang Q, Gong Y, Sun X, Huang X (2014) Time-aware personalized hashtag recommendation on social media. In: Proceedings of the 25th international conference on computational linguistics: technical papers COLING 2014. pp 203–212
50.
go back to reference Zhao F, Zhu Y, Jin H, Yang LT (2016) A personalized hashtag recommendation approach using lda-based topic model in microblog environment. Future Gener Comput Syst 65:196–206CrossRef Zhao F, Zhu Y, Jin H, Yang LT (2016) A personalized hashtag recommendation approach using lda-based topic model in microblog environment. Future Gener Comput Syst 65:196–206CrossRef
Metadata
Title
Hashtag recommendation for short social media texts using word-embeddings and external knowledge
Authors
Nagendra Kumar
Eshwanth Baskaran
Anand Konjengbam
Manish Singh
Publication date
14-10-2020
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 1/2021
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-020-01515-7

Other articles of this Issue 1/2021

Knowledge and Information Systems 1/2021 Go to the issue

Premium Partner