Skip to main content
Erschienen in: Knowledge and Information Systems 1/2021

14.10.2020 | Regular Paper

Hashtag recommendation for short social media texts using word-embeddings and external knowledge

verfasst von: Nagendra Kumar, Eshwanth Baskaran, Anand Konjengbam, Manish Singh

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the rapid growth of Twitter in recent years, there has been a tremendous increase in the number of tweets generated by users. Twitter allows users to make use of hashtags to facilitate effective categorization and retrieval of tweets. Despite the usefulness of hashtags, a major fraction of tweets do not contain hashtags. Several methods have been proposed to recommend hashtags based on lexical and topical features of tweets. However, semantic features and data sparsity in tweet representation have rarely been addressed by existing methods. In this paper, we propose a novel method for hashtag recommendation that resolves the data sparseness problem by exploiting the most relevant tweet information from external knowledge sources. In addition to lexical features and topical features, the proposed method incorporates the semantic features based on word-embeddings and user influence feature based on users’ influential position. To gain the advantage of various hashtag recommendation methods based on different features, our proposed method aggregates these methods using learning-to-rank and generates top-ranked hashtags. Experimental results show that the proposed method significantly outperforms the current state-of-the-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Atefeh F, Khreich W (2015) A survey of techniques for event detection in twitter. Comput Intell 31(1):132–164MathSciNetCrossRef Atefeh F, Khreich W (2015) A survey of techniques for event detection in twitter. Comput Intell 31(1):132–164MathSciNetCrossRef
2.
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
3.
Zurück zum Zitat Brooks C.H, Montanez N (2006) Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th international conference on World Wide Web. pp 625–632. ACM Brooks C.H, Montanez N (2006) Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th international conference on World Wide Web. pp 625–632. ACM
4.
Zurück zum Zitat Carpenter JP, Krutka DG (2014) How and why educators use twitter: a survey of the field. J Res Technol Educ 46(4):414–434CrossRef Carpenter JP, Krutka DG (2014) How and why educators use twitter: a survey of the field. J Res Technol Educ 46(4):414–434CrossRef
5.
Zurück zum Zitat Chang HC (2010) A new perspective on twitter hashtag use: diffusion of innovation theory. Proc Assoc Inf Sci Technol 47(1):1–4 Chang HC (2010) A new perspective on twitter hashtag use: diffusion of innovation theory. Proc Assoc Inf Sci Technol 47(1):1–4
6.
Zurück zum Zitat Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters. pp 241–249. Association for Computational Linguistics Davidov D, Tsur O, Rappoport A (2010) Enhanced sentiment learning using twitter hashtags and smileys. In: Proceedings of the 23rd international conference on computational linguistics: posters. pp 241–249. Association for Computational Linguistics
7.
Zurück zum Zitat Ding Z, Qiu X, Zhang Q, Huang X (2013) Learning topical translation model for microblog hashtag suggestion. In: Twenty-third international joint conference on artificial intelligence. pp 2078–2084 Ding Z, Qiu X, Zhang Q, Huang X (2013) Learning topical translation model for microblog hashtag suggestion. In: Twenty-third international joint conference on artificial intelligence. pp 2078–2084
8.
Zurück zum Zitat Duan Y, Jiang L, Qin T, Zhou M, Shum H.Y (2010) An empirical study on learning to rank of tweets. In: Proceedings of the 23rd international conference on computational linguistics. pp 295–303. Association for Computational Linguistics Duan Y, Jiang L, Qin T, Zhou M, Shum H.Y (2010) An empirical study on learning to rank of tweets. In: Proceedings of the 23rd international conference on computational linguistics. pp 295–303. Association for Computational Linguistics
9.
Zurück zum Zitat Efron M (2010) Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. pp 787–788. ACM Efron M (2010) Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. pp 787–788. ACM
10.
Zurück zum Zitat Feng W, Wang J (2012) Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. pp 1276–1284. ACM Feng W, Wang J (2012) Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. pp 1276–1284. ACM
11.
Zurück zum Zitat Ferragina P, Piccinno F, Santoro R (2015) On analyzing hashtags in twitter. In: International conference on Web and Social Media (ICWSM). AAAI Press, pp 110–119 Ferragina P, Piccinno F, Santoro R (2015) On analyzing hashtags in twitter. In: International conference on Web and Social Media (ICWSM). AAAI Press, pp 110–119
12.
Zurück zum Zitat Ferragina P, Scaiella U (2010) Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM international conference on information and knowledge management. pp 1625–1628. ACM Ferragina P, Scaiella U (2010) Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM international conference on information and knowledge management. pp 1625–1628. ACM
13.
Zurück zum Zitat Firth JR (1957) A synopsis of linguistic theory, 1930–1955. Studies in linguistic analysis Firth JR (1957) A synopsis of linguistic theory, 1930–1955. Studies in linguistic analysis
14.
Zurück zum Zitat Godin F, Slavkovikj V, De Neve W, Schrauwen B, Van de Walle R (2013) Using topic models for twitter hashtag recommendation. In: Proceedings of the 22nd international conference on World Wide Web. pp 593–596. ACM Godin F, Slavkovikj V, De Neve W, Schrauwen B, Van de Walle R (2013) Using topic models for twitter hashtag recommendation. In: Proceedings of the 22nd international conference on World Wide Web. pp 593–596. ACM
15.
Zurück zum Zitat Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Nat Acad Sci 101(suppl 1):5228–5235CrossRef Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Nat Acad Sci 101(suppl 1):5228–5235CrossRef
16.
Zurück zum Zitat Guan Z, Bu J, Mei Q, Chen C, Wang C (2009) Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. pp 540–547. ACM Guan Z, Bu J, Mei Q, Chen C, Wang C (2009) Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. pp 540–547. ACM
17.
18.
Zurück zum Zitat Hong L, Convertino G, Chi EH (2011) Language matters in twitter: a large scale study. In: ICWSM Hong L, Convertino G, Chi EH (2011) Language matters in twitter: a large scale study. In: ICWSM
19.
Zurück zum Zitat Hu X, Sun N, Zhang C, Chua T.S (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of the 18th ACM conference on information and knowledge management. pp 919–928. ACM Hu X, Sun N, Zhang C, Chua T.S (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of the 18th ACM conference on information and knowledge management. pp 919–928. ACM
20.
Zurück zum Zitat Kalloubi F, Nfaoui EH, El Beqqali O (2017) Harnessing semantic features for large-scale content-based hashtag recommendations on microblogging platforms. Int J Seman Web Inf Syst (IJSWIS) 13(1):63–81CrossRef Kalloubi F, Nfaoui EH, El Beqqali O (2017) Harnessing semantic features for large-scale content-based hashtag recommendations on microblogging platforms. Int J Seman Web Inf Syst (IJSWIS) 13(1):63–81CrossRef
21.
Zurück zum Zitat Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762CrossRef Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762CrossRef
22.
Zurück zum Zitat Kywe SM, Hoang TA, Lim EP, Zhu F (2012) On recommending hashtags in twitter networks. In: International conference on social informatics. pp 337–350. Springer Kywe SM, Hoang TA, Lim EP, Zhu F (2012) On recommending hashtags in twitter networks. In: International conference on social informatics. pp 337–350. Springer
23.
Zurück zum Zitat Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. pp 1188–1196 Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. pp 1188–1196
24.
Zurück zum Zitat Liang H, Xu Y, Li Y, Nayak R, Tao X (2010) Connecting users and items with weighted tags for personalized item recommendations. In: Proceedings of the 21st ACM conference on Hypertext and hypermedia. pp 51–60. ACM Liang H, Xu Y, Li Y, Nayak R, Tao X (2010) Connecting users and items with weighted tags for personalized item recommendations. In: Proceedings of the 21st ACM conference on Hypertext and hypermedia. pp 51–60. ACM
25.
Zurück zum Zitat Liu Z, Chen X, Sun M (2011) A simple word trigger method for social tag suggestion. In: Proceedings of the conference on empirical methods in natural language processing. pp 1577–1588. Association for Computational Linguistics Liu Z, Chen X, Sun M (2011) A simple word trigger method for social tag suggestion. In: Proceedings of the conference on empirical methods in natural language processing. pp 1577–1588. Association for Computational Linguistics
26.
Zurück zum Zitat Ma Z, Sun A, Cong G (2012) Will this# hashtag be popular tomorrow? In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. pp 1173–1174. ACM Ma Z, Sun A, Cong G (2012) Will this# hashtag be popular tomorrow? In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. pp 1173–1174. ACM
27.
Zurück zum Zitat Ma Z, Sun A, Cong G (2013) On predicting the popularity of newly emerging hashtags in twitter. J Assoc Inf Sci Technol 64(7):1399–1410CrossRef Ma Z, Sun A, Cong G (2013) On predicting the popularity of newly emerging hashtags in twitter. J Assoc Inf Sci Technol 64(7):1399–1410CrossRef
28.
Zurück zum Zitat Ma Z, Sun A, Yuan Q, Cong G (2014) Tagging your tweets: A probabilistic modeling of hashtag annotation in twitter. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. pp 999–1008. ACM Ma Z, Sun A, Yuan Q, Cong G (2014) Tagging your tweets: A probabilistic modeling of hashtag annotation in twitter. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. pp 999–1008. ACM
29.
Zurück zum Zitat Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. pp 55–60 Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. pp 55–60
30.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado G.S, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado G.S, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. pp 3111–3119
31.
Zurück zum Zitat Mishne G (2006) Autotag: a collaborative approach to automated tag assignment for weblog posts. In: Proceedings of the 15th international conference on World Wide Web. pp 953–954. ACM Mishne G (2006) Autotag: a collaborative approach to automated tag assignment for weblog posts. In: Proceedings of the 15th international conference on World Wide Web. pp 953–954. ACM
32.
Zurück zum Zitat Otsuka E, Wallace S.A, Chiu D (2014) Design and evaluation of a twitter hashtag recommendation system. In: Proceedings of the 18th international database engineering and applications symposium. pp 330–333. ACM Otsuka E, Wallace S.A, Chiu D (2014) Design and evaluation of a twitter hashtag recommendation system. In: Proceedings of the 18th international database engineering and applications symposium. pp 330–333. ACM
33.
Zurück zum Zitat Pan J.Y, Yang H.J, Faloutsos C, Duygulu P (2004) Gcap: Graph-based automatic image captioning. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW’04. pp 146. IEEE Pan J.Y, Yang H.J, Faloutsos C, Duygulu P (2004) Gcap: Graph-based automatic image captioning. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW’04. pp 146. IEEE
34.
Zurück zum Zitat Ramos J et al (2003) Using tf-idf to determine word relevance in document queries. Proc First Inst Conf Mach Learn 242:133–142 Ramos J et al (2003) Using tf-idf to determine word relevance in document queries. Proc First Inst Conf Mach Learn 242:133–142
35.
Zurück zum Zitat Romero D.M, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on World wide web. pp 695–704. ACM Romero D.M, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on World wide web. pp 695–704. ACM
36.
Zurück zum Zitat Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. pp 487–494. AUAI Press Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. pp 487–494. AUAI Press
37.
Zurück zum Zitat Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. nature 323(6088):533CrossRef Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. nature 323(6088):533CrossRef
38.
Zurück zum Zitat Sedhai S, Sun A (2014) Hashtag recommendation for hyperlinked tweets. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. pp 831–834. ACM Sedhai S, Sun A (2014) Hashtag recommendation for hyperlinked tweets. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval. pp 831–834. ACM
39.
Zurück zum Zitat She J, Chen L (2014) Tomoha: Topic model-based hashtag recommendation on twitter. In: Proceedings of the 23rd international conference on World Wide Web. pp 371–372. ACM She J, Chen L (2014) Tomoha: Topic model-based hashtag recommendation on twitter. In: Proceedings of the 23rd international conference on World Wide Web. pp 371–372. ACM
40.
Zurück zum Zitat Tsur O, Rappoport A (2012) What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM international conference on Web search and data mining. pp. 643–652. ACM Tsur O, Rappoport A (2012) What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM international conference on Web search and data mining. pp. 643–652. ACM
41.
Zurück zum Zitat Versley Y, Ponzetto SP, Poesio M, Eidelman V, Jern A, Smith J, Yang X, Moschitti A (2008) Bart: A modular toolkit for coreference resolution. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: demo session. pp 9–12. Association for Computational Linguistics Versley Y, Ponzetto SP, Poesio M, Eidelman V, Jern A, Smith J, Yang X, Moschitti A (2008) Bart: A modular toolkit for coreference resolution. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: demo session. pp 9–12. Association for Computational Linguistics
42.
Zurück zum Zitat Wang X, Wei F, Liu X, Zhou M, Zhang M (2011) Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM international conference on Information and knowledge management. pp 1031–1040. ACM Wang X, Wei F, Liu X, Zhou M, Zhang M (2011) Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM international conference on Information and knowledge management. pp 1031–1040. ACM
43.
Zurück zum Zitat Wang Y, Zheng B (2014) On macro and micro exploration of hashtag diffusion in twitter. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). pp 285–288. IEEE Wang Y, Zheng B (2014) On macro and micro exploration of hashtag diffusion in twitter. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). pp 285–288. IEEE
44.
Zurück zum Zitat Wang Y, Qu J, Liu J, Chen J, Huang Y (2014) What to tag your microblog: hashtag recommendation based on topic analysis and collaborative filtering. In: Asia-Pacific web conference. pp 610–618. Springer Wang Y, Qu J, Liu J, Chen J, Huang Y (2014) What to tag your microblog: hashtag recommendation based on topic analysis and collaborative filtering. In: Asia-Pacific web conference. pp 610–618. Springer
45.
Zurück zum Zitat Xiao F, Noro T, Tokuda T (2012) News-topic oriented hashtag recommendation in twitter based on characteristic co-occurrence word detection. In: International conference on web engineering. pp 16–30. Springer Xiao F, Noro T, Tokuda T (2012) News-topic oriented hashtag recommendation in twitter based on characteristic co-occurrence word detection. In: International conference on web engineering. pp 16–30. Springer
46.
Zurück zum Zitat Yang H, Chua T.S, Wang S, Koh C.K (2003) Structured use of external knowledge for event-based open domain question answering. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. pp. 33–40. ACM Yang H, Chua T.S, Wang S, Koh C.K (2003) Structured use of external knowledge for event-based open domain question answering. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. pp. 33–40. ACM
47.
Zurück zum Zitat Zangerle E, Gassler W, Specht G (2011) Recommending#-tags in twitter. In: Proceedings of the workshop on semantic adaptive social web (SASWeb 2011). CEUR workshop proceedings. vol 730, pp 67–78 Zangerle E, Gassler W, Specht G (2011) Recommending#-tags in twitter. In: Proceedings of the workshop on semantic adaptive social web (SASWeb 2011). CEUR workshop proceedings. vol 730, pp 67–78
48.
Zurück zum Zitat Zangerle E, Gassler W, Specht G (2013) On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc Netw Anal Min 3(4):889–898CrossRef Zangerle E, Gassler W, Specht G (2013) On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc Netw Anal Min 3(4):889–898CrossRef
49.
Zurück zum Zitat Zhang Q, Gong Y, Sun X, Huang X (2014) Time-aware personalized hashtag recommendation on social media. In: Proceedings of the 25th international conference on computational linguistics: technical papers COLING 2014. pp 203–212 Zhang Q, Gong Y, Sun X, Huang X (2014) Time-aware personalized hashtag recommendation on social media. In: Proceedings of the 25th international conference on computational linguistics: technical papers COLING 2014. pp 203–212
50.
Zurück zum Zitat Zhao F, Zhu Y, Jin H, Yang LT (2016) A personalized hashtag recommendation approach using lda-based topic model in microblog environment. Future Gener Comput Syst 65:196–206CrossRef Zhao F, Zhu Y, Jin H, Yang LT (2016) A personalized hashtag recommendation approach using lda-based topic model in microblog environment. Future Gener Comput Syst 65:196–206CrossRef
Metadaten
Titel
Hashtag recommendation for short social media texts using word-embeddings and external knowledge
verfasst von
Nagendra Kumar
Eshwanth Baskaran
Anand Konjengbam
Manish Singh
Publikationsdatum
14.10.2020
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2021
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-020-01515-7

Weitere Artikel der Ausgabe 1/2021

Knowledge and Information Systems 1/2021 Zur Ausgabe