Skip to main content
Erschienen in: Social Network Analysis and Mining 4/2013

01.12.2013 | Original Article

On the impact of text similarity functions on hashtag recommendations in microblogging environments

verfasst von: Eva Zangerle, Wolfgang Gassler, Günther Specht

Erschienen in: Social Network Analysis and Mining | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Microblogging applications such as Twitter are experiencing tremendous success. Microblog users utilize hashtags to categorize posted messages which aim at bringing order to the myriads of microblog messages. However, the percentage of messages incorporating hashtags is small and the used hashtags are very heterogeneous as hashtags may be chosen freely and may consist of any arbitrary combination of characters. This heterogeneity and the lack of use of hashtags lead to significant drawbacks in regards to the search functionality as messages are not categorized in a homogeneous way. In this paper, we present an approach for the recommendation of hashtags suitable for the message the user currently enters which aims at creating a more homogeneous set of hashtags. Furthermore, we present a detailed study on how the similarity measures used for the computation of recommendations influence the final set of recommended hashtags.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef
Zurück zum Zitat Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’07. ACM, New York, pp 971–980 Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’07. ACM, New York, pp 971–980
Zurück zum Zitat Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, New York Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, New York
Zurück zum Zitat Balabanović M, Shoham Y (1997) Fab: content-based, collaborative recommendation. Commun ACM 40:66–72CrossRef Balabanović M, Shoham Y (1997) Fab: content-based, collaborative recommendation. Commun ACM 40:66–72CrossRef
Zurück zum Zitat Bollen D, Knijnenburg BP, Willemsen MC, Graus M (2010) Understanding choice overload in recommender systems. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 63–70 Bollen D, Knijnenburg BP, Willemsen MC, Graus M (2010) Understanding choice overload in recommender systems. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 63–70
Zurück zum Zitat Boyd D, Golder S, Lotan G (1899) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: HICSS, IEEE Computer Society, pp 1–10 Boyd D, Golder S, Lotan G (1899) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: HICSS, IEEE Computer Society, pp 1–10
Zurück zum Zitat Chen J, Nairn R, Nelson L, Bernstein M, Chi E (2010) Short and tweet: experiments on recommending content from information streams. In: Proceedings of the 28th international conference on Human factors in computing systems. ACM, New York, pp 1185–1194 Chen J, Nairn R, Nelson L, Bernstein M, Chi E (2010) Short and tweet: experiments on recommending content from information streams. In: Proceedings of the 28th international conference on Human factors in computing systems. ACM, New York, pp 1185–1194
Zurück zum Zitat Cremonesi P, Turrin R, Lentini E, Matteucci M (2008) An evaluation methodology for collaborative recommender systems. In: IEEE International Conference on Automated solutions for Cross Media Content and Multi-channel Distribution, 2008. AXMEDIS’08, pp 224–231 Cremonesi P, Turrin R, Lentini E, Matteucci M (2008) An evaluation methodology for collaborative recommender systems. In: IEEE International Conference on Automated solutions for Cross Media Content and Multi-channel Distribution, 2008. AXMEDIS’08, pp 224–231
Zurück zum Zitat Dice L (1945) Measures of the amount of ecologic association between species. Ecol Freshw Fish 26(3):297–302CrossRef Dice L (1945) Measures of the amount of ecologic association between species. Ecol Freshw Fish 26(3):297–302CrossRef
Zurück zum Zitat Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, vol 6. Morgan Kaufmann Publishers Inc., pp 1606–1611 Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, vol 6. Morgan Kaufmann Publishers Inc., pp 1606–1611
Zurück zum Zitat Garg N, Weber I (2008) Personalized, interactive tag recommendation for flickr. In: Proceedings of the 2008 ACM conference on Recommender systems, RecSys ’08. ACM, New York, pp 67–74 Garg N, Weber I (2008) Personalized, interactive tag recommendation for flickr. In: Proceedings of the 2008 ACM conference on Recommender systems, RecSys ’08. ACM, New York, pp 67–74
Zurück zum Zitat Gassler W, Zangerle E, Specht G (2011) The snoopy concept: fighting heterogeneity in semistructured and collaborative information systems by using recommendations. In: The 2011 International Conference on Collaboration Technologies and Systems (CTS 2011), Philadelphia Gassler W, Zangerle E, Specht G (2011) The snoopy concept: fighting heterogeneity in semistructured and collaborative information systems by using recommendations. In: The 2011 International Conference on Collaboration Technologies and Systems (CTS 2011), Philadelphia
Zurück zum Zitat Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys ’10: Proceedings of the fourth ACM conference on Recommender systems. ACM, New York, pp 199–206 Hannon J, Bennett M, Smyth B (2010) Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys ’10: Proceedings of the fourth ACM conference on Recommender systems. ACM, New York, pp 199–206
Zurück zum Zitat Honeycutt C, Herring SC (2009) Beyond microblogging: conversation and collaboration via Twitter. In: HICSS, IEEE Computer Society, pp 1–10 Honeycutt C, Herring SC (2009) Beyond microblogging: conversation and collaboration via Twitter. In: HICSS, IEEE Computer Society, pp 1–10
Zurück zum Zitat Huberman B, Romero D, Wu F (2009) Social networks that matter: Twitter under the microscope. First Monday 14(1):8 Huberman B, Romero D, Wu F (2009) Social networks that matter: Twitter under the microscope. First Monday 14(1):8
Zurück zum Zitat Jaccard P (1901) Étude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles, 37:547–579 Jaccard P (1901) Étude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles, 37:547–579
Zurück zum Zitat Jaeschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in Folksonomies. In: Kok J, Koronacki J, Lopez de Mantaras R, Matwin S, Mladenic D, Skowron A (eds) Knowledge discovery in databases: PKDD 2007, vol 4702 of Lecture Notes in Computer Science. Springer, Berlin, pp 506–514 Jaeschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in Folksonomies. In: Kok J, Koronacki J, Lopez de Mantaras R, Matwin S, Mladenic D, Skowron A (eds) Knowledge discovery in databases: PKDD 2007, vol 4702 of Lecture Notes in Computer Science. Springer, Berlin, pp 506–514
Zurück zum Zitat Java A, Song X, Finin T, Tseng B (2007) Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, New York, pp 56–65 Java A, Song X, Finin T, Tseng B (2007) Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, New York, pp 56–65
Zurück zum Zitat Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Documentation 28(1):11–21CrossRef Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Documentation 28(1):11–21CrossRef
Zurück zum Zitat Krishnamurthy B, Gill P, Arlitt M (2008) A few chirps about twitter. In: Proceedings of the first workshop on Online social networks. ACM, New York, pp 19–24 Krishnamurthy B, Gill P, Arlitt M (2008) A few chirps about twitter. In: Proceedings of the first workshop on Online social networks. ACM, New York, pp 19–24
Zurück zum Zitat Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web. ACM, New York, pp 591–600 Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web. ACM, New York, pp 591–600
Zurück zum Zitat Levenshtein V (1965) Binary codes with correction for deletions and insertions of the symbol 1. Problemy Peredachi Informatsii 1(1):12–25MathSciNet Levenshtein V (1965) Binary codes with correction for deletions and insertions of the symbol 1. Problemy Peredachi Informatsii 1(1):12–25MathSciNet
Zurück zum Zitat Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10:707–710MathSciNet Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10:707–710MathSciNet
Zurück zum Zitat Lipczak M, Milios E (2010) Learning in efficient tag recommendation. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 167–174 Lipczak M, Milios E (2010) Learning in efficient tag recommendation. In: Proceedings of the fourth ACM conference on Recommender systems, RecSys ’10. ACM, New York, pp 167–174
Zurück zum Zitat Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Zurück zum Zitat Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the seventeenth conference on Hypertext and hypermedia, HT ’06. ACM, New York, pp 31–40 Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the seventeenth conference on Hypertext and hypermedia, HT ’06. ACM, New York, pp 31–40
Zurück zum Zitat Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the national conference on artificial intelligence, vol 21. AAAI Press, Menlo Park; MIT Press, London Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the national conference on artificial intelligence, vol 21. AAAI Press, Menlo Park; MIT Press, London
Zurück zum Zitat Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81CrossRef Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81CrossRef
Zurück zum Zitat Nishida K, Banno K, Fujimura K, Hoshide T (2011) Tweet classification by data compression. In: Proceedings of the 2011 international workshop on DETecting and Exploiting Cultural diversiTy on the social web. ACM, New York, pp 29–34 Nishida K, Banno K, Fujimura K, Hoshide T (2011) Tweet classification by data compression. In: Proceedings of the 2011 international workshop on DETecting and Exploiting Cultural diversiTy on the social web. ACM, New York, pp 29–34
Zurück zum Zitat Pazzani M, Billsus D (2007) Content-based recommendation systems. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web, vol 4321 of Lecture Notes in Computer Science. Springer, Berlin, pp 325–341 Pazzani M, Billsus D (2007) Content-based recommendation systems. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web, vol 4321 of Lecture Notes in Computer Science. Springer, Berlin, pp 325–341
Zurück zum Zitat Phelan O, McCarthy K, Smyth B (2009) Using twitter to recommend real-time topical news. In: Proceedings of the third ACM conference on recommender systems. ACM, New York, pp 385–388 Phelan O, McCarthy K, Smyth B (2009) Using twitter to recommend real-time topical news. In: Proceedings of the third ACM conference on recommender systems. ACM, New York, pp 385–388
Zurück zum Zitat Rae A, Sigurbjörnsson B, van Zwol R (2010) Improving tag recommendation using social networks. In: Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO ’10. Le Centre de Hautes Etudes Internationales d’Informatique Documentaire, Paris, pp 92–99 Rae A, Sigurbjörnsson B, van Zwol R (2010) Improving tag recommendation using social networks. In: Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO ’10. Le Centre de Hautes Etudes Internationales d’Informatique Documentaire, Paris, pp 92–99
Zurück zum Zitat Robertson SE, Walker S, Jones S, Hancock-Beaulieu M, Gatford M (1994) Okapi at TREC-3. In: Proceedings of the Text Retrieval Conference (TREC). National Institute of Standards and Technology, Gaithersburg, pp 109–126 Robertson SE, Walker S, Jones S, Hancock-Beaulieu M, Gatford M (1994) Okapi at TREC-3. In: Proceedings of the Text Retrieval Conference (TREC). National Institute of Standards and Technology, Gaithersburg, pp 109–126
Zurück zum Zitat Romero DM, Meeder B, Kleinberg JM (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Srinivasan S, Ramamritham K, Kumar A, Ravindra MP, Bertino E, Kumar R (eds) WWW. ACM, New York, pp 695–704 Romero DM, Meeder B, Kleinberg JM (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Srinivasan S, Ramamritham K, Kumar A, Ravindra MP, Bertino E, Kumar R (eds) WWW. ACM, New York, pp 695–704
Zurück zum Zitat Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523CrossRef Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523CrossRef
Zurück zum Zitat Schedl M (2010) On the use of microblogging posts for similarity estimation and artist labeling. In: Downie JS, Veltkamp RC (eds) ISMIR, International Society for Music Information Retrieval, pp 447–452 Schedl M (2010) On the use of microblogging posts for similarity estimation and artist labeling. In: Downie JS, Veltkamp RC (eds) ISMIR, International Society for Music Information Retrieval, pp 447–452
Zurück zum Zitat Schedl M (2012) # nowplaying madonna: a large-scale evaluation on estimating similarities between music artists and between movies from microblogs. Inf Retr 1–35 Schedl M (2012) # nowplaying madonna: a large-scale evaluation on estimating similarities between music artists and between movies from microblogs. Inf Retr 1–35
Zurück zum Zitat Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47CrossRef Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47CrossRef
Zurück zum Zitat Sen S, Vig J, Riedl J (2009) Tagommenders: connecting users to items through tags. In: Proceedings of the 18th international conference on world wide web, WWW ’09. ACM, New York, pp 671–680 Sen S, Vig J, Riedl J (2009) Tagommenders: connecting users to items through tags. In: Proceedings of the 18th international conference on world wide web, WWW ’09. ACM, New York, pp 671–680
Zurück zum Zitat Sigurbjörnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web. ACM, New York, pp 327–336 Sigurbjörnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web. ACM, New York, pp 327–336
Zurück zum Zitat Tatu M, Srikanth M, D’Silva T (2008) RSDC’08: Tag recommendations using bookmark content. In: Workshop at 18th European Conference on Machine Learning (ECML’08)/11th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD08 Tatu M, Srikanth M, D’Silva T (2008) RSDC’08: Tag recommendations using bookmark content. In: Workshop at 18th European Conference on Machine Learning (ECML’08)/11th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD08
Zurück zum Zitat Ye S, Wu S (2010) Measuring Message Propagation and Social Influence on Twitter. com. In: Proceedings of Second International Conference, Socinfo 2010, on Social Informatics, Laxenburg. Springer, New York, pp 216–231 Ye S, Wu S (2010) Measuring Message Propagation and Social Influence on Twitter. com. In: Proceedings of Second International Conference, Socinfo 2010, on Social Informatics, Laxenburg. Springer, New York, pp 216–231
Zurück zum Zitat Zangerle E, Gassler W, Specht G (2011) Using tag recommendations to homogenize folksonomies in microblogging environments. In: Bolc L, Makowski M, Wierzbicki A (eds) Proceedings of Third International Conference, SocInfo 2011, on Social Informatics, Singapore, vol 6430 of Lecture Notes in Computer Science. Springer, Berlin, pp 1–18 Zangerle E, Gassler W, Specht G (2011) Using tag recommendations to homogenize folksonomies in microblogging environments. In: Bolc L, Makowski M, Wierzbicki A (eds) Proceedings of Third International Conference, SocInfo 2011, on Social Informatics, Singapore, vol 6430 of Lecture Notes in Computer Science. Springer, Berlin, pp 1–18
Metadaten
Titel
On the impact of text similarity functions on hashtag recommendations in microblogging environments
verfasst von
Eva Zangerle
Wolfgang Gassler
Günther Specht
Publikationsdatum
01.12.2013
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 4/2013
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-013-0108-x

Weitere Artikel der Ausgabe 4/2013

Social Network Analysis and Mining 4/2013 Zur Ausgabe

Premium Partner