ABSTRACT
Tagging systems have become major infrastructures on the Web. They allow users to create tags that annotate and categorize content and share them with other users, very helpful in particular for searching multimedia content. However, as tagging is not constrained by a controlled vocabulary and annotation guidelines, tags tend to be noisy and sparse. Especially new resources annotated by only a few users have often rather idiosyncratic tags that do not reflect a common perspective useful for search. In this paper we introduce an approach based on Latent Dirichlet Allocation (LDA) for recommending tags of resources in order to improve search. Resources annotated by many users and thus equipped with a fairly stable and complete tag set are used to elicit latent topics to which new resources with only a few tags are mapped. Based on this, other tags belonging to a topic can be recommended for the new resource. Our evaluation shows that the approach achieves significantly better precision and recall than the use of association rules, suggested in previous work, and also recommends more specific tags. Moreover, extending resources with these recommended tags significantly improves search for new resources.
- R. Agrawal, T. Imielinski, and S. A. Mining association rules between sets of items in large databases. SIGMOD Record, 22(2), 1993. Google ScholarDigital Library
- Alias--i. Lingpipe 3.7.0. http://alias-i.com/lingpipe(accessed:10/2008), 2008.Google Scholar
- S. Bao, G.-R. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. Optimizing web search using social annotations. In C. L. Williamson, M. E. Zurko, P. F. Patel-Schneider, and P. J. Shenoy, editors, Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8--12, 2007, pages 501--510, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- V. Batagelj and M. Zaversnik. Generalized cores. CoRR, cs.DS/0202039, 2002.Google Scholar
- G. Begelman, P. Keller, and F. Smadja. Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the WWW 2006 Workshop on Collaborative Web Tagging, Edinburgh, May 2006.Google Scholar
- B. Berendt and C. Hanser. Tags are not metadata, but just more content -- to some people. In Proceedings of the International Conference on Weblogs and Social Media, 2007.Google Scholar
- I. Bhattacharya and L. Getoor. A latent dirichlet model for unsupervised entity resolution. In SIAM Conference on Data Mining (SDM), pages 47--58, April 2006.Google ScholarCross Ref
- I. Biro, D. Siklosi, J. Szabo, and A. A. Benczur. Linked latent dirichlet allocation in web spam filtering. In AIRWeb '09: Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, pages 37--40, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- {9} K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu. Can all tags be used for search? In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 193--202, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, January 2003. Google ScholarDigital Library
- P. A. Chirita, S. Costache, W. Nejdl, and S. Handschuh. P-tag: large scale automatic generation of personalized annotation tags for the web. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 845--854, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- R. Datta, W. Ge, J. Li, and J. Wang. Toward bridging the annotation-retrieval gap in image search. Multimedia, IEEE, 14(3):24--35, July-Sept. 2007. Google ScholarDigital Library
- P. A. Dmitriev, N. Eiron, M. Fontoura, and E. J. Shekita. Using annotations in enterprise search. In L. Carr, D. D. Roure, A. Iyengar, C. A. Goble, and M. Dahlin, editors, Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, Scotland, UK, May 23--26, 2006, pages 811--817, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- N. Garg and I. Weber. Personalized, interactive tag recommendation for flickr. In RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, pages 67--74, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- S. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2):198--208, April 2006. Google ScholarDigital Library
- T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1:5228--5235, April 2004.Google ScholarCross Ref
- P. Heymann, G. Koutrika, and H. Garcia-Molina. Can social bookmarking improve web search? In M. Najork, A. Z. Broder, and S. Chakrabarti, editors, Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, Palo Alto, California, USA, February 11--12, 2008, pages 195--206. ACM, 2008. Google ScholarDigital Library
- P. Heymann, D. Ramage, and H. Garcia-Molina. Social tag prediction. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 531--538, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- A. Hotho, R. Jaeschke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: Search and ranking. In Y. Sure and J. Domingue, editors, The Semantic Web: Research and Applications, volume 4011 of Lecture Notes in Computer Science, pages 411--426, Heidelberg, Germany, June 2006. Springer. Google ScholarDigital Library
- R. Jaeschke, L. B. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme. Tag recommendations in folksonomies. In J. N. Kok, J. Koronacki, R. L. de Montaras, S. Matwin, D. Mladenic, and A. Skowron, editors, Knowledge Discovery in Databases: PKDD 2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007, Proceedings, volume 4702 of Lecture Notes in Computer Science, pages 506--514, Heidelberg, Germany, 2007. Springer.Google Scholar
- R. Krestel and L. Chen. The art of tagging: Measuring the quality of tags. In J. Domingue and C. Anutariya, editors, The Semantic Web, 3rd Asian Semantic Web Conference, ASWC 2008, Bangkok, Thailand, December 8-11, 2008. Proceedings, volume 5367 of Lecture Notes in Computer Science, pages 257--271, Heidelberg, Germany, 2008. Springer. Google ScholarDigital Library
- C. D. Manning, P. Raghavan, and H. Schuetze. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK, July 2008. Google ScholarDigital Library
- C. Marlow, M. Naaman, D. Boyd, and M. Davis. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In U. K. Wiil, P. J. Nuernberg, and J. Rubart, editors, HYPERTEXT 2006, Proceedings of the 17th ACM Conference on Hypertext and Hypermedia, August 22-25, 2006, Odense, Denmark, pages 31--40, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In L. Ungar, M. Craven, D. Gunopulos, and T. Eliassi-Rad, editors, KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 935--940, New York, NY, USA, August 2006. ACM. Google ScholarDigital Library
- G. Mishne. Autotag: a collaborative approach to automated tag assignment for weblog posts. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 953--954, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- R. Schenkel, T. Crecelius, M. Kacimi, S. Michel, T. Neumann, J. X. Parreira, and G. Weikum. Efficient top-k querying over social-tagging networks. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 523--530, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- A. Shepitsen, J. Gemmell, B. Mobasher, and R. D. Burke. Personalized recommendation in social tagging systems using hierarchical clustering. In P. Pu, D. G. Bridge, B. Mobasher, and F. Ricci, editors, Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, Lausanne, Switzerland, October 23--25, 2008, pages 259--266, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- B. Sigurbjoernsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 327--336, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- Y. Song, Z. Zhuang, H. Li, Q. Zhao, J. Li, W.-C. Lee, and C. L. Giles. Real-time automatic tag recommendation. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 515--522, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- P. Symeonidis, A. Nanopoulos, and Y. Manolopoulos. Tag recommendations based on tensor dimensionality reduction. In RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, pages 43--50, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- Z. Xu, Y. Fu, J. Mao, and D. Su. Towards the semantic web: Collaborative tag suggestions. In Proceedings of Collaborative Web Tagging Workshop at 15th International World Wide Web Conference, 2006.Google Scholar
Index Terms
- Latent dirichlet allocation for tag recommendation
Recommendations
Tag recommendation for social bookmarking: Probabilistic approaches
Principles and Practice of Multi-Agent SystemsTagging has become increasingly popular with the explosion of user-created content on the web. A 'tag' can be defined as a group of keywords that makes organizing, browsing and searching for content more efficient. Users apply tags to a variety of web-...
Flickr tag recommendation based on collective knowledge
WWW '08: Proceedings of the 17th international conference on World Wide WebOnline photo services such as Flickr and Zooomr allow users to share their photos with family, friends, and the online community at large. An important facet of these services is that users manually annotate their photos using so called tags, which ...
Personalized tag recommendation based on user preference and content
ADMA'10: Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part IIWith the widely use of collaborative tagging system nowadays, users could tag their favorite resources with free keywords. Tag recommendation technology is developed to help users in the process of tagging. However, most of the tag recommendation ...
Comments