skip to main content
10.1145/1639714.1639726acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
research-article

Latent dirichlet allocation for tag recommendation

Published:23 October 2009Publication History

ABSTRACT

Tagging systems have become major infrastructures on the Web. They allow users to create tags that annotate and categorize content and share them with other users, very helpful in particular for searching multimedia content. However, as tagging is not constrained by a controlled vocabulary and annotation guidelines, tags tend to be noisy and sparse. Especially new resources annotated by only a few users have often rather idiosyncratic tags that do not reflect a common perspective useful for search. In this paper we introduce an approach based on Latent Dirichlet Allocation (LDA) for recommending tags of resources in order to improve search. Resources annotated by many users and thus equipped with a fairly stable and complete tag set are used to elicit latent topics to which new resources with only a few tags are mapped. Based on this, other tags belonging to a topic can be recommended for the new resource. Our evaluation shows that the approach achieves significantly better precision and recall than the use of association rules, suggested in previous work, and also recommends more specific tags. Moreover, extending resources with these recommended tags significantly improves search for new resources.

References

  1. R. Agrawal, T. Imielinski, and S. A. Mining association rules between sets of items in large databases. SIGMOD Record, 22(2), 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alias--i. Lingpipe 3.7.0. http://alias-i.com/lingpipe(accessed:10/2008), 2008.Google ScholarGoogle Scholar
  3. S. Bao, G.-R. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. Optimizing web search using social annotations. In C. L. Williamson, M. E. Zurko, P. F. Patel-Schneider, and P. J. Shenoy, editors, Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8--12, 2007, pages 501--510, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Batagelj and M. Zaversnik. Generalized cores. CoRR, cs.DS/0202039, 2002.Google ScholarGoogle Scholar
  5. G. Begelman, P. Keller, and F. Smadja. Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the WWW 2006 Workshop on Collaborative Web Tagging, Edinburgh, May 2006.Google ScholarGoogle Scholar
  6. B. Berendt and C. Hanser. Tags are not metadata, but just more content -- to some people. In Proceedings of the International Conference on Weblogs and Social Media, 2007.Google ScholarGoogle Scholar
  7. I. Bhattacharya and L. Getoor. A latent dirichlet model for unsupervised entity resolution. In SIAM Conference on Data Mining (SDM), pages 47--58, April 2006.Google ScholarGoogle ScholarCross RefCross Ref
  8. I. Biro, D. Siklosi, J. Szabo, and A. A. Benczur. Linked latent dirichlet allocation in web spam filtering. In AIRWeb '09: Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, pages 37--40, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu. Can all tags be used for search? In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 193--202, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, January 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. A. Chirita, S. Costache, W. Nejdl, and S. Handschuh. P-tag: large scale automatic generation of personalized annotation tags for the web. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 845--854, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Datta, W. Ge, J. Li, and J. Wang. Toward bridging the annotation-retrieval gap in image search. Multimedia, IEEE, 14(3):24--35, July-Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. A. Dmitriev, N. Eiron, M. Fontoura, and E. J. Shekita. Using annotations in enterprise search. In L. Carr, D. D. Roure, A. Iyengar, C. A. Goble, and M. Dahlin, editors, Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, Scotland, UK, May 23--26, 2006, pages 811--817, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Garg and I. Weber. Personalized, interactive tag recommendation for flickr. In RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, pages 67--74, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2):198--208, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1:5228--5235, April 2004.Google ScholarGoogle ScholarCross RefCross Ref
  17. P. Heymann, G. Koutrika, and H. Garcia-Molina. Can social bookmarking improve web search? In M. Najork, A. Z. Broder, and S. Chakrabarti, editors, Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, Palo Alto, California, USA, February 11--12, 2008, pages 195--206. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Heymann, D. Ramage, and H. Garcia-Molina. Social tag prediction. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 531--538, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Hotho, R. Jaeschke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: Search and ranking. In Y. Sure and J. Domingue, editors, The Semantic Web: Research and Applications, volume 4011 of Lecture Notes in Computer Science, pages 411--426, Heidelberg, Germany, June 2006. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Jaeschke, L. B. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme. Tag recommendations in folksonomies. In J. N. Kok, J. Koronacki, R. L. de Montaras, S. Matwin, D. Mladenic, and A. Skowron, editors, Knowledge Discovery in Databases: PKDD 2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007, Proceedings, volume 4702 of Lecture Notes in Computer Science, pages 506--514, Heidelberg, Germany, 2007. Springer.Google ScholarGoogle Scholar
  21. R. Krestel and L. Chen. The art of tagging: Measuring the quality of tags. In J. Domingue and C. Anutariya, editors, The Semantic Web, 3rd Asian Semantic Web Conference, ASWC 2008, Bangkok, Thailand, December 8-11, 2008. Proceedings, volume 5367 of Lecture Notes in Computer Science, pages 257--271, Heidelberg, Germany, 2008. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. D. Manning, P. Raghavan, and H. Schuetze. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Marlow, M. Naaman, D. Boyd, and M. Davis. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In U. K. Wiil, P. J. Nuernberg, and J. Rubart, editors, HYPERTEXT 2006, Proceedings of the 17th ACM Conference on Hypertext and Hypermedia, August 22-25, 2006, Odense, Denmark, pages 31--40, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In L. Ungar, M. Craven, D. Gunopulos, and T. Eliassi-Rad, editors, KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 935--940, New York, NY, USA, August 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. Mishne. Autotag: a collaborative approach to automated tag assignment for weblog posts. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 953--954, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Schenkel, T. Crecelius, M. Kacimi, S. Michel, T. Neumann, J. X. Parreira, and G. Weikum. Efficient top-k querying over social-tagging networks. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 523--530, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Shepitsen, J. Gemmell, B. Mobasher, and R. D. Burke. Personalized recommendation in social tagging systems using hierarchical clustering. In P. Pu, D. G. Bridge, B. Mobasher, and F. Ricci, editors, Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, Lausanne, Switzerland, October 23--25, 2008, pages 259--266, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Sigurbjoernsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 327--336, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Song, Z. Zhuang, H. Li, Q. Zhao, J. Li, W.-C. Lee, and C. L. Giles. Real-time automatic tag recommendation. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 515--522, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Symeonidis, A. Nanopoulos, and Y. Manolopoulos. Tag recommendations based on tensor dimensionality reduction. In RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, pages 43--50, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Z. Xu, Y. Fu, J. Mao, and D. Su. Towards the semantic web: Collaborative tag suggestions. In Proceedings of Collaborative Web Tagging Workshop at 15th International World Wide Web Conference, 2006.Google ScholarGoogle Scholar

Index Terms

  1. Latent dirichlet allocation for tag recommendation

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                RecSys '09: Proceedings of the third ACM conference on Recommender systems
                October 2009
                442 pages
                ISBN:9781605584355
                DOI:10.1145/1639714

                Copyright © 2009 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 23 October 2009

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate254of1,295submissions,20%

                Upcoming Conference

                RecSys '24
                18th ACM Conference on Recommender Systems
                October 14 - 18, 2024
                Bari , Italy

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader