research-article

Automatic tag recommendation algorithms for social recommender systems

Authors:
Yang Song

Microsoft Research

Microsoft Research
View Profile

,
Lu Zhang

The Pennsylvania State University

The Pennsylvania State University
View Profile

,
C. Lee Giles

The Pennsylvania State University

The Pennsylvania State University
View Profile

Authors Info & Claims

ACM Transactions on the Web Volume 5 Issue 1Article No.: 4pp 1–31https://doi.org/10.1145/1921591.1921595

Published:17 February 2011Publication History

ACM Transactions on the Web

Abstract

The emergence of Web 2.0 and the consequent success of social network Web sites such as Del.icio.us and Flickr introduce us to a new concept called social bookmarking, or tagging. Tagging is the action of connecting a relevant user-defined keyword to a document, image, or video, which helps the user to better organize and share their collections of interesting stuff. With the rapid growth of Web 2.0, tagged data is becoming more and more abundant on the social network Web sites. An interesting problem is how to automate the process of making tag recommendations to users when a new resource becomes available.

In this article, we address the issue of tag recommendation from a machine learning perspective. From our empirical observation of two large-scale datasets, we first argue that the user-centered approach for tag recommendation is not very effective in practice. Consequently, we propose two novel document-centered approaches that are capable of making effective and efficient tag recommendations in real scenarios. The first, graph-based, method represents the tagged data in two bipartite graphs, (document, tag) and (document, word), then finds document topics by leveraging graph partitioning algorithms. The second, prototype-based, method aims at finding the most representative documents within the data collections and advocates a sparse multiclass Gaussian process classifier for efficient document classification. For both methods, tags are ranked within each topic cluster/class by a novel ranking method. Recommendations are performed by first classifying a new document into one or more topic clusters/classes, and then selecting the most relevant tags from those clusters/classes as machine-recommended tags.

Experiments on real-world data from Del.icio.us, CiteULike, and BibSonomy examine the quality of tag recommendation as well as the efficiency of our recommendation algorithms. The results suggest that our document-centered models can substantially improve the performance of tag recommendations when compared to the user-centered methods, as well as topic models LDA and SVM classifiers.

References

Begelman, G., Keller, P., and Smadja, F. 2006. Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the Collaborative Web Tagging Workshop (WWW'06).Google Scholar
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022. Google ScholarCross Ref
Bogers, T. and van den Bosch, A. 2008. Recommending scientific articles using citeulike. In Proceedings of the ACM Conference on Recommender Systems (RecSys'08). ACM, New York, NY, 287--290. Google ScholarDigital Library
Breese, J. S., Heckerman, D., and Kadie, C. 1998. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. 43--52. Google ScholarDigital Library
Brinker, K., Furnkranz, J., and Hullermeier, E. 2006. A unified model for multilabel classification and ranking. In Proceedings of the European Conference on Artificial Intelligence (ECAI'06). Google ScholarDigital Library
Chirita, P. A., Costache, S., Nejdl, W., and Handschuh, S. 2007. P-tag: large scale automatic generation of personalized annotation tags for the web. In Proceedings of the 16th International Conference on World Wide Web (WWW'07). ACM Press, New York, NY, 845--854. Google ScholarDigital Library
Cristianini, N. and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press. Google ScholarDigital Library
Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. Sens B, 39, 1, 1--38.Google Scholar
Farooq, U., Song, Y., Carroll, J. M., and Giles, C. L. 2007. Social bookmarking for scholarly digital libraries. IEEE Internet Comput. 29--35. Google ScholarDigital Library
Figueiredo, M. A. T. and Jain, A. K. 2002. Unsupervised learning of finite mixture models. IEEE Trans. Patt. Anal. Mach. Intell. 24, 3, 381--396. Google ScholarDigital Library
Golder, S. and Huberman, B. 2006. Usage patterns of collaborative tagging systems. J. Inform. Sci. Google ScholarDigital Library
Golub, G. H. and Loan, C. F. V. 1996. Matrix Computations 3rd Ed. Johns Hopkins University Press. Google ScholarDigital Library
Jaeschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., and Stumme, G. 2007. Tag recommendations in folksonomies. In Workshop Proceedings of Lernen—Wissensentdeckung—Adaptivitt (LWA'07). Martin-Luther-Universität Halle-Wittenberg, 13--20.Google Scholar
Johnson, R. and Zhang, T. 2007. On the effectiveness of laplacian normalization for graph semi-supervised learning. J. Mach. Learn. Res. 8, 1489--1517. Google ScholarDigital Library
Kendall, M. 1938. A new measure of rank correlation. Biometrika 30, 81--89.Google ScholarCross Ref
Kohonen, T. 2001. Self Organization Maps. Springer.Google Scholar
Kullback, S. and Leibler, R. A. 1951. On information and sufficiency. Annl. Math. Stat. 22, 79--86.Google ScholarCross Ref
Lawrence, N., Seeger, M., and Herbrich, R. 2003. Fast sparse gaussian process methods: The informative vector machine. In Proceedings of Neural Information Processing Systems (NIPS15). 609--616.Google Scholar
Li, J. and Wang, J. Z. 2006. Real-time computerized annotation of pictures. In Proceedings of the International Conference on Multimedia (MULTIMEDIA'06). 911--920. Google ScholarDigital Library
Li, J. and Zha, H. 2006. Two-way poisson mixture models for simultaneous document classification and word clustering. Comput. Stat. Data Anal. 50, 1. Google ScholarDigital Library
Platt, J. C. 2000. Probabilities for SV machines. In Advances in Large Margin Classifiers, 61--74.Google Scholar
Rasmussen, C. E. and Williams, C. K. I. 2006. Gaussian Processes for Machine Learning. MIT Press. Google ScholarDigital Library
Schlattmann, P. 2003. Estimating the number of components in a finite mixture model: the special case of homogeneity. Comput. Stat. Data Anal. 41, 3-4, 441--451. Google ScholarDigital Library
Seeger, M. and Jordan, M. 2004. Sparse gaussian process classification with multiple classes. Tech. rep. 661, Department of Statistics, University of California at Berkeley.Google Scholar
Seeger, M. and Williams, C. 2003. Fast forward selection to speed up sparse gaussian process regression. In Proceedings of the Workshop on AI and Statistics.Google Scholar
Seo, S., Bode, M., and Obermayer, K. 2003. Soft nearest prototype classification. IEEE Trans. Neural Net. Google ScholarDigital Library
Song, Y., Zhang, L., and Giles, C. L. 2008. Sparse gaussian processes classification for fast tag recommendation. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08). ACM, New York, NY. Google ScholarDigital Library
Song, Y., Zhuang, Z., Li, H., Zhao, Q., Li, J., Lee, W.-C., and Giles, C. L. 2008. Real-time automatic tag recommendation. In Proceedings of the Annual International ACM SIGIR Conference (SIGIR'08). Google ScholarDigital Library
Symeonidis, P., Nanopoulos, A., and Manolopoulos, Y. 2008. Tag recommendations based on tensor dimensionality reduction. In Proceedings of the ACM Conference on Recommender Systems (RecSys'08). ACM, New York, NY, 43--50. Google ScholarDigital Library
Tsoumakas, G. and Katakis, I. 2007. Multi-label classification: An overview. Intl. J. Data Warehous. Mining 3, 3, 1--13.Google ScholarCross Ref
Zha, H., He, X., Ding, C., Simon, H., and Gu, M. 2001. Bipartite graph partitioning and data clustering. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM'01). ACM Press, New York, NY, 25--32. Google ScholarDigital Library

Index Terms

Automatic tag recommendation algorithms for social recommender systems
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
2. Information systems
  1. Information retrieval

Recommendations

Real-time automatic tag recommendation
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Tags are user-generated labels for entities. Existing research on tag recommendation either focuses on improving its accuracy or on automating the process, while ignoring the efficiency issue. We propose a highly-automated novel framework for real-time ...
Read More
A sparse gaussian processes classification framework for fast tag suggestions
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

Tagged data is rapidly becoming more available on the World Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource ...
Read More
Tag Recommendation Based on Collaborative Filtering and Text Similarity
ETCS '11: Proceedings of the 2011 Third International Workshop on Education Technology and Computer Science - Volume 02

In current social tagging system, users can freely add tags for the uploaded resources, which caused a problem that many tags could not describe the resource properly and even have some spelling errors. This problem may bring unnecessary troubles for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on the Web Volume 5, Issue 1
February 2011
150 pages
ISSN:1559-1131
EISSN:1559-114X
DOI:10.1145/1921591
Issue’s Table of Contents

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 February 2011
- Accepted: 1 November 2009
- Revised: 1 March 2009
- Received: 1 September 2008
Published in tweb Volume 5, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Gaussian processes
Tagging system
graph partitioning
mixture model
multi-label classification
prototype selection
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 115
  Total Citations
  View Citations
- 2,253
  Total Downloads
- Downloads (Last 12 months)45
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic tag recommendation algorithms for social recommender systems

ACM Transactions on the Web

Abstract

References

Cited By

Index Terms

Recommendations

Real-time automatic tag recommendation

A sparse gaussian processes classification framework for fast tag suggestions

Tag Recommendation Based on Collaborative Filtering and Text Similarity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic tag recommendation algorithms for social recommender systems

ACM Transactions on the Web

Abstract

References

Cited By

Index Terms

Recommendations

Real-time automatic tag recommendation

A sparse gaussian processes classification framework for fast tag suggestions

Tag Recommendation Based on Collaborative Filtering and Text Similarity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media