ABSTRACT
Researchers have access to large online archives of scientific articles. As a consequence, finding relevant papers has become more difficult. Newly formed online communities of researchers sharing citations provides a new way to solve this problem. In this paper, we develop an algorithm to recommend scientific articles to users of an online community. Our approach combines the merits of traditional collaborative filtering and probabilistic topic modeling. It provides an interpretable latent structure for users and items, and can form recommendations about both existing and newly published articles. We study a large subset of data from CiteULike, a bibliography sharing service, and show that our algorithm provides a more effective recommender system than traditional collaborative filtering.
- D. Agarwal and B.-C. Chen. Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 19--28, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- D. Agarwal and B.-C. Chen. flda: matrix factorization through latent Dirichlet allocation. In Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, pages 91--100, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- D. Bertsekas. Nonlinear Programming. Athena Scientific, 1999.Google Scholar
- D. Blei and J. Lafferty. A correlated topic model of Science. Annals of Applied Statistics, 1(1):17--35, 2007.Google ScholarCross Ref
- D. Blei and J. Lafferty. Topic models. In A. Srivastava and M. Sahami, editors, Text Mining: Theory and Applications. Taylor and Francis, 2009.Google Scholar
- D. Blei and J. McAuliffe. Supervised topic models. In Neural Information Processing Systems, 2007.Google Scholar
- D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, January 2003. Google ScholarDigital Library
- J. Chang, J. Boyd-Graber, S. Gerrish, C. Wang, and D. Blei. Reading tea leaves: How humans interpret topic models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 288--296, 2009.Google Scholar
- A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39:1--38, 1977.Google ScholarCross Ref
- S. M. Gerrish and D. M. Blei. Predicting legislative roll calls from text. In Proceedings of the 28th Annual International Conference on Machine Learning, ICML '11, 2011.Google Scholar
- J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '99, pages 230--237, New York, NY, USA, 1999. ACM. Google ScholarDigital Library
- Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 263--272, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarDigital Library
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30--37, 2009. Google ScholarDigital Library
- P. Melville, M. R., and R. Nagaraja. Content-boosted collaborative filtering for improved recommendations. In American Association for Artificial Intelligence, pages 187--192, 2002. Google ScholarDigital Library
- R. J. Mooney and L. Roy. Content-based book recommending using learning for text categorization. In Proceedings of the fifth ACM conference on Digital libraries, pages 195--204, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
- R. Pan, Y. Zhou, B. Cao, N. N. Liu, R. Lukose, M. Scholz, and Q. Yang. One-class collaborative filtering. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 502--511, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarDigital Library
- R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th International Conference on Machine learning, pages 880--887. ACM, 2008. Google ScholarDigital Library
- R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. Advances in Neural Information Processing Systems, 20:1257--1264, 2008.Google ScholarDigital Library
- H. Shan and A. Banerjee. Generalized probabilistic matrix factorizations for collaborative filtering. In Proceedings of the 2010 IEEE International Conference on Data Mining, pages 1025--1030, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarDigital Library
- Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566--1581, 2007.Google ScholarCross Ref
- E. Wang, D. Liu, J. Silva, D. Dunson, and L. Carin. Joint analysis of time-evolving binary matrices and associated documents. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2370--2378. 2010.Google Scholar
- K. Yu, J. Lafferty, S. Zhu, and Y. Gong. Large-scale collaborative prediction using a nonparametric random effects model. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1185--1192, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Index Terms
- Collaborative topic modeling for recommending scientific articles
Recommendations
Improving Collaborative Filtering Based Recommenders Using Topic Modelling
WI-IAT '14: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01Standard Collaborative Filtering (CF) algorithms make use of interactions between users and items in the form of implicit or explicit ratings alone for generating recommendations. Similarity among users or items is calculated purely based on rating ...
Collaborative topic regression for online recommender systems: an online and Bayesian approach
Collaborative Topic Regression (CTR) combines ideas of probabilistic matrix factorization (PMF) and topic modeling (such as LDA) for recommender systems, which has gained increasing success in many applications. Despite enjoying many advantages, the ...
A hybrid recommendation technique using topic embedding for rating prediction and to handle cold-start problem
AbstractRecommender systems aim to estimate item ratings and recommend items based on the users’ interests. The traditional recommender systems generally consider user–item rating information for rating prediction, but they suffer from various ...
Highlights- A novel recommendation approach to handle the cold-start problem.
- Incorporating ...
Comments