ABSTRACT
We report on the live evaluation of various news recommender systems conducted on the website swissinfo.ch. We demonstrate that there is a major difference between offline and online accuracy evaluations. In an offline setting, recommending most popular stories is the best strategy, while in a live environment this strategy is the poorest. For online setting, context-tree recommender systems which profile the users in real-time improve the click-through rate by up to 35%. The visit length also increases by a factor of 2.5. Our experience holds important lessons for the evaluation of recommender systems with offline data as well as for the use of the click-through rate as a performance indicator.
Supplemental Material
- J. Ahn, P. Brusilovsky, J. Grady, and D. He. Open user profiles for adaptive news systems: help or harm? In Conf. on WWW, pages 11--20, 2007. Google ScholarDigital Library
- D. Billsus and M. Pazzani. A hybrid user model for news story classification. In Conf. on User Modeling, pages 99--108, 1999. Google ScholarDigital Library
- R. Burke. Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction, 12:331--370, November 2002. Google ScholarDigital Library
- L. Chen and P. Pu. Eye-tracking study of user behavior in recommender interfaces. In User Modeling, Adaptation, and Personalization, pages 375--380, 2010. Google ScholarDigital Library
- P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In Conf. on Rec. Systems, pages 39--46, 2010. Google ScholarDigital Library
- A. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In Conf. on WWW, pages 271--280, 2007. Google ScholarDigital Library
- F. Garcin, C. Dimitrakakis, and B. Faltings. Personalized news recommendation with context trees. In Conf. on Rec. Systems, pages 105--112, 2013. Google ScholarDigital Library
- F. Garcin and B. Faltings. Pen recsys: A personalized news recommender systems framework. In Conf. on Rec. Systems, pages 469--470, 2013. Google ScholarDigital Library
- J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. Trans. on Info. Systems, 22:5--53, 2004. Google ScholarDigital Library
- W. IJntema, F. Goossen, F. Frasincar, and F. Hogenboom. Ontology-based news recommendation. In Workshop on Data Semantics, page 16, 2010. Google ScholarDigital Library
- B. Kille, F. Hopfgartner, T. Brodt, and T. Heintz. The plista dataset. In News Recommender Systems Workshop, pages 16--23, 2013. Google ScholarDigital Library
- E. Kirshenbaum, G. Forman, and M. Dugan. A live comparison of methods for personalized article recommendation at forbes.com. In ECML - PKDD, pages 51--66, 2012. Google ScholarDigital Library
- K. Lerman and T. Hogg. Using a model of social dynamics to predict popularity of news. In Conf. on WWW, pages 621--630, 2010. Google ScholarDigital Library
- L. Li, W. Chu, J. Langford, and R. Schapire. A contextual-bandit approach to personalized news article recommendation. In Conf. on WWW, pages 661--670, 2010. Google ScholarDigital Library
- J. Liu, P. Dolan, and E. Pedersen. Personalized news recommendation based on click behavior. In Proc. of IUI, pages 31--40, 2010. Google ScholarDigital Library
- P. Pu, L. Chen, and R. Hu. A user-centric evaluation framework for recommender systems. In Conf. on Rec. Systems, pages 157--164, 2011. Google ScholarDigital Library
- P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open architecture for collaborative filtering of netnews. In CSCW, pages 175--186, 1994. Google ScholarDigital Library
- A. Said, A. Bellogin, J. Lin, and A. de Vries. Do recommendations matter?: News recommendation in real life. In CSCW, pages 237--240, 2014. Google ScholarDigital Library
- A. Said, B. Fields, B. Jain, and S. Albayrak. User-centric evaluation of a k-furthest neighbor collaborative filtering recommender algorithm. In CSCW, pages 1399--1408, 2013. Google ScholarDigital Library
- A. Said, J. Lin, A. Bellogin, and A. de Vries. A month in the life of a production news recommender system. In Workshop on Living Labs for Information Retrieval Evaluation, pages 7--10, 2013. Google ScholarDigital Library
- G. Shani and A. Gunawardana. Evaluating recommendation systems. In Recommender Systems Handbook, pages 257--297. Springer, 2011.Google ScholarCross Ref
- M. Tavakolifard, J. Gulla, K. Almeroth, F. Hopfgartner, B. Kille, T. Plumbaum, A. Lommatzsch, T. Brodt, A. Bucko, and T. Heintz. Workshop and challenge on news recommender systems. In Conf. on Rec. Systems, pages 481--482, 2013. Google ScholarDigital Library
- H. Zheng, D. Wang, Q. Zhang, H. Li, and T. Yang. Do clicks measure recommendation relevancy?: an empirical user study. In Conf. on Rec. Systems, pages 249--252, 2010. Google ScholarDigital Library
Index Terms
- Offline and online evaluation of news recommender systems at swissinfo.ch
Recommendations
Predicting Online Performance of News Recommender Systems Through Richer Evaluation Metrics
RecSys '15: Proceedings of the 9th ACM Conference on Recommender SystemsWe investigate how metrics that can be measured offline can be used to predict the online performance of recommender systems, thus avoiding costly A-B testing. In addition to accuracy metrics, we combine diversity, coverage, and serendipity metrics to ...
A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation
RepSys '13: Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems EvaluationOffline evaluations are the most common evaluation method for research paper recommender systems. However, no thorough discussion on the appropriateness of offline evaluations has taken place, despite some voiced criticism. We conducted a study in which ...
Real-time news recommender system
ECMLPKDD'10: Proceedings of the 2010th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part IIIIn this demo we present a robust system for delivering real-time news recommendation to the user based on the user's history of the past visits to the site, current user's context and popularity of stories. Our system is running live providing real-time ...
Comments