ABSTRACT
Recommender systems research is being slowed by the difficulty of replicating and comparing research results. Published research uses various experimental methodologies and metrics that are difficult to compare. It also often fails to sufficiently document the details of proposed algorithms or the evaluations employed. Researchers waste time reimplementing well-known algorithms, and the new implementations may miss key details from the original algorithm or its subsequent refinements. When proposing new algorithms, researchers should compare them against finely-tuned implementations of the leading prior algorithms using state-of-the-art evaluation methodologies. With few exceptions, published algorithmic improvements in our field should be accompanied by working code in a standard framework, including test harnesses to reproduce the described results. To that end, we present the design and freely distributable source code of LensKit, a flexible platform for reproducible recommender systems research. LensKit provides carefully tuned implementations of the leading collaborative filtering algorithms, APIs for common recommender system use cases, and an evaluation framework for performing reproducible offline evaluations of algorithms. We demonstrate the utility of LensKit by replicating and extending a set of prior comparative studies of recommender algorithms --- showing limitations in some of the original results --- and by investigating a question recently raised by a leader in the recommender systems community on problems with error-based prediction evaluation.
- X. Amatriain. Recommender systems: We're doing it (all) wrong. http://technocalifornia.blogspot.com/2011/04/recommender-systems-were-doing-it-all.html, Apr. 2011.Google Scholar
- N. W. H. Blaikie. Analyzing quantitative data: from description to explanation. SAGE, Mar. 2003.Google Scholar
- J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI 1998, pages 43--52. AAAI, 1998. Google ScholarDigital Library
- R. Burke. Evaluating the dynamic properties of recommendation algorithms. In ACM RecSys '10, pages 225--228. ACM, 2010. Google ScholarDigital Library
- M. Fowler. Inversion of control containers and the dependency injection pattern. http://martinfowler.com/articles/injection.html, Jan. 2004.Google Scholar
- S. Funk. Netflix update: Try this at home. http://sifter.org/simon/journal/20061211.html, Dec. 2006.Google Scholar
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, and P. Reutemann. The WEKA data mining software: An update. SIGKDD Explorations, 11(1), 2009. Google ScholarDigital Library
- J. Herlocker, J. A. Konstan, and J. Riedl. An empirical analysis of design choices in Neighborhood-Based collaborative filtering algorithms. Inf. Retr., 5(4):287--310, 2002. Google ScholarDigital Library
- J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 22(1):5--53, 2004. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. Cumulated gain-based a aa evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS), 20(4):422--446, Oct. 2002. Google ScholarDigital Library
- Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In ACM KDD '08, pages 426--434. ACM, 2008. Google ScholarDigital Library
- N. Lathia, S. Hailes, and L. Capra. Evaluating collaborative filtering over time. In SIGIR '09 Workshop on the Future of IR Evaluation, July 2009.Google Scholar
- J. Levandoski, M. Ekstrand, J. Riedl, and M. Mokbel. RecBench: benchmarks for evaluating performance of recommender system architectures. In VLDB 2011, 2011.Google Scholar
- A. Paterek. Improving regularized singular value decomposition for collaborative filtering. In KDD Cup and Workshop 2007, Aug. 2007.Google Scholar
- P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. GroupLens: an open architecture for collaborative filtering of netnews. In ACM CSCW '94, pages 175--186. ACM, 1994. Google ScholarDigital Library
- B. Sarwar, G. Karypis, J. Konstan, and J. Reidl. Item-based collaborative filtering recommendation algorithms. In ACM WWW '01, pages 285--295. ACM, 2001. Google ScholarDigital Library
- G. Shani and A. Gunawardana. Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors, Recommender Systems Handbook, pages 257--297. Springer, 2010.Google Scholar
Index Terms
- Rethinking the recommender research ecosystem: reproducibility, openness, and LensKit
Recommendations
Rethinking Serendipity in Recommender Systems
CHIIR '23: Proceedings of the 2023 Conference on Human Information Interaction and RetrievalRecommender systems suggest items, such as movies or books, to users based on their interests. These systems often suggest items that users are either already familiar with or could easily have found on their own without additional assistance. To ...
Towards reproducibility in recommender-systems research
Numerous recommendation approaches are in use today. However, comparing their effectiveness is a challenging task because evaluation results are rarely reproducible. In this article, we examine the challenge of reproducibility in recommender-system ...
Temporal diversity in recommender systems
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrievalCollaborative Filtering (CF) algorithms, used to build web-based recommender systems, are often evaluated in terms of how accurately they predict user ratings. However, current evaluation techniques disregard the fact that users continue to rate items ...
Comments