Elsevier

Physics Reports

Volume 519, Issue 1, October 2012, Pages 1-49
Physics Reports

Recommender systems

https://doi.org/10.1016/j.physrep.2012.02.006Get rights and content

Abstract

The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has great scientific depth and combines diverse research fields which makes it interesting for physicists as well as interdisciplinary researchers.

Introduction

Thanks to computers and computer networks, our society is undergoing rapid transformation in almost all aspects. We buy online, gather information by search engines and live a significant part of our social life on the Internet. The fact that many of our actions and interactions are nowadays stored electronically gives researchers the opportunity to study socio-economical and techno-social systems at much better level of detail. Traditional “soft sciences”, such as sociology or economics, have their fast-growing branches relying on the study of these newly available massive data sets [1], [2]. Physicists, with their long experience with data-driven research, have joined this trend and contributed to many fields such as finance [3], [4], network theory [5], [6], [7], [8], [9] and social dynamics [10] which are outside their traditional realm. The study of recommender systems and information filtering in general is no exception with the interest of physicists steadily increasing over the past decade. The task of recommender systems is to turn data on users and their preferences into predictions of users’ possible future likes and interests. The study of recommender systems is at crossroads of science and socio-economic life and its huge potential was first noticed by web entrepreneurs in the forefront of the information revolution. While being originally a field dominated by computer scientists, recommendation calls for contributions from various directions and is now a topic of interest also for mathematicians, physicists, and psychologists. For instance, it is not a coincidence that an approach based on what psychologists know about human behavior scored high in a recent recommendation contest organized by the commercial company Netflix [11].

When computing recommendations for a particular user, the very basic approach is to select the objects favored by other users that are similar to the target user. Even this simple approach can be realized in a multitude of ways—this is because the field of recommendation lacks general “first principles” from which one could deduce the right way to recommend. For example, how best to measure user similarity and assess its uncertainty? How to aggregate divergent opinions from various users? How to handle users for whom little information is available? Should all data be trusted equally or can one detect reckless or intentionally misleading opinions? These and similar issues arise also when methods more sophisticated than those based on user similarity are used. Fortunately, there exist a number of real data sets that can be used to measure and compare performance of individual methods. In consequence, similarly to physics, it is the experiment what decides which recommendation approach is good and which is not.

It would be very misleading to think that recommender systems are studied only because suitable data sets are available. While the availability of data is important for empirical evaluation of recommendation methods, the main driving force comes from practice: electronic systems give us too much choice to handle by ourselves. The interest from industry is hardly surprising—an early book on the nascent field of recommendation, Net Worth by John Hagel III and Marc Singer [12], clearly pointed out the enormous economic impact of “info-mediaries” who can greatly enhance individual consumers’ information capabilities. Most e-commerce web sites now offer various forms of recommendation—ranging from simply showing the most popular items or suggesting other products by the same producer to complicated data mining techniques. People soon realized that there is no unique best recommendation method. Rather, depending on the context and density of the available data, different methods adapting to particular applications are most likely to succeed. Hence there is no panacea, and the best one can do is to understand the underlying premises and recommender mechanisms, then one can tackle many diverse application problems from the real life examples. This is also reflected in this review where we do not try to highlight any ultimate approach to recommendation. Instead, we review the basic ideas, methods and tools with particular emphasis on physics-rooted approaches.

The motivation for writing this review is multifold. Firstly, while extensive reviews of recommender systems by computer scientists already exists [13], [14], [15], the view of physicists is different from that of computer scientists by using more the complex networks approach and adapting various classical physics processes (such as diffusion) for information filtering. We thus believe that this review with its structure and emphasis on respective topics can provide a novel point of view. Secondly, the past decade has already seen a growing interest of physicists in recommender systems and we hope that this review can be a useful source for them by describing the state of the art in language which is more familiar to the physics community. Finally, the interdisciplinary approach presented here might provide new insights and solutions for open problems and challenges in the active field of information filtering.

This review is organized as follows. To better motivate the problem, In Section 2 we begin with a discussion of real applications of recommender systems. Next, in Section 3 we introduce basic concepts–such as complex networks, recommender systems, and metrics for their evaluation–that form a basis for all subsequent exposition. Then we proceed to description of recommendation algorithms where traditional approaches (similarity-based methods in Section 4 and dimensionality reduction techniques in Section 5) are followed by network-based approaches which have their origin in the random walk process well known to all physicists (in Section 6). Methods based on external information, such as social relationships (in Section 7), keywords or time stamps (in Section 8), are also included. We conclude with a brief evaluation of methods’ performance in Section 9 and a discussion on the outlook of the field in Section 10.

Section snippets

Real applications of recommender systems

Thanks to the ever-decreasing costs of data storage and processing, recommender systems gradually spread to most areas of our lives. Sellers carefully watch our purchases to recommend us other goods and enhance their sales, social web sites analyze our contacts to help us connect with new friends and get hooked with the site, and online radio stations remember skipped songs to serve us better in the future (see more examples in Table 1). In general, whenever there is plenty of diverse products

Definitions of subjects and problems

We briefly review in this chapter basic concepts that are useful in the study of recommender systems.

Similarity-based methods

Similarity-based methods represent one of the most successful approaches to recommendation. They have been studied extensively and found various applications in e-commerce [117], [118]. This class of algorithms can be further divided into methods employing user and item similarity, respectively. The basic assumption of a method based on user similarity is that people who agree in their past evaluations tend to agree again in their future evaluations. Thus, for a target user, the potential

Dimensionality reduction techniques

Dimensionality reduction aims at downsizing the amount of relevant data while preserving the major information content. It is often applied in areas such as data mining, machine learning and cluster analysis. Most techniques of dimensionality reduction involve feature extraction which makes use of hidden variables, or so called latent variables, to describe the underlying causes of co-occurrence data. In the context of movie selection, potential viewers may consider genres such as action,

Diffusion-based methods

Similarly as the classical PageRank algorithm [154] brought the order to the Internet by analyzing the directed network of links among web pages, one could aim to obtain recommendations using a network representation of the input data with user preferences. The algorithms presented in this section are all based on specific transformations (projections) of the input data to object–object networks. Personalized recommendations for an individual user are then obtained by using this user’s past

Social filtering

Recommendations made by a recommender system are often less appreciated than those coming from our friends [217] and social influences may play a more important role than similarity of past activities [218], [219]. In addition, accuracy of recommendation can be improved by analyzing social relationships, such as coauthorships in academic literature recommendation [220] and friendships and memberships in product recommendation [221]. Many real systems, such as Delicious.com and Facebook.com,

Meta approaches

Input data for recommendation can be extended far behind the traditional user–item-rating scheme. In this section, we briefly review the so-called meta approaches where additional information of various kind (tags assigned to items by users or time stamps of past evaluations) enters the recommendation process or simply several recommendation methods are combined together (either in an iterative self-evaluating way or by forming a hybrid algorithm). As possibilities for extensions are relatively

Performance evaluation

In this section we briefly compare performance of individual algorithms presented in this review. To test the algorithms, we use two standard data sets: Movielens 1M14 and Netflix.15 The Movielens data set, which contains ratings from 6040 users to 3706 movies (corresponding to the rating density of 4.5102), is used in its original form. Our subset of the original Netflix

Outlook

After reviewing extensively the past work in the field of recommender systems, we describe here a few challenges that the field of recommendation needs to tackle in the future.

To begin with, let us take the conceptual question of the possibility of effective recommendation. For a long time, we all thought that we know ourselves better than anyone else does. Without much fanfare and fuss about philosophical implications, IT experts and online businesses continue in their exploration of the part

Acknowledgments

This work was partially supported by the EU FET-Open Grant231200 (project QLectives) and National Natural Science Foundation of China (Grant Nos. 11075031, 11105024, 61103109 and 60973069). CHY is partially supported by EU FET FP7 project STAMINA (FP7-265496). TZ is supported by the Research Funds for the Central Universities (UESTC).

References (329)

  • M.E.J. Newman et al.

    The Structure and Dynamics of Networks

    (2006)
  • C. Castellano et al.

    Statistical physics of social dynamics

    Review of Modern Physics

    (2009)
  • J. Ellenberg, This psychologist might outsmart the math brains competing for the netflix prize, Wired Magazine, 2008,...
  • J. Hagel et al.

    Net Worth: Shaping Markets When Customers Make the Rules

    (1999)
  • J.L. Herlocker et al.

    Evaluating collaborative filtering recommender systems

    ACM Transactions on Information Systems

    (2004)
  • G. Adomavicius et al.

    Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions

    IEEE Transactions on Knowledge and Data Engineering

    (2005)
  • C. Anderson

    The Long Tail: Why the Future of Business is Selling Less of More

    (2006)
  • E. Brynjolfsson et al.

    Consumer surplus in the digital economy: estimating the value of increased product variety at online booksellers

    Management Science

    (2003)
  • J.B. Schafer et al.

    E-commerce recommendation applications

    Data Mining and Knowledge Discovery

    (2001)
  • P.-Y. Chen, S.-Y. Wu, J. Yoon, The impact of online recommendations and consumer feedback on sales, in: Proceedings of...
  • J. Bennett, S. Lanning, The Netflix prize, in: Proceedings of KDD Cup and Workshop, 2007, pp....
  • R.M. Bell et al.

    Lessons from the Netflix prize challenge

    ACM SIGKDD Explorations Newsletter

    (2007)
  • Y. Koren, The bellkor solution to the netflix grand prize, Report from the Netflix Prize Winners,...
  • M. Piotte, M. Chabbert, The pragmatic theory solution to the netflix grand prize, Report from the Netflix Prize...
  • A. Töscher, M. Jahrer, The bigchaos solution to the netflix grand prize, Report from the Netflix Prize Winners,...
  • A. Narayanan, V. Shmatikov, Robust de-anonymization of large sparse datasets, in: IEEE Symposium on Security and...
  • M.E.J. Newman

    Power laws Pareto distributions and Zipf’s law

    Contemporary Physics

    (2005)
  • W. Weibull

    A statistical distribution function of wide applicability

    J. Appl. Mech.-Trans. ASME

    (1951)
  • Z. Huang et al.

    Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering

    ACM Transactions on Information Systems

    (2004)
  • B. Sarwar, J. Konstan, J. Riedl, Incremental singular value decomposition algorithms for highly scalable recommender...
  • C.-H. Jin, J.-G. Liu, Y.-C. Zhang, T. Zhou, Adaptive information filtering for dynamics recommender systems,...
  • M.H. Holmes

    Introduction to Perturbation Methods

    (2005)
  • A.I. Schein et al.

    Methods and metrics for cold-start recommendations

  • X.N. Lam, T. Vu, T.D. Le, A.D. Duong, Addressing cold-start problem in recommendation systems, in: Proceedings of the...
  • L. Zhang, L.-S. Bai, T. Zhou, Crossing recommendation based on multi-B2C behavior, J. Univ. Elect. Sci. Technol. China...
  • S.M. Mcnee et al.

    Being accurate is not enough: how accuracy metrics have hurt recommender systems

  • B. Smyth et al.

    Similarity vs. diversity

  • C.-N. Ziegler et al.

    Improving recommendation lists through topic diversification

  • N. Hurley et al.

    Novelty and diversity in top-N recommendation—analysis and evaluation

    ACM Transactions on Internet Technology

    (2011)
  • T. Zhou et al.

    Solving the apparent diversity–accuracy dilemma of recommender systems

    Proceedings of the National Academy of Sciences of the United States of America

    (2010)
  • B. Mobasher et al.

    Towards trustworthy recommender systems: an analysis of attack models and algorithm robustness

    ACM Transactions on Internet Technology

    (2007)
  • S.K. Lam et al.
  • R. Burke et al.
  • L. Xiang et al.

    Temporal recommendation on graphs via long-and short-term preference fusion

  • R. Sinha, K. Swearingen, The role of transparency in recommender systems, in: Proceedings of the CHI’06 Conference on...
  • A.D.J. Cooke et al.

    Marketing the unfamiliar: the role of context and item-specific information in electronic agent recommendations

    Journal of Marketing Research

    (2002)
  • Z. Huang et al.

    Analyzing consumer–product graphs: empirical findings and applications in recommender systems

    Management Science

    (2007)
  • S. Sahebi, W.W. Cohen, Community-based recommendations: a solution to the cold start problem...
  • C. Song et al.

    Limits of predictability in human mobility

    Science

    (2010)
  • Cited by (986)

    View all citing articles on Scopus
    View full text