Using community preference for overcoming sparsity and cold-start problems in collaborative filtering system offering soft ratings☆
Introduction
Recommender systems (RSs) (Adomavicius and Tuzhilin, 2005, Hwang et al., 2010, Kim et al., 2002) have been developing rapidly since they were introduced in 1990s; in practice, RSs have been applied in a variety of e-commerce applications (Linden et al., 2003, Shambour and Lu, 2011, Kim et al., 2011). In general, RSs collect information about user preferences from multiple sources, estimate user preferences on unseen items, and then generate suitable recommendations based on the estimated data. Logically, the quality of the recommendations in an RS mainly depends on the representation of user preferences and the accuracy of estimations.
Conventional RSs described in the literature usually provide a rating domain defined as a totally ordered finite set of rating scores, and represent a user’s preference for an item as a rating score (i.e., a hard rating). However, user preferences are naturally subjective and qualitative; therefore, representing user preferences as hard ratings is not appropriate in some cases (Nguyen and Huynh, 2015). For example, let us consider an RS offering hard ratings with a rating domain . Suppose that a user has rated two items, and , with rating scores of 4 and 5, respectively; further, this user wants to rate item with a rating score indicating that it is better than but worse than . By using a rating score, the user can evaluate item only as 4 or 5; as a consequence, this user may hesitate to express his/her preference for the item. In this scenario, it will be more comfortable for the user to use a combination of rating scores, such as (i.e., a soft rating). Moreover, even though a user has evaluated an item by using a hard rating, this rating might encode imperfect information inside. For instance, when a user has rated an item with a rating score, such as 3, we cannot know exactly what this user thought about the item because the user probably wanted to evaluate the item as at least 3 or 3 for 90%. To addess such situations, the use of soft ratings in RSs has recently been studied and developed for the purpose of capturing subjective, qualitative, and imperfect information about user preferences (Wickramarathne et al., 2011, Nguyen and Huynh, 2014, Nguyen and Huynh, 2015). With a rating domain , a user can express his/her preference for an item by using a soft rating as follows:
- •
3 for sure ( with a probability of 1.0)
- •
At least 3 ( with a probability of 1.0)
- •
3 for 90% ( and with probabilities of 0.9 and 0.1, respectively)
- •
Less than 3 ( with a probability of 1.0)
According to the aforementioned studies, RSs offering soft ratings are developed based on the Dempster-Shafer theory (DST) (Dempster, 1967, Shafer, 1976), which is considered as one of the most general theories for modeling imperfect information (Wickramarathne et al., 2011).
Furthermore, in the RS research area, recommendation techniques can be divided into three main categories: collaborative filtering, content-based, and hybrid (Adomavicius and Tuzhilin, 2005). Among these, collaborative filtering is regarded as the most well-known technique (Ricci et al., 2011, Jannach et al., 2012). To provide personalized recommendations to an active user, collaborative filtering RSs commonly try to find other users who are expected to have preferences similar to those of the user, and employ those users’ ratings to estimate the original user’s preferences for unseen items. However, collaborative filtering RSs also suffer from two fundamental limitations known as sparsity and cold-start problems (Adomavicius and Tuzhilin, 2005). The first problem is caused when each user only rates a very small number of items; as a result, the number of ratings is insufficient and recommendation performance is significantly affected (Huang et al., 2004). The second problem arises because of missing information about new items and new users.
Social networks are growing very fast and play a significant role on the Internet. Additionally, communication and collaboration in social networks have become more and more convenient and frequent. In a social network, users naturally form into various communities whose members interact frequently with one another (Tang and Liu, 2010). These social relations might influence individual behaviors and decisions, including those related to buying items. Commonly, when consulting for advice before buying new items, most people tend to believe in recommendations from friends in the same community rather than in recommendations from outside users. In a community, each member has his/her own preference for an item, and the overall preference of all members is called the community preference for that item. In practice, community preferences can be used for better understanding of user behaviors and ratings; thus, these preferences can potentially be used for predicting the missing information about user preferences. In other words, community preferences might be useful for overcoming the limitations in collaborative filtering RSs.
On the basis of this observation, this paper proposes a new collaborative filtering RS that is capable of (1) representing user preferences as soft ratings and (2) exploiting community preferences derived from a social network containing all users, which helps address the aforementioned sparsity and cold-start problems. In the system, community preferences are employed for predicting all unprovided user ratings on items (including new users and new items); subsequently, active users receive personalized recommendations based mainly on the provided and predicted ratings.
The remainder of this paper is organized as follows. In Section 2, background information about DST is introduced. In Section 3, related work is discussed. The proposed system is described in Section 4. In Section 5, experiments are presented, and their results are discussed. Finally, conclusions and suggestions for future research are presented in Section 6.
Section snippets
Dempster-Shafer theory
DST (Dempster, 1967, Shafer, 1976), also called evidence theory or the theory of belief functions, is a well-known theory for modeling uncertain, imprecise, and incomplete information. In the context of this theory, let us consider a problem domain represented by a finite set of mutually exclusive and exhaustive hypotheses (Shafer, 1976). A function is called a mass function if it satisfies
A subset with is called a focal element. In
Related work
Over the years, many researchers have focused on tackling the two fundamental problems in collaborative filtering RSs, and various methods have been developed for addressing these problems. Regarding the previous studies, matrix factorization (Barjasteh et al., 2015, Bauer and Nanopoulos, 2014, Guo et al., 2015), which exploits latent factors for predicting all unprovided ratings, is known as a popular method. In addition, some authors proposed combining collaborative filtering with
Data modeling
Let denote a rating domain consisting of L preference labels, . In addition, let and denote a set containing M users and a set consisting of N items, respectively; in other words, and . Users can express their preferences on items by using soft ratings; each soft rating of user on item is modeled by a mass function . Further, Dempster’s rule of combination (Dempster, 1967) is used for combining information about user
Experiment
For evaluating the proposed system, data sets need to contain (1) rating data, (2) social network, and (3) context information. After considering the available data sets, we found that Flixster data set (Nguyen and Huynh, 2014) can satisfy these requirements. Thus, we selected this data set in the experiments.
Flixster data set contains 535,013 hard ratings from 3,827 users on 1210 movies with a rating domain consisting of 10 elements, (); the data
Conclusion
Compared to RSs offering hard ratings, RSs using soft ratings are more effective because they (1) provide better modeling of subjective, qualitative, and imperfect information about user preferences and (2) allow users to express preferences in a more realistic and flexible manner. Recently social networks have been developing very fast, and these networks play a significant role on the Internet. It is possible that social relations in social networks can naturally influence individual
Acknowledgement
This research work was supported by JSPS KAKENHI Grant No. 25240049.
References (42)
- et al.
Recommender systems based on quantitative implicit customer feedback
Decis. Support Syst.
(2014) Some aspects of Dempster-Shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account
Pattern Recogn. Lett.
(1996)- et al.
Leveraging clustering approaches to solve the gray-sheep users problem in recommender systems
Expert Syst. Appl.
(2014) - et al.
Merging trust in collaborative filtering to alleviate data sparsity and cold start
Knowl. Based Syst.
(2014) - et al.
Coauthorship networks and academic literature recommendation
Electron. Commer. Res. Appl.
(2010) - et al.
Hybrid collaborative filtering for high-involvement products: a solution to opinion sparsity and dynamics
Decis. Support Syst.
(2015) - et al.
Collaborative filtering based on collaborative tagging for enhancing the quality of recommendation
Electron. Commer. Res. Appl.
(2010) - et al.
Commenders: a recommendation procedure for online book communities
Electron. Commer. Res. Appl.
(2011) - et al.
A personalized recommendation procedure for internet shopping support
Electron. Commer. Res. Appl.
(2002) - et al.
A social recommender mechanism for e-commerce: combining similarity, trust, and relationship
Decis. Support Syst.
(2013)
Facing the cold start problem in recommender systems
Expert Syst. Appl.
The transferable belief model
Artif. Intell.
Recommender systems based on social networks
J. Syst. Software
Collaborative topic regression with social trust ensemble for recommendation in social media systems
Knowl. Based Syst.
Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions
IEEE Trans. Knowl. Data Eng.
Cold-start item and user recommendation with decoupled completion and transduction
Upper and lower probabilities induced by a multivalued mapping
Ann. Math. Stat.
Community structure in social and biological networks
Proc. Natl. Acad. Sci.
Cited by (0)
- ☆
This paper is an extended and revised version of the conference paper presented at CSoNet-2016 (Nguyen and Huynh, 2016)