Elsevier

Knowledge-Based Systems

Volume 60, April 2014, Pages 82-101
Knowledge-Based Systems

Multi-criteria collaborative filtering with high accuracy using higher order singular value decomposition and Neuro-Fuzzy system

https://doi.org/10.1016/j.knosys.2014.01.006Get rights and content

Abstract

Collaborative Filtering (CF) is the most widely used prediction technique in recommender systems. It makes recommendations based on ratings that users have assigned to items. Most of the current CF recommender systems maintain only single user ratings inside the user-item ratings matrix. Multi-criteria based CF presents a possibility of providing accurate recommendations by considering the user preferences in multi aspects of items. However, in the multi-criteria CF, the user behavior about items’ features is frequently subjective, imprecise and vague. These in turn induce uncertainty in reasoning and representation of items’ features that exactly cannot be solved using crisp machine learning techniques. In contrast, approaches such as fuzzy methods instead of crisp methods can better solve the issue of uncertainty. In addition, fuzzy methods can predict the users’ preference more accurately and even better alleviate the sparsity problem in overall rating by considering user perception about items’ features. Apart from this, in the multi-criteria CF, users provide the ratings on different aspects (criteria) of an item in new dimensions; thereby, increasing the scalability problem. Appropriate dimensionality reduction techniques are thus needed to capture the high dimensions all together without reducing them into lower dimensions to reveal the latent associations among the components. This study presents a new model for multi-criteria CF using Adaptive Neuro-Fuzzy Inference System (ANFIS) combined with subtractive clustering and Higher Order Singular Value Decomposition (HOSVD). HOSVD is used for dimensionality reduction for improving the scalability problem and ANFIS is used for extracting fuzzy rules from the experimental dataset, alleviating the sparsity problems in overall ratings and representing and reasoning the users’ behavior on items’ features. Experimental results on real-world dataset show that combination of two techniques remarkably improves the predictive accuracy and recommendation quality of multi-criteria CF.

Introduction

During the last decade the amount of information available online increased exponentially and information overload problem has become one of the major challenges faced by information retrieval and information filtering systems. Recommender systems are one solution to the information overload problem. In the mid-1990s, recommender systems became active in the research domain when the focus was shifted to recommendation problems by researchers that explicitly rely on user rating structure and also emerged as an independent research area [1].

Recommender systems based on Collaborative Filtering (CF) are particularly popular and used by large online [2], [3], [4]. CF algorithms can be divided into two categories: memory-based algorithms and model based algorithms [3], [5], [6]. Memory-based (or heuristic-based) methods, such as correlation analysis and vector similarity, search the user database for user profiles that are similar to the profile of the active user that the recommendation is made for [7]. Heuristic-based approaches are classed into user-based and item-based approaches [6], [8]. User-based CF has been the most popular and commonly used (memory-based) CF strategy [9]. It is based on the premise that similar users will like similar items. Item-based CF was first proposed by [10] as an alternative style of CF that avoids the scalability bottleneck associated with the traditional user-based algorithm. The bottleneck arises from the search for neighbors in a population of users that is continuously growing. In item-based CF, similarities are calculated between items rather than between users, the intuition being that a user will be interested in items which are similar to items he has liked in the past. Two of the most popular approaches to computing similarities between users and items are the Pearson correlation coefficient and cosine-based coefficients.

One of the main problems in the recommender systems specifically CF is known as the sparsity problem [11], [12], [13], [14]. Also, memory based CF approaches suffer from the scalability problem. Therefore, scaling up these systems on real datasets is one of the main challenges that many studies have been provided to overcome it [15], [16], [17], [18].

Compared with memory based algorithms, model-based algorithms usually scale better in terms of their resource requirements (memory and computing time) and do not require keeping actual user profiles for predictions [5], [10], [19]. Model-based CF adopts an eager learning strategy where a model of the data, i.e. the users, items and their ratings for those items, is pre-computed [6], [8], [20]. Several research has suggested that model-based CF can also produce better predictive accuracy than memory-based collaborative filtering, by using more sophisticated techniques such as matrix factorisation and dimensionality reduction, for example [21], [22].

The ratings provided by users for items are the key input to CF recommender systems. They present information regarding the quality of the item along with the preference of the user who shared the rating. Principally, the large numbers of recommender systems are developed for single-valued ratings. According to Adomavicius and Kwon [23], pure CF-based recommender systems rely solely on product ratings provided by a large user community to generate personalized recommendation lists for each individual online user. In traditional CF systems the assumption is that customers provide an overall rating for the items which they have purchased, for example, using a 5-star rating system. However, given the value of customer feedback to the business, customers in some domains are nowadays given the opportunity to provide more fine-grained feedback and to rate products and services along various dimensions [24], [25].

Adomavicius and Kwon [23] introduced schemes of incorporating multi-criteria rating information in the recommendation process. For example, they considered for each item multiple criteria and overall ratings that indicate how much the item is liked by the user based on their perception of items’ features. They stated that single-rating CF recommenders are indicated as systems that attempt to estimate a rating function R0 that has the form users × items  R0 for predicting a rating for any given user-item pair. R0 is totally ordered set, typically composed of real-valued numbers inside a certain range. They further discussed that in multi-criteria recommender systems, in comparison, the rating function R0 gets the form users × items  R0 × R1 ×  × Rk. Therefore, in the multi-criteria CF problem, there are m users, n items and k criteria in addition to an overall rating. Thus, users can provide a number of explicit ratings for items; a general rating R0 must be predicted in addition to k additional criteria ratings (R1,  , Rk). It can be configured to push new items to users in two ways, either by producing a Top-N list of recommendations for a given target, or by predicting the target user’s likely utility (or rating) for a particular unseen item. We will refer to these as the recommendation task and the rating prediction task in multi-criteria CF, respectively.

In the context of personalization applications, traditional single-rating CF have been highly successful however, the research area regarding CF with multi-criteria ratings for items has been rarely touched and the issue is largely unexplored. According to Adomavicius and Kwon [23], the problem of multi-criteria recommendations with a single and an overall rating is still considered an optimization problem. Also, providing ratings for items in multi aspects in the CF recommender systems presents new challenges such as sparsity problem in criteria and overall ratings, scalability problem with increasing new dimensions and representation and reasoning users’ behavior and preferences about items’ features.

With regard to scalability problem, while developing multi-criteria recommender system, it is obvious that the recommender system deals with high-dimensional data and therefore, applying dimensionality reduction technique for more than 2 dimensions for the problem of multi-criteria CF is one important issue that has rarely been considered in prior researches for such systems. The 3-dimensional space is rather split into pair relations such as user-item, user-criteria, and criteria-item in order to apply existing dimensionality reduction techniques and reveal the latent associations between the components. Therefore, a part of the total interaction between any pairs in the three or higher dimensions was lost.

CF recommender systems also suffer from sparsity or missing value in the user-item ratings matrix and this influences the predictive accuracy. Multi-criteria CF recommender systems suffer more from this problem on two sides, missing values in overall and criteria, with the system having to predict these missing ratings with new approaches [24]. In addition, to alleviate sparsity, traditional and crisp methods such as clustering and regression do not consider the exact users’ preferences in multi-criteria CF. With regards to the multi-criteria CF, the experimental results obtained in this study demonstrate that fuzzy methods can better alleviate the sparsity problem in overall ratings and this in turn leads to improvement in predictive accuracy.

For representation and reasoning users’ behavior, the user perception and behavior about items’ features are imprecise, subjective and vague; all these have to be taken into account. In multi-criteria CF, it is important to deal with non-stochastic uncertainty problem induced from vagueness and imprecision in representing and reasoning items’ features. With regards to user behavior and perception for example interest, the uncertainty is connected to how the user interest precisely can be represented and measured. In that direction, in multi-criteria CF, the system has to consider this issue and predict the overall preference about items according to the user behavior on different dimensions of the items. The systems developed by solely clustering methods, dimensionality reduction techniques, and regression approaches are usually failed to predict the exact user preferences about items’ features. Therefore, the sophisticated methods are needed that can accurately predict the user overall preference on the items’ features and provide the recommendations to be more tailored to the user taste.

We describe the contributions of our study as follows:

There have been no prior applications of the High Order Singular Value Decomposition (HOSVD) and Neuro-Fuzzy approach to multi-criteria CF problem; with respect to this, a new model based on HOSVD and ANFIS is developed. For the first time in a multi-criteria CF problem, ANFIS is used for solving the sparsity in overall ratings and uncertainty problem and HOSVD is applied for scalability problem.

In order to overcome above mentioned problems, in the proposed method, two well-known techniques are combined from the fields of artificial intelligence and dimensionality reduction. We apply HOSVD for dimensionality reduction on the high-dimensional dataset of user ratings. In addition, the result of decomposition through HOSVD is used for clustering based on cosine-based similarity. The aim of clustering by this method is to provide a model of similar users for extracting fuzzy rules with high accuracy using Adaptive Neuro-Fuzzy Inference System (ANFIS) and improving the model efficiency [26]. Indeed, HOSVD is used to capture the high dimensions all together without reducing them into lower dimensions where the traditional approaches have failed. Therefore, it substantially improves the efficiency and scalability of multi-criteria CF. Due to the nature of experimental dataset, we perform HOSVD on third-orders tensor. However, it can be also applied on tensors with more than 3 dimensions. This can be one of the main advantages of HOSVD which make it flexible and effective approach for multi-criteria CF where other traditional machine learning techniques have failed. It should be noted that using HOSVD the computation time for decomposition procedure is high when the tensor order is increased. However, it can be done in the offline phase and with incremental learning for data approximation procedure in the online phase.

In the proposed model, ANFIS aims to extract knowledge (rules) from the users’ ratings in multi aspect to be used in overall rating prediction task. The extracted rules is employed for predicting unknown ratings for alleviating sparsity problem in overall rating and also revealing the real level of user preferences on items’ features. The ANFIS provides flexible structure of defined problem that is suitable for generating stipulated input–output pairs using a set of induced fuzzy IF–THEN rules with appropriate and varied MFs [27]. The produced Fuzzy Inference System (FIS) is served to predict user overall preferences about items’ features with proper training. The elements of this model are a fuzzy set, a neural network and data clustering. In addition, non-stochastic uncertainty emerging from vagueness and imprecision is handled using ANFIS. The MFs produced by ANFIS is used for representation and reasoning users’ behavior of providing rating according to their perception about items’ features. The MFs formed by ANFIS are continuous and more accurate in representing the features of items and user feedbacks. Furthermore, to prevent the problem of overfitting discussed in the previous researches [24], [28], subtractive clustering is applied to minimize overfitting by fine-tuning the ANFIS models and also the checking set is used to solve this problem in the training data.

In the context of product recommendation, in practical applications and situations, customers are interested in rating the items or express their preferences in linguistic terms, such as {low interest}, {high interest} or {no interest} for the item’ features. This gives a suggestion to design multi-criteria CF to be user-friendly and convenient for users in giving ratings to items. Therefore, for multi-criteria CF, the fuzzy logic and fuzzy set is more appropriate in human linguistic reasoning with imprecise concepts in relation to the crisp approaches. In addition, linguistic terms are more suitable than numerical values in assessing qualitative information, which is usually related to the human perceptions, opinions and tastes. Hence, in multi-criteria CF, it is more appropriate that the linguistic terms be considered for users to express their preferences, knowledge and personal judgments [29]. From this perspective, we can define users’ degrees of preference regarding a particular item in a set of linguistic terms such as {low interest}, {high interest} or {no interest} for the feature of items. Furthermore, fuzzy approach provides a way to quantify the non-stochastic uncertainty that is induced from imprecision, vagueness, and subjectivity. Modeling with fuzzy approach is more reliable than traditional statistical methods such as Bayesian method which handles uncertainty due to randomness. Moreover, the discovered fuzzy rules from the users’ ratings through ANFIS can maintain in the rules database to be used in the next predictions for items recommendation. These properties promise to provide the framework for addressing the representation and inference challenges in multi-criteria CF research.

In this study, we consider the proposed method for movie domain recommender systems. However, the method can also be adopted for e-business and e-government applications recommender systems such as recommender systems developed by Zhang et al. [30] and Shambour and Lu [31], [32] for e-business and e-government applications, respectively.

Finally, we perform an in-depth experimental evaluation, which the user rating about items in multi aspects obtained from Yahoo!Movies network and several comparisons are conducted between our method and other algorithms.

Thus, in comparison with research efforts found in the literature, our work has the following differences. In this research:

  • A new hybrid recommendation model using HOSVD and Neuro-Fuzzy techniques is proposed for increasing the predictive accuracy and improving the scalability of the multi-criteria CF.

  • Sparsity issue in overall ratings is solved using Neuro-Fuzzy technique.

  • HOSVD is used for scalability improvement.

The remainder of this paper is organized as follows: In Section 2, research background and related work are described. HOSVD dimensional reduction technique, k-Nearest Neighbor (k-NN) Classifier, ANFIS and subtractive clustering are introduced in the separate subsections in Sections 3. Section 4 provides an overview of research methodology. Section 5 presents the result and discussion. Finally, conclusions and future work is presented in Section 6.

Section snippets

Research background and related work

In the area of personalized web search, Sun et al. [33] proposed Cube singular value decomposition (CubeSVD) to improve Web Search. Based on their CubeSVD analysis, which also used HOSVD technique, web search activities carried out more efficiently. They evaluated the method on MSN search engine data. In the field of recommender systems, several recommendation models have been proposed which have used three dimensional tensors for recommending music, objects and tags. Recommender models, using

Higher Order Singular Value Decomposition (HOSVD)

To represent and recognize high-dimensional data effectively, the dimensionality reduction is conducted on the original dataset for low-dimensional representation [57]. Visualizing, comparing, and decreasing processing time of data are the main advantages of dimensionality reduction techniques. HOSVD is one of the powerful dimensionality reduction techniques for tensor decomposition proposed by Lathauwer et al. [58]. They proposed HOSVD as a generalization of the SVD that is used for tensors

Research methodology

Fig. 5 shows the general framework of proposed method with combination HOSVD for dimensionality reduction and ANFIS combined with subtractive clustering for discovering knowledge from users’ ratings and predicting overall ratings.

In the first step, we apply the HOSVD for dimensionality reduction to reveal the latent associations among the components in the user-item-criteria tensor. Then, we perform cosine-based similarity for clustering to obtain groups of similar users and determine labels

Result and discussion

In order to analyse the effectiveness of the proposed method, several experiments were conducted on Yahoo!Movies dataset provided by Yahoo! Research Alliance Webscope program (http://webscope.sandbox.yahoo.com).

On the Yahoo!Movies network, users could rate movies in 4 dimensions (Story, Acting, Direction and Visuals) and assign an overall rating. Users used a 13-level rating scale for ratings. The four features for any movies were considered as: C1 = Acting, C2 = Story, C3 = Visuals and C4 = Directing.

Conclusion and future work

In this paper, a new method was proposed using a combination of HOSVD and ANFIS combined with subtractive clustering to improve the recommendation quality and predictive accuracy of multi-criteria CF. We proposed this method for overcoming the existing shortcomings such as predicting the overall ratings, sparsity, scalability and uncertainty induced from vagueness and imprecision in representing and reasoning items’ features in multi-criteria CF.

Using HOSVD, we reduced the noise of

Acknowledgements

The authors would like to acknowledge the support of Universti Teknologi Malaysia (UTM) for providing financial assistance. We would like to thank Prof. Dietmar Jannach for providing us with a multi-criteria data set for our experiment and Ehsan Shekarian for his helpful comments in revising the paper. Appreciation also goes to the editors and anonymous reviewers for their valuable comments and suggestions, which were helpful in improving the paper.

References (81)

  • Q. Shambour et al.

    A trust-semantic fusion-based recommendation approach for e-business applications

    Dec. Supp. Syst.

    (2012)
  • Y. Cao et al.

    An intelligent fuzzy-based recommendation system for consumer electronic products

    Expert Syst. Appl.

    (2007)
  • L.M. de Campos et al.

    A collaborative recommender system based on probabilistic inference from fuzzy observations

    Fuzzy Sets Syst.

    (2008)
  • R.R. Yager

    Fuzzy logic methods in recommender systems

    Fuzzy Sets Syst.

    (2003)
  • L.A. Zadeh

    Fuzzy sets

    Inform. Control

    (1965)
  • S. Petrovic-Lazarevic et al.

    Neuro-fuzzy modelling in support of knowledge management in social regulation of access to cigarettes by minors

    Knowl.-Based Syst.

    (2004)
  • A. Bouchachia et al.

    Enhancement of fuzzy clustering by mechanisms of partial supervision

    Fuzzy Sets Syst.

    (2006)
  • A. Bilge et al.

    A comparison of clustering-based privacy-preserving collaborative filtering schemes

    Appl. Soft Comput.

    (2013)
  • A. Bilge et al.

    A scalable privacy-preserving recommendation scheme via bisecting k-means clustering

    Inform. Process. Manage.

    (2013)
  • S.S. Anand, B. Mobasher, Intelligent techniques for web personalization, in: Proceedings of the 2003 International...
  • N. Mehrbakhsh et al.

    Collaborative filtering recommender systems

    Res. J. Appl. Sci., Eng. Technol.

    (2013)
  • G. Adomavicius et al.

    Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions

    IEEE Trans. Knowl. Data Eng.

    (2005)
  • M. Deshpande et al.

    Item-based top-n recommendation algorithms

    ACM Trans. Inform. Syst. (TOIS)

    (2004)
  • G. Bordogna et al.

    A flexible multi criteria information filtering model

    Soft Comput.

    (2010)
  • B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Application of dimensionality reduction in recommender system-a case...
  • J.A. Konstan et al.

    GroupLens: applying collaborative filtering to Usenet news

    Commun. ACM

    (1997)
  • B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in:...
  • J.L. Herlocker et al.

    Evaluating collaborative filtering recommender systems

    ACM Trans. Inform. Syst. (TOIS)

    (2004)
  • J.S. Breese, D. Heckerman, C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in:...
  • K. Goldberg et al.

    Eigentaste: a constant time collaborative filtering algorithm

    Inform. Retriev.

    (2001)
  • L.M. de Campos et al.

    Using second-hand information in collaborative recommender systems

    Soft Comput.

    (2010)
  • Y. Koren et al.

    Matrix factorization techniques for recommender systems

    Computer

    (2009)
  • P. Symeonidis, M.M. Ruxanda, A. Nanopoulos, Y. Manolopoulos, Ternary semantic analysis of social tags for personalized...
  • G. Adomavicius et al.

    New recommendation techniques for multicriteria rating systems

    Intell. Syst., IEEE

    (2007)
  • D. Jannach, Z. Karakaya, F. Gedikli, Accuracy improvements for multi-criteria recommender systems, in: Proceedings of...
  • G. Adomavicius et al.

    Multi-criteria recommender systems

  • V. Nourani et al.

    A geomorphology–based ANFIS model for multi-station modeling of rainfall–runoff process

    J. Hydrol.

    (2013)
  • S. Sen, J. Vig, J. Riedl, Tagommenders: connecting users to items through tags, in: Proceedings of the 18th...
  • J. Lu et al.

    A web-based personalized business partner recommendation system using fuzzy semantic techniques

    Comput. Intell.

    (2013)
  • Q. Shambour et al.

    A hybrid trust-enhanced collaborative filtering recommendation approach for personalized government-to-business e-services

    Int. J. Intell. Syst.

    (2011)
  • Cited by (84)

    • Credibility score based multi-criteria recommender system

      2020, Knowledge-Based Systems
      Citation Excerpt :

      Further, recommendations generated by these approaches are more accurate. In preference aggregation approaches, user models are constructed based on various criteria rating using linear regression [32], probabilistic modeling, neural network, support vector regression, ANFIS [33], etc. in multi-criteria systems and these models are used to generate recommendations to users. Table 3, represents work done in MCRS based on similarity aggregation approach and preference aggregation approach.

    View all citing articles on Scopus
    View full text