Elsevier

Expert Systems with Applications

Volume 83, 15 October 2017, Pages 300-313
Expert Systems with Applications

A hybrid recommender system using artificial neural networks

https://doi.org/10.1016/j.eswa.2017.04.046Get rights and content

Highlights

  • Neural network based hybrid recommender system utilizing review metadata is proposed.

  • The system optimizes model hyper-parameters to minimize log-loss.

  • Validate predictive capability of model against heterogeneous business categories.

Abstract

In the context of recommendation systems, metadata information from reviews written for businesses has rarely been considered in traditional systems developed using content-based and collaborative filtering approaches. Collaborative filtering and content-based filtering are popular memory-based methods for recommending new products to the users but suffer from some limitations and fail to provide effective recommendations in many situations. In this paper, we present a deep learning neural network framework that utilizes reviews in addition to content-based features to generate model based predictions for the business-user combinations. We show that a set of content and collaborative features allows for the development of a neural network model with the goal of minimizing logloss and rating misclassification error using stochastic gradient descent optimization algorithm. We empirically show that the hybrid approach is a very promising solution when compared to standalone memory-based collaborative filtering method.

Introduction

When the processing capability of a system is far exceeded by the amount of input, information overload occurs; technology has been the primary reason for causing this problem in the recent times. In addition, when a large amount of data is available in a variety of formats and the rate at which the data is produced is too fast for user to efficiently process, it leads to information overload (Edmunds & Morris, 2000). Consequently, when information overload occurs, the quality and effectiveness of decisions suffer (Speier, Valacich, & Vessey, 1999). In the current era of smart devices and Web 2.0, massive amount of data generated from a wide variety of sources, including but not limited to, social networking sites, multimedia sharing sites, hosted services, email, group communications, instant messages. According to Vickery and Vickery (2005), it is getting harder for users to find relevant content on the web, and on many occasions this information overload can cause users to respond to the problem by omitting something important or making an error in the process of making a decision. Whether it is for personal or professional needs, users cannot ignore the information available to them but need intelligent techniques that can efficiently filter the data and present the most relevant information.

A recommender system (RS) is an application that is built to cope with the problem of information overload and provide intelligent suggestions on items to users (Resnick, Varian, 1997, Ricci, Rokach, Shapira, 2011). This ability of a RS to provide an efficient means to find relevant items is extremely useful, which has led RS to become an important research area that has attracted attention in both academia and industry (Adomavicius & Tuzhilin, 2005). In simple terms, RS generates a personalized list of ranked items for users to buy from, where rank is computed from heterogeneous sources of data acquired from the user like interests, likes, dislikes, and demographics (Ricci et al., 2011). Interest in recommender systems has remained high because of the wide variety of applications that can help deal with information overload by providing personalized recommendations. Examples of these recommendation systems are product recommendations by Amazon.com (Linden, Smith, & York, 2003), movies by Netflix (Amatriain, 2013), news articles (Nanas, Vavalis, & Houstis, 2010), financial services (Felfernig, Isak, Szabo, & Zachar, 2007), and twitter (Gupta et al., 2013). A RS can support a variety of functions that gives reason for service providers to deploy these techniques on their infrastructure. Some of the more important functions of RS are increasing sales by selling more items, improving user experience and satisfaction, and to understand better the users’ needs (Ricci et al., 2011).

In this research, we develop a new hybrid RS technique that builds on the capabilities provided by traditional approaches like collaborative filtering and content-based filtering by utilizing the metadata associated with review text to train and build an Artificial Neural Network (ANN). We develop a multi-categorical classification model that predicts the class of a rating. LogLoss, a convex function, is the cost function minimized by applying stochastic gradient descent and accuracy of predictions is used to measure efficiency of the model. We perform computational experiments using the Yelp Academic Dataset and demonstrate a 75% improvement in the accuracy of predictions when compared against user based collaborative filtering algorithm. From all the models reviewed during the training phase, we choose the model with lowest logloss and utilize these parameter weights for making predictions on the test dataset. We assess effectiveness of the hybrid model by analyzing the percentage of observations with correct predictions. We also assess the effectiveness of these rating predictions when translated to yes/no recommendations.

Recommender systems are broadly classified into the following categories based on the underlying technique used for making recommendations (Adomavicius, Tuzhilin, 2005, Burke, 2002, Jannach, Zanker, Felfernig, Friedrich, 2010):

  • 1.

    Collaborative filtering: It evaluates the relevance of an item for a user based on the opinion of the members of a community (or cluster) (Nanas et al., 2010). It runs on the premise that users with similar interests tend to prefer similar items.

  • 2.

    Content-based filtering: Such systems are developed on the assumption that items with similar characteristics will be rated in a similar way by the users (Wu, Chang, & Liu, 2014). That is, it recommends items that are similar to the ones liked by the user in the past.

  • 3.

    Knowledge-based recommender systems: Such a system recommends products based on specific domain knowledge on how certain item features satisfies users’ needs and specifications (Ricci et al., 2011).

  • 4.

    Hybrid recommendation systems: Any combination of two or more techniques described above can be categorized as a hybrid recommendation system (Jannach et al., 2010).

Artificial neural networks (ANN) have a long history, and the research of McCulloch and Pitts (1943) and Cochocki and Unbehauen (1993) are generally considered the beginning of Neurocomputing. ANNs started to gain popularity in 1980s (Cochocki & Unbehauen, 1993) as the computational capabilities of computers started to meet the demands of an ANN. By choosing an appropriate non-linear activation function(s), ANNs can be modeled to detect complex nonlinear relationships between features and independent variables, to identify higher polynomial features, their interactions, and to leverage the availability of multiple optimization algorithms (Tu, 1996). Therefore, a model built using an ANN is well positioned to learn the complex relationships between users and items, as well as predict better recommendations. Deep-learning is a new and evolving field in machine learning where powerful models are developed by utilizing a deep structured neural network (Theano, 2015).

Extensive work has been done to build recommendation systems using various machine learning techniques such as Bayesian methods (Guo, 2011), clustering (Pham, Cao, Klamma, & Jarke, 2011), ANNs (Gunawardana & Meek, 2009), linear regression (Ge, Liu, Qi, & Chen, 2011), and probabilistic models (Li, Dias, El-Deredy, & Lisboa, 2007). Each method trains a model that learns from the data with the goal of minimizing error while predicting the rating of an item for each user. Hybrid recommendation systems combine two or more recommendation techniques to improve performance and have fewer limitations compared to stand-alone techniques. Below is a list of methods that are commonly used in building hybrid recommendation systems (Burke, 2002):

  • Weighted: A linear combination of predictions from different recommendation techniques is computed to get the final recommendation.

  • Switching: Using a switching criteria, the system switches between different recommendation techniques.

  • Mixed: A list of results from all recommendations derived from applying various techniques are presented as a unified list without applying any computations to combine the results.

  • Feature Combination: Results from the collaborative technique are used as another feature to build a content-based system over the augmented feature set.

  • Cascade: Multistage technique that combines the results from different recommendation techniques in a prioritized manner.

  • Feature Augmentation: Another technique that runs in multiple stages such that the rating or classification from an initial stage is used as an additional feature in the subsequent stages.

  • Meta-level: Model generated from a recommendation technique acts as an input to the next recommendation technique in the following stage.

Content-based and collaborative recommendation techniques will always suffer from “new item” and/or “new user” problems since both techniques require prior history to produce effective predictions and the hybridization techniques discussed above alleviate some of these limitations.

Balabanović and Shoham (1997) propose a hybrid recommendation engine that performs content-based analysis using user profiles to identify clusters of similar users. This knowledge is used towards building a collaborative recommendation. This approach has the advantage of making good recommendations to users that do not share similarity with other users by building a profile based on content of the items. Melville, Mooney, and Nagarajan (2002) develop a content-boosted collaborative filtering recommendation technique that proposes a method to solve the problem of sparse rating data. For each user, a vector of pseudo user-ratings is created such that if the user has rated an item, that rating is used and if an item is not rated, a pure content-based recommendation system is used to predict the rating. Using the pseudo user-rating vectors for all users, a dense matrix of user-ratings is generated that is then used as input to collaborative filtering to compute final recommendations. User pair similarity is computed using the Pearson correlation coefficient. Harmonic mean weighting is computed to incorporate confidence in correlations, and a final content-boosted collaborative filtering prediction for an active user is generated.

In situations when there is a predictable pattern in user actions like in an e-learning environment, the items a user is interested in depends on how far he/she is in the learning process of any subject/course. Chen, Niu, Zhao, and Li (2014) propose a two-stage hybrid recommendation system for an e-learning environment to recommend items in users’ learning process. The two stages of this system are: (1) item-based collaborative filtering to discover related item sets, and (2) a sequential pattern mining algorithm to filter items according to common learning patterns.

Schein, Popescul, Ungar, and Pennock (2002) propose a hybrid recommendation technique that builds a single probabilistic framework utilizing features from both content and collaborative content. Using a new performance metric called CROC curve, it is empirically demonstrated that the various components of the framework combine in an effective manner to enhance the performance characteristics of the recommendation systems. Gunawardana and Meek (2009) also propose a new hybrid system using a model-based approach with unified Boltzmann machines, which are probabilistic models that combine content and collaborative information in a coherent manner. Using the content and collaborative information as feature vectors, parameter weights that reflects how well each feature predicts user actions are learned by model training. This unified approach has an advantage over other approaches because there is no need for careful feature engineering or post-hoc hybridization of distinct recommender systems. Gunawardana and Meek (2009) is constrained to predicting future binary actions by the user, such as buying a book or watching a movie.

Schein et al. (2002), Balabanović and Shoham (1997), Melville et al. (2002), Chen, Chen, and Wang (2015), and Gunawardana and Meek (2009) all propose different kinds of hybrid models that utilize both content and collaborative information to develop a combination of memory and model-based recommendation systems. In this era of Web 2.0, it is common for most websites to give users an option to write reviews of the products in addition to providing a rating. Text analytics and/or sentiment analysis techniques enable the extraction of reviewers’ sentiment, latent factors, opinions, and contextual information. In situations where ratings are not available, reviews are used to infer the customers’ preference for a product from the opinion expressed in the reviews – referred to as a virtual rating as described by Zhang, Narayanan, and Choudhary (2010); this virtual rating is used in a traditional approach like collaborative filtering.

In situations when both reviews and ratings are available, extensive research has been done by using reviews to augment and enhance ratings with the collaborative filtering technique. Review information is augmented in different ways with ratings such as review helpfulness (Raghavan, Gunasekar, & Ghosh, 2012), context using Latent Dirichlet Allocation (Hariri, Mobasher, Burke, Zheng, 2011, Moshfeghi, Piwowarski, Jose, 2011, Zheng, 2014), overall opinion (Blattner, & Medo, Pero, Horváth, 2013), and emotion (Moshfeghi et al., 2011).

Raghavan et al. (2012) propose a two-stage model that first estimates a review quality score using the formula: ratio of “helpful votes” to “total votes” (originally proposed by Kim, Pantel, Chklovski, & Pennacchiotti (2006)) which is weighted with the user rating. For recent reviews that do not have enough votes to compute a quality score, a regression model is trained to predict rating quality score directly from the review text. Various feature extraction methods were evaluated using text mining techniques like bag-of-words, topic modeling, content (metadata) based features, and a hybrid method using both text and metadata-based feature extraction. In the second stage, a probabilistic collaborative filtering model is developed based on a quality score weighted rating.

There has been extensive research performed in the field of recommendation systems developed using contextual information (Adomavicius, Tuzhilin, 2011, Champiri, Shahamiri, Salim, 2015, Hariri, Mobasher, Burke, 2012, Hariri, Mobasher, Burke, Zheng, 2011, Zheng, 2014). Champiri et al. (2015) conduct a review to identify contextual information and methods for recommendations in digital libraries. Hariri et al. (2011) propose a new model that is context aware and the inferred context is used to define a utility function for the items reflecting how much each item is preferred by a user given his/her current context. Context inference is achieved using labeled LDA, which performed well for the “TripAdvisor” dataset used for this research. Standard item-based k-nearest neighbors is used to compute the utility score for an item i and user u and is defined as a linear combination of predictedRating(u, i) and contextScore(u, i).

Opinion-based recommendation systems is another domain of active research. Pero and Horváth (2013) utilize both ratings and inferred opinions from product reviews to propose a new model using matrix factorization for predicting ratings. Blattner and Medo (2012) formulate within an opinion formation framework where social types play a major role and users’ opinions are assembled in two stages (external sources and social interactions).

Moshfeghi et al. (2011) tackle the issues of data sparsity and cold-start by proposing a framework that is an extension of LDA and gradient boosted trees that considers item-related semantics and emotion. They identify semantic and emotion space to construct latent groups of users, and in each space the probability that a user likes an item is computed. Finally, the information from different spaces is aggregated using supervised machine learning techniques like gradient boosted trees.

Amini, Nasiri, and Afzali (2014) and Lee, Choi, and Woo (2002) propose new recommender frameworks that also utilize content/metadata in addition to ratings to improve the accuracy of the item predictions. They both utilize ANN to build a hybrid recommendation engine but differ in what role the ANN plays in their frameworks. Amini et al. (2014) build a hybrid classifier using collaborative and content-based data available in the MovieLens dataset; following classification models were trained and evaluated: spiking neural network, multi-layer perceptron neural network, decision tree, naive bayes. Duch and Jankowski (1999) performed extensive research of activation functions (also referred as transfer functions in the domain of ANN) and Ozkan and Erbek (2003) compared the performance of the following most common activation functions (Civco, Waug, 1994, Kaminsky, Barad, Brown, 1997): linear, logistic/sigmoid, tangent hyperbolic.

Scalability and performance are key metrics for deploying a solution in real production systems and Lee et al. (2002) propose a two stage self-organizing map (SOM) neural network based recommendation system to improve these key metrics. Lee et al. (2002) initially segment users based on demographics followed by clustering them according to preference of items using a SOM neural network. To recommend items for a user, the user’s cluster is first determined and the CF algorithm is applied to all the users whom belong to same cluster in order to predict the user’s preference for the items. Experimental results show that the proposed system has better predictability than the traditional CF-based approach and also greatly improves the computational time to calculate correlation coefficients. Lee et al. (2002) differentiate their approach from other research by preprocessing the data to decrease the dimensionality of the user and item space before computing correlation coefficients.

Collaborative filtering algorithms based on user ratings are hugely popular. However, in this era of Web 2.0 where user reviews are available for most of the products/services, review text can be mined efficiently and used in combination with rating data to deliver high quality recommendations. Reviews are typically written in text form describing assessment of the product or experience of the service provided. However, rating is a homogeneous value and does not capture the sentiment and/or context behind users’ experience in a way that a review would. For example, a young family really liked their experience at a restaurant that has child-friendly services readily available and, hence, give a very high rating for the restaurant with following review: “This is an excellent restaurant with child-friendly services that makes it an ideal place to visit for families with young children. However, if you are on date this is not ideal because it can be bit noisy”. High quality reviews tend to receive more number of votes from other users who find the review useful in their decision making process and/or share the sentiment expressed in the reviews. Hence, votes related meta-data associated with reviews can be extremely useful features in the development of a supervised learning model.

As far as we know, there is very little research done utilizing the meta-data associated with reviews and ratings in conjunction with user and business attributes to develop a supervised deep-learning rating prediction model. In general, an efficient recommender system should be able to model and capture the complex, nonlinear relationships between users and businesses. ANNs are particularly well-suited to learn about these relationships. In this paper, we extend the work from previous literature to incorporate quality of ratings, assessed by the number of votes received, in addition to content and collaborative features for businesses and users into the feature space and train a deep learning model using ANNs to predict the rating and evaluate the improvement in recommendation predictions. Further, we demonstrate how well the proposed technique works by performing computational experiments using the Yelp Academic Dataset and comparing its efficiency against a user based collaborative filtering model.

Section snippets

Materials and methods

For this research, we consider the problem of predicting the rating as a multi-label classification problem where each rating is treated a label. Logarithmic Loss (or Log Loss) is a classification loss function that quantifies the accuracy of a classifier by penalizing false classifications (Murphy, 2012). Log Loss (LL), which is a convex function and can be minimized by stochastic gradient descent, is defined by Eq. [1] LL=1ni=1nj=1m=5yijlogpijwhere n is the number of training examples, m

Baseline model using collaborative filtering

Neighborhood based algorithms are a popular choice in collaborative filtering recommendation systems. To assess the performance and efficiency of an ANN based learning model, we compare it’s performance against a reference model developed using user-based collaborative filtering algorithm for PA based restaurants. A neighborhood for two users is the common set of restaurants that both users have rated. Based on the common neighborhoods, pair-wise user similarities are computed using Pearson

Results and discussion

Results from the initial set of experiments are presented in  Table 6, Table 7, Table 8 sorted by increasing validation loss. T-Model16-L0.0001-D0.1 has the best performance with lowest validation loss of 0.128386 achieved at epoch 2, 986 and the training error at this epoch is 0.329144. We also observed that validation loss continues to get better even at 3, 000 epochs as shown by the very high epoch at which lowest validation loss is observed. However, the rate of model improvement is

Conclusions and future work

In this section, we summarize the contributions of the research work and present future research problems within the domain of recommendation systems.

References (64)

  • M. Balabanović et al.

    Fab: Content-based, collaborative recommendation

    Communications of the ACM

    (1997)
  • Blattner, M., & Medo, M. (2012). Recommendation systems in the scope of opinion formation: A model. arXiv.org,...
  • J.S. Breese et al.

    Empirical analysis of predictive algorithms for collaborative filtering

    Proceedings of the fourteenth conference on uncertainty in artificial intelligence

    (1998)
  • R. Burke

    Hybrid recommender systems: Survey and experiments

    User Modeling and User-Adapted Interaction

    (2002)
  • D. Carrillo et al.

    Multi-label classification for recommender systems

    (2013)
  • L. Chen et al.

    Recommender systems based on user reviews: The state of the art

    User Modeling and User-Adapted Interaction

    (2015)
  • W. Chen et al.

    A hybrid recommendation algorithm adapted in e-learning environments

    World Wide Web

    (2014)
  • Chollet, F. (2015). Keras....
  • D.L. Civco et al.

    Classification of multispectral, multitemporal, multisource spatial data using artificial neural networks

    Proceedings of ASPRS/ACSM

    (1994)
  • A. Cochocki et al.

    Neural networks for optimization and signal processing

    (1993)
  • H.B. Curry

    The method of steepest descent for non-linear minimization problems

    Quarterly of Applied Mathematics

    (1944)
  • L.T. DeCarlo

    On the meaning and use of kurtosis

    Psychological Methods

    (1997)
  • S. Dooms et al.

    Offline optimization for user-specific hybrid recommender systems

    Multimedia Tools and Applications

    (2015)
  • W. Duch et al.

    Survey of neural transfer functions

    Neural Computing Surveys

    (1999)
  • A. Felfernig et al.

    The vita financial services sales support environment

    Proceedings of the 19th national conference on innovative applications of artificial intelligence - volume 2

    (2007)
  • X. Ge et al.

    A new prediction approach based on linear regression for collaborative filtering

    2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD)

    (2011)
  • S. Geman et al.

    Neural networks and the bias/variance dilemma

    Neural Computation

    (1992)
  • A. Gunawardana et al.

    A unified approach to building hybrid recommender systems

    Proceedings of the third acm conference on recommender systems

    (2009)
  • S. Guo

    Bayesian recommender systems: Models and algorithms

    (2011)
  • P. Gupta et al.

    Wtf: The who to follow service at twitter

    Proceedings of the 22nd international conference on world wide web

    (2013)
  • N. Hariri et al.

    Context-aware music recommendation based on latenttopic sequential patterns

    Proceedings of the sixth acm conference on recommender systems

    (2012)
  • N. Hariri et al.

    Context-aware recommendation based on review mining

    Ijcai’ 11, proceedings of the 9th workshop on intelligent techniques for web personalization and recommender systems

    (2011)
  • Cited by (148)

    • SARWAS: Deep ensemble learning techniques for sentiment based recommendation system

      2023, Expert Systems with Applications
      Citation Excerpt :

      This work was evaluated on three standard datasets demonstrating considerably good results. Paradarami et al. (Paradarami, Bastian, & Wightman, 2017) suggested a model based on deep learning where both collaboration and content based are used for more accurate predictions than collaborative filtering memory based rec- ommendation methods. Cold start problem is a big issue in recommender systems, Wei et al. (Wei, He, Chen, Zhou, & Tang, 2017) combined time based collaborative filtering model with deep learning to resolve this issue of cold-start.This work uses time based statistics of user choices and item characterstics for prediction of item ratings falling in coldstart zone.

    View all citing articles on Scopus
    View full text