Introduction
-
The combination of DNN and DBSCAN clustering algorithm in the CBR core.
-
The combination of hybrid similarity criteria and the new Pro-FriendLink algorithm in CRS.
-
The proposed Pro-FriendLink algorithm for a new method in RSs.
Related works
Authors | Ref | Methods | Advantages | Disadvantages |
---|---|---|---|---|
Kim et al. | [17] | Collaborative error-reflected models for cold-start recommendation system | 1. High speed | 1. Low accuracy 2. Low precision |
Bobadilla et al. | [18] | A collaborative filtering approach to mitigate the new user cold start problem | 1. Normal accuracy | 1. Complex model |
Byström | [19] | Movie recommendations from user ratings | 1. Good accuracy | 1.Complex model |
Lika et al. | [20] | Facing the cold start problem in recommender system | 1. Fast execution time | 1. High MAE 2. High RMSE |
Pereira and Hruschka. | [21] | Simultaneous co-clustering and learning to address the cold start problem in recommendation system | 1. High speed | 1. Low accuracy 2. Low precision |
Sperlì et al. | [22] | A social media recommendation system | 1. Normal accuracy | 1. Complex model |
Kutty et al. | [23] | A people-to-people recommendation system using tensor space models | 1. High speed | 1. Low accuracy 2. Low precision |
Lin and Chi | [24] | Novel movie recommendation system based on collaborative filtering and neural networks | 1. Fast execution time | 1. High MAE 2. High RMSE |
Walek and Fojtik | [25] | Module combining a collaborative filtering system, a content-based system, and a fuzzy expert system | 1. High speed | 1. Complex model |
The proposed method
Phase 1: Content-based recommender system (CBRS)
Clustering all users with DBScan algorithm
User code | Gender | Age | Job |
---|---|---|---|
1 | 1 | 56 | 16 |
2 | 1 | 25 | 15 |
3 | 1 | 45 | 7 |
4 | 1 | 25 | 20 |
5 | 2 | 50 | 9 |
* 0: “other” or not specified |
* 1: “academic/educator” |
* 2: “artist” |
* 3: “clerical/admin” |
* 4: “college/grad student” |
* 5: “user service” |
* 6: “doctor/health care” |
* 7: “executive/managerial” |
* 8: “farmer” |
* 9: “homemaker” |
* 10: “K-12 student” |
* 11: “lawyer” |
* 12: “programmer” |
* 13: “retired” |
* 14: “sales/marketing” |
* 15: “scientist” |
* 16: “self-employed” |
* 17: “technician/engineer” |
* 18: “tradesman/craftsman” |
* 19: “unemployed” |
* 20: “writer” |
User code | Gender | Job | Occupation | Cluster | |||
---|---|---|---|---|---|---|---|
1 | 1 | 56 | 16 | Cluster_1 | |||
2 | 1 | 25 | 15 | Cluster_2 | |||
3 | 1 | 45 | 7 | Cluster_3 | |||
4 | 1 | 25 | 20 | Cluster_2 | |||
5 | 2 | 50 | 9 | Cluster_1 |
-
In case of detecting clusters with different densities and when clusters are close.
-
One of the most important problems is to determine the parameters.
-
It does not work well in case of high dimensional data and high-volume databases.
Separation of training and test samples
Classification of new users with DNN
Phase 2: CRS based on hybrid similarity criterion
Numeric features
String features
Formation of adjacency matrix
Predicting new user’s rating
-
Degree of the nodes.
-
User popularity.
-
Number of routes.
-
Node Balance.
Phase 3: Improved Friendlink algorithm
Phase 4: Combining link system and recommender system
Results
Dataset
Evaluation metrics
Weight | Scenarios |
---|---|
W1 = 0.6, W2 = 0.3 ,W3 = 0.1 | Scenario 1 |
W1 = 0.3, W2 = 0.6 ,W3 = 0.1 | Scenario 2 |
W1 = 0.3, W2 = 0.1 ,W3 = 0.6 | Scenario 3 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.86 | 1.09 |
Multifaceted decision tree | 0.9 | 1.12 |
Naïvebays | 0.89 | 1.13 |
Random classification | 0.92 | 1.19 |
Proposed method | 0.35 | 0.59 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.86 | 1.09 |
Multifaceted decision tree | 0.91 | 1.15 |
Naïvebays | 0.89 | 1.14 |
Random classification | 0.92 | 1.2 |
Proposed method | 0.35 | 0.59 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.86 | 1.09 |
Multifaceted decision tree | 0.9 | 1.14 |
NaiveBayes | 0.9 | 1.14 |
Random classification | 0.92 | 1.29 |
Proposed method | 0.35 | 0.59 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.83 | 1.03 |
Multifaceted decision tree | 0.845 | 1.09 |
Naïvebays | 0.835 | 1.03 |
Random classification | 0.87 | 1.1 |
Proposed method | 0.76 | 1.03 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.83 | 1.06 |
Multifaceted decision tree | 0.85 | 1.09 |
Naïvebays | 0.839 | 1.06 |
Random classification | 0.87 | 1.1 |
Proposed method | 0.76 | 1.03 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.83 | 1.06 |
Multifaceted decision tree | 0.86 | 1.08 |
Naïvebays | 0.84 | 1.06 |
Random classification | 0.87 | 1.09 |
Proposed method | 0.76 | 1.03 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.83 | 1.01 |
Multifaceted decision tree | 0.83 | 1.01 |
Naïvebays | 0.83 | 1.01 |
Random classification | 0.86 | 1.04 |
Proposed method | 0.73 | 0.95 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.82 | 1.01 |
Multifaceted decision tree | 0.82 | 1.01 |
Naïvebays | 0.82 | 1.01 |
Random classification | 0.82 | 1.04 |
Proposed method | 0.73 | 0.95 |
Methods | MAE | RMSE |
---|---|---|
Decision tree | 0.82 | 1.01 |
Multifaceted decision tree | 0.82 | 1.01 |
Naïvebays | 0.82 | 1.01 |
Random classification | 0.82 | 1.04 |
Proposed method | 0.73 | 0.95 |
100 users—Step 1 | 100 users—Step 2 | 100 users—Step 3 | ||||
---|---|---|---|---|---|---|
MAE | RMSE | MAE | RMSE | MAE | RMSE | |
Decision tree | 0.86 | 1.09 | 0.86 | 1.09 | 0.86 | 1.09 |
Multifaceted decision tree | 0.9 | 1.12 | 0.91 | 1.15 | 0.9 | 1.14 |
Naïvebays | 0.89 | 1.13 | 0.89 | 1.14 | 0.9 | 1.14 |
Random classification | 0.92 | 1.19 | 0.92 | 1.2 | 0.92 | 1.29 |
Proposed method | 0.35 | 0.59 | 0.35 | 0.59 | 0.35 | 0.59 |
500 users—Step 1 | 500 users—Step 2 | 500 users—Step 3 | ||||
---|---|---|---|---|---|---|
MAE | RMSE | MAE | RMSE | MAE | RMSE | |
Decision tree | 0.83 | 1.03 | 0.83 | 1.06 | 0.83 | 1.06 |
Multifaceted decision tree | 0.845 | 1.09 | 0.85 | 1.09 | 0.86 | 1.08 |
Naïvebays | 0.835 | 1.03 | 0.839 | 1.06 | 0.84 | 1.06 |
Random Classification | 0.87 | 1.1 | 0.87 | 1.1 | 0.87 | 1.09 |
Proposed method | 0.76 | 1.03 | 0.76 | 1.03 | 0.76 | 1.03 |
900 users—Step 1 | 900 users—Step 2 | 900 users—Step 3 | ||||
---|---|---|---|---|---|---|
MAE | RMSE | MAE | RMSE | MAE | RMSE | |
Decision tree | 0.83 | 1.01 | 0.82 | 1.01 | 0.82 | 1.01 |
Multifaceted decision tree | 0.83 | 1.01 | 0.82 | 1.01 | 0.82 | 1.01 |
Naïvebays | 0.83 | 1.01 | 0.82 | 1.01 | 0.82 | 1.01 |
Random classification | 0.86 | 1.04 | 0.82 | 1.04 | 0.82 | 1.04 |
proposed method | 0.73 | 0.95 | 0.73 | 0.95 | 0.73 | 0.95 |
Method | Type | MAE | Process time (s) |
---|---|---|---|
Scikit-learn | Normal RS | 0.88 | 31.49 |
TensorFlow | 0.88 | 104.336 | |
Scikit-learn | Neural network RS | 1.43 | 6264.01 |
TensorFlow | 0.76 | 136.28 | |
MyApproach | Boosting | 0.35 | 98.021 |
Method | Precision (%) | Accuracy (%) |
---|---|---|
Decision tree | 92.7 | 91 |
Neural network | 93.4 | 86 |
SVM | 97.7 | 88 |
Naïvebays | 96.5 | 87.1 |
KNN | 96.7 | 92 |
Random forest | 94.1 | 82.5 |
MyApproach | 98.92 | 93.9 |
Discussion
Conclusions
-
Execution and processing time of the proposed method is longer than some methods.
-
The proposed method cannot be implemented on every online site and system.
-
The methods used need improvements on big data.