1 Introduction
- RQ1How and by whom are app ratings and reviews manipulated? Through online research and an investigative disguised questionnaire, we identified 43 fake review providers and gathered information about their fake reviewing strategies and offers.
- RQ2How do fake reviews differ from regular app reviews? We crawled ∼60,000 fake reviews, empirically analyzed and compared them with ∼62 million official app reviews from the Apple App Store. We report on quantitative differences of fake reviewers and concerned apps.
- RQ3How accurate can fake reviews be automatically detected? We developed a supervised classifier to detect fake reviews. Within an in-the-wild experiment, we evaluated the performance of multiple classification algorithms, configurations, and classification features.
2 Study Design
2.1 Research Questions
- RQ1Fake review market reveals how app sales and downloads are manipulated and to which conditions. We investigate the following questions:
- 1. Providers: By whom are fake reviews offered? What strategies do fake review providers follow?
- 2. Offers: What exact services do fake review providers offer and under which conditions?
- 3. Policies: What are providers policies for submitting fake reviews? Do these reveal indicators to detect fake reviews?
- RQ2Fake review characteristics reveal empirical differences between official and fake reviews, including reviewed apps and reviewers.
- 1. Apps: Which apps are typically affected by fake reviews? What are their categories, prices, and deletion ratio?
- 2. Reviewers: What is a typical fake reviewer, e.g., in terms of number of reviews provided and review frequency?
- 3. Reviews: How do official and fake reviews differ, e.g., with regard to rating, length, votes, submission date, and content?
- RQ3Fake review detection examines how well supervised machine learning algorithms can detect fake reviews. We focus on the following questions:
- 1. Features: Which machine learning features can be used to automatically detect fake reviews?
- 2. Classification: Which machine learning algorithms perform best to classify reviews as fake/non-fake?
- 3. Optimization: How can classifiers further be optimized? What is the relative importance of the classification features?
- 4. In-the-Wild Experiment: How do the classifiers perform in practice on imbalanced datasets with different proportional distributions of fake and regular reviews?
2.2 Research Method and Data
2.2.1 Data Collection Phase
Provider Id | Provider Type | # Apps | # Reviews | Approach |
---|---|---|---|---|
PRP10 | Paid Review Provider | 77 | – | Crawl |
PRP16 | Paid Review Provider | 19 | 4 | Crawl, Social |
PRP21 | Paid Review Provider | 3 | – | Social |
PRP25 | Paid Review Provider | – | 3 | Social |
PRP26 | Paid Review Provider | – | 10 | Social |
PRP28 | Paid Review Provider | – | 3 | Social |
REP1 | Review Exchange Portal | 268 | – | Crawl |
REP2 | Review Exchange Portal | 277 | – | Crawl |
REP3 | Review Exchange Portal | 2,007 | 60,411 | API, Crawl |
REP5 | Review Exchange Portal | 7 | – | Crawl |
REP6 | Review Exchange Portal | 9 | – | Crawl |
REP8 | Review Exchange Portal | 182 | – | Crawl |
REP9 | Review Exchange Portal | 4 | – | Crawl |
\(\sum = 2,853\) | \(\sum = 60,431\) |
2.2.2 Data Preparation Phase
Official Reviews Dataset | Fake Reviews Dataset | |
---|---|---|
# of reviews | 62,617,037 | 8,607 |
# of apps | 1,430,091 | 1,929 |
# of reviewers | 25,333,786 | 721 |
2.2.3 Data Analysis Phase
3 Fake Review Market (RQ1)
3.1 Review Providers and Market Strategies
3.2 Offers and Pricing Models
PRP | Co. | Review Price | Rating Price | Install Price | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
iOS | Android | iOS | Android | iOS | Android | ||||||||
Min | Max | Min | Max | Min | Max | Min | Max | Min | Max | Min | Max | ||
1 | IN | 1.35 | 1.50 | ||||||||||
2 | DK | 4.63 | 4.90 | 0.98 | 1.00 | 0.25 | 0.25 | 0.05 | 0.06 | ||||
3 | IN | 0.25 | 0.25 | 0.20 | 0.20 | 0.09 | 0.09 | ||||||
4 | IN | 1.50 | 1.98 | ||||||||||
5 | GB | 1.11 | 1.50 | 0.10 | 0.10 | ||||||||
6 | US | 2.90 | 2.95 | 1.28 | 1.58 | 1.30 | 1.36 | ||||||
7 | RU | 0.25 | 0.25 | 0.20 | 0.20 | 0.10 | 0.10 | ||||||
8 | US | 6.00 | 9.00 | 6.00 | 9.00 | ||||||||
9 | US | 3.33 | 4.17 | 1.00 | 1.50 | 0.09 | 0.15 | ||||||
10 | NL | 1.55 | 1.55 | 1.00 | 1.00 | 0.65 | 0.65 | 0.20 | 0.20 | ||||
11 | US | 2.50 | 4.00 | 3.50 | 5.00 | 0.49 | 0.90 | 0.08 | 0.12 | ||||
12 | CA | 1.00 | 1.00 | ||||||||||
13 | US | 2.15 | 2.50 | 1.59 | 2.50 | 0.34 | 0.46 | 0.13 | 0.20 | ||||
14 | US | 4.30 | 5.00 | 0.85 | 1.20 | 0.35 | 0.38 | 0.35 | 0.38 | ||||
15 | IN | 0.15 | 0.15 | 0.08 | 0.08 | 0.10 | 0.10 | ||||||
16 | RU | 2.09 | 2.99 | 2.99 | 2.99 | ||||||||
17 | US | 5.02 | 5.20 | 2.00 | 2.60 | 0.40 | 0.45 | 0.40 | 0.46 | ||||
18 | DE | 2.50 | 2.50 | 0.17 | 0.17 | ||||||||
19 | US | 8.69 | 10.00 | 3.60 | 4.00 | 1.28 | 1.60 | 1.36 | 1.58 | ||||
20 | VN | 0.05 | 0.05 | 0.05 | 0.05 | 0.10 | 0.10 | 0.05 | 0.05 | ||||
21 | US | 2.00 | 2.00 | 1.40 | 2.00 | ||||||||
22 | US | 1.45 | 2.00 | 0.29 | 0.32 | 0.08 | 0.15 | ||||||
23 | RU | 3.40 | 4.00 | 2.75 | 2.75 | ||||||||
24 | US | 1.00 | 1.00 | 0.80 | 0.80 | 0.15 | 0.15 | ||||||
25 | NL | 1.78 | 3.30 | 1.78 | 3.30 | 0.50 | 0.50 | 0.08 | 0.10 | ||||
26 | RU | 3.00 | 3.00 | ||||||||||
27 | IN | 1.99 | 2.40 | 0.39 | 0.46 | 0.39 | 0.46 | ||||||
28 | CN | 2.09 | 2.99 | 2.39 | 2.99 | 1.00 | 1.99 | ||||||
29 | SG | 3.00 | 3.00 | 3.00 | 3.00 | ||||||||
30 | DE | 1.93 | 4.00 | 0.45 | 0.50 | 0.06 | 0.14 | ||||||
31 | US | 1.00 | 2.00 | 0.50 | 1.00 | ||||||||
32 | IN | 0.50 | 0.50 | ||||||||||
33 | AE | 0.90 | 1.00 | 0.75 | 0.80 | 0.15 | 0.40 | ||||||
34 | IN | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 2.00 | 1.60 | 1.67 | 1.60 | 1.67 |
NUM | 17 | 17 | 32 | 32 | 2 | 2 | 10 | 10 | 12 | 12 | 23 | 23 | |
AVG | 3.41 | 4.24 | 1.73 | 2.14 | 1.50 | 2.00 | 0.76 | 0.83 | 0.48 | 0.55 | 0.33 | 0.40 | |
SD | 1.85 | 2.21 | 1.24 | 1.70 | 0.71 | 0.01 | 0.64 | 0.71 | 0.38 | 0.40 | 0.45 | 0.50 |
3.3 Pretended Fake Review Characteristics
3.3.1 Disguised Questionnaire
We have several competitors which gain more and more market share. For this reason we are looking for both positive and negative reviews, positive for our apps and negative for our competitors’ apps. [...]
PRP | Positive ratings | Negative ratings | Custom keywords | Predefined reviews | Real users | Guarantee |
---|---|---|---|---|---|---|
9 | Yes | No | Yes | Yes | Yes | Yes |
10 | Yes | Yes | Yes | No | ||
12 | Yes | Yes | Yes | No | Yes | Yes |
15 | Yes | Yes | No | |||
16 | Yes | No | Yes | Yes | Yes | No |
22 | Yes | Yes | Yes | |||
23 | Yes | No | No | Yes | Yes | No |
25 | Yes | Yes | Yes | Yes | Yes | Yes |
26 | Yes | Yes | Yes | Yes | Yes | Yes |
28 | Yes | No | No | Yes | Yes | No |
29 | Yes | Yes | Yes | Yes | Yes |
3.3.2 Review Policies
REP | Co. | Real | Install | Use | Keep | Honest | Rating | Length | Copy |
---|---|---|---|---|---|---|---|---|---|
Dev. | App | App | App | ||||||
1 | IN | Yes | No | 1–5 | |||||
2 | ES | Yes | Yes | 5 days | Yes | 3–5 | > 10 words | No | |
3 | US | Yes | Yes | 1–2 days | Yes | 2–3 sentences | |||
4 | US | 1–2 sentences | No | ||||||
5 | GB | Yes | Yes | 1 day | Yes | 4–5 | 1–2 sentences | ||
6 | CN | Yes | Yes | 4 min | 5 days | ||||
7 | GB | Yes | 1–2 sentences | ||||||
8 | SE | Yes | Yes | 2 days | Yes | 3–5 | > 10 words | No | |
9 | RU | Yes | 10 min | 7 days |
3.3.3 Initial Fake Review Indicators
4 Fake Review Characteristics (RQ2)
4.1 Apps
Category | Fake Reviews Dataset | Official Reviews Dataset | ||||
---|---|---|---|---|---|---|
Rank | Apps | Reviews | Rank | Apps | Reviews | |
Books | 21 | 4 (0.21%) | 0.06% | 19 | 25069 (1.75%) | 0.89% |
Business | 12 | 33 (1.71%) | 1.72% | 3 | 130825 (9.15%) | 1.35% |
Catalogs | 23 | 3 (0.16%) | 0.09% | 23 | 10951 (0.77%) | 0.29% |
Education | 3 | 92 (4.77%) | 3.80% | 2 | 131302 (9.18%) | 1.74% |
Entertainment | 4 | 87 (4.51%) | 4.03% | 5 | 79504 (5.56%) | 5.68% |
Finance | 17 | 17 (0.88%) | 1.11% | 14 | 34684 (2.43%) | 1.66% |
Food & Drink | 15 | 19 (0.98%) | 0.65% | 9 | 50944 (3.56%) | 1.34% |
Games | 1 | 1023 (53.03%) | 47.57% | 1 | 326864 (22.86%) | 49.95% |
Health & Fitn. | 5 | 85 (4.41%) | 5.81% | 8 | 54410 (3.80%) | 3.23% |
Lifestyle | 8 | 69 (3.58%) | 4.82% | 4 | 102183 (7.15%) | 2.46% |
Medical | 20 | 13 (0.67%) | 0.65% | 16 | 30101 (2.10%) | 0.42% |
Music | 11 | 36 (1.87%) | 1.99% | 11 | 43874 (3.07%) | 2.72% |
Navigation | 20 | 13 (0.67%) | 1.06% | 20 | 21559 (1.51%) | 0.80% |
News | 24 | 2 (0.10%) | 0.06% | 18 | 26358 (1.84%) | 1.64% |
Newsstand | 23 | 3 (0.16%) | 0.18% | 25 | 1021 (0.07%) | 0.00% |
Photo & Video | 2 | 112 (5.81%) | 7.66% | 12 | 40034 (2.80%) | 6.54% |
Productivity | 10 | 42 (2.18%) | 1.92% | 10 | 44191 (3.09%) | 3.35% |
Reference | 18 | 14 (0.73%) | 0.63% | 15 | 34465 (2.41%) | 1.50% |
Shopping | 13 | 25 (1.30%) | 2.50% | 22 | 15253 (1.07%) | 2.15% |
Social Netw. | 6 | 82 (4.25%) | 3.69% | 17 | 27488 (1.92%) | 5.41% |
Sports | 9 | 45 (2.33%) | 1.79% | 13 | 37060 (2.59%) | 1.14% |
Stickers | 25 | 1 (0.05%) | 0.01% | 21 | 20979 (1.47%) | 0.01% |
Travel | 14 | 22 (1.14%) | 2.08% | 7 | 64846 (4.53%) | 1.30% |
Utilities | 8 | 69 (3.58%) | 4.69% | 6 | 71680 (5.01%) | 3.61% |
Weather | 16 | 18 (0.93%) | 1.43% | 24 | 4446 (0.31%) | 0.82% |
\(\sum = 1,929\) | \(\sum = 1,430,091\) |
4.2 Reviewers
4.3 Reviews
Great for expense tracking ⋆ ⋆ ⋆ ⋆ ⋆ Does a great job for expense tracking. Nice interface and color scheme. Definitely recommend!
Fantastic ⋆ ⋆ ⋆ ⋆ ⋆ Great game, my son loves it. Lots of fun.
5 Fake Review Detection (RQ3)
5.1 Feature Extraction
Category | Name | Type | Null-Values | Example |
---|---|---|---|---|
Reviewer | # Reviews (Total) | Int | 0 | 100 |
% Reviews (per Star-Rating) | [Float] | 0 | [0.7, 0.0, 0.0, 0.0, 0.3] | |
Review Frequency (in Seconds) | Int | 1,734 | 100 | |
Account Usage (in Seconds) | Int | 0 | 600 | |
App | # Reviews (Total) | Int | 0 | 100 |
# Reviews (per Star-Rating) | [Float] | 0 | [0.2, 0.2, 0.2, 0.2, 0.2] | |
Review | Length (in Characters) | Int | 0 | 100 |
5.2 Data Preprocessing
normalize()
method with standard parameters of the preprocessing
module provided by scikit-learn (Pedregosa et al. 2011).scale()
method with standard parameters of the preprocessing
module.5.3 Classification with Balanced Data
RepeatedStratifiedKFold
method of the model_selection
module.Classifier | Accuracy | Precision | Recall | F1 | AUC/ROC |
---|---|---|---|---|---|
RandomForestClassifier | 0.970 | 0.973 | 0.967 | 0.970 | 0.989 |
DecisionTreeClassifier | 0.953 | 0.949 | 0.957 | 0.953 | 0.953 |
MLPClassifier | 0.919 | 0.921 | 0.916 | 0.918 | 0.969 |
SVC(kernel=’rbf’) | 0.901 | 0.879 | 0.930 | 0.904 | 0.959 |
SVC(kernel=’linear’) | 0.899 | 0.878 | 0.926 | 0.902 | 0.960 |
LinearSVC | 0.895 | 0.861 | 0.941 | 0.900 | 0.964 |
GaussianNB | 0.765 | 0.731 | 0.889 | 0.755 | 0.955 |
5.4 Optimization
RFECV
method from the feature_selection
module. The cross validation is performed as described in the previous phase.GridSearchCV
from the model_selection
module. This method performs a cross validated, exhaustive search over a predefined grid of parameters for a classification algorithm. After finding the optimal combination of parameters within the grid, this is further manually tuned by adding more values around the currently best.5.5 Feature Importance
5.6 Classification with Imbalanced Data
Predicted as fake review | Predicted as regular review | |
---|---|---|
Actual fake review | True positive (TP) | False negative (FN) |
Actual regular review | False positive (FP) | True negative (TN) |
5.6.1 Classification Results with Imbalanced Data
Skew | Recall | AUC/ROC | Precision | F1-score | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DT | MLP | RF | DT | MLP | RF | DT | MLP | RF | DT | MLP | RF | |
90.0 | 0.982 | 0.978 | 0.986 | 0.897 | 0.968 | 0.984 | 0.979 | 0.970 | 0.987 | 0.981 | 0.974 | 0.986 |
80.0 | 0.972 | 0.963 | 0.980 | 0.923 | 0.970 | 0.987 | 0.968 | 0.958 | 0.984 | 0.970 | 0.960 | 0.982 |
70.0 | 0.966 | 0.943 | 0.976 | 0.941 | 0.973 | 0.987 | 0.964 | 0.954 | 0.980 | 0.965 | 0.949 | 0.978 |
60.0 | 0.964 | 0.940 | 0.970 | 0.946 | 0.974 | 0.988 | 0.952 | 0.940 | 0.976 | 0.958 | 0.940 | 0.973 |
50.0 | 0.953 | 0.920 | 0.962 | 0.950 | 0.972 | 0.989 | 0.947 | 0.925 | 0.974 | 0.950 | 0.922 | 0.968 |
40.0 | 0.945 | 0.902 | 0.956 | 0.953 | 0.972 | 0.989 | 0.941 | 0.914 | 0.972 | 0.943 | 0.908 | 0.964 |
30.0 | 0.942 | 0.872 | 0.947 | 0.956 | 0.973 | 0.988 | 0.932 | 0.903 | 0.963 | 0.937 | 0.887 | 0.955 |
20.0 | 0.924 | 0.842 | 0.937 | 0.950 | 0.974 | 0.988 | 0.903 | 0.885 | 0.950 | 0.914 | 0.863 | 0.944 |
10.0 | 0.896 | 0.758 | 0.912 | 0.941 | 0.977 | 0.986 | 0.872 | 0.859 | 0.933 | 0.884 | 0.802 | 0.922 |
9.0 | 0.894 | 0.824 | 0.907 | 0.940 | 0.978 | 0.986 | 0.865 | 0.826 | 0.925 | 0.879 | 0.824 | 0.916 |
8.0 | 0.885 | 0.809 | 0.896 | 0.936 | 0.979 | 0.986 | 0.859 | 0.832 | 0.925 | 0.872 | 0.819 | 0.910 |
7.0 | 0.884 | 0.785 | 0.892 | 0.936 | 0.979 | 0.984 | 0.855 | 0.830 | 0.920 | 0.869 | 0.807 | 0.906 |
6.0 | 0.857 | 0.804 | 0.882 | 0.924 | 0.978 | 0.983 | 0.848 | 0.790 | 0.915 | 0.853 | 0.797 | 0.898 |
5.0 | 0.860 | 0.694 | 0.876 | 0.925 | 0.979 | 0.984 | 0.826 | 0.820 | 0.904 | 0.843 | 0.752 | 0.890 |
4.0 | 0.857 | 0.714 | 0.869 | 0.925 | 0.979 | 0.982 | 0.816 | 0.776 | 0.902 | 0.836 | 0.744 | 0.885 |
3.0 | 0.830 | 0.681 | 0.846 | 0.912 | 0.980 | 0.980 | 0.789 | 0.741 | 0.898 | 0.809 | 0.710 | 0.871 |
2.0 | 0.806 | 0.567 | 0.814 | 0.901 | 0.978 | 0.978 | 0.764 | 0.690 | 0.900 | 0.784 | 0.622 | 0.855 |
1.0 | 0.775 | 0.395 | 0.756 | 0.886 | 0.978 | 0.971 | 0.723 | 0.589 | 0.900 | 0.748 | 0.473 | 0.822 |
0.9 | 0.776 | 0.255 | 0.755 | 0.886 | 0.976 | 0.973 | 0.730 | 0.606 | 0.895 | 0.752 | 0.359 | 0.819 |
0.8 | 0.764 | 0.300 | 0.735 | 0.881 | 0.978 | 0.969 | 0.714 | 0.555 | 0.901 | 0.738 | 0.380 | 0.810 |
0.7 | 0.765 | 0.252 | 0.731 | 0.881 | 0.977 | 0.968 | 0.714 | 0.559 | 0.906 | 0.738 | 0.334 | 0.809 |
0.6 | 0.756 | 0.166 | 0.710 | 0.877 | 0.976 | 0.967 | 0.697 | 0.536 | 0.897 | 0.725 | 0.253 | 0.793 |
0.5 | 0.747 | 0.065 | 0.703 | 0.873 | 0.977 | 0.966 | 0.690 | 0.600 | 0.916 | 0.717 | 0.117 | 0.796 |
0.4 | 0.735 | 0.021 | 0.661 | 0.867 | 0.975 | 0.963 | 0.675 | 0.496 | 0.912 | 0.704 | 0.038 | 0.766 |
0.3 | 0.712 | 0.001 | 0.634 | 0.855 | 0.976 | 0.962 | 0.656 | 0.396 | 0.930 | 0.683 | 0.002 | 0.754 |
0.2 | 0.692 | 0.001 | 0.608 | 0.846 | 0.975 | 0.952 | 0.638 | 0.292 | 0.931 | 0.664 | 0.001 | 0.736 |
0.1 | 0.680 | 0.000 | 0.560 | 0.840 | 0.973 | 0.945 | 0.712 | 0.100 | 0.962 | 0.696 | 0.000 | 0.707 |
6 Discussion
6.1 Implications
Nice UI ⋆ ⋆ ⋆ ⋆Very clean and beautiful UI. I like the goal setting and the reminders. I would like to see some animation when scrolling the weekly progress bars.