1 Introduction
-
We present several machine learning models and datasets to aid the automatic detection of the violation of the human value of honesty in app reviews. Our publicly available replication package supports researchers and practitioners to adapt, replicate, and validate our study (Obie et al. 2022).
-
We provide insights into the different categories of honesty violations prevalent in app reviews by creating a taxonomy based on a manual analysis of the honesty violations dataset.
-
We survey 70 app developer practitioners and interview 3 practitioners to get their feedback on the prevalence of honesty violations in their mobile apps, the causes of these issues, and feedback on our proposed machine learning-based classifier to help identify such violations from user app reviews.
-
We present an actionable framework for developers which gives a better understanding of the causes and consequences of honesty violations and strategies that can be used to avoid and fix honesty violations.
-
We present a set of practical implications and future research directions to deal with the challenges of the violations of the human value of honesty in apps that would benefit end-users and society.
2 Motivating Examples
3 Related Work
4 Research Design
-
RQ1. Can we effectively identify reviews documenting honesty violations automatically? We formed a large labelled dataset of app reviews and then trained a variety of machine learning classifiers to answer this RQ. Our best-performing classifier has an F1 score of 0.921.
-
RQ2. What types of honesty violations are reported in these app reviews? We manually inspected a sample of 401 honesty violation reviews and classified the honesty violations represented by each into ten distinct categories.
-
RQ3. What is app developers’ experience with honesty violations in the mobile apps they develop and their perspective on automatic detection of honesty violations? We developed three subquestions to answer this RQ. We use in-depth interviews and a broad survey with the participation of 73 mobile app practitioners. RQ3.1.What are the causes of honesty violations in mobile apps, and who is responsible for them? We want to know, according to developers’ experience with honesty violations in mobile apps they develop, what causes these honesty violations in mobile apps and who is responsible for them. RQ3.2.What are the consequences of honesty violations in mobile apps on the end users and app developers/owners according to developers’ experience? The goal of this RQ is to understand the impacts of honesty violations on end users, and the developers themselves/owners of the mobile apps, as experienced by the mobile app developers. RQ3.3. What strategies do developers use to handle honesty violations in mobile apps? This RQ aims to identify what strategies the mobile app developers use to avoid and/or fix reported honesty violations in mobile apps (or if they indeed do so). RQ3.4. What are the benefits of automatically detecting honesty violations in mobile apps? Through this research question, we target exploring the potential benefits of automatic detection of honesty violations.
5 Automatic Classification of Honesty Violations (RQ1)
5.1 A Dataset of Honesty-Related Reviews
5.1.1 Data Collection
Number of Apps | 713 |
App Categories | 25 |
All Reviews | 236,660 |
Honesty-related Reviews (after keywords filter) | 4,885 |
Honesty Violation Reviews (after manual validation) | 401 |
5.1.2 Data Labeling
5.2 Classification Approach
5.2.1 Data Preparation
5.2.2 Feature Extraction
5.2.3 Model Selection and Tuning
5.2.4 Cross Validation
5.3 Results
SVM | LR | NN | RF | GBT | DNN | GAN | |
---|---|---|---|---|---|---|---|
True negative | 0.432 | 0.407 | 0.358 | 0.371 | 0.358 | 0.407 | 0.383 |
True positive | 0.457 | 0.469 | 0.482 | 0.420 | 0.420 | 0.506 | 0.482 |
False positive | 0.025 | 0.049 | 0.099 | 0.085 | 0.099 | 0.049 | 0.074 |
False negative | 0.086 | 0.074 | 0.062 | 0.124 | 0.124 | 0.037 | 0.062 |
MCC | 0.785 | 0.753 | 0.676 | 0.581 | 0.555 | 0.826 | 0.726 |
SVM | LR | NN | RF | GBT | DNN | GAN | |
---|---|---|---|---|---|---|---|
Accuracy | 0.889 | 0.877 | 0.840 | 0.790 | 0.778 | 0.914 | 0.864 |
Precision | 0.949 | 0.905 | 0.830 | 0.829 | 0.810 | 0.911 | 0.867 |
Recall | 0.841 | 0.864 | 0.886 | 0.773 | 0.773 | 0.932 | 0.886 |
F1 score | 0.892 | 0.884 | 0.857 | 0.800 | 0.791 | 0.921 | 0.876 |
Our (DNN) approach | Baseline classifier | |||||
---|---|---|---|---|---|---|
Precision | Recall | F1 | Precision | Recall | F1 | |
Classification | 0.911 | 0.932 | 0.921 | 0.0821 | 0.5 | 0.1412 |
Improvement | - | - | - | 11.096x | 1.864x | 6.523x |
“It was great! I loved it! But then there was a little problem. At day 5 of me using it, there was a bit of lag on the app. I checked my connection and my phone but everything was fine, then the next morning when I opened the app, it was pure black. I waited for 4-7 minutes but nothing happened. I restarted the app then open it again then everything was fine. Please fix this problem I just don’t want this to happen. I really love this app so please fix it. Thank you.”
“It was nice to see the changes you made. It is easier to delete and move things but, you then overdo it. You overcompensated. More ads. Now, it keeps telling me I’m offline when I’m not. I get to the page, touch visit, nothings happening except it telling me I’m offline, check my network connection. It was fine the other day. All my other apps are working, not sure what is going on. Help.”
In these cases, the reviews focus on describing technical issues while using app rather than the fact that the app provides inaccurate information. Furthermore, the reviews mentioned “ads” but not mentioning whether the ads were relevant to the user’s preferences. Given the sentiment of such reviews, the content of the populated ads in app may be neutral or relevant to the user’s preferences. These examples demonstrates that there are two potential limitations of the DNN model such as (1) the confusion between technical issues and inaccurate information; and (2) the confusion between ads and false ads.“Pinterest used to be a great app for recipes with the occasional ad. Now it’s the single worst app to have because of the overabundance of ads. My screen jumps up and down and will not hold the recipe in place and when I finally do get to the recipe, my screen goes black and the app closes. With all my prep work on the counter ready to go I have to go back and endure the same pain of finding the recipe, scrolling past the ads, and hoping it doesn’t do the same again.”
6 Categories of Honesty Violations (RQ2)
6.1 Categorisation Approach
Honesty Violation Category | f |
---|---|
Unfair cancellation and refund policies | 48 (12%) |
False advertisements | 55 (14%) |
Delusive subscriptions | 33 (8%) |
Cheating systems | 93 (23%) |
Inaccurate information | 15 (4%) |
Unfair fees | 106 (26%) |
No service | 64 (16%) |
Deletion of reviews | 6 (1.5%) |
Impersonation | 9 (2%) |
Fraudulent-looking apps | 29 (7%) |
6.2 Results
6.2.1 Unfair Cancellation and Refund Policies
“The app allows you to accidentally sign up to premium with a push of a button. When you want to cancel, however, you can’t do that via the app... You have to go to the webpage, enter details and cancel there.”
Sometimes, the app also makes it easy for the user to mistakenly activate a premium subscription in the way the interface and flow are designed, e.g.:“Deceptive billing practices - information on cancelling is circular; emailed a link that advises to email. [It] doesn’t have colour tag functionality across web and app; very poor UX and worse customer service.”
Another aspect of this category focuses on situations where the user perceives the refund steps and policies to be dishonest and unfair. This also involves situations where the refund policy does not cater to accidental subscriptions, e.g.:“Use with caution. It’s unscrupulous about signing you up for a subscription when you’re skipping past the in-app ads. It’s not made clear once you’ve subscribed, and there’s no way of cancelling it through the app.”
“DO NOT SIGN UP FOR FREE TRIAL! IT IS A SCAM. YOU WILL GET CHARGED ANYWAY, AND YOU WILL NEVER GET YOUR MONEY BACK!! Once again, after numerous attempts to blame Google, this developer has still not refunded my $38. Once again, I cancelled 3 full days before the free trial ended but was still charged. Once again, [I] contacted the developer, who told me that I would receive a full refund within 7 to 10 days, and still nothing. I have saved the email, pricing this to be true. DO NOT TRUST THIS DEVELOPER. SCAM!!!!”
6.2.2 False Advertisements
“Couldn’t find Google Assistant integration anywhere. Even though it’s been advertised everywhere when searching the web for the app... It’s even in the description of the app here. That’s false advertising. I will edit my review when it’s out of Beta and working in the final version.”
In some cases, the app lures users into downloading the app on the basis that it is free-for-use only for the user to find out that the free-for-use is a trial version for a specific time period and not perpetually free as implied in the app description:“The app doesn’t listen to the watch at all. I’ve tried completing and snoozing and it does nothing. The watch app can only add tasks, so the screenshots they’re sharing here are DECEPTIVE.”
In addition, the app developers (through the app description) make promises to users to give them certain benefits like a free premium subscription when a particular action is carried out (e.g., inviting a particular number of friends to sign up). However, they never truly fulfil their promises when the user fulfils their end of the bargain. These unfulfilled obligations are perceived by the end-user as a violation of honesty, e.g.:“The actual free version doesn’t allow you anything, not even to learn how to use the app properly. That role is filled by 7 days of free premium. The free, on the description, is a lie. Is a paid-only app with temporary free access to its full features that gets practically useless after the 7-day trial... I don’t like to be lied to.”
Another example relates to scenarios where the user is invited to make certain commitments based on a future reward and the developers bail out on their prior commitment:“I love this app however I sent the link to several friends and they got the app and I received no premium time whatsoever. Don’t be dishonest with your apps. That’s lame.”
“Shame on Them! Liars. I paid for the season pass TWICE (ONCE for my apple device and the other for my Samsung Device). I was falsely promised access to ALL FUTURE CONTENT. Now they are trying to charge me for the Parisian Inspired TOKENS! HOW DARE THEY LIE AND BAIT AND SWITCH.”
6.2.3 Delusive Subscriptions
“I just realised that I have been charged for some crappy premium service fee which I had no idea about when using the app. Why is this charge by default? Why was I not informed in the first place? Beware of scam for useless monthly premium fees!”
Additionally, there is the issue of lack of user consent in the subscription process where certain apps do not provide a confirmation mechanism that prevents accidental subscriptions by the user, e.g.:“I can’t believe I was charged 55.99. What are you giving me? Gold? I unsubscribed but saw mysterious charge in my bank account.”
In some scenarios, the automatic subscription is hidden behind an in-app ad/feature, and an unsuspecting user who clicks on the feature is automatically subscribed to the premium version of the app without a clear warning or confirmation, e.g.:“Made me pay 1 year worth of subscription without my confirmation. Only used its free trial because I had to use it once. What a scam...”
“Deceptive practices. If you click the in-app “ad" that simply says enable notifications, you’ll automatically be signed up and billed for their premium service. This bypasses the Google/Apple stores subscription model and bills your card directly. Not to mention it’s impossible to downgrade from this service in the app itself; you have to visit their website, which is a deliberately obstructive hurdle considering you can upgrade in the app just fine.”
6.2.4 Cheating Systems
“This game cheats. It uses words not found in the dictionary. Also it told me a word was unplayable, but it was the first best word option.”
In some of the reviews, users complain that the game works properly when the user loses and parts with money and only freezes when the AI system in the app is about to lose. Based on the reviews, the users seem to be using real money in the games/apps. This complaint is a recurring theme within this category:“I play it with my sister often. However, there is the problem of the game and AI cheating. I rolled a 2 and a 3 at the start of the game and it moved me FOUR spaces forward not five. Four. That happened several times and I can assure you I was looking everytime it happened. I am very disappointed at the fact this game is cheating...”
“You have to pay for it, then the game just freezes when you win against the CPU? Reset it over and again, keeps freezing unless it rolls something to not land on my property. Also, is the dice rigged against the CPU? Honesty? With as much as I owned in the beginning, none of the 3 CPUs would land on anything I owned. Anytime the last CPU needs to raise money, game freezes, guess ya just can’t win.”
“there’s a glitch in it that freezes the game from continuing when you’re winning. The dice just disappears, but the trains and clouds and aircrafts keep moving. It’s like It is designed so that one doesn’t win them.”
We consider this category important as some of these apps require the use of real money to play or for in-app purchases. If apps are dishonest in the underlying process of the systems that are expected to be fair, then that constitutes not only a violation of the value of honesty, it might potentially be a crime. This is worth considering, especially when the exact issue is raised by several users:“When playing against the computers when you’re about to win and bankrupt the final computer the game conveniently freezes. It does not allow you to win. Not a very fun game to play, I want my money back.”
Other non-game examples include cases where the user reports not having the full value of the fee they were charged for the app and feels cheated. For instance:“Although you say that the dice is random, i cannot help but feel that it is rigged. Take a look at your reviews, there are many other players that feel the same. Can’t be all of us are wrong. Or maybe we are suffering from mass hysteria?”
“Whenever I pay for parking the app always steals 5 minutes off my parking time. For example, I pay for 60 minutes and the timer starts at 54 minutes and 59 seconds. I am very upset, this has been happening for a while and probably to many more people as well. That is a lot of money!”
“This app will not give you re requested amount of parking time. If you park for 15 minutes it will immediately say you have 11 minutes left. I understand that you have to charge but at least give me the requested amount of parking time.”
6.2.5 Inaccurate Information
Another example review in this category is quite severe as it relates to a health emergency app providing potentially inaccurate information that might be detrimental to the user:“When you need to pay for additional time, and click ’Recent’ to pay for the most Recently parked in place - the first item is not the place you just parked in so it tricks you into paying for the wrong place (dark pattern). Please make the Recent accurately reflect the most recently parked in place.”
Other less severe but important reviews where the user perceives the app provides inaccurate information or notification are shown below:“Try to use this in an actual emergency and you’ll just end up as a dead idiot holding a cellphone. The information is either useless or completely false in most cases. Don’t bother downloading.”
“Do not buy unless you are sure you want to. You will NOT be able to get it set up and working within the 15 minute refund window. The instructions online are so cryptic it (and wrong).”
“Very annoying every time when you open the app it shows you have a notification. Then checking your notifications you don’t have any.”
6.2.6 Unfair Fees
Other examples of fees considered by the user to be unfair are:“Went through the sign up process and parked my bike in a bike parking zone. Put in the correct zone details for the bike parking area and got charged a car parking rate. Rang support and they said there is no bike parking at that location. I explained there was and they told me to ring the council.”
“The app charges you 0.25 per transaction. So I paid 0.75 to pay for parking it charged me 0.25 service fee then I extended my parking 0.25 and it charged me again 0.25!!! Biggest scam in the world.”
This category is also reflected in the form of hidden charges where the user is not aware of subsequent charges made to their account. These hidden charges can take the form of a vague bill (as shown in the review below) or not notifying the user with respect to extra charges.“The only annoying things are that I have to buy any extra Monopoly Board in the same game when I already paid the main game. Can you not give extra Monopoly Boards in the same game for free. You are not fair!”
“This is a notorious company with horrible app I’ve ever used. They hide the history and details very deep for you to check and trace. And the monthly bill is also vague. I experienced they secretly bill me!”
Another related issue within this category is dubious charges where the user account has been charged, and it is not clear why those charges occur. Abnormally high fees (more than the standard subscription fees) and overcharging of the user account are also captured under this category. For example:“LOOK OUT PEOPLE. THIS IS A SCAM. THEY DID NOT WARN OF A DEPOSIT FEE AND THEY TOOK 33% OF THE DEPOSIT. I RECOMMEND SUING THEM NOW.”
“It charged me £74.50 when I bought a ticket for £1.50 it’s a absolute scam I want my money back!”
6.2.7 No Service
Another related example is shown below:“Horrible experience with this app. Causing a lot of frustrations with users. when it fails and I get a ticket there is no much help I can get. sometimes I just pay the fines just because the complaint system is awfully inconvenient. I feel cheated and it looks like a money making tool for whoever is collecting the fines.”
“I spent 20 euros with all the DLCs included, I feel pretty deceived not being able to play the game.”
6.2.8 Deletion of Reviews
“I left them a negative review and the developer deleted it. Now I’m going to review them on YouTube and all social media platforms. Basically, they are scammers.”
“Deleted my honest review. Warning. Steer clear. They keep trying to make you slip up and pay for premium. I signed up for a free trial last year and they make it too difficult for you to find where to cancel. Was charged about $40... shame such a good app is tarnished by such shady practices.”
6.2.9 Impersonation
Another example in this category reflects situations where users feel that they are interacting with bots instead of humans when they have signed up to the platform to interact with humans. This is similar to false advertising-related lawsuits of the Match.com platform described in Section 2. An example of this is:“STAY AWAY... this app is a scam. the stickers make it look like it’s Brisbane council approved. it’s not and they are no help. I still got a fine for using the app correctly and the Brisbane council parking police have no access to check if you have paid or not and do not accept this as a payment method.”
“Good game, fake players online. I wanted a challenging Monopoly game. But when I start. I can tell that some are bots not real people online. For example, they quickly trade when it is their turn. A normal human will take some time to choose options.”
6.2.10 Fraudulent-looking Apps
“...Be careful with this kind of dishonest apps”
Furthermore, Table 6 shows the breakdown of honesty violations across different app categories. Out of the 401 honesty_violations reviews, Games (28.9%), Auto & Vehicles (22.2%), and Finance (12.7%) are the app categories with the most number of honesty violations, while Medical (0.2%) and Music & Audio (0.2%) are the app categories with the least number of honesty violations.“This is a fraud app don’t download”
App Category | f |
---|---|
Games | 116 (28.9%) |
Auto Vehicles | 89 (22.2%) |
Finance | 51 (12.7%) |
Productivity | 47 (11.7%) |
Photography | 26 (6.5%) |
Tools | 19 (4.7%) |
Maps Navigation | 11 (2.7%) |
Travel and Local | 10 (2.5%) |
Health Fitness | 7 (1.7 %) |
Video Players Editors | 7 (1.7%) |
Social | 6 (1.5%) |
Communication | 5 (1.2%) |
Entertainment | 3 (0.7%) |
Education | 2 (0.5%) |
Music Audio | 1 (0.2 %) |
Medical | 1 (0.2 %) |
7 Developers’ Experience With Honesty Violations in Mobile Apps (RQ3)
7.1 Practitioner Study Design Approach
7.1.1 Step Int: Interview Study
7.1.2 Step Survey: Survey Study
7.2 Interview and Survey Study Results: Participant Information and Their Context
7.2.1 Participant Information.
7.2.2 Types of Mobile Apps Participants Develop.
7.2.3 Developer Experience: Reported Honesty Violations In App Reviews.
7.3 Interview and Survey Study Results
Business | Developers | App Platforms | Users | |
---|---|---|---|---|
Honesty violations in mobile apps | ||||
Causes | Maximise revenue (31) | Poor designing (12) | Vague audits (1) | False claims (competitors in addition to users) (7) |
Market competition (6) | Poor testing (6) | |||
Improper definition of target audience | ||||
Consequences | Bad reputation (22) | Extra work to fix honesty violations (6) | Identity theft (9) | |
Face legal issues (8) | Experience negative emotions (7) | Experience negative emotions (21) | ||
Lose user trust (8) | Harm work performance (3) | Lose trust in apps/ company/ developers (14) | ||
Lose users (7) | Harm personal reputation (7) | Lose money unknowingly (19) | ||
Lose revenue/ business (18) | Lose time (4) | |||
Stop using/ uninstall/ not install apps (13) | ||||
Avoiding strategies | Strengthen designing practices (7) | |||
Strengthen development practices (6) | ||||
Strengthen testing practices (20) | ||||
Be transparent with customers/ users (16) | ||||
Have moral standards (5) | ||||
Fixing strategies | Thoroughly investigate the violation and fix (30) | |||
Hotfix (17) | ||||
Be transparent about the violation with customers/ users (14) | ||||
Have tools in place to resolve honesty violations (2) | ||||
Automatic detection of honesty violations | ||||
Benefits | Retain/ improve reputation (11) | Quick detection of honesty violations (20) | Transparency by knowing what to expect from the app (15) | |
Reduce/ avoid legal risks (4) | Improve developer satisfaction (5) | Find honest apps in stores (4) | ||
Gain more revenue (3) | Avoid fixes (2) | Improve user satisfaction (13) | ||
Retain/ gain users (3) | Reduce effort on fixing (2) | |||
Improve user trust (6) |