We investigate how community feedback affects individual users based on a predictive model. First, we have developed a model that predicts topic changes of an author by incorporating essential features: (i) author’s properties, (ii) global topic trends due to news and social events, and (iii) the received feedback. Our model achieves high accuracy (≈ 82%) for two datasets from social media platforms (i.e., Reddit and Twitter). Then, we quantify the feedback effect on each user level using the model. While this effect does not significantly influence most users (67% in Reddit and 85% in Twitter), it affects the remaining users positively rather than negatively, i.e., these users are more inclined to continue the same topic if they receive positive feedback.
The effect of social feedback varies across different groups of users and social media platforms. The percentage of susceptible users is higher on Reddit than on Twitter, but the effect size is larger for the Twitter users than the Reddit users. We note also that in Reddit the percentage of susceptible users decreases with user activity, whereas it increases with user activity on Twitter (Tables
11 and
12 in Appendix
F, respectively). Expert Twitter accounts often belong to celebrities or organizations, who may make use of social feedback in choosing their next topics to maximise engagement. This is not the case in Reddit, where user accounts have significantly lower visibility and organizations and celebrities do not have official accounts, hence there are less incentives for optimizing posting activity for engagement. Future studies can test these hypotheses by distinguishing between different kinds of users (e.g., celebrities, organisations, casual users) in a given social media platform. Here we focused on highly active users (> 50 posts in six months) — the results might be different for less active users. Measuring the effect of community feedback for inactive users is more challenging, because they post less frequently. If users are extremely inactive but post in bursts, as it is often the case [
44,
68], the effect of community feedback can be captured by grouping similar users to obtain sufficient numbers of samples per each group.
At first glance the percentage of users susceptible to community feedback might appear to be small. However, Cheng et al. [
9] also report “that negative feedback leads to significant behavioral changes that are detrimental to the community. [...] In contrast, positive feedback does not carry similar effects, and neither encourages rewarded authors to write more, nor improves the quality of their posts.” While that study focused on other behavioral changes, repeating our setup while focusing on
negative feedback is a future direction to explore. Another reason that the percentage of susceptible users is small could be due to users getting accustomed to feedback and hence starting to “price it in” through certain expectations. For example, Cunha et al. [
14] observe “diminishing returns and social feedback on later posts is less important than for the first post.” Though it is theoretically possible to look at changes in susceptibility over time, there are technical limitations related to obtaining complete user timelines. Still, differentiating between “fresh” and “experienced” users could be worth pursuing.
7.1 Limitations
This study has the following limitations. First, we focused on the number of comments (Reddit) and retweets (Twitter), but we did not consider the content or sentiment of the feedback. However, as discussed above, the effect of receiving negative feedback can be quite different from that of positive feedback [
8]. While retweets typically imply positive feedback, such as support for the author and agreement with the tweet contents [
21,
54], comments and replies often contain a mixture or support and criticism [
21]. In our dataset, the positive, neutral, and negative comments accounted for about 40%, 30%, and 30% for the total comments, respectively. This difference in sentiment is a possible reason why the effect of community feedback is smaller in Reddit than in Twitter. It would be interesting to extend the logistic model to incorporate the sentiment of the comments. At the same time, a negative sentiment does not necessarily indicate an antagonistic position towards the original post. For example, a post about a tragic event is likely to attract many comments with a negative sentiment, while agreeing with the original position. Stance detection [
40] could hence be a useful direction to explore in the future.
Second, topic classification from short texts (e.g., tweet) is still a challenging task. While most of subreddit titles were interpretable for us, some topics extracted from tweets were not. This might be another reason why the results of Reddit and Twitter are different quantitatively (Tables
5). Note that noise in the topic classifier would lead to an underestimate of the effect that community feedback, or any other feature, has on topic continuation as the dependent variable, i.e. whether a topic is repeated or not, becomes more random and less predictable than it actually is. Hence, we believe that our estimates for the percentage of susceptible users and for the gains of the topic repeat probability due to community feedback are both lower bounds.
Third, we only looked at one type of behavior, topic continuation vs. topic change, and looked at effects averaged across all topics. Other behaviors, such as time until the next post or even churn probabilities could be looked at. Furthermore, the effect might be heterogeneous across topics. Future work is needed to look at different types of behavior change, as well as additional factors that might influence the effect heterogeneity.
Fourth, our current study does not look at who provides feedback, whether a close friend, an acquaintance, or a stranger. Previous work looking at fact-checking interventions for false statements on Twitter [
29,
47] found that the type of social link did effect the likelihood to accept a fact-checking intervention. While the collection of social network information adds certain technical challenges related to API limits, the incorporation of such information seems a promising future direction.
Fifth, an additional, inherent challenge when collected data from online platforms is the fact that these platforms change for at least two reason: (i) Their user bases changes and, once no longer undergoing exponential growth, generally matures both in terms of expertise on the platform as well as in terms of biological age. (ii) Platforms periodically introduce new features, such as Twitter’s “retweet with comment” [
21] or its expansion of the 140 character limit to 280 [
24]. In a sense, every new feature creates a new platform, making before-after generalizations difficult. While our method is expected to be applicable to future versions of the platforms studied, the quantitative findings might not be.
Sixth, our approach for estimating treatment effect based on predictive modeling may be affected by model misspecification. We assume the logistic model and identify the confounding variables by exploring possible factors for the author’s posting behavior. Although the high prediction accuracy (82% for Reddit and Twitter) suggests that our predictive model is reasonable, there are many possible choices for the model and it is likely that more predictive models will be developed in future. For instance, it is interesting to extend the proposed model by incorporating the history of posting behavior of a user. Additionally, similar to the matching methods, our method might miss confounding variables, which may affect the estimate of the community feedback effect. For example, our model neglects the temporal information, i.e., the time of previous posting. It would be interesting to develop such a predictive based on point process [
37]. Our method can control some of unobserved confounders by including the global topic trend in the model. Specifically, we adopted a simple random walk model for the topic trend
\(g_{k}(t)\), having a property of autoregressive smoothness. We note that this model could be extended to incorporate seasonality and rapid changes [
35].
Finally, it is possible that social feedback affects emotions more than observable actions such as topic choice. For example, Marayuma et al. [
48] observe that “receiving positive feedback to social media posts instills a psychological sense of community in the poster.” However, they do not report any actual behavior change. Reasoning about internal, mental states using social media is inherently challenging and something that this work does not attempt to do.
7.2 Broader impact
Our results contribute to the discussion on how operant conditioning affects social media users [
1,
15] and suggest that social feedback systems are a critical and sensitive part of social media platforms that has an agenda-setting effect. The results of this study have implications for the design of social media. Prior studies show that social feedback influences opinions of consumers about online content and its propensity to spread [
26,
55], whereas this study shows its impact on authors’ decisions on the topic to post next. We note that polarizing or biased topics receive more feedback than impartial topics [
73]. One can hypothesize that social influence contributes to this effect, by boosting the spread of topics that arouse emotions and elicit quick positive feedback from susceptible users. A potential solution addressing this issue is a novel design of social rating systems that accounts for susceptibilities of users.
Finally, we note that topic choice is a higher-level cognitive task [
50], related to free will, so it is surprising that it is influenced by social feedback, although the father of operant conditioning considered free will an illusion [
62,
65]. It remains an open question how many of our choices are determined by various kinds of feedback, including social feedback, and how many are the result of free will.