1 Introduction
Twitter
, the main online medium used by the movement. Twitter
users create directed links to follow the messages of other users and communicate through short public messages called tweets. We analyze the content of a large set of tweets about the 15M movement, extracting sentiment values and semantic content related to social and cognitive processes. Our aim is to explore the role of social emotions in group activity and collective action. We address how emotional interaction supports the creation of social movements and how emotional expressions lead to the involvement of the participants of the movement.Twitter
social network. We pay special attention to emotional expression in tweets, social inclusion in the follower network of the participants of the movement, and sentiment polarization in the creation and social response to the movement.2 Results
2.1 Sentiment analysis in Spanish
2.2 Activity and information cascades
Twitter
users that produce a tweet in the cascade, also known as the number of spreaders, \(n_{\mathrm{sp}}\). The associated size of an information cascade corresponds to the amount of unique users who receive some tweet of the cascade in their tweet feeds. This concept, commonly known as exposure of the tweets in the cascade, is the sum of the amount of participants who follow at least one spreader, denoted as \(n_{c}\).2.3 The movement at the local level
Twitter
, such that participants were aware of the large attention that the movement was receiving online. To quantify the social activity of each participant, we compute a vector of user features that quantifies the integration in the movement, its level of activity, and its expressed emotions and levels of social and cognitive content. We estimate participant integration in the movement in terms of the follower/following network, i.e., a network in which a link from user u to user v is created when the latter follows the former. Thus, the direction of links goes from a user to its followers, indicating the direction in which information flows. We measure the k-core centrality of a user, \(k_{c}(u)\) (explained in Materials and Methods), where the higher \(k_{c}\), the better integrated the user is. We also control for its amount of followers, \(k_{\mathrm{out}}(u)\), and the amount of participants followed by u, \(k_{\mathrm{in}}(u)\). The level of engagement in the movement is approximated by the total amount of tweets about 15M created by the participant, \(n(u)\). We measure the expression of emotions by means of the ratios of positive, \(\operatorname{pos}(u)\), and negative tweets, \(\operatorname{neg}(u)\), and the ratios of words related to social processes, \(\operatorname{soc}(u)\), and cognitive processes, \(\operatorname{cog}(u)\).
n
(
u
)
|
\(\boldsymbol{k}_{\boldsymbol{c}}\boldsymbol{(u)}\)
|
\(\boldsymbol{k}_{\mathbf{in}}\boldsymbol{(u)}\)
|
\(\boldsymbol{k}_{\mathbf{out}}\boldsymbol{(u)}\)
|
pos(
u
)
|
neg(
u
)
|
soc(
u
)
|
cog(
u
)
|
\(\mathbf{R}^{\mathbf{2}}\)
| |
---|---|---|---|---|---|---|---|---|---|
n(u) | 0.193∗∗∗
| 0.015∗∗
| 0.032∗∗∗
| 0.010∗
| 0.026∗∗∗
| −0.022∗∗∗
| −0.005 | 0.048 | |
\(k_{c}(u)\)
| 0.094∗∗∗
| 0.676∗∗∗
| 0.090∗∗∗
| 0.005 | 0.012∗∗∗
| −0.012∗∗∗
| −0.003 | 0.537 |
Twitter
.
Dataset
|
pos(
u
)
|
neg(
u
)
|
neu(
u
)
|
soc(
u
)
|
cog(
u
)
|
---|---|---|---|---|---|
15M | 0.063 | 0.068 | 0.065 | 0.035 | 0.128 |
15M shuffled | 0.00002 (0.008) | −0.0001 (0.007) | 0.0001 (0.007) | −0.0002 (0.008) | −0.0002 (0.007) |
individuals | 0.261 | 0.364 | 0.315 | 0.336 | 0.358 |
ind. shuffled | 0.029 (0.01) | 0.014 (0.009) | 0.017 (0.009) | 0.028 (0.009) | 0.022 (0.009) |
3 Discussion
Twitter
social network. Using a dataset of tweets related to the 15M movement, we track the activity of 84,698 Twitter
users. Our analysis includes 556,334 tweets during a period of 32 days, providing an illustration of the structure of the movement in two ways: (i) at the dynamic aspect of cascades in the discussion between connected users, and (ii) at the individual level of social integration and participation of each user.Twitter
. In line with previous works in social psychology [29], we assess the role of emotions in social interaction and collective action. We test the hypothesis that collective emotions fuel social interaction by analyzing cascades according to their emotional, cognitive, and social content. We find that the sentiment expressed in the first tweet of a cascade does not significantly impact the size of the cascade. Instead, the collective emotions in the cascade are responsible for its size in terms of spreaders and listeners. In particular, cascades without positive content tend to be larger, and their size follows a qualitatively different distribution. The cognitive content of the tweets of a cascade play no role in their spread. On the other hand, our analysis of social content in the cascades reveals a clear pattern: cascades with large ratios of social-related terms have distributions of listener and spreader sizes that scale with system size, in contrast with cascades with low ratios of social-related terms, which follow distributions that have bounded means.Twitter
, pushing the virality of content above a critical threshold that produces qualitatively different cascading behavior.4 Materials and methods
4.1 15M tweets and network
Twitter
related to the 15M movement in Spain, which brewed for some time in several online social media, and mainly rised with the launch of the digital platform Democracia Real Ya (Real Democracy Now). Twitter and Facebook were utilized to organize a series of protests that took off on the 15th of May, 2011, when demonstrators camped in several cities [30, 31]. From that moment on, camps, demonstrations and protests spread throughout the country, and the 15M became a grassroots movement for additional citizen platforms and organizations. As many of the adherents are online social media users, the growth and stabilization of the movement was closely reflected in time-stamped data of twitter messages. Some of these tweets were extracted from the Twitter
API according to a set of pre-selected keywords (see Table I in Additional file 1), and the collection comprises messages exchanged from the 25th of April at 00:03:26 to the 26th of May at 23:59:55, 2011. The sample of tweets was filtered by the Spanish startup company Cierzo Development LTd., which exploits its own private SMMART (Social Media Marketing Analysis and Reporting Tool) platform, and therefore no further details are available. According to previous reports, the SMMART platform collects 1/3 of the total Twitter
traffic. From the sample of tweets we obtained, the follower/following network is extracted: for the active users, i.e. those who posted at least one tweet in the sample collected, the set of followers is retrieved, and the resulting network is filtered to include only the active followers. The resulting network is composed of nodes that represent users, and edges with directionality corresponding to the information flow in Twitter
. This way, if a user u is a follower of user v, there will be a directed link from v to u in the network.4.2 Sentiment analysis
SentiStrength
[15], a state-of-the-art sentiment analysis tool for short, informal messages from social media [22]. SentiStrength
is used in a wide variety of applications, from the sentiment analysis of stock markets [32], to reactions to political campaigns [13], and interaction in different social networks [22]. We tailored SentiStrength
to the Spanish language based on a sentiment corpus of more than 60,000 tweets and evaluated it on an independent corpus of more than 7,000 human-annotated tweets [23]. More details about our application of SentiStrength
and the results of this evaluation can be found in the Supplementary Information. After sentiment detection, for each tweet m, we have an emotion value \(e_{m}\) associated with the tweet. \(e_{m}=1\) if the tweet is positive with respect to its emotional charge, \(e_{m}=0\) if the tweet is neutral, and \(e_{m}=-1\) if the tweet is negative. We abbreviate these as positive, neutral and negative tweets, always referring to their emotional charge.4.3 Linguistic content analysis
4.4 Cascade detection
Twitter
is considered to be both a micro-blogging service and a message interchange service, as suggested by the high values of link reciprocity \(\rho\sim0.49\) and the mention
functionality. Time-constrained cascades allow to take into account these frequent situations in which people discuss about particular topics using their own words to express their ideas, rather than forwarding a restricted piece of information.