Linking cyber and physical spaces through community detection and clustering in social media feeds

https://doi.org/10.1016/j.compenvurbsys.2014.11.002Get rights and content

Highlights

  • Our analysis includes two major events as captured in Twitter.

  • The themes in cyber and physical communities tend to converge over time.

  • Messages among physical space users are more consistent at the onset of the event.

  • Geolocated users are consuming information more than they produce.

Abstract

Over the last decade we have witnessed a significant growth in the use of social media. Interactions within their context lead to the establishment of groups that function at the intersection of the physical and cyber spaces, and as such represent hybrid communities. Gaining a better understanding of how information flows in these hybrid communities is a substantial scientific challenge with significant implications on our ability to better harness crowd-contributed content. This paper addresses this challenge by studying how information propagates and evolves over time at the intersection of the physical and cyber spaces. By analyzing the spatial footprint, social network structure, and content in both physical and cyber spaces we advance our understanding of the information propagation mechanisms in social media. The utility of this approach is demonstrated in two real-world case studies, the first reflecting a planned event (the Occupy Wall Street – OWS – movement’s Day of Action in November 2011), and the second reflecting an unexpected disaster (the Boston Marathon bombing in April 2013). Our findings highlight the intricate nature of the propagation and evolution of information both within and across cyber and physical spaces, as well as the role of hybrid networks in the exchange of information between these spaces.

Introduction

The past few years have witnessed the dramatic increase in the adoption and use of social media (Kaplan & Haenlein, 2010). In the U.S. alone, approximately two-thirds of online users participate in social media (Smith, 2011), spending on average between 3.6 and 6.5 h a month in social networking sites such as Facebook or Twitter (Nielsen, 2012). This has led to an unprecedented increase in the volume of data generated by social media users: every minute we have over 270,000 tweets (or retweets) contributed worldwide (Forbes, 2012), 3000 images posted in Flickr (Sapiro, 2011), and 100 h of video uploaded in YouTube (YouTube, 2014). These are but a few examples of the shift that has occurred in recent years toward user-generated digital content. With millions of users around the world, this trend is likely to further intensify (Hollis, 2011) as technological advances empower users to contribute richer data at higher rates.

Social media services and platforms offer a wide array of digital channels for expression and interaction, ranging from forums/message boards (e.g. MacRumors), weblogs (e.g. Blogger, Wordpress), and microblogging (e.g. Twitter, Tumblr, Weibo), to wikis (e.g. Wikipedia, Wikimapia), social networking services (e.g. Facebook, Google+, LinkedIn), and podcasts (Video and Audio e.g. iTunes, Ustream). Such media have enabled the general public to contribute, disseminate, and exchange information (Kaplan & Haenlein, 2010), by introducing a bottom-up alternative to complement the traditional top-down nature of Web 1.0 (Schneckenberg, 2009). This has not only resulted in a change in traditional journalism and news reporting (Deuze, 2008, Kwak et al., 2010), but it is also leading to new opportunities within the geographical sciences (Caverlee et al., 2013, Sui and Goodchild, 2011) due to the rich geographic context and context social media data often provides. A noteworthy example of this trend is the livehoods project (Cranshaw, Schwartz, Hong, & Sadeh, 2012) that is used to characterize and understand urban dynamics using social media. Indeed, social media, and micro-bogging in particular, have already been shown useful in predicting pandemics (Chunara et al., 2012, Culotta, 2010, Ritterman et al., 2009) or natural disasters (e.g. Corbane et al., 2012, Crooks et al., 2013, Zook et al., 2010) to name a few.

As we increasingly embrace the use of crowd-contributed content, gaining a better understanding of how physical space events are reported and discussed within these hybrid communities is a substantial theoretical challenge that also has significant application potential. This paper addresses this challenge by studying how information propagates and evolves over time at the intersection of the physical and cyber spaces, considering representative test cases and studying them under the lens of geosocial analysis. By analyzing the spatial footprint, social network structure, and content in both physical and cyber spaces we can advance our understanding of the complex mechanism through which information regarding localized events is propagated through social media.

A particularity of social media that renders such study necessary is the fact that, unlike other forms of volunteered geographic information, contributions there are part of a networking process, whereby individuals share and exchange information with other members of these online communities (Stefanidis et al., 2013). This networking activity may center around a variety of topics, ranging from personal observations on minutia to commentaries on issues of broader interest (Aiello et al., 2013, Mischaud, 2007). Understanding how people participate in this process remains a substantial, cross-disciplinary theoretical challenge. As a way to address this issue, Farnham and Churchill (2011), for example, discussed the issue of cyber (online) presences, governed by principles of cyber interaction and information flow. However, as these studies emerged from the social psychology domain, they often fail to adequately address the role of the physical space in these cyber interactions. People still live and function primarily in a physical space (rather than the cyber one), and their interactions in this space still play a central role in shaping their behavior. As social media becomes an integral part of our societies, understanding the interplay between cyber presence and the corresponding physical space (so called “polysocial reality (Applin & Fischer, 2012)) becomes increasingly important, as it will elevate our capability to leverage such content for a variety of purposes.

Mapping and understanding the relations between the cyber and the physical spaces, and in particular the information flow between them is a substantial scientific research challenge, as these two spaces are often not explicitly related, nor are they studied in tandem. The emergence of geolocated social media presents a unique opportunity to address this challenge by allowing us to link cyber and physical activities through user interactions, and understand how peoples’ actions and reactions to events manifest themselves across these spaces. Such knowledge is critical in a wide range of applications of broader societal value (e.g. disaster response), providing additional motivation for this research.

Our focus is on studying the connections between the cyber and the physical spaces (especially as it relates to reports of events in the physical space), as they are expressed through social associations and physical proximity. By doing so we will show how we can identify connections across these two spaces, and demonstrate the value of studying both simultaneously rather than separately. We argue that by studying social networks in both physical and cyber spaces, combining social network analysis (SNA) and spatiotemporal data clustering we can better understand their complex structure. While SNA is a rapidly growing field (e.g., Newman (2010) and Barabási (2012)), it is only recently emerging as a tool in geospatial analysis, and is often underutilized (Ter Wal & Boschma, 2009). Moreover, SNA too is weakened because of the lack of geographic consideration when exploring social relations (e.g. Bosco, 2006). Only recently have we started seeing some early studies that attempt to infuse geography into this issue, addressing for example the geographic scope of topics discussed in on-line communities (HerdaGdelen, Zuo, Gard-Murray, and Bar-Yam (2013)). Our work contributes to this issue by linking the cyber and physical spaces through SNA and spatiotemporal analysis, aiming to bridge the gap between these two fields.

The remainder of this paper is organized as follows. In Section 2 we discuss the rise of geosocial media as a new social communication avenue and a novel source of geosocial information. In particular, we discuss the notion of physical presence within social media and its importance for exploring the relation between the cyber and the physical domains. In Section 3 we discuss how communities and groups can be detected in both the cyber and physical space, and how they can be processed to form a ‘hybrid’ geosocial view of communities. To showcase these concepts and their benefits, in Section 4 we present the analysis of two case studies that make use of Twitter data associated with two different types of events: a planned activity during the Occupy Wall Street (OWS) Day of Action (November 17th, 2011), and the response to the Boston Marathon Bombing (April 15, 2013). This paper is concluded with a summary and outlook in Section 5.

Section snippets

The rise of geosocial media and spatial presence

The power of social media to disseminate information of societal importance has been showcased over the last few years with respect to a range events, from citizen journalism (e.g. the 2008 Mumbai terrorist attacks; Arthur, 2008), to civil unrest (e.g. the 2011 London riots (Glasgow, Ebaugh, & Fink, 2012) and the ‘Arab Spring’ (Christensen, 2011, Howard et al., 2011)), military operations (e.g. the 2011 U.S. raid on Bin Laden’s hideaway (McCullagh, 2011)) and health (e.g. Culotta, 2010). Within

Cyber space communities and physical space groups

Social media have proven to be a fertile ground for fostering user interaction, thus supporting the large-scale synthesis of the virtual and the real (Gordon and Manosevitch, 2011, Mitra, 2003, Parks, 2011). As a result, communities with various degrees of physical and virtual presence are formed (Mitra & Schwartz, 2001), linking physical spaces, cyber spaces, and human activity. In earlier work, Porter (2004) outlined a typology for virtual communities, and discussed their attributes, namely

Case studies

In order to showcase our analysis approach we present in this section two case studies. In the first case study – the 2011 Occupy Wall Street (OWS) Day of Action – we demonstrate how the different steps of our analysis lead to the creation of the bipartite meta-graph that connects cyber space communities and groups. In the second case – the 2013 Boston Marathon Bombing – we show how our analysis approach enables a detailed examination of information propagation between the cyber and the

Summary and outlook

Studying the connections between cyber space and the physical space has long been a challenge in gaining a deeper understanding of how people act and interact. The emergence of social media provides a lens to study the social connections among individuals, allowing us for the first time to observe the links among the distinct spaces in which we operate. While the phone allowed connecting people in fixed locations and the mobile phone extended that to account for mobility (Wellman, 2001, Kwan,

References (108)

  • A. Amini et al.

    On density-based data streams clustering algorithms: A survey

    Journal of Computer Science and Technology

    (2014)
  • Applin, S. A., Fischer, M. D. (2012). Polysocial reality: prospects for extending user capabilities beyond mixed, dual...
  • Arthur, C. (2008). How twitter and flickr recorded the mumbai terror attacks, The Guardian <http://bit.ly/1j6mhaz>...
  • A. Barabási

    The network takeover

    Nature Physics

    (2012)
  • M.W. Berry et al.

    Matrices, vector spaces, and information retrieval

    SIAM Review

    (1999)
  • F. Biocca et al.

    Toward a more robust theory and measure of social presence: Review and suggested criteria

    Presence

    (2003)
  • V.D. Blondel et al.

    Fast unfolding of communities in large networks

    Journal of Statistical Mechanics: Theory and Experiment

    (2008)
  • F. Bosco

    Actor-network theory, networks, and relational approaches in human geography

  • Boyd, D., Golder, S., Lotan, G. (2010). Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In...
  • Cao, F., Ester, M., Qian, W., Zhou, A. (2006). Density-based clustering over an evolving data stream with noise. In J....
  • N. Caren et al.

    Occupy online: Facebook and the spread of occupy wall street

    Social Science Research Network

    (2011)
  • J. Caverlee et al.

    Towards geo-social intelligence: Mining, analyzing, and leveraging geospatial footprints in social media

    IEEE Computer Society Data Engineering Bulletin

    (2013)
  • Cha, M., Haddadi, H., Benevenuto, F., & Gummadi, P. K. (2010). Measuring user influence in twitter: The million...
  • Cheng, Z., Caverlee, J., Lee, K. (2010). You are where you tweet: A content-based approach to geolocating twitter...
  • S.S. Choi et al.

    A survey of binary similarity and distance measures

    Journal of Systemics, Cybernetics and Informatics

    (2010)
  • G. Christakos

    Modern spatiotemporal geostatistics

    (2000)
  • C. Christensen

    Twitter revolutions? Addressing social media and dissent

    The Communication Review

    (2011)
  • R. Chunara et al.

    Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak

    The American Journal of Tropical Medicine and Hygiene

    (2012)
  • C. Clauset et al.

    Finding community structure in very large networks

    Physical Review E

    (2004)
  • C. Corbane et al.

    Relationship between the spatial distribution of SMS messages reporting needs and building damage in 2010 Haiti disaster

    Natural Hazards and Earth System Sciences

    (2012)
  • Cranshaw, J., Schwartz, R., Hong, J. I., Sadeh, N. M. (2012). The livehoods project: Utilizing social media to...
  • A. Croitoru et al.

    GeoSocial gauge: A system prototype for knowledge discovery from geosocial media

    International Journal of Geographical Information Science

    (2013)
  • A.T. Crooks et al.

    #Earthquake: Twitter as a distributed sensor system

    Transactions in GIS

    (2013)
  • Culotta, A. (2010). Towards detecting influenza epidemics by analyzing twitter messages. In Proceedings of the first...
  • S. Dann

    Twitter content classification

    First Monday

    (2010)
  • M. Deuze

    Understanding journalism as newswork: How It changes, and how it remains the same

    Westminster Papers in Communication and Culture

    (2008)
  • Ester, M., Kriegel, H.-P., Sander, J., Xu, X. (1996). A density-based algorithm for discovering clusters in large...
  • Farnham, S. D., Churchill, E. F. (2011). Faceted identity, faceted lives: social and technical issues with being...
  • Fink, C., Piatko, C., Mayfield, J., Chou, D., Finin, T., Martineau, J. (2009). The geolocation of web logs from textual...
  • Forbes (2012). Twitter’s dick costolo: Twitter mobile ad revenue beats desktop on some days, <http://onforb.es/KgTWYP>...
  • Friggeri, A., Lambiotte, R., Kosinski, M., Fleury, E. (2012). Psychological aspects of social communities. In 2012 ASE...
  • P.F. Gillham et al.

    Strategic incapacitation and the policing of occupy wall street protests in New York City, 2011

    Policing and Society: An International Journal of Research and Policy

    (2012)
  • Glasgow, K., Ebaugh, A., Fink, C. (2012). #Londonsburning: Integrating geographic topical, and social information...
  • M.F. Goodchild

    Citizens as sensors: The world of volunteered geography

    GeoJournal

    (2007)
  • M. Gorawski et al.

    AEC algorithm: A heuristic approach to calculating density-based clustering eps parameter

  • E. Gordon et al.

    Augmented deliberation: Merging physical and virtual interaction to engage communities in urban planning

    New Media & Society

    (2011)
  • A. Gruzd et al.

    Imagining twitter as an imagined community

    American Behavioral Scientist

    (2011)
  • Harrison, S., Dourish, P. (1996). Re-place-ing space: The roles of place and space in collaborative systems. In...
  • A. HerdaĞdelen et al.

    An exploration of social identity: The geography and politics of news-sharing communities in twitter

    Complexity

    (2013)
  • Hollis, C. (2011). 2011 IDC digital universe study: Big data is here, now what?, <http://bit.ly/kouTgc> [Accessed on...
  • Cited by (50)

    • Information propagation on cyber, relational and physical spaces about covid-19 vaccine: Using social media and splatial framework

      2022, Computers, Environment and Urban Systems
      Citation Excerpt :

      For example, one of our limitations is that our analysis only includes cyber, relational and physical spaces while excluding mental and relative spaces. As noted above (Section 2) this was done because relational space inferring networks between objects has more potential to bridge the cyber and physical spaces but a logical next step would be to extend this study to include mental and relative spaces thus allowing one to further clarify the relationships between hybrid spaces and give rise to a human-centered Splatial framework (Shaw & Sui, 2020; Croitoru et al., 2015). Furthermore, as with all studies that involve spatial boundaries, our study also suffers from the modifiable areal unit problem (Openshaw, 1981).

    • Tweeting the Laurentian Great Lakes: A community opinion analysis about Great Lakes areas as assessed through mentions on Twitter

      2022, Journal of Great Lakes Research
      Citation Excerpt :

      Collected metadata included timestamps, user screen names, the number of favorites and retweets generated by collected tweets, the permanent location of users if provided in their Twitter profile, and the geolocation of tweets tagged with location. In our sample only 3% of analyzed tweets included geolocation data; Croitoru et al. (2015) found a similarly low rate of geotagging of tweets. Searches for relevant tweets were conducted every seven days from April 13 through October 14, 2019.

    • When the storm is over: Sentiments, communities and information flow in the aftermath of Hurricane Dorian

      2020, International Journal of Disaster Risk Reduction
      Citation Excerpt :

      Virtual communities that spring up on social media platforms possess sufficient qualities that can be used for rapid assessment of post-disaster damage as well as follow the narratives of public discourse relating to a disaster [16]. For instance, Croitoru, Wayant, Crooks, Radzikowski, & Stefanidis [17] established a connection between cyber and physical space in the aftermath of a disaster, while Finau et al. [18] showed how communities of social media users were built around the hashtag #StrongerThanWinston following the 2016 Cyclone Winston disaster in Fiji. Aldrich & Meyer [19] highlighted the critical role of social networks in disaster survival and recovery.

    View all citing articles on Scopus
    View full text