Linking cyber and physical spaces through community detection and clustering in social media feeds
Introduction
The past few years have witnessed the dramatic increase in the adoption and use of social media (Kaplan & Haenlein, 2010). In the U.S. alone, approximately two-thirds of online users participate in social media (Smith, 2011), spending on average between 3.6 and 6.5 h a month in social networking sites such as Facebook or Twitter (Nielsen, 2012). This has led to an unprecedented increase in the volume of data generated by social media users: every minute we have over 270,000 tweets (or retweets) contributed worldwide (Forbes, 2012), 3000 images posted in Flickr (Sapiro, 2011), and 100 h of video uploaded in YouTube (YouTube, 2014). These are but a few examples of the shift that has occurred in recent years toward user-generated digital content. With millions of users around the world, this trend is likely to further intensify (Hollis, 2011) as technological advances empower users to contribute richer data at higher rates.
Social media services and platforms offer a wide array of digital channels for expression and interaction, ranging from forums/message boards (e.g. MacRumors), weblogs (e.g. Blogger, Wordpress), and microblogging (e.g. Twitter, Tumblr, Weibo), to wikis (e.g. Wikipedia, Wikimapia), social networking services (e.g. Facebook, Google+, LinkedIn), and podcasts (Video and Audio e.g. iTunes, Ustream). Such media have enabled the general public to contribute, disseminate, and exchange information (Kaplan & Haenlein, 2010), by introducing a bottom-up alternative to complement the traditional top-down nature of Web 1.0 (Schneckenberg, 2009). This has not only resulted in a change in traditional journalism and news reporting (Deuze, 2008, Kwak et al., 2010), but it is also leading to new opportunities within the geographical sciences (Caverlee et al., 2013, Sui and Goodchild, 2011) due to the rich geographic context and context social media data often provides. A noteworthy example of this trend is the livehoods project (Cranshaw, Schwartz, Hong, & Sadeh, 2012) that is used to characterize and understand urban dynamics using social media. Indeed, social media, and micro-bogging in particular, have already been shown useful in predicting pandemics (Chunara et al., 2012, Culotta, 2010, Ritterman et al., 2009) or natural disasters (e.g. Corbane et al., 2012, Crooks et al., 2013, Zook et al., 2010) to name a few.
As we increasingly embrace the use of crowd-contributed content, gaining a better understanding of how physical space events are reported and discussed within these hybrid communities is a substantial theoretical challenge that also has significant application potential. This paper addresses this challenge by studying how information propagates and evolves over time at the intersection of the physical and cyber spaces, considering representative test cases and studying them under the lens of geosocial analysis. By analyzing the spatial footprint, social network structure, and content in both physical and cyber spaces we can advance our understanding of the complex mechanism through which information regarding localized events is propagated through social media.
A particularity of social media that renders such study necessary is the fact that, unlike other forms of volunteered geographic information, contributions there are part of a networking process, whereby individuals share and exchange information with other members of these online communities (Stefanidis et al., 2013). This networking activity may center around a variety of topics, ranging from personal observations on minutia to commentaries on issues of broader interest (Aiello et al., 2013, Mischaud, 2007). Understanding how people participate in this process remains a substantial, cross-disciplinary theoretical challenge. As a way to address this issue, Farnham and Churchill (2011), for example, discussed the issue of cyber (online) presences, governed by principles of cyber interaction and information flow. However, as these studies emerged from the social psychology domain, they often fail to adequately address the role of the physical space in these cyber interactions. People still live and function primarily in a physical space (rather than the cyber one), and their interactions in this space still play a central role in shaping their behavior. As social media becomes an integral part of our societies, understanding the interplay between cyber presence and the corresponding physical space (so called “polysocial reality” (Applin & Fischer, 2012)) becomes increasingly important, as it will elevate our capability to leverage such content for a variety of purposes.
Mapping and understanding the relations between the cyber and the physical spaces, and in particular the information flow between them is a substantial scientific research challenge, as these two spaces are often not explicitly related, nor are they studied in tandem. The emergence of geolocated social media presents a unique opportunity to address this challenge by allowing us to link cyber and physical activities through user interactions, and understand how peoples’ actions and reactions to events manifest themselves across these spaces. Such knowledge is critical in a wide range of applications of broader societal value (e.g. disaster response), providing additional motivation for this research.
Our focus is on studying the connections between the cyber and the physical spaces (especially as it relates to reports of events in the physical space), as they are expressed through social associations and physical proximity. By doing so we will show how we can identify connections across these two spaces, and demonstrate the value of studying both simultaneously rather than separately. We argue that by studying social networks in both physical and cyber spaces, combining social network analysis (SNA) and spatiotemporal data clustering we can better understand their complex structure. While SNA is a rapidly growing field (e.g., Newman (2010) and Barabási (2012)), it is only recently emerging as a tool in geospatial analysis, and is often underutilized (Ter Wal & Boschma, 2009). Moreover, SNA too is weakened because of the lack of geographic consideration when exploring social relations (e.g. Bosco, 2006). Only recently have we started seeing some early studies that attempt to infuse geography into this issue, addressing for example the geographic scope of topics discussed in on-line communities (HerdaGdelen, Zuo, Gard-Murray, and Bar-Yam (2013)). Our work contributes to this issue by linking the cyber and physical spaces through SNA and spatiotemporal analysis, aiming to bridge the gap between these two fields.
The remainder of this paper is organized as follows. In Section 2 we discuss the rise of geosocial media as a new social communication avenue and a novel source of geosocial information. In particular, we discuss the notion of physical presence within social media and its importance for exploring the relation between the cyber and the physical domains. In Section 3 we discuss how communities and groups can be detected in both the cyber and physical space, and how they can be processed to form a ‘hybrid’ geosocial view of communities. To showcase these concepts and their benefits, in Section 4 we present the analysis of two case studies that make use of Twitter data associated with two different types of events: a planned activity during the Occupy Wall Street (OWS) Day of Action (November 17th, 2011), and the response to the Boston Marathon Bombing (April 15, 2013). This paper is concluded with a summary and outlook in Section 5.
Section snippets
The rise of geosocial media and spatial presence
The power of social media to disseminate information of societal importance has been showcased over the last few years with respect to a range events, from citizen journalism (e.g. the 2008 Mumbai terrorist attacks; Arthur, 2008), to civil unrest (e.g. the 2011 London riots (Glasgow, Ebaugh, & Fink, 2012) and the ‘Arab Spring’ (Christensen, 2011, Howard et al., 2011)), military operations (e.g. the 2011 U.S. raid on Bin Laden’s hideaway (McCullagh, 2011)) and health (e.g. Culotta, 2010). Within
Cyber space communities and physical space groups
Social media have proven to be a fertile ground for fostering user interaction, thus supporting the large-scale synthesis of the virtual and the real (Gordon and Manosevitch, 2011, Mitra, 2003, Parks, 2011). As a result, communities with various degrees of physical and virtual presence are formed (Mitra & Schwartz, 2001), linking physical spaces, cyber spaces, and human activity. In earlier work, Porter (2004) outlined a typology for virtual communities, and discussed their attributes, namely
Case studies
In order to showcase our analysis approach we present in this section two case studies. In the first case study – the 2011 Occupy Wall Street (OWS) Day of Action – we demonstrate how the different steps of our analysis lead to the creation of the bipartite meta-graph that connects cyber space communities and groups. In the second case – the 2013 Boston Marathon Bombing – we show how our analysis approach enables a detailed examination of information propagation between the cyber and the
Summary and outlook
Studying the connections between cyber space and the physical space has long been a challenge in gaining a deeper understanding of how people act and interact. The emergence of social media provides a lens to study the social connections among individuals, allowing us for the first time to observe the links among the distinct spaces in which we operate. While the phone allowed connecting people in fixed locations and the mobile phone extended that to account for mobility (Wellman, 2001, Kwan,
References (108)
- et al.
Users of the world unite! The challenges and opportunities of social media
Business Horizons
(2010) - et al.
Basic notions for the analysis of large two-mode networks
Social Networks
(2008) - et al.
Twitter spammer detection using data stream clustering
Information Sciences
(2014) - et al.
Another look at ‘being there’ experiences in digital media: Exploring connections of telepresence with mental imagery
Computers in Human Behavior
(2014) - et al.
Term-weighting approaches in automatic text retrieval
Information Processing & Management
(1988) The wikification of gis and its consequences: Or Angelina Jolie’s New Tattoo and the future of GIS
Computers, Environment and Urban Systems
(2008)- et al.
Supporting geographically-aware web document foraging and sensemaking
Computers, Environment and Urban Systems
(2011) A survey of stream clustering algorithms
- Aggarwal, C. C., Han, J., Wang, J., Yu, P. S. (2003). A framework for clustering evolving ddata streams. In J. C....
- et al.
Sensing trending topics in twitter
IEEE Transactions on Multimedia
(2013)
On density-based data streams clustering algorithms: A survey
Journal of Computer Science and Technology
The network takeover
Nature Physics
Matrices, vector spaces, and information retrieval
SIAM Review
Toward a more robust theory and measure of social presence: Review and suggested criteria
Presence
Fast unfolding of communities in large networks
Journal of Statistical Mechanics: Theory and Experiment
Actor-network theory, networks, and relational approaches in human geography
Occupy online: Facebook and the spread of occupy wall street
Social Science Research Network
Towards geo-social intelligence: Mining, analyzing, and leveraging geospatial footprints in social media
IEEE Computer Society Data Engineering Bulletin
A survey of binary similarity and distance measures
Journal of Systemics, Cybernetics and Informatics
Modern spatiotemporal geostatistics
Twitter revolutions? Addressing social media and dissent
The Communication Review
Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak
The American Journal of Tropical Medicine and Hygiene
Finding community structure in very large networks
Physical Review E
Relationship between the spatial distribution of SMS messages reporting needs and building damage in 2010 Haiti disaster
Natural Hazards and Earth System Sciences
GeoSocial gauge: A system prototype for knowledge discovery from geosocial media
International Journal of Geographical Information Science
#Earthquake: Twitter as a distributed sensor system
Transactions in GIS
Twitter content classification
First Monday
Understanding journalism as newswork: How It changes, and how it remains the same
Westminster Papers in Communication and Culture
Strategic incapacitation and the policing of occupy wall street protests in New York City, 2011
Policing and Society: An International Journal of Research and Policy
Citizens as sensors: The world of volunteered geography
GeoJournal
AEC algorithm: A heuristic approach to calculating density-based clustering eps parameter
Augmented deliberation: Merging physical and virtual interaction to engage communities in urban planning
New Media & Society
Imagining twitter as an imagined community
American Behavioral Scientist
An exploration of social identity: The geography and politics of news-sharing communities in twitter
Complexity
Cited by (50)
Using crowdsourcing images to assess visual quality of urban landscapes: A case study of Xiamen Island
2023, Ecological IndicatorsInformation propagation on cyber, relational and physical spaces about covid-19 vaccine: Using social media and splatial framework
2022, Computers, Environment and Urban SystemsCitation Excerpt :For example, one of our limitations is that our analysis only includes cyber, relational and physical spaces while excluding mental and relative spaces. As noted above (Section 2) this was done because relational space inferring networks between objects has more potential to bridge the cyber and physical spaces but a logical next step would be to extend this study to include mental and relative spaces thus allowing one to further clarify the relationships between hybrid spaces and give rise to a human-centered Splatial framework (Shaw & Sui, 2020; Croitoru et al., 2015). Furthermore, as with all studies that involve spatial boundaries, our study also suffers from the modifiable areal unit problem (Openshaw, 1981).
Tweeting the Laurentian Great Lakes: A community opinion analysis about Great Lakes areas as assessed through mentions on Twitter
2022, Journal of Great Lakes ResearchCitation Excerpt :Collected metadata included timestamps, user screen names, the number of favorites and retweets generated by collected tweets, the permanent location of users if provided in their Twitter profile, and the geolocation of tweets tagged with location. In our sample only 3% of analyzed tweets included geolocation data; Croitoru et al. (2015) found a similarly low rate of geotagging of tweets. Searches for relevant tweets were conducted every seven days from April 13 through October 14, 2019.
Mapping relationships between mobile phone call activity and regional function using self-organizing map
2021, Computers, Environment and Urban SystemsWhen the storm is over: Sentiments, communities and information flow in the aftermath of Hurricane Dorian
2020, International Journal of Disaster Risk ReductionCitation Excerpt :Virtual communities that spring up on social media platforms possess sufficient qualities that can be used for rapid assessment of post-disaster damage as well as follow the narratives of public discourse relating to a disaster [16]. For instance, Croitoru, Wayant, Crooks, Radzikowski, & Stefanidis [17] established a connection between cyber and physical space in the aftermath of a disaster, while Finau et al. [18] showed how communities of social media users were built around the hashtag #StrongerThanWinston following the 2016 Cyclone Winston disaster in Fiji. Aldrich & Meyer [19] highlighted the critical role of social networks in disaster survival and recovery.