nach oben

Erschienen in:

Open Access 2022 | OriginalPaper | Buchkapitel

Social Response to COVID-19 SMART Dashboard: Proposal for Case Study

verfasst von : Karenina Zaballa, Gabriela Fernandez, Carol Maione, Norbert Bonnici, Jarai Carter, Domenico Vito, Ming-Hsiang Tsou

Erschienen in: Participative Urban Health and Healthy Aging in the Age of AI

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

The COVID-19 pandemic took a toll on the world’s healthcare infrastructure as well as its social, economic, and psychological well-being. In particular, Italy’s unexpectedly high COVID-19 case and death rate from March to June, 2020, captured headlines due to its speed and virulence. Many governments are currently implementing measures to help contain and slow down the spread of COVID-19. The Social Response to Covid-19 Smart Dashboard was built by researchers at the Metabolism of Cities Living Lab, Center for Human Dynamics in the Mobile Age at San Diego State University and Politecnico di Milano. This dashboard provides an aggregated view of what people in 10 Italian metropolitan cities (Milan, Venice, Turin, Bologna, Florence, Rome, Naples, Bari, Palermo, and Cagliari) tweet during the pandemic by monitoring social media behaviors in the north, center, south, and islands. Moreover, the dashboard is a geo-targeted search tool for Twitter messages to monitor the diffusion of information and social behavior changes which provides an automatic procedure to help researchers to: associate tweets based on geography differences, filter noises such as removing redundant retweets and using machine learning methods to improve precisions, analyze social media data from a spatiotemporal perspective, and visualize social media data in various aspects such as weekly trends, top urls, top retweets, top mentions, and top hashtags. The Social Response to Covid-19 SMART Dashboard provides a useful tool for policy makers, city planners, research organizations, and health officials to monitor real-time societal perceptions using social media.

1 Introduction

Social media is more ubiquitous than ever, enabling it to be a good tool to keep connected during the pandemic. Using automatic data processing for Twitter messages, the Social Response to COVID-19 SMART (Social Media Analytic and Research Testbed) Dashboard helps researchers search Tweets in different cities, filter noise (such as removing redundant retweets and using machine learning methods to improve precision), analyze social media data from a spatiotemporal perspective, and visualize social media data in various aspects (such as weekly and monthly trends, top URLs, top retweets, top mentions or top hashtags). The Social Response to COVID-19 SMART Dashboard uses multiple data mining programs, GIS methods, and advanced geo-targeted social media API’s to track selected topics in space and over time. There are multiple components to searching, processing, and visualizing social media messages from the Twitter Standard Search application programming interfaces or API’s. The filtered statistics of the focus topics and geo-targeted cities are visually represented in the SMART Dashboard.

The daily and almost live monitoring capability of the Dashboard has great potential for local, state, public health agencies, and practitioners to integrate real-time information to investigate large-scale disease outbreaks. For example, the Social Response to Covid-19 SMART Dashboard can be used to study sentiments on COVID-19 and vaccines in Italian cities based on new policy mandates and curfews. Because of the Dashboard’s unique capability to capture the temporal and spatial nature of COVID-related policies, behaviors, beliefs, and sentiments through Twitter content revealing various trends in diverse geographic areas, community leaders can use this tool to closely connect to their constituents and mitigate social issues before they become full-blown movements. Another potential use is to monitor public opinion towards crisis events such as the SARS COVID-19 outbreak. The Dashboard visualizes the most popular media shared in Twitter based on the COVID-19 pandemic in real-time.

2 Literature Review

To provide more background on the Dashboard, the following areas will be discussed in detail: the impact of COVID-19 on the 10 metropolitan cities in Italy to understand the geographical and temporal constraints, social media analytics to delve into their use cases, and SMART Dashboard 2.0 to delve into the history of the dashboard.

2.1 Impact of COVID-19 on the 10 Italian Cities

The COVID-19 pandemic has turned the once tourist-filled cities of Italy to ghost towns due to quarantine measures. The SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is caused by a coronavirus, and it presents itself with symptoms that include “fever or chills, cough, shortness of breath, difficulty in breathing, fatigue or tiredness, muscle or body aches, headaches, new loss of taste or smell, sore throat, congestion or runny notes, nausea or vomiting, and diarrhea” [12]. During the beginning of March 2021, Italy came into the forefront of world health news due to its rapidly rising COVID-19 cases and deaths as well as for being the first country outside of Asia to have such high cases and deaths.

To provide more background, Italy’s first confirmed COVID-19 case was reported in the Province of Lodi, Lombardy region on February 20th, 2020 [11]. The next day, Italy and all of Europe the first COVID-19-related death was announced in the province of Padova and Veneto region [1]. Due to the increasingly older residents who have a larger likelihood of comorbidities in Italy, the majority are at risk for the disease [2].

Other Twitter dashboard studies have focused on identifying real-time Twitter trend analysis using big data analytics and machine learning techniques [3]. For instance, Garg and Kaur [4] have explained the analysis of Twitter data using components of Cloudera distribution of Hadoop. In fact, the study’s objective assigned polarity to each tweet. Map reduce and Apache SPARK frameworks were used for sentiment analysis. The result showed that Apache SPARK is better than MapReduce. Saad and Yang [5] have performed sentiment analysis of Twitter data using ordinal regression. While, Ahmed and Rodriguez-Diaz [6] have performed sentiment analysis on online customer reviews as a form of visualization. Finally, Rathod and Barot [7] researched the same field to predic public opinion on ongoing events by analyzing tweet sentiments using machine learning classifiers like SVM, Naive Bayes, logistic classifier, and KNN classifier. SVM was found to be the best classifier with the least mean square error for the classifications. Garg et al. [8] have identified the trending pattern in Twitter using SPARK. These patterns were obtained by collecting tweets on a real-time basis and identifying trending hashtags at the same time. It was implemented using a big data technology SPARK streaming. This type of technique can help governments or companies know about more about the behaviors/trends of their given campaign/program and/or brand/product awareness and customer needs.

The time frame chosen to provide a proof of concept for the Dashboard is from March 3rd to June 25th, 2020. This period is divided into Phases 0, 1, and 2. Phase 0 started when the first case was reported until before the lockdown. Phase I, or the lockdown phase, started on March 11th, 2020 and ended on May 4th, 2020 [9]. Phase 2 lasted from May 4th, 2020 until June 3rd, 2020 [2].

Phase I is marked by increased restrictions in Milan in response to the pandemic. Specifically, educational institutions, religious events, cultural centers, and all events and places that required gathering were prohibited [9]. This included professional sporting events. Visits to family and relatives were prohibited as well as patronizing bars and restaurants. Dining institutions were allowed takeout with limited hours. Face masks were required in all public spaces indoors and outdoors. In addition, there was a self-certification form that the government required the residents to fill out and keep on their person whenever they left their homes that enabled contact tracing measures [9]. The lockdown was exacerbated when military force was ordered to keep lockdown measures in place. Due to travel restrictions, no airports were open for use. The only travel of any kind allowed was to the grocery store, pharmacy, or the hospital. Next, Phase II marked the easing of restrictions in Phase I. Businesses opened without limits to their hours of operation [9]. Some airports opened enabling reduced international travel. Public parks also opened as well as public transportation with reduced capacity [9].

2.2 Metropolitan Italian Cities

For this study, Italian major metropolitan cities were explored to understand the interconnections between geographical location, number of COVID-19 cases, social response to the pandemic and locally-enforced measures based on Twitter data. Table 1 shows the 10 cities that were selected across Italy. These cities were chosen by the Crowdfight International Team, a multidisciplinary research group, based on economic and cultural factors. Since the outbreak started in the North, the team decided to start there while other cities were added over time. Milan and Venice were chosen to represent the Northwestern region. Turin and Bologna were chosen for the Northeastern region. Florence and Rome represented the Central region. Naples and Bari claimed the Southern region. Palermo and Cagliari represented the Islands.

Table 1.

Socio-demographics for the 10 Italian metropolitan cities [13]

Type	City	Region (NUTS-3)	Area (km²)	Population (n)	Radius surveyed (miles)	Latitude	Longitude
North West	Milan	Lombardy	1,575	3,190,3405	38 (24)	45.4642	9.1899
North East	Turin	Piedmont	6,829	2,293,340	59 (37)	45.0703	7.6868
	Venice	Veneto	2,462	858,455	56 (35)	45.4408	12.3155
	Bologna	Emilia Romagna	3,702	1,005,831	9 (6)	44.4949	11.3426
Center	Florence	Tuscany	3,514	1,007,435	56 (35)	43.7696	11.2558
	Rome	Lazio	5,352	4,336,915	67 (42)	41.9027	12.4963
South	Naples	Campania	1,171	3,128,702	45 (28)	40.8517	14.2681
	Bari	Apulia	3,821	1,251,004	48 (30)	41.1171	16.8719
Islands	Cagliari	Sardinia	1,248	431,302	37 (23)	39.2238	9.1217
	Palermo	Sicily	5,009	1,276,52	11(7)	38.1157	13.3615

2.3 SMART Dashboard

The idea of the “prototype created by the Center for Human Dynamics in the Mobile Age at SDSU was to facilitate the rapid dissemination of official alerts and warnings notifications from OES during disaster events via multiple social media channels to targeted demographics” [15]. The platform can identify and recruit top 1000 social media volunteers based on their social network influence factors and can aid government agencies to communicate more effectively to the public [14].

In our study, this same Dashboard was refitted to 10 metropolitan cities in Italy. More specifically, the north, center, south and island cities of Italy [14]. The backend was improved and mounted on larger servers.

3 Methodology

3.1 Data Collection

To provide the analysis, the team began by collecting Twitter data through the Twitter Standard Search API. This involved making a Twitter Developer account, requesting access tokens and keys followed by authentication of said keys. The API allows for collecting specific metadata, so the researchers had freedom to choose which ones to use for the study. In addition, Table 2 shows the keywords that were used to harvest the Tweets. These were chosen by the Crowdfight International Team in partnership with the Metabolism of Cities Living Lab under the Center for Human Dynamics in the Mobile Age (HDMA), after discussions with Italian colleagues as well as medical professionals. Keywords were selected based on popularity based on hashtag and word of mouth.

Table 2.

Social response to Covid-19 smart dashboard selected keywords

ID	Italian keywords	English keywords	Description
1	distanziamento sociale	Social distancing	Refers to the rule of being at least 6 feet apart in public and private spaces to decrease the spread of COVID-19
2	positivi	Tested positive	For COVID-19 virus
3	Stay at home	Stay at home	Refers to the measure used by governments to decrease spread of COVID-19
4	vaccino	Vaccine	self-explanatory
5	Coronavirus/COVID/COVID-19	Coronavirus/COVID/COVID-19	Caused by a coronavirus called SARS-CoV-2; the cause of the pandemic
6	sintomi	Symptoms	Symptoms of COVID-19
7	mascherine	Masks	Medical masks used to prevent spread of COVID-19
8	quarantena	Quarantine	Measure to reduce spread of COVID-19
9	amuchina	Hand sanitizer	Hand sanitizer (slang); may also refer to a brand of bleach and brand of sanitizing products
10	Giuseppe Conte	Giuseppe Conte	Prime Minister of Italy (former)

3.2 Data Collection

In order to understand how the data is analyzed it is important to understand the client and server framework in Fig. 1 below.

The server side for the Dashboard is explained below. For the database, the social media data tends to be more unstructured, so a NOSQL database, specifically MongoDB was used [10]. The Twitter Search Engine, coded in Python, was used to specify keywords, time period, and automate collection [10]. The web server used is written in NodeJS so that there would not be a need to switch to other server-side languages to implement the server [10]. This was specifically written so that JavaScript and node modules can be utilized to expand the functionality. Having NodeJS for the server also enabled for easier REST API creation, since the API is also built with NodeJS [10]. The client side of the framework is built upon HTML5 (HyperText Markup Language 5), JavaScript (JS), and CSS3 (Cascading Style Sheets, Version 3) as the base. On top of which are various JavaScript libraries to be discussed in Table 3.

Table 3.

JavaScript libraries in the client side [10]

JS library	Utility
jQuery	Easily handles HTML document traversal and manipulation, event handling, animation and AJAX; the API can also be used across multiple browsers
Bootstrap	Well-known framework for HTML, CSS, and JS, creating responsive projects on the web
jQuery MD5	Encrypts password in MD5 encoding format
Leaflet	Used for maps
D3.JS	Used for Word Cloud section
Dygraph	Used for line chart in the Trend Section
Morris.js	Used for bar chart in the Trend and Word Cloud Sections
dataTable	Displays results in table format with filter and sort functions
Moment.js	Formats date and time
Twitter_widget.js	Permits Tweets to be displayed in Twitter style; used in Top Retweets
OWL Carousel	Displays images in a carousel format; used in Top Media

3.3 Dashboard Features

Due to the flexibility of the original SMART Dashboard 2.0, the Social Response to COVID-19 Dashboard was created by first changing the geo-tagged tweets during data collection then changing the keywords and filtering out specific links that may be deemed inappropriate or unrelated to the cause on the SMART Dashboard. Each section of the COVID Dashboard is discussed below.

The first few components that the user sees is the screen in Fig. 2 below, containing the Dashboard Toolbar on the far left, the SMART index at the top, and the Trend and Top Media sections below the SMART index. It also houses the “Stop Auto Refresh” button in order to enable researchers to stop the feed and conduct analyses.

Dashboard was created by first changing the geo-tagged tweets during data collection then changing the keywords and filtering out specific links that may be deemed inappropriate or unrelated to the cause on the SMART Dashboard. Each section of the COVID Dashboard is discussed below. The SMART Dashboard 2.0 Toolbar, on the far left, contains the shortcuts of each component on the Dashboard. It also houses the keywords used to extract the Tweets. In addition, it contains the “Download” button to gain access from the data in the dashboard, the Privacy Policy, and Feedback buttons. The “Home” button enables the selection of keywords and filtration of certain Tweets that may be inappropriate or that adds noise to the findings. The toolbar also enables the selection of keywords simply by checking and unchecking the keywords desired.

The SMART Index, which consists of the four multi-colored blocks across the top, shows the most current metrics from the last 10 min it refreshed. The blocks will be discussed from left to right. The first block (blue) from top to bottom shows the number of Tweets harvested within the past hour, the date they were extracted, and the distribution of the time that each tweet was extracted. The second block (green) shows the number of Tweets extracted in the past 24 h, current date, and distribution of the Tweets over time. The third block (yellow) contains the number of Tweets since the day before the current date. It also contains the distribution of the number of Tweets from the day before and the current date. The fourth block (pink/salmon) shows the number of Tweets since the beginning of collection and the distribution of Tweets from the beginning of the Tweet harvest until the current date The Trend Section shows the frequency of Tweets generated by the keywords over time through a series of line graphs. Users can hover over any section of the graph and it will show the Tweets, both filtered and unfiltered, in the time frame. Any point in the line can be clicked to show the Tweets at the selected timeframe within the point selected. In addition, the tabs on the top can change said time frame. In Fig. 2, the graph shows how users can visualize Twitter metrics from the past 10 min, hour, daily, weekly, and monthly, therefore shrinking the graph towards the left. The bottom sliding scale can also change the distribution of the timeline of the graph.

The Top Media Section on the lower left of Fig. 2, shows the most shared images posted within the timeframe. The user has options to change the time frame, whether to show all media from the beginning of extraction, a week of current date, a day from current date, and from the current date.

The Top URL Section shows the most posted links or web pages within the timeframe. The user has options to change the time frame, whether to show all URL’s from the beginning of extraction, a week of current date, a day from current date, and from the current date. Figure 4 shows it all. A unique feature in the Social Response to COVID-19 Dashboard is the Word Cloud Section in Fig. 3. It includes a word cloud and most frequent vocabulary words table within the selected time period. The word cloud function contains the most frequent vocabulary words within the corpus at any chosen time period. The size of the words indicate a higher frequency, while words with smaller fonts are less frequent. Word clouds are an intuitive, decorative, and convenient way to see most common keywords in a corpus. Future developments for certain word clouds can include using stopwords, or words that are used so commonly that they provide little to no value to the visualization. For example, in English, this could include articles and prepositions, like “the” and “into.” This would naturally mean selecting a particular language, which when harvesting geo-tagged Tweets, do not guarantee one specific language.

In addition, the Vocabulary Frequency Table shows the most frequent words in the corpus in the selected time frame. The information is presented in bar chart form arranged from most frequent to least frequent. Another unique feature of the Dashboard is the Tweets in Cities section shows the normalized tweeting rates by city population within the certain time period selected. Basemaps can be changed to the user’s preference. In our example in Fig. 4, the map is based in Milan, Italy and the selected time period is all Tweets since the beginning of the extraction. Other options include a week from current date, a day from current date, and the current date. What may also be notable for researchers is the geographic visualization of where the tweets were collected as well as the collection radius and other useful statistics like the total number of Tweets collected in the selected time period and the latest population information that the API can find.

The most common Retweets from the selected time period are displayed in the Top Retweets section. Like the other sections, users can select which time period they want: all Tweets since the beginning of extraction, a week from the current date, the day before the current date, and the current date. Each Retweet has its frequency next to it. Retweets are important because they are quantifiable measures of influence. They also heavily affect a corpus if the study does not require original Tweets.The Top Mentions section shows the most frequent user references (beginning with ‘@’) in the selected time period. This section is notable because mentions are quantifiable measures of reference. It shows the frequency of interaction between the Twitter users within the collected corpus. Their corresponding frequency is displayed next to each user that was mentioned. Users can select which time period they want: all Tweets since the beginning of extraction, a week from the current date, the day before the current date, and the current date. The Top Hashtags, which refer to an idea or theme of a tweet, are shown below the Top Mentions section. Users create this hashtag to refer to certain movements, using the pound sign (‘#’). Like mentions, these are also quantifiable measures of reference and levels of interaction between users and hashtags. The corresponding frequency is displayed next to each frequent hashtag in the time period. Users can select which time period they want: all Tweets since the beginning of extraction, a week from the current date, the day before the current date, and the current date. The last shows the Geocode Status of the Tweets collected in the selected time period. This is meaningful for the researchers because it gives context to the successfully geocoded tweets in the corpus. It can give insight into error rates, so future experiment parameters can be adjusted accordingly. Corresponding counts and percentages are displayed next to each status (Fig. 5).

4 Discussion and Future Work

This type of dashboard is successful in filtering certain websites and content and the unique combination of visualizations increases the potential of the tool to be used in many different settings. For the purposes of social response to COVID-19, it allows policymakers to understand the current behaviors of society and can be used to observe public opinion during and after crisis events or disease outbreaks. The SMART Dashboard is available for use to assist response and assistance efforts during the pandemic. Real-time public health information and major events captured using social media are now at the forefront of behavioral measurement, disease surveillance, health promotion, and more. Different cities and regions may reveal different patterns of social media messages and trends. By analyzing the context of social media messages, linking place and time together we can discover more meaningful patterns and insights depending on the goals of the study of disease outbreaks and social media activities. Having expounded on the Dashboard’s capabilities, it is useful to note that the limitations of this study are dependent on the Twitter Standard Search API, the capacity of the server to store data, the extraction parameters in data collection, and the specific keywords used in the study. With the constant sharing of ideas online, it is impossible to capture the totality of themes online. In addition, certain natural language processing techniques for the word cloud can be improved by implementing specific stopwords in order to see specific keywords rather than articles and prepositions. Work can be done to make the techniques agnostic to language including stopword adjustments.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Vorheriges Kapitel Design COVID-19 Ontology: A Healthcare and Safety Perspective

Nächstes Kapitel Adopting the Internet of Things Technology to Remotely Monitor COVID-19 Patients

Alicandro, G., Remmuzzi, G., La Vecchia, C.: Italy’s first wave of the COVID-19 pandemic has ended: no excess mortality in May 2020. Lancet 396(1023), E27–E28 (2020). https://doi.org/10.1016/S0140-6736(20)31865-1CrossRef

Indolfi, C., Spaccarotella, C.: The outbreak of COVID-19 in Italy: fighting the pandemic. JACC 2(9), 1414–1418 (2020) (case reports). https://doi.org/10.1016/j.jaccas.2020.03.012

Rodrigues, A., Fernandes, R.P., Bhandary, A., Shenoy, C.A., Shetty, A., Anisha, M.: Real-time twitter trend analysis using big data analytics and machine learning techniques. Wireless Communications and Mobile Computing, Hindawi (2021). https://doi.org/10.1155/2021/3920325

Garg, K., Kaur, D.: Sentiment analysis on Twitter data using Apache Hadoop and performance evaluation on Hadoop MapReduce and Apache Spark, pp. 233–238. The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp)

Saad, S.E., Yang, J.: Twitter sentiment analysis based on ordinal regression. IEEE Access 7, 163677–163685 (2019)CrossRef

Ahmed, Z., Rodríguez-Díaz, M.: Significant labels in sentiment analysis of online customer reviews of airlines. Sustainability 12(20), 1–18 (2020)CrossRef

Rathod, T., Barot, M.: Trend analysis on Twitter for predicting public opinion on ongoing events. Int. J. Comput. Appl. 180(26), 13–17 (2018)

Garg, P., Johari, R., Kumar, H., Bhatia, R.: Trending pattern analysis of Twitter using spark streaming. In: International Conference on Application of Computing and Communication Technologies, pp. 3–13. Springer, Singapore (2018)

Minero, G.: Lockdown in Italy, Where Milan (2020). https://www.wheremilan.com/tips/lockdown-in-italy-phase-2/

10.

Jung, C.T.: SMART Dashboard Technical Document. The Center for Human Dynamics in the Mobile Age, San Diego State University (2016)

11.

Godin, M.: Why Is Italy's Coronavirus Outbreak So Bad?, Time, March 10 2020. https://time.com/5799586/italy-coronavirus-outbreak/

12.

Center for Disease Control and Prevention: Symptoms of COVID-19. US Department of Health and Human Services (2021). https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html

13.

Italian National Institute of Statistics (ISTAT): Popolazione residente per età, sesso e stato civile al 1° Gennaio 2019. ISTAT, Rome (2020). http://demo.istat.it/pop2019/index.html

14.

The Center for Human Dynamics in the Mobile Age San Diego State University: SMART Dashboard 2.0 (2021). https://humandynamics.sdsu.edu/SMART2.0.html

15.

Zaballa, K., Yang, H.: Social Response to COVID-19 in Italy (Website). The Center for Human Dynamics in the Mobile Age (2020). https://storymaps.arcgis.com/stories/74c499d5ac0a46ffbbc2b28acfa05102

Titel: Social Response to COVID-19 SMART Dashboard: Proposal for Case Study
verfasst von: Karenina Zaballa
Gabriela Fernandez
Carol Maione
Norbert Bonnici
Jarai Carter
Domenico Vito
Ming-Hsiang Tsou
Verlag: Springer International Publishing
Buch: Participative Urban Health and Healthy Aging in the Age of AI
Print ISBN: 978-3-031-09592-4

Electronic ISBN: 978-3-031-09593-1

Copyright-Jahr: 2022
DOI: https://doi.org/10.1007/978-3-031-09593-1_12

Springer Professional