Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2023

Open Access 01.12.2023 | Original Article

Assessment of text-generated supply chain risks considering news and social media during disruptive events

verfasst von: Soumik Nafis Sadeek, Shinya Hanaoka

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Information flow is an important task in a supply chain network. Disruptive events often impede this flow due to confounding factors, which may not be identified immediately. The objective of this study is to assess supply chain risks by detecting significant risks, examining risk variations across different time phases and establishing risk sentiment relationships utilizing textual data. We examined two disruptive events—coronavirus disease 2019 (Omicron phase) and the Ukraine–Russia war—between November 2021 and April 2022. Data sources included news media and Twitter. The Latent Dirichlet Allocation algorithm was applied to the textual data to extract potential text-generated risks in the form of “topics.” A proportion of these risks were analyzed to assess their time-varying nature. Natural language processing-based sentiment analysis was applied to these risks to infer the sentiment coming from the media using the ordered probit model. The results identify various unnoticed risks, for example: logistics tension, supply chain resiliency, ripple effect, regional supply chain, etc. that may adversely affect supply chain operations if not considered. The outcomes also indicate that textual data sources are capable of capturing risks before the events actually occur. The outcomes further suggest that text data could be valuable for strategic decision making and improving supply chain visibility.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Supply Chain Risk Management (SCRM) has been emerging as one of the significant fields in the supply chain domain. It focuses on an organization’s goals and achievements and prepares against any potential risks from disruption, such as global pandemics (e.g., coronavirus disease 2019) and conflicts (e.g., the Ukraine–Russia war), that result in supply disruptions and demand fluctuation. An efficient risk management mechanism begins with an understanding of how various risk sources and their consequences interact with daily supply chain operations, which allows businesses to identify critical issues and respond to them accordingly (Chopra and Sodhi 2014; Stanton 2020).
SCRM is important because its cascading effect on the networks results in vulnerabilities in the supply chain (Cigolini and Rossi 2010). For instance, the coronavirus disease 2019 (COVID-19) pandemic has disrupted the regular supply chain mechanism and hampered global economies with various measured and latent risks (Caniato and Rice 2003). COVID-19 is a highly infectious respiratory virus that has posed a great threat to the general public, and many governments have taken stringent measures to prevent its spread. Globally, supply chains have been facing major disruptions to new demands due to cross-country lockdowns and the emergence of new COVID-19 variants (Delta, Omicron, Xe, etc.) (Zhu et al. 2020). Recently, the transition from a reactive to a proactive approach to supply chain risks has gained the attention of researchers and supply chain managers due to the global pandemic (Toorajipour et al. 2021). The coronavirus pandemic has had a clear impact on the supply chains of manufacturers, wholesalers, vendors and retailers. Existing frameworks have not been able to adapt with these disruptions, and organizations have not had appropriate strategies to deal with them (Sharma et al. 2020). COVID-19 has drastically changed the regular global supply chain operation in terms of supply shortage, a lack of just-in-time practices, port congestion and inadequate safety stocks. These problems have largely been triggered by a lack of information disclosure and transparency regarding various supply chain risks (Zhu et al. 2020). In addition, the commencement of the Ukraine–Russia war has sourced an additional risk in the global supply chain, impacting global economy, policy and business (Pereira et al. 2022) and thus causing an alarming humanitarian situation. Furthermore, it has triggered perturbation in supply chain operation and has had devastating impacts on food security, agricultural activities and infrastructure (FAO 2022).
The ever-increasing use of digital texts creates opportunities for knowledge extraction, if these texts are properly analyzed using relevant computational methods (Gentzkow et al. 2019). Automated textual analysis substantially reduces the costs of analyzing large corpora of text collections and enables the exploration of new research areas (Grimmer and Stewart 2013). Examples of textual analysis can be seen in stock market analysis (Born et al. 2014; Tetlock 2007), corporate finance (Rajan et al. 2022) and economic policy intervention (Baker et al. 2016).
In the supply chain domain, news-based text data are used in apparel industries (Shah et al. 2021). In addition, Zhou et al. (2021) extracted significant information about risks from earlier research papers in various supply chain domains. Moreover, Swain and Cao (2019) indicated that organizations sharing relevant information on their social media improves individual and overall supply chain performance.
From the above discussion, it is clear that disruptive events have hampered regular supply chain operations, especially where unprecedented factors are involved. News media and people on social media report and share thoughts and experiences in the form of digital texts; thus, textual data could be useful in proactively identifying risks. Therefore, this study highlights the implementation of a framework that could improve supply chain risk management, particularly by identifying contemporary supply chain risks during disruptive events. This study aims to assess supply chain risks during COVID-19 (Omicron phase) and the Ukraine–Russia War. The objectives are i) to detect and assess text-generated supply chain risks from news and social media by analyzing time-varying trends, and ii) to identify the significant supply chain risks and establish risk sentiment synergy.

2 Literature review

Supply chain researchers have been using textual data for more than a decade, but the quantity and popularity of such studies has largely been limited. However, due to computational advancements, this cross-field has recently been gaining momentum. Recent studies on the following topics have been conducted through textual analysis: demand forecasting accuracy (Cui et al. 2018; Wang et al. 2019); inventory performance (Wood et al. 2017); customer preference identification (Mishra et al. 2017); product development idea generation (Chirumalla et al. 2018; Guo et al. 2017; Zhang et al. 2019); distribution and delivery (Cherrett et al. 2015; Kirac and Milburn 2018); reverse logistics (Minnema et al. 2016); business risk identification (Narayanaswami 2018; Zhang et al. 2018); and risk information distribution during disasters and emergencies (Kumar and Singh 2019; Nisar and Prabhakar 2018; Onorati and Díaz 2016). In the following paragraphs, we discuss the text analyses performed in these studies in regard to supply chain risks, risk sentiment and similar issues studied during COVID-19.
When SCRM was slow to adopt textual data in research and practice, Chae (2015) conducted a comprehensive study on Twitter data in the context of supply chain management; in this study, a framework combining text-based content analysis, sentiment analysis and network analysis was proposed. To discover risks and uncertainty in complex supplier selection in the supply chain, Su and Chen (2018) proposed a framework called “Twitter Enabled Supplier Status Assessment,” which was built upon recurrent neural networks and Word2Vec to classify risks and capture word contexts, respectively. This framework assists in selecting global suppliers from textual data, and it helps to identify potential risks and develop negotiation criteria with suppliers.
Akundi et al. (2018) used Twitter to obtain real-time information about product supply chains, to understand the rapidly changing market and to obtain information regarding the public’s sentiments toward smartphone brands such as Apple, Samsung and Huawei using association rule mining. Schmidt et al. (2020) addressed the relationship between supply chain glitches and stock market returns by leveraging Twitter responses at the firm level and by leveraging sentiment reaction. Moreover, Singh et al. (2018) proposed a framework comprising textual data from Twitter to identify risks in the food supply chain through classification using the Support Vector Machine and clustering using a bootstrapped hierarchy. Huang et al. (2020) extended research to the dynamic behavior of the supply chain using online reviews to identify the bullwhip effect and inventory variance amplification. They used data from customers and one e-commerce retailer. Their results indicated that online review adoption increased both the bullwhip effect and amplification. Wichmann et al. (2020) used textual information to enhance supply chain visibility among buyers and suppliers. To tackle unprecedented risks due to increasing supply chain complexity, their study mapped supply chain risks using deep learning and natural language processing (NLP). A similar study was conducted by Sheikhattar et al. (2022), in which they posited that textual data was complementary to building a decision supply system. In addition, Ying et al. (2020) used textual data sourced from banks’ evaluation and approval reports to unveil potential transparency risks in supply chain finance. Teo (2020) forecasted supply chain demand to reduce bullwhip risk by using textual information. They proposed a framework using deep learning model and NLP to extract information from long text documents. Likewise, Yao et al. (2022) used textual data from financial websites of Chinese pharmaceutical enterprises to predict enterprise credit risk in the supply chain. Using NLP techniques, sentiment features were constructed and compared with traditional features to predict credit risks. Wood et al. (2015) combined textual data, sentiment analysis and big data analytics to leverage text data as a real-time information provider of market demand to the upstream firms of a supply chain network. Swain and Cao (2019) analyzed social media comments posted by various supply chain members who produced opinionated content and performed both opinion mining and sentiment analysis. Their results affirmed that the performance of a supply chain operation is closely related to information sharing among its members.
Sharma et al. (2020) gathered Twitter data from NASDAQ 100 firms to understand the risks they were facing during COVID-19 in supply chain operations. Their study identified the following as critical risks: supply–demand mismatch, a lack of technological enhancement, a lack of supply chain resiliency and difficulties in ensuring sustainability in the supply chain. Meyer et al. (2021) identified that news-based texts could uncover topics such as risk, resilience, sustainability and disruptions. Moreover, Wu et al. (2022) attempted to reconfigure the supply chain model using textual data from social media to understand bullwhip effects and possible disruption during COVID-19. The outcomes identified the gap between academic and public concerns and associated risks. Hirata and Matsuda (2021) extracted Twitter data to highlight the impact of COVID-19 on supply chains, specifically in regard to logistics and shipping. They mined the web to extract internet articles and their results indicated that the recovery of the supply chain largely depends on China, an alarming finding in terms of global supply chain resiliency. Brandtner et al. (2021) analyzed textual data and customers’ rating data from the five biggest retail chains in Austria regarding the downstream supply chain (point to sale) during COVID-19. Their results indicated that negative sentiments were prevalent among consumers during pandemic, specifically during periods of political regulation to curb infections. Ganesh and Kalpana (2022) attempted to detect supply chain risks early and in real time using textual data from the Twitters of online supply chain platforms during the pandemic. Extracted risk factors included a shortage of semiconductors, port congestion, lead time increase and production shutdown.
As seen above, text mining could be a potential data source for extracting supply chain risks. However, the studies discussed above have some limitations, such as limited data handling in terms of text treatment, a lack of distribution-based textual data for inferential purposes, a scarcity of resources for news-based risk identification and policy making, and no account of risk propagation and time series textual information. Moreover, to our knowledge, there are no assessment-based studies that consider both the pandemic and the Ukraine–Russia war as possible supply chain risks in terms of sentiments. Therefore, this study attempts to fill this gap by considering uncertainty in textual data, identifying significant risks, integrating sentiments with risks and highlighting how news and Twitter reveal sentiments on a particular topic.

3 Materials and methods

3.1 Data Collection and preprocessing

This study used textual data from news media and Twitter to extract various global supply chain risks. Both of these sources offer real-time or nearly real-time textual data. News media publishes textual articles in an explicit and concise manner, including articles by experts from a particular field of interest and those that contain opinions and diverse information. Twitter offers direct consumer opinions.
News media was selected based on availability, free accessibility and institutional subscription. Based on these criteria, we selected four news media: Asian Times, BBC, Nikkei Asia and Reuters. Other than Nikkei Asia, all offered free access to their articles. The Tokyo Institute of Technology has a subscription to Nikkei Asia, allowing us to access their articles. We extracted Twitter data using Twitter API for Academic Research (API Key and the API Access Token) to access 7-day data extraction. For example, on November 22, we requested Twitter data of a particular hashtag and then Twitter will return data from November 15 to November 22 of that hashtag. Similarly, we have extracted data every week to ensure continuity of data collection. We used R statistical software and the “twitteR” package (Gentry 2015) to extract the Twitter data.
Textual data from November 15, 2021, to April 30, 2022, was collected. This time frame covered both the day that Omicron was identified as a Variant of Interest (VOI) and the start date of the Ukraine–Russia war (February 24, 2022). In total, 673 articles were extracted from news media. Inclusion criteria for news articles was the use of at least one bi-gram word “supply chain” within the article. During the process of extracting news articles, authors examined the geographical coverage and organizational scope reported in those articles related to supply chain. The majority of the articles featured countries, e.g., Japan (124), USA (116), China (64), Taiwan (62), South Korea (30), UK (38), Germany (22), Russia and Ukraine (13). The collected articles mainly focused on various topics such as automobile industries (58), transportation, automobile and semiconductor industries (94), business and finance (87), port congestion (13) and the Ukraine–Russia war (33), among others. For Twitter, we initially extracted 2,72,732 tweets based on #supplychain hashtag. We then filtered out advertising and job post tweets. Later, we applied another filter containing the words “omicron,” “variants,” “covid,” “pandemic,” “Ukraine,” “Russia” and “logistics;” Tweets comprising of any of these words were selected. After rigorous data clean up, 5,372 tweets were included in the final analysis. While collecting data from Twitter, we encountered difficulties in obtaining geo-tagged location information of the tweets, therefore, we were unable to accurately determine location information. Though when collecting data using hashtag, Twitter typically returns tweets containing respective hashtags from various locations worldwide. Hence, we can assume our data encircled an wider locational coverage. This diversification from both news and social media potentially reduced bias in the analysis. In addition, we followed relevance or purposive sampling technique with an assurance that our collected text data fits with the research objectives and are relevant to the subject domain. Moreover, this technique is feasible for shorter time duration and smaller number of samples; though for wider-scale application, this sampling technique could be costly and time-consuming (Anandarajan et al. 2019).
Textual data requires pre-analysis treatment to avoid detecting the wrong information. In this study, treatment involved NLP techniques. For instance, we converted all text to lowercase, removed punctuation, removed tabs, removed numbers, removed relevant stop words and reduced prefixes and suffixes on words to enhance term aggregation. For more information on preprocessing, please refer to Kwartler (2017).
To conduct the sentiment analysis of various topics detected from LDA, we leverage NRC/ Affect Intensity Lexicon Library (Mohammad and Turney 2013; Mohammad 2021), also known as EmoLex. This is a Dictionary-based NLP Library that regularly updates sentiments and emotions of specific English words. This library simultaneously stores and updates sentiments and emotions of the English words. NRC library classifies sentiments as “positive” and “negative” sentiments, and emotions as “anger,” “anticipation,” “disgust,” “fear,” “surprise,” “joy,” “trust” and “sadness.” This NRC Emotion Lexicon has been widely used to create sentiment scores of documents/ tweets in various research-based studies including finance (Qian et al. 2022), tourism carrying capacity (Tokarchuk et al. 2022), disaster management (Navarro et al. 2023), climate change (Upadhyaya et al. 2023), etc. Supply chain-based issues deal with a diverse range of subjects including economics, finance and trades, and in addition, NRC Emotion Lexicon leverages a large corpus of financial terms that lead to sentiment and emotion lexicon (McCarthy and Gita 2023). Therefore, we decided to use NRC for the current study. To get maximum word coverage, we decided to integrate emotions with sentiments and to keep only positive and negative sentiments. To do this, we have summed up words detected in the category of positive, joy and trust as only positive sentiment, and summed up words detected in the category of negative, anger, anticipation, disgust, fear, sadness and surprise as negative sentiment. For example, suppose an article from Nikkei Asia published on November 15, 2021, had number of words detected as positive = 28, negative = 8, anger = 3, anticipation = 9, disgust = 0, fear = 3, joy = 2, sadness = 1, surprise = 1 and trust = 27. Then, we summed up positive sentiments as positive + joy + trust = 28 + 2 + 27 = 57 and negative sentiments as negative + anger + anticipation + disgust + fear + sadness + surprise = 8 + 3 + 9 + 0 + 3 + 1 + 1 = 25. Therefore, the polarity of this document was (positive sentiment – negative sentiment) = 57–25 = 32 which indicates that the article was skewed toward positive sentiment. In a similar way, we conducted sentiment analysis separately for the articles and tweets. By following this approach mentioned, we calculated sentiment polarity for each of those articles and tweets, resulting in 673 distinct sentiment values for articles and 5372 sentiment values for the tweets. These sentiment values were then categorized into seven groups, enabling the regression of text-based risks with sentiments and providing sentiment patterns separately for news and social media. These seven categories are as follows: Extremely Positive (> + 20); Moderate Positive (+ 11 to + 20); Weakly Positive (+ 1 to + 10); Neutral (0); Weakly Negative (−1 to −10); Moderate Negative (−11 to −20); and Extremely Negative (<—20). This categorization is applied in the ordered probit model described in Sect. 3.3.2.

3.2 Basic framework

Initially, the Latent Dirichlet Allocation (LDA) algorithm, a text mining technique, was applied to extract specific topics (i.e., a combination of words indicating a unique topic), which we defined as risks for news and Twitter data, separately. Later, a proportion of each risk was extracted to understand how these specific risks represented the articles and tweets. A sentiment analysis was then conducted to extract the sentiment score of each of the articles and tweets using NRC sentiment library. Lastly, we used the ordered probit model to infer sentiments toward risks from the news and Twitter data.

3.3 Modeling methods

3.3.1 Latent Dirichlet Allocation algorithm

LDA was first proposed by Blei et al. (2003). LDA is the most common algorithm for topic modeling. The basic principle of LDA is that every document is a mixture of topics and that every topic is a mixture of words. LDA simultaneously estimates both; it identifies the mixture of words associated with a topic and determines the mixture of topics that represents a document.
Suppose, there are total t topics. Each topic in a given document can be viewed as generating from a distribution. Let ztd be the tth (t = 1,..., T) topic in the dth (d = 1,..., D) document. ztd takes a value between 1 and T following ztd ∼ Multinomial (θd), where θd = (θd1, θd2,..., θdT) refers to the topic probability.
Again, let wdw, where w = 1,..., Ws and d = 1,..., D, be the wth word to be used in the dth document and Ws denotes the total number of words of the dth document. wdw takes a value between 1 and K, where K is the total number of unique words used in all the documents. Here, the word generates using wdw|ztd ∼ Multinomial (βt), where βt = (βt1, βt2,..., βtK) is the probability that a word is picked in condition the topic t is selected. The above description was adapted from Zhang et al. (2018).
The LDA assumes that both topics and words are generated from a Dirichlet distribution. For topic probability, it refers to θd ∼ Dirichlet (α) for each document d, and for word probability, it refers to βt ∼ Dirichlet (δ). Here, both α and δ are hyper parameters of LDA that can be estimated. Following Grun and Hornik (2011), Gibbs sampling was deployed to make inference from document distribution (θd) and topic word distribution (βt). It is important to note that statistical inference from LDA results depends on hyperparameter choice which is usually chosen in an ad hoc manner, however, this study followed the proposed procedure suggested by Blei et al. (2003)

3.3.2 Ordered probit model

An ordered choice model, the ordered probit model, was used to show how various risk factors generated from news and Twitter texts impact the sentiments of the supply chain. This effect can be represented as:
$$y_{i}^{*} = {{{\varvec{\beta}}\,{\varvec{x}}}}_{{\varvec{i}}} + \varepsilon_{i}$$
(1)
where \({y}_{i}^{*}\) is the latent variable (sentiment level) measuring supply chain sentiments toward the text-based risks of documents i; \({\varvec{\beta}}\) is a vector of parameters; \({{\varvec{x}}}_{{\varvec{i}}}\) is a vector of independent variables (topic proportions/risk proportions) for each document i; and \({\varepsilon }_{i}\) is the error term, which is assumed to be normally distributed.
The present study models sentiment analysis as having seven levels of ordinal categories of dependent variables, where sentiment is defined as follows: −3 = extremely negative, −2 = moderately negative, −1 = weakly negative, 0 = neutral, + 1 = weakly positive, + 2 = moderately positive and + 3 = extremely positive. The observed and coded sentiment level \({y}_{i}\) is classified based on latent variables \({y}_{i}^{*}\), as follows:
$$y_{i} = \left\{ {\begin{array}{*{20}c} { - 3 \;{\text{if}} \; - \infty < y_{i}^{*} \le \mu_{1} } \\ { - 2 \;{\text{if}} \;\mu_{1} < y_{i}^{*} \le \mu_{2} } \\ { - 1 \;{\text{if}}\; \mu_{2} < y_{i}^{*} \le \mu_{3} } \\ {0 \;{\text{if}}\; \mu_{3} < y_{i}^{*} \le \mu_{4} } \\ { + 1\; {\text{if}}\; \mu_{4} < y_{i}^{*} \le \mu_{5} } \\ { + 2 \;{\text{if}}\; \mu_{5} < y_{i}^{*} \le \mu_{6} } \\ { + 3 \;{\text{if}}\; \mu_{6} < y_{i}^{*} \le + \infty } \\ \end{array} } \right.$$
(2)
where \({y}_{i}\) is the observed variable for measuring the sentiment level of each news and social media document i, and \({\mu }_{j}\) represents the threshold values for the sentiment level j to be estimated.
The probability that a sentiment score would incur a level of sentiment j (j = −3, −2, −1, 0, + 1, + 2, + 3) is given in the following equation:
$$\begin{aligned} Pr\left( {y_{i} = -3{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = F\left( {\mu_{1} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ Pr\left( {y_{i} = -2{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = F\left( {\mu_{2} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) - F\left( {\mu_{1} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ Pr\left( {y_{i} = -1{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = F\left( {\mu_{3} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) - F\left( {\mu_{2} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ Pr\left( {y_{i} =0{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = F\left( {\mu_{4} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) - F\left( {\mu_{3} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ Pr\left( {y_{i} = +1{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = F\left( {\mu_{5} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) - F\left( {\mu_{4} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ Pr\left( {y_{i} = +2{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = F\left( {\mu_{6} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) - F\left( {\mu_{5} -{\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ Pr\left( {y_{i} = +3{|}{\varvec{x}}_{{\varvec{i}}} } \right) & = 1 - \left({\mu_{6} - {\varvec{\beta}}\,{\varvec{x}}_{{\varvec{i}}} } \right) \\ \end{aligned}$$
(3)
where F(.) is the cumulative probability function for standard normal distribution.
This model was estimated using maximum likelihood estimation, and the McFadden pseudo-squared value was used to evaluate the fitness of the model. For more information about the estimation and evaluation method, please refer to Woolridge (2012).

4 Results and discussion

LDA does not provide information about optimal topic numbers for the text data extracted from Twitter. Therefore, we need to extract optimal topic numbers externally. Optimal topics for both news and Twitter that should be extracted to detect the maximum number of topics was 20 using metrics proposed by Cao et al. (2009) and Deveaud et al. (2014).
In the following section, the results extracted from the LDA algorithm are explained. Tables 1 and 2 show the list of optimal topic titles and the corresponding words for each topic in news media and on Twitter, respectively. LDA returns the probabilities of each word under each topic in ranked order. It should be noted that each topic was titled by observing the probability and the combination of words. Topic clusters are also returned in ranked order. LDA does not return the topic name. The researchers are required to name the topic by observing the topic’s word combination; this is based on intuition and domain expertise.
Table 1
Name and words of each corresponding topic for news media
Topic name
Topic words
T1: Shipping, Port and Logistics
Shipping port ports goods biden canada cargo transport container freight logistics air president white house
T2: Price Surge
Inflation bank rate central rates policy prices interest monetary fed economy higher price market global consumer
T3: Raw Material Import–Export
Battery materials production china nickel batteries material supply lithium global steel raw market chinese indonesia
T4: China’s Foreign Strategy
China companies chinese government technology national beijing policy foreign development key industry standards security
T5: Risks in Asian Market
Asia world time china japan people nikkei political system covid lot big markets point coming real risks
T6: China’s Covid Policy
China covid shanghai omicron city cases lockdown people government variant chinese world virus beijing
T7: Electric Car Production
Vehicles electric tesla sales production cars vehicle china car automakers company market auto evs million
T8: Retailers and Shopping
Labor company stores holiday chain products forced retailers supply retail store vietnam cotton online xinjiang
T9: Food and Oil Price Hit
Prices food price costs rising cost higher inflation increase rise energy oil fuel commodity farmers hit
T10: Energy Supply
Energy power emissions solar carbon renewable climate china fuel gas supply oil reduce
T11: Supply Chain Revenue
Quarter company billion supply chain revenue demand sales profit million shares costs expects
T12: Regional Supply Chain
Billion company business companies group market investors yuan million financial arm capital
T13: Economic Growth
Growth economic economy exports supply demand covid pandemic domestic quarter expected gdp
T14: Ukraine–Russia War
Russia ukraine war russian oil global gas invasion sanctions country europe exports impact supply energy
T15: Automobile Supply Chain
Japan production japanese toyota yen parts company suppliers motor plant auto plants operations chain
T16: Chip Industry and Shortage
Chip semiconductor chips industry taiwan manufacturing samsung global billion tsmc production supply
T17: Electronic Parts Production
Apple nikkei asia production told foxconn components suppliers company iphone china tech supply maker
T18: Real Estate Stress
Housing asia southeast indonesia construction government thailand country local investment region market
T19: Supply Chain Shortage
Supply chain global shortage shortages disruptions pandemic demand industry business impact due production
T20: Trade in Asia
Trade japan taiwan economic south india korea china security minister australia president indopacific
Table 2
Name and words of each corresponding topic for social media (Twitter)
Topic name
Topic words
T1: Economic Growth and Omicron
Omicron variant global growth supply chain spread economic amid recovery ongoing disrupt threatens
T2: China’s Zero-Covid Policy
Covid supply chain lockdowns china global woes china’ policy worsen zerocovid imposes meet strains
T3: China’s Port Restrictions
Covid lockdown china cases port shanghai shenzhen restrictions chinese disruption city testing ports
T4: Russian Sanctions
Russian sanctions russia attack increase ukraine key cost ukrainian war cyber port lives disrupts threat
T5: Food Market Impact
Impact war ukraine war latest food russiaukraine read market report ongoing impacted logistics
T6: Export Halt—Ukraine
Russia ukraine war wheat worlds world’ sanctions largest production export shipments neon biggest halts
T7: Cargo Shipping Restrictions
Shipping cargo big air ships ports container west blunder equates europe thousands record cancelled
T8: Supply Chain Disruption
Covid disruptions pandemic world made post era test people business brexit learn companies challenges
T9: Logistics Tension
Supply russia ukraine war chain crisis global war affect impact impacts ceo situation continue logistics
T10: Ripple Effect Supply Chain
Invasion ukraine russian russias russia’ global effects causing disruptions ripple impact markets companies
T11: Food Price Hike
Prices crisis ukraine war food costs higher rising mess hits energy sink strain fuel scary uncertainty warning continues
T12: China’s Response to War
Ukraine russia crisis support data china response business experts great relief tech taiwan push humanitarian financial
T13: Last-mile Logistic
Mile delivery logistics today future article solutions industry retailers changing learn latest inventory food innovation
T14: Omicron Hit on Sales Price
Published omicron economy hit rise reports concerns sales price share industry holiday issues fears supply
T15: Trade Risk Warning
Due disruption news risk warns russia companies der trade potential stop caused coming die country top suspend
T16: Supply Delay—Retailers
Covid disruptions omicron due shortages delays cases surge grocery shelves hit empty finally variants days store time doors
T17: Inflation and War
Inflation impact ukraine decade join high les guerre easing joe minimize monetary consequences low sur discuss live blame
T18: Supply Shortage—War
Ukraine war shortage news firms demand chip sea production due neon issues pandemic wave survey needed means parts
T19: Energy Problem—Russia
Russian oil russia gas imports ban people europe end price energy bans restrictions exports natural start european
T20: Supply Chain Resiliency
Supply chain postcovid global issues war industries affecting coronavirus world find pandemic episode resilience

4.1 Supply chain risk detection using LDA

4.1.1 Topics from news media

In Table 1, 20 topics and corresponding words are listed. These topics are categorized into seven broad risk categories: Trade in Asia, Logistics, Macroeconomic issues, China issues, Energy issues, Semiconductor shortages and Automobile supply chain.
4.1.1.1 Trade in Asia
T5 and T20 are “Risks in Asian Market” and “Trade in Asia,” respectively. T5 represents the following word combination: “asia world time china japan people nikkei political system covid lot big markets point coming real risks.” This word combination indicates plausible risks in the Asian market. Similarly, T20 represents a cluster of “trade japan taiwan economic south india korea china security minister australia president indopacific,” which potentially indicates a trade issue in the Asian region.
4.1.1.2 Logistics and supply chain
In this category, T1 represents “Shipping, Port and Logistics” with the following word combination: “shipping port ports goods biden Canada cargo transport container freight logistics air president white house.” In this cluster, we observe that most of the words represent the issues related to shipping, ports, containers, freight, logistics, cargo, etc. From domain knowledge, we could refer this word combination to an issue of “Shipping, Port and Logistics” in general. The presence of other words, such as biden, Canada, president, white house may be indicative of the truck driver strikes at the USA–Canada border.
T8 represents the “Retailers and Shopping” risk. The representative word cluster is: “labor company stores holiday chain products forced retailers supply retail store vietnam cotton online xinjiang.” By examining these words, we could infer that, during Omicron, Christmas and Black Friday might be dominant occasions in terms of sales, and that possible labor shortages may be experienced during this time due to the Ukraine–Russia war.
T11, T12 and T19 are “Supply Chain Revenue,” “Regional Supply Chain” and “Supply Chain Shortage,” respectively. Among these topics, supply chain shortages were very common. From the beginning of the pandemic, “global shortage” and “disruptions” had been in the news constantly. However, T11 was interesting, because the news media expected possible revenue in the last quarter of 2021 as indicated in the word combination under this topic. In addition, “Regional Supply Chain” is represented by “billion company business companies group market investors yuan million financial arm capital.” The word “regional” may indicate financial investment in the local market, which vaguely represents supply chain finance.
4.1.1.3 Macroeconomic issues
T2 represents the “Price Surge” risk, with the following word combination: “inflation bank rate central rates policy prices interest monetary fed economy higher price market global consumer.” This topic represents the issue of price hikes, inflation and struggles experienced by financial policymakers. “Price Surge” has been a problem since the beginning of the COVID-19 pandemic due to issues of international trade and has since been impacted by the start of the Ukraine–Russia war. T3, “Raw Material Import–Export,” reflects issues in the importing and exporting of raw materials. T3 represents a cluster of “battery materials production china nickel batteries material supply lithium global steel raw market chinese Indonesia.” The presence of nickel, batteries, lithium and so on might refer to the raw materials needed for the production of electronic parts.
T13 identifies topics related to “Economic Growth,” with the following words: “growth economic economy exports supply demand covid pandemic domestic quarter expected gdp.” The pandemic and the war have both decreased economic growth in most countries. Economic growth largely depends on imports, exports and trade. Therefore, news media identified this topic as an important risk for supply chain management.
4.1.1.4 China issues
T4, “China's Foreign Strategy,” is a dominant topic discussed in news media. Likewise, T6 represents the topic “China’s Covid Policy,” referring to the Zero-Covid Policy adopted by China. Due to China’s international policy, supply chain operations have been experiencing issues related to trade, port congestion, and air and maritime transport.
4.1.1.5 Energy issues
T9 and T10 represent risks related to “Food and Oil Price Hit” and “Energy Supply,” respectively. These topics have both been discussed in the news media, mainly during the Ukraine–Russia war. Energy prices have increased due to Russia and Ukraine no longer exporting gas and oil. Thus, many countries are experiencing energy price hikes and a shaken aviation market due to higher oil prices after the war had begun.
4.1.1.6 Semiconductor shortage
T16 and T17 represent the “Chip Industry and Shortage” and “Electronic Parts Production,” respectively. T16 comprises “chip semiconductor chips industry taiwan manufacturing samsung global billion production supply,” while T17 includes “apple nikkei asia production told foxconn components suppliers company iphone china tech supply maker.” These cluster of words clearly refer to the supply side of chip production and the potential shortage of semiconductors for tech giants.
4.1.1.7 Automobile supply chain
T7 and T15 are “Electric Car Production” and “Automobile Supply Chain,” respectively. Supply chain disruption in this category is largely related to the supply and production of electric and electronic parts, semiconductor supply and energy supply for the continuation of automobile plant production, especially in countries with large automobile industries (e.g., Japan). These issues have already been discussed in the news media during Omicron and the Ukraine–Russia war.

4.1.2 Topics from Twitter

In Table 2, 20 topics and corresponding words are listed. These topics are categorized into four broad risk categories: China issues, Energy issues, Logistics and Economic issues.
4.1.2.1 China issues
Three topics were detected from Twitter concerning China during Omicron and the war, namely T2 “China Zero-Covid Policy,” T3 “China Port Restriction” and T12 “Chinas’ Response to War.” Continuous international border restrictions and maritime port restrictions were predominant topics on Twitter. China’s response to the Ukraine–Russia war was a major issue for the people in other countries. While other countries were imposing sanctions on Russia, China did not. In addition, China’s stance on Russia was not clear at the time of this study. As a result, two vital economic risks—T15 “Trade Risk Warning” and T17 “Inflation and War”—might have been detected from Twitter as a by-product of this uncertain response.
4.1.2.2 Energy issues
Another issue discussed on Twitter was T19 “Energy Problem—Russia.” Several countries, including some G7 countries, are dependent on the gas and oil supply from Russia. After the start of the war, most developed countries had imposed sanctions on Russia; in response, the Russian Government halted part of its energy supply, especially in Europe. The word combination under this topic—“russian oil russia gas imports ban people europe end price energy bans restrictions exports natural start European”—indicates this issue. Therefore, it is clear that energy supply was deemed a potential risk among Twitter users.
4.1.2.3 Economic issues
Various economic issues, specifically macroeconomic issues, were also detected on Twitter. For instance, T1 represents “Economic Growth and Omicron,” T5 refers to “Food Market Impact” and T6 is “Export Halt—Ukraine.” T1 is a topic also seen in the news media. T5 and T6 refer to Ukraine’s position as a key exporter of product such as oil, corn and wheat. The export of these products has been halted due to the war. Thus, T11 “Food Price Hike” and T14 “Omicron Hit on Sales Price” are also some of the topics being discussed on Twitter. The other economic issues—T15 “Trade Risk Warning” and T17 “Inflation and War”—are also discussed on Twitter.
4.1.2.4 Logistics and supply chain issues
In terms of logistics, some topics detected on Twitter were similar to those detected from news media such as T7 “Cargo Shipping Restrictions,” T16 “Supply Delay—Retailers,” T18 “Supply Shortage—War” and T20 “Supply Chain Resiliency.” Interestingly, three topics were detected on Twitter but not in the news media: T9 “Logistics Tension,” T10 “Ripple Effect Supply Chain” and T13 “Last-mile Logistics.” “Logistics Tension” was prevalent at the start of the war. “Ripple Effect Supply Chain” implies the propagation of disruption among the supply chain network and indicates global risks in regular supply operations.
As seen, the news media and Twitter were generally both concerned about ports and shipping, inflation, energy supply, economic growth, supply chain resiliency and shortages during Omicron and the Ukraine–Russia war. The news media was solely concerned with automobile and semiconductor supply chains and regional supply chains. Twitter also detected unique issues, such as China’s port restrictions, Ukraine’s export halt, the ripple effect of supply chain issues and last-mile logistics. A pattern can be seen in that the news media were particularly concerned with the supply side (upstream) of the supply chain, while Twitter was mostly concern with the demand side (downstream).

4.2 Risk variations across the time frame

In this section, we discuss variations of each topic between November 15, 2021, and April 30, 2022. These variations are shown by day in Fig. 1 for news media and in Fig. 3 for Twitter. To show these variations, the time frame is classified into four broad categories: Phase 1: Omicron Variant of Interest (Omicron VOI) to Omicron Variant of Concern (Omicron VOC), from November 15, 2021, to November 26, 2021; Phase 2: O-VOC to Xe Variant of Concern (Xe VOC), from November 27, 2021, to January 19, 2022; Phase 3: Xe VOC to the start of the Ukraine–Russia war, from January 20, 2022, to February 24, 2022, and Phase 4: Post-Ukraine–Russia war started, from February 25, 2022, to April 30, 2022. Note that VOC and VOI were taken as measurements from the World Health Organization (WHO 2021). The classification of time frame shows at which of this time frame, the risk variations were largely discussed and statistically significant.

4.2.1 Risk variations in the news media

Figure 1 presents the risk variations of each day in the time frame. All of the 20 topics detected by the news media are visualized in a time series to observe trends in variation. Figure 1 shows no specific trends; however, the proportions are observed as stationary, with many fluctuations on some specific days. Due to space limitations and the large number of generated topics, it is difficult to observe and describe each of the topics. Therefore, another visualization plot is presented in Fig. 2, in which the average proportion of each risk detected in the news media is analyzed.
For logistics, we observed that “Shipping, Port and Logistics” was discussed more during Phase 1, and that news coverage about this topic declined later on. However, T8 “Retailers and Shopping,” T11 “Supply Chain Revenue” and T12 “Regional Supply Chain” were relatively steady across all four time frames; this indicates that the news media was constantly reporting on these three topics. In contrast, reportage on T19 “Supply Chain Shortage” increased from 0.018 (1.8%) to 0.058 (5.8%) after declaring Omicron as the VOC in late November 2021.
In terms of trade in Asia, T5 “Risks in Asian Market” and T20 “Trade in Asia” were both highly detected in Phase 1. T20 declined after November 26, 2021; however, the proportion of this topic reported in the news media was almost steady from that point on.
Regarding issues in China, T4 “China’s Foreign Strategy” and T6 “China’s Covid Policy” were increasingly reported on in the news media from Phase 2 onwards. Issues, for example, T2 “Price Surge” was highly detected in Phase 1. Moreover, reporting on both T3 “Raw Material Import–Export” and T13 “Economic Growth” increased in Phase 1, Phase 2 and Phase 4. Similarly, reportage on T16 “Chip Industry and Shortage” and T17 “Electronic Parts Production” increased over the entire time frame, especially T17, which increased significantly during Phase 4.

4.2.2 Risk variations on Twitter

Figure 3 presents the topic variation trend of tweets related to supply chain issues. Similar to news media topics, a stationary pattern in the time series plot can be observed; however, there are fluctuations. In Fig. 4, the average proportion of each risk detected on Twitter is shown for the four time frames. For instance, three topics regarding issues in China (i.e., T2 “China’s Zero-Covid Policy,” T3 “China’s Port Restrictions” and T12 “China’s Response to War”) were stable on Twitter across all four time frames. From this we can infer that Twitter users were concerned about China’s stance regarding both the pandemic and the war.
Regarding economic issues, T1 “Economic Growth and Omicron,” T5 “Food Market Impact,” T6 “Export Halt—Ukraine,” T11 “Food Price Hike,” T14 “Omicron Hit on Sales Price,” T15 “Trade Risk Warning” and T17 “Inflation and War” were relatively stable across all time frames on Twitter. In particular, T1, T5, T11 and T14 were found tweeted and discussed more after the Ukraine–Russia war started with maximum proportion among these four time frames. T6 and T15 were both discussed the most between Phases 1 and 3, before the start of the Ukraine–Russia war. T19 “Energy Problem—Russia” was discussed the most during late November to early January (Phase 2), before the war started. The discussion was ongoing, although it decreased slightly in late January 2022. However, at the start of the Ukraine–Russia war in February 2022, discussion on this topic increased again.
For logistics issues, discussions on T7 “Cargo Shipping Restrictions” peaked during Phase 2, just before the Xe variant was detected in early January 2022; however, these discussions decreased just before the war started. T9 “Logistics Tension” peaked during Phase 4. T10 “Ripple Effect Supply Chain” was found to be almost stationary during the overall time frame; however, its average proportion shows evidence of concerns on Twitter about this topic.
T13 “Last-mile Logistics” was highly discussed during November 2021. It again increased after the Xe Variant was detected in January 2022. T16 “Supply Delay –Retailers” was discussed the most in Phase 4 after the war started, similar to T9 “Logistics Tension” and T18 “Supply Shortage—War.”

4.3 Statistical significance

In this section, we present the statistical significance of the risk factor variations across the four time frames categorized in Sect. 4.2. Here, we compare the means of each risk extracted from the LDA for both news media and Twitter. Each time frame is regarded as a group (phase), and the objective is to examine whether the risk proportions are significantly different between each group. If we find that a risk is not significantly different across the entire time frame, we can infer that risk may have been reported or discussed in a similar way across the time frame; likewise, if we find that a risk has a significant variation between any of the individual time frame categories, we can infer that particular risk may have been discussed more or less in that time frame than in others. As the response variable does not fit into normal distribution, we utilized the Kruskal–Wallis test, a nonparametric test that can be used as an alternative to a one-way ANOVA. This test cannot determine which groups are different, so if we identified any risks that were significantly different from the group means, we then used the Dunns’ test to confirm which groups were different. The significance level was 10%. Table 3 presents the results of this analysis for news media; we identified that nine risks varied across the time frame with statistical significance implying that the topic proportions of these nine topics are different across the four time frames.
Table 3
Kruskal–Wallis test for news media
Risks (topics)
Kruskal–Wallis Test
\({\chi }^{2}\)
df
p value
Dunns’ test
Time frame
Contrast
Adjusted p value
Shipping, Port and Logistics
10.9
3
0.0123
2–3
−3.20
0.0082
Risks in Asian Market
8.06
3
0.0447
3–4
−2.82
0.0288
China’s Covid Policy
8.97
3
0.0296
1–3
2.47
0.0800
3–4
−2.44
0.0892
Electric Car Production
10.3
3
0.0163
1–3
−2.76
0.0348
1–4
−2.72
0.0387
Food and Oil Price Hit
8.64
3
0.0345
2–4
−2.80
0.0309
Economic Growth
6.42
3
0.0930
2–3
−2.40
0.0981
Ukraine–Russia War
10.6
3
0.0140
2–4
3.06
0.0131
Electronic Parts Production
136
3
0.0000
1–4
6.79
0.0000
2–4
9.73
0.0000
3–4
9.23
0.0000
Supply Chain Shortage
6.27
3
0.0992
1–3
2.47
0.0806
In Table 3, we observe that the risk related to “Shipping, Port and Logistics” is significantly different across the time frame (p-value is 0.0123), and the Dunns’ test indicates that this difference lies between Phases 2 and 3. This finding is significant (p = 0.0082) and has a contrast value of -3.20 which again indicates that this risk was more likely to have been reported in Phase 3. This result implies that risks related to “Shipping, Port and Logistics” were likely to be more prevalent when Xe was declared as a variant of concern, and that this continued until the start of Ukraine–Russia war.
“Supply Chain Shortage,” “Electronic Parts Production” and “China’s Covid Policy” were more likely to be discussed in Phase 1 than other risks. “Ukraine–Russia War” and “Electronic Parts Production” were more likely to be discussed in Phase 2. In Phase 3, “Shipping, Ports and Logistics,” “Electric Car Production,” “Economic Growth” and “Electronic Parts Production” were more likely to be reported on. Finally, in Phase 4, the topics most likely to be discussed were “Risks in Asian Market,” “China’s Covid Policy,” “Electric Car Production” and “Food and Oil Price Hit.”
It is interesting to note that potential supply chain risks caused by the Ukraine–Russia war were discussed for a relatively long period of time (approximately three months) before the war had actually begun. For example, “Electronic Parts Production” risks were heavily reported from Phase 2 to Phase 4. Likewise, risks related to “China’s Covid Policy” were prevalent during Phase 1.
Table 4 presents the results for risks extracted from Twitter. Thirteen risk variations were found to be statistically significant. Among these, “Supply Chain Disruption,” “Cargo Shipping Restrictions,” “Food Price Hike,” “Last-mile Logistics,” “Supply Delay—Retailers,” “Inflation and War” and “Supply Chain Resiliency” were most likely to be tweeted about in almost all of the time frame categories. This implies that people were consistently concerned about possible supply chain risks throughout the entire time frame. Other risks, such as “China’s Port Restrictions” and “China’s Response to War,” were found to be discussed more during Phase 4. Likewise, risks related to logistics were discussed more in Phases 2 and 3 than in Phase 4.
Table 4
Kruskal–Wallis test for Twitter
Risks (topics)
Kruskal–Wallis Test
\({\chi }^{2}\)
df
p value
Dunns’ test
time frame
Contrast
Adjusted
p value
Economic growth and omicron
13.2
3
0.0043
1–4
−2.72
0.0393
China’s port restrictions
8.45
3
0.0375
1–4
−2.76
0.0343
Cargo shipping restrictions
86.1
3
0.0000
1–4
−3.42
0.0000
2–4
−8.43
0.0000
3–4
−4.69
0.0000
Supply chain disruption
48.8
3
0.0000
1–2
−4.55
0.0000
1–3
−4.54
0.0000
1–4
6.06
0.0000
2–4
−3.49
0.0028
3–4
−2.90
0.0223
Logistics tension
30.8
3
0.0000
2–4
4.75
0.0000
3–4
3.78
0.0009
Ripple effect supply chain
16.7
3
0.0008
2–4
−3.36
0.0047
Food price hike
28.1
3
0.0000
1–2
−3.31
0.0056
1–3
−2.91
0.0216
2–4
4.27
0.0001
3–4
2.95
0.0194
China’s response to war
39.2
3
0.0000
2–4
−5.33
0.0000
3–4
−3.83
0.0000
Last-mile logistics
74.5
3
0.0000
1–2
−2.75
0.0359
1–4
−4.87
0.0000
2–4
−5.27
0.0000
3–4
−6.48
0.0000
Omicron hit to sale price
8.66
3
0.0341
1–2
−2.71
0.0067
Supply delay—retailers
23.6
3
0.0000
1–3
−2.55
0.0648
2–4
3.14
0.0101
3–4
3.99
0.0004
Inflation and war
143
3
0.0000
1–2
2.77
0.0341
2–3
−2.99
0.0169
2–4
−11.3
0.0000
3–4
−6.37
0.0000
Supply Chain resiliency
24.5
3
0.0000
1–2
−3.49
0.0029
1–3
−3.43
0.0037
1–4
−446
0.0000

4.4 Assessing risks by sentiment

To understand the effect of detected risks on news media and social media’s sentiments, we used an ordered choice model. Sentiments were categorized in an orderly manner numerically and the ordered logit and ordered probit models were fixed as modeling methods. Sentiment categories were the dependent variables and text-generated risks and time frames were the independent variables. Time frames were used to show the interaction effect with the extracted risks.
From the log-likelihood results of the ordered logit (News Media: −817.80; Twitter: −7810.39) and ordered probit (News Media: −816.78; Twitter: −7810.00) models, the latter shows slightly better estimates, which indicates a better fit for analyzing the effect of text-generated risks on sentiment. Thus, Table 5 presents the final ordered probit model with 13 statistically significant variables, including the interaction effect of both risk and time frame. Significance was set at 10%, which means that if the p-value is less than 0.10, we can reject the null hypothesis and confirm that the variables’ effect on sentiment is statistically significant.
Table 5
Estimation results of the ordered probit model for news media and Twitter
 
News media
Twitter
Β
SE
p-value
β
SE
p-value
T5: Risks in Asian Market
−1.20
0.49
0.015
T9: Food and Oil Price Hike
3.33
1.19
0.005
T12: Regional Supply Chain
−1.46
0.64
0.022
T14: Omicron Hit Sale Price
−0.58
0.23
0.011
T16: Chip Industry and Shortage
−2.40
0.54
0.000
T17: Electronic Parts Production
−1.02
0.48
0.033
T18: Supply Shortage—War
−0.31
0.19
0.091
Phase 1*T17: Inflation and War
−0.58
0.33
0.077
Phase 1*T9: Logistic Tension
0.40
0.20
0.040
Phase 2*T9: Logistic Tension
0.27
0.08
0.001
Phase 3*T9: Logistic Tension
0.43
0.09
0.000
Phase 3*T16: Chip Industry and Shortage
0.40
0.24
0.091
Phase 3*T19: Supply Chain Shortage
−1.66
0.99
0.092
Number of observation
673
5372
LR chi2
136.34
126.64
Pseudo-squared R
0.0770
0.0823
For news media, T5 “Risks in Asian Market” is negatively statistical significant, which indicates that if the topic proportion of Asian market risks increases by 1%, then positive sentiment will likely decrease. Thus, the overall effect of the sentiment is negative. This indicates extreme supply chain risk in the Asian region. This result also shows that the news media reported on possible supply chain disruptions in the Asian region with a negative sentiment.
T12 “Regional Supply Chain” shows extreme negative sentiment and indicates that the news media was generally discussing this topic with negative sentiment. This suggests that regional disruption and risks may increase due to the pandemic and the war. T16 “Chip Industry and Shortage” and T17 “Electronic Parts Production” show possible negative sentiments in the news media.
Phase 3’s interaction effect on T19 “Supply Chain Shortage” shows that experiencing supply chain shortage in the phase 3 and this topic might have time-varying effect with negative sentiment. In contrast, the same time frame might have a positive effect on sentiments for T16 “Chip Industry and Shortage.”
In terms of Twitter, T14 “Omicron Hit on Sales Price” and T18 “Supply Shortage –War” tend to increase users’ negative sentiments. An interesting result was found in terms of the interaction effect of the Phase 1 and possible inflation due to the Ukraine–Russia War—T1. The result shows that inflation was beginning to increase before the war started, and that this topic generated negative sentiment.
Table 5 also shows the overall performance of the ordered probit model for both news media and Twitter. The pseudo-R2 for news media is 7.7% and for Twitter is 8.2%, which is a moderately good fit for textual data.

5 Discussions

In this section, we will discuss significant patterns derived from the outcomes in four broad categories of phases. For example, during Phase 1 (November 15, 2021, to November 26, 2021), i.e., from the time when Omicron was first detected until it is declared as a concerned variant, risks associated with logistics were discussed with the highest proportion among other four phases (see Figs. 2 and 4) in both media. Among those risks, “Shipping, Port and Logistics” from news and “Supply Chain Disruption,” “Last-mile Logistics,” “Supply Chain Resiliency” from Twitter were found statistically significant in terms of topic proportion during Phase 2, Phase 3 and Phase 4 (see Tables 3 and 4). It may imply that discussion about the potential risks involved with these topics got prolonged with the duration of time in both media that lasted until the post-Russia–Ukraine conflict. Similarly, the risk associated with “Electric Car Production” was utmostly reported during Phase 1 and the discussion extended until Phase 3 and Phase 4 with statistical significance. It possibly may imply plausible disruption threat in automobile supply chain. Moreover, during phase 1, “Price Surge” and “Export Halt” were discussed highly in news and social media, respectively. The combined result indicates possible value chain break down in the global supply chain since export halt due to combined interruption of COVID-19 and the conflict on maritime transport causes inbound logistics get collapsed; therefore, product supply may get stopped resulting in price hike experience in the downstream (i.e., demand side) of supply chain.
During Phase 2 (November 27, 2021, to January 19, 2022), i.e., until Xe variant got detected, risks associated with economic issues and energy issues emerged. For instance, “Inflation” was the utmost discussed topic in Twitter in this phase compared to other phases, and it was further found statistically significant for other three phases with a plausible negative sentiment in Phase 1. “Food and Oil Price Hit” was also dominant topic and prolonged until phase 4 with statistical significance. In addition, “Regional Supply Chain” was the highest in proportion during Phase 2 and it was discussed more with negative sentiment. “Supply Chain Shortage” was also the utmost discussed topic in this phase with negative sentiment during phase 3. This result could imply that COVID-19 was more likely to create inflation in global economy resulting in possible economic breakdown both globally and regionally. This could perhaps create supply chain shortage that news media was more concerned about.
During Phase 3 (January 18, 2022, to February 24, 2022) until the commencement of Ukraine–Russia war, risks involved with material import–export, sanctions and trade risks came into limelight and was highlighted more in both media. Risk associated with Trades were discussed in both media, and news media, in addition, discussed this issue with negative sentiment, especially across Asian region. It is interesting that possible sanctions on Russia and China’s foreign strategies were discussed more in this phase before the Russia–Ukraine war actually started. Individuals from media might have given warning about possible geopolitical impact of supply chain market if war had been taken place.
At the post-phase of Ukraine–Russia war (Phase 4), specific issues and associated risks were discussed more. For instance, “semiconductor shortage” were highly discussed with negative sentiment across this time frame; “Food Price Hike,” “Supply Delay—Retailers,” “Sales Price” got more attention in this phase, again, with negative sentiment. Phase 4 is the time when combined effect of COVID-19 and Ukraine–Russia war was in effect. Therefore, the severity and magnitude of these risks could be more compared to other supply chain risks.
Supply chain policy makers could use these text-based information to support their decision making. For example, the top organization level, where strategic decisions are generally made, could leverage these text-based risk assessment as a qualitative risk assessment and forecast. For instance, in this study, we observed that issues involve with ports and shipping, last-mile logistics, etc. extended from phase 1 to phase 4. In addition, energy issues, inflation and supply chain shortage prolonged from phase 2 until phase 4. These risks could be used as possible forecast for long-term decision making to tackle optimum production capacity, to plan efficient inventory and to allocate resources at the organizational capacity. From the above outcomes, it could also be noticed that news media generally discussed about the upstream side (i.e., supply side) and social media in general discussed about downstream side (i.e., demand side) of supply chain. Therefore, this text-based risk assessment could enhance the bonding among Customer Relationship Management (CRM) and Supplier Relationship Management (SRM) with the organizations. In addition, this text-based information could be integrated with the Enterprise Resource Planning (ERP) systems that might help in tactical planning, routine decision making, and execution and transaction stage of an internal supply chain systems of any organization. Moreover, natural language-based sentiment analysis could even detect the magnitude of the risks by measuring positive or negative sentiment, especially among suppliers and consumers. For example, inflation, shortage and trades were more likely to be discussed with negative sentiment among other risks in news and social media. This particular information might help supply chain policy makers to give more weightage in these sort of risks to avoid future disruption.
Moreover, text-based information may add value in supply chain risk management. Rayport and Sviokola (1995) listed three criteria, e.g., visibility, mirroring and creating new customer relationships—that usually add value in supply chain information systems. Text-based risks assessment from continuous monitoring news articles and social media posts could enhance the visibility of physical and monetary flow in the systems. For instance, our result indicated potential risks involved with trades and retailers at the upstream and last-mile logistics at the downstream. If ERP system of an organization includes these text-based sources and then analyze it, it could be capable of establishing an idea of possible parallel risks that might emerge in future; therefore, organization could prepare for the worst in advance. Thus, text-based risk identification might be able provide more real-time data that organization take advantage of (Bozarth and Hensfield 2019) and ensure visibility of potential risks. In addition, mirroring for specific risks could be designed to replace potential physical risks with potential virtual risks generated from text data to test those risks. For example, news media was discussing about possible semiconductor shortages during COVID-19 time frame; however, this issue emerged after Ukraine–Russia had commenced. This particular risk could have been tested in advance with simulation or by leveraging virtual reality to test its likely severity when it had been first detected.

6 Conclusion

In this study, we aimed to determine whether analysis of textual data sourced from the news media and Twitter during disruptive events could be beneficial in detecting supply chain risks. We also wished to assess the time-varying nature of the risks generated from the textual data to understand its efficacy, and to understand news media and social media’s sentiments on the risks that we identified. To achieve these objectives, we used the Latent Dirichlet Allocation algorithm to identify risks from news media and Twitter data. We found that the risks proportions varied between November 2021 and April 2022, often changing each data in both types of media in all phases. To understand and compare the significance of each risk in each time frame, we examined the risks using a nonparametric significance test. Finally, we used the ordered probit model to examine the sentiments of the news media and Twitter on each risk.
The results indicate that textual data from the news media and Twitter are able to detect a diverse range of logistics, economic, geopolitical and trade-based risks. Sentiment and time effect add value to the text-generated risks. The news media is primarily concerned with the supply side of the supply chain, while Twitter is mostly concerned with the demand side. Supply chain shortages and port restrictions were identified as possible risks before the war. The news media was able to detect automobile and semiconductor disruptions during Omicron and the conflict. Macroeconomic risks (e.g., inflation) had been dominant as risk throughout the time frame with negative sentiment. Supply chain policymakers can use textual data to monitor phenomena and detect unusual risks and warnings (e.g., regional supply chain issues, real estate stress, etc.), which is not easy to obtain from traditional data quickly. Textual data could also be used to warn the supply chain market about possible disruptions (e.g., the ripple effect, food inflation, etc.). Moreover, risks pertaining to the Ukraine–Russia war, including war-related inflation, were detected well in advance of the start of the war. From the outcomes and discussions, we could understand that text-based data source, if analyzed properly, could complement traditional supply chain data in analyzing risks considering external sources as an important information source.
This study has several limitations. For example, we only analyzed articles from four news media sources. Majority of the news media sourced from the Asian region; therefore, the research outcomes may be deemed as representative from Asian region. In future, we might consider to use a diverse range of news media sources to detect and assess supply chain risks. Twitter data was collected over a short period of time, covering only the Omicron outbreak and Ukraine–Russia war. An increased time frame could empower detecting more diverse range of risks. Apart from these, some topics seem to be correlated during explanation in terms of word combination of the topics. However, LDA could not capture this correlation. Thus, there is plenty of scope to expand upon this study in future research. For example, future studies could analyze data from a broader range of news media sources or use a wider time frame for data collection. LDA could also be integrated with correlation analysis to better understand the correlated topics. More advanced text mining models (e.g., BERT, GPT–2, GPT–3) could also be used for supply chain risk forecasting purposes. Moreover, text-based embedding techniques could be used to forecast supply chain risks from textual data source using deep learning techniques.

Declarations

Competeting interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
Zurück zum Zitat Anandarajan M, Hill C, Nolan T (2019) Practical text analytics: maximizing the value of the text data, Springer Nature Switzerland Anandarajan M, Hill C, Nolan T (2019) Practical text analytics: maximizing the value of the text data, Springer Nature Switzerland
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022 Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Zurück zum Zitat Bozarth CC, Handfield RB (2019) Introduction to operations and supply chain management (5th eds). Pearson Education, Inc Bozarth CC, Handfield RB (2019) Introduction to operations and supply chain management (5th eds). Pearson Education, Inc
Zurück zum Zitat Caniato F, Rice JB Jr (2003) Building a secure and resilient supply network. Supply Chain Manag Rev 7:22–30 Caniato F, Rice JB Jr (2003) Building a secure and resilient supply network. Supply Chain Manag Rev 7:22–30
Zurück zum Zitat Chopra S, Sodhi MS (2014) Reducing the risk of supply chain disruptions. MIT Sloan Manag Rev 55(3):72–80 Chopra S, Sodhi MS (2014) Reducing the risk of supply chain disruptions. MIT Sloan Manag Rev 55(3):72–80
Zurück zum Zitat Food and Agriculture Organization (2022) Ukraine: Note on the impact of the war on food security in Ukraine – March 25 2022, Rome Food and Agriculture Organization (2022) Ukraine: Note on the impact of the war on food security in Ukraine – March 25 2022, Rome
Zurück zum Zitat Grun B, Hornik K (2011) topicmodels: an R Package for fitting topic models. J Stat Softw 40(13):1–30 Grun B, Hornik K (2011) topicmodels: an R Package for fitting topic models. J Stat Softw 40(13):1–30
Zurück zum Zitat Kwartler T (2017) Text mining in practice with R. Wiley publisher, ISBN:978-1-119-28201-3 Kwartler T (2017) Text mining in practice with R. Wiley publisher, ISBN:978-1-119-28201-3
Zurück zum Zitat McCarthy S, Gita A (2023) Enhancing financial market analysis and prediction with emotion corpora and news co-occurrence network. J Risk Financ Manag 16:226CrossRef McCarthy S, Gita A (2023) Enhancing financial market analysis and prediction with emotion corpora and news co-occurrence network. J Risk Financ Manag 16:226CrossRef
Zurück zum Zitat Mohammad SM (2021) Sentiment analysis: Automatically detecting valence, emotions, and other affectual states from text. Arxiv, 2005.11882 Mohammad SM (2021) Sentiment analysis: Automatically detecting valence, emotions, and other affectual states from text. Arxiv, 2005.11882
Zurück zum Zitat Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29 Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon. Comput Intell 29
Zurück zum Zitat Navarro J, Piña JU, Mas FM, Lahoz-Beltra R (2023) Press media impact of the Cumbre Vieja volcano activity in the island of La Palma(Canary Island): a machine learning and sentiment analysis of the news published during the volcanic eruption of 2021. Int J Disast Risk Reduct 91(1):103694CrossRef Navarro J, Piña JU, Mas FM, Lahoz-Beltra R (2023) Press media impact of the Cumbre Vieja volcano activity in the island of La Palma(Canary Island): a machine learning and sentiment analysis of the news published during the volcanic eruption of 2021. Int J Disast Risk Reduct 91(1):103694CrossRef
Zurück zum Zitat Qian C, Mathur N, Zakaria NH, Arora R, Gupta V, Ali M (2022) Understanding public opinions on social media for financial setiment analysis using AI-based techniques. Inf Process Manage 59(6):103098CrossRef Qian C, Mathur N, Zakaria NH, Arora R, Gupta V, Ali M (2022) Understanding public opinions on social media for financial setiment analysis using AI-based techniques. Inf Process Manage 59(6):103098CrossRef
Zurück zum Zitat Rayport J, Sviokla J (1995) Exploiting the virtual value chain. Harv Bus Rev 73(6):75–85 Rayport J, Sviokla J (1995) Exploiting the virtual value chain. Harv Bus Rev 73(6):75–85
Zurück zum Zitat Stanton D (2020) Supply chain management for dummies. Wiley Publishers Stanton D (2020) Supply chain management for dummies. Wiley Publishers
Zurück zum Zitat Teo WWJ (2020) A natural language processing approach to improve demand forecasting in long supply chains. Masters Thesis, Massachusetts Institute of Technology (MIT) Teo WWJ (2020) A natural language processing approach to improve demand forecasting in long supply chains. Masters Thesis, Massachusetts Institute of Technology (MIT)
Zurück zum Zitat Tokarchuk O, Barr JC, Cozzio C (2022) How much is too much? Estimating tourism carrying capacity in urban context using sentiment analysis. Tour Manage 91:104522CrossRef Tokarchuk O, Barr JC, Cozzio C (2022) How much is too much? Estimating tourism carrying capacity in urban context using sentiment analysis. Tour Manage 91:104522CrossRef
Zurück zum Zitat Upadhyaya A, Fischella M, Nejdl W (2023) Towards sentiment and temporal aided stance detection of climate change tweets. Inf Process Manage 60(4):103325CrossRef Upadhyaya A, Fischella M, Nejdl W (2023) Towards sentiment and temporal aided stance detection of climate change tweets. Inf Process Manage 60(4):103325CrossRef
Zurück zum Zitat Woolridge JM (2012) Introductory econometrics: a modern approach, 5th edn. Pearson, London Woolridge JM (2012) Introductory econometrics: a modern approach, 5th edn. Pearson, London
Metadaten
Titel
Assessment of text-generated supply chain risks considering news and social media during disruptive events
verfasst von
Soumik Nafis Sadeek
Shinya Hanaoka
Publikationsdatum
01.12.2023
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2023
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-023-01100-0

Weitere Artikel der Ausgabe 1/2023

Social Network Analysis and Mining 1/2023 Zur Ausgabe

Premium Partner