Skip to main content
Top

18-11-2024 | Original Article

Statistical Data-Driven Modelling and Forecasting: An Application to COVID-19 Pandemic

Authors: Shalabh, Subhra Sankar Dhar, Sabara Parshad Rajeshbhai

Published in: Annals of Data Science

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

One of the key objectives of statistics is to provide a model compatible with the data generated by an unknown random process. Often, it happens that the unknown process is intractable, and no prior data or information associated with the unknown process is available. Under such circumstances, well-known techniques like regression modelling techniques may not work. As a result, an alternative approach may be to observe the general features of the process from the available data. Afterward, a suitable statistical distribution, like a mixture of certain distributions, can be fitted to the existing available data, and future observations can be predicted using this fitting. For example, one may consider the prediction related to the COVID-19 pandemic. As it occurred for the first time, no prior data was available to apprehend the behaviour and progression of the COVID-19 pandemic. For such cases, a data-based statistical modelling procedure can be adopted to predict future occurrences based on a small data set. This article presents such an application-oriented, data-based statistical modelling procedure with an implementation on the COVID-19 data. The proposed procedure can be used for a wide range of modelling and forecasting of future events.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
2.
go back to reference Shalabh, Dhar SS, Chakroborty C, Jha, P (Appearing) Goodness of fit based and variable selection in non-parametric measurement error model. In: Shekhar C, Sinha RR (eds) Statistical Modeling and applications on real-time problems, Taylor's & Francis, CRC Press Shalabh, Dhar SS, Chakroborty C, Jha, P (Appearing) Goodness of fit based and variable selection in non-parametric measurement error model. In: Shekhar C, Sinha RR (eds) Statistical Modeling and applications on real-time problems, Taylor's & Francis, CRC Press
3.
go back to reference Shalabh, Dhar SS, Garg G (Appearing) Robust measures of goodness of fit and outlier detection in linear regression models. In: Ali MM, Ali I, Yousof HM (eds) Statistical outliers and related topics, Taylor's & Francis, CRC Press Shalabh, Dhar SS, Garg G (Appearing) Robust measures of goodness of fit and outlier detection in linear regression models. In: Ali MM, Ali I, Yousof HM (eds) Statistical outliers and related topics, Taylor's & Francis, CRC Press
4.
go back to reference Xu C, Chang W, Liu W (2023) Data-driven decision model based on local two-stage weighted ensemble learning. Ann Oper Res 325:995–1028CrossRef Xu C, Chang W, Liu W (2023) Data-driven decision model based on local two-stage weighted ensemble learning. Ann Oper Res 325:995–1028CrossRef
7.
go back to reference Beyaztas BH, Bandyopadhyay S (2022) Data driven robust estimation methods for fixed effects panel data models. J Stat Comput Simul 7(92):1401–1425CrossRef Beyaztas BH, Bandyopadhyay S (2022) Data driven robust estimation methods for fixed effects panel data models. J Stat Comput Simul 7(92):1401–1425CrossRef
11.
go back to reference Huang H, Gao W, Ye C (2021) An intelligent data-driven model for disease diagnosis based on machine learning theory. J Comb Optim 42:884–895CrossRef Huang H, Gao W, Ye C (2021) An intelligent data-driven model for disease diagnosis based on machine learning theory. J Comb Optim 42:884–895CrossRef
12.
go back to reference Sang J, Pan X, Lin T, Liang W, Liu GR (2021) A data-driven artificial neural network model for predicting wind load of buildings using GSM-CFD solver. Eur J Mech B Fluids 87:24–36CrossRef Sang J, Pan X, Lin T, Liang W, Liu GR (2021) A data-driven artificial neural network model for predicting wind load of buildings using GSM-CFD solver. Eur J Mech B Fluids 87:24–36CrossRef
13.
go back to reference Wang M, Liu C, Xie T, Sun Z (2020) Data-driven model checking for errors-in-variables varying-coefficient models with replicate measurements. Comput Stat Data Anal 141:12–27CrossRef Wang M, Liu C, Xie T, Sun Z (2020) Data-driven model checking for errors-in-variables varying-coefficient models with replicate measurements. Comput Stat Data Anal 141:12–27CrossRef
14.
go back to reference Boruvka A, Takahara G, Tu D (2016) Data-driven ridge regression for Aalen’s additive risk model. Stat Probab Lett 109:189–193CrossRef Boruvka A, Takahara G, Tu D (2016) Data-driven ridge regression for Aalen’s additive risk model. Stat Probab Lett 109:189–193CrossRef
15.
go back to reference Guay A, Guerre E (2006) A data-driven nonparametric specification test for dynamic regression models. Economet Theor 22:543–586CrossRef Guay A, Guerre E (2006) A data-driven nonparametric specification test for dynamic regression models. Economet Theor 22:543–586CrossRef
16.
go back to reference Vieu P (2002) Data-driven model choice in multivariate nonparametric regression. Statistics 36:231–246CrossRef Vieu P (2002) Data-driven model choice in multivariate nonparametric regression. Statistics 36:231–246CrossRef
17.
go back to reference Spokoiny V (2001) Data-driven testing the fit of linear models. Math Methods Stat 10:465–497 Spokoiny V (2001) Data-driven testing the fit of linear models. Math Methods Stat 10:465–497
18.
go back to reference Shi Y (2022) Advances in big data analytics: theory, algorithm and practice. Springer, SingaporeCrossRef Shi Y (2022) Advances in big data analytics: theory, algorithm and practice. Springer, SingaporeCrossRef
19.
go back to reference Shi Y, Tian YJ, Kou G, Peng Y, Li JP (2011) Optimization based data mining: theory and applications. Springer, BerlinCrossRef Shi Y, Tian YJ, Kou G, Peng Y, Li JP (2011) Optimization based data mining: theory and applications. Springer, BerlinCrossRef
20.
22.
go back to reference Efron B, Hastie T (2021) Computer age statistical inference: algorithms, evidence, and data science. Cambridge University Press, India Efron B, Hastie T (2021) Computer age statistical inference: algorithms, evidence, and data science. Cambridge University Press, India
24.
go back to reference Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef
25.
go back to reference Nazarathy Y, Klok H (2021) Statistics with Julia—fundamentals for data science, machine learning and artificial intelligence. Springer, ChamCrossRef Nazarathy Y, Klok H (2021) Statistics with Julia—fundamentals for data science, machine learning and artificial intelligence. Springer, ChamCrossRef
26.
go back to reference Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York
28.
go back to reference Alloghani M, Al-Jumeily D, Mustafina J, Hussain A, Aljaaf AJ (2020) A systematic review on supervised and unsupervised machine learning algorithms for data science. In: Berry M, Mohamed A, Yap B (eds) Supervised and unsupervised learning for data science. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-22475-2_1 Alloghani M, Al-Jumeily D, Mustafina J, Hussain A, Aljaaf AJ (2020) A systematic review on supervised and unsupervised machine learning algorithms for data science. In: Berry M, Mohamed A, Yap B (eds) Supervised and unsupervised learning for data science. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://​doi.​org/​10.​1007/​978-3-030-22475-2_​1
31.
go back to reference Zanella M, Bardelli C, Dimarco G, Deandrea S, Perotti P, Azzi M, Figini S, Toscani G (2021) A data-driven epidemic model with social structure for understanding the COVID-19 infection on a heavily affected Italian province. Math Models Methods Appl Sci 31:2533–2570CrossRef Zanella M, Bardelli C, Dimarco G, Deandrea S, Perotti P, Azzi M, Figini S, Toscani G (2021) A data-driven epidemic model with social structure for understanding the COVID-19 infection on a heavily affected Italian province. Math Models Methods Appl Sci 31:2533–2570CrossRef
33.
go back to reference Ganesh M, Hawkins SC (2022) A surrogate Bayesian framework for a SARS-CoV-2 data driven stochastic model. Comput Math Biophys 10:34–67CrossRef Ganesh M, Hawkins SC (2022) A surrogate Bayesian framework for a SARS-CoV-2 data driven stochastic model. Comput Math Biophys 10:34–67CrossRef
35.
go back to reference Eshkiti A, Sabouhi F, Bozorgi-Amiri A (2023) A data-driven optimization model to response to COVID-19 pandemic: a case study. Ann Oper Res 328:337–386CrossRef Eshkiti A, Sabouhi F, Bozorgi-Amiri A (2023) A data-driven optimization model to response to COVID-19 pandemic: a case study. Ann Oper Res 328:337–386CrossRef
37.
go back to reference Dasari VN, Prabaharan SRS (2020) Data science and the role of artificial intelligence in achieving the fast diagnosis of Covid-19. Chaos Solitons Fract 140(110182):7 Dasari VN, Prabaharan SRS (2020) Data science and the role of artificial intelligence in achieving the fast diagnosis of Covid-19. Chaos Solitons Fract 140(110182):7
46.
49.
go back to reference Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38CrossRef Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38CrossRef
50.
go back to reference Pedregosa FG et al (2011) Scikit-learn Machine learning in Python. J Mach Learn Res 12:2825–2830 Pedregosa FG et al (2011) Scikit-learn Machine learning in Python. J Mach Learn Res 12:2825–2830
51.
go back to reference Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman and Hall/CRCCrossRef Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman and Hall/CRCCrossRef
52.
53.
go back to reference Heumann C, Schomaker M, Shalabh M (2022) Introduction to statistics and data analysis—with exercises, solutions and applications in R. Springer, Switzerland Heumann C, Schomaker M, Shalabh M (2022) Introduction to statistics and data analysis—with exercises, solutions and applications in R. Springer, Switzerland
Metadata
Title
Statistical Data-Driven Modelling and Forecasting: An Application to COVID-19 Pandemic
Authors
Shalabh
Subhra Sankar Dhar
Sabara Parshad Rajeshbhai
Publication date
18-11-2024
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-024-00583-8

Premium Partner