Skip to main content
Top
Published in:

2024 | OriginalPaper | Chapter

Continuous Versus Categorical Imputation Method for Unobserved Count with Zero-Inflation

Authors : Reem Elsousy, Abdel-Salam G. Abdel-Salam

Published in: Mathematical Analysis and Numerical Methods

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This chapter addresses the challenge of missing data in count variables with zero-inflation, focusing on the imputation methods Predictive Mean Matching (PMM) and Multinomial Logistic Regression (MLR). It begins by discussing the types of missing data and the importance of accurate imputation to maintain data integrity. The study evaluates the performance of PMM and MLR in an empirical application using traffic accident data, comparing their effectiveness in preserving the original data distribution and model fit. Additionally, a simulation study is conducted to further validate the findings. The chapter highlights the strengths and limitations of both methods, offering insights into their suitability for different scenarios. The results indicate that while both methods can effectively impute missing data, MLR tends to achieve better model fit, particularly with larger sample sizes. The chapter concludes with practical recommendations for the use of these methods in real-world applications.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Liu, P., Lei, L., Wu, N.: A quantitative study of the effect of missing data in classifiers. In: The Fifth International Conference on Computer and Information Technology (CIT'05), pp. 28–33. IEEE (2005) Liu, P., Lei, L., Wu, N.: A quantitative study of the effect of missing data in classifiers. In: The Fifth International Conference on Computer and Information Technology (CIT'05), pp. 28–33. IEEE (2005)
2.
go back to reference Fielding, S., Fayers, P., Loge, J., Jordhøy, M., Kaasa, S.: Methods for handling missing data in palliative care research. Palliat. Med. 20(8), 791–798 (2006)CrossRef Fielding, S., Fayers, P., Loge, J., Jordhøy, M., Kaasa, S.: Methods for handling missing data in palliative care research. Palliat. Med. 20(8), 791–798 (2006)CrossRef
3.
go back to reference Li, P., Stuart, E.A., Allison, D.B.: Multiple imputation: a flexible tool for handling missing data. JAMA 314(18), 1966–1967 (2015)CrossRef Li, P., Stuart, E.A., Allison, D.B.: Multiple imputation: a flexible tool for handling missing data. JAMA 314(18), 1966–1967 (2015)CrossRef
4.
go back to reference Jakobsen, J.C., Gluud, C., Wetterslev, J., Winkel, P.: When and how should multiple imputation be used for handling missing data in randomised clinical trials–a practical guide with flowcharts. BMC Med. Res. Methodol. 17(1), 1–10 (2017)CrossRef Jakobsen, J.C., Gluud, C., Wetterslev, J., Winkel, P.: When and how should multiple imputation be used for handling missing data in randomised clinical trials–a practical guide with flowcharts. BMC Med. Res. Methodol. 17(1), 1–10 (2017)CrossRef
5.
go back to reference Baraldi, A.N., Enders, C.K.: An introduction to modern missing data analyses. J. Sch. Psychol. 48(1), 5–37 (2010)CrossRef Baraldi, A.N., Enders, C.K.: An introduction to modern missing data analyses. J. Sch. Psychol. 48(1), 5–37 (2010)CrossRef
6.
go back to reference Morris, T.P., White, I.R., Royston, P.: Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med. Res. Methodol. 14, 1–13 (2014)CrossRef Morris, T.P., White, I.R., Royston, P.: Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med. Res. Methodol. 14, 1–13 (2014)CrossRef
7.
go back to reference MacNeil Vroomen, J., et al.: Multiple imputation strategies for zero-inflated cost data in economic evaluations: which method works best? Eur. J. Health Econ. 17, 939–950 (2016) MacNeil Vroomen, J., et al.: Multiple imputation strategies for zero-inflated cost data in economic evaluations: which method works best? Eur. J. Health Econ. 17, 939–950 (2016)
8.
go back to reference Xu, D., Hu, P.J.-H., Huang, T.-S., Fang, X., Hsu, C.-C.: A deep learning–based, unsupervised method to impute missing values in electronic health records for improved patient management. J. Biomed. Inform. 111, 103576 (2020)CrossRef Xu, D., Hu, P.J.-H., Huang, T.-S., Fang, X., Hsu, C.-C.: A deep learning–based, unsupervised method to impute missing values in electronic health records for improved patient management. J. Biomed. Inform. 111, 103576 (2020)CrossRef
9.
go back to reference Sentas, P., Angelis, L.: Categorical missing data imputation for software cost estimation by multinomial logistic regression. J. Syst. Softw. 79(3), 404–414 (2006)CrossRef Sentas, P., Angelis, L.: Categorical missing data imputation for software cost estimation by multinomial logistic regression. J. Syst. Softw. 79(3), 404–414 (2006)CrossRef
10.
go back to reference Kwak, C., Clayton-Matthews, A.: Multinomial logistic regression. Nurs. Res. 51(6), 404–410 (2002)CrossRef Kwak, C., Clayton-Matthews, A.: Multinomial logistic regression. Nurs. Res. 51(6), 404–410 (2002)CrossRef
11.
go back to reference Lakshminarayan, K., Harp, S.A., Samad, T.: Imputation of missing data in industrial databases. Appl. Intell. 11(3), 259–275 (1999)CrossRef Lakshminarayan, K., Harp, S.A., Samad, T.: Imputation of missing data in industrial databases. Appl. Intell. 11(3), 259–275 (1999)CrossRef
12.
go back to reference Kamakura, W.A., Wedel, M.: Factor analysis and missing data. J. Mark. Res. 37(4), 490–498 (2000)CrossRef Kamakura, W.A., Wedel, M.: Factor analysis and missing data. J. Mark. Res. 37(4), 490–498 (2000)CrossRef
13.
go back to reference Vink, G., Frank, L.E., Pannekoek, J., Van Buuren, S.: Predictive mean matching imputation of semicontinuous variables. Stat. Neerl. 68(1), 61–90 (2014)MathSciNetCrossRef Vink, G., Frank, L.E., Pannekoek, J., Van Buuren, S.: Predictive mean matching imputation of semicontinuous variables. Stat. Neerl. 68(1), 61–90 (2014)MathSciNetCrossRef
14.
go back to reference Kleinke, K.: Multiple imputation under violated distributional assumptions: a systematic evaluation of the assumed robustness of predictive mean matching. Journal of Educational and Behavioral Statistics 42(4), 371–404 (2017)CrossRef Kleinke, K.: Multiple imputation under violated distributional assumptions: a systematic evaluation of the assumed robustness of predictive mean matching. Journal of Educational and Behavioral Statistics 42(4), 371–404 (2017)CrossRef
15.
go back to reference Vink, G., Lazendic, G., van Buuren, S.: Partioned predictive mean matching as a large data multilevel imputation technique. Psychol. Test Assess. Model. 57(4), 577–594 (2015) Vink, G., Lazendic, G., van Buuren, S.: Partioned predictive mean matching as a large data multilevel imputation technique. Psychol. Test Assess. Model. 57(4), 577–594 (2015)
16.
go back to reference Zimmermann, P., Mazouch, P., Hulíková Tesárková, K.: Categorical data imputation under MAR missing scheme. In: Proceedings of the 31st International Conference Mathematical Methods in Economics (2013) Zimmermann, P., Mazouch, P., Hulíková Tesárková, K.: Categorical data imputation under MAR missing scheme. In: Proceedings of the 31st International Conference Mathematical Methods in Economics (2013)
17.
go back to reference Andridge, R.R., Little, R.J.: A review of hot deck imputation for survey non-response. Int. Stat. Rev. 78(1), 40–64 (2010)CrossRef Andridge, R.R., Little, R.J.: A review of hot deck imputation for survey non-response. Int. Stat. Rev. 78(1), 40–64 (2010)CrossRef
18.
go back to reference Rose, C.E., Martin, S.W., Wannemuehler, K.A., Plikaytis, B.D.: On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data. J. Biopharm. Stat.Biopharm. Stat. 16(4), 463–481 (2006)MathSciNetCrossRef Rose, C.E., Martin, S.W., Wannemuehler, K.A., Plikaytis, B.D.: On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data. J. Biopharm. Stat.Biopharm. Stat. 16(4), 463–481 (2006)MathSciNetCrossRef
Metadata
Title
Continuous Versus Categorical Imputation Method for Unobserved Count with Zero-Inflation
Authors
Reem Elsousy
Abdel-Salam G. Abdel-Salam
Copyright Year
2024
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-4876-1_1

Premium Partner