Skip to main content

Advertisement

Log in

Improving estimation of missing data in historical monthly precipitation by evolutionary methods in the semi-arid area

  • Published:
Environment, Development and Sustainability Aims and scope Submit manuscript

Abstract

Precipitation is among the main variables in weather and climate studies. The length of the statistical period plays a pivotal role in the accurate analysis of precipitation. One of the limitations of meteorological stations is having missing data. Analysis based on incomplete data leads to biased analysis. The historical monthly precipitation of the five stations in Iran is available since 1880 with missing data. The name of these synoptic stations are Mashhad, Isfahan, Tehran, Bushehr, and Jask. The data in the period of 1941–1949 have a gap that was during and following World War II (1939–1945). The present study aimed to use several classic and meta-heuristic methods to estimate these missing data. The Root Means Square Error (RMSE) criteria were used for comparison. The neighboring stations of Iran were selected as independent variable to estimate missing rainfall data. First, missing data were restored with the fitting of several new regression models for monthly precipitation (with RMSEs: 9.79, 7.89, 13.43, 6.65, and 20.96 millimeter(mm)). Then, the parameters of regression models were optimized by methods of genetic algorithm (GA) and Ant Colony (ACO). It was observed that RMSEs reduced to 2.56, 2.51, 3.49, 2.48, and 4.02 mm. Besides, Artificial Neural Network (ANN) and Support Vector Regression (SVR) methods were used to model the data. ANN and SVR could not increase the accuracy of the estimated data. The missing data were imputed using evolutionary methods (GA and ACO). As a result, the length of the statistical period of the stations reached over 125 years, and the data could be considered a valuable basis for water resources, drought analyses, evaluation trends, climate changes, and global warming.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Hyperbolic tangent.

References

  • Abbass, H. A., Sarker, R. A., & Newton, C. S. (2002). Data mining, a heuristic approach-IGI global (p. 300). Idea Group Publishing.

    Book  Google Scholar 

  • Aguilera, H., Carolina, G. A., & Carmen, S. H. (2020). Estimating extremely large amounts of missing precipitation data. Journal of Hydroinformatics, 22(3): 578–592.

  • Aydilek, I. B., & Arslan, A. (2013). A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression. Information Sciences, 233(1), 25–35.

  • Belala, F., Hirche, A. D., Muller, S., Tourki, M., Salamani, M., Grandi, M., Ait Hamouda, T., & Boughani, M. (2018). Rainfall patterns of Algerian steppes and the impacts on natural vegetation in the 20th century. Journal of Arid Land, 10(4), 561–573.

    Article  Google Scholar 

  • Chaudhuri, S., Goswami, S., Das, D., & Middey, A. (2014). Meta-heuristic ant colony optimization technique to forecast the amount of summer monsoon rainfall: Skill comparison with Markov chain model. Theoretical and Applied Climatology, 116(3–4), 585–595.

    Article  Google Scholar 

  • Coulibaly, P., & Evora, N. D. (2007). Comparison of neural network methods for infilling missing daily weather records. Journal of Hydrology, 341(1–2), 27–41.

    Article  Google Scholar 

  • Dastorani, M. T., Moghadamnia, A., Piri, J., & Rico-Ramirez, M. (2010). Application of ANN and ANFIS models for reconstructing missing flow data. Environmental Monitoring and Assessment., 166(1–4), 421–434.

    Article  Google Scholar 

  • Escalante-Sandoval, C., & Nuñez-Garcia, P. (2017). Meteorological drought features in northern and northwestern parts of Mexico under different climate change scenarios. Journal of Arid Land, 9(1), 65–75.

    Article  Google Scholar 

  • Farzandi, M. (2019). The Hybrid EM and evolutionary algorithms for estimating and analyzing missing data in meteorology Case study: 130-years monthly precipitation and temperature of the five stations in Iran. PhD dissertation, Ferdowsi University of Mashhad, Iran.

  • Farzandi, M., Sanaeinejad, H., Ghahraman, B., & Sarmad, M. (2019). Imputation of missing meteorological data with evolutionary and machine learning methods, case study: long-term monthly precipitation and temperature of Mashhad. Journal of Water and Soil, 33(2), 361–377.

    Google Scholar 

  • Fox, J. (2016). Applied regression analysis and generalized linear models (p. 816). SAGE Publications, Inc.

    Google Scholar 

  • Geiß, C., Jilge, M., Lakes, T., & Taubenböck, H. (2016). Estimation of seismic vulnerability levels of urban structures with multisensor remote sensing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(5), 1913–1936.

    Article  Google Scholar 

  • Golkar Hamzee Yazd, H. R., Salehnia, N., Kolsoumi, S., & Gerrit, H. (2019). Prediction of climate variables by comparing the k-nearest neighbor method and MIROC5 outputs in an arid environment. Climate Research, 77, 99–114.

    Article  Google Scholar 

  • http://sdwebx.worldbank.org. (https://climateknowledgeportal.worldbank.org/)

  • https://climexp.knmi.nl. (Koninklijk Nederlands Meteorologisch Institute Climate Explorer)

  • Islamic Republic of Iran Meteorological Organization (https://www.irimo.ir/eng/index.php)

  • Iqbal, M., Wen, J., Wang, Sh., & Adnan, M. (2018). Variations of precipitation characteristics during the period 1960–2014 in the Source Region of the Yellow River, China. Journal of Arid Land, 10(3), 388–401.

    Article  Google Scholar 

  • Jacob, D., Reed, D. W., & Robson, A. J. (1999). Choosing a pooling group. Flood estimation handbook (Vol. 3). Institute of Hydrology.

    Google Scholar 

  • Kazemzadeh, M., & Malekian, A. (2018). Homogeneity analysis of streamflow records in arid and semi-arid regions of northwestern Iran. Journal of Arid Land, 10(4), 493–506.

    Article  Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (p. 408). Wiley.

    Book  Google Scholar 

  • Miang Kueh, S., & Kuok Kuok, K. (2016). Precipitation downscaling using the artificial neural network BatNN and development of future rainfall intensity-duration-frequency curves. Climate Research, 68, 73–89.

    Article  Google Scholar 

  • Patil, D. V., Bichkar, R. S. (2010). Multiple imputation of missing data with genetic algorithm based techniques. IJCA special issue on evolutionary computation for optimization techniques.

  • Ramesh, S. V. T., Tufail, M., & Ormsbee, L. (2009). Optimal functional forms for estimation of missing precipitation data. Journal of Hydrology, 374(1–2), 106–115.

  • Salehnia, N., Alizadeh, A., Sanaeinejad, H., Bannayan, M., Zarrin, A., & Hoogenboom (2017). Estimation of meteorological drought indices based on AgMERRA precipitation data and station-observed precipitation data. Journal of Arid Land, 9(6), 797–809.

  • Salehnia, N., Salehnia, N., Ansari, H., Kolsoumi, S., & Bannayan, M. (2019). Climate data clustering effects on arid and semi-arid rainfed wheat yield: A comparison of artificial intelligence and K-Means approaches. International J. of Biometeorology, 63(7), 861–872. https://doi.org/10.1007/s00484-019-01699-w

    Article  Google Scholar 

  • Sattari, M., Rezazadeh-Joudi, A., & Kusiak, A. (2017). Assessment of different methods for estimation of missing data in precipitation studies. Hydrology Research, 48(4), 1032–1044.

  • Serrano-Notivoli, R., de Luis, M., Ángel Saz, M., & Beguería, S. (2017). Spatially based reconstruction of daily precipitation instrumental data series. Climate Research, 73, 167–186.

    Article  Google Scholar 

  • Seyyednezhad Golkhatmi, N., Sanaeinejad2, S. H., Ghahraman, B., Rezaee Pazhand, H. (2012). Extended modified inverse distance method for interpolation rainfall. International Journal of Engineering Inventions, 1(3): 57-65.

  • Smithsonian Institution. (1927, 1934, 1947): World weather records, 1910–1920, 1921–1930, 1931–1940. Smithson. Miss C. Collect. 79,90,105. (Publication 2913, 3216, 3803)

  • Smola, A. J., & Vishwanathan, S. V. N. (2008). Introduction to machine learning (p. 234). Cambridge University Press.

    Google Scholar 

  • Türkeş, M., Yozgatlıgil, C., Batmaz, I., İyigün, C., Kartal Koç, E., Fahmi, F. M., & Aslan, S. (2016). Has the climate been changing in Turkey? Regional climate change signals based on a comparative statistical analysis of two consecutive time periods, 1950–1980 and 1981–2010. Climate Research, 70, 77–93.

    Article  Google Scholar 

  • Yozgatligil, C., Aslan, S., Iyigun, C., & Batmaz, I. (2013). Comparison of missing value imputation methods in time series: The case of Turkish meteorological data. Theory Apply Climatology, 112(1–2), 143–167.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahboobeh Farzandi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farzandi, M., Sanaeinejad, H., Rezaei-Pazhan, H. et al. Improving estimation of missing data in historical monthly precipitation by evolutionary methods in the semi-arid area. Environ Dev Sustain 24, 8313–8332 (2022). https://doi.org/10.1007/s10668-021-01784-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10668-021-01784-4

Keyword

Navigation