Abstract
Particulate matter has major impacts on human health in urban regions, and Tehran is one of the most polluted metropolitan cities in the world, struggling to control this pollutant more than any other contaminant. PM2.5 concentrations were predicted by three statistical modeling methods: (i) decision tree (DT), (ii) Bayesian network (BN), and (iii) support vector machine (SVM). Collected data for three consecutive years (January 2013 to January 2016) were used to develop the models. Data from the initial 2 years were employed as the training data, and measurements from the last year were used for testing the models. Twelve parameters, covering meteorological variables and concentrations of several chemical species, were explored as potential predictors of PM2.5. According to the sensitivity analysis of PM2.5 by SVM and derived explicit equations from BN and DT, PM10, NO2, SO2, and O3 are the most important predictors. Furthermore, the impacts of the predictors on the PM2.5 were assessed which the chemical precursors’ influences indicated more in comparison with meteorological parameters. Capabilities of the models were compared to each other and the support vector machine was found to be the best performing, based on evaluation criteria. Nonetheless, the decision tree and Bayesian network methods also provided acceptable results. We suggest more studies using the SVM and other methods as hybrids would lead to improved models.
Similar content being viewed by others
References
Aguilera PA, Fernández A, Fernández R, Rumí R, Salmerón A (2011) Bayesian networks in environmental modelling. Environ Model Softw 26:1376–1388. https://doi.org/10.1016/j.envsoft.2011.06.004
Aronszajn N (2009) Theory of reproducing kernels. Am Math Soc 68:337–404. https://doi.org/10.2307/1990404
Atkinson RW, Kang S, Anderson HR, Mills IC, Walton HA (2014) Epidemiological time series studies of PM2.5 and daily mortality and hospital admissions: a systematic review and meta-analysis. Thorax 69:660–665. https://doi.org/10.1136/thoraxjnl-2013-204492
Bagha N, Arian M, Ghorashi M, Pourkermani M, el Hamdouni R, Solgi A (2014) Geomorphology evaluation of relative tectonic activity in the Tehran basin, central Alborz, northern Iran. Geomorphology 29:135–145. https://doi.org/10.1016/j.geomorph.2013.12.041
Bayes M, Price M (1763) An essay towards solving a problem in the doctrine of chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a Letter to John Canton, A. M. F. R. S. Philos Trans R Soc Lond 53:370–418. https://doi.org/10.1098/rstl.1763.0053
Borja-Aburto VH, Castillejos M, Gold DR, Bierzwinski S, Loomis D (1998) Mortality and ambient fine particles in southwest Mexico City, 1993-1995. Environ Health Perspect 106:849–855. https://doi.org/10.2307/3434129
Cao J, Chow JC, Lee FSC, Watson JG (2013) Evolution of PM2.5 measurements and standards in the U.S. and future perspectives for China. Aerosol Air Qual Res 13:1197–1211. https://doi.org/10.4209/aaqr.2012.11.0302
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018
Davidson CI, Phalen RF, Solomon PA (2005) Airborne particulate matter and human health: a review. Aerosol Sci Technol 39:737–749. https://doi.org/10.1080/02786820500191348
Dunea D, Iordache S, Liu H-Y, Bøhler T, Pohoata A, Radulescu C (2016) Quantifying the impact of PM2.5 and associated heavy metals on respiratory health of children near metallurgical facilities. Environ Sci Pollut Res Int 23:15395–15406. https://doi.org/10.1007/s11356-016-6734-x
Elizondo D, Orun A (2017) An Intelligent traffic network optimisation by use of Bayesian inference methods to combat air pollution. In: TPM-Transport Practitioner’s Meeting Conference, 28–29 June 2017, Nottingham
Fann N, Lamson AD, Anenberg SC, Wesson K, Risley D, Hubbell BJ (2012) Estimating the national public health burden associated with exposure to ambient PM2.5 and ozone. Risk Anal 32:81–95. https://doi.org/10.1111/j.1539-6924.2011.01630.x
Fattore E, Paiano V, Borgini A, Tittarelli A, Bertoldi M, Crosignani P, Fanelli R (2011) Human health risk in relation to air quality in two municipalities in an industrialized area of northern Italy. Environ Res 111:1321–1327. https://doi.org/10.1016/j.envres.2011.06.012
Feng X, Li Q, Zhu Y, Hou J, Jin L, Wang J (2015) Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos Environ 107:118–128. https://doi.org/10.1016/j.atmosenv.2015.02.030
Genton MG (2001) Classes of kernels for machine learning: a statistics perspective. J Mach Learn Res 2:299–312. https://doi.org/10.1162/15324430260185646
Ivanciuc O (2007) Applications of support vector machines in chemistry. In: Reviews in Computational Chemistry. pp 291–400
James G, Witten D, Hastie T, Tibshirani R (2000) An introduction to statistical learning. Springer New York Heidelberg Dordrecht London
Kamiński B, Jakubczyk M, Szufel P (2017) A framework for sensitivity analysis of decision trees. Cent Eur J Oper Res 26:1–25. https://doi.org/10.1007/s10100-017-0479-6
Kim S, Shiri J, Singh VP, Kisi O, Landeras G (2015) Predicting daily pan evaporation by soft computing models with limited climatic data. Hydrol Sci J 60:1120–1136. https://doi.org/10.1080/02626667.2014.945937
Kisi O, Parmar KS, Soni K, Demir V (2017) Modeling of air pollutants using least square support vector regression, multivariate adaptive regression spline, and M5 model tree models. Air Qual Atmos Health 10:873–883. https://doi.org/10.1007/s11869-017-0477-9
Kujaroentavon K, Kiattisin S, Leelasantitham A, Thammaboosadee S (2015) Air quality classification in Thailand based on decision tree. In: BMEiCON 2014-7th Biomedical Engineering International Conference
Kuo Y-M, Chiu C-H, Yu H-L (2015) Influences of ambient air pollutants and meteorological conditions on ozone variations in Kaohsiung, Taiwan. Stoch Env Res Risk A 29:1037–1050. https://doi.org/10.1007/s00477-014-0968-2
Leili M, Naddafi K, Nabizadeh R, Yunesian M, Mesdaghinia A (2008) The study of TSP and PM10 concentration and their heavy metal content in central area of Tehran, Iran. Air Qual Atmos Health 1:159–166. https://doi.org/10.1007/s11869-008-0021-z
Lelieveld J, Evans JS, Fnais M, Giannadaki D, Pozzer A (2015) The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature 525:367–371
Li Q, Guo Y, Song J-Y, Song Y, Ma J, Wang HJ (2018) Impact of long-term exposure to local PM10 on children’s blood pressure: a Chinese national cross-sectional study. Air Qual Atmos Health 11:705–713. https://doi.org/10.1007/s11869-018-0577-1
Liu JC, Peng RD (2018) Health effect of mixtures of ozone, nitrogen dioxide, and fine particulates in 85 US counties. Air Qual Atmos Health 11:311–324. https://doi.org/10.1007/s11869-017-0544-2
Liu KF-R, Lu C-F, Chen C-W, Shen Y-S (2012) Applying Bayesian belief networks to health risk assessment. Stoch Env Res Risk A 26:451–465. https://doi.org/10.1007/s00477-011-0470-z
Marchant R, Ramos F (2012) Bayesian optimisation for intelligent environmental monitoring. IEEE Int Conf Intell Robot Syst 2242–2249. doi: https://doi.org/10.1109/IROS.2012.6385653
Martí P, Shiri J, Duran-ros M, et al (2013) Artificial neural networks vs. Gene Expression Programming for estimating outlet dissolved oxygen in micro-irrigation sand filters fed with effluents 99:176–185. doi: https://doi.org/10.1016/j.compag.2013.08.016
Marzouni MB, Alizadeh T, Banafsheh MR, Khorshiddoust AM, Ghozikali MG, Akbaripoor S, Sharifi R, Goudarzi G (2016) A comparison of health impacts assessment for PM10 during two successive years in the ambient air of Kermanshah, Iran. Atmos Pollut Res 7:1–7. https://doi.org/10.1016/j.apr.2016.04.004
McCann RK, Marcot BG, Ellis R (2006) Bayesian belief networks: applications in ecology and natural resource management. Can J For Res 36:3053–3062. https://doi.org/10.1139/x06-238
McMillan NJ, Holland DM, Morara M, Jingyu F (2007) Space time zero inflated count models of harbor seals. Environmetrics 18:697–712. https://doi.org/10.1002/env
Mehdipour V (2017) Temporal modeling of tropospheric ozone and analysis of its relationship with photochemical precursors considering meteorological parameters. K. N. Toosi University of Technology, Tehran
Mehdipour V, Memarianfard M (2017) Application of support vector machine and gene expression programming on tropospheric ozone prognosticating for Tehran metropolitan. Civ Eng J 3:557. https://doi.org/10.28991/cej-030984
Mehdipour V, Memarianfard M (2018) Ground-level O3 sensitivity analysis using support vector machine with radial basis function. Int J Environ Sci Technol. https://doi.org/10.1007/s13762-018-1770-3
Mehdipour V, Memarianfard M, Homayounfar F (2017) Application of gene expression programming to water dissolved oxygen concentration prediction. International Journal of Human Capital in Urban Management 2:39–48. https://doi.org/10.22034/ijhcum.2017.02.01.004
Moret BME (1982) Decision trees and diagrams. ACM Comput Surv 14:593–623. https://doi.org/10.1145/356893.356898
Nickless A, Rayner PJ, Engelbrecht F, Brunke EG, Erni B, Scholes RJ (2017) Estimates of CO 2 fluxes over the City of Cape Town, South Africa, through Bayesian inverse modelling. Atmos Chem Phys :1–72. https://doi.org/10.5194/acp-2017-604
Ömer Faruk D (2010) A hybrid neural network and ARIMA model for water quality time series prediction. Eng Appl Artif Intell 23:586–594. https://doi.org/10.1016/j.engappai.2009.09.015
Osowski S, Garanty K (2007) Forecasting of the daily meteorological pollution using wavelets and support vector machine. Eng Appl Artif Intell 20:745–755. https://doi.org/10.1016/j.engappai.2006.10.008
Pascal M, Corso M, Chanel O, Declercq C, Badaloni C, Cesaroni G, Henschel S, Meister K, Haluza D, Martin-Olmedo P, Medina S, Aphekom group (2013) Assessing the public health impacts of urban air pollution in 25 European cities: results of the Aphekom project. Sci Total Environ 449:390–400. https://doi.org/10.1016/j.scitotenv.2013.01.077
Pope CA, Ezzati M, Cannon JB, Allen RT, Jerrett M, Burnett RT (2018) Mortality risk and PM2.5 air pollution in the USA: an analysis of a national prospective cohort. Air Qual Atmos Health 11:245–252. https://doi.org/10.1007/s11869-017-0535-3
Quinlan JR (2006) Simplifying decision trees. Int J:221–234
Rivest RL (1987) Learning decision lists. Mach Learn 2:229–246. https://doi.org/10.1023/A:1022607331053
Roushangar K, Homayounfar F (2015) Prediction of flow friction coefficient using GEP and ANN methods. International Journal of Artificial Intelligence and Mechatronics 4:65–68
Sapankevych N, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4:24–38. https://doi.org/10.1109/MCI.2009.932254
Schölkfopf B, Smola AJ, Burges C (1999) Advances in kernel methods: support vector learning. MIT Press, London
Schweitzer L, Zhou J (2010) Neighborhood air quality, respiratory health, and vulnerable populations in compact and sprawled regions. J Am Plan Assoc 76:363–371. https://doi.org/10.1080/01944363.2010.486623
Seyedabrishami S, Mamdoohi A (2012) Impact of carpooling on fuel saving in urban transportation: case study of Tehran. Procedia Soc Behav Sci 54:323–331. https://doi.org/10.1016/j.sbspro.2012.09.751
Sfetsos A, Vlachogiannis D (2010) A new approach to discovering the causal relationship between meteorological patterns and PM10 exceedances. Atmos Res 98:500–511. https://doi.org/10.1016/j.atmosres.2010.08.021
Sharifi SS, Rezaverdinejad V, Nourani V (2016) Estimation of daily global solar radiation using wavelet regression, ANN, GEP and empirical models: a comparative study of selected temperature-based approaches. J Atmos Sol Terr Phys 149:131–145. https://doi.org/10.1016/j.jastp.2016.10.008
Sihag P, Jain P, Kumar M (2018a) Modelling of impact of water quality on recharging rate of storm water filter system using various kernel function based regression. Modeling Earth Systems and Environment 4:61–68. https://doi.org/10.1007/s40808-017-0410-0
Sihag P, Singh B, Vand AS, Mehdipour V (2018b) Modeling the infiltration process with soft computing techniques. ISH Journal of Hydraulic Engineering 5010:1–15. https://doi.org/10.1080/09715010.2018.1464408
Theodoridis S (2008) Pattern recognition, 4th editio. Academic, Burlington
Utgoff PE (1989) Incremental induction of decision trees. Mach Learn 4:161–186. https://doi.org/10.1023/A:1022699900025
Uusitalo L (2007) Advantages and challenges of Bayesian networks in environmental modelling. Ecol Model 203:312–318. https://doi.org/10.1016/j.ecolmodel.2006.11.033
Vafa-arani H, Jahani S, Dashti H et al (2014) A system dynamics modeling for urban air pollution: a case study of Tehran, Iran. Transp Res Part D: Transp Environ 31:21–36. https://doi.org/10.1016/j.trd.2014.05.016
Varis O, Kuikka S (1999) Learning Bayesian decision analysis by doing: lessons from environmental and natural resources management. Ecol Model 119:177–195. https://doi.org/10.1016/S0304-3800(99)00061-7
Vicedo-Cabrera AM, Biggeri A, Grisotto L, Barbone F, Catelan D (2013) A Bayesian kriging model for estimating residential exposure to air pollution of children living in a high-risk area in Italy. Geospat Health 8:87–95. https://doi.org/10.4081/gh.2013.57
Wade PR (2000) Bayesian methods in conservation biology. Conserv Biol 14:1308–1316. https://doi.org/10.1046/j.1523-1739.2000.99415.x
Wang P, Liu Y, Qin Z, Zhang G (2015) Science of the total environment a novel hybrid forecasting model for PM 10 and SO 2 daily concentrations. Sci Total Environ 505:1202–1212. https://doi.org/10.1016/j.scitotenv.2014.10.078
World Health Organization (2003) Health aspects of air pollution with particulate matter, ozone and nitrogen dioxide: report on a WHO working group, Bonn, Germany 13–15 January 2003
Xing YF, Xu YH, Shi MH, Lian YX (2016) The impact of PM2.5 on the human respiratory system. J Thorac Dis 8:E69–E74. https://doi.org/10.3978/j.issn.2072-1439.2016.01.19
Acknowledgements
This article is in memories of professor S. A. Sadrnejad whom we missed with great regrets. Also, the authors are grateful to dear Farzin Homayounfar, Dr. E. Kouhestani, and other collaborators who suggested their invaluable comments.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interests.
Rights and permissions
About this article
Cite this article
Mehdipour, V., Stevenson, D.S., Memarianfard, M. et al. Comparing different methods for statistical modeling of particulate matter in Tehran, Iran. Air Qual Atmos Health 11, 1155–1165 (2018). https://doi.org/10.1007/s11869-018-0615-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11869-018-0615-z