Abstract
This study focuses on how to determine the most relevant variables in order to estimate the hourly NO2 concentrations in a monitoring network located in the Bay of Algeciras (Spain). For each station of the network, artificial neural networks and multiple linear regression have been used to compute hourly estimation models. Meteorological variables and hourly NO2 concentrations from the nearby stations have been used as inputs, and a feature selection procedure has been applied as a previous step. The different models developed have been statistically compared. The inputs used in the best estimation model for each station were the most important to estimate each hourly NO2 concentration level. These estimations can be a very useful resource to provide autonomous capacities as automatic decalibration detection or missing data imputation in monitoring networks. Finally, the similarities between stations, according to the relevance of variables, have been analysed with the aid of a hierarchical clustering algorithm.
Similar content being viewed by others
References
Aguirre-Basurko E, Ibarra-Berastegi G, Madariaga I (2006) Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ Model Softw 21:430–446
Bai Y, Li Y, Wang X et al (2016) Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions. Atmos Pollut Res 7:557–566. https://doi.org/10.1016/j.apr.2016.01.004
Banerjee T, Srivastava RK (2011) Evaluation of environmental impacts of Integrated Industrial Estate—Pantnagar through application of air and water quality indices. Environ Monit Assess 172:547–560. https://doi.org/10.1007/s10661-010-1353-3
Bartra J, Mullol J, Del Cuvillo A et al (2007) Air pollution and allergens. J Investig Allergol Clin Immunol 17:3–8
Bhaskar BV, Mehta VM (2010) Atmospheric particulate pollutants and their relationship with meteorology in Ahmedabad. Aerosol Air Qual Res 10:301–315. https://doi.org/10.4209/aaqr.2009.10.0069
Bien J, Tibshirani R (2011) Hierarchical clustering with prototypes via minimax linkage. J Am Stat Assoc 106:1075–1084. https://doi.org/10.1198/jasa.2011.tm10183
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press Inc, New York
Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3:1–27. https://doi.org/10.1080/03610927408827101
Chelani AB, Chalapati RC, Phadke K, Hasan M (2002) Prediction of sulphur dioxide concentration using artificial neural networks. Environ Model Softw 17:161–168. https://doi.org/10.1016/S1364-8152(01)00061-5
Chen J, Wang W, Zhang J et al (2009) Characteristics of gaseous pollutants near a main traffic line in Beijing and its influencing factors. Atmos Res 94:470–480. https://doi.org/10.1016/j.atmosres.2009.07.008
Chiu H-F, Yang C-Y (2015) Air pollution and daily clinic visits for migraine in a subtropical city: Taipei, Taiwan. J Toxicol Environ Health A 78:549–558. https://doi.org/10.1080/15287394.2015.983218
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI 1:224–227. https://doi.org/10.1109/tpami.1979.4766909
Dominick D, Latif MT, Juahir H et al (2012) An assessment of influence of meteorological factors on PM10 and NO2 at selected stations in Malaysia. Sustain Environ Res 22:305–315
Elminir HK (2005) Dependence of urban air pollutants on meteorology. Sci Total Environ 350:225–237. https://doi.org/10.1016/j.scitotenv.2005.01.043
European Environment Agency (2013) Every breath we take: Improving air quality in Europe. Publications Office of the European Union, Luxembourg
European Environment Agency (2014) Annual report 2014 and EMAS environmental statement 2014. Publications Office of the European Union, Luxembourg
Finlayson-Pitts BJ, Pitts JN Jr (2000) Chemistry of the upper and lower atmosphere: theory, experiments, and applications. Academic Press, Cambridge
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701. https://doi.org/10.1080/01621459.1937.10503522
Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636. https://doi.org/10.1016/S1352-2310(97)00447-0
Gardner MW, Dorling SR (1999) Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33:709–719. https://doi.org/10.1016/s1352-2310(97)00282-3
Gibson J (2015) Air pollution, climate change, and health. Lancet Oncol 16:e269. https://doi.org/10.1016/S1470-2045(15)70238-X
Guyon I, Elisseeff A (2003) an introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. https://doi.org/10.1016/j.aca.2011.07.027
Hastie TTT, Tibshirani R, Friedman J (2009) The elements of statistical learning, 2nd edn. Springer, New York
He J, Yu Y, Liu N, Zhao S (2013) Numerical model-based relationship between meteorological conditions and air quality and its implication for urban air quality management. Int J Environ Pollut 53:265–286. https://doi.org/10.1504/IJEP.2013.059921
Hochberg Y, Tamhane AC (1987) Multiple comparison procedures. Wiley, New York
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366. https://doi.org/10.1016/0893-6080(89)90020-8
İçağa Y, Sabah E (2009) Statistical analysis of air pollutants and meteorological parameters in Afyon, Turkey. Environ Model Assess 14:259–266. https://doi.org/10.1007/s10666-008-9139-5
Khedairia S, Khadir MT (2012) Impact of clustered meteorological parameters on air pollutants concentrations in the region of Annaba, Algeria. Atmos Res 113:89–101. https://doi.org/10.1016/j.atmosres.2012.05.002
Kolehmainen M, Martikainen H, Ruuskanen J (2001) Neural networks and periodic components used in air quality forecasting. Atmos Environ 35:815–825. https://doi.org/10.1016/S1352-2310(00)00385-X
Kourtidis KA, Ziomas I, Zerefos C et al (2002) Benzene, toluene, ozone, NO2 and SO2 measurements in an urban street canyon in Thessaloniki, Greece. Atmos Environ 36:5355–5364
Kukkonen J, Partanen L, Karppinen A et al (2003) Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmos Environ 37:4539–4550. https://doi.org/10.1016/S1352-2310(03)00583-1
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441. https://doi.org/10.1137/0111030
Martín ML, Turias IJ, González FJ et al (2008) Prediction of CO maximum ground level concentrations in the Bay of Algeciras, Spain using artificial neural networks. Chemosphere 70:1190–1195. https://doi.org/10.1016/j.chemosphere.2007.08.039
Muñoz E, Martín ML, Turias IJ et al (2014) Prediction of PM10 and SO2 exceedances to control air pollution in the Bay of Algeciras, Spain. Stoch Environ Res Risk Assess 28:1409–1420. https://doi.org/10.1007/s00477-013-0827-6
Murtagh F (1983) A survey of recent advances in hierarchical clustering algorithms. Comput J 26:354–359. https://doi.org/10.1093/comjnl/26.4.354
Parra MA, Elustondo D, Bermejo R, Santamaría JM (2009) Ambient air levels of volatile organic compounds (VOC) and nitrogen dioxide (NO2) in a medium size city in Northern Spain. Sci Total Environ 407:999–1009. https://doi.org/10.1016/j.scitotenv.2008.10.032
Reyes MM (2015) Modelado de alta resolucion para el estudio de la respuesta oceanica al forzamiento del viento en el Estrecho de Gibraltar (Unpublished doctoral dissertation). University of Cádiz, Spain
Rivera C, Stremme W, Barrera H et al (2015) Spatial distribution and transport patterns of NO2 in the Tijuana–San Diego area. Atmos Pollut Res 6:230–238. https://doi.org/10.5094/APR.2015.027
Rokach L, Maimon O (2005) Clustering methods. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Boston, MA, pp 321–352
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL, PDP Research Group (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1. Foundations. MIT Press, Cambridge, MA, pp 318–362
Russo A, Lind PG, Raischel F et al (2015) Neural network forecast of daily pollution concentration using optimal meteorological data at synoptic and local scales. Atmos Pollut Res 6:540–549. https://doi.org/10.5094/APR.2015.060
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23:2507–2517. https://doi.org/10.1093/bioinformatics/btm344
Sarle WS (1995) Stopped training and other remedies for overfitting. In: Proceedings of 27th Symposium Interface Computer Science and Statistics, pp 352–360
Shi JP, Harrison RM (1997) Regression modelling of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 31:4081–4094. https://doi.org/10.1016/S1352-2310(97)00282-3
Solomatine D, See LM, Abrahart RJ (2008) Data-driven modelling: concepts, approaches and experiences. In: Abrahart RJ, See LM, Solomatine DP (eds) Practical hydroinformatics: computational intelligence and technological developments in water applications. Springer, Berlin, pp 17–30
Sun Y, Zhuang G, Wang Y et al (2004) The air-borne particulate pollution in Beijing—concentration, composition, distribution and sources. Atmos Environ 38:5991–6004. https://doi.org/10.1016/j.atmosenv.2004.07.009
Tabaku A, Bejtja G, Bala S et al (2011) Effects of air pollution on children’s pulmonary health. Atmos Environ 45:7540–7545. https://doi.org/10.1016/j.atmosenv.2010.07.033
Turias IJ, González FJ, Martin ML, Galindo PL (2008) Prediction models of CO, SPM and SO2 concentrations in the Campo de Gibraltar Region, Spain: a multiple comparison strategy. Environ Monit Assess 143:131–146. https://doi.org/10.1007/s10661-007-9963-0
Turias IJ, Jerez JM, Franco L et al (2017) Prediction of carbon monoxide (CO) atmospheric pollution concentrations using meterological variables. WIT Trans Ecol Environ 211:137–145. https://doi.org/10.2495/AIR170141
Vlachogianni A, Kassomenos P, Karppinen A et al (2011) Evaluation of a multiple regression model for the forecasting of the concentrations of NOx and PM10 in Athens and Helsinki. Sci Total Environ 409:1559–1571. https://doi.org/10.1016/j.scitotenv.2010.12.040
Westmoreland EJ, Carslaw N, Carslaw DC et al (2007) Analysis of air quality within a street canyon using statistical and dispersion modelling techniques. Atmos Environ 41:9195–9205. https://doi.org/10.1016/j.atmosenv.2007.07.057
Willmott CJ (1982) Some comments on the evaluation of model performance. Am Meteorol Soc 63:1309–1313. https://doi.org/10.1175/1520-0477(1982)063%3c1309:SCOTEO%3e2.0.CO;2
Xu WY, Zhao CS, Ran L et al (2011) Characteristics of pollutants and their correlation to meteorological conditions at a suburban site in the North China Plain. Atmos Chem Phys 11:4353–4369. https://doi.org/10.5194/acp-11-4353-2011
Xu J, Yan F, Xie Y et al (2015) Impact of meteorological conditions on a nine-day particulate matter pollution event observed in December 2013, Shanghai, China. Particuology 20:69–79. https://doi.org/10.1016/j.partic.2014.09.001
Yao Y, Rosasco L, Caponnetto A (2007) On early stopping in gradient descent learning. Constr Approx 26:289–315. https://doi.org/10.1007/s00365-006-0663-2
Zhang K, Batterman S (2013) Air pollution and health risks due to vehicle traffic. Sci Total Environ 450–451:307–316. https://doi.org/10.1016/j.scitotenv.2013.01.074
Zhang H, Wang Y, Hu J et al (2015) Relationships between meteorological parameters and criteria air pollutants in three megacities in China. Environ Res 140:242–254. https://doi.org/10.1016/j.envres.2015.04.004
Zheng H, Zhang Y (2007) Feature selection for high dimensional data in astronomy. Adv Sp Res 41:1960–1964. https://doi.org/10.1016/j.asr.2007.08.033
Zu Y, Huang L, Hu J et al (2017) Investigation of relationships between meteorological conditions and high PM10 pollution in a megacity in the western Yangtze River Delta. Air Qual Atmos Health, China. https://doi.org/10.1007/s11869-017-0472-1
Acknowledgements
This work is part of the coordinated research projects TIN2014-58516-C2-1-R and TIN2014-58516-C2-2-R supported by MICINN (Ministerio de Economía y Competitividad-Spain). Monitoring data have been kindly provided by the Environmental Agency of the Andalusian Government.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
González-Enrique, J., Turias, I.J., Ruiz-Aguilar, J.J. et al. Spatial and meteorological relevance in NO2 estimations: a case study in the Bay of Algeciras (Spain). Stoch Environ Res Risk Assess 33, 801–815 (2019). https://doi.org/10.1007/s00477-018-01644-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-018-01644-0