An artificial neural network ensemble approach to generate air pollution maps

Van Roode, S.; Ruiz-Aguilar, J. J.; González-Enrique, J.; Turias, I. J.

doi:10.1007/s10661-019-7901-6

An artificial neural network ensemble approach to generate air pollution maps

Published: 07 November 2019

Volume 191, article number 727, (2019)
Cite this article

Environmental Monitoring and Assessment Aims and scope Submit manuscript

S. Van Roode¹,
J. J. Ruiz-Aguilar²,
J. González-Enrique¹ &
…
I. J. Turias¹

763 Accesses
23 Citations
Explore all metrics

Abstract

The objective of this research is to propose an artificial neural network (ANN) ensemble in order to estimate the hourly NO₂ concentration at unsampled locations. Spatial interpolation methods and linear regression models with regularization have been compared to perform the ensemble. The study case is based on the region of the Bay of Algeciras (Spain). This area is very industrialized and presents high concentrations of traffic. Air pollution data has been collected from the monitoring network maintained by the Andalusian Government in the region. On one hand, two totally different methods have been used and compared such as inverse distance weight (IDW) and least absolute shrinkage and selection operator (LASSO) in order to generate maps of pollutant concentration values. On the other hand, an ensemble approach has been developed using the outputs of the previous models. The ensemble is based on an ANN with backpropagation learning. An experimental procedure using cross-validation has been applied in order to compare the different models based on several performance indexes (R correlation coefficient, MSE, MAE and d index of fitness) and together to Friedman test and Bonferroni correction. The results reveal that the proposed ensemble approach presents better performance than single models in general terms. A validation procedure has been conducted using a leave-one-out strategy using each monitoring station. IDW method presents an average value of R equals 0.72 and a maximum R equals 0.87, a minimum MSE equals 78.00, a minimum MAE equals 5.841 and a maximum d equals 0.913. LASSO presents an average value of R equals 0.76 and a maximum R equals 0.86, a minimum MSE equals 59.13, a minimum MAE equals 5.490 and a maximum d equals 0.900. Finally, the ANN ensemble shows an average value of R equals 0.77 and a maximum R equals 0.87, a minimum MSE equals 54.05, a minimum MAE equals 4.972 and a maximum d equals 0.915. The main objective has been to produce adequate atmospheric pollutant concentration maps and, therefore, to obtain estimations for locations that are distinct to the monitoring stations. Another objective has been to have in hand a system to produce robust measurements. This kind of system could be useful for missing data imputation and to find out reading errors (i.e. unexpected deviations or calibration problems) in some of the nodes of a network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Water quality prediction using machine learning models based on grid search method

Article Open access 29 September 2023

Air pollution prediction with machine learning: a case study of Indian cities

Article 15 May 2022

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Article 10 April 2024

References

Alimissis, A., Philippopoulos, K., Tzanis, C. G., & Deligiorgi, D. (2018). Spatial estimation of urban air pollution with the use of artificial neural network models. Atmospheric Environment, 191, 205–213. https://doi.org/10.1016/J.ATMOSENV.2018.07.058.
Article CAS Google Scholar
Aznarte, J. L. (2017). Probabilistic forecasting for extreme NO2 pollution episodes. Environmental Pollution, 229, 321–328. https://doi.org/10.1016/J.ENVPOL.2017.05.079.
Article CAS Google Scholar
Beelen, R., Hoek, G., Pebesma, E., Vienneau, D., de Hoogh, K., & Briggs, D. J. (2009). Mapping of background air pollution at a fine spatial scale across the European Union. Science of the Total Environment, 407, 1852–1867. https://doi.org/10.1016/j.scitotenv.2008.11.048.
Article CAS Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140. https://doi.org/10.1007/BF00058655.
Article Google Scholar
Burrough, P.A., McDonnell, R.A., 1998. Principles of geographical information systems. Oxford Univ. Press.
Cabaneros, S. M., Calautit, J. K., & Hughes, B. R. (2019). A review of artificial neural network models for ambient air pollution prediction. Environmental Modelling and Software, 119, 285–304. https://doi.org/10.1016/J.ENVSOFT.2019.06.014.
Article Google Scholar
Contreras, L., & Ferri, E. (2016). Wind-sensitive interpolation of urban air pollution forecasts. Procedia Computer Science, 80, 313–323. https://doi.org/10.1016/j.procs.2016.05.343.
Article Google Scholar
de Mesnard, L. (2013). Pollution models and inverse distance weighting: some critical remarks. Computational Geosciences, 52, 459–469. https://doi.org/10.1016/j.cageo.2012.11.002.
Article Google Scholar
Dirección General de Carreteras, Ministerio de Fomento, 2017. Mapa de tráfico 2017.
Donahue, N.M., 2018. Air pollution and air quality, in: Green Chemistry. Elsevier, pp. 151–176. https://doi.org/10.1016/B978-0-12-809270-5.00007-8
Chapter Google Scholar
Drucker, H., Cortes, C., Jackel, L. D., LeCun, Y., & Vapnik, V. (1994). Boosting and other ensemble methods. Neural Computation, 6, 1289–1301. https://doi.org/10.1162/neco.1994.6.6.1289.
Article Google Scholar
Dubois, G., & Galmarini, S. (2005). Introduction to the spatial interpolation comparison (SIC) 2004 exercise and presentation of the datasets. Appied GIS, 1, 1–11. https://doi.org/10.2104/ag050009.
Article Google Scholar
Feng, X., Li, Q., Zhu, Y., Hou, J., Jin, L., & Wang, J. (2015). Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmospheric Environment, 107, 118–128. https://doi.org/10.1016/j.atmosenv.2015.02.030.
Article CAS Google Scholar
García, E.M., Rodríguez, M.L.M., Jiménez-Come, M.J., Espinosa, F.T., Domínguez, I.T., 2011. Prediction of peak concentrations of PM10 in the area of Campo de Gibraltar (Spain) using classification models. pp. 203–212. https://doi.org/10.1007/978-3-642-19644-7_22
Google Scholar
Gómez-Losada, Á., Santos, F. M., Gibert, K., & Pires, J. C. M. (2019). A data science approach for spatiotemporal modelling of low and resident air pollution in Madrid (Spain): implications for epidemiological studies. Computers, Environment and Urban Systems, 75, 1–11. https://doi.org/10.1016/J.COMPENVURBSYS.2018.12.005.
Article Google Scholar
Gong, G., Mattevada, S., & O’Bryant, S. E. (2014). Comparison of the accuracy of kriging and IDW interpolations in estimating groundwater arsenic concentrations in Texas. Environmental Research, 130, 59–69. https://doi.org/10.1016/J.ENVRES.2013.12.005.
Article CAS Google Scholar
González-Enrique, J., Ruiz-Aguilar, J.J., Moscoso-López, J.A., Van Roode, S., Urda, D., Turias, I.J., 2019a. A genetic algorithm and neural network stacking ensemble approach to improve NO2 level estimations. Springer International Publishing, pp. 856–867. https://doi.org/10.1007/978-3-030-20521-8_70
Chapter Google Scholar
González-Enrique, J., Turias, I. J., Ruiz-Aguilar, J. J., Moscoso-López, J. A., & Franco, L. (2019b). Spatial and meteorological relevance in NO2 estimations: a case study in the Bay of Algeciras (Spain). Stochastic Environmental Research and Risk Assessment, 33, 801–815. https://doi.org/10.1007/s00477-018-01644-0.
Article Google Scholar
González-Enrique, J., Turias, I. J., Ruiz-Aguilar, J. J., Moscoso-López, J. A., Jerez-Aragonés, J., & Franco, L. (2019c). Estimation of NO₂ concentration values in a monitoring sensor network using a fusion approach. Fresenius Environmental Bulletin, 28, 681–686.
Google Scholar
He, J., & Christakos, G. (2018). Space-time PM2.5 mapping in the severe haze region of Jing-Jin-Ji (China) using a synthetic approach. Environmental Pollution, 240, 319–329. https://doi.org/10.1016/J.ENVPOL.2018.04.092.
Article CAS Google Scholar
Healey, S. P., Cohen, W. B., Yang, Z., Kenneth Brewer, C., Brooks, E. B., Gorelick, N., Hernandez, A. J., Huang, C., Joseph Hughes, M., Kennedy, R. E., Loveland, T. R., Moisen, G. G., Schroeder, T. A., Stehman, S. V., Vogelmann, J. E., Woodcock, C. E., Yang, L., & Zhu, Z. (2018). Mapping forest change using stacked generalization: an ensemble approach. Remote Sensing of Environment, 204, 717–728. https://doi.org/10.1016/J.RSE.2017.09.029.
Article Google Scholar
Hengl, T., 2009. A practical guide to geostatistical mapping, JCR Scientific and Technical Research Series.
Hengl, T., Minasny, B., & Gould, M. (2009). A geostatistical analysis of geostatistics. Scientometrics, 80, 491–514. https://doi.org/10.1007/s11192-009-0073-3.
Article Google Scholar
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359–366. https://doi.org/10.1016/0893-6080(89)90020-8.
Article Google Scholar
Jiang, X., Zou, B., Feng, H., Tang, J., Tu, Y., & Zhao, X. (2019). Spatial distribution mapping of Hg contamination in subclass agricultural soils using GIS enhanced multiple linear regression. Journal of Geochemical Exploration, 196, 1–7. https://doi.org/10.1016/J.GEXPLO.2018.10.002.
Article CAS Google Scholar
Lehmann, E.L., Casella, G., 1998. Theory of point estimation, 2nd edi. ed.
Li, J., Heap, A.D., 2008. A review of spatial interpolation methods for environmental scientists, Record 200. ed.
Li, J., & Heap, A. D. (2011). A review of comparative studies of spatial interpolation methods in environmental sciences: performance and impact factors. Ecological Informatics, 6, 228–241. https://doi.org/10.1016/j.ecoinf.2010.12.003.
Article Google Scholar
Li, J., & Heap, A. D. (2014). Spatial interpolation methods applied in the environmental sciences: a review. Environmental Modelling and Software, 53, 173–189. https://doi.org/10.1016/j.envsoft.2013.12.008.
Article Google Scholar
Ma, J., Ding, Y., Cheng, J. C. P., Jiang, F., & Wan, Z. (2019a). A temporal-spatial interpolation and extrapolation method based on geographic long short-term memory neural network for PM2.5. Journal of Cleaner Production, 237, 117729. https://doi.org/10.1016/J.JCLEPRO.2019.117729.
Article Google Scholar
Ma, X., Longley, I., Gao, J., Kachhara, A., & Salmond, J. (2019b). A site-optimised multi-scale GIS based land use regression model for simulating local scale patterns in air pollution. Science of the Total Environment, 685, 134–149. https://doi.org/10.1016/J.SCITOTENV.2019.05.408.
Article Google Scholar
Martín, M. L., Turias, I. J., González, F. J., Galindo, P. L., Trujillo, F. J., Puntonet, C. G., & Gorriz, J. M. (2008). Prediction of CO maximum ground level concentrations in the Bay of Algeciras, Spain using artificial neural networks. Chemosphere, 70, 1190–1195. https://doi.org/10.1016/j.chemosphere.2007.08.039.
Article CAS Google Scholar
Matheron, G., 1965. Les variables régionalisées et leur estimation, une application de la théorie de fonctions aléatoires aux sciences de la nature. Masson.
Mora, C., Frazier, A. G., Longman, R. J., Dacks, R. S., Walton, M. M., Tong, E. J., Sanchez, J. J., Kaiser, L. R., Stender, Y. O., Anderson, J. M., Ambrosino, C. M., Fernandez-Silva, I., Giuseffi, L. M., & Giambelluca, T. W. (2013). The projected timing of climate departure from recent variability. Nature, 502, 183–187. https://doi.org/10.1038/nature12540.
Article CAS Google Scholar
Muñoz, E., Martín, M. L., Turias, I. J., Jimenez-Come, M. J., & Trujillo, F. J. (2014). Prediction of PM₁₀ and SO₂ exceedances to control air pollution in the Bay of Algeciras, Spain. Stochastic Environmental Research and Risk Assessment, 28, 1409–1420. https://doi.org/10.1007/s00477-013-0827-6.
Article Google Scholar
Naughton, O., Donnelly, A., Nolan, P., Pilla, F., Misstear, B. D., & Broderick, B. (2018). A land use regression model for explaining spatial variation in air pollution levels using a wind sector based approach. Science of the Total Environment, 630, 1324–1334. https://doi.org/10.1016/J.SCITOTENV.2018.02.317.
Article CAS Google Scholar
Piccini, C., Marchetti, A., Rivieccio, R., & Napoli, R. (2018). Multinomial logistic regression with soil diagnostic features and land surface parameters for soil mapping of Latium (Central Italy). Geoderma. https://doi.org/10.1016/J.GEODERMA.2018.09.037.
Article Google Scholar
Qi, Y., Li, Q., Karimian, H., & Liu, D. (2019). A hybrid model for spatiotemporal forecasting of PM2.5 based on graph convolutional neural network and long short-term memory. Science of the Total Environment, 664, 1–10. https://doi.org/10.1016/J.SCITOTENV.2019.01.333.
Article CAS Google Scholar
Requia, W. J., Coull, B. A., & Koutrakis, P. (2019). Evaluation of predictive capabilities of ordinary geostatistical interpolation, hybrid interpolation, and machine learning methods for estimating PM2.5 constituents over space. Environmental Research, 175, 421–433. https://doi.org/10.1016/J.ENVRES.2019.05.025.
Article CAS Google Scholar
Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations by error propagation, in: Parallel Distributed Processing (pp. 318–362). Cambridge: MIT Press.
Google Scholar
Russo, A., & Soares, A. O. (2014). Hybrid model for urban air pollution forecasting: a stochastic spatio-temporal approach. Mathematical Geoscience, 46, 75–93. https://doi.org/10.1007/s11004-013-9483-0.
Article CAS Google Scholar
Sellier, Y., Galineau, J., Hulin, A., Caini, F., Marquis, N., Navel, V., Bottagisi, S., Giorgis-Allemand, L., Jacquier, C., Slama, R., & Lepeule, J. (2014). Health effects of ambient air pollution: do different methods for estimating exposure lead to different results? Environment International, 66, 165–173. https://doi.org/10.1016/J.ENVINT.2014.02.001.
Article CAS Google Scholar
Shahbazi, H., Karimi, S., Hosseini, V., Yazgi, D., & Torbatian, S. (2018). A novel regression imputation framework for Tehran air pollution monitoring network using outputs from WRF and CAMx models. Atmospheric Environment, 187, 24–33. https://doi.org/10.1016/J.ATMOSENV.2018.05.055.
Article CAS Google Scholar
Sharma, N., Taneja, S., Sagar, V., & Bhatt, A. (2018). Forecasting air pollution load in Delhi using data analysis tools. Procedia Comput. Sci., 132, 1077–1085. https://doi.org/10.1016/J.PROCS.2018.05.023.
Article Google Scholar
Shepard, D. (1968). A two-dimensional interpolation function for irregularly-spaced data (pp. 517–524). New York: Proceedings of the 1968 ACM National Conference. https://doi.org/10.1145/800186.810616.
Book Google Scholar
Sun, J., & Li, H. (2008). Listed companies’ financial distress prediction based on weighted majority voting combination of multiple classifiers. Expert Systems with Applications, 35, 818–827. https://doi.org/10.1016/j.eswa.2007.07.045.
Article Google Scholar
Tadić, J. M., Ilić, V., & Biraud, S. (2015). Examination of geostatistical and machine-learning techniques as interpolators in anisotropic atmospheric environments. Atmospheric Environment, 111, 28–38. https://doi.org/10.1016/J.ATMOSENV.2015.03.063.
Article Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B:Statistical Methodology, 58, 267–288.
Google Scholar
Tibshirani, R. (2011). Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Society. Series B:Statistical Methodology, 73, 273–282. https://doi.org/10.1111/j.1467-9868.2011.00771.x.
Article Google Scholar
Ting, K. M., & Witten, I. H. (1999). Issues in stacked generalization. Journal of Artificial Intelligence Research, 10, 271–289.
Article Google Scholar
Turias, I. J., González, F. J., Martín, M. L., & Galindo, P. L. (2006). A competitive neural network approach for meteorological situation clustering. Atmospheric Environment, 40, 532–541. https://doi.org/10.1016/j.atmosenv.2005.09.065.
Article CAS Google Scholar
Turias, I. J., González, F. J., Martin, M. L., & Galindo, P. L. (2008). Prediction models of CO, SPM and SO₂ concentrations in the Campo de Gibraltar Region, Spain: a multiple comparison strategy. Environmental Monitoring and Assessment, 143, 131–146. https://doi.org/10.1007/s10661-007-9963-0.
Article CAS Google Scholar
Turias, I. J., Jerez, J. M., Franco, L., Mesa, H., Ruiz-Aguilar, J. J., Moscoso, J. A., & Jiménez-Come, M. J. (2017). Prediction of carbon monoxide (CO) atmospheric pollution concentrations using meteorological variables. WIT Transactions on Ecology and the Environment, 211(9), 137–145. https://doi.org/10.2495/AIR170141.
Article CAS Google Scholar
Van Roode, S., Ruiz-Aguilar, J. J., González-Enrique, J., Moscoso-López, J. A., & Turias, I. J. (2018). Using geostatistical modelling for analysis of air pollution and its relation with road traffic in Bay of Algeciras (Spain). XIII Congreso de Ingeniería del Transporte (CIT). Gijón.
Van Roode, S., Ruiz-Aguilar, J.J., González-Enrique, J., Turias, I.J., 2020. A hybrid approach for short-term NO2 forecasting: case study of Bay of Algeciras (Spain) 190–198. https://doi.org/10.1007/978-3-030-20055-8_18
Google Scholar
Wang, J., & Song, G. (2018). A deep spatial-temporal ensemble model for air quality prediction. Neurocomputing, 314, 198–206. https://doi.org/10.1016/j.neucom.2018.06.049.
Article Google Scholar
Willmott, C. J. (1981). On the validation of models. Physical Geography, 2, 184–194.
Article Google Scholar
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259. https://doi.org/10.1016/S0893-6080(05)80023-1.
Article Google Scholar
Woźniak, M., Graña, M., & Corchado, E. (2014). A survey of multiple classifier systems as hybrid systems. Information Fusion, 16, 3–17. https://doi.org/10.1016/j.inffus.2013.04.006.
Article Google Scholar
Yu, H., Russell, A., Mulholland, J., Odman, T., Hu, Y., Chang, H. H., & Kumar, N. (2018). Cross-comparison and evaluation of air pollution field estimation methods. Atmospheric Environment, 179, 49–60. https://doi.org/10.1016/J.ATMOSENV.2018.01.045.
Article CAS Google Scholar

Download references

Acknowledgements

This work is part of the research project TIN2014-58516-C2-2-R supported by (MICINN Ministerio de Economía y Competitividad-Spain). The data have been kindly provided by Junta de Andalucía (Spain).

Author information

Authors and Affiliations

Intelligent Modelling of Systems, Department of Computer Science Engineering, University of Cádiz, Polytechnic School of Engineering, 11202, Algeciras, Spain
S. Van Roode, J. González-Enrique & I. J. Turias
Intelligent Modelling of Systems, Department of Civil and Industrial Engineering, University of Cádiz, Polytechnic School of Engineering, 11202, Algeciras, Spain
J. J. Ruiz-Aguilar

Authors

S. Van Roode
View author publications
You can also search for this author in PubMed Google Scholar
J. J. Ruiz-Aguilar
View author publications
You can also search for this author in PubMed Google Scholar
J. González-Enrique
View author publications
You can also search for this author in PubMed Google Scholar
I. J. Turias
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Van Roode.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Van Roode, S., Ruiz-Aguilar, J.J., González-Enrique, J. et al. An artificial neural network ensemble approach to generate air pollution maps. Environ Monit Assess 191, 727 (2019). https://doi.org/10.1007/s10661-019-7901-6

Download citation

Received: 28 March 2019
Accepted: 17 October 2019
Published: 07 November 2019
DOI: https://doi.org/10.1007/s10661-019-7901-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An artificial neural network ensemble approach to generate air pollution maps

Abstract

Access this article

Similar content being viewed by others

Water quality prediction using machine learning models based on grid search method

Air pollution prediction with machine learning: a case study of Indian cities

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An artificial neural network ensemble approach to generate air pollution maps

Abstract

Access this article

Similar content being viewed by others

Water quality prediction using machine learning models based on grid search method

Air pollution prediction with machine learning: a case study of Indian cities

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation