Skip to main content
Erschienen in: Neural Computing and Applications 11/2020

10.06.2019 | Original Article

A data ensemble approach for real-time air quality forecasting using extremely randomized trees and deep neural networks

verfasst von: Ebrahim Eslami, Ahmed Khan Salman, Yunsoo Choi, Alqamah Sayeed, Yannic Lops

Erschienen in: Neural Computing and Applications | Ausgabe 11/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Six generalized machine learning (ML) ensemble models were developed to predict the real-time hourly ozone concentration of the following day. These models were used to forecast hourly ozone concentrations of the following day for all of 2017 in the city of Seoul, South Korea. To prepare the training dataset, it was referred to observed meteorology and air pollution parameters of the 2014–2016 period. The ensemble models fuse two regression models: a low-ozone peak model and a high-ozone model. For both, extremely randomized trees and deep neural networks were used. A regularization approach was also adopted that adjusts the model toward capturing higher ozone peaks by resampling the training dataset based on the peaks. Adopting the proposed ML ensemble forecasting method over single-model ML techniques as a part of mainstream practice for air quality forecasting will be beneficial for several reasons. For one, the proposed method, which captures daily maximum ozone concentrations during the high-ozone season (April–September), reduces the ozone peak prediction error by 5 to 30 ppb. In addition, compared to station-specific (independent) ML models with more frequent low-ozone values, models are trained with a uniformly distributed dataset, so they are more generalizable in nature. As a result, unlike station-specific models, they retain their accuracy (yearly IOA = 0.84–0.89) in all stations with an IOA increment. Proposed models also make predictions several times faster, requiring only one-time training for predicting an entire station network. Based on a categorical analysis of the training dataset, an algorithm was proposed for selecting the most suitable model for each month. The “best” model further improves the accuracy of both the ML ensemble and individual models by up to 2.4%. This study shows that the ML ensemble modeling approach is a fast, reliable, and robust technique that can benefit environmental decision-makers in urban regions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part I: history, techniques, and current status. Atmos Environ 60:632–655 Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part I: history, techniques, and current status. Atmos Environ 60:632–655
2.
Zurück zum Zitat Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part II: state of the science, current research needs, and future prospects. Atmos Environ 60:656–676 Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part II: state of the science, current research needs, and future prospects. Atmos Environ 60:656–676
3.
Zurück zum Zitat Eslami E, Choi Y, Lops Y, Sayeed A (2019) A real-time hourly ozone prediction system using deep convolutional neural network. arXiv:1901.11079 [physics.ao-ph] Eslami E, Choi Y, Lops Y, Sayeed A (2019) A real-time hourly ozone prediction system using deep convolutional neural network. arXiv:​1901.​11079 [physics.ao-ph]
4.
Zurück zum Zitat Byun D, Schere KL (2006) Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl Mech Rev 59:51–77 Byun D, Schere KL (2006) Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl Mech Rev 59:51–77
5.
Zurück zum Zitat Mallet V, Sportisse B (2006) Uncertainty in a chemistry-transport model due to physical parameterizations and numerical approximations: an ensemble approach applied to ozone modeling. J Geophys Res Atmos 111(D01302):1–15 Mallet V, Sportisse B (2006) Uncertainty in a chemistry-transport model due to physical parameterizations and numerical approximations: an ensemble approach applied to ozone modeling. J Geophys Res Atmos 111(D01302):1–15
6.
Zurück zum Zitat Choi Y (2014) The impact of satellite-adjusted NOx emissions on simulated NOx and O3 discrepancies in the urban and outflow areas of the Pacific and Lower Middle US. Atmos Chem Phys 14(2):675–690 Choi Y (2014) The impact of satellite-adjusted NOx emissions on simulated NOx and O3 discrepancies in the urban and outflow areas of the Pacific and Lower Middle US. Atmos Chem Phys 14(2):675–690
7.
Zurück zum Zitat Pan S, Choi Y, Roy A, Jeon W (2017) Allocating emissions to 4 km and 1 km horizontal spatial resolutions and its impact on simulated NOx and O3 in Houston. TX Atmos Environ 164:398–415 Pan S, Choi Y, Roy A, Jeon W (2017) Allocating emissions to 4 km and 1 km horizontal spatial resolutions and its impact on simulated NOx and O3 in Houston. TX Atmos Environ 164:398–415
8.
Zurück zum Zitat Pan S, Choi Y, Jeon W, Roy A, Westenbarger DA, Kim HC (2017) Impact of high-resolution sea surface temperature, emission spikes and wind on simulated surface ozone in Houston, Texas during a high ozone episode. Atmos Environ 152:362–376 Pan S, Choi Y, Jeon W, Roy A, Westenbarger DA, Kim HC (2017) Impact of high-resolution sea surface temperature, emission spikes and wind on simulated surface ozone in Houston, Texas during a high ozone episode. Atmos Environ 152:362–376
9.
Zurück zum Zitat Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62 Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62
10.
Zurück zum Zitat Breiman L (2017) Classification and regression trees. Routledge, New York Breiman L (2017) Classification and regression trees. Routledge, New York
11.
Zurück zum Zitat Durão RM, Mendes MT, Pereira MJ (2016) Forecasting O3 levels in industrial area surroundings up to 24 h in advance, combining classification trees and MLP models. Atmos Pollut Res 7:961–970 Durão RM, Mendes MT, Pereira MJ (2016) Forecasting O3 levels in industrial area surroundings up to 24 h in advance, combining classification trees and MLP models. Atmos Pollut Res 7:961–970
12.
Zurück zum Zitat Sun W, Palazoglu A, Singh A, Zhang H, Wang Q, Zhao Z, Cao D (2015) Prediction of surface ozone episodes using clusters based generalized linear mixed effects models in Houston–Galveston–Brazoria area, Texas. Atmos Pollut Res 6:245–253 Sun W, Palazoglu A, Singh A, Zhang H, Wang Q, Zhao Z, Cao D (2015) Prediction of surface ozone episodes using clusters based generalized linear mixed effects models in Houston–Galveston–Brazoria area, Texas. Atmos Pollut Res 6:245–253
13.
Zurück zum Zitat Siwek K, Osowski S (2016) Data mining methods for prediction of air pollution. Int J Appl Math Comput Sci 26:467–478MathSciNetMATH Siwek K, Osowski S (2016) Data mining methods for prediction of air pollution. Int J Appl Math Comput Sci 26:467–478MathSciNetMATH
14.
Zurück zum Zitat Wang D, Wei S, Luo H, Yue C, Grunder O (2017) A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine. Sci Total Environ 580:719–733 Wang D, Wei S, Luo H, Yue C, Grunder O (2017) A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine. Sci Total Environ 580:719–733
15.
Zurück zum Zitat Pandey G, Zhang B, Jian L (2013) Predicting submicron air pollution indicators: a machine learning approach. Environ Sci Process Impacts 15(5):996–1005 Pandey G, Zhang B, Jian L (2013) Predicting submicron air pollution indicators: a machine learning approach. Environ Sci Process Impacts 15(5):996–1005
16.
Zurück zum Zitat Feng X, Li Q, Zhu Y, Hou J, Jin L, Wang J (2015) Artificial neural networks forecasting of PM25 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos Environ 107:118–128 Feng X, Li Q, Zhu Y, Hou J, Jin L, Wang J (2015) Artificial neural networks forecasting of PM25 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos Environ 107:118–128
17.
Zurück zum Zitat Peng H, Lima AR, Teakles A, Jin J, Cannon AJ, Hsieh WW (2016) Evaluating hourly air quality forecasting in Canada with nonlinear updatable machine learning methods. Air Qual Atmos Health 10:195–211 Peng H, Lima AR, Teakles A, Jin J, Cannon AJ, Hsieh WW (2016) Evaluating hourly air quality forecasting in Canada with nonlinear updatable machine learning methods. Air Qual Atmos Health 10:195–211
18.
Zurück zum Zitat Li X, Peng L, Yao X, Cui S, Hu Y, You C, Chi T (2017) Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. Environ Pollut 231:997–1004 Li X, Peng L, Yao X, Cui S, Hu Y, You C, Chi T (2017) Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. Environ Pollut 231:997–1004
19.
Zurück zum Zitat Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001 Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001
20.
Zurück zum Zitat Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554MathSciNetMATH Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554MathSciNetMATH
21.
Zurück zum Zitat Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42MATH Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42MATH
22.
Zurück zum Zitat Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106 Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
23.
Zurück zum Zitat Larivière B, Van den Poel D (2005) Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst Appl 29(2):472–484 Larivière B, Van den Poel D (2005) Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst Appl 29(2):472–484
24.
Zurück zum Zitat Kuremoto T, Kimura S, Kobayashi K, Obayashi M (2014) Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing 137:47–56 Kuremoto T, Kimura S, Kobayashi K, Obayashi M (2014) Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing 137:47–56
25.
Zurück zum Zitat Hecht-Nielsen R (1988) Theory of the backpropagation neural network. Neural Netw 1:445–448 Hecht-Nielsen R (1988) Theory of the backpropagation neural network. Neural Netw 1:445–448
26.
Zurück zum Zitat Panchal G, Ganatra A, Kosta YP, Panchal D (2011) Behaviour analysis of multilayer perceptrons with multiple hidden neurons and hidden layers. Int J Comput Sci Eng 3:333–337 Panchal G, Ganatra A, Kosta YP, Panchal D (2011) Behaviour analysis of multilayer perceptrons with multiple hidden neurons and hidden layers. Int J Comput Sci Eng 3:333–337
27.
Zurück zum Zitat Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153–160 Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153–160
28.
Zurück zum Zitat Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507MathSciNetMATH Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507MathSciNetMATH
29.
Zurück zum Zitat Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 11:2287–2322MathSciNetMATH Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 11:2287–2322MathSciNetMATH
31.
Zurück zum Zitat Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M et al (2016) Tensorflow: a system for large-scale machine learning. OSDI 16:265–283 Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M et al (2016) Tensorflow: a system for large-scale machine learning. OSDI 16:265–283
32.
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATH
33.
Zurück zum Zitat Willmott CJ (1981) On the validation of models. Phys Geogr 2(2):184–194 Willmott CJ (1981) On the validation of models. Phys Geogr 2(2):184–194
Metadaten
Titel
A data ensemble approach for real-time air quality forecasting using extremely randomized trees and deep neural networks
verfasst von
Ebrahim Eslami
Ahmed Khan Salman
Yunsoo Choi
Alqamah Sayeed
Yannic Lops
Publikationsdatum
10.06.2019
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 11/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-019-04287-6

Weitere Artikel der Ausgabe 11/2020

Neural Computing and Applications 11/2020 Zur Ausgabe

Brain inspired Computing&Machine Learning Applied Research-BISMLARE

Real-time diameter of the fetal aorta from ultrasound

Premium Partner