Skip to main content
Top
Published in: The Journal of Supercomputing 3/2020

14-11-2017

An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution

Authors: Chao-Tung Yang, Yu-Wei Chan, Jung-Chun Liu, Ben-Shen Lou

Published in: The Journal of Supercomputing | Issue 3/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Recently, the R package has become a popular tool for big data analysis due to its several matured software packages for the data analysis and visualization, including the analysis of air pollution. The air pollution problem is of increasing global concern as it has greatly impacts on the environment and human health. With the rapid development of IoT and the increase in the accuracy of geographical information collected by sensors, a huge amount of air pollution data were generated. Thus, it is difficult to analyze the air pollution data in a single machine environment effectively and reliably due to its inherent characteristic of memory design. In this work, we construct a distributed computing environment based on both the softwares of RHadoop and SparkR for performing the analysis and visualization of air pollution with the R more reliably and effectively. In the work, we firstly use the sensors, called EdiGreen AirBox to collect the air pollution data in Taichung, Taiwan. Then, we adopt the Inverse Distance Weighting method to transform the sensors’ data into the density map. Finally, the experimental results show the accuracy of the short-term prediction results of PM2.5 by using the ARIMA model. In addition, the verification with respect to the prediction accuracy with the MAPE method is also presented in the experimental results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cohen AJ, Ross Anderson H, Ostro B, Pandey KD, Krzyzanowski M, Kunzli N, Gutschmidt K, Pope A, Romieu I, Samet JM, Smith K (2005) The global burden of disease due to outdoor air pollution. J Toxic Environ Health 68(13–14):1301–1307CrossRef Cohen AJ, Ross Anderson H, Ostro B, Pandey KD, Krzyzanowski M, Kunzli N, Gutschmidt K, Pope A, Romieu I, Samet JM, Smith K (2005) The global burden of disease due to outdoor air pollution. J Toxic Environ Health 68(13–14):1301–1307CrossRef
2.
go back to reference Mehta S, Shin H, Burnett R, North T, Cohen AJ (2013) Ambient particulate air pollution and acute lower respiratory infections: a systematic review and implications for estimating the global burden of disease. Air Qual Atmos Health 6(1):69–83CrossRef Mehta S, Shin H, Burnett R, North T, Cohen AJ (2013) Ambient particulate air pollution and acute lower respiratory infections: a systematic review and implications for estimating the global burden of disease. Air Qual Atmos Health 6(1):69–83CrossRef
3.
go back to reference Liu L, Yang X, Liu H, Wang M, Welles S, Mrquez S, Frank A, Haas CN (2016) Spatial temporal analysis of airpollution, climate change, and total mortality in 120 cities of china. Front Public Health 4:1–13CrossRef Liu L, Yang X, Liu H, Wang M, Welles S, Mrquez S, Frank A, Haas CN (2016) Spatial temporal analysis of airpollution, climate change, and total mortality in 120 cities of china. Front Public Health 4:1–13CrossRef
4.
go back to reference da Silva CS, Rossato JM, Rocha JAV, Vargas VM (2015) Characterization of an area of reference for inhalable particulate matter (PM2.5) associated with genetic biomonitoring in children. Mutat Res Genet Toxicol Environ Mutagen 778:44–55CrossRef da Silva CS, Rossato JM, Rocha JAV, Vargas VM (2015) Characterization of an area of reference for inhalable particulate matter (PM2.5) associated with genetic biomonitoring in children. Mutat Res Genet Toxicol Environ Mutagen 778:44–55CrossRef
5.
go back to reference Yorifuji T, Kashima S, Diez MH, Kado Y, Sanada S, Doi H (2017) Prenatal exposure to outdoor air pollution and child behavioral problems at school age in Japan. Environ Int 99:192–198CrossRef Yorifuji T, Kashima S, Diez MH, Kado Y, Sanada S, Doi H (2017) Prenatal exposure to outdoor air pollution and child behavioral problems at school age in Japan. Environ Int 99:192–198CrossRef
6.
go back to reference Ries L (1993) Areas of influence for IDW-interpolation with isotropic environmental data. CATENA 20(1):199–205CrossRef Ries L (1993) Areas of influence for IDW-interpolation with isotropic environmental data. CATENA 20(1):199–205CrossRef
7.
go back to reference Liang Y, Fang L, Pan H, Zhang K, Kan H, Brook JR, Sun Q (2014) PM2.5 in Beijing temporal pattern and its association with influenza. Environ Health 13:102–109CrossRef Liang Y, Fang L, Pan H, Zhang K, Kan H, Brook JR, Sun Q (2014) PM2.5 in Beijing temporal pattern and its association with influenza. Environ Health 13:102–109CrossRef
8.
go back to reference Li X, Peng L, Hu Y, Shao J, Chi T (2016) Deep learning architecture for air quality predictions. Environ Sci Pollut Res 23:22408–22417CrossRef Li X, Peng L, Hu Y, Shao J, Chi T (2016) Deep learning architecture for air quality predictions. Environ Sci Pollut Res 23:22408–22417CrossRef
10.
go back to reference Zhao Y, Cen Y (2013) Data mining applications with R. Academic Press, Cambridge Zhao Y, Cen Y (2013) Data mining applications with R. Academic Press, Cambridge
11.
go back to reference Liang M, Trejo C, Muthu L, Ngo LB, Luckow A, Apon AW (2015) Evaluating R-based big data analytic frameworks. In: 2015 IEEE International Conference on Cluster Computing, September 2015 Liang M, Trejo C, Muthu L, Ngo LB, Luckow A, Apon AW (2015) Evaluating R-based big data analytic frameworks. In: 2015 IEEE International Conference on Cluster Computing, September 2015
12.
go back to reference Dousse O, Thiran P, Hasler M (2002) Connectivity in ad-hoc and hybrid networks. In: Proceedings of IEEE INFOCOM 2002, June 2002 Dousse O, Thiran P, Hasler M (2002) Connectivity in ad-hoc and hybrid networks. In: Proceedings of IEEE INFOCOM 2002, June 2002
13.
go back to reference Uskenbayeva R, Kuandykov A, Young IC, Temirboltov T, Mnzholov S, Kozhmzhrov D (2015) Integrating of data using the Hadoop and R. Proc Comput Sci 56:145–149CrossRef Uskenbayeva R, Kuandykov A, Young IC, Temirboltov T, Mnzholov S, Kozhmzhrov D (2015) Integrating of data using the Hadoop and R. Proc Comput Sci 56:145–149CrossRef
15.
go back to reference Stachelek J (1993) Spatial interpolation via inverse path distance weighting. West Palm Beach 20:237–240 Stachelek J (1993) Spatial interpolation via inverse path distance weighting. West Palm Beach 20:237–240
16.
go back to reference Prajapati V (2013) Big data analytics with R and Hadoop. Packt Publishing, Birmingham Prajapati V (2013) Big data analytics with R and Hadoop. Packt Publishing, Birmingham
17.
go back to reference Catalano M, Galatioto F, Bell M, Namdeo A, Bergantinoc AS (2016) Improving the prediction of air pollution peak episodes generated by urban transport networks. Environ Sci Policy 60:69–83CrossRef Catalano M, Galatioto F, Bell M, Namdeo A, Bergantinoc AS (2016) Improving the prediction of air pollution peak episodes generated by urban transport networks. Environ Sci Policy 60:69–83CrossRef
18.
go back to reference Zafra C, Ngel Y, Torres E (2017) ARIMA analysis of the effect of land surface coverage on PM10 concentrations in a high-altitude megacity. Atmos Pollut Res 8(4):660–668CrossRef Zafra C, Ngel Y, Torres E (2017) ARIMA analysis of the effect of land surface coverage on PM10 concentrations in a high-altitude megacity. Atmos Pollut Res 8(4):660–668CrossRef
19.
go back to reference Wang P, Zhang H, Qin Z, Zhang G (2017) A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting. Atmos Pollut Res 8(5):850–860CrossRef Wang P, Zhang H, Qin Z, Zhang G (2017) A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting. Atmos Pollut Res 8(5):850–860CrossRef
20.
go back to reference Kuandykov A, Cho YI, Temirboltov T, Mnzholov S, Kozhmzhrov D (2016) Optimizing R with SparkR on a commodity cluster for biomedical research. Comput Methods Progr Biomed 137:321–328CrossRef Kuandykov A, Cho YI, Temirboltov T, Mnzholov S, Kozhmzhrov D (2016) Optimizing R with SparkR on a commodity cluster for biomedical research. Comput Methods Progr Biomed 137:321–328CrossRef
21.
go back to reference Shivaram V, Zongheng Y, Davies L, Eric L, Hossein F, Xiangrui M, Reynold X, Ali G, Michael F, Stoica I, Matei Z (2016) SparkR: scaling R programs with spark. In: Proceedings of the 2016 International Conference on Management of Data, June–July 2016 Shivaram V, Zongheng Y, Davies L, Eric L, Hossein F, Xiangrui M, Reynold X, Ali G, Michael F, Stoica I, Matei Z (2016) SparkR: scaling R programs with spark. In: Proceedings of the 2016 International Conference on Management of Data, June–July 2016
22.
go back to reference Siknun GP, Sitanggang IS (2016) Web-based classification application for forest fire data using the shiny framework and the C5.0 algorithm. Proc Environ Sci 33:332–339CrossRef Siknun GP, Sitanggang IS (2016) Web-based classification application for forest fire data using the shiny framework and the C5.0 algorithm. Proc Environ Sci 33:332–339CrossRef
23.
go back to reference Hermawati R, Sitanggang IS (2016) Web-based clustering application using shiny framework and DBSCAN algorithm for hotspots data in peatland in Sumatra. Proc Environ Sci 33:317–323CrossRef Hermawati R, Sitanggang IS (2016) Web-based clustering application using shiny framework and DBSCAN algorithm for hotspots data in peatland in Sumatra. Proc Environ Sci 33:317–323CrossRef
24.
go back to reference Ries L (1993) Areas of influence for IDW-interpolation with isotropic environmental data. CATENA 20(1–2):199–205CrossRef Ries L (1993) Areas of influence for IDW-interpolation with isotropic environmental data. CATENA 20(1–2):199–205CrossRef
25.
go back to reference Wagner M, Darrell K (2015) Tutorial L exploring discrete database networks of triCare health data using R and shiny. Pract Predict Anal Decis Syst Med 30:635–658 Wagner M, Darrell K (2015) Tutorial L exploring discrete database networks of triCare health data using R and shiny. Pract Predict Anal Decis Syst Med 30:635–658
Metadata
Title
An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution
Authors
Chao-Tung Yang
Yu-Wei Chan
Jung-Chun Liu
Ben-Shen Lou
Publication date
14-11-2017
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 3/2020
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-017-2189-1

Other articles of this Issue 3/2020

The Journal of Supercomputing 3/2020 Go to the issue

Premium Partner