Skip to main content
Erschienen in: The Journal of Supercomputing 8/2020

11.09.2019

Analysis of interpolation algorithms for the missing values in IoT time series: a case of air quality in Taiwan

verfasst von: Neil Y. Yen, Jia-Wei Chang, Jia-Yi Liao, You-Ming Yong

Erschienen in: The Journal of Supercomputing | Ausgabe 8/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Missing values are common in the Internet of Things (IoT) environment for various reasons, including regular maintenance or malfunction. In time-series prediction in the IoT, missing values may have a relationship with the target labels, and their missing patterns result in informative missingness. Thus, missing values can be a barrier to achieving high accuracy of prediction and analysis in data mining in the IoT. Although several methods have been proposed to estimate values that are missing, few studies have investigated the comparison of interpolation methods using conventional and deep learning models. There has thus far been relatively little research into interpolation methods in the IoT environment. To address these problems, this paper presents the use of linear regression, support vector regression, artificial neural networks, and long short-term memory to make time-series predictions for missing values. Finally, a full comparison and analysis of interpolation methods are presented. We believe that these findings can be of value to future work in IoT applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Soh PW, Chang JW, Huang JW (2018) Adaptive deep learning-based air quality prediction model using the most relevant spatial-temporal relations. IEEE Access 6:38186–38199CrossRef Soh PW, Chang JW, Huang JW (2018) Adaptive deep learning-based air quality prediction model using the most relevant spatial-temporal relations. IEEE Access 6:38186–38199CrossRef
4.
Zurück zum Zitat Yadav ML, Roychoudhury B (2018) Handling missing values: a study of popular imputation packages in R. Knowl Based Syst 160:104–118CrossRef Yadav ML, Roychoudhury B (2018) Handling missing values: a study of popular imputation packages in R. Knowl Based Syst 160:104–118CrossRef
5.
Zurück zum Zitat Allison PD, Horizons S (2012) Handling missing data by maximum likelihood. In: SAS Global Forum, pp 1–21 Allison PD, Horizons S (2012) Handling missing data by maximum likelihood. In: SAS Global Forum, pp 1–21
6.
Zurück zum Zitat Batista GE, Monard MC (2002) A study of k-nearest neighbour as an imputation method. HIS 87(48):251–260 Batista GE, Monard MC (2002) A study of k-nearest neighbour as an imputation method. HIS 87(48):251–260
7.
Zurück zum Zitat Malarvizhi MR, Thanamani AS (2012) K-nearest neighbor in missing data imputation. Int J Eng Res Dev (IJERD) 5(1):5–7 Malarvizhi MR, Thanamani AS (2012) K-nearest neighbor in missing data imputation. Int J Eng Res Dev (IJERD) 5(1):5–7
8.
Zurück zum Zitat Royston P (2004) Multiple imputation of missing values. Stata J 4(3):227–241CrossRef Royston P (2004) Multiple imputation of missing values. Stata J 4(3):227–241CrossRef
9.
Zurück zum Zitat Amiri M, Jensen R (2016) Missing data imputation using fuzzy-rough methods. Neurocomputing 205:152–164CrossRef Amiri M, Jensen R (2016) Missing data imputation using fuzzy-rough methods. Neurocomputing 205:152–164CrossRef
10.
Zurück zum Zitat Belanche LA, Kobayashi V, Aluja T (2014) Handling missing values in kernel methods with application to microbiology data. Neurocomputing 141:110–116CrossRef Belanche LA, Kobayashi V, Aluja T (2014) Handling missing values in kernel methods with application to microbiology data. Neurocomputing 141:110–116CrossRef
11.
Zurück zum Zitat Soley-Bori M (2013) Dealing with missing data: key assumptions and methods for applied analysis. Boston University 4:1–19 Soley-Bori M (2013) Dealing with missing data: key assumptions and methods for applied analysis. Boston University 4:1–19
12.
Zurück zum Zitat Žliobaitė I, Hollmén J, Junninen H (2014) Regression models tolerant to massively missing data: a case study in solar-radiation nowcasting. Atmos Meas Tech 7(12):4387–4399CrossRef Žliobaitė I, Hollmén J, Junninen H (2014) Regression models tolerant to massively missing data: a case study in solar-radiation nowcasting. Atmos Meas Tech 7(12):4387–4399CrossRef
13.
Zurück zum Zitat Raghunathan TE, Lepkowski JM, Van Hoewyk J, Solenberger P (2001) A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv Methodol 27(1):85–96 Raghunathan TE, Lepkowski JM, Van Hoewyk J, Solenberger P (2001) A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv Methodol 27(1):85–96
14.
Zurück zum Zitat Jones MP (1996) Indicator and stratification methods for missing explanatory variables in multiple linear regression. J Am Stat Assoc 91(433):222–230MathSciNetMATHCrossRef Jones MP (1996) Indicator and stratification methods for missing explanatory variables in multiple linear regression. J Am Stat Assoc 91(433):222–230MathSciNetMATHCrossRef
15.
Zurück zum Zitat Wang L, Fu D, Li Q, Mu Z (2010) Modelling method with missing values based on clustering and support vector regression. J Syst Eng Electron 21(1):142–147CrossRef Wang L, Fu D, Li Q, Mu Z (2010) Modelling method with missing values based on clustering and support vector regression. J Syst Eng Electron 21(1):142–147CrossRef
16.
Zurück zum Zitat Li Q, Fu Y, Zhou X, Xu Y (2009) The investigation and application of SVC and SVR in handling missing values. In: Proceedings of the First International Conference on Information Science and Engineering, pp 1002–1005, IEEE Li Q, Fu Y, Zhou X, Xu Y (2009) The investigation and application of SVC and SVR in handling missing values. In: Proceedings of the First International Conference on Information Science and Engineering, pp 1002–1005, IEEE
17.
Zurück zum Zitat Nourani V, Baghanam AH, Gebremichael M (2012) Investigating the ability of artificial neural network (ANN) models to estimate missing rain-gauge data. J Environ Inform 19(1):38–50CrossRef Nourani V, Baghanam AH, Gebremichael M (2012) Investigating the ability of artificial neural network (ANN) models to estimate missing rain-gauge data. J Environ Inform 19(1):38–50CrossRef
18.
Zurück zum Zitat Tealab A, Hefny H, Badr A (2017) Forecasting of nonlinear time series using ANN. Fut Comput Inform J 2(1):39–47CrossRef Tealab A, Hefny H, Badr A (2017) Forecasting of nonlinear time series using ANN. Fut Comput Inform J 2(1):39–47CrossRef
19.
Zurück zum Zitat Hu Y, Sun X, Nie X, Li Y, Liu L (2019) An enhanced LSTM for trend following of time series. IEEE Access 7:34020–34030CrossRef Hu Y, Sun X, Nie X, Li Y, Liu L (2019) An enhanced LSTM for trend following of time series. IEEE Access 7:34020–34030CrossRef
24.
25.
Zurück zum Zitat Ren Y, Suganthan PN, Srikanth N (2014) A novel empirical mode decomposition with support vector regression for wind speed forecasting. IEEE Trans Neural Netw Learn Syst 27(8):1793–1798MathSciNetCrossRef Ren Y, Suganthan PN, Srikanth N (2014) A novel empirical mode decomposition with support vector regression for wind speed forecasting. IEEE Trans Neural Netw Learn Syst 27(8):1793–1798MathSciNetCrossRef
28.
Zurück zum Zitat Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the International Conference on Machine Learning (ICML), 2013, vol 30, no 1, p 3 Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the International Conference on Machine Learning (ICML), 2013, vol 30, no 1, p 3
29.
Zurück zum Zitat Liu Q, Brigham K, Rao NS (2017) Estimation and fusion for tracking over long-haul links using artificial neural networks. IEEE Trans Signal Inf Process Over Netw 3(4):760–770MathSciNetCrossRef Liu Q, Brigham K, Rao NS (2017) Estimation and fusion for tracking over long-haul links using artificial neural networks. IEEE Trans Signal Inf Process Over Netw 3(4):760–770MathSciNetCrossRef
31.
Zurück zum Zitat Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y (2017) Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Trans Smart Grid 10(1):841–851CrossRef Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y (2017) Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Trans Smart Grid 10(1):841–851CrossRef
32.
Zurück zum Zitat Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. In: Proceedings of the Eleventh Annual Conference of the International Speech Communication Association, 2010 Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. In: Proceedings of the Eleventh Annual Conference of the International Speech Communication Association, 2010
33.
Zurück zum Zitat Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRef Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166CrossRef
34.
Zurück zum Zitat Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: Proceedings of the International Conference on Machine Learning, 2013 Feb, pp 1310–1318 Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: Proceedings of the International Conference on Machine Learning, 2013 Feb, pp 1310–1318
35.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
36.
Zurück zum Zitat Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471CrossRef Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471CrossRef
38.
Zurück zum Zitat Hahne JM, Biessmann F, Jiang N, Rehbaum H, Farina D, Meinecke FC et al (2014) Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control. IEEE Trans Neural Syst Rehabil Eng 22(2):269–279CrossRef Hahne JM, Biessmann F, Jiang N, Rehbaum H, Farina D, Meinecke FC et al (2014) Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control. IEEE Trans Neural Syst Rehabil Eng 22(2):269–279CrossRef
39.
Zurück zum Zitat Silva-Ramírez EL, Pino-Mejías R, López-Coello M, Cubiles-de-la-Vega MD (2011) Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Netw 24(1):121–129CrossRef Silva-Ramírez EL, Pino-Mejías R, López-Coello M, Cubiles-de-la-Vega MD (2011) Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Netw 24(1):121–129CrossRef
Metadaten
Titel
Analysis of interpolation algorithms for the missing values in IoT time series: a case of air quality in Taiwan
verfasst von
Neil Y. Yen
Jia-Wei Chang
Jia-Yi Liao
You-Ming Yong
Publikationsdatum
11.09.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 8/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-019-02991-7

Weitere Artikel der Ausgabe 8/2020

The Journal of Supercomputing 8/2020 Zur Ausgabe