Skip to main content
Top

2020 | OriginalPaper | Chapter

GRNN++: A Parallel and Distributed Version of GRNN Under Apache Spark for Big Data Regression

Authors : Sk. Kamaruddin, Vadlamani Ravi

Published in: Data Management, Analytics and Innovation

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Among the neural network architectures for prediction, multi-layer perceptron (MLP), radial basis function (RBF), wavelet neural network (WNN), general regression neural network (GRNN), and group method of data handling (GMDH) are popular. Out of these architectures, GRNN is preferable because it involves single-pass learning and produces reasonably good results. Although GRNN involves single-pass learning, it cannot handle big datasets because a pattern layer is required to store all the cluster centers after clustering all the samples. Therefore, this paper proposes a hybrid architecture, GRNN++, which makes GRNN scalable for big data by invoking a parallel distributed version of K-means++, namely, K-means||, in the pattern layer of GRNN. The whole architecture is implemented in the distributed parallel computational architecture of Apache Spark with HDFS. The performance of the GRNN++ was measured on gas sensor dataset which has 613 MB of data under a ten-fold cross-validation setup. The proposed GRNN++ produces very low mean squared error (MSE). It is worthwhile to mention that the primary motivation of this article is to present a distributed and parallel version of the traditional GRNN.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Kusakunniran, W., Wu, Q., Zhang, J., Li, H.: Multi-view gait recognition based on motion regression using multilayer perceptron. In: 2010 20th International Conference on Pattern Recognition, pp 2186–2189. IEEE, Istanbul (2010) Kusakunniran, W., Wu, Q., Zhang, J., Li, H.: Multi-view gait recognition based on motion regression using multilayer perceptron. In: 2010 20th International Conference on Pattern Recognition, pp 2186–2189. IEEE, Istanbul (2010)
2.
go back to reference Agirre-Basurko, E., Ibarra-Berastegi, G., Madariaga, I.: Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ. Model Softw. 21, 430–446 (2006)CrossRef Agirre-Basurko, E., Ibarra-Berastegi, G., Madariaga, I.: Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ. Model Softw. 21, 430–446 (2006)CrossRef
3.
go back to reference Gaudart, J., Giusiano, B., Huiart, L.: Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data. Comput. Stat. Data Anal. 44, 547–570 (2004)MathSciNetCrossRef Gaudart, J., Giusiano, B., Huiart, L.: Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data. Comput. Stat. Data Anal. 44, 547–570 (2004)MathSciNetCrossRef
4.
go back to reference Mignon, A., Jurie, F.: Reconstructing faces from their signatures using RBF regression. In: Procedings of the British Machine Vision Conference 2013, pp 103.1–103.11. British Machine Vision Association, Bristol (2013) Mignon, A., Jurie, F.: Reconstructing faces from their signatures using RBF regression. In: Procedings of the British Machine Vision Conference 2013, pp 103.1–103.11. British Machine Vision Association, Bristol (2013)
5.
go back to reference Hannan, S.A., Manza, R.R., Ramteke, R.J.: Generalized regression neural network and radial basis function for heart disease diagnosis. Int. J. Comput. Appl. 7, 7–13 (2010) Hannan, S.A., Manza, R.R., Ramteke, R.J.: Generalized regression neural network and radial basis function for heart disease diagnosis. Int. J. Comput. Appl. 7, 7–13 (2010)
6.
go back to reference Taki, M., Rohani, A., Soheili-Fard, F., Abdeshahi, A.: Assessment of energy consumption and modeling of output energy for wheat production by neural network (MLP and RBF) and Gaussian process regression (GPR) models. J. Clean. Prod. 172, 3028–3041 (2018)CrossRef Taki, M., Rohani, A., Soheili-Fard, F., Abdeshahi, A.: Assessment of energy consumption and modeling of output energy for wheat production by neural network (MLP and RBF) and Gaussian process regression (GPR) models. J. Clean. Prod. 172, 3028–3041 (2018)CrossRef
7.
go back to reference Budu, K.: Comparison of wavelet-based ANN and regression models for reservoir inflow forecasting. J. Hydrol. Eng. 19, 1385–1400 (2014)CrossRef Budu, K.: Comparison of wavelet-based ANN and regression models for reservoir inflow forecasting. J. Hydrol. Eng. 19, 1385–1400 (2014)CrossRef
8.
go back to reference Vinaykumar, K., Ravi, V., Carr, M., Rajkiran, N.: Software development cost estimation using wavelet neural networks. J. Syst. Softw. 81, 1853–1867 (2008)CrossRef Vinaykumar, K., Ravi, V., Carr, M., Rajkiran, N.: Software development cost estimation using wavelet neural networks. J. Syst. Softw. 81, 1853–1867 (2008)CrossRef
9.
go back to reference Chauhan, N., Ravi, V., Karthik Chandra, D.: Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks. Expert Syst. Appl. 36, 7659–7665 (2009)CrossRef Chauhan, N., Ravi, V., Karthik Chandra, D.: Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks. Expert Syst. Appl. 36, 7659–7665 (2009)CrossRef
10.
go back to reference Rajkiran, N., Ravi, V.: Software reliability prediction using wavelet neural networks. In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), pp 195–199. IEEE, Sivakasi (2007) Rajkiran, N., Ravi, V.: Software reliability prediction using wavelet neural networks. In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), pp 195–199. IEEE, Sivakasi (2007)
11.
go back to reference Astakhov, V.P., Galitsky, V.V.: Tool life testing in gundrilling: an application of the group method of data handling (GMDH). Int. J. Mach. Tools Manuf 45, 509–517 (2005)CrossRef Astakhov, V.P., Galitsky, V.V.: Tool life testing in gundrilling: an application of the group method of data handling (GMDH). Int. J. Mach. Tools Manuf 45, 509–517 (2005)CrossRef
12.
go back to reference Elattar, E.E., Goulermas, J.Y., Wu, Q.H.: Generalized locally weighted GMDH for short term load forecasting. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42, 345–356 (2012)CrossRef Elattar, E.E., Goulermas, J.Y., Wu, Q.H.: Generalized locally weighted GMDH for short term load forecasting. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42, 345–356 (2012)CrossRef
13.
go back to reference Srinivasan, D.: Energy demand prediction using GMDH networks. Neurocomputing 72, 625–629 (2008)CrossRef Srinivasan, D.: Energy demand prediction using GMDH networks. Neurocomputing 72, 625–629 (2008)CrossRef
14.
go back to reference Ravisankar, P., Ravi, V.: Financial distress prediction in banks using group method of data handling neural network, counter propagation neural network and fuzzy ARTMAP. Knowl. Based Syst. 23, 823–831 (2010)CrossRef Ravisankar, P., Ravi, V.: Financial distress prediction in banks using group method of data handling neural network, counter propagation neural network and fuzzy ARTMAP. Knowl. Based Syst. 23, 823–831 (2010)CrossRef
15.
go back to reference Mohanty, R., Ravi, V., Patra, M.R.: Software reliability prediction using group method of data handling. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2009, pp 344–351. Springer, Berlin (2009) Mohanty, R., Ravi, V., Patra, M.R.: Software reliability prediction using group method of data handling. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2009, pp 344–351. Springer, Berlin (2009)
16.
go back to reference Reddy, K.N., Ravi, V.: Kernel group method of data handling: application to regression problems. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2012, pp 74–81. Springer, Berlin (2012)CrossRef Reddy, K.N., Ravi, V.: Kernel group method of data handling: application to regression problems. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2012, pp 74–81. Springer, Berlin (2012)CrossRef
17.
go back to reference Ahad, N., Qadir, J., Ahsan, N.: Neural networks in wireless networks: techniques, applications and guidelines. J. Netw. Comput. Appl. 68, 1–27 (2016)CrossRef Ahad, N., Qadir, J., Ahsan, N.: Neural networks in wireless networks: techniques, applications and guidelines. J. Netw. Comput. Appl. 68, 1–27 (2016)CrossRef
18.
go back to reference Jin, L., Li, S., Yu, J., He, J.: Robot manipulator control using neural networks: A survey. Neurocomputing 285, 23–34 (2018)CrossRef Jin, L., Li, S., Yu, J., He, J.: Robot manipulator control using neural networks: A survey. Neurocomputing 285, 23–34 (2018)CrossRef
19.
go back to reference Marugán, A.P., Márquez, F.P.G., Perez, J.M.P., Ruiz-Hernández, D.: A survey of artificial neural network in wind energy systems. Appl. Energy 228, 1822–1836 (2018)CrossRef Marugán, A.P., Márquez, F.P.G., Perez, J.M.P., Ruiz-Hernández, D.: A survey of artificial neural network in wind energy systems. Appl. Energy 228, 1822–1836 (2018)CrossRef
20.
go back to reference Agrawal, S., Agrawal, J.: Neural network techniques for cancer prediction: a survey. Proc. Comput. Sci. 60, 769–774 (2015)CrossRef Agrawal, S., Agrawal, J.: Neural network techniques for cancer prediction: a survey. Proc. Comput. Sci. 60, 769–774 (2015)CrossRef
21.
go back to reference Khoshroo, A., Emrouznejad, A., Ghaffarizadeh, A., Kasraei, M., Omid, M.: Sensitivity analysis of energy inputs in crop production using artificial neural networks. J. Clean. Prod. 197(Part 1), 992–998 (2018)CrossRef Khoshroo, A., Emrouznejad, A., Ghaffarizadeh, A., Kasraei, M., Omid, M.: Sensitivity analysis of energy inputs in crop production using artificial neural networks. J. Clean. Prod. 197(Part 1), 992–998 (2018)CrossRef
22.
go back to reference Tkáč, M., Verner, R.: Artificial neural networks in business: two decades of research. Appl. Soft Comput. 38, 788–804 (2016)CrossRef Tkáč, M., Verner, R.: Artificial neural networks in business: two decades of research. Appl. Soft Comput. 38, 788–804 (2016)CrossRef
23.
go back to reference Specht, D.F.: A general regression neural network. IEEE Trans. Neural Netw. 2, 568–576 (1991)CrossRef Specht, D.F.: A general regression neural network. IEEE Trans. Neural Netw. 2, 568–576 (1991)CrossRef
24.
go back to reference Bahmani, B., Moseley, B., Vattani, A., Kumar, R., Vassilvitskii, S.: Scalable K-means++. Proc. VLDB Endow. 5, 622–633 (2012)CrossRef Bahmani, B., Moseley, B., Vattani, A., Kumar, R., Vassilvitskii, S.: Scalable K-means++. Proc. VLDB Endow. 5, 622–633 (2012)CrossRef
25.
go back to reference Arthur, D., Vassilvitskii, S.: k-means ++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 1027–1035 (2007) Arthur, D., Vassilvitskii, S.: k-means ++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 1027–1035 (2007)
26.
go back to reference Zhao, W., Ma, H., He, Q.: Parallel K-means clustering based on MapReduce. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing, pp. 674–679. Springer, Berlin (2009)CrossRef Zhao, W., Ma, H., He, Q.: Parallel K-means clustering based on MapReduce. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing, pp. 674–679. Springer, Berlin (2009)CrossRef
27.
go back to reference Liao, Q., Yang, F., Zhao, J.: An improved parallel K-means clustering algorithm with MapReduce. In: 2013 15th IEEE International Conference on Communication Technology, pp 764–768. IEEE (2013) Liao, Q., Yang, F., Zhao, J.: An improved parallel K-means clustering algorithm with MapReduce. In: 2013 15th IEEE International Conference on Communication Technology, pp 764–768. IEEE (2013)
28.
go back to reference Kamaruddin, S., Ravi, V., Mayank, P.: Parallel evolving clustering method for big data analytics using apache spark: applications to banking and physics. In: Reddy, P., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) Lecture Notes in Computer Science. Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, pp. 278–292. Springer, Cham (2017) Kamaruddin, S., Ravi, V., Mayank, P.: Parallel evolving clustering method for big data analytics using apache spark: applications to banking and physics. In: Reddy, P., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) Lecture Notes in Computer Science. Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, pp. 278–292. Springer, Cham (2017)
29.
go back to reference Leung, M.T., Chen, A.-S., Daouk, H.: Forecasting exchange rates using general regression neural networks. Comput. Oper. Res. 27, 1093–1110 (2000)CrossRef Leung, M.T., Chen, A.-S., Daouk, H.: Forecasting exchange rates using general regression neural networks. Comput. Oper. Res. 27, 1093–1110 (2000)CrossRef
30.
go back to reference Kayaer, K., Yildirim, T.: Medical diagnosis on Pima Indian diabetes using general regression neural networks. In: Proceedings of the International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), pp 181–184 (2003) Kayaer, K., Yildirim, T.: Medical diagnosis on Pima Indian diabetes using general regression neural networks. In: Proceedings of the International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), pp 181–184 (2003)
31.
go back to reference Li, C., Bovik, A.C., Wu, X.: Blind image quality assessment using a general regression neural network. IEEE Trans. Neural Netw. 22, 793–799 (2011)CrossRef Li, C., Bovik, A.C., Wu, X.: Blind image quality assessment using a general regression neural network. IEEE Trans. Neural Netw. 22, 793–799 (2011)CrossRef
32.
go back to reference Li, H., Guo, S., Li, C., Sun, J.: A hybrid annual power load forecasting model based on generalized regression neural network with fruit fly optimization algorithm. Knowl. Based Syst. 37, 378–387 (2013)CrossRef Li, H., Guo, S., Li, C., Sun, J.: A hybrid annual power load forecasting model based on generalized regression neural network with fruit fly optimization algorithm. Knowl. Based Syst. 37, 378–387 (2013)CrossRef
33.
go back to reference Ravi, V., Krishna, M.: A new online data imputation method based on general regression auto associative neural network. Neurocomputing 138, 106–113 (2014)CrossRef Ravi, V., Krishna, M.: A new online data imputation method based on general regression auto associative neural network. Neurocomputing 138, 106–113 (2014)CrossRef
34.
go back to reference Tejasviram, V., Solanki, H., Ravi, V., Kamaruddin, S.: Auto associative extreme learning machine based non-linear principal component regression for big data applications. In: 2015 Tenth International Conference on Digital Information Management (ICDIM), pp 223–228. IEEE, Jeju (2015) Tejasviram, V., Solanki, H., Ravi, V., Kamaruddin, S.: Auto associative extreme learning machine based non-linear principal component regression for big data applications. In: 2015 Tenth International Conference on Digital Information Management (ICDIM), pp 223–228. IEEE, Jeju (2015)
35.
go back to reference Kamaruddin, S., Ravi, V.: Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. In: Proceedings of the International Conference on Informatics and Analytics—ICIA-16, pp 1–8. ACM Press, Pondicherry (2016) Kamaruddin, S., Ravi, V.: Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. In: Proceedings of the International Conference on Informatics and Analytics—ICIA-16, pp 1–8. ACM Press, Pondicherry (2016)
36.
go back to reference Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)MathSciNetCrossRef Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)MathSciNetCrossRef
37.
go back to reference Bezdek, J.C., Pal, N.R.: Some new indices of cluster validity. IEEE Trans. Syst. Man Cybern. B Cybern. 28, 301–315 (1998)CrossRef Bezdek, J.C., Pal, N.R.: Some new indices of cluster validity. IEEE Trans. Syst. Man Cybern. B Cybern. 28, 301–315 (1998)CrossRef
38.
go back to reference Fonollosa, J., Sheik, S., Huerta, R., Marco, S.: Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sensors Actuators B Chem. 215, 618–629 (2015)CrossRef Fonollosa, J., Sheik, S., Huerta, R., Marco, S.: Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sensors Actuators B Chem. 215, 618–629 (2015)CrossRef
Metadata
Title
GRNN++: A Parallel and Distributed Version of GRNN Under Apache Spark for Big Data Regression
Authors
Sk. Kamaruddin
Vadlamani Ravi
Copyright Year
2020
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-32-9949-8_16