Skip to main content
Erschienen in: Soft Computing 1/2024

04.11.2023 | Data analytics and machine learning

Examining parallelization in kernel regression

verfasst von: Orcun Oltulu, Fulya Gokalp Yavuz

Erschienen in: Soft Computing | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

For a few decades, parallelization in statistical computing has been an increasing trend, and researchers have put significant effort into converting or adjusting known statistical methods and algorithms in parallel. The main reasons for the transition to parallel processes are the rapid growth in the size and the volume of data and the accelerated hardware developments. Divide and (re)combine (DnR) is one of the parallelization methods that allows the existing data or method to be implemented by dividing it into smaller pieces. It is possible to use the DnR method in most regression methods to reveal the relationship between the data. Although several libraries have been created in existing programming languages for many regression methods, such an approach is not yet used for kernel regression. However, it should be kept in mind that the kernel regression calculation method takes a relatively long time. For this reason, parallelization would be a handy strategy to decrease the calculation time in kernel regression. In this study, we aim to demonstrate how time efficiency is achieved using DnR methods for kernel regression with the help of several parallelization strategies in R. The results indicate that the computation time can be reduced proportionally with a trade-off between time and accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Benedetti JK (1977) On the nonparametric estimation of regression functions. J R Stat Soc Ser B (Methodological) 39(2):248–253MathSciNet Benedetti JK (1977) On the nonparametric estimation of regression functions. J R Stat Soc Ser B (Methodological) 39(2):248–253MathSciNet
Zurück zum Zitat Calaway R, Weston S, Calaway MR (2015) Package ‘foreach’. R package pp 1–10 Calaway R, Weston S, Calaway MR (2015) Package ‘foreach’. R package pp 1–10
Zurück zum Zitat Calaway R, Weston S, Calaway MR (2017) Foreach. R Package, version 1.4.4 Calaway R, Weston S, Calaway MR (2017) Foreach. R Package, version 1.4.4
Zurück zum Zitat Chowdhury J, Chaudhuri P (2020) Convergence rates for kernel regression in infinite-dimensional spaces. Ann Inst Stat Math 72(2):471–509MathSciNetCrossRef Chowdhury J, Chaudhuri P (2020) Convergence rates for kernel regression in infinite-dimensional spaces. Ann Inst Stat Math 72(2):471–509MathSciNetCrossRef
Zurück zum Zitat Diggle PJ, Giorgi E (2019) Model-based geostatistics for global public health: methods and applications. CRC Press, Boca RatonCrossRef Diggle PJ, Giorgi E (2019) Model-based geostatistics for global public health: methods and applications. CRC Press, Boca RatonCrossRef
Zurück zum Zitat Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–67MathSciNet Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–67MathSciNet
Zurück zum Zitat Gramacki A (2018) Nonparametric kernel density estimation and its computational aspects, vol 37. Springer, Berlin Gramacki A (2018) Nonparametric kernel density estimation and its computational aspects, vol 37. Springer, Berlin
Zurück zum Zitat Guo G, You W, Qian G, Shao W (2015) Parallel maximum likelihood estimator for multiple linear regression models. J Comput Appl Math 273:251–263MathSciNetCrossRef Guo G, You W, Qian G, Shao W (2015) Parallel maximum likelihood estimator for multiple linear regression models. J Comput Appl Math 273:251–263MathSciNetCrossRef
Zurück zum Zitat Hayfield T, Racine JS (2008) Nonparametric econometrics: the np package. J Stat Softw 27(5):1–32CrossRef Hayfield T, Racine JS (2008) Nonparametric econometrics: the np package. J Stat Softw 27(5):1–32CrossRef
Zurück zum Zitat Hayfield T, Racine JS, Racine MJS (2013) npRmpi. R Package, version 0.60-2 Hayfield T, Racine JS, Racine MJS (2013) npRmpi. R Package, version 0.60-2
Zurück zum Zitat Ho AT, Huynh KP, Jacho-Chavez DT (2011) npRmpi: A package for parallel distributed kernel estimation in R. J Appl Econ 26(2):344–349 Ho AT, Huynh KP, Jacho-Chavez DT (2011) npRmpi: A package for parallel distributed kernel estimation in R. J Appl Econ 26(2):344–349
Zurück zum Zitat Lopez-Novoa U, Sáenz J, Mendiburu A, Miguel-Alonso J (2015) An efficient implementation of kernel density estimation for multi-core and many-core architectures. Int J High Perform Comput Appl 29(3):331–347 Lopez-Novoa U, Sáenz J, Mendiburu A, Miguel-Alonso J (2015) An efficient implementation of kernel density estimation for multi-core and many-core architectures. Int J High Perform Comput Appl 29(3):331–347
Zurück zum Zitat Łukasik S (2007) Parallel computing of kernel density estimates with mpi. In: International conference on computational science. Springer, pp 726–733 Łukasik S (2007) Parallel computing of kernel density estimates with mpi. In: International conference on computational science. Springer, pp 726–733
Zurück zum Zitat Martino L, Read J (2021) A joint introduction to gaussian processes and relevance vector machines with connections to kalman filtering and other kernel smoothers. Inf Fusion 74:17–38CrossRef Martino L, Read J (2021) A joint introduction to gaussian processes and relevance vector machines with connections to kalman filtering and other kernel smoothers. Inf Fusion 74:17–38CrossRef
Zurück zum Zitat Michailidis PD, Margaritis KG (2013) Parallel computing of kernel density estimation with different multi-core programming models. In: 2013 21st Euromicro international conference on parallel, distributed, and network-based processing. IEEE, pp 77–85 Michailidis PD, Margaritis KG (2013) Parallel computing of kernel density estimation with different multi-core programming models. In: 2013 21st Euromicro international conference on parallel, distributed, and network-based processing. IEEE, pp 77–85
Zurück zum Zitat Nadaraya EA (1965) On non-parametric estimates of density functions and regression curves. Theory Probab Appl 10(1):186–190CrossRef Nadaraya EA (1965) On non-parametric estimates of density functions and regression curves. Theory Probab Appl 10(1):186–190CrossRef
Zurück zum Zitat Rasmussen CE, Williams C (2006) Gaussian processes for machine learning, vol 32. MIT Press, Cambridge, p 68 Rasmussen CE, Williams C (2006) Gaussian processes for machine learning, vol 32. MIT Press, Cambridge, p 68
Zurück zum Zitat Renaut RA (1998) A parallel multisplitting solution of the least squares problem. Numer Linear Algebra Appl 5(1):11–31MathSciNetCrossRef Renaut RA (1998) A parallel multisplitting solution of the least squares problem. Numer Linear Algebra Appl 5(1):11–31MathSciNetCrossRef
Zurück zum Zitat Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27(3):832–837MathSciNetCrossRef Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27(3):832–837MathSciNetCrossRef
Zurück zum Zitat Stone CJ (1980) Optimal rates of convergence for nonparametric estimators. Ann Stat 1348–1360 Stone CJ (1980) Optimal rates of convergence for nonparametric estimators. Ann Stat 1348–1360
Zurück zum Zitat Stone CJ (1982) Optimal global rates of convergence for nonparametric regression. Ann Stat 1040–1053 Stone CJ (1982) Optimal global rates of convergence for nonparametric regression. Ann Stat 1040–1053
Zurück zum Zitat Takeda H, Farsiu S, Milanfar P (2007) Kernel regression for image processing and reconstruction. IEEE Trans Image Process 16(2):349–366MathSciNetCrossRef Takeda H, Farsiu S, Milanfar P (2007) Kernel regression for image processing and reconstruction. IEEE Trans Image Process 16(2):349–366MathSciNetCrossRef
Zurück zum Zitat Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1(June):211–244MathSciNet Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1(June):211–244MathSciNet
Zurück zum Zitat Wen T, Yang F, Gu J, Chen S, Wang L, Xie Y (2018) An adaptive kernel regression method for 3d ultrasound reconstruction using speckle prior and parallel gpu implementation. Neurocomputing 275:208–223CrossRef Wen T, Yang F, Gu J, Chen S, Wang L, Xie Y (2018) An adaptive kernel regression method for 3d ultrasound reconstruction using speckle prior and parallel gpu implementation. Neurocomputing 275:208–223CrossRef
Zurück zum Zitat Whang YJ (1998) Topics in advanced econometrics: estimation, testing, and specification of cross-section and time series models (Herman J bierens Cambridge university press, 1994). Econom Theory 14(3):369–374CrossRef Whang YJ (1998) Topics in advanced econometrics: estimation, testing, and specification of cross-section and time series models (Herman J bierens Cambridge university press, 1994). Econom Theory 14(3):369–374CrossRef
Zurück zum Zitat Yatchew A (1998) Nonparametric regression techniques in economics. J Econ Lit 36(2):669–721 Yatchew A (1998) Nonparametric regression techniques in economics. J Econ Lit 36(2):669–721
Zurück zum Zitat Yatracos YG (1988) A lower bound on the error in nonparametric regression type problems. Ann Stat 16(3):1180–1187MathSciNetCrossRef Yatracos YG (1988) A lower bound on the error in nonparametric regression type problems. Ann Stat 16(3):1180–1187MathSciNetCrossRef
Zurück zum Zitat Yu H (2002) Rmpi: parallel statistical computing in r. R News 2(2):10–14 Yu H (2002) Rmpi: parallel statistical computing in r. R News 2(2):10–14
Metadaten
Titel
Examining parallelization in kernel regression
verfasst von
Orcun Oltulu
Fulya Gokalp Yavuz
Publikationsdatum
04.11.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 1/2024
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-023-09285-4

Weitere Artikel der Ausgabe 1/2024

Soft Computing 1/2024 Zur Ausgabe

Premium Partner