Skip to main content
Top

2018 | OriginalPaper | Chapter

Outliers Detection in Regressions by Nonparametric Parzen Kernel Estimation

Authors : Tomasz Galkowski, Andrzej Cader

Published in: Artificial Intelligence and Soft Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A certain observation which is unusual or different from all other ones is called the outlier or anomaly. Appropriate evaluation of data is a crucial problem in modelling of the real objects or phenomena. Actually investigated problems often are based on data mass-produced by computer systems, without careful inspection or screening. The great amount of generated and processed information (e.g. so-called Big-Data) cause that possible outliers often go unnoticed and the result is that they can be masked. However, in regression, this situation can be more complicated. The identification and evaluation of the extremely atypical measurements in observations, for instance in some areas of medicine, geology, particularly in seismology (earthquakes), is precisely the outliers that are the subjects of interest. In this paper, a nonparametric procedure based on Parzen kernel for estimation of unknown function is applied. Evaluation of which measurements in input data-set could be recognized as outliers and possibly should be removed has been performed using the Cook’s Distance formula. Anomaly detection is still an important problem to be researched within diverse areas and application domains.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Andersen, R.: Modern Methods for Robust Regression. Quantitative Applications in the Social Sciences, vol. 152. Sage, Thousand Oaks (2008)CrossRef Andersen, R.: Modern Methods for Robust Regression. Quantitative Applications in the Social Sciences, vol. 152. Sage, Thousand Oaks (2008)CrossRef
2.
go back to reference Beg, I., Rashid, T.: Modelling uncertainties in multi-criteria decision making using distance measure and topsis for hesitant fuzzy sets. J. Artif. Intell. Soft Comput. Res. 7(2), 103–109 (2017)CrossRef Beg, I., Rashid, T.: Modelling uncertainties in multi-criteria decision making using distance measure and topsis for hesitant fuzzy sets. J. Artif. Intell. Soft Comput. Res. 7(2), 103–109 (2017)CrossRef
3.
go back to reference Bollen K.A., Jackman R.W.: Regression diagnostics: an expository treatment of outliers and influential cases. In: Fox, J., Scott, L.J. (eds.) Modern Methods of Data Analysis, pp. 257–291. Sage, Newbury Park (1990). ISBN 0-8039-3366-5 Bollen K.A., Jackman R.W.: Regression diagnostics: an expository treatment of outliers and influential cases. In: Fox, J., Scott, L.J. (eds.) Modern Methods of Data Analysis, pp. 257–291. Sage, Newbury Park (1990). ISBN 0-8039-3366-5
4.
go back to reference Cook, R.D.: Detection of influential observations in linear regression. Technometrics 19, 15–18 (1977). American Statistical AssociationMathSciNetMATH Cook, R.D.: Detection of influential observations in linear regression. Technometrics 19, 15–18 (1977). American Statistical AssociationMathSciNetMATH
5.
go back to reference Cook, R.D.: Residuals and Influence in Regression. Weisberg, Sanford, New York (1982) Cook, R.D.: Residuals and Influence in Regression. Weisberg, Sanford, New York (1982)
7.
go back to reference Cpalka, K., Rebrova, O., Nowicki, R., et al.: On design of flexible neuro-fuzzy systems for nonlinear modelling. Int. J. Gen. Syst. 42(6), 706–720 (2013)CrossRef Cpalka, K., Rebrova, O., Nowicki, R., et al.: On design of flexible neuro-fuzzy systems for nonlinear modelling. Int. J. Gen. Syst. 42(6), 706–720 (2013)CrossRef
8.
go back to reference Cpałka, K., Łapa, K., Przybył, A.: A new approach to design of control systems using genetic programming. Inf. Technol. Control 44(4), 433–442 (2015) Cpałka, K., Łapa, K., Przybył, A.: A new approach to design of control systems using genetic programming. Inf. Technol. Control 44(4), 433–442 (2015)
9.
go back to reference Duch, W., Korbicz, J., Rutkowski, L., Tadeusiewicz, R. (eds.): Biocybernetics and Biomedical Engineering 2000. Neural Networks, vol. 6. Akademicka Oficyna Wydawnicza, EXIT, Warsaw (2000). (in Polish) Duch, W., Korbicz, J., Rutkowski, L., Tadeusiewicz, R. (eds.): Biocybernetics and Biomedical Engineering 2000. Neural Networks, vol. 6. Akademicka Oficyna Wydawnicza, EXIT, Warsaw (2000). (in Polish)
10.
go back to reference Galkowski, T., Rutkowski, L.: Nonparametric recovery of multivariate functions with applications to system identification. In: Proceedings of the IEEE, vol. 73, pp. 942–943, New York (1985) Galkowski, T., Rutkowski, L.: Nonparametric recovery of multivariate functions with applications to system identification. In: Proceedings of the IEEE, vol. 73, pp. 942–943, New York (1985)
11.
go back to reference Galkowski, T., Rutkowski, L.: Nonparametric fitting of multivariable functions. IEEE Trans. Autom. Control AC–31, 785–787 (1986)CrossRef Galkowski, T., Rutkowski, L.: Nonparametric fitting of multivariable functions. IEEE Trans. Autom. Control AC–31, 785–787 (1986)CrossRef
12.
go back to reference Galkowski, T.: Nonparametric estimation of boundary values of functions. Arch. Control Sci. 3(1–2), 85–93 (1994)MathSciNetMATH Galkowski, T.: Nonparametric estimation of boundary values of functions. Arch. Control Sci. 3(1–2), 85–93 (1994)MathSciNetMATH
15.
17.
go back to reference Galkowski, T., Pawlak, M.: The novel method of the estimation of the Fourier transform based on noisy measurements. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 52–61. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_6CrossRef Galkowski, T., Pawlak, M.: The novel method of the estimation of the Fourier transform based on noisy measurements. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 52–61. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-59060-8_​6CrossRef
19.
go back to reference Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.-K., Stanley, H.E.: Components of a new research resource for complex physiologic signals, PhysioBank, PhysioToolkit, and PhysioNet. Circulation 101(23), 215–220 (2000)CrossRef Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.-K., Stanley, H.E.: Components of a new research resource for complex physiologic signals, PhysioBank, PhysioToolkit, and PhysioNet. Circulation 101(23), 215–220 (2000)CrossRef
20.
go back to reference Greblicki, W., Rutkowski, L.: Density-free Bayes risk consistency of nonparametric pattern recognition procedures. Proc. IEEE 69(4), 482–483 (1981)CrossRef Greblicki, W., Rutkowski, L.: Density-free Bayes risk consistency of nonparametric pattern recognition procedures. Proc. IEEE 69(4), 482–483 (1981)CrossRef
21.
go back to reference Grycuk, R., Gabryel, M., Nowicki, R., Scherer, R.: Content-based image retrieval optimization by differential evolution. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp. 86–93 (2016) Grycuk, R., Gabryel, M., Nowicki, R., Scherer, R.: Content-based image retrieval optimization by differential evolution. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp. 86–93 (2016)
22.
go back to reference Grycuk, R., Scherer, R., Gabryel, M.: New image descriptor from edge detector and blob extractor. J. Appl. Math. Comput. Mech. 14(4), 31–39 (2015)CrossRef Grycuk, R., Scherer, R., Gabryel, M.: New image descriptor from edge detector and blob extractor. J. Appl. Math. Comput. Mech. 14(4), 31–39 (2015)CrossRef
23.
go back to reference Korytkowski, M., Rutkowski, L., Scherer, R.: On combining backpropagation with boosting. In: International Joint Conference on Neural Networks, pp. 1274–1277 (2006) Korytkowski, M., Rutkowski, L., Scherer, R.: On combining backpropagation with boosting. In: International Joint Conference on Neural Networks, pp. 1274–1277 (2006)
24.
go back to reference Zhang, L., Lin, J., Karim, R.: Adaptive kernel density-based anomaly detection for nonlinear systems. Knowl.-Based Syst. 139, 50–63 (2018)CrossRef Zhang, L., Lin, J., Karim, R.: Adaptive kernel density-based anomaly detection for nonlinear systems. Knowl.-Based Syst. 139, 50–63 (2018)CrossRef
25.
go back to reference Liu, H., Gegov, A., Cocea, M.: Rule based networks: an efficient and interpretable representation of computational models. J. Artif. Intell. Soft Comput. Res. 7(2), 111–123 (2017)CrossRef Liu, H., Gegov, A., Cocea, M.: Rule based networks: an efficient and interpretable representation of computational models. J. Artif. Intell. Soft Comput. Res. 7(2), 111–123 (2017)CrossRef
26.
27.
go back to reference Rotar, C., Iantovics, L.B.: Directed evolution - a new metaheuristc for optimization. J. Artif. Intell. Soft Comput. Res. 7(3), 183–200 (2017)CrossRef Rotar, C., Iantovics, L.B.: Directed evolution - a new metaheuristc for optimization. J. Artif. Intell. Soft Comput. Res. 7(3), 183–200 (2017)CrossRef
28.
go back to reference Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Hoboken (2003)MATH Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Hoboken (2003)MATH
29.
go back to reference Rutkowski, L.: A general approach for nonparametric fitting of functions and their derivatives with applications to linear circuits identification. IEEE Trans. Circuits Syst. 33(8), 812–818 (1986)CrossRef Rutkowski, L.: A general approach for nonparametric fitting of functions and their derivatives with applications to linear circuits identification. IEEE Trans. Circuits Syst. 33(8), 812–818 (1986)CrossRef
30.
go back to reference Rutkowski, L.: Sequential pattern recognition procedures derived from multiple Fourier series. Pattern Recognit. Lett. 8, 213–216 (1988)CrossRef Rutkowski, L.: Sequential pattern recognition procedures derived from multiple Fourier series. Pattern Recognit. Lett. 8, 213–216 (1988)CrossRef
31.
go back to reference Rutkowski, L.: Non-parametric learning algorithms in the time-varying environments. Sig. Process. 18(2), 129–137 (1989)CrossRef Rutkowski, L.: Non-parametric learning algorithms in the time-varying environments. Sig. Process. 18(2), 129–137 (1989)CrossRef
32.
go back to reference Rutkowski, L.: Multiple Fourier series procedures for extraction of nonlinear regressions from noisy data. IEEE Trans. Sig. Process. 41(10), 3062–3065 (1993)CrossRef Rutkowski, L.: Multiple Fourier series procedures for extraction of nonlinear regressions from noisy data. IEEE Trans. Sig. Process. 41(10), 3062–3065 (1993)CrossRef
33.
go back to reference Rutkowski, L., Cpalka, K.: Compromise approach to neuro-fuzzy systems. In: Intelligent Technologies-Theory and Applications, 2nd Euro-International Symposium on Computation Intelligence, Kosice, Slovakia. Frontiers in Artificial Intelligence and Applications, vol. 76, pp. 85–90 (2002) Rutkowski, L., Cpalka, K.: Compromise approach to neuro-fuzzy systems. In: Intelligent Technologies-Theory and Applications, 2nd Euro-International Symposium on Computation Intelligence, Kosice, Slovakia. Frontiers in Artificial Intelligence and Applications, vol. 76, pp. 85–90 (2002)
35.
go back to reference Starczewski, A., Krzyżak, A.: Improvement of the validity index for determination of an appropriate data partitioning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 159–170. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_16CrossRef Starczewski, A., Krzyżak, A.: Improvement of the validity index for determination of an appropriate data partitioning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 159–170. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-59060-8_​16CrossRef
36.
go back to reference Tezuka, T., Claramunt, C.: Kernel analysis for estimating the connectivity of a network with event sequences. J. Artif. Intell. Soft Comput. Res. 7(1), 17–31 (2017)CrossRef Tezuka, T., Claramunt, C.: Kernel analysis for estimating the connectivity of a network with event sequences. J. Artif. Intell. Soft Comput. Res. 7(1), 17–31 (2017)CrossRef
37.
go back to reference Yan, P.: Mapreduce and semantics enabled event detection using social media. J. Artif. Intell. Soft Comput. Res. 7(3), 201–213 (2017)CrossRef Yan, P.: Mapreduce and semantics enabled event detection using social media. J. Artif. Intell. Soft Comput. Res. 7(3), 201–213 (2017)CrossRef
38.
go back to reference Łapa, K., Cpałka, K., Wang, L.: New method for design of fuzzy systems for nonlinear modelling using different criteria of interpretability. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014. LNCS (LNAI), vol. 8467, pp. 217–232. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07173-2_20CrossRef Łapa, K., Cpałka, K., Wang, L.: New method for design of fuzzy systems for nonlinear modelling using different criteria of interpretability. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014. LNCS (LNAI), vol. 8467, pp. 217–232. Springer, Cham (2014). https://​doi.​org/​10.​1007/​978-3-319-07173-2_​20CrossRef
Metadata
Title
Outliers Detection in Regressions by Nonparametric Parzen Kernel Estimation
Authors
Tomasz Galkowski
Andrzej Cader
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-91262-2_32

Premium Partner