Abstract
In the practice of data analysis some problems for many-sided researches are caused by the methodological variety of specific algorithms, often leading to laborious interpretations and time-consuming studies. This paper presents the concept of methodically unified procedures, based on kernel estimators, for three fundamental tasks: outlier detection, clustering, and classification. Their clear interpretation facilitates the applications and potential individual modifications. The investigated procedures are distribution-free, enabling analysis and exploration of data with any distributions, also when elements are grouped in several separated parts. The results obtained depend not only on the values of particular attributes, but above all on the complex relationships between them.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In the book [30] with the following notation changes: \(W\left( K \right) \mathop \rightarrow \limits ^{into} R\left( K \right) \) and \(U\left( K \right) \mathop \rightarrow \limits ^{into} \mu _2 \left( K \right) \).
- 2.
In the event that such a value does not exist, the presence of one cluster should be recognized and the procedure completed. The same applies to the irrational but formally possible situation m = 1, when set (30) is empty.
References
Aggarwal, C.C.: Outlier Analysis. Springer, New York (2013)
Aggarwal, C.C.: Data Mining. Springer, Cham (2015)
Agresti, A.: Categorical Data Analysis. Wiley, Hoboken (2002)
Biau, G., Devroye, L.: Lectures on the Nearest Neighbor Method. Springer, Cham (2015)
Billingsley, P.: Probability and Measure. Wiley, New York (1995)
Canaan, C., Garai, M.S., Daya, M.: Popular sorting algorithms. World Appl. Program. 1, 62–71 (2011)
Duda, R.O., Hart, P.E., Storck, D.G.: Pattern Classification. Wiley, New York (2001)
Gentle, J.E.: Random Number Generation and Monte Carlo Methods. Springer, New York (2003)
Fodor, J., Roubens, M.: Fuzzy Preference Modelling and Multicriteria Decision Support. Kluwer, Dordrecht (1994)
Fukunaga, K., Hostetler, L.D.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21, 32–40 (1975)
Kelley, C.T.: Iterative Methods for Optimization. SIAM, Philadelphia (1999)
Kulczycki, P.: Wykrywanie uszkodzeń w systemach zautomatyzowanych metodami statystycznymi. Alfa, Warsaw (1998)
Kulczycki, P.: Estymatory jągonadrowe w analizie systemowej. WNT, Warsaw (2005)
Kulczycki, P., Kacprzyk, J., Kóczy, L.T., Mesiar, R., Wisniewski, R. (eds.): Information Technology, Systems Research, and Computational Physics. Springer, Cham (2020)
Kulczycki, P.: Kernel estimators for data analysis. In: Ram, M., Davim, J.P. (eds.) Advanced Mathematical Techniques in Engineering Sciences, pp. 177–202. CRC/Taylor & Francis, Boca Raton (2018)
Kulczycki, P., Charytanowicz, M.: A complete gradient clustering algorithm formed with kernel estimators. Int. J. Appl. Math. Comput. Sci. 20, 123–134 (2010)
Kulczycki, P., Charytanowicz, M., Kowalski, P.A., Łukasik, S.: The complete gradient clustering algorithm: properties in practical applications. J. Appl. Stat. 39, 1211–1224 (2012)
Kulczycki, P., Daniel, K.: Metoda wspomagania strategii marketingowej operatora telefonii komórkowej, Przeglągonad Statystyczny, vol. 56, no. 2, pp. 116–134 (2009). Errata: vol. 56, no. 3–4, s. 3 (2009)
Kulczycki, P., Kowalski, P.A.: Bayes classification of imprecise information of interval type. Control Cybern. 40, 101–123 (2011)
Kulczycki, P., Kowalski, P.A.: Bayes classification for nonstationary patterns. Int. J. Comput. Methods 12(2), 19 (2015). ID 1550008
Kulczycki, P., Kruszewski, D.: Identification of atypical elements by transforming task to supervised form with fuzzy and intuitionistic fuzzy evaluations. Appl. Soft Comput. 60, 623–633 (2017)
Kulczycki, P., Kruszewski, D.: Detection of rare elements in investigation of medical problems. In: Nguyen, N.T., Gaol, G.L., Hong, T.-P., Trawiński, B. (eds.) Intelligent Information and Database Systems, pp. 257–268. Springer, Cham (2019)
Kulczycki, P., Prochot, C.: Identyfikacja stanów nietypowych za pomocą estymatorów jądrowych. In: Bubnicki, Z., Hryniewicz, O., Kulikowski, R. (eds.) Metody i techniki analizy informacji i wspomagania decyzji, pp. 57–62. EXIT, Warsaw (2002)
Mirkin, B.: Clustering for Data Mining. Taylor & Francis, Boca Raton (2005)
Parrish, R.: Comparison of quantile estimators in normal sampling. Biometrics 46, 247–257 (1990)
Rokach, L., Maimon, O.: Data Mining with Decision Trees. World Scientific, New Jersey (2015)
Silva, J., Faria, E., Barros, R., Hruschka, E., de Carvalho, A., Gama, J.: Data stream clustering: a survey. ACM Comput. Surv. 46, 13 (2013)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
Sorzano, C.O.S., Vargas, J., Pascual-Montano, A.: A survey of dimensionality reduction techniques, arXiv, signature 1403.2877v1 (2014)
Wand, M., Jones, M.: Kernel Smoothing. Chapman and Hall, London (1995)
Acknowledgments
I would like to express my gratitude to my close associates – former Ph.D.-students – Małgorzata Charytanowicz, D.Sc., Karina Daniel, Ph.D., Piotr A. Kowalski, D.Sc., Damian Kruszewski, Ph.D., Szymon Łukasik, Ph.D., with whom the research summarized in this paper was conducted.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kulczycki, P. (2020). Methodically Unified Procedures for Outlier Detection, Clustering and Classification. In: Arai, K., Bhatia, R., Kapoor, S. (eds) Proceedings of the Future Technologies Conference (FTC) 2019. FTC 2019. Advances in Intelligent Systems and Computing, vol 1069. Springer, Cham. https://doi.org/10.1007/978-3-030-32520-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-32520-6_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32519-0
Online ISBN: 978-3-030-32520-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)