Fuzzy clustering algorithm for outlier-interval data based on the robust exponent distance

Phamtoan, Dinh; Nguyenhuu, Khanh; Vovan, Tai

doi:10.1007/s10489-021-02773-w

Fuzzy clustering algorithm for outlier-interval data based on the robust exponent distance

Published: 06 September 2021

Volume 52, pages 6276–6291, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

467 Accesses
11 Citations
Explore all metrics

Abstract

The outlier elements of a data are ones that differs significantly from others. For many reasons, we have to face with outlier elements in data analysis for the different fields. Because an outlier element can cause the serious problems in statistical analyses, studying about it is interested in many researchers. This article proposes the fuzzy clustering algorithm for outlier - interval data based on the robust exponent distance to overcome the drawback of traditional clustering algorithm which to clean the outliers before performing. The outstanding advantage of this algorithm is to find the suitable number of clusters, to cluster for the interval data with outlier elements, and to determine the probability belonging to clusters for the intervals at the same time. The proposed algorithm is described step by step via numerical examples, and can be performed effectively by the Matlab procedure. In addition, it also applied in reality with the air pollution, mushroom, and image data sets. These real applications demonstrate the robustness of the proposed algorithm in comparison with the existing ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CoMadOut—a robust outlier detection algorithm based on CoMAD

Article Open access 07 May 2024

Novel knowledge and accuracy measures for interval-valued fuzzy sets with applications in cluster analysis and pattern detection

Article 09 May 2024

A novel outlier detection method based on Bayesian change point analysis and Hampel identifier for GNSS coordinate time series

Article Open access 02 April 2024

References

Azzalini A, Torelli N (2007) Clustering via nonparametric density estimation. Stat Comput 17(1):71–80
Article MathSciNet Google Scholar
Bezdek J C (1974) Numerical taxonomy with fuzzy sets. J Math Biol 1(1):57–71
Article MathSciNet Google Scholar
de Carvalho FdA, Simões EC (2017) Fuzzy clustering of interval-valued data with city-block and hausdorff distances. Neurocomputing 266:659–673
Article Google Scholar
Chen J H, Hung W L (2015) An automatic clustering algorithm for probability density functions. J Stat Comput Simul 85(15):3047–3063
Article MathSciNet Google Scholar
Hathaway R J, Bezdek J C (1988) Recent convergence results for the fuzzy c-means clustering algorithms. J Classif 5(2):237–247
Article MathSciNet Google Scholar
Hung W L, Yang J H, Shen K F (2016) Self-updating clustering algorithm for interval-valued data. IEEE Int Conf Fuzzy Syst:1494–1500
Jeng J T, Chen C M, Chang S C, Chuang C C (2019) Ipfcm clustering algorithm under euclidean and hausdorff distance measure for symbolic interval data. Int J Fuzzy Syst 21(7):2102–2119
Article MathSciNet Google Scholar
Kabir S, Wagner C, Havens T C, Anderson D T, Aickelin U (2017) Novel similarity measure for interval-valued data based on overlapping ratio. IEEE Int Conf Fuzzy Syst:1–6
Kamel M S, Selim S Z (1994) New algorithms for solving the fuzzy clustering problem. Pattern Recogn 27(3):421–428
Article Google Scholar
Lethikim N, Lehoang T, Vovan T (2021) Automatic clustering algorithm for interval data based on overlap distance. Communications in Statistics-Simulation and Computation, pp 1–16. Taylor & Francis. https://doi.org/10.1080/03610918.2021.1900248
Malarvizhi N, Selvarani P, Raj P (2019) Adaptive fuzzy genetic algorithm for multi biometric authentication. Multimed Tools Appl:1–14
Nguyentrang T, Tai V (2017) Fuzzy clustering of probability density functions. J Appl Stat 44(4):583–601
Article MathSciNet Google Scholar
Pham-Gia T, Turkkan N, Tai V (2008) Statistical discrimination analysis using the maximum function. Commun Stat—Simul Comput®; 37(2):320–336
Article MathSciNet Google Scholar
Phamtoan D, Vovan T (2020) Automatic fuzzy genetic algorithm in clustering for images based on the extracted intervals. Multimed Tools Appl:1–23, https://doi.org/10.1007/s11042-020-09975-3
Reimers N, Schiller B, Beck T, Daxenberger J, Stab C, Gurevych I (2019) Classification and clustering of arguments with contextualized word embeddings. arXiv:190609821
Rodríguez SIR, de Carvalho FdAT (2019) A new fuzzy clustering algorithm for interval-valued data based on city-block distance. 2019 IEEE International Conference on Fuzzy Systems, pp 1–6
de Souza L C, de Souza R M C R, do Amaral G J A (2020) Dynamic clustering of interval data based on hybrid l_q distance. Knowl Inf Syst 62(2):687–718
Article Google Scholar
Tai V, Pham-Gia T (2010) Clustering probability distributions. J Appl Stat 37(11):1891–1910
Article MathSciNet Google Scholar
Tai V, Dinh P, Tranthituy D (2019) Automatic genetic algorithm in clustering for discrete elements. Commun Stat-Simul Comput:1–16
Vovan T, Phamtoan D, LeHoang T, Nguyentrang T (2020) An automatic clustering for interval data using the genetic algorithm. Ann Oper Res:1–22
Wang X, Yu F, Pedrycz W, Yu L (2019) Clustering of interval-valued time series of unequal length based on improved dynamic time warping. Expert Syst Appl 125:293–304
Article Google Scholar
Xu W (2010) Symbolic data analysis: interval-valued data regression. PhD thesis, University of Georgia, Athens

Download references

Acknowledgments

For Khanh Nguyenhuu and Tai Vovan, this research is funded by Ministry of Education and Training in Viet Nam under grant number B2021 – TCT – 01.

Author information

Authors and Affiliations

University of Science, Ho Chi Minh City, Vietnam
Dinh Phamtoan
Vietnam National University, Ho Chi Minh City, Vietnam
Dinh Phamtoan
Faculty of Engineering, Van lang University, Ho Chi Minh City, Vietnam
Dinh Phamtoan
College of Natural Science, Can Tho University, Can Tho city, Vietnam
Khanh Nguyenhuu & Tai Vovan

Authors

Dinh Phamtoan
View author publications
You can also search for this author in PubMed Google Scholar
Khanh Nguyenhuu
View author publications
You can also search for this author in PubMed Google Scholar
Tai Vovan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tai Vovan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Phamtoan, D., Nguyenhuu, K. & Vovan, T. Fuzzy clustering algorithm for outlier-interval data based on the robust exponent distance. Appl Intell 52, 6276–6291 (2022). https://doi.org/10.1007/s10489-021-02773-w

Download citation

Accepted: 15 August 2021
Published: 06 September 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10489-021-02773-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fuzzy clustering algorithm for outlier-interval data based on the robust exponent distance

Abstract

Access this article

Similar content being viewed by others

CoMadOut—a robust outlier detection algorithm based on CoMAD

Novel knowledge and accuracy measures for interval-valued fuzzy sets with applications in cluster analysis and pattern detection

A novel outlier detection method based on Bayesian change point analysis and Hampel identifier for GNSS coordinate time series

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fuzzy clustering algorithm for outlier-interval data based on the robust exponent distance

Abstract

Access this article

Similar content being viewed by others

CoMadOut—a robust outlier detection algorithm based on CoMAD

Novel knowledge and accuracy measures for interval-valued fuzzy sets with applications in cluster analysis and pattern detection

A novel outlier detection method based on Bayesian change point analysis and Hampel identifier for GNSS coordinate time series

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation