Skip to main content
Top

2019 | OriginalPaper | Chapter

A Novel Density-Based Clustering Approach for Outlier Detection in High-Dimensional Data

Authors : Thouraya Aouled Messaoud, Abir Smiti, Aymen Louati

Published in: Hybrid Artificial Intelligent Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Outlier detection is a primary aspect in data-mining and machine learning applications, also known as outlier mining. The importance of outlier detection in medical data came from the fact that outliers may carry some precious information however outlier detection can show very bad performance in the presence of high dimensional data. In this paper, a new outlier detection technique is proposed based on a feature selection strategy to avoid the curse of dimensionality, named Infinite Feature Selection DBSCAN. The main purpose of our proposed method is to reduce the dimensions of a high dimensional data set in order to efficiently identify outliers using clustering techniques. Simulations on real databases proved the effectiveness of our method taking into account the accuracy, the error-rate, F-score and the retrieval time of the algorithm.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Laurikkala, J., Juhola, M., Kentala, E., Lavrac, N., Miksch, S., Kavsek, B.: Informal identification of outliers in medical data. In: Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, vol. 1, pp. 20–24 (2000) Laurikkala, J., Juhola, M., Kentala, E., Lavrac, N., Miksch, S., Kavsek, B.: Informal identification of outliers in medical data. In: Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology, vol. 1, pp. 20–24 (2000)
2.
go back to reference Goldstein, M., Dengel, A.: Histogram-based outlier score (HBOS): a fast unsupervised anomaly detection algorithm. In: KI-2012: Poster and Demo Track, pp. 59–63 (2012) Goldstein, M., Dengel, A.: Histogram-based outlier score (HBOS): a fast unsupervised anomaly detection algorithm. In: KI-2012: Poster and Demo Track, pp. 59–63 (2012)
3.
go back to reference Kriegel, H.-P., Zimek, A., et al.: Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 444–452. ACM (2008) Kriegel, H.-P., Zimek, A., et al.: Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 444–452. ACM (2008)
5.
go back to reference Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD Record, vol. 29, pp. 93–104. ACM (2000) Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD Record, vol. 29, pp. 93–104. ACM (2000)
7.
go back to reference Ester, M., Kriegel, H.-P., Sander, J., Xiaowei, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996) Ester, M., Kriegel, H.-P., Sander, J., Xiaowei, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
8.
go back to reference Xianting, Q., Pan, W.: A density-based clustering algorithm for high-dimensional data with feature selection. In: 2016 International Conference on Industrial Informatics-Computing Technology, Intelligent Technology, Industrial Information Integration(ICIICII), pp. 114–118. IEEE (2016) Xianting, Q., Pan, W.: A density-based clustering algorithm for high-dimensional data with feature selection. In: 2016 International Conference on Industrial Informatics-Computing Technology, Intelligent Technology, Industrial Information Integration(ICIICII), pp. 114–118. IEEE (2016)
9.
go back to reference Huang, J., Zhu, Q., Yang, L., Cheng, D.D., Quanwang, W.: A novel outlier cluster detection algorithm without top-n parameter. Knowl. Based Syst. 121, 32–40 (2017)CrossRef Huang, J., Zhu, Q., Yang, L., Cheng, D.D., Quanwang, W.: A novel outlier cluster detection algorithm without top-n parameter. Knowl. Based Syst. 121, 32–40 (2017)CrossRef
10.
go back to reference Smiti, A., Elouedi, Z.: COID: maintaining case method based on clustering, outliers and internal detection. In: Lee, R., Ma, J., Bacon, L., Du, W., Petridis, M. (eds.) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2010. SCI, vol. 295, pp. 39–52. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13265-0_4CrossRef Smiti, A., Elouedi, Z.: COID: maintaining case method based on clustering, outliers and internal detection. In: Lee, R., Ma, J., Bacon, L., Du, W., Petridis, M. (eds.) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2010. SCI, vol. 295, pp. 39–52. Springer, Heidelberg (2010). https://​doi.​org/​10.​1007/​978-3-642-13265-0_​4CrossRef
11.
go back to reference Smiti, A., Elouedi, Z.: WCOID: maintaining case-based reasoning systems using weighting, clustering, outliers and internal cases detection. In: International Conference on Intelligent Systems Design and Applications (ISDA), pp. 356–361. IEEE Computer Society (2011) Smiti, A., Elouedi, Z.: WCOID: maintaining case-based reasoning systems using weighting, clustering, outliers and internal cases detection. In: International Conference on Intelligent Systems Design and Applications (ISDA), pp. 356–361. IEEE Computer Society (2011)
13.
go back to reference Roffo, G., Melzi, S., Cristani, M.: Infinite feature selection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4202–4210 (2015) Roffo, G., Melzi, S., Cristani, M.: Infinite feature selection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4202–4210 (2015)
Metadata
Title
A Novel Density-Based Clustering Approach for Outlier Detection in High-Dimensional Data
Authors
Thouraya Aouled Messaoud
Abir Smiti
Aymen Louati
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-29859-3_28

Premium Partner