nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Feature Selection Based on Density Peak Clustering Using Information Distance Measure

verfasst von : Jie Cai, Shilong Chao, Sheng Yang, Shulin Wang, Jiawei Luo

Erschienen in: Intelligent Computing Theories and Application

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Feature selection is one of the most important data preprocessing techniques in data mining and machine learning. A new feature selection method based on density peak clustering is proposed. The new method applies an information distance between features as clustering distance metric, and uses the density peak clustering method for feature clustering. The representative feature of each cluster is selected to generate the final result. The method can avoid selecting the irrelevant representative feature from one cluster, where most features are irrelevant to class label. The comparison experiments on ten datasets show that the feature selection results of the proposed method exhibit improved classification accuracies for different classifiers.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Performance Evaluation of Systematic Analysis for Combining Multi-class Models for Sickle Cell Disorder Data Sets

Nächstes Kapitel Joint Sample Expansion and 1D Convolutional Neural Networks for Tumor Classification

Avrim, L.B., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)MathSciNetMATH

Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Boston (1998)CrossRefMATH

Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef

Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)MathSciNetMATH

Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 7th International Conference on Machine Learning, pp. 359–366 (2000)

Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)MathSciNetMATH

Mitra, P., Murthy, C.: Unsupervised feature selection using similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)CrossRef

Ienco, D., Meo, R.: Exploration and reduction of the feature space by hierarchical clustering. In: Proceedings of the 2008 SIAM Conference on Data Mining, Atlanta, Georgia, USA, pp. 577–587 (2008)

Witten, D., Tibshirni, R.: A framework for feature selection in clustering. J. Am. Stat. Associ. 105, 713–726 (2010)MathSciNetCrossRefMATH

10.

Liu, H., Wu, X., Zhang, S.: Feature selection using hierarchical feature clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, United Kingdom, pp. 979–984 (2011)

11.

Zhao, X., Deng, W., Shi, Y.: Feature selection with attributes clustering by maximal information coefficient. Procedia Comput. Sci. 17(2), 70–79 (2013)CrossRef

12.

Bandyopadhyay, S., Bhadra, T., Mitra, P., et al.: Integration of dense subgraph finding with feature clustering for unsupervised feature selection. Pattern Recogn. Lett. 40(1), 104–112 (2014)CrossRef

13.

Au, W.H., Chan, K.C., Wong, A.K., Wang, Y.: Attribute clustering for grouping, selection, and classification of gene expression data. IEEE/ACM Trans. Comput. Biol. Bioinf. 2(2), 83–101 (2005)CrossRef

14.

Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2013)CrossRef

15.

Liu, Q., Zhang, J., Xiao J., Zhu, H., Zhao, Q.: A supervised feature selection algorithm through minimum spanning tree clustering. In: IEEE 26th International Conference on Tools with Artificial Intelligence, pp. 264–271 (2014)

16.

Meila, M.: Comparing clusterings - an information based distance. J. Multivar. Anal. 98(5), 873–895 (2007)MathSciNetCrossRefMATH

17.

Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11(10), 2837–2854 (2010)MathSciNetMATH

18.

Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)CrossRef

19.

Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the IJCAI, pp. 1022–1029 (1993)

20.

Yang, H.H., Moody, J.E.: Data visualization and feature selection: new algorithms for nongaussian data. In: Proceedings of the NIPS, pp. 687–693 (1999)

21.

Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. Mach. Learn. 14, 171–182 (1994)

22.

Sotoca, J.M., Pla, F.: Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recogn. 43(6), 2068–2081 (2010)CrossRefMATH

Titel: Feature Selection Based on Density Peak Clustering Using Information Distance Measure
verfasst von: Jie Cai
Shilong Chao
Sheng Yang
Shulin Wang
Jiawei Luo
Verlag: Springer International Publishing
Buch: Intelligent Computing Theories and Application
Print ISBN: 978-3-319-63311-4

Electronic ISBN: 978-3-319-63312-1

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-63312-1_11

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"