ABSTRACT
We address the problem of feature selection in a kernel space to select the most discriminative and informative features for classification and data analysis. This is a difficult problem because the dimension of a kernel space may be infinite. In the past, little work has been done on feature selection in a kernel space. To solve this problem, we derive a basis set in the kernel space as a first step for feature selection. Using the basis set, we then extend the margin-based feature selection algorithms that are proven effective even when many features are dependent. The selected features form a subspace of the kernel space, in which different state-of-the-art classification algorithms can be applied for classification. We conduct extensive experiments over real and simulated data to compare our proposed method with four baseline algorithms. Both theoretical analysis and experimental results validate the effectiveness of our proposed method.
- Aha, D. W. (1990). A study of instance-based algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations. Doctoral dissertation, Department of Information & Computer Science, University of California, Irvine. Google ScholarDigital Library
- Baudat, G., & Anouar, F. (2000). Generalized discriminant analysis using a kernel approach. Neural Computation, 12, 2385--2404. Google ScholarDigital Library
- Baudat, G., & Anouar, F. (2003). Feature vector selection and projection using kernels. Neurocomputing, 55, 21--38.Google ScholarCross Ref
- Bradley, P. S., & Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. ICML '98 (pp. 82--90). Google ScholarDigital Library
- Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals Eugen., 7, 179--188.Google ScholarCross Ref
- Fukunaga, K. (1990). Introduction to statistical pattern recognition (2nd ed.). San Diego, CA, USA: Academic Press Professional, Inc. Google ScholarDigital Library
- Gilad-Bachrach, R., Navot, A., & Tishby, N. (2004). Margin based feature selection - theory and algorithms. ICML '04 (p. 43). Google ScholarDigital Library
- Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. J. Mach. Learn. Res., 3, 1157--1182. Google ScholarDigital Library
- Horn, R. A., & Johnson, C. R. (1985). Matrix analysis. Cambridge University Press. Google ScholarDigital Library
- I. T. Jolliffe. (2002). Principal components analysis. Springer Verlag.Google Scholar
- Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. ML92 (pp. 249--256). Google ScholarDigital Library
- Kononenko, I. (1994). Estimating attributes: analysis and extensions of relief. ECML-94: Proceedings of the European conference on machine learning on Machine Learning (pp. 171--182). Google ScholarDigital Library
- Liang, Z., & Zhao, T. (2006). Feature selection for linear support vector machines. ICPR '06 (pp. 606--609). Google ScholarDigital Library
- Mika, S., Ratsch, G., Weston, J., Scholkopf, B., & Muller, K. (1999", a). Fisher discriminant analysis with kernels.Google Scholar
- Mika, S., Schölkopf, B., Smola, A. J., Müller, K.-R., Scholz, M., & Räätsch, G. (1999b). Kernel PCA and de--noising in feature spaces. Advances in Neural Information Processing Systems 11. MIT Press. Google ScholarDigital Library
- Niijima, S., & Kuhara, S. (2006). Gene subset selection in kernel-induced feature space. Pattern Recognition Letters, 27, 1884--1892. Google ScholarDigital Library
- Scholkopf, B., & A. J. Smola. (2002). Learnin with kernels. Cambridge, MA,: The MIT Press.Google Scholar
- Sun, Y., & Li, J. (2006). Iterative relief for feature weighting. ICML '06 (pp. 913--920). Google ScholarDigital Library
- Weston, J., Elisseeff, A., Schölkopf, B., & Tipping, M. (2003). Use of the zero norm with linear models and kernel methods. J. Mach. Learn. Res., 3, 1439--1461. Google ScholarDigital Library
- Wu, M., Schölkopf, B., & Bakir, G. (2005). Building sparse large margin classifiers. ICML '05 (pp. 996--1003). Google ScholarDigital Library
- Yan, J., Liu, N., Zhang, B., Yan, S., Chen, Z., Cheng, Q., Fan, W., & Ma, W.-Y. (2005). Ocfs: optimal orthogonal centroid feature selection for text categorization. SIGIR '05 (pp. 122--129). Google ScholarDigital Library
- Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. ICML '97 (pp. 412--420). Google ScholarDigital Library
- Feature selection in a kernel space
Recommendations
Gene subset selection in kernel-induced feature space
This paper proposes a new filter approach to gene subset selection for kernel-based classifiers. We derive kernel forms of several well-known class separability criteria, and gene subset selection based on the kernelized criteria is applied to ...
Multiple indefinite kernel learning for feature selection
AbstractMultiple kernel learning for feature selection (MKL-FS) utilizes kernels to explore complex properties of features and performs better in embedded methods. However, the kernels in MKL-FS are generally limited to be positive definite. ...
Highlights- We propose a novel multiple indefinite kernel feature selection (MIK-FS) method.
Online kernel selection: algorithms and evaluations
AAAI'12: Proceedings of the Twenty-Sixth AAAI Conference on Artificial IntelligenceKernel methods have been successfully applied to many machine learning problems. Nevertheless, since the performance of kernel methods depends heavily on the type of kernels being used, identifying good kernels among a set of given kernels is important ...
Comments