Elsevier

Pattern Recognition

Volume 37, Issue 6, June 2004, Pages 1299-1302
Pattern Recognition

Rapid and Brief Communication
A reformative kernel Fisher discriminant analysis

https://doi.org/10.1016/j.patcog.2003.10.006Get rights and content

Abstract

A reformative kernel Fisher discriminant method is proposed, which is directly derived from the naive kernel Fisher discriminant analysis with superiority in classification efficiency. In the novel method only a part of training patterns, called “significant nodes”, are necessary to be adopted in classifying one test pattern. A recursive algorithm for selecting “significant nodes”, which is the key of the novel method, is presented in detail. The experiment on benchmarks shows that the novel method is effective and much efficient in classifying.

Introduction

Kernel Fisher discriminant analysis [1] has attracted much attention and has been applied in many recognition problems because it owns conceptual elegance and state-of-the-art performance. The key to kernel Fisher method is to classify test patterns in some high-dimensional space using kernel trick. However, it is well-known that kernel Fisher discriminant analysis is based on the theory of reproducing kernels; as a result, the classification efficiency of the naive kernel Fisher discriminant analysis is in verse ratio to the number of training patterns. Consequently, if the number of training patterns is large enough, the method may become impractical. So it is very important to improve the classification efficiency of the naive kernel Fisher discriminant analysis if a suitable approach is available [2], [3]. In this paper, it is supposed that in the high-dimensional space introduced by kernel methods the Fisher discriminant vector can be well approximated by some expansion of a part of training patterns. From this supposition, a reformative kernel Fisher discriminant analysis is developed, which is directly derived from the naive kernel Fisher discriminant analysis and directly based on the Fisher criterion. Moreover, when the reformative method classifies each test pattern it is only necessary to compute the kernel functions between the test pattern and the “significant nodes”, a few training patterns selected from the total training patterns. So, in practice, the novel method will be much more efficient than the naive kernel Fisher discriminant analysis in classifying.

Section snippets

Kernel Fisher discriminant analysis

Kernel Fisher discriminant analysis is based on a conceptual transformation from input space into a nonlinear high-dimensional feature space. Let {xi} denote the input space. Suppose that the high-dimensional feature space is F and the corresponding nonlinear function is φ, i.e. φ(xi)∈F. Consequently in the feature space F Fisher criterion is defined byJ(w)=w′Sbφww′Swφw,where w is the discriminant vector, and Sbφ and Swφ are between-class scatter matrix and within-class scatter matrices,

A reformative criterion function

In practice, generally N is singular, so α can be solved by the following equation:α=(N+μI)−1(M1−M2),where μ is a positive constant. From the viewpoint of numerical stability, if μ is large enough, N+μI will be positive definite and consequently the problem will be more stable. Now we define Fisher criterion asJ(α)=α′Mαα′Nα+μα′α=α′Mαα′(N+μI)α.It is provable thatJ(α)=(M1−M2)′α.Obviously, the greater (M1M2)′α is, the more significant the corresponding patterns are. So (M1M2)′α can be taken as a

Experiment

One experiment on 10 benchmark datasets (http://ida.first.gmd.de/~raetsch/data/) is performed. Hundred partitions are generated for each dataset (except “Image” and “Splice” with only 20 partitions) and every partition includes own training pattern subset and test pattern subset. Gaussian kernel in the form of k(x,y)=exp(−||xy||2/(2σ2)) is adopted. For each dataset σ2 is set the variance of the first training pattern subset. Training is also carried out in the first training subset, while

About the Author—YONG XU was born in Sichuan, China, in 1972. He received his B.S. degree and M.S. degree in 1994 and 1997, respectively. Now, he is working for his Ph.D. degree in Pattern Recognition and Intelligence System. He is the author of 8 scientific papers in pattern recognition and image processing. His current interests include face recognition and detection, character recognition and image processing.

References (4)

There are more references available in the full text version of this article.

Cited by (0)

About the Author—YONG XU was born in Sichuan, China, in 1972. He received his B.S. degree and M.S. degree in 1994 and 1997, respectively. Now, he is working for his Ph.D. degree in Pattern Recognition and Intelligence System. He is the author of 8 scientific papers in pattern recognition and image processing. His current interests include face recognition and detection, character recognition and image processing.

About the Author—JING-YU YANG received his B.S. degree in Computer Science from NUST, Nanjing, China. He is the author of over 100 scientific papers in computer vision, pattern recognition, and artificial intelligence. His current research interests are in the areas of pattern recognition, image processing and artificial intelligence, and expert system.

About the Author—JIAN YANG was born in Jiangsu, China, in 1973. He obtained his B.S. degree, M.S. degree and Ph.D. degree in 1995, 1998 and 2002, respectively. Now, he is a postdoctor at the University of Zaragoza (Spain) and NUST (China). He is the author of more than 20 scientific papers in pattern recognition and computer vision. His current research interests include face recognition and detection, handwritten character recognition and data fusion.

View full text