Locality-regularized linear regression discriminant analysis for feature extraction
Introduction
Feature extraction (or dimensionality reduction) has become an effective tool to aid understanding of the underlying structure of data in many real-world applications such as face recognition, image retrieval [37], fingerprint authentication [47], bioinformatics [45], and data mining [24]. The goal of feature extraction is to reduce the redundant, irrelevant, and noise information while preserving the intrinsic information contained in the data.
Over the past few decades, numerous dimensionality reduction (DR) methods have been developed for feature extraction. Principal component analysis (PCA) [33] and linear discriminant analysis (LDA) [1] are the two most representative DR algorithms. PCA aims to find a set of projection axes such that the data's variance is maximized after the data projection. PCA is an unsupervised approach as it ignores the useful prior class information. In contrast, LDA is a supervised algorithm. LDA searches for a set of projection axes such that the samples from the same class are in close proximity and the samples from different classes are separated, simultaneously. LDA takes advantage of the class information to discover the discriminant structure of the data; thus, LDA is usually more effective than PCA for supervised pattern recognition tasks. However, LDA still has some limitations. For example, LDA usually suffers from small sample size (SSS) problems [29] (which lead to the singularity of the within-class scatter matrix), and LDA can only extract a maximum of C − 1 features (where C is the number of sample classes).
In recent years, inspired by manifold learning, a family of locality-based DR algorithms has been proposed. Locality preserving projections (LPP) [12] is one of the most popular locality-based DR methods. Unlike PCA and LDA, which can only discover the global Euclidean structure of the data, LPP is able to determine the intrinsic manifold structure of the data. LPP seeks to find a set of projection directions such that the neighbourhood structure of the data is preserved after projection. To exploit the class information to improve the classification performance, some supervised versions of LPP have been proposed, such as class-wise locality preserving projection (CLPP) [23], supervised optimal locality preserving projection (SOLPP) [36], and local similarity preserving projections (LSPP) [16]. In addition, motivated by LDA and LLP, many discriminant locality-based DR algorithms [3], [13], [14], [15], [17], [18], [21], [25], [26], [31], [34], [38], [39], [42], [46] have been developed. These methods combine the ideas behind LDA and LPP, and aim to discover the local and discriminant structures of the data. Yan et al. proposed a general DR framework called graph embedding [39], which provides a powerful platform on which various DR algorithms can be developed. It has also been shown that all the aforementioned methods can be interpreted by the graph embedding framework.
After feature extraction, data classification is another important step in pattern recognition. Once the distinctive features are obtained using the DR algorithms, a suitable classifier can be chosen to classify these features. So far, many useful classification methods [5], [7], [8], [9], [10], [11], [22], [35], [41], [49] have been developed, such as nearest neighbour classifier (NNC) [5], nearest feature line (NFL) classifier [22], and minimum distance classifier (MDC) [7]. Recently, representation-based classification methods have attracted considerable research attention and show great potential for pattern classification. Sparse representation-based classification (SRC) [20] is one of the most representative representation-based classification methods. SRC first codes a testing sample as a sparse linear combination of the training samples, and then classifies the testing sample by evaluating which class leads to the minimum reconstruction error. The SRC method has shown promising and robust results for face recognition as it can handle errors due to occlusion and corruption effectively. However, SRC is usually time-consuming owing to computation of the reconstruction coefficients by solving a l1-norm minimization problem. Zhang et al. [48] analysed the working mechanism of SRC and pointed out that it is the collaborative representation (CR), and not the l1-norm sparsity, that makes SRC powerful in terms of face recognition. To this end, they proposed a CR-based classification (CRC) method and demonstrated that CRC achieves highly competitive classification results with significantly less complexity than SRC. Motivated by this promising property of CRC, Yang et al. developed a novel DR algorithm called collaborative representation-based projection (CRP) [43], to preserve the collaborative representation relationships between samples. In addition to SRC and CRC, linear regression classification (LRC) [19] is a well-known representation-based classifier. LRC codes the testing sample as a linear combination of class-specific samples, and the decision is determined in favour of the class with the minimal reconstruction error. Compared to SRC, LRC spends less time in estimating the reconstruction coefficients using the least-squares method.
As mentioned above, both feature extraction and data classification are of extreme importance for pattern recognition. However, most DR algorithms (for feature extraction) were designed independently of the decision rules of the classifiers, and the classifier is generally selected heuristically (e.g., based on experience) at the stage of classification. The subspaces learned by DR algorithms generally have different characteristics that are invisible to the classifiers. Consequently, the classifiers selected after feature extraction cannot effectively make use of the characteristics of the learned subspace. As a result, the DR algorithms may not work well with the selected classifier, which often degrades the pattern recognition performance. To better connect the DR algorithm to one specific classifier, some researchers consider learning the DR algorithm according to the decision rule of the classifier. For example, Yang et al. proposed local mean-based nearest neighbour discriminant analysis (LM-NNDA) [44] and sparse representation classifier steered discriminative projection (SRC-DP) [40] according to the decision rules of local mean based nearest neighbour (LM-NN) classifier [27] and SRC, respectively. Inspired by Yang's work, Chen et al. used the decision rule of LRC to develop a novel DR algorithm named reconstructive discriminant analysis (RDA) [4]. These studies demonstrate remarkable results and validate the feasibility and effectiveness of using the classification rules to design various algorithms.
More recently, Brown et al. proposed a locality-regularized linear regression classification (LLRC) [2] method using a manifold learning procedure to expand on conventional LRC and increase the accuracy. However, the original feature space cannot guarantee the efficiency of LLRC. Motivated by the aforementioned methods [4], [40], [44], this paper proposes a novel DR algorithm, termed locality-regularized linear regression discriminant analysis (LLRDA), to generate an efficient feature subspace that is optimum for LLRC using the decision rule of LLRC. Concretely, the inter-class and the intra-class local reconstruction scatters are first constructed to characterize the separability and compactness of the samples, respectively; then, the feature extraction criterion is derived by maximizing the ratio between them. Since the decision rule of LLRC is used to steer the design of LLRDA, LLRDA is able to perform well with LLRC in pattern recognition.
The remainder of this paper is organized as follows. Section 2 briefly reviews LRC and LLRC. Section 3 details the proposed LLRDA method. Section 4 discusses the relationship between LLRDA and other techniques such as LRC and RDA. Section 5 presents the experimental results on CMU PIE, ORL, FERET, and Yale-B face databases to evaluate the effectiveness of LLRDA. Finally, Section 6 concludes and discusses future research directions.
Section snippets
LRC and LLRC
Let denote a set of n training image samples from C classes, and suppose the number of samples from the ith class is ni. Then, we have , where is an N-dimensional column vector representing the ith image vector. In this section, we briefly review LRC and LLRC.
Basic principle
As discussed earlier, the main advantage of LLRC over LRC is that LLRC exploits the neighbourhood relationship between data samples to enhance the classification performance. LLRC assumes that the neighbouring samples, rather than the overall samples, from a specific class lie in a linear subspace. To classify a testing sample, LLRC first computes the local reconstruction errors for each class, and then evaluates which class leads to the minimal local reconstruction error. Clearly, if the
Connections to LRC
As introduced in the previous sections, the connections between LLRDA and LRC can be summarized as follows:
- (1)
LLRDA is a DR method for extracting features from the original data, whereas LRC is a classifier used to recognize the features obtained by the DR methods (e.g. PCA, LDA, and LLRDA).
- (2)
LRC is based on the assumption that samples from one specific class lie on the same subspace, whereas LLRDA is based on the assumption that neighbouring samples from one specific class lie on the same
Experiments
In this section, extensive experiments are conducted on CMU PIE, ORL, FERET, and Yale-B face databases to verify the effectiveness of LLRDA. In addition, we compare LLRDA with other state-of-the-art feature extraction algorithms such as PCA [33], LDA [1], LPP [12], and RDA [4]. After feature extraction, different classifiers such as NNC [5], MDC [7], LRC [19], and LLRC [2] are adopted to measure the recognition performance of each method; for LLRC, the parameter k is selected by exhaustive
Conclusions
In this paper, we propose a novel DR algorithm called locality-regularized linear regression discriminant analysis (LLRDA) for feature extraction. This algorithm inherits the assumption of the LLRC method that neighbouring samples for one specific class should lie on a linear subspace. Based on this assumption, LLRDA seeks to find a discriminant subspace that is efficient for LLRC by maximizing the inter-class local reconstruction scatter, while simultaneously minimizing the intra-class local
Acknowledgements
This work is sponsored by the National Natural Science Foundation of China (Grant Nos. 61503195, 61403188, 61603159 and 61502245), Natural Science Foundation of Jiangsu Province (Grant Nos. BK20150849 and BK20160293), Project funded by China Postdoctoral Science Foundation (Grant No. 2016M600433), Project supported by the Jiangsu Key Laboratory of Image and Video Understanding for Social Safety (Nanjing University of Science and Technology) (Grant No. 30916014107) and Open Fund Project of
References (49)
- et al.
Reconstructive discriminant analysis: a feature extraction method induced from linear regression classification
Neurocomputing
(2012) - et al.
Incremental learning for ν-support vector regression
Neural Netw. Off. J. Int. Neural Netw. Soc.
(2015) - et al.
Local similarity and diversity preserving discriminant projection for face and handwriting digits recognition
Neurocomputing
(2012) - et al.
Discriminant similarity and variance preserving projection for feature extraction
Neurocomputing
(2014) - et al.
Feature extraction using local structure preserving discriminant analysis
Neurocomputing
(2014) - et al.
Local similarity preserving projections for face recognition
AEU - Int. J. Electron. Commun.
(2015) - et al.
Fuzzy local discriminant embedding for image feature extraction
Comput. Electr. Eng.
(2015) - et al.
Feature extraction using constrained maximum variance mapping
Pattern Recognit.
(2008) - et al.
Kernel class-wise locality preserving projection
Inf. Sci.
(2008) - et al.
Face recognition using discriminant locality preserving projections based on maximum margin criterion
Pattern Recognit.
(2010)
A local mean-based nonparametric classifier
Pattern Recognit. Lett.
Efficient linear discriminant analysis with locality preserving for face recognition
Pattern Recognit.
Feature extraction using two-dimensional maximum embedding difference
Inf. Sci.
A rapid learning algorithm for vehicle classification
Inf. Sci.
Supervised optimal locality preserving projection
Pattern Recognit.
A multi-manifold discriminant analysis method for image feature extraction
Pattern Recognit.
A collaborative representation based projections method for feature extraction
Pattern Recognit.
From classifiers to discriminators: a nearest neighbor rule induced discriminant analysis
Pattern Recognit.
Face recognition: Eigenfaces vs. Fisherfaces: recognition using class specific projection
IEEE Trans. Pattern Anal. Mach. Intell.
Locality-regularized linear regression for face recognition
Locality sensitive discriminant analysis
Nearest neighbor pattern classification
IEEE Trans. Inf. Theory
From few to many: illumination cone models for face recognition under variable lighting and pose
IEEE Trans. Pattern Anal. Mach. Intell.
Digital Image Processing
Cited by (15)
Two-directional two-dimensional fractional-order embedding canonical correlation analysis for multi-view dimensionality reduction and set-based video recognition
2023, Expert Systems with ApplicationsCitation Excerpt :In recent years, many approaches have been proposed by researchers to deal with the pattern recognition problem. However, most of the existing approaches pay attention to single image-based object recognition task (Gao, Sun, Xu et al., 2018; Gao, Sun, & Yang, 2018b; Han et al., 2022; Huang et al., 2018; Niu et al., 2022; Xie et al., 2018; Zhu et al., 2022). This type of approaches mainly focuses on two research directions: feature learning and metric learning.
Local median based linear regression classification for biometric recognition
2021, Computers and Electrical EngineeringCitation Excerpt :However, this assumption cannot always hold due to variations between images and limited training samples. On one hand, several researches [17, 21, 23-24] have revealed that neighboring samples lie in a manifold and hence, the similar properties of test sample ought to be close to those of the mean of its nearest neighbors. Therefore, we first take the neighborhood relationships into account.
Target redirected regression with dynamic neighborhood structure
2021, Information SciencesCitation Excerpt :The final type of extension of LSR is manifold extension. As indicated in [14,27,31], high-dimensional data always lie on an ambient low-dimensional manifold embedded in the original high-dimensional space. Therefore, learning the local manifold structure of data is essential for improving the performance of learning models.
Overview and comparative study of dimensionality reduction techniques for high dimensional data
2020, Information FusionCitation Excerpt :LPP has been applied in different application areas including image retrieval [99], image and video classification, face recognition [16,100], pattern recognition [101], automatic speech recognition [6], and computer vision [98]. Huang et al. [102] introduced a Locality-Regularized Linear Regression Discriminant Analysis (LL-RDA) based on LL Regression Classification (LLRC) [103]. LL-RDA was derived by maximizing the inter-class reconstruct of local scatters at the same time minimizing the intra-class reconstruction of local scatters.
Piecewise linear regression-based single image super-resolution via Hadamard transform
2018, Information SciencesCitation Excerpt :Image super-resolution (SR) is the process of recovering a visually pleasing high-resolution (HR) image from a low-resolution (LR) image. SR has many real applications, such as face recognition [15,27,42], visual question answering [47], visual speaker identification and authentication [20], object understanding [24], activity recognition [28], surveillance systems, satellite imaging, medical imaging, and ultra-high definition display devices. Most existing methods use certain prior information to address the SR problem, especially learned priors.
ASF-LLRDA: Locality-regularized Linear Regression Discriminant Analysis with Approximately Symmetrical Face Preprocessing for Face Recognition
2023, 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023