Elsevier

Information Sciences

Volume 429, March 2018, Pages 164-176
Information Sciences

Locality-regularized linear regression discriminant analysis for feature extraction

https://doi.org/10.1016/j.ins.2017.11.001Get rights and content

Abstract

Locality-regularized linear regression classification (LLRC) is an effective classifier that shows great potential for face recognition. However, the original feature space cannot guarantee the classification efficiency of LLRC. To alleviate this problem, we propose a novel dimensionality reduction method called locality-regularized linear regression discriminant analysis (LLRDA) for feature extraction. The proposed LLRDA is developed according to the decision rule of LLRC and seeks to generate a subspace that is discriminant for LLRC. Specifically, the intra-class and inter-class local reconstruction scatters are first defined to characterize the compactness and separability of samples, respectively. Then, the objective function for LLRDA is derived by maximizing the inter-class local reconstruction scatter and simultaneously minimizing the intra-class local reconstruction scatter. Extensive experimental results on CMU PIE, ORL, FERET, and Yale-B face databases validate the effectiveness of our proposed method.

Introduction

Feature extraction (or dimensionality reduction) has become an effective tool to aid understanding of the underlying structure of data in many real-world applications such as face recognition, image retrieval [37], fingerprint authentication [47], bioinformatics [45], and data mining [24]. The goal of feature extraction is to reduce the redundant, irrelevant, and noise information while preserving the intrinsic information contained in the data.

Over the past few decades, numerous dimensionality reduction (DR) methods have been developed for feature extraction. Principal component analysis (PCA) [33] and linear discriminant analysis (LDA) [1] are the two most representative DR algorithms. PCA aims to find a set of projection axes such that the data's variance is maximized after the data projection. PCA is an unsupervised approach as it ignores the useful prior class information. In contrast, LDA is a supervised algorithm. LDA searches for a set of projection axes such that the samples from the same class are in close proximity and the samples from different classes are separated, simultaneously. LDA takes advantage of the class information to discover the discriminant structure of the data; thus, LDA is usually more effective than PCA for supervised pattern recognition tasks. However, LDA still has some limitations. For example, LDA usually suffers from small sample size (SSS) problems [29] (which lead to the singularity of the within-class scatter matrix), and LDA can only extract a maximum of C − 1 features (where C is the number of sample classes).

In recent years, inspired by manifold learning, a family of locality-based DR algorithms has been proposed. Locality preserving projections (LPP) [12] is one of the most popular locality-based DR methods. Unlike PCA and LDA, which can only discover the global Euclidean structure of the data, LPP is able to determine the intrinsic manifold structure of the data. LPP seeks to find a set of projection directions such that the neighbourhood structure of the data is preserved after projection. To exploit the class information to improve the classification performance, some supervised versions of LPP have been proposed, such as class-wise locality preserving projection (CLPP) [23], supervised optimal locality preserving projection (SOLPP) [36], and local similarity preserving projections (LSPP) [16]. In addition, motivated by LDA and LLP, many discriminant locality-based DR algorithms [3], [13], [14], [15], [17], [18], [21], [25], [26], [31], [34], [38], [39], [42], [46] have been developed. These methods combine the ideas behind LDA and LPP, and aim to discover the local and discriminant structures of the data. Yan et al. proposed a general DR framework called graph embedding [39], which provides a powerful platform on which various DR algorithms can be developed. It has also been shown that all the aforementioned methods can be interpreted by the graph embedding framework.

After feature extraction, data classification is another important step in pattern recognition. Once the distinctive features are obtained using the DR algorithms, a suitable classifier can be chosen to classify these features. So far, many useful classification methods [5], [7], [8], [9], [10], [11], [22], [35], [41], [49] have been developed, such as nearest neighbour classifier (NNC) [5], nearest feature line (NFL) classifier [22], and minimum distance classifier (MDC) [7]. Recently, representation-based classification methods have attracted considerable research attention and show great potential for pattern classification. Sparse representation-based classification (SRC) [20] is one of the most representative representation-based classification methods. SRC first codes a testing sample as a sparse linear combination of the training samples, and then classifies the testing sample by evaluating which class leads to the minimum reconstruction error. The SRC method has shown promising and robust results for face recognition as it can handle errors due to occlusion and corruption effectively. However, SRC is usually time-consuming owing to computation of the reconstruction coefficients by solving a l1-norm minimization problem. Zhang et al. [48] analysed the working mechanism of SRC and pointed out that it is the collaborative representation (CR), and not the l1-norm sparsity, that makes SRC powerful in terms of face recognition. To this end, they proposed a CR-based classification (CRC) method and demonstrated that CRC achieves highly competitive classification results with significantly less complexity than SRC. Motivated by this promising property of CRC, Yang et al. developed a novel DR algorithm called collaborative representation-based projection (CRP) [43], to preserve the collaborative representation relationships between samples. In addition to SRC and CRC, linear regression classification (LRC) [19] is a well-known representation-based classifier. LRC codes the testing sample as a linear combination of class-specific samples, and the decision is determined in favour of the class with the minimal reconstruction error. Compared to SRC, LRC spends less time in estimating the reconstruction coefficients using the least-squares method.

As mentioned above, both feature extraction and data classification are of extreme importance for pattern recognition. However, most DR algorithms (for feature extraction) were designed independently of the decision rules of the classifiers, and the classifier is generally selected heuristically (e.g., based on experience) at the stage of classification. The subspaces learned by DR algorithms generally have different characteristics that are invisible to the classifiers. Consequently, the classifiers selected after feature extraction cannot effectively make use of the characteristics of the learned subspace. As a result, the DR algorithms may not work well with the selected classifier, which often degrades the pattern recognition performance. To better connect the DR algorithm to one specific classifier, some researchers consider learning the DR algorithm according to the decision rule of the classifier. For example, Yang et al. proposed local mean-based nearest neighbour discriminant analysis (LM-NNDA) [44] and sparse representation classifier steered discriminative projection (SRC-DP) [40] according to the decision rules of local mean based nearest neighbour (LM-NN) classifier [27] and SRC, respectively. Inspired by Yang's work, Chen et al. used the decision rule of LRC to develop a novel DR algorithm named reconstructive discriminant analysis (RDA) [4]. These studies demonstrate remarkable results and validate the feasibility and effectiveness of using the classification rules to design various algorithms.

More recently, Brown et al. proposed a locality-regularized linear regression classification (LLRC) [2] method using a manifold learning procedure to expand on conventional LRC and increase the accuracy. However, the original feature space cannot guarantee the efficiency of LLRC. Motivated by the aforementioned methods [4], [40], [44], this paper proposes a novel DR algorithm, termed locality-regularized linear regression discriminant analysis (LLRDA), to generate an efficient feature subspace that is optimum for LLRC using the decision rule of LLRC. Concretely, the inter-class and the intra-class local reconstruction scatters are first constructed to characterize the separability and compactness of the samples, respectively; then, the feature extraction criterion is derived by maximizing the ratio between them. Since the decision rule of LLRC is used to steer the design of LLRDA, LLRDA is able to perform well with LLRC in pattern recognition.

The remainder of this paper is organized as follows. Section 2 briefly reviews LRC and LLRC. Section 3 details the proposed LLRDA method. Section 4 discusses the relationship between LLRDA and other techniques such as LRC and RDA. Section 5 presents the experimental results on CMU PIE, ORL, FERET, and Yale-B face databases to evaluate the effectiveness of LLRDA. Finally, Section 6 concludes and discusses future research directions.

Section snippets

LRC and LLRC

Let X=[x1,x2,,xn]RN×n denote a set of n training image samples from C classes, and suppose the number of samples from the ith class is ni. Then, we have i=1Cni=n, where xiRN×1(i=1,2,n) is an N-dimensional column vector representing the ith image vector. In this section, we briefly review LRC and LLRC.

Basic principle

As discussed earlier, the main advantage of LLRC over LRC is that LLRC exploits the neighbourhood relationship between data samples to enhance the classification performance. LLRC assumes that the neighbouring samples, rather than the overall samples, from a specific class lie in a linear subspace. To classify a testing sample, LLRC first computes the local reconstruction errors for each class, and then evaluates which class leads to the minimal local reconstruction error. Clearly, if the

Connections to LRC

As introduced in the previous sections, the connections between LLRDA and LRC can be summarized as follows:

  • (1)

    LLRDA is a DR method for extracting features from the original data, whereas LRC is a classifier used to recognize the features obtained by the DR methods (e.g. PCA, LDA, and LLRDA).

  • (2)

    LRC is based on the assumption that samples from one specific class lie on the same subspace, whereas LLRDA is based on the assumption that neighbouring samples from one specific class lie on the same

Experiments

In this section, extensive experiments are conducted on CMU PIE, ORL, FERET, and Yale-B face databases to verify the effectiveness of LLRDA. In addition, we compare LLRDA with other state-of-the-art feature extraction algorithms such as PCA [33], LDA [1], LPP [12], and RDA [4]. After feature extraction, different classifiers such as NNC [5], MDC [7], LRC [19], and LLRC [2] are adopted to measure the recognition performance of each method; for LLRC, the parameter k is selected by exhaustive

Conclusions

In this paper, we propose a novel DR algorithm called locality-regularized linear regression discriminant analysis (LLRDA) for feature extraction. This algorithm inherits the assumption of the LLRC method that neighbouring samples for one specific class should lie on a linear subspace. Based on this assumption, LLRDA seeks to find a discriminant subspace that is efficient for LLRC by maximizing the inter-class local reconstruction scatter, while simultaneously minimizing the intra-class local

Acknowledgements

This work is sponsored by the National Natural Science Foundation of China (Grant Nos. 61503195, 61403188, 61603159 and 61502245), Natural Science Foundation of Jiangsu Province (Grant Nos. BK20150849 and BK20160293), Project funded by China Postdoctoral Science Foundation (Grant No. 2016M600433), Project supported by the Jiangsu Key Laboratory of Image and Video Understanding for Social Safety (Nanjing University of Science and Technology) (Grant No. 30916014107) and Open Fund Project of

References (49)

  • Y. Mitani et al.

    A local mean-based nonparametric classifier

    Pattern Recognit. Lett.

    (2006)
  • X. Shu et al.

    Efficient linear discriminant analysis with locality preserving for face recognition

    Pattern Recognit.

    (2012)
  • M. Wan et al.

    Feature extraction using two-dimensional maximum embedding difference

    Inf. Sci.

    (2014)
  • X. Wen et al.

    A rapid learning algorithm for vehicle classification

    Inf. Sci.

    (2015)
  • W.K. Wong et al.

    Supervised optimal locality preserving projection

    Pattern Recognit.

    (2012)
  • W. Yang et al.

    A multi-manifold discriminant analysis method for image feature extraction

    Pattern Recognit.

    (2011)
  • W. Yang et al.

    A collaborative representation based projections method for feature extraction

    Pattern Recognit.

    (2015)
  • J. Yang et al.

    From classifiers to discriminators: a nearest neighbor rule induced discriminant analysis

    Pattern Recognit.

    (2011)
  • P. Belhumeur et al.

    Face recognition: Eigenfaces vs. Fisherfaces: recognition using class specific projection

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1997)
  • D. Brown et al.

    Locality-regularized linear regression for face recognition

  • D. Cai et al.

    Locality sensitive discriminant analysis

  • T. Cover et al.

    Nearest neighbor pattern classification

    IEEE Trans. Inf. Theory

    (1967)
  • A.S. Georghiades et al.

    From few to many: illumination cone models for face recognition under variable lighting and pose

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2001)
  • R.C. Gonzalez et al.

    Digital Image Processing

    (2006)
  • Cited by (15)

    • Two-directional two-dimensional fractional-order embedding canonical correlation analysis for multi-view dimensionality reduction and set-based video recognition

      2023, Expert Systems with Applications
      Citation Excerpt :

      In recent years, many approaches have been proposed by researchers to deal with the pattern recognition problem. However, most of the existing approaches pay attention to single image-based object recognition task (Gao, Sun, Xu et al., 2018; Gao, Sun, & Yang, 2018b; Han et al., 2022; Huang et al., 2018; Niu et al., 2022; Xie et al., 2018; Zhu et al., 2022). This type of approaches mainly focuses on two research directions: feature learning and metric learning.

    • Local median based linear regression classification for biometric recognition

      2021, Computers and Electrical Engineering
      Citation Excerpt :

      However, this assumption cannot always hold due to variations between images and limited training samples. On one hand, several researches [17, 21, 23-24] have revealed that neighboring samples lie in a manifold and hence, the similar properties of test sample ought to be close to those of the mean of its nearest neighbors. Therefore, we first take the neighborhood relationships into account.

    • Target redirected regression with dynamic neighborhood structure

      2021, Information Sciences
      Citation Excerpt :

      The final type of extension of LSR is manifold extension. As indicated in [14,27,31], high-dimensional data always lie on an ambient low-dimensional manifold embedded in the original high-dimensional space. Therefore, learning the local manifold structure of data is essential for improving the performance of learning models.

    • Overview and comparative study of dimensionality reduction techniques for high dimensional data

      2020, Information Fusion
      Citation Excerpt :

      LPP has been applied in different application areas including image retrieval [99], image and video classification, face recognition [16,100], pattern recognition [101], automatic speech recognition [6], and computer vision [98]. Huang et al. [102] introduced a Locality-Regularized Linear Regression Discriminant Analysis (LL-RDA) based on LL Regression Classification (LLRC) [103]. LL-RDA was derived by maximizing the inter-class reconstruct of local scatters at the same time minimizing the intra-class reconstruction of local scatters.

    • Piecewise linear regression-based single image super-resolution via Hadamard transform

      2018, Information Sciences
      Citation Excerpt :

      Image super-resolution (SR) is the process of recovering a visually pleasing high-resolution (HR) image from a low-resolution (LR) image. SR has many real applications, such as face recognition [15,27,42], visual question answering [47], visual speaker identification and authentication [20], object understanding [24], activity recognition [28], surveillance systems, satellite imaging, medical imaging, and ultra-high definition display devices. Most existing methods use certain prior information to address the SR problem, especially learned priors.

    • ASF-LLRDA: Locality-regularized Linear Regression Discriminant Analysis with Approximately Symmetrical Face Preprocessing for Face Recognition

      2023, 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
    View all citing articles on Scopus
    View full text