Elsevier

Pattern Recognition

Volume 67, July 2017, Pages 252-262
Pattern Recognition

Low rank representation with adaptive distance penalty for semi-supervised subspace classification

https://doi.org/10.1016/j.patcog.2017.02.017Get rights and content

Highlights

  • A novel LRR with adaptive distance penalty method is proposed for SSL.

  • LRRADP can better capture both the global structure and local affinity of the data.

  • The projected based LRRADP(LRRADP2) shows impressive robustness.

  • Extensive experiments demonstrate the effectiveness of the proposed method.

Abstract

The graph based Semi-supervised Subspace Learning (SSL) methods treat both labeled and unlabeled data as nodes in a graph, and then instantiate edges among these nodes by weighting the affinity between the corresponding pairs of samples. Constructing a good graph to discover the intrinsic structures of the data is critical for these SSL tasks such as subspace clustering and classification. The Low Rank Representation (LRR) is one of powerful subspace clustering methods, based on which a weighted affinity graph can be constructed. Generally, adjacent samples usually belong to a union of subspace and thereby nearby points in the graph should have large edge weights. Motivated by this, in this paper, we proposed a novel LRR with Adaptive Distance Penalty (LRRADP) to construct a good affinity graph. The graph identified by the LRRADP can not only capture the global subspace structure of the whole data but also effectively preserve the neighbor relationship among samples. Furthermore, by projecting the data set into an appropriate subspace, the LRRADP can be further improved to construct a more discriminative affinity graph. Extensive experiments on different types of baseline datasets are carried out to demonstrate the effectiveness of the proposed methods. The improved method, named as LRRADP2, shows impressive performance on real world handwritten and noisy data. The MATLAB codes of the proposed methods will be available at http://www.yongxu.org/lunwen.html.

Introduction

In many big data related applications, the problem of effectively connecting unlabeled data with labeled data is of central importance [1], [2]. For example, in the applications of image based web searching and image based object recognition, the labeled data is usually limited and the unlabeled data are rich and available in internet. In these problems, the target goal is to build the connection between unlabeled data and labeled data and then identify the labels of the unlabeled data. Semi-supervised Subspace Learning (SSL) [3], [4], [5], [6] is a family of techniques that exploits the “manifold structure” of the data by using both labeled and unlabeled samples [7], [8].

Constructing a graph of the local connectivity of data is an effective strategy for SSL due to its success in practice [9], [10], [11]. The graph based SSL methods treat both labeled and unlabeled samples from the data set as nodes in a graph, and then instantiate edges among these nodes which are weighted according to the affinity between the corresponding pairs of samples. Suppose that the data set is noiseless and embeds in independent subspaces, the graph identified by the respective SSL method should be a block-diagonal matrix and each block corresponding to a subspace. Subspace clustering is able to produce exactly correct clustering result based on the block-diagonal matrix. To address this issue, we build a weight graph G=(V,W), where V is the vertex set denoting nodes of the graph corresponding to N data points and WRN × N is a symmetric non-negative weight matrix representing the relationship among the nodes. A non-zero weight reflects the affinity between corresponding nodes and a zero weight denotes that there is no edge jointing them. An ideal similarity matrix W, hence an ideal weight graph G, is one in which nodes that correspond to points from the same subspace are connected to each other and there is no edge between any two nodes that correspond to points belonging to different subspaces. Thus, given a data set, the problem of graph construction is to determine the weight matrix W. A perfect similarity graph built by SSL has n independent connected components corresponding to n subspaces and then by applying spectral clustering the labels can be propagated from the labeled samples to unlabeled samples over the graph [12], [13], [14], [15].

The recently Low Rank Representation (LRR) [13], [14] is a promising weight graph construction method. The target of the LRR aims at finding the lowest-rankness representation among all candidates that can express the data vectors as linear combinations of the basis in a proper dictionary. Consider a set of data X=[x1,x2,,xn] in Rd, each column of which is a sample that can be represented by the linear combination of a basis of d vectors. If we form the basis matrix as A=[a1,a2,,am], the X can be represented as: X=AZ,where Z=[z1,z2,,zn] is the coefficient matrix with each zi characterizing how other samples contribute to the representation of xi. Since the data can only be essentially represented by those data in the same subspace, the nonzero elements in zi represents that the corresponding samples are in the same subspace. Therefore, minimizing rankness of the data vector space could be an appropriate criterion to cluster data drawn from multiple linear subspaces. That is, LRR discovers the lowest rankness of the representation of the data set as follows. minZrank(Z),s.t.X=AZ,where rank(•) denotes the rankness of a matrix. The low rankness constraint guarantees that the coefficients of samples coming from the same subspace are highly correlated. When the data are clean and exactly from linearly independent subspaces, the similarity matrix built by this way is an ideally n block diagonal matrix corresponding to n subspaces.

LRR is an effective framework for exploring the multiple subspace structures of data. Based on the LRR, lots of recent efforts have been made to exploit ways of constructing a discriminative graph for SSL [16], [17], [18], [19]. Liu et al. [18] proposed a latent low rank representation method for subspace clustering by approximating and using the unobserved data hidden in the observed data to resolve the issue of insufficient sampling. Zhang et al. [19] extended the latent LRR by choosing the sparest solution in the solution set to increase the robustness of the method. Wei et al. [20] proposed a robust shape interaction by preprocessing the data using robust PCA [21] and then applying LRR to build the similarity matrix. By combining the sparsity and global structure, Zhuang et al. [22], [23] proposed a nonnegative low-rank and sparse graph for semi-supervised learning. Fang et al. [24], [25] combined the nonnegative low-rank representation with the semi-supervised clustering learning within one framework achieving acceptable classification performance.

Conventional LRR based methods usually consider much on construction of the global subspace structure. However, a good graph should not only capture global structures of all the data but also reveal the intrinsic neighbor relationship among the data [26]. In this paper, we propose a Low Rank Representation with Adaptive Distance Penalty (LRRADP) method, which constructs the linear combination by using the nearby samples as much as possible via the adaptive distance penalty. The affinity graph built by the LRRADP can better both capture the global subspace structure of a whole data set and preserve local neighbor relationships among the data samples. The similarity graph/matrix identified by the LRRADP can work well with conventional semi-supervised classification method, such as Gaussian Fields and Harmonic Functions (GFHF) [8], for the label prediction of unlabeled samples. Moreover, the LRRADP is improved to LRRADP2 by projecting the data set into an appropriate subspace.

The remainder of this paper is organized as follows. Section 2 introduces the related works of the low rank representation and semi-supervised subspace classification methods. Section 3 proposes an LRR with adaptive distance penalty method (LRRADP) for subspace classification. Section 4 extends the LRRADP to LRRADP2 by projecting the data into an appropriate subspace. Section 5 presents the experimental results and Section 6 concludes this paper.

Section snippets

Low rank representation

To capture the global structure of data, LRR [13], [14] is to construct the affinities of an undirected graph. A LRR graph obtains the representation of all data under a global low-rank constraint, thus is better at capturing the global data structures. It has been proven that under suitable conditions, LRR can correctly preserve the membership of samples that belong to the same subspace [13]. Given a set of data, the data usually can be represented by other data that lie in the same subspace.

Low rank representation with adaptive distance penalty

Throughout this paper, all the matrices are written as uppercase. For matrix M, the (i, j)th element of M is denoted as [M]i, j. The ith row of M is denoted as [M]i, * and the jth column of M is denoted as [M]*, j. The trace of M is denoted as tr(M). The lpnorm of M is denoted as ||M||p. Specially, the Frobenius norm and nuclear norm of matrix M are denoted as ||M||F and ||M||*, respectively. The transpose of M is denoted as MT. M ≥ 0 mean all elements of M are larger than or equal to zero. I

LRRADP2

In general, the quality of data representation will greatly affect the quality of graph. A good data representation could improve the quality of the graph and then improve the performance of the SSL. Previous works [22] have shown that by projecting the data with a projection matrix, the embedded data will facilitate the subsequent data representation and increase the classification accuracy. To improve the data representation, we propose to learn an appropriate subspace in which the graph

Experiments

In this section, we evaluate the performance of the proposed methods on baseline databases, as well as other state-of-the-art graph construction methods. We combine the graphs identified by the LRRADP, LRRADP2 and conventional popular graph construction methods with the GFHF method to perform the semi-supervised classification, and quantitatively evaluate their performance. We test and compare these solvers on six representative data sets, including the COIL20, AR, Extended Yale B, Isolet5,

Conclusions

In this paper, we considered the general problem of learning from labeled and unlabeled samples and classifying the unlabeled samples and proposed a novel low rank representation with adaptive distance penalty, named LRRADP, to learn the affinity graph of the data set. By embedding the adaptive distance penalty into the LRR, the obtained affinity graph can better not only capture the global clustering structure of the whole data but also preserve the local neighbor relationship of those data.

Acknowledgment

This paper is partially supported by the National Natural Science Foundation of China (No. 61370163) and Shenzhen Municipal Science and Technology Innovation Council (No. JCYJ20130329154017293).

Lunke Fei received the B.S. and M.S. degree in computer science and technology from East China Jiaotong University, China, in 2004 and 2007, He is currently pursuing the Ph.D. degree in computer science and technology at Shenzhen Graduate School, Harbin Institute of Technology, Shenzhen, China. His current research interests include pattern recognition and biometrics.

References (39)

  • F. Nie et al.

    Semi-supervised orthogonal discriminant analysis via label propagation

    Pattern Recogn.

    (2009)
  • H. Zhang et al.

    Robust latent low rank representation for subspace clustering

    Neurocomputing

    (2014)
  • L. He et al.

    Iterative ensemble normalized cuts

    Pattern Recogn.

    (2016)
  • J. Yan et al.

    A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and nondegenerate

  • D. Cai et al.

    Semi-supervised discriminant analysis

  • X. Zhu, Semi-supervised learning literature survey, (2005),...
  • S. Yan et al.

    Graph embedding and extensions: a general framework for dimensionality reduction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • B. Ni et al.

    Learning a propagable graph for semisupervised learning: classification and regression

    IEEE Trans. Knowl. Data Eng.

    (2012)
  • V. Sindhwani et al.

    Linear manifold regularization for large scale semi-supervised learning

  • X. Zhu et al.

    Semi-supervised learning using Gaussian fields and harmonic functions

  • Y. Luo et al.

    Manifold regularized multitask learning for semi-supervised multilabel image classification

    IEEE Trans. Image Process.

    (2013)
  • F. Nie et al.

    Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction

    IEEE Trans. Image Process.

    (2010)
  • R. He et al.

    Nonnegative sparse coding for discriminative semi-supervised learning

  • J. Wang et al.

    Linear neighborhood propagation and its applications

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2009)
  • G. Liu et al.

    Robust recovery of subspace structures by low-rank representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • G. Liu et al.

    Robust subspace segmentation by low-rank representation

  • J. Shi et al.

    Normalized cuts and image segmentation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2000)
  • K. Tang et al.

    Structure-constrained low-rank representation

    IEEE Trans. Neural Networks Learn. Syst.

    (2014)
  • J. Chen et al.

    Robust subspace segmentation via low-rank representation

    IEEE Trans. Cybern.

    (2014)
  • Cited by (52)

    • Fast subspace clustering by learning projective block diagonal representation

      2023, Pattern Recognition
      Citation Excerpt :

      Inspired by these findings, we naturally intend to learn an effective projection mapping to quickly predict the block diagonal coding. Subspace clustering models have numerous and diverse real-world applications [34–39]. This section mainly discusses subspace clustering approaches based on block diagonal structure prior, which are the most relevant work to our model.

    View all citing articles on Scopus

    Lunke Fei received the B.S. and M.S. degree in computer science and technology from East China Jiaotong University, China, in 2004 and 2007, He is currently pursuing the Ph.D. degree in computer science and technology at Shenzhen Graduate School, Harbin Institute of Technology, Shenzhen, China. His current research interests include pattern recognition and biometrics.

    Yong Xu received the B.S. and M.S. degrees in 1994 and 1997, respectively, and the Ph.D. degree in pattern recognition and intelligence system from the Nanjing University of Science and Technology, Nanjing, China, in 2005. Currently, he is with the Bio-Computing Research Center, Shenzhen Graduate School, Harbin Institute of Technology, Shenzhen, China. His current research interests include pattern recognition, biometrics, bioinformatics, machine learning, image processing, and video analysis.

    Xiaozhao Fang received the M.S. degree in computer science from Guangdong University of Technology in 2008, and the Ph.D. degree in computer science and technology at Shenzhen Graduate School, Harbin Institute of Technology in 2016. He has published more than 15 journal papers. His current research interests include pattern recognition and machine learning.

    Jian Yang received the B.S. degree in mathematics from Xuzhou Normal University, in 1995, the M.S. degree in applied mathematics from Chang sha Railway University, in 1998, and the Ph.D. degree from the Nanjing University of Science and Technology (NUST), in 2002, with a focus on pattern recognition and intelligence systems. In 2003, he was a Post-Doctoral Researcher with the University of Zaragoza. He was a Post-Doctoral Fellow with the Biometrics Centre, the Hong Kong Poly technic University, from 2004 to 2006, and the Department of Computer Science, New Jersey Institute of Technology, from 2006 to 2007. He is currently a Professor with the School of Computer Scienceand Technology, NUST. His journal papers have been cited more than 1600 times in the ISI Web of Science, and 2800 times in Google Scholar. He has authored over 80 scientific papers in patternre cognition and computer vision. His research interests include pattern recognition, computer vision, and machine learning. He is currently an Associate Editor of Pattern Recognition Letters and the IEEE Transactions on Neural Networks and Learning Systems, respectively.

    View full text