Semi-supervised local Fisher discriminant analysis for dimensionality reduction

Sugiyama, Masashi; Idé, Tsuyoshi; Nakajima, Shinichi; Sese, Jun

doi:10.1007/s10994-009-5125-7

Semi-supervised local Fisher discriminant analysis for dimensionality reduction

Published: 11 July 2009

Volume 78, pages 35–61, (2010)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Semi-supervised local Fisher discriminant analysis for dimensionality reduction

Download PDF

Masashi Sugiyama¹,
Tsuyoshi Idé²,
Shinichi Nakajima³ &
…
Jun Sese⁴

3349 Accesses
207 Citations
Explore all metrics

Abstract

When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The proposed method, which we call SEmi-supervised Local Fisher discriminant analysis (SELF), has an analytic form of the globally optimal solution and it can be computed based on eigen-decomposition. We show the usefulness of SELF through experiments with benchmark and real-world document classification datasets.

Article PDF

Data-dependent kernel sparsity preserving projection and its application for semi-supervised classification

Article 15 February 2018

Ao Zhang & Xianwen Gao

Semi-supervised Multi-label Linear Discriminant Analysis

Semi-supervised Uncertain Linear Discriminant Analysis

References

Albert, A. (1972). Regression and the Moore-Penrose pseudoinverse. San Diego: Academic Press.
MATH Google Scholar
Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 337–404.
Article MATH MathSciNet Google Scholar
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., & van der Vorst, H. (Eds.) (2000). Templates for the solution of eigenvalue problems: a practical guide. Philadelphia: Society for Industrial and Applied Mathematics.
MATH Google Scholar
Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15, 1373–1396.
Article MATH Google Scholar
Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7, 2399–2434.
MathSciNet Google Scholar
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
MATH Google Scholar
Cai, D., He, X., & Han, J. (2007). Semi-supervised discriminant analysis. In Proceedings of the IEEE international conference on computer vision (pp. 1–7), Rio de Janeiro, Brazil.
Chapelle, O., Schölkopf, B., & Zien, A. (Eds.) (2006). Semi-supervised learning. Cambridge: MIT Press.
Google Scholar
Chung, F. R. K. (1997). Spectral graph theory. Providence: American Mathematical Society.
MATH Google Scholar
Davidov, D., Gabrilovich, E., & Markovitch, S. (2004). Parameterized generation of labeled datasets for text categorization based on a hierarchical directory. In The 27th annual international ACM SIGIR conference (pp. 250–257), Sheffield, UK.
Duffy, N., & Collins, M. (2002). Convolution kernels for natural language. In Advances in neural information processing systems (Vol. 14, pp. 625–632). Cambridge: MIT Press.
Google Scholar
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.
Google Scholar
Friedman, J. H. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165–175.
Article MathSciNet Google Scholar
Fukunaga, K. (1990). Introduction to statistical pattern recognition (2nd ed.). San Diego: Academic Press.
MATH Google Scholar
Gärtner, T. (2003). A survey of kernels for structured data. SIGKDD Explorations, 5, S268–S275.
Article Google Scholar
Gärtner, T., Flach, P., & Wrobel, S. (2003). On graph kernels: hardness results and efficient alternatives. In Proceedings of the sixteenth annual conference on computational learning theory (pp. 129–143).
Globerson, A., & Roweis, S. (2006). Metric learning by collapsing classes. In Advances in neural information processing systems (Vol. 18, pp. 451–458). Cambridge: MIT Press.
Google Scholar
Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2005). Neighbourhood components analysis. In Advances in neural information processing systems (Vol. 17, pp. 513–520). Cambridge: MIT Press.
Google Scholar
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Article MATH Google Scholar
He, X., & Niyogi, P. (2004). Locality preserving projections. In Advances in neural information processing systems (Vol. 16, pp. 153–160). Cambridge: MIT Press.
Google Scholar
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504–507.
Article MathSciNet Google Scholar
Joachims, T. (2002). Learning to classify text using support vector machines: methods, theory and algorithms. Dordrecht: Kluwer Academic.
Google Scholar
Jolliffe, I. T. (1986). Principal component analysis. New York: Springer.
Google Scholar
Kashima, H., & Koyanagi, T. (2002). Kernels for semi-structured data. In Proceedings of the nineteenth international conference on machine learning (pp. 291–298). San Mateo: Morgan Kaufmann.
Google Scholar
Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. In Proceedings of the twentieth international conference on machine learning (pp. 321–328). San Mateo: Morgan Kaufmann.
Google Scholar
Kohavi, R., & John, G. (1997). Wrappers for feature selection. Artificial Intelligence, 97, 273–324.
Article MATH Google Scholar
Kondor, R. I., & Lafferty, J. (2002). Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the nineteenth international conference on machine learning (pp. 315–322).
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., & Watkins, C. (2002). Text classification using string kernels. Journal of Machine Learning Research, 2, 419–444.
Article MATH Google Scholar
Loog, M. (2007). A complete characterization of a family of solutions to a generalized Fisher criterion. Journal of Machine Learning Research, 8, 2121–2123.
MathSciNet Google Scholar
Loog, M. (2008). On the equivalence of linear dimensionality-reducing transformations. Journal of Machine Learning Research, 9, 2489–2490.
MathSciNet Google Scholar
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Smola, A., & Müller, K.-R. (2003). Constructing descriptive and discriminative nonlinear features: Rayleigh coefficients in kernel feature spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 623–628.
Article Google Scholar
Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. (Eds.) (2009). Dataset shift in machine learning. Cambridge: MIT Press.
Google Scholar
Rätsch, G., Onoda, T., & Müller, K.-R. (2001). Soft margins for adaboost. Machine Learning, 42, 287–320.
Article MATH Google Scholar
Roweis, S., & Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326.
Article Google Scholar
Schölkopf, B., Smola, A., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319.
Article Google Scholar
Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90, 227–244.
Article MATH MathSciNet Google Scholar
Song, Y., Nie, F., Zhang, C., & Xiang, S. (2008). A unified framework for semi-supervised dimensionality reduction. Pattern Recognition, 41, 2789–2799.
Article MATH Google Scholar
Sugiyama, M. (2007). Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. Journal of Machine Learning Research, 8, 1027–1061.
Google Scholar
Sugiyama, M., Krauledat, M., & Müller, K.-R. (2007). Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research, 8, 985–1005.
Google Scholar
Sugiyama, M., Ide, T., Nakajima, S., & Sese, J. (2008). Semi-supervised local Fisher discriminant analysis for dimensionality reduction. In Advances in knowledge discovery and data mining (pp. 333–344). Berlin: Springer.
Chapter Google Scholar
Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319–2323.
Article Google Scholar
Weinberger, K., Blitzer, J., & Saul, L. (2006). Distance metric learning for large margin nearest neighbor classification. In Advances in neural information processing systems (Vol. 18, pp. 1473–1480). Cambridge: MIT Press.
Google Scholar
Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2003). Distance metric learning with application to clustering with side-information. In Advances in neural information processing systems (Vol. 15, pp. 505–512). Cambridge: MIT Press.
Google Scholar
Ye, J. (2005). Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research, 6, 483–502.
Google Scholar
Ye, J. (2008). Comments on the complete characterization of a family of solutions to a generalized Fisher criterion. Journal of Machine Learning Research, 9, 517–519.
Google Scholar
Zadrozny, B. (2004). Learning and evaluating classifiers under sample selection bias. In Proceedings of the twenty-first international conference on machine learning (pp. 903–910). New York: ACM.
Google Scholar
Zelnik-Manor, L., & Perona, P. (2005). Self-tuning spectral clustering. In Advances in neural information processing systems (Vol. 17, pp. 1601–1608). Cambridge: MIT Press.
Google Scholar
Zhang, D., Zhou, Z.-H., & Chen, S. (2007). Semi-supervised dimensionality reduction. In Proceedings of the 7th SIAM international conference on data mining (pp. 629–634), Minneapolis, MN, USA.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tokyo Institute of Technology, 2-12-2 O-okayama, Meguro-ku, Tokyo, 152-8552, Japan
Masashi Sugiyama
IBM Research, Tokyo Research Laboratory, 1623-14 Shimo-tsuruma, Yamato-shi, Kanagawa, 242-8502, Japan
Tsuyoshi Idé
Nikon Corporation, 201-9 Oaza-Miizugahara, Kumagaya-shi, Saitama, 360-8559, Japan
Shinichi Nakajima
Department of Information Science, Ochanomizu University, 2-1-1 Otsuka, Bunkyo, Tokyo, 112-8610, Japan
Jun Sese

Authors

Masashi Sugiyama
View author publications
You can also search for this author in PubMed Google Scholar
Tsuyoshi Idé
View author publications
You can also search for this author in PubMed Google Scholar
Shinichi Nakajima
View author publications
You can also search for this author in PubMed Google Scholar
Jun Sese
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masashi Sugiyama.

Additional information

Editor: Roni Khardon.

A preliminary version of this paper was previously published in Sugiyama et al. (2008). A MATLAB implementation of the proposed dimensionality reduction method SELF is available from http://sugiyama-www.cs.titech.ac.jp/~sugi/software/SELF.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sugiyama, M., Idé, T., Nakajima, S. et al. Semi-supervised local Fisher discriminant analysis for dimensionality reduction. Mach Learn 78, 35–61 (2010). https://doi.org/10.1007/s10994-009-5125-7

Download citation

Received: 17 August 2007
Revised: 12 June 2009
Accepted: 15 June 2009
Published: 11 July 2009
Issue Date: January 2010
DOI: https://doi.org/10.1007/s10994-009-5125-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Semi-supervised local Fisher discriminant analysis for dimensionality reduction

Abstract

Article PDF

Similar content being viewed by others

Data-dependent kernel sparsity preserving projection and its application for semi-supervised classification

Semi-supervised Multi-label Linear Discriminant Analysis

Semi-supervised Uncertain Linear Discriminant Analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semi-supervised local Fisher discriminant analysis for dimensionality reduction

Abstract

Article PDF

Similar content being viewed by others

Data-dependent kernel sparsity preserving projection and its application for semi-supervised classification

Semi-supervised Multi-label Linear Discriminant Analysis

Semi-supervised Uncertain Linear Discriminant Analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation