Alternating direction method of multipliers for penalized zero-variance discriminant analysis

Ames, Brendan P. W.; Hong, Mingyi

doi:10.1007/s10589-016-9828-y

Alternating direction method of multipliers for penalized zero-variance discriminant analysis

Published: 12 February 2016

Volume 64, pages 725–754, (2016)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Brendan P. W. Ames¹ &
Mingyi Hong²

564 Accesses
24 Citations
Explore all metrics

Abstract

We consider the task of classification in the high dimensional setting where the number of features of the given data is significantly greater than the number of observations. To accomplish this task, we propose a heuristic, called sparse zero-variance discriminant analysis, for simultaneously performing linear discriminant analysis and feature selection on high dimensional data. This method combines classical zero-variance discriminant analysis, where discriminant vectors are identified in the null space of the sample within-class covariance matrix, with penalization applied to induce sparse structures in the resulting vectors. To approximately solve the resulting nonconvex problem, we develop a simple algorithm based on the alternating direction method of multipliers. Further, we show that this algorithm is applicable to a larger class of penalized generalized eigenvalue problems, including a particular relaxation of the sparse principal component analysis problem. Finally, we establish theoretical guarantees for convergence of our algorithm to stationary points of the original nonconvex problem, and empirically demonstrate the effectiveness of our heuristic for classifying simulated data and data drawn from applications in time-series classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Structured Nonlinear Discriminant Analysis

Linear Discriminant Analysis with Adherent Regularization

A DC Programming Approach for Sparse Linear Discriminant Analysis

References

Allen, G.I.: Sparse and functional principal components analysis. arXiv:1309.2895 (2013)
Ames, B., Hong, M.: SZVD: ADMM for sparse zero-variance discriminant analysis. http://bpames.people.ua.edu/software.html (2014)
Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Structured sparsity through convex optimization. Stat. Sci. 27(4), 450–468 (2012)
Article MathSciNet MATH Google Scholar
Baik, J., Silverstein, J.W.: Eigenvalues of large sample covariance matrices of spiked population models. J. Multivar. Anal. 97(6), 1382–1408 (2006)
Article MathSciNet MATH Google Scholar
Bickel, P., Levina, E.: Some theory for Fisher’s linear discriminant function, naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10(6), 989–1010 (2004)
Article MathSciNet MATH Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends\(\textregistered \) Mach. Learn. 3(1), 1–122 (2011)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Boyd, S., Vandenberghe, L.: Subgradients. Lecture Notes for EE364b, Stanford University, Winter 2006-07 (2008). http://see.stanford.edu/materials/lsocoee364b/01-subgradients_notes.pdf
Briandet, R., Kemsley, E.K., Wilson, R.H.: Discrimination of Arabica and Robusta in instant coffee by Fourier transform infrared spectroscopy and chemometrics. J. Agric. Food Chem. 44(1), 170–174 (1996)
Article Google Scholar
Cai, T., Liu, W.: A direct estimation approach to sparse linear discriminant analysis. J. Am. Stat. Assoc. 106(496), 1566–1577 (2011)
Candès, E.J.: Compressive sampling. In: Proceedings oh the International Congress of Mathematicians: Madrid, August 22–30, 2006: invited lectures, pages pp. 1433–1452 (2006)
Clemmensen, L.: On discriminant analysis techniques and correlation structures in high dimensions. Technical Report-2013. Technical University of Denmark (2013)
Clemmensen, L.: contributions by Max Kuhn. sparseLDA: Sparse Discriminant Analysis. R package version 0.1-6 (2012)
Clemmensen, L., Hastie, T., Witten, D., Ersbøll, B.: Sparse discriminant analysis. Technometrics, 53(4), 406–413 (2011)
d’Aspremont, A., Bach, F., El Ghaoui, L.: Optimal solutions for sparse principal component analysis. J. Mach. Learn. Res. 9, 1269–1294 (2008)
MathSciNet MATH Google Scholar
d’Aspremont, A., Bach, F., El Ghaoui, L.: Approximation bounds for sparse principal component analysis. Math. Program. 148(1–2), 89–110 (2014)
Article MathSciNet MATH Google Scholar
d’Aspremont, A., El Ghaoui, L., Jordan, M.I., Lanckriet, G.R.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49(3), 434–448 (2007)
Article MathSciNet MATH Google Scholar
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97(457), 77–87 (2002)
Article MathSciNet MATH Google Scholar
Tebbens, J.Duintjer, Schlesinger, P.: Improving implementation of linear discriminant analysis for the high dimension/small sample size problem. Comput. Stat. Data Anal. 52(1), 423–437 (2007)
Article MathSciNet MATH Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)
Article MathSciNet MATH Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936)
Article Google Scholar
Friedman, J.H.: Regularized discriminant analysis. J. Am. Stat. Assoc. 84(405), 165–175 (1989)
Article MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1), 86–100 (2007)
Article MATH Google Scholar
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Ann. Stat. 73–102 (1995)
Hastie, T., Tibshirani, R., Friedman, J.J.H.: The Elements of Statistical Learning. Springer, New York (2013)
MATH Google Scholar
Hong, M., Luo, Z.-Q.: On the linear convergence of the alternating direction method of multipliers. arXiv:1208.3922, preprint (2012)
Hong, M., Luo, Z.-Q., Razaviyayn, M.: Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. arXiv:1410.1390, preprint (2012)
Johnstone, I.M., Lu, A.Y.: Sparse principal components analysis. Unpublished manuscript (2004)
Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the lasso. J. Comput. Graph. Stat. 12(3), 531–547 (2003)
Article MathSciNet Google Scholar
Journée, M., Nesterov, Y., Richtárik, P., Sepulchre, R.: Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11, 517–553 (2010)
MathSciNet MATH Google Scholar
Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR time series classification/clustering homepage (2006)
Krzanowski, W., Jonathan, P., McCarthy, W., Thomas, M.: Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. Appl. Stat. 101–115 (1995)
Kutyniok, G.: Compressed sensing: Theory and applications. CoRR, arXiv:1203.3815 (2012)
Luss, R., Teboulle, M.: Convex approximations to sparse PCA via lagrangian duality. Oper. Res. Lett. 39(1), 57–61 (2011)
Article MathSciNet MATH Google Scholar
Ma, S.: Alternating direction method of multipliers for sparse principal component analysis. J. Oper. Res. Soc. China 1(2), 253–274 (2013)
Article MATH Google Scholar
Papailiopoulos, D.S., Dimakis, A.G., Korokythakis, S.: Sparse PCA through low-rank approximations. arXiv:1303.0551, preprint (2013)
Paul, D.: Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica 17(4), 1617 (2007)
MathSciNet MATH Google Scholar
Richtárik, P., Takáč, M., Ahipaşaoğlu, S.D.: Alternating maximization: Unifying framework for 8 sparse PCA formulations and efficient parallel codes. arXiv:1212.4137, preprint (2012)
Shao, J., Wang, Y., Deng, X., Wang, S.: Sparse linear discriminant analysis by thresholding for high dimensional data. Ann. Stat. 39(2), 1241–1265 (2011)
Article MathSciNet MATH Google Scholar
Tapp, H.S., Defernez, M., Kemsley, E.K.: Ftir spectroscopy and multivariate analysis can distinguish the geographic origin of extra virgin olive oils. J. Agric. Food Chem. 51(21), 6110–6115 (2003)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Class prediction by nearest shrunken centroids, with applications to dna microarrays. Stat. Sci. 104–117 (2003)
Welling, M.: Fisher linear discriminant analysis. Department of Computer Science, University of Toronto (2005)
Witten, D.: penalizedLDA: Penalized classification using Fisher’s linear discriminant, (2011). R package version 1.0
Witten, D.M., Tibshirani, R.: Penalized classification using fisher’s linear discriminant. J. R. Stat. Soc. Ser. B 73(5), 753–772 (2011)
Article MathSciNet MATH Google Scholar
Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)
Article Google Scholar
Xu, P., Brock, G.N., Parrish, R.S.: Modified linear discriminant analysis approaches for classification of high-dimensional microarray data. Comput. Stat. Data Anal. 53(5), 1674–1687 (2009)
Article MathSciNet MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Yuan, X.-T., Zhang, T.: Truncated power method for sparse eigenvalue problems. J. Mach. Learn. Res. 14(1), 899–925 (2013)
MathSciNet MATH Google Scholar
Zhang, Y., d’Aspremont, A., Ghaoui, L.E.I.: Sparse PCA: Convex relaxations, algorithms and applications. Handbook on Semidefinite, Conic and Polynomial Optimization, pp. 915–940. Springer, Berlin (2012)
Chapter Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgments

The work presented in this paper was partially carried out while B.P.W Ames was a postdoctoral fellow at the Institute for Mathematics and its Applications during the IMA’s annual program on Mathematics of Information (supported by National Science Foundation (NSF) award DMS-0931945), and while B.P.W. Ames was a Von Karman instructor at the California Institute of Technology supported by Joel Tropp under Office of Naval Research (ONR) award N00014-11-1002. This research was also supported by University of Alabama Research Grant RG14678. We are grateful to Fadil Santosa, Krystal Taylor, Zhi-Quan Luo, Meisam Razaviyayn, and Line Clemmensen for their insights and helpful suggestions. Finally, we are grateful for the contributions of two anonymous reviewers whose suggestions have greatly improved this manuscript.

Author information

Authors and Affiliations

Department of Mathematics, The University of Alabama, Box 870350, Tuscaloosa, AL, 35487-0350, USA
Brendan P. W. Ames
Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Black Engineering Building, Ames, IA, 50011, USA
Mingyi Hong

Authors

Brendan P. W. Ames
View author publications
You can also search for this author in PubMed Google Scholar
Mingyi Hong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brendan P. W. Ames.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ames, B.P.W., Hong, M. Alternating direction method of multipliers for penalized zero-variance discriminant analysis. Comput Optim Appl 64, 725–754 (2016). https://doi.org/10.1007/s10589-016-9828-y

Download citation

Received: 19 December 2014
Published: 12 February 2016
Issue Date: July 2016
DOI: https://doi.org/10.1007/s10589-016-9828-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Alternating direction method of multipliers for penalized zero-variance discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Structured Nonlinear Discriminant Analysis

Linear Discriminant Analysis with Adherent Regularization

A DC Programming Approach for Sparse Linear Discriminant Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Alternating direction method of multipliers for penalized zero-variance discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Structured Nonlinear Discriminant Analysis

Linear Discriminant Analysis with Adherent Regularization

A DC Programming Approach for Sparse Linear Discriminant Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation