ABSTRACT
Nonnegative Matrix Factorization (NMF) is an effective dimension reduction method for non-negative dyadic data, and has proven to be useful in many areas, such as text mining, bioinformatics and image processing. NMF is usually formulated as a constrained non-convex optimization problem, and many algorithms have been developed for solving it. Recently, a coordinate descent method, called FastHals, has been proposed to solve least squares NMF and is regarded as one of the state-of-the-art techniques for the problem. In this paper, we first show that FastHals has an inefficiency in that it uses a cyclic coordinate descent scheme and thus, performs unneeded descent steps on unimportant variables. We then present a variable selection scheme that uses the gradient of the objective function to arrive at a new coordinate descent method. Our new method is considerably faster in practice and we show that it has theoretical convergence guarantees. Moreover when the solution is sparse, as is often the case in real applications, our new method benefits by selecting important variables to update more often, thus resulting in higher speed. As an example, on a text dataset RCV1, our method is 7 times faster than FastHals, and more than 15 times faster when the sparsity is increased by adding an L1 penalty. We also develop new coordinate descent methods when error in NMF is measured by KL-divergence by applying the Newton method to solve the one-variable sub-problems. Experiments indicate that our algorithm for minimizing the KL-divergence is faster than the Lee & Seung multiplicative rule by a factor of 10 on the CBCL image dataset.
- M. Berry, M. Browne, A. Langville, P. Pauca, and R. Plemmon. Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics and Data Analysis, 2007. Submitted.Google ScholarCross Ref
- D. Bersekas and J. Tsitsiklis. Parallel and distributed computation. Prentice-Hall, 1989. Google ScholarDigital Library
- A. Cichocki and A.-H. Phan. Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Transaction on Fundamentals, E92-A(3):708--721, 2009.Google Scholar
- E. Gaussier and C. Goutte. Relation between PLSA and NMF and implications. 28th Annual International ACM SIGIR Conference, 2005. Google ScholarDigital Library
- E. F. Gonzales and Y. Zhang. Accelerating the Lee-Seung algorithm for non-negative matrix factorization. Technical report, Department of Computational and Applied Mathematics, Rice University, 2005.Google Scholar
- L. Grippo and M. Sciandrone. On the convergence of the block nonlinear Gauss-Seidel method under convex constraints. Operations Research Letters, 26:127--136, 2000. Google ScholarDigital Library
- P. O. Hoyer. Non-negative sparse coding. In Proceedings of IEEE Workshop on Neural Networks for Signal Processing, pages 557--565, 2002.Google ScholarCross Ref
- P. O. Hoyer. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research, 5:1457--1469, 2004. Google ScholarDigital Library
- C.-J. Hsieh and I. S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. Department of Computer Science TR-11-06, University of Texas at Austin, 2011.Google Scholar
- D. Kim, S. Sra, and I. S. Dhillon. Fast Newton-type methods for the least squares nonnegative matrix appoximation problem. Proceedings of the Sixth SIAM International Conference on Data Mining, pages 343--354, 2007.Google ScholarCross Ref
- J. Kim and H. Park. Non-negative matrix factorization based on alternating non-negativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2):713--730, 2008. Google ScholarDigital Library
- J. Kim and H. Park. Toward faster nonnegative matrix factorization: A new algorithm and comparisons. Proceedings of the IEEE International Conference on Data Mining, pages 353--362, 2008. Google ScholarDigital Library
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, November 1998. MNIST database available at http://yann.lecun.com/exdb/mnist/.Google ScholarCross Ref
- D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788--791, 1999.Google ScholarCross Ref
- D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 556--562. MIT Press, 2001.Google Scholar
- D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361--397, 2004. Google ScholarDigital Library
- Y. Li and S. Osher. Coordinate descent optimization for l1 minimization with application to compressed sensing; a greedy algorithm. Inverse Probl. Imaging, 3(3):487--503, 2009.Google ScholarCross Ref
- C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19:2756--2779, 2007. Google ScholarDigital Library
- C. Liu, H. chih Yang, J. Fan, L.-W. He, and Y.-M. Wang. Distributed Non-negative Matrix Factorization for Web-Scale Dyadic Data Analysis on MapReduce. 2010.Google Scholar
- P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error. Environmetrics, 5:111--126, 1994.Google ScholarCross Ref
- S. Perkins, K. Lacker, and J. Theiler. Grafting: Fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research, 3:1333--1356, 2003. Google ScholarDigital Library
- J. Piper, P. Pauca, R. Plemmons, and M. Giffin. Object characterization from spectral data using nonnegative factorization and information theory. In Proceedings of AMOS Technical Conference, 2004.Google Scholar
- R. Zdunek and A. Cichocki. Non-negative matrix factorization with quasi-newton optimization. Eighth International Conference on Artificial Intelligence and Soft Computing, ICAISC, pages 870--879, 2006. Google ScholarDigital Library
Index Terms
- Fast coordinate descent methods with variable selection for non-negative matrix factorization
Recommendations
New SVD based initialization strategy for non-negative matrix factorization
We give a new method to determine the rank of the factorization for NMF algorithms.We propose a novel method SVD-NMF to enhance initialization for NMF.The compute process is cheap.Numerical results show that SVD-NMF convergent faster than NNDSVD and ...
Fast bregman divergence NMF using taylor expansion and coordinate descent
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningNon-negative matrix factorization (NMF) provides a lower rank approximation of a matrix. Due to nonnegativity imposed on the factors, it gives a latent structure that is often more physically meaningful than other lower rank approximations such as ...
Non-negative Matrix Factorization for Binary Data
IC3K 2015: Proceedings of the International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge ManagementWe propose the Logistic Non-negative Matrix Factorization for decomposition of binary data. Binary data
are frequently generated in e.g. text analysis, sensory data, market basket data etc. A common method for
analysing non-negative data is the Non-...
Comments