Nonnegative Matrix Factorization with Integrated Graph and Feature Learning

Authors:
Chong Peng

Southern Illinois University at Carbondale, IL, USA

Southern Illinois University at Carbondale, IL, USA
View Profile

,
Zhao Kang

Southern Illinois University at Carbondale, IL, USA

Southern Illinois University at Carbondale, IL, USA
View Profile

,
Yunhong Hu

Yuncheng University, Shanxi Province, China

Yuncheng University, Shanxi Province, China
View Profile

,
Jie Cheng

University of Hawaii at Hilo, HI, USA

University of Hawaii at Hilo, HI, USA
View Profile

,
Qiang Cheng

Southern Illinois University at Carbondale, IL, USA

Southern Illinois University at Carbondale, IL, USA
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 8 Issue 3Article No.: 42pp 1–29https://doi.org/10.1145/2987378

Published:12 January 2017Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Matrix factorization is a useful technique for data representation in many data mining and machine learning tasks. Particularly, for data sets with all nonnegative entries, matrix factorization often requires that factor matrices be nonnegative, leading to nonnegative matrix factorization (NMF). One important application of NMF is for clustering with reduced dimensions of the data represented in the new feature space. In this paper, we propose a new graph regularized NMF method capable of feature learning and apply it to clustering. Unlike existing NMF methods that treat all features in the original feature space equally, our method distinguishes features by incorporating a feature-wise sparse approximation error matrix in the formulation. It enables important features to be more closely approximated by the factor matrices. Meanwhile, the graph of the data is constructed using cleaner features in the feature learning process, which integrates feature learning and manifold learning procedures into a unified NMF model. This distinctly differs from applying the existing graph-based NMF models after feature selection in that, when these two procedures are independently used, they often fail to align themselves toward obtaining a compact and most expressive data representation. Comprehensive experimental results demonstrate the effectiveness of the proposed method, which outperforms state-of-the-art algorithms when applied to clustering.

References

Phipps Arabie. 1994. Cluster analysis in marketing research. Adv. Methods Market. Res. (1994), 160--189.Google Scholar
Sanjeev Arora, Rong Ge, Ravindran Kannan, and Ankur Moitra. 2012. Computing a nonnegative matrix factorization--provably. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing. ACM, 145--162. Google ScholarDigital Library
Mikhail Belkin and Partha Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering.. In NIPS, Vol. 14. 585--591.Google ScholarDigital Library
Michael W. Berry, Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2007. Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52, 1 (2007), 155--173. Google ScholarCross Ref
D. Cai. 2011. Litekmeans: The fastest matlab implementation of kmeans. Retrieved from http://www.zjucadcg.cn/dengcai/Data/Clustering.html.Google Scholar
Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (2011), 1548--1560. Google ScholarDigital Library
Deng Cai, Xiaofei He, Xiaoyun Wu, and Jiawei Han. 2008. Non-negative matrix factorization on manifold. In Proceedings of the 8th IEEE International Conference on Data Mining, 2008 (ICDM’08). IEEE, 63--72. Google ScholarDigital Library
Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 333--342. Google ScholarDigital Library
Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? J. ACM 58, 3 (2011), 11.Google ScholarDigital Library
Wei-Chien Chang. 1983. On using principal components before separating a mixture of two multivariate normal distributions. Appl. Stat. (1983), 267--275. Google ScholarCross Ref
Qiang Cheng, Hongbo Zhou, and Jie Cheng. 2011. The fisher-markov selector: Fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 33, 6 (2011), 1217--1233. Google ScholarDigital Library
Fan R. K. Chung. 1997. Spectral Graph Theory. Vol. 92. American Mathematical Soc.Google Scholar
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. 1990. Indexing by latent semantic analysis. J. Assoc. Inform. Sci. Technol. 41, 6 (1990), 391--407. Google ScholarCross Ref
Richard O. Duda, Peter E. Hart, and David G . Stork. 2012. Pattern Classification. John Wiley 8 Sons.Google Scholar
Ernie Esser, Michael Möller, Stanley Osher, Guillermo Sapiro, and Jack Xin. 2012. A convex model for nonnegative matrix factorization and dimensionality reduction on physical space. IEEE Trans. Image Process. 21, 7 (2012), 3239--3252. Google ScholarDigital Library
Cédric Févotte and Jérôme Idier. 2011. Algorithms for nonnegative matrix factorization with the β-divergence. Neur. Comput. 23, 9 (2011), 2421--2456. Google ScholarDigital Library
Athinodoros S. Georghiades, Peter N. Belhumeur, and David J Kriegman. 2001. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Trans. Pattern Anal. Mach. Intell. 23, 6 (2001), 643--660. Google ScholarDigital Library
Nicolas Gillis and Stephen A. Vavasis. 2014. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 4 (2014), 698--714. Google ScholarDigital Library
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012a. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Trans. Sign. Process. 60, 6 (2012), 2882--2898. Google ScholarDigital Library
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012b. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Trans. Neur. Netw. Learn. Syst. 23, 7 (2012), 1087--1099. Google ScholarCross Ref
Patrik O. Hoyer. 2002. Non-negative sparse coding. In Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002. IEEE, 557--565. Google ScholarCross Ref
Patrik O Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5 (2004), 1457--1469.Google ScholarDigital Library
Jin Huang, Feiping Nie, Heng Huang, and Chris Ding. 2014. Robust manifold nonnegative matrix factorization. ACM Trans. Knowl. Discov. Data 8, 3 (2014), 11.Google ScholarDigital Library
Ian Jolliffe. 2002. Principal Component Analysis. Wiley Online Library.Google Scholar
Zhao Kang, Chong Peng, and Qiang Cheng. 2015. Robust subspace clustering via smoothed rank approximation. IEEE Signal Processing Letters 22, 11 (2015), 2088--2092. Google ScholarCross Ref
Jingu Kim, Yunlong He, and Haesun Park. 2014. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. J Global Optim. 58, 2 (2014), 285--319. Google ScholarDigital Library
Da Kuang, Haesun Park, and Chris HQ Ding. 2012. Symmetric nonnegative matrix factorization for graph clustering. In SDM, Vol. 12. SIAM, 106--117.Google Scholar
Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788--791. Google ScholarCross Ref
Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Adv. Neur. Infor. Process. Syst. 556--562.Google Scholar
Tao Li and Chris Ding. 2006. The relationships among various nonnegative matrix factorization methods for clustering. In Proceedings of the 6th International Conference on Data Mining (ICDM’06). IEEE, 362--371. Google ScholarDigital Library
M. Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
Chuan-bi Lin. 2007a. Projected gradient methods for nonnegative matrix factorization. Neur. Comput. 19, 10 (2007), 2756--2779. Google ScholarDigital Library
Chih-Jen Lin. 2007b. On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans. Neur. Netw. 18, 6 (2007), 1589--1596. Google ScholarDigital Library
Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1 (2013), 171--184. Google ScholarDigital Library
Huan Liu and Hiroshi Motoda. 2012. Feature Selection for Knowledge Discovery and Data Mining. Vol. 454. Springer Science 8 Business Media.Google Scholar
Nikos K Logothetis and David L Sheinberg. 1996. Visual object recognition. Annu. Rev. Neurosci. 19, 1 (1996), 577--621. Google ScholarCross Ref
Michael Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with gabor wavelets. In Proceedings of the 1998 Third IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 200--205. Google ScholarCross Ref
Sameer A. Nene, Shree K Nayar, Hiroshi Murase, and others. 1996. Columbia Object Image Library (COIL-20). Technical Report. Technical Report CUCS-005-96.Google Scholar
Andrew Y. Ng, Michael I. Jordan, Yair Weiss, and others. 2002. On spectral clustering: Analysis and an algorithm. Adv. Neur. Inform. Process. Syst. 2 (2002), 849--856.Google Scholar
Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint 2, 1-norms minimization. In Advances in Neural Information Processing Systems. 1813--1821.Google Scholar
Feiping Nie, Dong Xu, and Xuelong Li. 2012. Initialization independent clustering with actively self-training method. IEEE Trans. Syst. Man Cybernet. B 42, 1 (2012), 17--27. Google ScholarDigital Library
Alexey Ozerov and Cédric Févotte. 2010. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18, 3 (2010), 550--563. Google ScholarDigital Library
Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126. Google ScholarCross Ref
Stephen E. Palmer. 1977. Hierarchical structure in perceptual representation. Cogn. Psychol. 9, 4 (1977), 441--474. Google ScholarCross Ref
Chong Peng, Zhao Kang, Huiqing Li, and Qiang Cheng. 2015. Subspace clustering using log-determinant rank approximation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 925--934. Google ScholarDigital Library
C. Peng, Z. Kang, M. Yang, and Q. Cheng. 2016. Feature selection embedded subspace clustering. IEEE Sign. Process. Lett. 23, 7 (2016), 1018--1022. DOI:http://dx.doi.org/10.1109/LSP.2016.2573159 Google ScholarCross Ref
Ferdinando S. Samaria and Andy C. Harter. 1994. Parameterisation of a stochastic model for human face identification. In Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, 1994. IEEE, 138--142. Google ScholarCross Ref
Qinbao Song, Jinjie Ni, and Guangtao Wang. 2013. A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25, 1 (2013), 1--14. Google ScholarDigital Library
Dacheng Tao, Xuelong Li, Xindong Wu, and Stephen J. Maybank. 2007. General averaged divergence analysis. In Proceedings of the 7th IEEE International Conference on Data Mining, 2007 (ICDM 2007). IEEE, 302--311. Google ScholarDigital Library
Stephen A. Vavasis. 2009. On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20, 3 (2009), 1364--1377. Google ScholarDigital Library
E. Wachsmuth, M. W. Oram, and D. I. Perrett. 1994. Recognition of objects and their component parts: Responses of single units in the temporal cortex of the macaque. Cerebr. Cortex 4, 5 (1994), 509--522. Google ScholarCross Ref
Jim Jing-Yan Wang, Halima Bensmail, and Xin Gao. 2014. Feature selection and multi-kernel learning for sparse representation on a manifold. Neur. Netw. 51 (2014), 9--16. Google ScholarDigital Library
Yu-Xiong Wang and Yu-Jin Zhang. 2013. Nonnegative matrix factorization: A comprehensive review. IEEE Trans. Knowl. Data Eng. 25, 6 (2013), 1336--1353. Google ScholarDigital Library
Eric P. Xing, Michael I. Jordan, and Richard M. Karp. 2001. Feature selection for high-dimensional genomic microarray data. In Proceedings of ehe18th International Conference on Machine Learning. 601--608.Google Scholar

Index Terms

Nonnegative Matrix Factorization with Integrated Graph and Feature Learning

Recommendations

Quadratic nonnegative matrix factorization

In Nonnegative Matrix Factorization (NMF), a nonnegative matrix is approximated by a product of lower-rank factorizing matrices. Most NMF methods assume that each factorizing matrix appears only once in the approximation, thus the approximation is ...
Read More
Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence
Highlights
- A new non-negative matrix factorization decomposition model is proposed.
- A new ...
Abstract
This paper introduces a Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence (FNTNMF-KLD) for extracting the nonlinear features of a matrix in standard NMF. This method uses a nonlinear ...
Read More
Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering
Abstract
Non-negative matrix factorization (NMF) can map high-dimensional data into a low-dimensional data space. Feature selection can eliminate the redundant and irrelevant features from the alternative features. In this paper, we propose a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 8, Issue 3
Special Issue: Mobile Social Multimedia Analytics in the Big Data Era and Regular Papers
May 2017
320 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3040485
Editor:
Yu Zheng
Microsoft Research, China
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 January 2017
- Accepted: 1 August 2016
- Revised: 1 June 2016
- Received: 1 February 2016
Published in tist Volume 8, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Non-negative matrix factorization
clustering
feature learning
manifold learning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 29
  Total Citations
  View Citations
- 758
  Total Downloads
- Downloads (Last 12 months)71
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Nonnegative Matrix Factorization with Integrated Graph and Feature Learning

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Quadratic nonnegative matrix factorization

Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence

Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering