Abstract
Matrix factorization is a useful technique for data representation in many data mining and machine learning tasks. Particularly, for data sets with all nonnegative entries, matrix factorization often requires that factor matrices be nonnegative, leading to nonnegative matrix factorization (NMF). One important application of NMF is for clustering with reduced dimensions of the data represented in the new feature space. In this paper, we propose a new graph regularized NMF method capable of feature learning and apply it to clustering. Unlike existing NMF methods that treat all features in the original feature space equally, our method distinguishes features by incorporating a feature-wise sparse approximation error matrix in the formulation. It enables important features to be more closely approximated by the factor matrices. Meanwhile, the graph of the data is constructed using cleaner features in the feature learning process, which integrates feature learning and manifold learning procedures into a unified NMF model. This distinctly differs from applying the existing graph-based NMF models after feature selection in that, when these two procedures are independently used, they often fail to align themselves toward obtaining a compact and most expressive data representation. Comprehensive experimental results demonstrate the effectiveness of the proposed method, which outperforms state-of-the-art algorithms when applied to clustering.
- Phipps Arabie. 1994. Cluster analysis in marketing research. Adv. Methods Market. Res. (1994), 160--189.Google Scholar
- Sanjeev Arora, Rong Ge, Ravindran Kannan, and Ankur Moitra. 2012. Computing a nonnegative matrix factorization--provably. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing. ACM, 145--162. Google ScholarDigital Library
- Mikhail Belkin and Partha Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering.. In NIPS, Vol. 14. 585--591.Google ScholarDigital Library
- Michael W. Berry, Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2007. Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52, 1 (2007), 155--173. Google ScholarCross Ref
- D. Cai. 2011. Litekmeans: The fastest matlab implementation of kmeans. Retrieved from http://www.zjucadcg.cn/dengcai/Data/Clustering.html.Google Scholar
- Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (2011), 1548--1560. Google ScholarDigital Library
- Deng Cai, Xiaofei He, Xiaoyun Wu, and Jiawei Han. 2008. Non-negative matrix factorization on manifold. In Proceedings of the 8th IEEE International Conference on Data Mining, 2008 (ICDM’08). IEEE, 63--72. Google ScholarDigital Library
- Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 333--342. Google ScholarDigital Library
- Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? J. ACM 58, 3 (2011), 11.Google ScholarDigital Library
- Wei-Chien Chang. 1983. On using principal components before separating a mixture of two multivariate normal distributions. Appl. Stat. (1983), 267--275. Google ScholarCross Ref
- Qiang Cheng, Hongbo Zhou, and Jie Cheng. 2011. The fisher-markov selector: Fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 33, 6 (2011), 1217--1233. Google ScholarDigital Library
- Fan R. K. Chung. 1997. Spectral Graph Theory. Vol. 92. American Mathematical Soc.Google Scholar
- Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. 1990. Indexing by latent semantic analysis. J. Assoc. Inform. Sci. Technol. 41, 6 (1990), 391--407. Google ScholarCross Ref
- Richard O. Duda, Peter E. Hart, and David G . Stork. 2012. Pattern Classification. John Wiley 8 Sons.Google Scholar
- Ernie Esser, Michael Möller, Stanley Osher, Guillermo Sapiro, and Jack Xin. 2012. A convex model for nonnegative matrix factorization and dimensionality reduction on physical space. IEEE Trans. Image Process. 21, 7 (2012), 3239--3252. Google ScholarDigital Library
- Cédric Févotte and Jérôme Idier. 2011. Algorithms for nonnegative matrix factorization with the β-divergence. Neur. Comput. 23, 9 (2011), 2421--2456. Google ScholarDigital Library
- Athinodoros S. Georghiades, Peter N. Belhumeur, and David J Kriegman. 2001. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Trans. Pattern Anal. Mach. Intell. 23, 6 (2001), 643--660. Google ScholarDigital Library
- Nicolas Gillis and Stephen A. Vavasis. 2014. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 4 (2014), 698--714. Google ScholarDigital Library
- Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012a. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Trans. Sign. Process. 60, 6 (2012), 2882--2898. Google ScholarDigital Library
- Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012b. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Trans. Neur. Netw. Learn. Syst. 23, 7 (2012), 1087--1099. Google ScholarCross Ref
- Patrik O. Hoyer. 2002. Non-negative sparse coding. In Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002. IEEE, 557--565. Google ScholarCross Ref
- Patrik O Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5 (2004), 1457--1469.Google ScholarDigital Library
- Jin Huang, Feiping Nie, Heng Huang, and Chris Ding. 2014. Robust manifold nonnegative matrix factorization. ACM Trans. Knowl. Discov. Data 8, 3 (2014), 11.Google ScholarDigital Library
- Ian Jolliffe. 2002. Principal Component Analysis. Wiley Online Library.Google Scholar
- Zhao Kang, Chong Peng, and Qiang Cheng. 2015. Robust subspace clustering via smoothed rank approximation. IEEE Signal Processing Letters 22, 11 (2015), 2088--2092. Google ScholarCross Ref
- Jingu Kim, Yunlong He, and Haesun Park. 2014. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. J Global Optim. 58, 2 (2014), 285--319. Google ScholarDigital Library
- Da Kuang, Haesun Park, and Chris HQ Ding. 2012. Symmetric nonnegative matrix factorization for graph clustering. In SDM, Vol. 12. SIAM, 106--117.Google Scholar
- Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788--791. Google ScholarCross Ref
- Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Adv. Neur. Infor. Process. Syst. 556--562.Google Scholar
- Tao Li and Chris Ding. 2006. The relationships among various nonnegative matrix factorization methods for clustering. In Proceedings of the 6th International Conference on Data Mining (ICDM’06). IEEE, 362--371. Google ScholarDigital Library
- M. Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
- Chuan-bi Lin. 2007a. Projected gradient methods for nonnegative matrix factorization. Neur. Comput. 19, 10 (2007), 2756--2779. Google ScholarDigital Library
- Chih-Jen Lin. 2007b. On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans. Neur. Netw. 18, 6 (2007), 1589--1596. Google ScholarDigital Library
- Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1 (2013), 171--184. Google ScholarDigital Library
- Huan Liu and Hiroshi Motoda. 2012. Feature Selection for Knowledge Discovery and Data Mining. Vol. 454. Springer Science 8 Business Media.Google Scholar
- Nikos K Logothetis and David L Sheinberg. 1996. Visual object recognition. Annu. Rev. Neurosci. 19, 1 (1996), 577--621. Google ScholarCross Ref
- Michael Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with gabor wavelets. In Proceedings of the 1998 Third IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 200--205. Google ScholarCross Ref
- Sameer A. Nene, Shree K Nayar, Hiroshi Murase, and others. 1996. Columbia Object Image Library (COIL-20). Technical Report. Technical Report CUCS-005-96.Google Scholar
- Andrew Y. Ng, Michael I. Jordan, Yair Weiss, and others. 2002. On spectral clustering: Analysis and an algorithm. Adv. Neur. Inform. Process. Syst. 2 (2002), 849--856.Google Scholar
- Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint 2, 1-norms minimization. In Advances in Neural Information Processing Systems. 1813--1821.Google Scholar
- Feiping Nie, Dong Xu, and Xuelong Li. 2012. Initialization independent clustering with actively self-training method. IEEE Trans. Syst. Man Cybernet. B 42, 1 (2012), 17--27. Google ScholarDigital Library
- Alexey Ozerov and Cédric Févotte. 2010. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18, 3 (2010), 550--563. Google ScholarDigital Library
- Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126. Google ScholarCross Ref
- Stephen E. Palmer. 1977. Hierarchical structure in perceptual representation. Cogn. Psychol. 9, 4 (1977), 441--474. Google ScholarCross Ref
- Chong Peng, Zhao Kang, Huiqing Li, and Qiang Cheng. 2015. Subspace clustering using log-determinant rank approximation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 925--934. Google ScholarDigital Library
- C. Peng, Z. Kang, M. Yang, and Q. Cheng. 2016. Feature selection embedded subspace clustering. IEEE Sign. Process. Lett. 23, 7 (2016), 1018--1022. DOI:http://dx.doi.org/10.1109/LSP.2016.2573159 Google ScholarCross Ref
- Ferdinando S. Samaria and Andy C. Harter. 1994. Parameterisation of a stochastic model for human face identification. In Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, 1994. IEEE, 138--142. Google ScholarCross Ref
- Qinbao Song, Jinjie Ni, and Guangtao Wang. 2013. A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25, 1 (2013), 1--14. Google ScholarDigital Library
- Dacheng Tao, Xuelong Li, Xindong Wu, and Stephen J. Maybank. 2007. General averaged divergence analysis. In Proceedings of the 7th IEEE International Conference on Data Mining, 2007 (ICDM 2007). IEEE, 302--311. Google ScholarDigital Library
- Stephen A. Vavasis. 2009. On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20, 3 (2009), 1364--1377. Google ScholarDigital Library
- E. Wachsmuth, M. W. Oram, and D. I. Perrett. 1994. Recognition of objects and their component parts: Responses of single units in the temporal cortex of the macaque. Cerebr. Cortex 4, 5 (1994), 509--522. Google ScholarCross Ref
- Jim Jing-Yan Wang, Halima Bensmail, and Xin Gao. 2014. Feature selection and multi-kernel learning for sparse representation on a manifold. Neur. Netw. 51 (2014), 9--16. Google ScholarDigital Library
- Yu-Xiong Wang and Yu-Jin Zhang. 2013. Nonnegative matrix factorization: A comprehensive review. IEEE Trans. Knowl. Data Eng. 25, 6 (2013), 1336--1353. Google ScholarDigital Library
- Eric P. Xing, Michael I. Jordan, and Richard M. Karp. 2001. Feature selection for high-dimensional genomic microarray data. In Proceedings of ehe18th International Conference on Machine Learning. 601--608.Google Scholar
Index Terms
- Nonnegative Matrix Factorization with Integrated Graph and Feature Learning
Recommendations
Quadratic nonnegative matrix factorization
In Nonnegative Matrix Factorization (NMF), a nonnegative matrix is approximated by a product of lower-rank factorizing matrices. Most NMF methods assume that each factorizing matrix appears only once in the approximation, thus the approximation is ...
Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence
Highlights- A new non-negative matrix factorization decomposition model is proposed.
- A new ...
AbstractThis paper introduces a Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence (FNTNMF-KLD) for extracting the nonlinear features of a matrix in standard NMF. This method uses a nonlinear ...
Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering
AbstractNon-negative matrix factorization (NMF) can map high-dimensional data into a low-dimensional data space. Feature selection can eliminate the redundant and irrelevant features from the alternative features. In this paper, we propose a ...
Comments