skip to main content
research-article
Public Access

Nonnegative Matrix Factorization with Integrated Graph and Feature Learning

Authors Info & Claims
Published:12 January 2017Publication History
Skip Abstract Section

Abstract

Matrix factorization is a useful technique for data representation in many data mining and machine learning tasks. Particularly, for data sets with all nonnegative entries, matrix factorization often requires that factor matrices be nonnegative, leading to nonnegative matrix factorization (NMF). One important application of NMF is for clustering with reduced dimensions of the data represented in the new feature space. In this paper, we propose a new graph regularized NMF method capable of feature learning and apply it to clustering. Unlike existing NMF methods that treat all features in the original feature space equally, our method distinguishes features by incorporating a feature-wise sparse approximation error matrix in the formulation. It enables important features to be more closely approximated by the factor matrices. Meanwhile, the graph of the data is constructed using cleaner features in the feature learning process, which integrates feature learning and manifold learning procedures into a unified NMF model. This distinctly differs from applying the existing graph-based NMF models after feature selection in that, when these two procedures are independently used, they often fail to align themselves toward obtaining a compact and most expressive data representation. Comprehensive experimental results demonstrate the effectiveness of the proposed method, which outperforms state-of-the-art algorithms when applied to clustering.

References

  1. Phipps Arabie. 1994. Cluster analysis in marketing research. Adv. Methods Market. Res. (1994), 160--189.Google ScholarGoogle Scholar
  2. Sanjeev Arora, Rong Ge, Ravindran Kannan, and Ankur Moitra. 2012. Computing a nonnegative matrix factorization--provably. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing. ACM, 145--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mikhail Belkin and Partha Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering.. In NIPS, Vol. 14. 585--591.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Michael W. Berry, Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2007. Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52, 1 (2007), 155--173. Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Cai. 2011. Litekmeans: The fastest matlab implementation of kmeans. Retrieved from http://www.zjucadcg.cn/dengcai/Data/Clustering.html.Google ScholarGoogle Scholar
  6. Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (2011), 1548--1560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Deng Cai, Xiaofei He, Xiaoyun Wu, and Jiawei Han. 2008. Non-negative matrix factorization on manifold. In Proceedings of the 8th IEEE International Conference on Data Mining, 2008 (ICDM’08). IEEE, 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 333--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? J. ACM 58, 3 (2011), 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wei-Chien Chang. 1983. On using principal components before separating a mixture of two multivariate normal distributions. Appl. Stat. (1983), 267--275. Google ScholarGoogle ScholarCross RefCross Ref
  11. Qiang Cheng, Hongbo Zhou, and Jie Cheng. 2011. The fisher-markov selector: Fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 33, 6 (2011), 1217--1233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Fan R. K. Chung. 1997. Spectral Graph Theory. Vol. 92. American Mathematical Soc.Google ScholarGoogle Scholar
  13. Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. 1990. Indexing by latent semantic analysis. J. Assoc. Inform. Sci. Technol. 41, 6 (1990), 391--407. Google ScholarGoogle ScholarCross RefCross Ref
  14. Richard O. Duda, Peter E. Hart, and David G . Stork. 2012. Pattern Classification. John Wiley 8 Sons.Google ScholarGoogle Scholar
  15. Ernie Esser, Michael Möller, Stanley Osher, Guillermo Sapiro, and Jack Xin. 2012. A convex model for nonnegative matrix factorization and dimensionality reduction on physical space. IEEE Trans. Image Process. 21, 7 (2012), 3239--3252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Cédric Févotte and Jérôme Idier. 2011. Algorithms for nonnegative matrix factorization with the β-divergence. Neur. Comput. 23, 9 (2011), 2421--2456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Athinodoros S. Georghiades, Peter N. Belhumeur, and David J Kriegman. 2001. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Trans. Pattern Anal. Mach. Intell. 23, 6 (2001), 643--660. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nicolas Gillis and Stephen A. Vavasis. 2014. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 4 (2014), 698--714. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012a. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Trans. Sign. Process. 60, 6 (2012), 2882--2898. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012b. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Trans. Neur. Netw. Learn. Syst. 23, 7 (2012), 1087--1099. Google ScholarGoogle ScholarCross RefCross Ref
  21. Patrik O. Hoyer. 2002. Non-negative sparse coding. In Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002. IEEE, 557--565. Google ScholarGoogle ScholarCross RefCross Ref
  22. Patrik O Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5 (2004), 1457--1469.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jin Huang, Feiping Nie, Heng Huang, and Chris Ding. 2014. Robust manifold nonnegative matrix factorization. ACM Trans. Knowl. Discov. Data 8, 3 (2014), 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ian Jolliffe. 2002. Principal Component Analysis. Wiley Online Library.Google ScholarGoogle Scholar
  25. Zhao Kang, Chong Peng, and Qiang Cheng. 2015. Robust subspace clustering via smoothed rank approximation. IEEE Signal Processing Letters 22, 11 (2015), 2088--2092. Google ScholarGoogle ScholarCross RefCross Ref
  26. Jingu Kim, Yunlong He, and Haesun Park. 2014. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. J Global Optim. 58, 2 (2014), 285--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Da Kuang, Haesun Park, and Chris HQ Ding. 2012. Symmetric nonnegative matrix factorization for graph clustering. In SDM, Vol. 12. SIAM, 106--117.Google ScholarGoogle Scholar
  28. Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788--791. Google ScholarGoogle ScholarCross RefCross Ref
  29. Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Adv. Neur. Infor. Process. Syst. 556--562.Google ScholarGoogle Scholar
  30. Tao Li and Chris Ding. 2006. The relationships among various nonnegative matrix factorization methods for clustering. In Proceedings of the 6th International Conference on Data Mining (ICDM’06). IEEE, 362--371. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Lichman. 2013. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google ScholarGoogle Scholar
  32. Chuan-bi Lin. 2007a. Projected gradient methods for nonnegative matrix factorization. Neur. Comput. 19, 10 (2007), 2756--2779. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chih-Jen Lin. 2007b. On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans. Neur. Netw. 18, 6 (2007), 1589--1596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1 (2013), 171--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Huan Liu and Hiroshi Motoda. 2012. Feature Selection for Knowledge Discovery and Data Mining. Vol. 454. Springer Science 8 Business Media.Google ScholarGoogle Scholar
  36. Nikos K Logothetis and David L Sheinberg. 1996. Visual object recognition. Annu. Rev. Neurosci. 19, 1 (1996), 577--621. Google ScholarGoogle ScholarCross RefCross Ref
  37. Michael Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with gabor wavelets. In Proceedings of the 1998 Third IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 200--205. Google ScholarGoogle ScholarCross RefCross Ref
  38. Sameer A. Nene, Shree K Nayar, Hiroshi Murase, and others. 1996. Columbia Object Image Library (COIL-20). Technical Report. Technical Report CUCS-005-96.Google ScholarGoogle Scholar
  39. Andrew Y. Ng, Michael I. Jordan, Yair Weiss, and others. 2002. On spectral clustering: Analysis and an algorithm. Adv. Neur. Inform. Process. Syst. 2 (2002), 849--856.Google ScholarGoogle Scholar
  40. Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint 2, 1-norms minimization. In Advances in Neural Information Processing Systems. 1813--1821.Google ScholarGoogle Scholar
  41. Feiping Nie, Dong Xu, and Xuelong Li. 2012. Initialization independent clustering with actively self-training method. IEEE Trans. Syst. Man Cybernet. B 42, 1 (2012), 17--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Alexey Ozerov and Cédric Févotte. 2010. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18, 3 (2010), 550--563. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126. Google ScholarGoogle ScholarCross RefCross Ref
  44. Stephen E. Palmer. 1977. Hierarchical structure in perceptual representation. Cogn. Psychol. 9, 4 (1977), 441--474. Google ScholarGoogle ScholarCross RefCross Ref
  45. Chong Peng, Zhao Kang, Huiqing Li, and Qiang Cheng. 2015. Subspace clustering using log-determinant rank approximation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 925--934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. C. Peng, Z. Kang, M. Yang, and Q. Cheng. 2016. Feature selection embedded subspace clustering. IEEE Sign. Process. Lett. 23, 7 (2016), 1018--1022. DOI:http://dx.doi.org/10.1109/LSP.2016.2573159 Google ScholarGoogle ScholarCross RefCross Ref
  47. Ferdinando S. Samaria and Andy C. Harter. 1994. Parameterisation of a stochastic model for human face identification. In Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, 1994. IEEE, 138--142. Google ScholarGoogle ScholarCross RefCross Ref
  48. Qinbao Song, Jinjie Ni, and Guangtao Wang. 2013. A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25, 1 (2013), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Dacheng Tao, Xuelong Li, Xindong Wu, and Stephen J. Maybank. 2007. General averaged divergence analysis. In Proceedings of the 7th IEEE International Conference on Data Mining, 2007 (ICDM 2007). IEEE, 302--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Stephen A. Vavasis. 2009. On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20, 3 (2009), 1364--1377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. E. Wachsmuth, M. W. Oram, and D. I. Perrett. 1994. Recognition of objects and their component parts: Responses of single units in the temporal cortex of the macaque. Cerebr. Cortex 4, 5 (1994), 509--522. Google ScholarGoogle ScholarCross RefCross Ref
  52. Jim Jing-Yan Wang, Halima Bensmail, and Xin Gao. 2014. Feature selection and multi-kernel learning for sparse representation on a manifold. Neur. Netw. 51 (2014), 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Yu-Xiong Wang and Yu-Jin Zhang. 2013. Nonnegative matrix factorization: A comprehensive review. IEEE Trans. Knowl. Data Eng. 25, 6 (2013), 1336--1353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Eric P. Xing, Michael I. Jordan, and Richard M. Karp. 2001. Feature selection for high-dimensional genomic microarray data. In Proceedings of ehe18th International Conference on Machine Learning. 601--608.Google ScholarGoogle Scholar

Index Terms

  1. Nonnegative Matrix Factorization with Integrated Graph and Feature Learning

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Intelligent Systems and Technology
              ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 3
              Special Issue: Mobile Social Multimedia Analytics in the Big Data Era and Regular Papers
              May 2017
              320 pages
              ISSN:2157-6904
              EISSN:2157-6912
              DOI:10.1145/3040485
              • Editor:
              • Yu Zheng
              Issue’s Table of Contents

              Copyright © 2017 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 12 January 2017
              • Accepted: 1 August 2016
              • Revised: 1 June 2016
              • Received: 1 February 2016
              Published in tist Volume 8, Issue 3

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader