Skip to main content

2017 | OriginalPaper | Buchkapitel

Distributed Nonnegative Matrix Factorization with HALS Algorithm on MapReduce

verfasst von : Rafał Zdunek, Krzysztof Fonal

Erschienen in: Algorithms and Architectures for Parallel Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nonnegative Matrix Factorization (NMF) is a commonly used method in machine learning and data analysis for feature extraction and dimensionality reduction of nonnegative data. Recently, we observe its increasing popularity in processing massive data, and advances in developing various distributed algorithms for NMF. In the paper, we propose a computational strategy for implementation of the Hierarchical Alternating Least Squares (HALS) algorithm using the MapReduce programming paradigm. Due to this approach, the scalable HALS NMF, which can be implemented on parallel and distributed computer architectures, is obtained. The scalability and efficiency of the proposed algorithm is confirmed in the numerical experiments, performed on large-scale synthetic and recommendation system datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley, Hoboken (2009) Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley, Hoboken (2009)
2.
Zurück zum Zitat Lee, D.D., Seung, H.S.: Algorithms for nonnegative matrix factorization. In: Advances in Neural Information Processing, NIPS, vol. 13, pp. 556–562. MIT Press (2001) Lee, D.D., Seung, H.S.: Algorithms for nonnegative matrix factorization. In: Advances in Neural Information Processing, NIPS, vol. 13, pp. 556–562. MIT Press (2001)
3.
Zurück zum Zitat Liu, C., Yang, H.c., Fan, J., He, L.W., Wang, Y.M.: Distributed nonnegative matrix factorization for web-scale dyadic data analysis on MapReduce. In: Proc. 19th International Conference on World Wide Web (WWW 2010), pp. 681–690. ACM, New York, NY, USA (2010) Liu, C., Yang, H.c., Fan, J., He, L.W., Wang, Y.M.: Distributed nonnegative matrix factorization for web-scale dyadic data analysis on MapReduce. In: Proc. 19th International Conference on World Wide Web (WWW 2010), pp. 681–690. ACM, New York, NY, USA (2010)
4.
Zurück zum Zitat Sun, Z., Li, T., Rishe, N.: Large-scale matrix factorization using MapReduce. In: ICDM Workshops, pp. 1242–1248. IEEE Computer Society (2010) Sun, Z., Li, T., Rishe, N.: Large-scale matrix factorization using MapReduce. In: ICDM Workshops, pp. 1242–1248. IEEE Computer Society (2010)
5.
Zurück zum Zitat Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)CrossRef Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)CrossRef
6.
Zurück zum Zitat Yin, J., Gao, L., Zhang, Z.M.: Scalable nonnegative matrix factorization with block-wise updates. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8726, pp. 337–352. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44845-8_22 Yin, J., Gao, L., Zhang, Z.M.: Scalable nonnegative matrix factorization with block-wise updates. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8726, pp. 337–352. Springer, Heidelberg (2014). doi:10.​1007/​978-3-662-44845-8_​22
7.
Zurück zum Zitat Liao, R., Zhang, Y., Guan, J., Zhou, S.: CloudNMF: A MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets. Genomics, Proteomics and Bioinform. 12(1), 48–51 (2014)CrossRef Liao, R., Zhang, Y., Guan, J., Zhou, S.: CloudNMF: A MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets. Genomics, Proteomics and Bioinform. 12(1), 48–51 (2014)CrossRef
8.
Zurück zum Zitat Schelter, S., Boden, C., Schenck, M., Alexandrov, A., Markl, V.: Distributed matrix factorization with MapReduce using a series of broadcast-joins. In: ACM Conference on Recommender Systems (RecSys) (2013) Schelter, S., Boden, C., Schenck, M., Alexandrov, A., Markl, V.: Distributed matrix factorization with MapReduce using a series of broadcast-joins. In: ACM Conference on Recommender Systems (RecSys) (2013)
9.
Zurück zum Zitat Tan, W., Cao, L., Fong, L.L.: Faster and cheaper: Parallelizing large-scale matrix factorization on GPUs. CoRR abs/1603.03820 (2016) Tan, W., Cao, L., Fong, L.L.: Faster and cheaper: Parallelizing large-scale matrix factorization on GPUs. CoRR abs/1603.03820 (2016)
10.
Zurück zum Zitat Cichocki, A., Zdunek, R., Amari, S.: Hierarchical ALS algorithms for nonnegative matrix and 3D tensor factorization. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 169–176. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74494-8_22 CrossRef Cichocki, A., Zdunek, R., Amari, S.: Hierarchical ALS algorithms for nonnegative matrix and 3D tensor factorization. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 169–176. Springer, Heidelberg (2007). doi:10.​1007/​978-3-540-74494-8_​22 CrossRef
11.
Zurück zum Zitat Cichocki, A., Phan, A.H.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fund. Electron. Commun. Comput. Sci. E92–A(3), 708–721 (2009)CrossRef Cichocki, A., Phan, A.H.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fund. Electron. Commun. Comput. Sci. E92–A(3), 708–721 (2009)CrossRef
12.
Zurück zum Zitat Han, L., Neumann, M., Prasad, U.: Alternating projected Barzilai-Borwein methods for nonnegative matrix factorization. Electron. Trans. Numer. Anal. 36, 54–82 (2009–2010) Han, L., Neumann, M., Prasad, U.: Alternating projected Barzilai-Borwein methods for nonnegative matrix factorization. Electron. Trans. Numer. Anal. 36, 54–82 (2009–2010)
13.
Zurück zum Zitat Kim, J., Park, H.: Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM J. Sci. Comput. 33(6), 3261–3281 (2011)MathSciNetCrossRefMATH Kim, J., Park, H.: Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM J. Sci. Comput. 33(6), 3261–3281 (2011)MathSciNetCrossRefMATH
14.
Zurück zum Zitat Gillis, N., Glineur, F.: Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization. Neural Comput. 24(4), 1085–1105 (2012)MathSciNetCrossRef Gillis, N., Glineur, F.: Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization. Neural Comput. 24(4), 1085–1105 (2012)MathSciNetCrossRef
15.
Zurück zum Zitat Chen, W., Guillaume, M.: HALS-based NMF with flexible constraints for hyperspectral unmixing. EURASIP J. Adv. Sig. Proc. 54, 1–14 (2012) Chen, W., Guillaume, M.: HALS-based NMF with flexible constraints for hyperspectral unmixing. EURASIP J. Adv. Sig. Proc. 54, 1–14 (2012)
16.
Zurück zum Zitat Laudadio, T., Sava, C., Anca, R., Sima, D.M., Wright, A.J., Heerschap, A., Mastronardi, N., Van Huffel, S.: Hierarchical non-negative matrix factorization applied to three-dimensional 3T MRSI data for automatic tissue characterization of the prostate. NMR Biomed. 29(6), 751–758 (2016)CrossRef Laudadio, T., Sava, C., Anca, R., Sima, D.M., Wright, A.J., Heerschap, A., Mastronardi, N., Van Huffel, S.: Hierarchical non-negative matrix factorization applied to three-dimensional 3T MRSI data for automatic tissue characterization of the prostate. NMR Biomed. 29(6), 751–758 (2016)CrossRef
17.
Zurück zum Zitat Wang, Y.X., Zhang, Y.J.: Nonnegative matrix factorization: A comprehensive review. IEEE Trans. on Knowl. Data Eng. 25(6), 1336–1353 (2013)CrossRef Wang, Y.X., Zhang, Y.J.: Nonnegative matrix factorization: A comprehensive review. IEEE Trans. on Knowl. Data Eng. 25(6), 1336–1353 (2013)CrossRef
18.
Zurück zum Zitat Benson, A.R., Lee, J.D., Rajwa, B., Gleich, D.F.: Scalable methods for nonnegative matrix factorizations of near-separable tall-and-skinny matrices. In: Proceedings of Neural Information Processing Systems, pp. 945–953(2014) Benson, A.R., Lee, J.D., Rajwa, B., Gleich, D.F.: Scalable methods for nonnegative matrix factorizations of near-separable tall-and-skinny matrices. In: Proceedings of Neural Information Processing Systems, pp. 945–953(2014)
19.
Zurück zum Zitat Harper, F.M., Konstan, J.A.: The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 5(4), 19:1–19:19 (2015)CrossRef Harper, F.M., Konstan, J.A.: The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 5(4), 19:1–19:19 (2015)CrossRef
20.
21.
Zurück zum Zitat Buksak, D.: Implementation of nonnegative matrix factorization algorithms in apache spark framework. Master’s thesis, Wroclaw University of Science and Technology Supervised by Dr. R. Zdunek (2016) Buksak, D.: Implementation of nonnegative matrix factorization algorithms in apache spark framework. Master’s thesis, Wroclaw University of Science and Technology Supervised by Dr. R. Zdunek (2016)
Metadaten
Titel
Distributed Nonnegative Matrix Factorization with HALS Algorithm on MapReduce
verfasst von
Rafał Zdunek
Krzysztof Fonal
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-65482-9_14

Premium Partner