Skip to main content

2018 | OriginalPaper | Buchkapitel

Distributed Nonnegative Matrix Factorization with HALS Algorithm on Apache Spark

verfasst von : Krzysztof Fonał, Rafał Zdunek

Erschienen in: Artificial Intelligence and Soft Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nonnegative Matrix Factorization (NMF) is a commonly-used unsupervised learning method for extracting parts-based features and dimensionality reduction from nonnegative data. Many computational algorithms exist for updating the latent nonnegative factors in NMF. In this study, we propose an extension of the Hierarchical Alternating Least Squares (HALS) algorithm to a distributed version using the state-of-the-art framework - Apache Spark. Spark gains its popularity among other distributed computational frameworks because of its in-memory approach which works much faster than well-known Apache Hadoop. The scalability and efficiency of the proposed algorithm is confirmed in the numerical experiments, performed on real data as well as synthetic ones.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley, Hoboken (2009)CrossRef Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley, Hoboken (2009)CrossRef
3.
Zurück zum Zitat Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRef Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRef
4.
Zurück zum Zitat Liu, C., Yang, H.C., Fan, J., He, L.W., Wang, Y.M.: Distributed nonnegative matrix factorization for web-scale dyadic data analysis on MapReduce. In: Proceedings of 19th International Conference on World Wide Web. WWW 2010, pp. 681–690. ACM, New York (2010) Liu, C., Yang, H.C., Fan, J., He, L.W., Wang, Y.M.: Distributed nonnegative matrix factorization for web-scale dyadic data analysis on MapReduce. In: Proceedings of 19th International Conference on World Wide Web. WWW 2010, pp. 681–690. ACM, New York (2010)
5.
Zurück zum Zitat Sun, Z., Li, T., Rishe, N.: Large-scale matrix factorization using MapReduce. In: ICDM Workshops, pp. 1242–1248. IEEE Computer Society (2010) Sun, Z., Li, T., Rishe, N.: Large-scale matrix factorization using MapReduce. In: ICDM Workshops, pp. 1242–1248. IEEE Computer Society (2010)
6.
Zurück zum Zitat Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)CrossRef Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)CrossRef
8.
Zurück zum Zitat Han, L., Neumann, M., Prasad, U.: Alternating projected Barzilai-Borwein methods for nonnegative matrix factorization. Electron. Trans. Numer. Anal. 36, 54–82 (2009–2010) Han, L., Neumann, M., Prasad, U.: Alternating projected Barzilai-Borwein methods for nonnegative matrix factorization. Electron. Trans. Numer. Anal. 36, 54–82 (2009–2010)
9.
Zurück zum Zitat Kim, J., Park, H.: Fast nonnegative matrix factorization: an active-set-like method and comparisons. SIAM J. Sci. Comput. 33(6), 3261–3281 (2011)MathSciNetCrossRef Kim, J., Park, H.: Fast nonnegative matrix factorization: an active-set-like method and comparisons. SIAM J. Sci. Comput. 33(6), 3261–3281 (2011)MathSciNetCrossRef
10.
Zurück zum Zitat Gillis, N., Glineur, F.: Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization. Neural Comput. 24(4), 1085–1105 (2012)MathSciNetCrossRef Gillis, N., Glineur, F.: Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization. Neural Comput. 24(4), 1085–1105 (2012)MathSciNetCrossRef
11.
Zurück zum Zitat Chen, W., Guillaume, M.: HALS-based NMF with flexible constraints for hyperspectral unmixing. EURASIP J. Adv. Signal Process. 54, 1–14 (2012) Chen, W., Guillaume, M.: HALS-based NMF with flexible constraints for hyperspectral unmixing. EURASIP J. Adv. Signal Process. 54, 1–14 (2012)
12.
Zurück zum Zitat Laudadio, T., Croitor Sava, A.R., Sima, D.M., Wright, A.J., Heerschap, A., Mastronardi, N., Van Huffel, S.: Hierarchical non-negative matrix factorization applied to three-dimensional 3T MRSI data for automatic tissue characterization of the prostate. NMR Biomed. 29(6), 751–758 (2016)CrossRef Laudadio, T., Croitor Sava, A.R., Sima, D.M., Wright, A.J., Heerschap, A., Mastronardi, N., Van Huffel, S.: Hierarchical non-negative matrix factorization applied to three-dimensional 3T MRSI data for automatic tissue characterization of the prostate. NMR Biomed. 29(6), 751–758 (2016)CrossRef
14.
Zurück zum Zitat Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Gribble, S.D., Katabi, D. (eds.) Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, 25–27 April 2012, pp. 15–28. USENIX Association (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Gribble, S.D., Katabi, D. (eds.) Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, 25–27 April 2012, pp. 15–28. USENIX Association (2012)
15.
Zurück zum Zitat Cichocki, A., Phan, A.H.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E92–A(3), 708–721 (2009)CrossRef Cichocki, A., Phan, A.H.: Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E92–A(3), 708–721 (2009)CrossRef
16.
Zurück zum Zitat Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intell. Syst. 5(4), 19:1–19:19 (2015)CrossRef Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intell. Syst. 5(4), 19:1–19:19 (2015)CrossRef
Metadaten
Titel
Distributed Nonnegative Matrix Factorization with HALS Algorithm on Apache Spark
verfasst von
Krzysztof Fonał
Rafał Zdunek
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-91262-2_30