Skip to main content
Top

2021 | OriginalPaper | Chapter

Mixture Variational Autoencoder of Boltzmann Machines for Text Processing

Authors : Bruno Guilherme Gomes, Fabricio Murai, Olga Goussevskaia, Ana Paula Couto da Silva

Published in: Natural Language Processing and Information Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Variational autoencoders (VAEs) have been successfully used to learn good representations in unsupervised settings, especially for image data. More recently, mixture variational autoencoders (MVAEs) have been proposed to enhance the representation capabilities of VAEs by assuming that data can come from a mixture distribution. In this work, we adapt MVAEs for text processing by modeling each component’s joint distribution of latent variables and document’s bag-of-words as a graphical model known as the Boltzmann Machine, popular in natural language processing for performing well in a number of tasks. The proposed model, MVAE-BM, can learn text representations from unlabeled data without requiring pre-trained word embeddings. We evaluate the representations obtained by MVAE-BM on six corpora w.r.t. the perplexity metric and accuracy on binary and multi-class text classification. Despite its simplicity, our results show that MVAE-BM’s performance is on par with or superior to that of modern deep learning techniques such as BERT and RoBERTa. Last, we show that the mapping to mixture components learned by the model lends itself naturally to document clustering.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: ACL (2020) Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: ACL (2020)
2.
go back to reference Dahl, G.E., Adams, R.P., Larochelle, H.: Training restricted Boltzmann machines on word observations. In: ICML (2012) Dahl, G.E., Adams, R.P., Larochelle, H.: Training restricted Boltzmann machines on word observations. In: ICML (2012)
4.
go back to reference Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019) Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
5.
go back to reference Ding, R., Nallapati, R., Xiang, B.: Coherence-aware neural topic modeling. In: EMNLP (2018) Ding, R., Nallapati, R., Xiang, B.: Coherence-aware neural topic modeling. In: EMNLP (2018)
6.
go back to reference Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. In: ICLR (2017) Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. In: ICLR (2017)
7.
go back to reference Jiang, S., Chen, Y., Yang, J., Zhang, C., Zhao, T.: Mixture variational autoencoders. Pattern Recognit. Lett. 128 (2019) Jiang, S., Chen, Y., Yang, J., Zhang, C., Zhao, T.: Mixture variational autoencoders. Pattern Recognit. Lett. 128 (2019)
8.
go back to reference Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014) Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)
10.
go back to reference Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: ICLR (2017) Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: ICLR (2017)
11.
go back to reference Miao, Y., Grefenstette, E., Blunsom, P.: Discovering discrete latent topics with neural variational inference. In: ICML (2017) Miao, Y., Grefenstette, E., Blunsom, P.: Discovering discrete latent topics with neural variational inference. In: ICML (2017)
12.
go back to reference Miao, Y., Yu, L., Blunsom, P.: Neural variational inference for text processing. In: ICML (2015) Miao, Y., Yu, L., Blunsom, P.: Neural variational inference for text processing. In: ICML (2015)
13.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NeurIPS (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NeurIPS (2013)
14.
go back to reference Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. In: ICML (2014) Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. In: ICML (2014)
15.
go back to reference Ning, X., et al.: Nonparametric topic modeling with neural inference. Neurocomputing 399, 296–306 (2020)CrossRef Ning, X., et al.: Nonparametric topic modeling with neural inference. Neurocomputing 399, 296–306 (2020)CrossRef
16.
go back to reference Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: EMNLP (2014) Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: EMNLP (2014)
17.
go back to reference Reimers, N., Gurevych, I.: Making monolingual sentence embeddings multilingual using knowledge distillation. arXiv preprint arXiv:2004.09813 (2020) Reimers, N., Gurevych, I.: Making monolingual sentence embeddings multilingual using knowledge distillation. arXiv preprint arXiv:​2004.​09813 (2020)
18.
go back to reference Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef
19.
go back to reference Srivastava, A., Sutton, C.: Neural variational inference for topic models. In: NeurIPS (2016) Srivastava, A., Sutton, C.: Neural variational inference for topic models. In: NeurIPS (2016)
20.
go back to reference Srivastava, N., Salakhutdinov, R., Hinton, G.: Modeling documents with a deep Boltzmann machine. In: Conference on Uncertainty in Artificial Intelligence (2013) Srivastava, N., Salakhutdinov, R., Hinton, G.: Modeling documents with a deep Boltzmann machine. In: Conference on Uncertainty in Artificial Intelligence (2013)
21.
go back to reference Sugar, C.A., James, G.M.: Finding the number of clusters in a dataset. J. Am. Stat. Assoc. 98(463), 750–763 (2003)CrossRef Sugar, C.A., James, G.M.: Finding the number of clusters in a dataset. J. Am. Stat. Assoc. 98(463), 750–763 (2003)CrossRef
22.
go back to reference Wu, J., et al.: Neural mixed counting models for dispersed topic discovery. In: Annual Meeting of the Association for Computational Linguistics (2020) Wu, J., et al.: Neural mixed counting models for dispersed topic discovery. In: Annual Meeting of the Association for Computational Linguistics (2020)
23.
go back to reference Xiao, Y., Zhao, T., Wang, W.Y.: Dirichlet variational autoencoder for text modeling. CoRR (2018) Xiao, Y., Zhao, T., Wang, W.Y.: Dirichlet variational autoencoder for text modeling. CoRR (2018)
24.
go back to reference Xu, J., Durrett, G.: Spherical latent spaces for stable variational autoencoders. In: EMNLP (2018) Xu, J., Durrett, G.: Spherical latent spaces for stable variational autoencoders. In: EMNLP (2018)
25.
go back to reference Yang, Z., Hu, Z., Salakhutdinov, R., Berg-Kirkpatrick, T.: Improved variational autoencoders for text modeling using dilated convolutions. In: ICML (2017) Yang, Z., Hu, Z., Salakhutdinov, R., Berg-Kirkpatrick, T.: Improved variational autoencoders for text modeling using dilated convolutions. In: ICML (2017)
Metadata
Title
Mixture Variational Autoencoder of Boltzmann Machines for Text Processing
Authors
Bruno Guilherme Gomes
Fabricio Murai
Olga Goussevskaia
Ana Paula Couto da Silva
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-80599-9_5

Premium Partner