Abstract
A language model based on features extracted from a recurrent neural network language model and semantic embedding of the left context of the current word based on probabilistic semantic analysis (PLSA) is developed. To calculate such embedding, the context is considered as a document. The effect of vanishing gradients in a recurrent neural network is reduced by this method. The experiment has shown that adding topic-based features reduces perplexity by 10%.
Similar content being viewed by others
References
I. Oparin, “Language models for automatic speech recognition of inflectional languages,” PhD Thesis (Univ. of West Bohemia, Pilsen, 2008).
E. W. D. Whittaker, “Statistical language modeling for automatic speech recognition of Russian and English,” PhD Thesis (Cambridge Univ., 2000).
A. Deoras, T. Mikolov, and S. Kombrik, “Approximate inference: a sampling based modeling technique to capture complex dependencies in a language model,” Speech Commun. (2012).
T. Mikolov, “Statistical language models based on neural networks,” PhD Thesis (Brno Univ. of Technology, 2012).
M. Kudinov, “Recurrent neural networks for hypotheses re-scoring,” in Proc. Speech and Computer Int. Conf. SPECOM 2015 (Athens, Sept. 20–24, 2015).
S. Hochreiter and J. Schmidhuber, “Bridging long time lags by weight guessing and Long Short-Term Memory,” in Spatiotemporal Models in Biological and Artificial Systems, Ed. by F. Silva, J. Principe, and L. Almeida (IOS Press, 1996).
R. Pascanu, T. Mikolov, and Y. Bengio, On the difficulty of training recurrent neural networks (2012). arXiv:1211.5063
R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio, “How to construct deep recurrent neural networks,” in Proc. ICLR 2014 (Banff, 2014). arXiv:1312.6026
Y. Bengio, P. Simard, and P. Frasconi, “Learning longterm dependencies with gradient descent is difficult,” IEEE Trans. Neural Networks 5 (2), 157–166 (1994).
T. Mikolov and G. Zweig, “Context dependent recurrent neural network language model,” in Proc. IEEE Spoken Language Technology Workshop (Miami, 2012).
D. Blei, A. Ng, and M. Jordan, “Latent dirichlet allocation,” J. Mach. Learn. Res. 3, 993–1022 (2003).
J. Bellegarda, “Exploiting latent semantic information in statistical language modeling,” Proc. IEEE. 88 (8), 1279 (2000).
D. Gildea and T. Hoffman, “Topic-based language models using EM,” in Proc. EUROSPEECH (Budapest, 1999).
T. Hofmann, “Probabilistic latent semantic indexing,” in Proc. 22nd Annu. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (ACM, New York, 1999), pp. 50–57.
K.V.Vorontsov, Probabilistic Thematic Simulation. http://www.machinelearning.ru/wiki/images/2/22/Voron-2013-ptm.pdf. Cited September 29, 2015.
K. Vorontsov and A. Potapenko, “Regularization, robustness, and rarefiety of probabilistic thematic models,” Komp’yut. Issl. Model. 4 (4), 693–706 (2012).
R. Rosenfeld, “A maximum entropy approach to adaptive statistical language modelling,” Comput. Speech Language 10 (3), 187–228 (1996).
S. Muzychka, A. Romanenko, and I. Piontkovskaja, “Conditional random field for morphological disambiguation in Russian,” in Proc. Conf. Dialog-2014 (Bekasovo, 2014).
A. N. Tikhonov and V. Ya. Arsenin, Methods for Solving Incorrect Problems (Nauka, Moscow, 1986) [in Russian].
K. Vorontsov and A. Potapenko, “Probabilistic thematic models regularization for rising interpretability and determining the number of topics,” in Proc. Conf. Dialog-2014 (Bekasovo, 2014).
Author information
Authors and Affiliations
Corresponding author
Additional information
This article was translated by the authors.
Mikhail Sergeevich Kudinov. Born 1990. Student in Theoretical and Computational Linguistics, Faculty of Philology, Moscow State University, 2007–2012. Since 2012 graduate student at Dorodnitsyn Computing Centre, Russian Academy of Sciences. Fields of interest: natural language processing, automatic speech recognition, language modeling, machine learning. Author of 6 papers.
Aleksandr Aleksandrovich Romanenko. Born 1991. Student at the Department of Control and Applied Mathematics, Moscow Institute of Physics and Technology, 2008–2014. Awarded Master’s degree in 2014. Since 2015 graduate student at Moscow Institute of Physics and Technology, academic adviser K.V.Vorontsov. Fields of interest: probabilistic topic modeling, machine learning, document analysis. Author of 5 papers.
Rights and permissions
About this article
Cite this article
Kudinov, M.S., Romanenko, A.A. A hybrid language model based on a recurrent neural network and probabilistic topic modeling. Pattern Recognit. Image Anal. 26, 587–592 (2016). https://doi.org/10.1134/S1054661816030123
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1054661816030123