nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Compositional Sentence Representation from Character Within Large Context Text

verfasst von : Geonmin Kim, Hwaran Lee, Bokyeong Kim, Soo-young Lee

Erschienen in: Neural Information Processing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper describes a Hierarchical Composition Recurrent Network (HCRN) consisting of a 3-level hierarchy of compositional models: character, word and sentence. This model is designed to overcome two problems of representing a sentence on the basis of a constituent word sequence. The first is a data sparsity problem when estimating the embedding of rare words, and the other is no usage of inter-sentence dependency. In the HCRN, word representations are built from characters, thus resolving the data-sparsity problem, and inter-sentence dependency is embedded into sentence representation at the level of sentence composition. We propose a hierarchy-wise language learning scheme in order to alleviate the optimization difficulties when training deep hierarchical recurrent networks in an end-to-end fashion. The HCRN was quantitatively and qualitatively evaluated on a dialogue act classification task. In the end, the HCRN achieved the state-of-the-art performance with a test error rate of 22.7\(\%\) for dialogue act classification on the SWBD-DAMSL database.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Boxless Action Recognition in Still Images via Recurrent Visual Attention

Nächstes Kapitel Ultra-deep Neural Network for Face Anti-spoofing

The dataset is available at https://web.stanford.edu/~jurafsky/swb1_dialogact_annot.tar.gz.

The number of epochs to freeze the pre-trained model is chosen as the best parameter from preliminary experiments on the validation set.

Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef

Bojanowski, P., Joulin, A., Mikolov, T.: Alternative structures for character-level RNNs (2016). arXiv preprint: arXiv:1511.06303

Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)

Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (2014). arXiv preprint: arXiv:1412.3555v1

Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Gated feedback recurrent neural networks. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), vol. 37, pp. 2067–2075 (2015)

Gambäck, B., Olsson, F., Täckström, O.: Active learning for dialogue act classification. In: Proceedings of Interspeech 2011 (2011)

Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)CrossRefMATHMathSciNet

Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6, 107–116 (1998)CrossRefMATH

Jason, L., Kyunghyun, C., Thomas, H.: Fully character-level neural machine translation without explicit segmentation (2017). arXiv preprint: arXiv:1610.03017

10.

Ji, Y., Haffari, G., Eisenstein, J.: A Latent Variable Recurrent Neural Network for Discourse Relation Language Models (2016). arXiv preprint: arXiv:1603.01913

11.

Jonas, G., Michael, A., David, G., Denis, Y., Yann, N.: Convolutional Sequence to Sequence Learning (2017). arXiv preprint: arXiv:1705.03122

12.

Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 171–180 (2015)

13.

Junyoung, C., Sungjin, A., Yoshua, B.: Hierarchical multiscale recurrent neural network. In: International Conference of Learning Representation (ICLR) (2017)

14.

Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. In: ACL WS on Continuous Vector Space Models and Their Compositionality, pp. 119–126 (2013)

15.

Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)

16.

Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: Proceedings of Association for the Advancement of Artificial Intelligence (2016)

17.

Koutnik, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork RNN. In: Proceedings of the 31st International Conference on Machine Learning (ICML), vol. 32, pp. 1863–1871 (2014)

18.

Ling, W., Luis, T., Marujo, L., Astudillo, R.F., Amir, S., Dyer, C., Black, A.W., Trancoso, I.: Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of Empirical Methods on Natural Language Processing (EMNLP), pp. 1520–1530 (2015)

19.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (NIPS) (2013)

20.

Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Special Track on Cognitive Systems at AAAI (2016)

21.

Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop (2010)

22.

Srivastava, N.: Unsupervised learning of video representations using LSTMs. In: Proceedings of International Conference of Machine Learning 2015, vol. 37 (2015)

23.

Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Ess-Dykema, C.V., Meteer, M.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 339–373 (2000)CrossRef

24.

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2014)

25.

Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015)

26.

Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). arXiv preprint: arXiv:1212.5701

Titel: Compositional Sentence Representation from Character Within Large Context Text
verfasst von: Geonmin Kim
Hwaran Lee
Bokyeong Kim
Soo-young Lee
Verlag: Springer International Publishing
Buch: Neural Information Processing
Print ISBN: 978-3-319-70095-3

Electronic ISBN: 978-3-319-70096-0

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-70096-0_69

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner