Skip to main content

2017 | OriginalPaper | Buchkapitel

Compositional Sentence Representation from Character Within Large Context Text

verfasst von : Geonmin Kim, Hwaran Lee, Bokyeong Kim, Soo-young Lee

Erschienen in: Neural Information Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper describes a Hierarchical Composition Recurrent Network (HCRN) consisting of a 3-level hierarchy of compositional models: character, word and sentence. This model is designed to overcome two problems of representing a sentence on the basis of a constituent word sequence. The first is a data sparsity problem when estimating the embedding of rare words, and the other is no usage of inter-sentence dependency. In the HCRN, word representations are built from characters, thus resolving the data-sparsity problem, and inter-sentence dependency is embedded into sentence representation at the level of sentence composition. We propose a hierarchy-wise language learning scheme in order to alleviate the optimization difficulties when training deep hierarchical recurrent networks in an end-to-end fashion. The HCRN was quantitatively and qualitatively evaluated on a dialogue act classification task. In the end, the HCRN achieved the state-of-the-art performance with a test error rate of 22.7\(\%\) for dialogue act classification on the SWBD-DAMSL database.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
The number of epochs to freeze the pre-trained model is chosen as the best parameter from preliminary experiments on the validation set.
 
Literatur
1.
Zurück zum Zitat Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef
2.
Zurück zum Zitat Bojanowski, P., Joulin, A., Mikolov, T.: Alternative structures for character-level RNNs (2016). arXiv preprint: arXiv:1511.06303 Bojanowski, P., Joulin, A., Mikolov, T.: Alternative structures for character-level RNNs (2016). arXiv preprint: arXiv:​1511.​06303
3.
Zurück zum Zitat Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014) Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)
4.
Zurück zum Zitat Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (2014). arXiv preprint: arXiv:1412.3555v1 Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling (2014). arXiv preprint: arXiv:​1412.​3555v1
5.
Zurück zum Zitat Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Gated feedback recurrent neural networks. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), vol. 37, pp. 2067–2075 (2015) Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Gated feedback recurrent neural networks. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), vol. 37, pp. 2067–2075 (2015)
6.
Zurück zum Zitat Gambäck, B., Olsson, F., Täckström, O.: Active learning for dialogue act classification. In: Proceedings of Interspeech 2011 (2011) Gambäck, B., Olsson, F., Täckström, O.: Active learning for dialogue act classification. In: Proceedings of Interspeech 2011 (2011)
7.
Zurück zum Zitat Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)CrossRefMATHMathSciNet Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)CrossRefMATHMathSciNet
8.
Zurück zum Zitat Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6, 107–116 (1998)CrossRefMATH Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6, 107–116 (1998)CrossRefMATH
9.
Zurück zum Zitat Jason, L., Kyunghyun, C., Thomas, H.: Fully character-level neural machine translation without explicit segmentation (2017). arXiv preprint: arXiv:1610.03017 Jason, L., Kyunghyun, C., Thomas, H.: Fully character-level neural machine translation without explicit segmentation (2017). arXiv preprint: arXiv:​1610.​03017
10.
Zurück zum Zitat Ji, Y., Haffari, G., Eisenstein, J.: A Latent Variable Recurrent Neural Network for Discourse Relation Language Models (2016). arXiv preprint: arXiv:1603.01913 Ji, Y., Haffari, G., Eisenstein, J.: A Latent Variable Recurrent Neural Network for Discourse Relation Language Models (2016). arXiv preprint: arXiv:​1603.​01913
11.
Zurück zum Zitat Jonas, G., Michael, A., David, G., Denis, Y., Yann, N.: Convolutional Sequence to Sequence Learning (2017). arXiv preprint: arXiv:1705.03122 Jonas, G., Michael, A., David, G., Denis, Y., Yann, N.: Convolutional Sequence to Sequence Learning (2017). arXiv preprint: arXiv:​1705.​03122
12.
Zurück zum Zitat Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 171–180 (2015) Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 171–180 (2015)
13.
Zurück zum Zitat Junyoung, C., Sungjin, A., Yoshua, B.: Hierarchical multiscale recurrent neural network. In: International Conference of Learning Representation (ICLR) (2017) Junyoung, C., Sungjin, A., Yoshua, B.: Hierarchical multiscale recurrent neural network. In: International Conference of Learning Representation (ICLR) (2017)
14.
Zurück zum Zitat Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. In: ACL WS on Continuous Vector Space Models and Their Compositionality, pp. 119–126 (2013) Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. In: ACL WS on Continuous Vector Space Models and Their Compositionality, pp. 119–126 (2013)
15.
Zurück zum Zitat Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014) Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)
16.
Zurück zum Zitat Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: Proceedings of Association for the Advancement of Artificial Intelligence (2016) Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: Proceedings of Association for the Advancement of Artificial Intelligence (2016)
17.
Zurück zum Zitat Koutnik, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork RNN. In: Proceedings of the 31st International Conference on Machine Learning (ICML), vol. 32, pp. 1863–1871 (2014) Koutnik, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork RNN. In: Proceedings of the 31st International Conference on Machine Learning (ICML), vol. 32, pp. 1863–1871 (2014)
18.
Zurück zum Zitat Ling, W., Luis, T., Marujo, L., Astudillo, R.F., Amir, S., Dyer, C., Black, A.W., Trancoso, I.: Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of Empirical Methods on Natural Language Processing (EMNLP), pp. 1520–1530 (2015) Ling, W., Luis, T., Marujo, L., Astudillo, R.F., Amir, S., Dyer, C., Black, A.W., Trancoso, I.: Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of Empirical Methods on Natural Language Processing (EMNLP), pp. 1520–1530 (2015)
19.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (NIPS) (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (NIPS) (2013)
20.
Zurück zum Zitat Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Special Track on Cognitive Systems at AAAI (2016) Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Special Track on Cognitive Systems at AAAI (2016)
21.
Zurück zum Zitat Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop (2010) Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop (2010)
22.
Zurück zum Zitat Srivastava, N.: Unsupervised learning of video representations using LSTMs. In: Proceedings of International Conference of Machine Learning 2015, vol. 37 (2015) Srivastava, N.: Unsupervised learning of video representations using LSTMs. In: Proceedings of International Conference of Machine Learning 2015, vol. 37 (2015)
23.
Zurück zum Zitat Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Ess-Dykema, C.V., Meteer, M.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 339–373 (2000)CrossRef Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Ess-Dykema, C.V., Meteer, M.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 339–373 (2000)CrossRef
24.
Zurück zum Zitat Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2014) Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2014)
25.
Zurück zum Zitat Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015) Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015)
Metadaten
Titel
Compositional Sentence Representation from Character Within Large Context Text
verfasst von
Geonmin Kim
Hwaran Lee
Bokyeong Kim
Soo-young Lee
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-70096-0_69

Premium Partner