nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Extending the Transformer with Context and Multi-dimensional Mechanism for Dialogue Response Generation

verfasst von : Ruxin Tan, Jiahui Sun, Bo Su, Gongshen Liu

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The existing work of using generative model in multi-turn dialogue system is often based on RNN (Recurrent neural network) even though the Transformer structure has achieved great success in other fields of NLP. In the multi-turn conversation task, a response is produced according to both the source utterance and the utterances in the previous turn which are regarded as context utterances. However, vanilla Transformer processes utterances in isolation and hence cannot explicitly handle the differences between context utterances and source utterance. In addition, even the same word could have different meanings in different contexts as there are rich information within context utterance and source utterance in multi-turn conversation. Based on context and multi-dimensional attention mechanism, an end-to-end model, which is extended from vanilla Transformer, is proposed for response generation. With the context mechanism, information from the context utterance can flow to the source and hence jointly control response generation. Multi-dimensional attention mechanism enables our model to capture more context and source utterance information by 2D vectoring the attention weights. Experiments show that the proposed model outperforms other state-of-the-art models (+35.8% better than the best baseline).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Domain Adaptive Question Answering over Knowledge Base

Nächstes Kapitel Co-attention Networks for Aspect-Level Sentiment Analysis

The subscript i is omitted for clarity.

They are all native speakers and have graduate degrees or above.

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734 (2014)

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Kuang, S., Xiong, D.: Fusing recency into neural machine translation with an inter-sentence gate model. arXiv preprint arXiv:1806.04466 (2018)

Mou, L., Song, Y., Yan, R., Li, G., Zhang, L., Jin, Z.: Sequence to backward and forward sequences: a content-introducing approach to generative short-text conversation. In: COLING 2016: Technical Papers, pp. 3349–3358 (2016)

Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: EMNLP, pp. 583–593. Association for Computational Linguistics (2011)

Serban, I.V., Sordoni, A., Bengio, Y., Courville, A.C., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI, vol. 16, pp. 3776–3784 (2016)

Serban, I.V., et al.: A hierarchical latent variable encoder-decoder model for generating dialogues. In: AAAI, pp. 3295–3301 (2017)

10.

Shang, L., Lu, Z., Li, H.: Neural responding machine for short-text conversation. In: ACL (Volume 1: Long Papers), vol. 1, pp. 1577–1586 (2015)

11.

Shen, T., Zhou, T., Long, G., Jiang, J., Pan, S., Zhang, C.: Disan: directional self-attention network for RNN/CNN-free language understanding. In: AAAI (2018)

12.

Sordoni, A., et al.: A neural network approach to context-sensitive generation of conversational responses. In: NAACL: Human Language Technologies, pp. 196–205 (2015)

13.

Tao, C., Mou, L., Zhao, D., Yan, R.: Ruber: an unsupervised method for automatic evaluation of open-domain dialog systems. In: AAAI (2018)

14.

Tian, Z., Yan, R., Mou, L., Song, Y., Feng, Y., Zhao, D.: How to make context more useful? an empirical study on context-aware neural conversational models. In: ACL (Volume 2: Short Papers), vol. 2, pp. 231–236 (2017)

15.

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

16.

Williams, J.D., Asadi, K., Zweig, G.: Hybrid code networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning. In: ACL (Volume 1: Long Papers), vol. 1, pp. 665–677 (2017)

17.

Xing, C., Wu, Y., Wu, W., Huang, Y., Zhou, M.: Hierarchical recurrent attention network for response generation. In: AAAI (2018)

18.

Yan, Z., Duan, N., Chen, P., Zhou, M., Zhou, J., Li, Z.: Building task-oriented dialogue systems for online shopping. In: AAAI, pp. 4618–4626 (2017)

19.

Zhang, J., et al.: Thumt: an open source toolkit for neural machine translation. arXiv preprint arXiv:1706.06415 (2017)

20.

Zhou, X., et al.: Multi-turn response selection for chatbots with deep attention matching network. In: ACL (Volume 1: Long Papers), vol. 1, pp. 1118–1127 (2018)

Titel: Extending the Transformer with Context and Multi-dimensional Mechanism for Dialogue Response Generation
verfasst von: Ruxin Tan
Jiahui Sun
Bo Su
Gongshen Liu
Verlag: Springer International Publishing
Buch: Natural Language Processing and Chinese Computing
Print ISBN: 978-3-030-32235-9

Electronic ISBN: 978-3-030-32236-6

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-32236-6_16

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"