Skip to main content
Erschienen in: Cognitive Computation 3/2021

23.03.2020

A Deep Multi-task Model for Dialogue Act Classification, Intent Detection and Slot Filling

verfasst von: Mauajama Firdaus, Hitesh Golchha, Asif Ekbal, Pushpak Bhattacharyya

Erschienen in: Cognitive Computation | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

An essential component of any dialogue system is understanding the language which is known as spoken language understanding (SLU). Dialogue act classification (DAC), intent detection (ID) and slot filling (SF) are significant aspects of every dialogue system. In this paper, we propose a deep learning-based multi-task model that can perform DAC, ID and SF tasks together. We use a deep bi-directional recurrent neural network (RNN) with long short-term memory (LSTM) and gated recurrent unit (GRU) as the frameworks in our multi-task model. We use attention on the LSTM/GRU output for DAC and ID. The attention outputs are fed to individual task-specific dense layers for DAC and ID. The output of LSTM/GRU is fed to softmax layer for slot filling as well. Experiments on three datasets, i.e. ATIS, TRAINS and FRAMES, show that our proposed multi-task model performs better than the individual models as well as all the pipeline models. The experimental results prove that our attention-based multi-task model outperforms the state-of-the-art approaches for the SLU tasks. For DAC, in relation to the individual model, we achieve an improvement of more than 2% for all the datasets. Similarly, for ID, we get an improvement of 1% on the ATIS dataset, while for TRAINS and FRAMES dataset, there is a significant improvement of more than 3% compared to individual models. We also get a 0.8% enhancement for ATIS and a 4% enhancement for TRAINS and FRAMES dataset for SF with respect to individual models. Results obtained clearly show that our approach is better than existing methods. The validation of the obtained results is also demonstrated using statistical significance t tests.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ang J, Liu Y, Shriberg E. Automatic dialog act segmentation and classification in multiparty meetings, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, {ICASSP} '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005, Vol 1, pp 1061–1064. Ang J, Liu Y, Shriberg E. Automatic dialog act segmentation and classification in multiparty meetings, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, {ICASSP} '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005, Vol 1, pp 1061–1064.
2.
Zurück zum Zitat Bapna A, Tur G, Hakkani-Tur D, Heck L. Sequential dialogue context modeling for spoken language understanding, In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Saarbrucken, Germany, August 15–17, 2017; pp 103–114. Bapna A, Tur G, Hakkani-Tur D, Heck L. Sequential dialogue context modeling for spoken language understanding, In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Saarbrucken, Germany, August 15–17, 2017; pp 103–114.
3.
Zurück zum Zitat Barahona LMR, Gasic M, Mrkšić N, Su PH, Ultes S, Wen TH, Young S. Exploiting sentence and context representations in deep neural models for spoken language understanding, In: 26th International Conference on Computational Linguistics, (COLING), Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan; pp 258–267. Barahona LMR, Gasic M, Mrkšić N, Su PH, Ultes S, Wen TH, Young S. Exploiting sentence and context representations in deep neural models for spoken language understanding, In: 26th International Conference on Computational Linguistics, (COLING), Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan; pp 258–267. 
4.
Zurück zum Zitat Chen L, Di Eugenio B. Multimodality and dialogue act classification in the RoboHelper Project; In: Proceedings of the SIGDIAL 2013 Conference, The 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 22–24 August 2013, SUPELEC, Metz, France; pp 183–192. Chen L, Di Eugenio B. Multimodality and dialogue act classification in the RoboHelper Project; In: Proceedings of the SIGDIAL 2013 Conference, The 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 22–24 August 2013, SUPELEC, Metz, France; pp 183–192.
5.
Zurück zum Zitat A. Deoras, R. Sarikaya, Deep belief network based semantic taggers for spoken language understanding., In: INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp. 2713–2717. A. Deoras, R. Sarikaya, Deep belief network based semantic taggers for spoken language understanding., In: INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp. 2713–2717.
6.
Zurück zum Zitat Fernandez R, Picard RW. Dialog act classification from prosodic features using support vector machines, In: Speech Prosody 2002, International Conference; 2002. Fernandez R, Picard RW. Dialog act classification from prosodic features using support vector machines, In: Speech Prosody 2002, International Conference; 2002.
7.
Zurück zum Zitat Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. Intent detection for spoken language understanding using a deep ensemble model, In: 15th Pacific Rim International Conference on Artificial Intelligence (PRICAI), Nanjing, China, August 28-31, 2018, Proceedings, Part {I}, Springer, pp 629–642. Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. Intent detection for spoken language understanding using a deep ensemble model, In: 15th Pacific Rim International Conference on Artificial Intelligence (PRICAI), Nanjing, China, August 28-31, 2018, Proceedings, Part {I}, Springer, pp 629–642.
8.
Zurück zum Zitat Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. A deep learning based multi-task ensemble model for intent detection and slot filling in spoken language understanding, In: Neural Information Processing - 25th International Conference, (ICONIP) 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part {IV}, Springer, pp 647–658. Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. A deep learning based multi-task ensemble model for intent detection and slot filling in spoken language understanding, In: Neural Information Processing - 25th International Conference, (ICONIP) 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part {IV}, Springer, pp 647–658.
9.
Zurück zum Zitat Firdaus M, Kumar A, Ekbal A, Bhattacharyya P. A Multi-task hierarchical approach for intent detection and slot filling, In: Knowledge-Based Systems, Elsevier; vol-183; 2019. Firdaus M, Kumar A, Ekbal A, Bhattacharyya P. A Multi-task hierarchical approach for intent detection and slot filling, In: Knowledge-Based Systems, Elsevier; vol-183; 2019.
10.
Zurück zum Zitat Goo CW, Gao G, Hsu YK, Huo CL, Chen TC, Hsu KW, Chen YN. Slot-gated modeling for joint slot filling and intent prediction, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), pp 753–757. Goo CW, Gao G, Hsu YK, Huo CL, Chen TC, Hsu KW, Chen YN. Slot-gated modeling for joint slot filling and intent prediction, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), pp 753–757.
11.
Zurück zum Zitat Gorin AL, Riccardi G, Wright JH. How may I help you? Speech Comm. 1997; vol-23, pp 113–27. Gorin AL, Riccardi G, Wright JH. How may I help you? Speech Comm. 1997; vol-23, pp 113–27.
12.
Zurück zum Zitat Grau S, Sanchis E, Castro MJ, Vilar D. Dialogue act classification using a Bayesian approach, In: 9th Conference Speech and Computer; 2004. Grau S, Sanchis E, Castro MJ, Vilar D. Dialogue act classification using a Bayesian approach, In: 9th Conference Speech and Computer; 2004.
13.
Zurück zum Zitat Guo D, Tur G, Yih Wt, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks, In: Spoken Language Technology Workshop (SLT), IEEE, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 554–559. Guo D, Tur G, Yih Wt, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks, In: Spoken Language Technology Workshop (SLT), IEEE, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 554–559.
14.
Zurück zum Zitat Haffner P, Tur G, Wright JH. Optimizing SVMs for complex call classification. In: Acoustics, Speech, and Signal Processing, IEEE International Conference, Hong Kong, April 6-10, 2003, vol 1, pp 632–635. Haffner P, Tur G, Wright JH. Optimizing SVMs for complex call classification. In: Acoustics, Speech, and Signal Processing, IEEE International Conference, Hong Kong, April 6-10, 2003, vol 1, pp 632–635.
15.
Zurück zum Zitat Hakkani-Tür D, Tur G, Chotimongkol A. Using syntactic and semantic graphs for call classification, In: Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing; 2005. Hakkani-Tür D, Tur G, Chotimongkol A. Using syntactic and semantic graphs for call classification, In: Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing; 2005.
16.
Zurück zum Zitat Hakkani-Tür D, Tür G, Celikyilmaz A, Chen YN, Gao J, Deng L, Wang YY Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM, In: 17th Annual Conference of the International Speech Communication Association, Interspeech, San Francisco, CA, USA, September 8-12, 2016; pp 715–719. Hakkani-Tür D, Tür G, Celikyilmaz A, Chen YN, Gao J, Deng L, Wang YY Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM, In: 17th Annual Conference of the International Speech Communication Association, Interspeech, San Francisco, CA, USA, September 8-12, 2016; pp 715–719.
17.
Zurück zum Zitat Hashemi HB, Asiaee A, Kraft R. Query intent detection using convolutional neural networks, In: International Conference on Web Search and Data Mining, Workshop on Query Understanding; 2016. Hashemi HB, Asiaee A, Kraft R. Query intent detection using convolutional neural networks, In: International Conference on Web Search and Data Mining, Workshop on Query Understanding; 2016.
18.
Zurück zum Zitat He Y, Young S. A data-driven spoken language understanding system, In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp 583–588; 2003. He Y, Young S. A data-driven spoken language understanding system, In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp 583–588; 2003.
19.
Zurück zum Zitat Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.CrossRef Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.CrossRef
20.
Zurück zum Zitat Jeong M, Lee GG. Triangular-chain conditional random fields. IEEE Trans. Audio Speech Lang Process. 2008; vol-16(7); pp 1287–302. Jeong M, Lee GG. Triangular-chain conditional random fields. IEEE Trans. Audio Speech Lang Process. 2008; vol-16(7); pp 1287–302.
21.
Zurück zum Zitat Ji G, Bilmes J. Dialog act tagging using graphical models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP) '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 33–36. Ji G, Bilmes J. Dialog act tagging using graphical models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP) '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 33–36.
22.
Zurück zum Zitat Ji Y, Haffari G, Eisenstein J. A Latent variable recurrent neural network for discourse relation language models, arXiv preprint arXiv:1603.01913; 2016. Ji Y, Haffari G, Eisenstein J. A Latent variable recurrent neural network for discourse relation language models, arXiv preprint arXiv:1603.01913; 2016.
23.
Zurück zum Zitat Justo R, Alcaide JM, Torres MI, Walker M. Detection of sarcasm and nastiness: new resources for Spanish language. In: Cognitive Computation; 2018; vol-10; pp 1135–1151. Justo R, Alcaide JM, Torres MI, Walker M. Detection of sarcasm and nastiness: new resources for Spanish language. In: Cognitive Computation; 2018; vol-10; pp 1135–1151.
24.
Zurück zum Zitat Kalchbrenner N, Blunsom P. Recurrent convolutional neural networks for discourse compositionality, In: Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, CVSM@ACL 2013, Sofia, Bulgaria, August 9, 2013, pp 119–126. Kalchbrenner N, Blunsom P. Recurrent convolutional neural networks for discourse compositionality, In: Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, CVSM@ACL 2013, Sofia, Bulgaria, August 9, 2013, pp 119–126.
25.
Zurück zum Zitat Keizer S. A Bayesian approach to dialogue act classification, In: BI-DIALOG 2001: Proceedings of the 5th Workshop on Formal Semantics and Pragmatics of Dialogue, pp 210–218; 2001. Keizer S. A Bayesian approach to dialogue act classification, In: BI-DIALOG 2001: Proceedings of the 5th Workshop on Formal Semantics and Pragmatics of Dialogue, pp 210–218; 2001.
26.
Zurück zum Zitat Keizer S, Nijholt A, et al. Dialogue act recognition with Bayesian networks for Dutch dialogues, In: Proceedings of the SIGDIAL 2002 Workshop, The 3rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Thursday, July 11, 2002 to Friday, July 12, 2002, Philadelphia, PA, USA; Association for Computational Linguistics, pp 88–94. Keizer S, Nijholt A, et al. Dialogue act recognition with Bayesian networks for Dutch dialogues, In: Proceedings of the SIGDIAL 2002 Workshop, The 3rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Thursday, July 11, 2002 to Friday, July 12, 2002, Philadelphia, PA, USA; Association for Computational Linguistics, pp 88–94.
27.
Zurück zum Zitat Khanpour H, Guntakandla N, Nielsen R. Dialogue act classification in domain-independent conversations using a deep recurrent neural network, In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, December 11-16, 2016, Osaka, Japan, pp. 2012–2021. Khanpour H, Guntakandla N, Nielsen R. Dialogue act classification in domain-independent conversations using a deep recurrent neural network, In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, December 11-16, 2016, Osaka, Japan, pp. 2012–2021.
28.
Zurück zum Zitat Kim JK, Tur G, Celikyilmaz A, Cao B, Wang YY. Intent detection using semantically enriched word embeddings, In: Spoken Language Technology Workshop (SLT), IEEE, San Diego, CA, USA, December 13-16, 2016; pp 414–419. Kim JK, Tur G, Celikyilmaz A, Cao B, Wang YY. Intent detection using semantically enriched word embeddings, In: Spoken Language Technology Workshop (SLT), IEEE, San Diego, CA, USA, December 13-16, 2016; pp 414–419.
29.
Zurück zum Zitat Kim SN, Cavedon L, Baldwin T. Classifying Dialogue acts in one-on-one live chats, In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 9-11 October 2010, {MIT} Stata Center, Massachusetts, USA; pp 862–871. Kim SN, Cavedon L, Baldwin T. Classifying Dialogue acts in one-on-one live chats, In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 9-11 October 2010, {MIT} Stata Center, Massachusetts, USA; pp 862–871.
30.
Zurück zum Zitat Kim Y, Jernite Y, Sontag D, Rush AM. Character-Aware Neural Language Models, In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, pp 2741–2749. Kim Y, Jernite Y, Sontag D, Rush AM. Character-Aware Neural Language Models, In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, pp 2741–2749.
31.
Zurück zum Zitat Kim YB, Lee S, Stratos K. ONENET: Joint domain, intent, slot prediction for spoken language understanding, In: Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, Okinawa, Japan, December 16-20, 2017 pp 547–553. Kim YB, Lee S, Stratos K. ONENET: Joint domain, intent, slot prediction for spoken language understanding, In: Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, Okinawa, Japan, December 16-20, 2017 pp 547–553.
32.
Zurück zum Zitat Kingma D, Ba J. Adam: a method for stochastic optimization, In: 3rd International Conference on Learning Representations, {ICLR} 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. Kingma D, Ba J. Adam: a method for stochastic optimization, In: 3rd International Conference on Learning Representations, {ICLR} 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
33.
Zurück zum Zitat Kral P, Cerisara C. Automatic dialogue act recognition with syntactic features. Lang Resour Eval. 2014;48(3):419–41. Kral P, Cerisara C. Automatic dialogue act recognition with syntactic features. Lang Resour Eval. 2014;48(3):419–41.
34.
Zurück zum Zitat Kumar H, Agarwal A, Dasgupta R, Joshi S, Kumar A. Dialogue act sequence labeling using hierarchical encoder with CRF, In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp 3440–3447. Kumar H, Agarwal A, Dasgupta R, Joshi S, Kumar A. Dialogue act sequence labeling using hierarchical encoder with CRF, In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp 3440–3447.
35.
Zurück zum Zitat Lauren P, Qu G, Yang J, Watta P, Huang GB, Lendasse A. Generating word embeddings from an extreme learning machine for sentiment analysis and sequence labeling tasks. In: Cognitive Computation, 2018; Springer; vol- 10; pp 625–638. Lauren P, Qu G, Yang J, Watta P, Huang GB, Lendasse A. Generating word embeddings from an extreme learning machine for sentiment analysis and sequence labeling tasks. In: Cognitive Computation, 2018; Springer; vol- 10; pp 625–638.
36.
Zurück zum Zitat Li Y, Yang L, Xu B, Wang J, Lin H. Improving user attribute classification with text and social network attention. In: Cognitive Computation, 2019; Springer; vol- 11; pp 459–468. Li Y, Yang L, Xu B, Wang J, Lin H. Improving user attribute classification with text and social network attention. In: Cognitive Computation, 2019; Springer; vol- 11; pp 459–468.
37.
Zurück zum Zitat Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling, In: Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp 685--689. Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling, In: Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp 685--689.
38.
Zurück zum Zitat Liu B, Lane I. Joint online spoken language understanding and language modeling with recurrent neural networks. In: Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA, pp 22-30. Liu B, Lane I. Joint online spoken language understanding and language modeling with recurrent neural networks. In: Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA, pp 22-30.
39.
Zurück zum Zitat Liu B, Lane I. Dialog context language modeling with recurrent neural networks, In: IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP, New Orleans, LA, USA, March 5-9, 2017; pp. 5715–5719. Liu B, Lane I. Dialog context language modeling with recurrent neural networks, In: IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP, New Orleans, LA, USA, March 5-9, 2017; pp. 5715–5719.
40.
Zurück zum Zitat Liu Y. Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus, In: Ninth International Conference on Spoken Language Processing, Interspeech, Pittsburgh, PA, USA, September 17-21, 2006. Liu Y. Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus, In: Ninth International Conference on Spoken Language Processing, Interspeech, Pittsburgh, PA, USA, September 17-21, 2006.
41.
Zurück zum Zitat Liu Y, Han K, Tan Z, Lei Y. Using context information for dialog act classification in DNN framework, In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 9-11, 2017; pp. 2170–2178. Liu Y, Han K, Tan Z, Lei Y. Using context information for dialog act classification in DNN framework, In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 9-11, 2017; pp. 2170–2178.
42.
Zurück zum Zitat Luan Y, Watanabe S, Harsham B. Efficient learning for spoken language understanding tasks with word embedding based pre-training, In: Sixteenth Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015; pp 1398–1402. Luan Y, Watanabe S, Harsham B. Efficient learning for spoken language understanding tasks with word embedding based pre-training, In: Sixteenth Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015; pp 1398–1402.
43.
Zurück zum Zitat McCallum A, Freitag D, Pereira FC. Maximum entropy Markov models for information extraction and segmentation. ICML. 2000;17:591–8. McCallum A, Freitag D, Pereira FC. Maximum entropy Markov models for information extraction and segmentation. ICML. 2000;17:591–8.
44.
Zurück zum Zitat Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent neural network architectures and learning methods for spoken language understanding, In: 14th Annual Conference of the International Speech Communication Association, Interspeech, Lyon, France, August 25-29, 2013; pp 3771–3775. Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent neural network architectures and learning methods for spoken language understanding, In: 14th Annual Conference of the International Speech Communication Association, Interspeech, Lyon, France, August 25-29, 2013; pp 3771–3775.
45.
Zurück zum Zitat Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, et al. Using recurrent neural networks for slot filling in spoken language understanding. IEEE-ACM T Audio Spe. 2015;23(3):530–9. Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, et al. Using recurrent neural networks for slot filling in spoken language understanding. IEEE-ACM T Audio Spe. 2015;23(3):530–9.
46.
Zurück zum Zitat Moschitti A, Riccardi G, Raymond C. Spoken language understanding with kernels for syntactic/semantic structures. In: IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU, Kyoto, Japan, December 9-13, 2007; pp 183–188. Moschitti A, Riccardi G, Raymond C. Spoken language understanding with kernels for syntactic/semantic structures. In: IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU, Kyoto, Japan, December 9-13, 2007; pp 183–188.
47.
Zurück zum Zitat Papalampidi P, Iosif E, Potamianos A. Dialogue act semantic representation and classification using recurrent neural networks, In: Proc. SEMDIAL 2017 (SaarDial) Workshop on the Semantics and Pragmatics of Dialogue, pp. 77–86; 2017. Papalampidi P, Iosif E, Potamianos A. Dialogue act semantic representation and classification using recurrent neural networks, In: Proc. SEMDIAL 2017 (SaarDial) Workshop on the Semantics and Pragmatics of Dialogue, pp. 77–86; 2017.
48.
Zurück zum Zitat Pennington J, Socher R, Manning C. Glove: global vectors for word representation, In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), October 25-29, 2014, Doha, Qatar, pp 1532–1543. Pennington J, Socher R, Manning C. Glove: global vectors for word representation, In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), October 25-29, 2014, Doha, Qatar, pp 1532–1543.
49.
Zurück zum Zitat Price PJ. Evaluation of spoken language systems: the ATIS domain, In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27; 1990. Price PJ. Evaluation of spoken language systems: the ATIS domain, In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27; 1990.
50.
Zurück zum Zitat Ravuri S, Stoicke A. A comparative study of neural network models for lexical intent classification, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, December 13-17, 2015, pp 368–374. Ravuri S, Stoicke A. A comparative study of neural network models for lexical intent classification, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, December 13-17, 2015, pp 368–374.
51.
Zurück zum Zitat Ravuri SV, Stolcke A. Recurrent neural network and LSTM models for lexical utterance classification, In: 16th Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015, pp 135–139. Ravuri SV, Stolcke A. Recurrent neural network and LSTM models for lexical utterance classification, In: 16th Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015, pp 135–139.
52.
Zurück zum Zitat Raymond C, Riccardi G. Generative and discriminative algorithms for spoken language understanding, In: Eighth Annual Conference of the International Speech Communication Association, Interspeech; Antwerp, Belgium, August 27-31, 2007, pp 1605–1608. Raymond C, Riccardi G. Generative and discriminative algorithms for spoken language understanding, In: Eighth Annual Conference of the International Speech Communication Association, Interspeech; Antwerp, Belgium, August 27-31, 2007, pp 1605–1608.
53.
Zurück zum Zitat Ribeiro E, Ribeiro R, de Matos DM. The influence of context on dialogue act recognition, arXiv preprint arXiv:150600839; 2015. Ribeiro E, Ribeiro R, de Matos DM. The influence of context on dialogue act recognition, arXiv preprint arXiv:150600839; 2015.
54.
Zurück zum Zitat Ries K. Hmm and neural network based speech act detection, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Phoenix, Arizona, USA, March 15-19, 1999; vol 1, pp 497–500. Ries K. Hmm and neural network based speech act detection, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Phoenix, Arizona, USA, March 15-19, 1999; vol 1, pp 497–500.
55.
Zurück zum Zitat Samei B, Li H, Keshtkar F, Rus V, Graesser AC. Context-based speech act classification in intelligent tutoring systems, In: International Conference on Intelligent Tutoring Systems, Springer, pp 236–241; 2014. Samei B, Li H, Keshtkar F, Rus V, Graesser AC. Context-based speech act classification in intelligent tutoring systems, In: International Conference on Intelligent Tutoring Systems, Springer, pp 236–241; 2014.
56.
Zurück zum Zitat Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.MathSciNetMATH Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.MathSciNetMATH
57.
Zurück zum Zitat Stolcke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, et al. Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist. 2000;26(3):339–73.CrossRef Stolcke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, et al. Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist. 2000;26(3):339–73.CrossRef
58.
Zurück zum Zitat Sun X, Peng X, Ding S. Emotional human machine conversation generation based on long short-term memory. In: Cognitive Computation, 2018; Springer; vol-10(3); pp 389–397. Sun X, Peng X, Ding S. Emotional human machine conversation generation based on long short-term memory. In: Cognitive Computation, 2018; Springer; vol-10(3); pp 389–397.
59.
Zurück zum Zitat Tur G. Model adaptation for spoken language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 41–44. Tur G. Model adaptation for spoken language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 41–44.
60.
Zurück zum Zitat Tur G, Hakkani-Tür D, Heck L, Parthasarathy S. Sentence simplification for spoken language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 22-27, 2011, Prague Congress Center, Prague, Czech Republic; pp 5628–5631. Tur G, Hakkani-Tür D, Heck L, Parthasarathy S. Sentence simplification for spoken language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 22-27, 2011, Prague Congress Center, Prague, Czech Republic; pp 5628–5631.
61.
Zurück zum Zitat Venkataraman A, Ferrer L, Stolcke A, Shriberg E. Training a prosody-based dialog act tagger from unlabeled data, In: Acoustics, Speech, and Signal Processing, Proceedings (ICASSP’03), IEEE International Conference on, IEEE, Hong Kong, April 6-10, 2003; vol 1, pp 272–275. Venkataraman A, Ferrer L, Stolcke A, Shriberg E. Training a prosody-based dialog act tagger from unlabeled data, In: Acoustics, Speech, and Signal Processing, Proceedings (ICASSP’03), IEEE International Conference on, IEEE, Hong Kong, April 6-10, 2003; vol 1, pp 272–275.
62.
Zurück zum Zitat Wang P, Song Q, Han H, Cheng J. Sequentially supervised long short-term memory for gesture recognition. In: Cognitive Computation, 2016; Springer; vol-8(5); pp 982–91. Wang P, Song Q, Han H, Cheng J. Sequentially supervised long short-term memory for gesture recognition. In: Cognitive Computation, 2016; Springer; vol-8(5); pp 982–91.
63.
Zurück zum Zitat Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), vol 2, pp 309–314. Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), vol 2, pp 309–314.
64.
Zurück zum Zitat Wang Z, Lin Z. Optimal feature selection for learning-based algorithms for sentiment classification. In: Cognitive Computation, 2019; Springer; vol-12, pp 238–248. Wang Z, Lin Z. Optimal feature selection for learning-based algorithms for sentiment classification. In: Cognitive Computation, 2019; Springer; vol-12, pp 238–248.
65.
Zurück zum Zitat Welch BL. The generalization of student’s problem when several different population variances are involved. Biometrika. 1947;34(1/2):28–35.MathSciNetCrossRef Welch BL. The generalization of student’s problem when several different population variances are involved. Biometrika. 1947;34(1/2):28–35.MathSciNetCrossRef
66.
Zurück zum Zitat Xing C, Wu W, Wu Y, Liu J, Huang Y, Zhou M, et al. Topic aware neural response generation. In: Proceedings of the Thirty-First (AAAI) Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA; pp 3351–3357. Xing C, Wu W, Wu Y, Liu J, Huang Y, Zhou M, et al. Topic aware neural response generation. In: Proceedings of the Thirty-First (AAAI) Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA; pp 3351–3357.
67.
Zurück zum Zitat Xu P, Sarikaya R. Convolutional neural network based triangular CRF for joint intent detection and slot filling, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Olomouc, Czech Republic, December 8-12, 2013, pp 78–83. Xu P, Sarikaya R. Convolutional neural network based triangular CRF for joint intent detection and slot filling, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Olomouc, Czech Republic, December 8-12, 2013, pp 78–83.
68.
Zurück zum Zitat Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification, In: 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp 1480–1489. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification, In: 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp 1480–1489.
69.
Zurück zum Zitat Yao K, Zweig G, Hwang MY, Shi Y, Yu D. Recurrent neural networks for language understanding, In: 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon, France, August 25-29, 2013; pp 2524–2528. Yao K, Zweig G, Hwang MY, Shi Y, Yu D. Recurrent neural networks for language understanding, In: 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon, France, August 25-29, 2013; pp 2524–2528.
70.
Zurück zum Zitat Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y. Spoken language understanding using long short-term memory neural networks, In: IEEE Spoken Language Technology Workshop, {SLT} 2014, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 189–194. Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y. Spoken language understanding using long short-term memory neural networks, In: IEEE Spoken Language Technology Workshop, {SLT} 2014, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 189–194.
71.
Zurück zum Zitat Yao K, Peng B, Zweig G, Yu D, Li X, Gao F. Recurrent conditional random field for language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014; pp 4077–4081. Yao K, Peng B, Zweig G, Yu D, Li X, Gao F. Recurrent conditional random field for language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014; pp 4077–4081.
72.
Zurück zum Zitat Zhang X, Wang H. A joint model of intent determination and slot filling for spoken language understanding, In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (IJCAI), New York, NY, USA, 9-15 July 2016, pp 2993-2999. Zhang X, Wang H. A joint model of intent determination and slot filling for spoken language understanding, In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (IJCAI), New York, NY, USA, 9-15 July 2016, pp 2993-2999.
73.
Zurück zum Zitat Zhao L, Feng Z. Improving slot filling in spoken language understanding with joint pointer and attention, In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, {ACL} 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers}, pp 426–431. Zhao L, Feng Z. Improving slot filling in spoken language understanding with joint pointer and attention, In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, {ACL} 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers}, pp 426–431.
74.
Zurück zum Zitat Zhou H, Huang M, Zhang T, Zhu X, Liu B. Emotional chatting machine: emotional conversation generation with internal and external memory, In: Proceedings of the Thirty-Second {AAAI} Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th {AAAI} Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018; pp 730–739. Zhou H, Huang M, Zhang T, Zhu X, Liu B. Emotional chatting machine: emotional conversation generation with internal and external memory, In: Proceedings of the Thirty-Second {AAAI} Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th {AAAI} Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018; pp 730–739.
75.
Zurück zum Zitat Zhou Y, Hu Q, Liu J, Jia Y. Combining heterogeneous deep neural networks with conditional random fields for Chinese dialogue act recognition. In: Neurocomputing, 2015; Vol - 168; pp 408–17. Zhou Y, Hu Q, Liu J, Jia Y. Combining heterogeneous deep neural networks with conditional random fields for Chinese dialogue act recognition. In: Neurocomputing, 2015; Vol - 168; pp 408–17.
76.
Zurück zum Zitat Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding, In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), New Orleans, LA, USA, March 5-9, 2017, pp 5675–5679. Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding, In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), New Orleans, LA, USA, March 5-9, 2017, pp 5675–5679.
Metadaten
Titel
A Deep Multi-task Model for Dialogue Act Classification, Intent Detection and Slot Filling
verfasst von
Mauajama Firdaus
Hitesh Golchha
Asif Ekbal
Pushpak Bhattacharyya
Publikationsdatum
23.03.2020
Verlag
Springer US
Erschienen in
Cognitive Computation / Ausgabe 3/2021
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-020-09718-4

Weitere Artikel der Ausgabe 3/2021

Cognitive Computation 3/2021 Zur Ausgabe