Skip to main content
Top
Published in: Cognitive Computation 3/2021

23-03-2020

A Deep Multi-task Model for Dialogue Act Classification, Intent Detection and Slot Filling

Authors: Mauajama Firdaus, Hitesh Golchha, Asif Ekbal, Pushpak Bhattacharyya

Published in: Cognitive Computation | Issue 3/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

An essential component of any dialogue system is understanding the language which is known as spoken language understanding (SLU). Dialogue act classification (DAC), intent detection (ID) and slot filling (SF) are significant aspects of every dialogue system. In this paper, we propose a deep learning-based multi-task model that can perform DAC, ID and SF tasks together. We use a deep bi-directional recurrent neural network (RNN) with long short-term memory (LSTM) and gated recurrent unit (GRU) as the frameworks in our multi-task model. We use attention on the LSTM/GRU output for DAC and ID. The attention outputs are fed to individual task-specific dense layers for DAC and ID. The output of LSTM/GRU is fed to softmax layer for slot filling as well. Experiments on three datasets, i.e. ATIS, TRAINS and FRAMES, show that our proposed multi-task model performs better than the individual models as well as all the pipeline models. The experimental results prove that our attention-based multi-task model outperforms the state-of-the-art approaches for the SLU tasks. For DAC, in relation to the individual model, we achieve an improvement of more than 2% for all the datasets. Similarly, for ID, we get an improvement of 1% on the ATIS dataset, while for TRAINS and FRAMES dataset, there is a significant improvement of more than 3% compared to individual models. We also get a 0.8% enhancement for ATIS and a 4% enhancement for TRAINS and FRAMES dataset for SF with respect to individual models. Results obtained clearly show that our approach is better than existing methods. The validation of the obtained results is also demonstrated using statistical significance t tests.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ang J, Liu Y, Shriberg E. Automatic dialog act segmentation and classification in multiparty meetings, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, {ICASSP} '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005, Vol 1, pp 1061–1064. Ang J, Liu Y, Shriberg E. Automatic dialog act segmentation and classification in multiparty meetings, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, {ICASSP} '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005, Vol 1, pp 1061–1064.
2.
go back to reference Bapna A, Tur G, Hakkani-Tur D, Heck L. Sequential dialogue context modeling for spoken language understanding, In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Saarbrucken, Germany, August 15–17, 2017; pp 103–114. Bapna A, Tur G, Hakkani-Tur D, Heck L. Sequential dialogue context modeling for spoken language understanding, In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Saarbrucken, Germany, August 15–17, 2017; pp 103–114.
3.
go back to reference Barahona LMR, Gasic M, Mrkšić N, Su PH, Ultes S, Wen TH, Young S. Exploiting sentence and context representations in deep neural models for spoken language understanding, In: 26th International Conference on Computational Linguistics, (COLING), Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan; pp 258–267. Barahona LMR, Gasic M, Mrkšić N, Su PH, Ultes S, Wen TH, Young S. Exploiting sentence and context representations in deep neural models for spoken language understanding, In: 26th International Conference on Computational Linguistics, (COLING), Proceedings of the Conference: Technical Papers, December 11–16, 2016, Osaka, Japan; pp 258–267. 
4.
go back to reference Chen L, Di Eugenio B. Multimodality and dialogue act classification in the RoboHelper Project; In: Proceedings of the SIGDIAL 2013 Conference, The 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 22–24 August 2013, SUPELEC, Metz, France; pp 183–192. Chen L, Di Eugenio B. Multimodality and dialogue act classification in the RoboHelper Project; In: Proceedings of the SIGDIAL 2013 Conference, The 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 22–24 August 2013, SUPELEC, Metz, France; pp 183–192.
5.
go back to reference A. Deoras, R. Sarikaya, Deep belief network based semantic taggers for spoken language understanding., In: INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp. 2713–2717. A. Deoras, R. Sarikaya, Deep belief network based semantic taggers for spoken language understanding., In: INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp. 2713–2717.
6.
go back to reference Fernandez R, Picard RW. Dialog act classification from prosodic features using support vector machines, In: Speech Prosody 2002, International Conference; 2002. Fernandez R, Picard RW. Dialog act classification from prosodic features using support vector machines, In: Speech Prosody 2002, International Conference; 2002.
7.
go back to reference Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. Intent detection for spoken language understanding using a deep ensemble model, In: 15th Pacific Rim International Conference on Artificial Intelligence (PRICAI), Nanjing, China, August 28-31, 2018, Proceedings, Part {I}, Springer, pp 629–642. Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. Intent detection for spoken language understanding using a deep ensemble model, In: 15th Pacific Rim International Conference on Artificial Intelligence (PRICAI), Nanjing, China, August 28-31, 2018, Proceedings, Part {I}, Springer, pp 629–642.
8.
go back to reference Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. A deep learning based multi-task ensemble model for intent detection and slot filling in spoken language understanding, In: Neural Information Processing - 25th International Conference, (ICONIP) 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part {IV}, Springer, pp 647–658. Firdaus M, Bhatnagar S, Ekbal A, Bhattacharyya P. A deep learning based multi-task ensemble model for intent detection and slot filling in spoken language understanding, In: Neural Information Processing - 25th International Conference, (ICONIP) 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part {IV}, Springer, pp 647–658.
9.
go back to reference Firdaus M, Kumar A, Ekbal A, Bhattacharyya P. A Multi-task hierarchical approach for intent detection and slot filling, In: Knowledge-Based Systems, Elsevier; vol-183; 2019. Firdaus M, Kumar A, Ekbal A, Bhattacharyya P. A Multi-task hierarchical approach for intent detection and slot filling, In: Knowledge-Based Systems, Elsevier; vol-183; 2019.
10.
go back to reference Goo CW, Gao G, Hsu YK, Huo CL, Chen TC, Hsu KW, Chen YN. Slot-gated modeling for joint slot filling and intent prediction, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), pp 753–757. Goo CW, Gao G, Hsu YK, Huo CL, Chen TC, Hsu KW, Chen YN. Slot-gated modeling for joint slot filling and intent prediction, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), pp 753–757.
11.
go back to reference Gorin AL, Riccardi G, Wright JH. How may I help you? Speech Comm. 1997; vol-23, pp 113–27. Gorin AL, Riccardi G, Wright JH. How may I help you? Speech Comm. 1997; vol-23, pp 113–27.
12.
go back to reference Grau S, Sanchis E, Castro MJ, Vilar D. Dialogue act classification using a Bayesian approach, In: 9th Conference Speech and Computer; 2004. Grau S, Sanchis E, Castro MJ, Vilar D. Dialogue act classification using a Bayesian approach, In: 9th Conference Speech and Computer; 2004.
13.
go back to reference Guo D, Tur G, Yih Wt, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks, In: Spoken Language Technology Workshop (SLT), IEEE, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 554–559. Guo D, Tur G, Yih Wt, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks, In: Spoken Language Technology Workshop (SLT), IEEE, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 554–559.
14.
go back to reference Haffner P, Tur G, Wright JH. Optimizing SVMs for complex call classification. In: Acoustics, Speech, and Signal Processing, IEEE International Conference, Hong Kong, April 6-10, 2003, vol 1, pp 632–635. Haffner P, Tur G, Wright JH. Optimizing SVMs for complex call classification. In: Acoustics, Speech, and Signal Processing, IEEE International Conference, Hong Kong, April 6-10, 2003, vol 1, pp 632–635.
15.
go back to reference Hakkani-Tür D, Tur G, Chotimongkol A. Using syntactic and semantic graphs for call classification, In: Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing; 2005. Hakkani-Tür D, Tur G, Chotimongkol A. Using syntactic and semantic graphs for call classification, In: Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing; 2005.
16.
go back to reference Hakkani-Tür D, Tür G, Celikyilmaz A, Chen YN, Gao J, Deng L, Wang YY Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM, In: 17th Annual Conference of the International Speech Communication Association, Interspeech, San Francisco, CA, USA, September 8-12, 2016; pp 715–719. Hakkani-Tür D, Tür G, Celikyilmaz A, Chen YN, Gao J, Deng L, Wang YY Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM, In: 17th Annual Conference of the International Speech Communication Association, Interspeech, San Francisco, CA, USA, September 8-12, 2016; pp 715–719.
17.
go back to reference Hashemi HB, Asiaee A, Kraft R. Query intent detection using convolutional neural networks, In: International Conference on Web Search and Data Mining, Workshop on Query Understanding; 2016. Hashemi HB, Asiaee A, Kraft R. Query intent detection using convolutional neural networks, In: International Conference on Web Search and Data Mining, Workshop on Query Understanding; 2016.
18.
go back to reference He Y, Young S. A data-driven spoken language understanding system, In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp 583–588; 2003. He Y, Young S. A data-driven spoken language understanding system, In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp 583–588; 2003.
19.
go back to reference Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.CrossRef Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.CrossRef
20.
go back to reference Jeong M, Lee GG. Triangular-chain conditional random fields. IEEE Trans. Audio Speech Lang Process. 2008; vol-16(7); pp 1287–302. Jeong M, Lee GG. Triangular-chain conditional random fields. IEEE Trans. Audio Speech Lang Process. 2008; vol-16(7); pp 1287–302.
21.
go back to reference Ji G, Bilmes J. Dialog act tagging using graphical models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP) '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 33–36. Ji G, Bilmes J. Dialog act tagging using graphical models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP) '05, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 33–36.
22.
go back to reference Ji Y, Haffari G, Eisenstein J. A Latent variable recurrent neural network for discourse relation language models, arXiv preprint arXiv:1603.01913; 2016. Ji Y, Haffari G, Eisenstein J. A Latent variable recurrent neural network for discourse relation language models, arXiv preprint arXiv:1603.01913; 2016.
23.
go back to reference Justo R, Alcaide JM, Torres MI, Walker M. Detection of sarcasm and nastiness: new resources for Spanish language. In: Cognitive Computation; 2018; vol-10; pp 1135–1151. Justo R, Alcaide JM, Torres MI, Walker M. Detection of sarcasm and nastiness: new resources for Spanish language. In: Cognitive Computation; 2018; vol-10; pp 1135–1151.
24.
go back to reference Kalchbrenner N, Blunsom P. Recurrent convolutional neural networks for discourse compositionality, In: Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, CVSM@ACL 2013, Sofia, Bulgaria, August 9, 2013, pp 119–126. Kalchbrenner N, Blunsom P. Recurrent convolutional neural networks for discourse compositionality, In: Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, CVSM@ACL 2013, Sofia, Bulgaria, August 9, 2013, pp 119–126.
25.
go back to reference Keizer S. A Bayesian approach to dialogue act classification, In: BI-DIALOG 2001: Proceedings of the 5th Workshop on Formal Semantics and Pragmatics of Dialogue, pp 210–218; 2001. Keizer S. A Bayesian approach to dialogue act classification, In: BI-DIALOG 2001: Proceedings of the 5th Workshop on Formal Semantics and Pragmatics of Dialogue, pp 210–218; 2001.
26.
go back to reference Keizer S, Nijholt A, et al. Dialogue act recognition with Bayesian networks for Dutch dialogues, In: Proceedings of the SIGDIAL 2002 Workshop, The 3rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Thursday, July 11, 2002 to Friday, July 12, 2002, Philadelphia, PA, USA; Association for Computational Linguistics, pp 88–94. Keizer S, Nijholt A, et al. Dialogue act recognition with Bayesian networks for Dutch dialogues, In: Proceedings of the SIGDIAL 2002 Workshop, The 3rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Thursday, July 11, 2002 to Friday, July 12, 2002, Philadelphia, PA, USA; Association for Computational Linguistics, pp 88–94.
27.
go back to reference Khanpour H, Guntakandla N, Nielsen R. Dialogue act classification in domain-independent conversations using a deep recurrent neural network, In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, December 11-16, 2016, Osaka, Japan, pp. 2012–2021. Khanpour H, Guntakandla N, Nielsen R. Dialogue act classification in domain-independent conversations using a deep recurrent neural network, In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, December 11-16, 2016, Osaka, Japan, pp. 2012–2021.
28.
go back to reference Kim JK, Tur G, Celikyilmaz A, Cao B, Wang YY. Intent detection using semantically enriched word embeddings, In: Spoken Language Technology Workshop (SLT), IEEE, San Diego, CA, USA, December 13-16, 2016; pp 414–419. Kim JK, Tur G, Celikyilmaz A, Cao B, Wang YY. Intent detection using semantically enriched word embeddings, In: Spoken Language Technology Workshop (SLT), IEEE, San Diego, CA, USA, December 13-16, 2016; pp 414–419.
29.
go back to reference Kim SN, Cavedon L, Baldwin T. Classifying Dialogue acts in one-on-one live chats, In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 9-11 October 2010, {MIT} Stata Center, Massachusetts, USA; pp 862–871. Kim SN, Cavedon L, Baldwin T. Classifying Dialogue acts in one-on-one live chats, In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 9-11 October 2010, {MIT} Stata Center, Massachusetts, USA; pp 862–871.
30.
go back to reference Kim Y, Jernite Y, Sontag D, Rush AM. Character-Aware Neural Language Models, In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, pp 2741–2749. Kim Y, Jernite Y, Sontag D, Rush AM. Character-Aware Neural Language Models, In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, pp 2741–2749.
31.
go back to reference Kim YB, Lee S, Stratos K. ONENET: Joint domain, intent, slot prediction for spoken language understanding, In: Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, Okinawa, Japan, December 16-20, 2017 pp 547–553. Kim YB, Lee S, Stratos K. ONENET: Joint domain, intent, slot prediction for spoken language understanding, In: Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, Okinawa, Japan, December 16-20, 2017 pp 547–553.
32.
go back to reference Kingma D, Ba J. Adam: a method for stochastic optimization, In: 3rd International Conference on Learning Representations, {ICLR} 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. Kingma D, Ba J. Adam: a method for stochastic optimization, In: 3rd International Conference on Learning Representations, {ICLR} 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
33.
go back to reference Kral P, Cerisara C. Automatic dialogue act recognition with syntactic features. Lang Resour Eval. 2014;48(3):419–41. Kral P, Cerisara C. Automatic dialogue act recognition with syntactic features. Lang Resour Eval. 2014;48(3):419–41.
34.
go back to reference Kumar H, Agarwal A, Dasgupta R, Joshi S, Kumar A. Dialogue act sequence labeling using hierarchical encoder with CRF, In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp 3440–3447. Kumar H, Agarwal A, Dasgupta R, Joshi S, Kumar A. Dialogue act sequence labeling using hierarchical encoder with CRF, In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp 3440–3447.
35.
go back to reference Lauren P, Qu G, Yang J, Watta P, Huang GB, Lendasse A. Generating word embeddings from an extreme learning machine for sentiment analysis and sequence labeling tasks. In: Cognitive Computation, 2018; Springer; vol- 10; pp 625–638. Lauren P, Qu G, Yang J, Watta P, Huang GB, Lendasse A. Generating word embeddings from an extreme learning machine for sentiment analysis and sequence labeling tasks. In: Cognitive Computation, 2018; Springer; vol- 10; pp 625–638.
36.
go back to reference Li Y, Yang L, Xu B, Wang J, Lin H. Improving user attribute classification with text and social network attention. In: Cognitive Computation, 2019; Springer; vol- 11; pp 459–468. Li Y, Yang L, Xu B, Wang J, Lin H. Improving user attribute classification with text and social network attention. In: Cognitive Computation, 2019; Springer; vol- 11; pp 459–468.
37.
go back to reference Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling, In: Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp 685--689. Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling, In: Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp 685--689.
38.
go back to reference Liu B, Lane I. Joint online spoken language understanding and language modeling with recurrent neural networks. In: Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA, pp 22-30. Liu B, Lane I. Joint online spoken language understanding and language modeling with recurrent neural networks. In: Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA, pp 22-30.
39.
go back to reference Liu B, Lane I. Dialog context language modeling with recurrent neural networks, In: IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP, New Orleans, LA, USA, March 5-9, 2017; pp. 5715–5719. Liu B, Lane I. Dialog context language modeling with recurrent neural networks, In: IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP, New Orleans, LA, USA, March 5-9, 2017; pp. 5715–5719.
40.
go back to reference Liu Y. Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus, In: Ninth International Conference on Spoken Language Processing, Interspeech, Pittsburgh, PA, USA, September 17-21, 2006. Liu Y. Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus, In: Ninth International Conference on Spoken Language Processing, Interspeech, Pittsburgh, PA, USA, September 17-21, 2006.
41.
go back to reference Liu Y, Han K, Tan Z, Lei Y. Using context information for dialog act classification in DNN framework, In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 9-11, 2017; pp. 2170–2178. Liu Y, Han K, Tan Z, Lei Y. Using context information for dialog act classification in DNN framework, In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 9-11, 2017; pp. 2170–2178.
42.
go back to reference Luan Y, Watanabe S, Harsham B. Efficient learning for spoken language understanding tasks with word embedding based pre-training, In: Sixteenth Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015; pp 1398–1402. Luan Y, Watanabe S, Harsham B. Efficient learning for spoken language understanding tasks with word embedding based pre-training, In: Sixteenth Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015; pp 1398–1402.
43.
go back to reference McCallum A, Freitag D, Pereira FC. Maximum entropy Markov models for information extraction and segmentation. ICML. 2000;17:591–8. McCallum A, Freitag D, Pereira FC. Maximum entropy Markov models for information extraction and segmentation. ICML. 2000;17:591–8.
44.
go back to reference Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent neural network architectures and learning methods for spoken language understanding, In: 14th Annual Conference of the International Speech Communication Association, Interspeech, Lyon, France, August 25-29, 2013; pp 3771–3775. Mesnil G, He X, Deng L, Bengio Y. Investigation of recurrent neural network architectures and learning methods for spoken language understanding, In: 14th Annual Conference of the International Speech Communication Association, Interspeech, Lyon, France, August 25-29, 2013; pp 3771–3775.
45.
go back to reference Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, et al. Using recurrent neural networks for slot filling in spoken language understanding. IEEE-ACM T Audio Spe. 2015;23(3):530–9. Mesnil G, Dauphin Y, Yao K, Bengio Y, Deng L, Hakkani-Tur D, et al. Using recurrent neural networks for slot filling in spoken language understanding. IEEE-ACM T Audio Spe. 2015;23(3):530–9.
46.
go back to reference Moschitti A, Riccardi G, Raymond C. Spoken language understanding with kernels for syntactic/semantic structures. In: IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU, Kyoto, Japan, December 9-13, 2007; pp 183–188. Moschitti A, Riccardi G, Raymond C. Spoken language understanding with kernels for syntactic/semantic structures. In: IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU, Kyoto, Japan, December 9-13, 2007; pp 183–188.
47.
go back to reference Papalampidi P, Iosif E, Potamianos A. Dialogue act semantic representation and classification using recurrent neural networks, In: Proc. SEMDIAL 2017 (SaarDial) Workshop on the Semantics and Pragmatics of Dialogue, pp. 77–86; 2017. Papalampidi P, Iosif E, Potamianos A. Dialogue act semantic representation and classification using recurrent neural networks, In: Proc. SEMDIAL 2017 (SaarDial) Workshop on the Semantics and Pragmatics of Dialogue, pp. 77–86; 2017.
48.
go back to reference Pennington J, Socher R, Manning C. Glove: global vectors for word representation, In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), October 25-29, 2014, Doha, Qatar, pp 1532–1543. Pennington J, Socher R, Manning C. Glove: global vectors for word representation, In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), October 25-29, 2014, Doha, Qatar, pp 1532–1543.
49.
go back to reference Price PJ. Evaluation of spoken language systems: the ATIS domain, In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27; 1990. Price PJ. Evaluation of spoken language systems: the ATIS domain, In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27; 1990.
50.
go back to reference Ravuri S, Stoicke A. A comparative study of neural network models for lexical intent classification, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, December 13-17, 2015, pp 368–374. Ravuri S, Stoicke A. A comparative study of neural network models for lexical intent classification, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, December 13-17, 2015, pp 368–374.
51.
go back to reference Ravuri SV, Stolcke A. Recurrent neural network and LSTM models for lexical utterance classification, In: 16th Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015, pp 135–139. Ravuri SV, Stolcke A. Recurrent neural network and LSTM models for lexical utterance classification, In: 16th Annual Conference of the International Speech Communication Association, Interspeech, Dresden, Germany, September 6-10, 2015, pp 135–139.
52.
go back to reference Raymond C, Riccardi G. Generative and discriminative algorithms for spoken language understanding, In: Eighth Annual Conference of the International Speech Communication Association, Interspeech; Antwerp, Belgium, August 27-31, 2007, pp 1605–1608. Raymond C, Riccardi G. Generative and discriminative algorithms for spoken language understanding, In: Eighth Annual Conference of the International Speech Communication Association, Interspeech; Antwerp, Belgium, August 27-31, 2007, pp 1605–1608.
53.
go back to reference Ribeiro E, Ribeiro R, de Matos DM. The influence of context on dialogue act recognition, arXiv preprint arXiv:150600839; 2015. Ribeiro E, Ribeiro R, de Matos DM. The influence of context on dialogue act recognition, arXiv preprint arXiv:150600839; 2015.
54.
go back to reference Ries K. Hmm and neural network based speech act detection, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Phoenix, Arizona, USA, March 15-19, 1999; vol 1, pp 497–500. Ries K. Hmm and neural network based speech act detection, In: IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Phoenix, Arizona, USA, March 15-19, 1999; vol 1, pp 497–500.
55.
go back to reference Samei B, Li H, Keshtkar F, Rus V, Graesser AC. Context-based speech act classification in intelligent tutoring systems, In: International Conference on Intelligent Tutoring Systems, Springer, pp 236–241; 2014. Samei B, Li H, Keshtkar F, Rus V, Graesser AC. Context-based speech act classification in intelligent tutoring systems, In: International Conference on Intelligent Tutoring Systems, Springer, pp 236–241; 2014.
56.
go back to reference Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.MathSciNetMATH Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.MathSciNetMATH
57.
go back to reference Stolcke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, et al. Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist. 2000;26(3):339–73.CrossRef Stolcke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, et al. Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist. 2000;26(3):339–73.CrossRef
58.
go back to reference Sun X, Peng X, Ding S. Emotional human machine conversation generation based on long short-term memory. In: Cognitive Computation, 2018; Springer; vol-10(3); pp 389–397. Sun X, Peng X, Ding S. Emotional human machine conversation generation based on long short-term memory. In: Cognitive Computation, 2018; Springer; vol-10(3); pp 389–397.
59.
go back to reference Tur G. Model adaptation for spoken language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 41–44. Tur G. Model adaptation for spoken language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, USA, March 18-23, 2005; vol 1, pp 41–44.
60.
go back to reference Tur G, Hakkani-Tür D, Heck L, Parthasarathy S. Sentence simplification for spoken language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 22-27, 2011, Prague Congress Center, Prague, Czech Republic; pp 5628–5631. Tur G, Hakkani-Tür D, Heck L, Parthasarathy S. Sentence simplification for spoken language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 22-27, 2011, Prague Congress Center, Prague, Czech Republic; pp 5628–5631.
61.
go back to reference Venkataraman A, Ferrer L, Stolcke A, Shriberg E. Training a prosody-based dialog act tagger from unlabeled data, In: Acoustics, Speech, and Signal Processing, Proceedings (ICASSP’03), IEEE International Conference on, IEEE, Hong Kong, April 6-10, 2003; vol 1, pp 272–275. Venkataraman A, Ferrer L, Stolcke A, Shriberg E. Training a prosody-based dialog act tagger from unlabeled data, In: Acoustics, Speech, and Signal Processing, Proceedings (ICASSP’03), IEEE International Conference on, IEEE, Hong Kong, April 6-10, 2003; vol 1, pp 272–275.
62.
go back to reference Wang P, Song Q, Han H, Cheng J. Sequentially supervised long short-term memory for gesture recognition. In: Cognitive Computation, 2016; Springer; vol-8(5); pp 982–91. Wang P, Song Q, Han H, Cheng J. Sequentially supervised long short-term memory for gesture recognition. In: Cognitive Computation, 2016; Springer; vol-8(5); pp 982–91.
63.
go back to reference Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), vol 2, pp 309–314. Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 2 (Short Papers), vol 2, pp 309–314.
64.
go back to reference Wang Z, Lin Z. Optimal feature selection for learning-based algorithms for sentiment classification. In: Cognitive Computation, 2019; Springer; vol-12, pp 238–248. Wang Z, Lin Z. Optimal feature selection for learning-based algorithms for sentiment classification. In: Cognitive Computation, 2019; Springer; vol-12, pp 238–248.
65.
go back to reference Welch BL. The generalization of student’s problem when several different population variances are involved. Biometrika. 1947;34(1/2):28–35.MathSciNetCrossRef Welch BL. The generalization of student’s problem when several different population variances are involved. Biometrika. 1947;34(1/2):28–35.MathSciNetCrossRef
66.
go back to reference Xing C, Wu W, Wu Y, Liu J, Huang Y, Zhou M, et al. Topic aware neural response generation. In: Proceedings of the Thirty-First (AAAI) Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA; pp 3351–3357. Xing C, Wu W, Wu Y, Liu J, Huang Y, Zhou M, et al. Topic aware neural response generation. In: Proceedings of the Thirty-First (AAAI) Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA; pp 3351–3357.
67.
go back to reference Xu P, Sarikaya R. Convolutional neural network based triangular CRF for joint intent detection and slot filling, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Olomouc, Czech Republic, December 8-12, 2013, pp 78–83. Xu P, Sarikaya R. Convolutional neural network based triangular CRF for joint intent detection and slot filling, In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Olomouc, Czech Republic, December 8-12, 2013, pp 78–83.
68.
go back to reference Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification, In: 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp 1480–1489. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification, In: 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pp 1480–1489.
69.
go back to reference Yao K, Zweig G, Hwang MY, Shi Y, Yu D. Recurrent neural networks for language understanding, In: 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon, France, August 25-29, 2013; pp 2524–2528. Yao K, Zweig G, Hwang MY, Shi Y, Yu D. Recurrent neural networks for language understanding, In: 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon, France, August 25-29, 2013; pp 2524–2528.
70.
go back to reference Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y. Spoken language understanding using long short-term memory neural networks, In: IEEE Spoken Language Technology Workshop, {SLT} 2014, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 189–194. Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y. Spoken language understanding using long short-term memory neural networks, In: IEEE Spoken Language Technology Workshop, {SLT} 2014, South Lake Tahoe, NV, USA, December 7-10, 2014; pp 189–194.
71.
go back to reference Yao K, Peng B, Zweig G, Yu D, Li X, Gao F. Recurrent conditional random field for language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014; pp 4077–4081. Yao K, Peng B, Zweig G, Yu D, Li X, Gao F. Recurrent conditional random field for language understanding, In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014; pp 4077–4081.
72.
go back to reference Zhang X, Wang H. A joint model of intent determination and slot filling for spoken language understanding, In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (IJCAI), New York, NY, USA, 9-15 July 2016, pp 2993-2999. Zhang X, Wang H. A joint model of intent determination and slot filling for spoken language understanding, In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (IJCAI), New York, NY, USA, 9-15 July 2016, pp 2993-2999.
73.
go back to reference Zhao L, Feng Z. Improving slot filling in spoken language understanding with joint pointer and attention, In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, {ACL} 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers}, pp 426–431. Zhao L, Feng Z. Improving slot filling in spoken language understanding with joint pointer and attention, In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, {ACL} 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers}, pp 426–431.
74.
go back to reference Zhou H, Huang M, Zhang T, Zhu X, Liu B. Emotional chatting machine: emotional conversation generation with internal and external memory, In: Proceedings of the Thirty-Second {AAAI} Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th {AAAI} Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018; pp 730–739. Zhou H, Huang M, Zhang T, Zhu X, Liu B. Emotional chatting machine: emotional conversation generation with internal and external memory, In: Proceedings of the Thirty-Second {AAAI} Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th {AAAI} Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018; pp 730–739.
75.
go back to reference Zhou Y, Hu Q, Liu J, Jia Y. Combining heterogeneous deep neural networks with conditional random fields for Chinese dialogue act recognition. In: Neurocomputing, 2015; Vol - 168; pp 408–17. Zhou Y, Hu Q, Liu J, Jia Y. Combining heterogeneous deep neural networks with conditional random fields for Chinese dialogue act recognition. In: Neurocomputing, 2015; Vol - 168; pp 408–17.
76.
go back to reference Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding, In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), New Orleans, LA, USA, March 5-9, 2017, pp 5675–5679. Zhu S, Yu K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding, In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), New Orleans, LA, USA, March 5-9, 2017, pp 5675–5679.
Metadata
Title
A Deep Multi-task Model for Dialogue Act Classification, Intent Detection and Slot Filling
Authors
Mauajama Firdaus
Hitesh Golchha
Asif Ekbal
Pushpak Bhattacharyya
Publication date
23-03-2020
Publisher
Springer US
Published in
Cognitive Computation / Issue 3/2021
Print ISSN: 1866-9956
Electronic ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-020-09718-4

Other articles of this Issue 3/2021

Cognitive Computation 3/2021 Go to the issue

Premium Partner