Skip to main content

2021 | OriginalPaper | Buchkapitel

Learning to Rank Intents in Voice Assistants

verfasst von : Raviteja Anantha, Srinivas Chappidi, William Dawoodi

Erschienen in: Conversational Dialogue Systems for the Next Decade

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Voice Assistants aim to fulfill user requests by choosing the best intent from multiple options generated by its Automated Speech Recognition and Natural Language Understanding sub-systems. However, voice assistants do not always produce the expected results. This can happen because voice assistants choose from ambiguous intents—user-specific or domain-specific contextual information reduces the ambiguity of the user request. Additionally the user information-state can be leveraged to understand how relevant/executable a specific intent is for a user request. In this work, we propose a novel Energy-based model for the intent ranking task, where we learn an affinity metric and model the trade-off between extracted meaning from speech utterances and relevance/executability aspects of the intent. Furthermore we present a Multisource Denoising Autoencoder based pretraining that is capable of learning fused representations of data from multiple sources. We empirically show our approach outperforms existing state of the art methods by reducing the error-rate by 3.8%, which in turn reduces ambiguity and eliminates undesired dead-ends leading to better user experience. Finally, we evaluate the robustness of our algorithm on the intent ranking task and show our algorithm improves the robustness by 33.3%.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Basili R, Bastianelli E, Castellucci G, Nardi D, Perera V (2013) Kernel-based discriminative re-ranking for spoken command understanding in HRI. Springer International Publishing, Cham, pp 169–180 Basili R, Bastianelli E, Castellucci G, Nardi D, Perera V (2013) Kernel-based discriminative re-ranking for spoken command understanding in HRI. Springer International Publishing, Cham, pp 169–180
2.
Zurück zum Zitat Bromley J, Guyon I, LeCun Y, Sackinger E, Shah R (1993) Signature verification using a siamese time delay neural networks. In: Advances in neural information processing systems Bromley J, Guyon I, LeCun Y, Sackinger E, Shah R (1993) Signature verification using a siamese time delay neural networks. In: Advances in neural information processing systems
3.
Zurück zum Zitat Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of international conference on machine learning Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of international conference on machine learning
4.
Zurück zum Zitat Christopher JCB (2010) From ranknet to lambdarank to lambdamart: an overview Christopher JCB (2010) From ranknet to lambdarank to lambdamart: an overview
5.
Zurück zum Zitat Christopher JCB, Ragno R, Viet Le Q (2006) Learning to rank with non smooth cost functions. In: Proceedings of the NIPS Christopher JCB, Ragno R, Viet Le Q (2006) Learning to rank with non smooth cost functions. In: Proceedings of the NIPS
6.
Zurück zum Zitat Celikyilmaz A, Sarikaya R, Hakkani Tur D, Liu X, Ramesh N, Tur G (2016) A new pre-training method for training deep learning models with application to spoken language understanding Celikyilmaz A, Sarikaya R, Hakkani Tur D, Liu X, Ramesh N, Tur G (2016) A new pre-training method for training deep learning models with application to spoken language understanding
7.
Zurück zum Zitat Nung Chen Y, Hakkani Tur D, He X (2016) Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models. In: IEEE international conference on acoustics, speech and signal processing (ICASSP) Nung Chen Y, Hakkani Tur D, He X (2016) Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models. In: IEEE international conference on acoustics, speech and signal processing (ICASSP)
8.
Zurück zum Zitat Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1724–1734 Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1724–1734
9.
Zurück zum Zitat Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. Proceeding CVPR 2005 proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR 2005), vol 1, pp 539–546 Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. Proceeding CVPR 2005 proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR 2005), vol 1, pp 539–546
10.
Zurück zum Zitat Kaiming H, Xiangyu Z, Shaoqing R, Jian S (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification Kaiming H, Xiangyu Z, Shaoqing R, Jian S (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification
11.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
12.
Zurück zum Zitat Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on machine learning, PMLR, vol 37, pp 448–456 Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on machine learning, PMLR, vol 37, pp 448–456
13.
Zurück zum Zitat Kim YB, Kim D, Kim JK, Sarikaya R (2018) A scalable neural shortlisting-reranking approach for large-scale domain classification in natural language understanding. In: Proceedings of NAACL-HLT, pp 16–24 Kim YB, Kim D, Kim JK, Sarikaya R (2018) A scalable neural shortlisting-reranking approach for large-scale domain classification in natural language understanding. In: Proceedings of NAACL-HLT, pp 16–24
14.
Zurück zum Zitat Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: Proceedings of the international conference on machine learning Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: Proceedings of the international conference on machine learning
15.
Zurück zum Zitat Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd international conference on machine learning, Lille, France Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd international conference on machine learning, Lille, France
16.
Zurück zum Zitat LeCun Y, Huang FJ (2005) Loss functions for discriminative training of energy-based models. AI-stats LeCun Y, Huang FJ (2005) Loss functions for discriminative training of energy-based models. AI-stats
17.
Zurück zum Zitat Lee D-H (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Proceedings of the 25th international conference on machine learning, ICML Lee D-H (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Proceedings of the 25th international conference on machine learning, ICML
18.
Zurück zum Zitat Morbini F, Audhkhasi K, Artstein R, Van Segbroeck M, Sagae K, Georgiou P, Traum DR, Narayan S (2012) A reranking approach for recognition and classification of speech input in conversational dialogue systems. In: IEEE spoken language technology workshop (SLT) Morbini F, Audhkhasi K, Artstein R, Van Segbroeck M, Sagae K, Georgiou P, Traum DR, Narayan S (2012) A reranking approach for recognition and classification of speech input in conversational dialogue systems. In: IEEE spoken language technology workshop (SLT)
19.
Zurück zum Zitat Ranzato MA, Szummer M (2008) Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th international conference on machine learning, ICML Ranzato MA, Szummer M (2008) Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th international conference on machine learning, ICML
20.
Zurück zum Zitat Robichaud JP, Crook PA, Xu P, Khan OZ, Sarikaya R (2014) Hypotheses ranking for robust domain classification and tracking in dialogue systems Robichaud JP, Crook PA, Xu P, Khan OZ, Sarikaya R (2014) Hypotheses ranking for robust domain classification and tracking in dialogue systems
21.
Zurück zum Zitat Nitish S, Geoffrey H, Alex K, Ilya S, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting Nitish S, Geoffrey H, Alex K, Ilya S, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting
22.
Zurück zum Zitat Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep boltzmann machines. In: Proceedings of neural information processing systems Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep boltzmann machines. In: Proceedings of neural information processing systems
23.
Zurück zum Zitat Teh YW, Welling M, Osindero S, Hinton GE (2003) Energy-based models for sparse overcomplete representations. J Mach Learn Res 4:1235–1260 Teh YW, Welling M, Osindero S, Hinton GE (2003) Energy-based models for sparse overcomplete representations. J Mach Learn Res 4:1235–1260
24.
Zurück zum Zitat Thomson B (2013) Statistical methods for spoken dialogue management. Springer-Verlag, LondonCrossRef Thomson B (2013) Statistical methods for spoken dialogue management. Springer-Verlag, LondonCrossRef
25.
Zurück zum Zitat Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine Learning, pp 1096–1103 Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine Learning, pp 1096–1103
Metadaten
Titel
Learning to Rank Intents in Voice Assistants
verfasst von
Raviteja Anantha
Srinivas Chappidi
William Dawoodi
Copyright-Jahr
2021
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-8395-7_7

Neuer Inhalt