Skip to main content

2020 | OriginalPaper | Buchkapitel

Robust Spoken Language Understanding with RL-Based Value Error Recovery

verfasst von : Chen Liu, Su Zhu, Lu Chen, Kai Yu

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Spoken Language Understanding (SLU) aims to extract structured semantic representations (e.g., slot-value pairs) from speech recognized texts, which suffers from errors of Automatic Speech Recognition (ASR). To alleviate the problem caused by ASR-errors, previous works may apply input adaptations to the speech recognized texts, or correct ASR errors in predicted values by searching the most similar candidates in pronunciation. However, these two methods are applied separately and independently. In this work, we propose a new robust SLU framework to guide the SLU input adaptation with a rule-based value error recovery module. The framework consists of a slot tagging model and a rule-based value error recovery module. We pursue on an adapted slot tagging model which can extract potential slot-value pairs mentioned in ASR hypotheses and is suitable for the existing value error recovery module. After the value error recovery, we can achieve a supervision signal (reward) by comparing refined slot-value pairs with annotations. Since operations of the value error recovery are non-differentiable, we exploit policy gradient based Reinforcement Learning (RL) to optimize the SLU model. Extensive experiments on the public CATSLU dataset show the effectiveness of our proposed approach, which can improve the robustness of SLU and outperform the baselines by significant margins.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
All possible value candidates of each slot are provided in the domain ontology.
 
2
E.g., the value candidate set for slot address can be all available addresses saved in the database of a dialogue system.
 
Literatur
1.
2.
Zurück zum Zitat Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: NAACL (2018) Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: NAACL (2018)
4.
Zurück zum Zitat Li, H., Liu, C., Zhu, S., Yu, K.: Robust spoken language understanding with acoustic and domain knowledge. In: ICMI, pp. 531–535 (2019) Li, H., Liu, C., Zhu, S., Yu, K.: Robust spoken language understanding with acoustic and domain knowledge. In: ICMI, pp. 531–535 (2019)
5.
Zurück zum Zitat Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. In: INTERSPEECH, pp. 685–689 (2016) Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. In: INTERSPEECH, pp. 685–689 (2016)
6.
Zurück zum Zitat Liu, C., Zhu, S., Zhao, Z., Cao, R., Chen, L., Yu, K.: Jointly encoding word confusion network and dialogue context with bert for spoken language understanding. arXiv preprint arXiv:2005.11640 (2020) Liu, C., Zhu, S., Zhao, Z., Cao, R., Chen, L., Yu, K.: Jointly encoding word confusion network and dialogue context with bert for spoken language understanding. arXiv preprint arXiv:​2005.​11640 (2020)
7.
Zurück zum Zitat Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775 (2013) Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775 (2013)
8.
Zurück zum Zitat Qin, L., Che, W., Li, Y., Wen, H., Liu, T.: A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of EMNLP-IJCNLP, pp. 2078–2087 (2019) Qin, L., Che, W., Li, Y., Wen, H., Liu, T.: A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of EMNLP-IJCNLP, pp. 2078–2087 (2019)
9.
Zurück zum Zitat Schumann, R., Angkititrakul, P.: Incorporating ASR errors with attention-based, jointly trained RNN for intent detection and slot filling. In: ICASSP (2018) Schumann, R., Angkititrakul, P.: Incorporating ASR errors with attention-based, jointly trained RNN for intent detection and slot filling. In: ICASSP (2018)
10.
Zurück zum Zitat Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NeurIPS (2000) Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NeurIPS (2000)
11.
Zurück zum Zitat Tan, C., Ling, Z.: Multi-classification model for spoken language understanding. In: ICMI, pp. 526–530 (2019) Tan, C., Ling, Z.: Multi-classification model for spoken language understanding. In: ICMI, pp. 526–530 (2019)
12.
Zurück zum Zitat Tür, G., Deoras, A., Hakkani-Tür, D.: Semantic parsing using word confusion networks with conditional random fields. In: INTERSPEECH, pp. 2579–2583 (2013) Tür, G., Deoras, A., Hakkani-Tür, D.: Semantic parsing using word confusion networks with conditional random fields. In: INTERSPEECH, pp. 2579–2583 (2013)
13.
Zurück zum Zitat Wang, X., et al.: Transfer learning methods for spoken language understanding. In: ICMI, pp. 510–515 (2019) Wang, X., et al.: Transfer learning methods for spoken language understanding. In: ICMI, pp. 510–515 (2019)
14.
Zurück zum Zitat Williams, J.D.: Web-style ranking and SLU combination for dialog state tracking. In: SIGDIAL, pp. 282–291 (2014) Williams, J.D.: Web-style ranking and SLU combination for dialog state tracking. In: SIGDIAL, pp. 282–291 (2014)
15.
Zurück zum Zitat Yang, X., Liu, J.: Using word confusion networks for slot filling in spoken language understanding. In: INTERSPEECH, pp. 1353–1357 (2015) Yang, X., Liu, J.: Using word confusion networks for slot filling in spoken language understanding. In: INTERSPEECH, pp. 1353–1357 (2015)
16.
Zurück zum Zitat Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: SLT (2014) Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: SLT (2014)
17.
Zurück zum Zitat Zhao, Z., Zhu, S., Yu, K.: A hierarchical decoding model for spoken language understanding from unaligned data. In: ICASSP, pp. 7305–7309 (2019) Zhao, Z., Zhu, S., Yu, K.: A hierarchical decoding model for spoken language understanding from unaligned data. In: ICASSP, pp. 7305–7309 (2019)
18.
Zurück zum Zitat Zhu, S., Lan, O., Yu, K.: Robust spoken language understanding with unsupervised ASR-error adaptation. In: ICASSP, pp. 6179–6183 (2018) Zhu, S., Lan, O., Yu, K.: Robust spoken language understanding with unsupervised ASR-error adaptation. In: ICASSP, pp. 6179–6183 (2018)
19.
Zurück zum Zitat Zhu, S., Yu, K.: Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In: ICASSP, pp. 5675–5679 (2017) Zhu, S., Yu, K.: Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In: ICASSP, pp. 5675–5679 (2017)
20.
Zurück zum Zitat Zhu, S., Zhao, Z., Zhao, T., Zong, C., Yu, K.: CATSLU: The 1st Chinese audio-textual spoken language understanding challenge. In: ICMI, pp. 521–525 (2019) Zhu, S., Zhao, Z., Zhao, T., Zong, C., Yu, K.: CATSLU: The 1st Chinese audio-textual spoken language understanding challenge. In: ICMI, pp. 521–525 (2019)
Metadaten
Titel
Robust Spoken Language Understanding with RL-Based Value Error Recovery
verfasst von
Chen Liu
Su Zhu
Lu Chen
Kai Yu
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-60450-9_7

Premium Partner