Skip to main content

2016 | OriginalPaper | Buchkapitel

Identifying User Intents in Vietnamese Spoken Language Commands and Its Application in Smart Mobile Voice Interaction

verfasst von : Thi-Lan Ngo, Van-Hop Nguyen, Thi-Hai-Yen Vuong, Thac-Thong Nguyen, Thi-Thua Nguyen, Bao-Son Pham, Xuan-Hieu Phan

Erschienen in: Intelligent Information and Database Systems

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a lightweight machine learning model and a fast conjunction matching method to the problem of identifying user intents behind their spoken text commands. These model and method were integrated into a mobile virtual assistant for Vietnamese (VAV) to understand what mobile users mean to carry out on their smartphones via their commands. User intent, in the scope of our work, is an action associated with a particular mobile application. Given an input spoken command, its application will be identified by an accurate classifier while the action will be determined by a flexible conjunction matching algorithm. Our classifier and conjunction matcher are very compact in order that we can store and execute them right on mobile devices. To evaluate the classifier and the matcher, we annotated a medium-sized data set, conducting various experiments with different settings, and achieving impressive accuracy for both the application and action identification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Microsoft Skype Translator and AT&T Speech-to-Speech Translation.
 
Literatur
1.
Zurück zum Zitat Angelov, K., Bringert, B., Ranta, A.: Speech-enabled hybrid multilingual translation for mobile devices. In: EACL (2014) Angelov, K., Bringert, B., Ranta, A.: Speech-enabled hybrid multilingual translation for mobile devices. In: EACL (2014)
2.
Zurück zum Zitat Bastianelli, E., Castellucci, G., Croce, D., Basili, R., Nardi, D.: Effective and robust NLU for human-robot interaction. In: ECAI, vol. 263, pp. 57–62 (2014) Bastianelli, E., Castellucci, G., Croce, D., Basili, R., Nardi, D.: Effective and robust NLU for human-robot interaction. In: ECAI, vol. 263, pp. 57–62 (2014)
3.
Zurück zum Zitat Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996) Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)
4.
Zurück zum Zitat Borthwick, A.: A maximum entropy approach to named entity recognition. Ph.D. dissertation, Deptartment of CS, New York University (1999) Borthwick, A.: A maximum entropy approach to named entity recognition. Ph.D. dissertation, Deptartment of CS, New York University (1999)
5.
Zurück zum Zitat Branavan, S.R.K., Chen, H., Zettlemoyer, L.S., Barzilay, R.: Reinforcement learning for mapping instructions to actions. In: ACL/IJCNLP, pp. 82–90 (2009) Branavan, S.R.K., Chen, H., Zettlemoyer, L.S., Barzilay, R.: Reinforcement learning for mapping instructions to actions. In: ACL/IJCNLP, pp. 82–90 (2009)
6.
Zurück zum Zitat Branavan, S.R.K., Zettlemoyer, L.S., Barzilay, R.: Reading between the lines: learning to map high-level instructions to commands. In: ACL, pp. 1268–1277 (2010) Branavan, S.R.K., Zettlemoyer, L.S., Barzilay, R.: Reading between the lines: learning to map high-level instructions to commands. In: ACL, pp. 1268–1277 (2010)
7.
Zurück zum Zitat Bratman, M.: Intention, Plans, and Practical Reason. Harvard University Press, Cambridge (1987) Bratman, M.: Intention, Plans, and Practical Reason. Harvard University Press, Cambridge (1987)
8.
Zurück zum Zitat Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML (2014) Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML (2014)
9.
Zurück zum Zitat Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep Speech: scaling up end-to-end speech recognition (2014). arxiv.org/abs/1412.5567v2 Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep Speech: scaling up end-to-end speech recognition  (2014). arxiv.​org/​abs/​1412.​5567v2
10.
Zurück zum Zitat Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)CrossRef Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)CrossRef
11.
12.
Zurück zum Zitat Popkin, J.: Google, apple siri and IBM watson: the future of natural-language question answering in your enterprise. Gartner Technical Professional Advice (2013) Popkin, J.: Google, apple siri and IBM watson: the future of natural-language question answering in your enterprise. Gartner Technical Professional Advice (2013)
13.
Zurück zum Zitat Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: EMNLP, vol.1, pp. 133–142 (1996) Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: EMNLP, vol.1, pp. 133–142 (1996)
14.
Zurück zum Zitat Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S., Roy, N.: Understanding natural language commands for robotic navigation and mobile manipulation. In: AAAI (2011) Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S., Roy, N.: Understanding natural language commands for robotic navigation and mobile manipulation. In: AAAI (2011)
15.
Zurück zum Zitat Tur, G., Mori, R.D.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, New York (2011)CrossRefMATH Tur, G., Mori, R.D.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, New York (2011)CrossRefMATH
Metadaten
Titel
Identifying User Intents in Vietnamese Spoken Language Commands and Its Application in Smart Mobile Voice Interaction
verfasst von
Thi-Lan Ngo
Van-Hop Nguyen
Thi-Hai-Yen Vuong
Thac-Thong Nguyen
Thi-Thua Nguyen
Bao-Son Pham
Xuan-Hieu Phan
Copyright-Jahr
2016
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49381-6_19

Premium Partner