Skip to main content
Erschienen in: Cluster Computing 4/2016

01.12.2016

A novel density-based clustering method using word embedding features for dialogue intention recognition

verfasst von: Jungsun Jang, Yeonsoo Lee, Seolhwa Lee, Dongwon Shin, Dongjun Kim, Haechang Rim

Erschienen in: Cluster Computing | Ausgabe 4/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In dialogue systems, understanding user utterances is crucial for providing appropriate responses. Various classification models have been proposed to deal with natural language understanding tasks related to user intention analysis, such as dialogue acts or emotion recognition. However, models that use original lexical features without any modifications encounter the problem of data sparseness, and constructing sufficient training data to overcome this problem is labor-intensive, time-consuming, and expensive. To address this issue, word embedding models that can learn lexical synonyms using vast raw corpora have recently been proposed. However, the analysis of embedding features is not yet sufficient to validate the efficiency of such models. Specifically, using the cosine similarity score as a feature in the embedding space neglects the skewed nature of the word frequency distribution, which can affect the improvement of model performance. This paper describes a novel density-based clustering method that efficiently integrates word embedding vectors into dialogue intention recognition. Experimental results show that our proposed model helps overcome the data sparseness problem seen in previous classification models and can assist in improving the classification performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mancini, M., Pelachaud, C.: Dynamic behavior qualifiers for conversational agents. In: Intelligent Virtual Agents: 7th International Working Conference, pp. 112–124 (2007) Mancini, M., Pelachaud, C.: Dynamic behavior qualifiers for conversational agents. In: Intelligent Virtual Agents: 7th International Working Conference, pp. 112–124 (2007)
2.
Zurück zum Zitat Bosma, W., André, E.: Exploiting emotions to disambiguate dialogue acts. In: Proceedings of the 9th International Conference on Intelligent User Interfaces, pp. 85–92 (2004) Bosma, W., André, E.: Exploiting emotions to disambiguate dialogue acts. In: Proceedings of the 9th International Conference on Intelligent User Interfaces, pp. 85–92 (2004)
3.
Zurück zum Zitat Austin, J.A.: How to Do Things with Words. Harvard University Press, Cambridge (1962) Austin, J.A.: How to Do Things with Words. Harvard University Press, Cambridge (1962)
4.
Zurück zum Zitat Traum, D., Larsson, S.: The information state approach to dialogue management. In: Smith, R., van Kuppevelt, J. (eds.) Current and New Directions in Discourse and Dialogue. Kluwer, Dordrecht (2003) Traum, D., Larsson, S.: The information state approach to dialogue management. In: Smith, R., van Kuppevelt, J. (eds.) Current and New Directions in Discourse and Dialogue. Kluwer, Dordrecht (2003)
5.
Zurück zum Zitat Bub, T., Schwinn, J.: VERBMOBIL: the evolution of a complex large speech-to-speech translation system. In: Proceedings of International Conference on Spoken Language Processing, (1996) Bub, T., Schwinn, J.: VERBMOBIL: the evolution of a complex large speech-to-speech translation system. In: Proceedings of International Conference on Spoken Language Processing, (1996)
6.
Zurück zum Zitat Allen, J., Core, M.: DAMSL: dialogue act markup in several layers (draft 2.1). Technical Report, University of Rochester, (1997) Allen, J., Core, M.: DAMSL: dialogue act markup in several layers (draft 2.1). Technical Report, University of Rochester, (1997)
7.
Zurück zum Zitat Bunt, H., Alexandersson, J., Charletta, J., Choe, J.W., Fang, A.C., Hasida, K., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., Soria, C., Traum, D.: Towards an ISO standard for dialogue act annotation. In: Proceedings of International Language Resources and Evaluation (LREC’10), pp. 2248–2558, (2010) Bunt, H., Alexandersson, J., Charletta, J., Choe, J.W., Fang, A.C., Hasida, K., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., Soria, C., Traum, D.: Towards an ISO standard for dialogue act annotation. In: Proceedings of International Language Resources and Evaluation (LREC’10), pp. 2248–2558, (2010)
8.
Zurück zum Zitat Bunt, H., Alexandersson, J., Charletta, J., Choe, J.W., Fang, A.C., Hasida, K., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., Soria, C., Traum, D.: ISO 24617-2: a semantically-based standard for dialogue annotation. In;: Proceedings of International Language Resources and Evaluation (LREC’12), pp. 430–437, (2012) Bunt, H., Alexandersson, J., Charletta, J., Choe, J.W., Fang, A.C., Hasida, K., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., Soria, C., Traum, D.: ISO 24617-2: a semantically-based standard for dialogue annotation. In;: Proceedings of International Language Resources and Evaluation (LREC’12), pp. 430–437, (2012)
9.
Zurück zum Zitat Lee, H., Kim, H., Seo, J.: An effective two-step model for speech act analysis in a schedule management domain. Korean J. Cognit. Sci. 19(3), 297–310 (2008)MathSciNetCrossRef Lee, H., Kim, H., Seo, J.: An effective two-step model for speech act analysis in a schedule management domain. Korean J. Cognit. Sci. 19(3), 297–310 (2008)MathSciNetCrossRef
10.
Zurück zum Zitat Kim, S., Lee, Y., Lee, J.: Korean speech act tagging using previous sentence features and following candidate speech acts. J. Korean Inst. Inform.n Sci. Eng. 35(6), 374–385 (2008) Kim, S., Lee, Y., Lee, J.: Korean speech act tagging using previous sentence features and following candidate speech acts. J. Korean Inst. Inform.n Sci. Eng. 35(6), 374–385 (2008)
11.
Zurück zum Zitat Kim, M., Park, J., Kim, S., Rim, H., Lee, D.: A comparative study on optimal feature identification and combination for Korean dialogue act classification. J. Korean Inst. Inform.n Sci. Eng. 35(11), 681–691 (2008) Kim, M., Park, J., Kim, S., Rim, H., Lee, D.: A comparative study on optimal feature identification and combination for Korean dialogue act classification. J. Korean Inst. Inform.n Sci. Eng. 35(11), 681–691 (2008)
12.
Zurück zum Zitat Kim, H., Seon, C., Seo, J.: Review of Korean speech act classification: machine learning methods. J. Comput. Sci. Eng. 5(4), 288–293 (2011)CrossRef Kim, H., Seon, C., Seo, J.: Review of Korean speech act classification: machine learning methods. J. Comput. Sci. Eng. 5(4), 288–293 (2011)CrossRef
13.
Zurück zum Zitat Aman, S., Szpakowicz, S.: Identifying expressions of emotion in text. In: Proceedings of 10th International Conference on Text, Speech and Dialogue, (2007) Aman, S., Szpakowicz, S.: Identifying expressions of emotion in text. In: Proceedings of 10th International Conference on Text, Speech and Dialogue, (2007)
14.
Zurück zum Zitat Valstar, M., Jiang, B., Méhu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, pp. 921–926, (2011) Valstar, M., Jiang, B., Méhu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, pp. 921–926, (2011)
15.
Zurück zum Zitat Alhussein, M.: Automatic facial emotion recognition using weber local descriptor for e-Healthcare system. Clust. Comput. 19(1), 99–108 (2016)MathSciNetCrossRef Alhussein, M.: Automatic facial emotion recognition using weber local descriptor for e-Healthcare system. Clust. Comput. 19(1), 99–108 (2016)MathSciNetCrossRef
16.
Zurück zum Zitat Purver, M., Battersby, S.: Experimenting with distant supervision for emotion classification. In: Proceedings of EACL, pp. 482–491 (2012) Purver, M., Battersby, S.: Experimenting with distant supervision for emotion classification. In: Proceedings of EACL, pp. 482–491 (2012)
17.
Zurück zum Zitat Kang, S., Park, H., Seo, J.: Emotion classification of user’s utterance for a dialogue system. Korean J. Cognit. Sci. 21(4), 459–480 (2010)CrossRef Kang, S., Park, H., Seo, J.: Emotion classification of user’s utterance for a dialogue system. Korean J. Cognit. Sci. 21(4), 459–480 (2010)CrossRef
18.
Zurück zum Zitat Hasegawa, T., Kaji, N., Yoshinaga, N., Toyoda, M.: Predicting and eliciting addressee’s emotion in online dialogue. In: Proceedings of ACL, pp. 964–972, (2013) Hasegawa, T., Kaji, N., Yoshinaga, N., Toyoda, M.: Predicting and eliciting addressee’s emotion in online dialogue. In: Proceedings of ACL, pp. 964–972, (2013)
19.
Zurück zum Zitat Plutchik, R.: A general psychoevolutionary theory of emotion. In: Plutchik, R., Kellerman, H. (eds.) Emotion: Theory, Research, and Experience, pp. 3–33. Academic Press, New York (1980)CrossRef Plutchik, R.: A general psychoevolutionary theory of emotion. In: Plutchik, R., Kellerman, H. (eds.) Emotion: Theory, Research, and Experience, pp. 3–33. Academic Press, New York (1980)CrossRef
20.
Zurück zum Zitat Dumais, S.T.: Latent semantic analysis. Ann. Rev. Inform. Sci. Technol. 38, 188–230 (2004)CrossRef Dumais, S.T.: Latent semantic analysis. Ann. Rev. Inform. Sci. Technol. 38, 188–230 (2004)CrossRef
21.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
22.
Zurück zum Zitat Mikolov, T., Karafiat, M., Burget, L., Cernocky, J.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), pp. 1045–1048, (2010) Mikolov, T., Karafiat, M., Burget, L., Cernocky, J.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), pp. 1045–1048, (2010)
23.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR, (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR, (2013)
24.
Zurück zum Zitat Barnoi, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting versus context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 238–247, (2014) Barnoi, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting versus context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 238–247, (2014)
25.
Zurück zum Zitat Xu, R., Chen, T., Xia, Y., Lu, Q., Liu, B., Wang, X.: Word embedding composition for data imbalances in sentiment and emotion classification. Cognit. Comput. 7(2), 226–240 (2015)CrossRef Xu, R., Chen, T., Xia, Y., Lu, Q., Liu, B., Wang, X.: Word embedding composition for data imbalances in sentiment and emotion classification. Cognit. Comput. 7(2), 226–240 (2015)CrossRef
26.
Zurück zum Zitat Shin, D., Lee, Y., Jang, J., Rim, H.: Emotion classification in dialogue using embedding features. In: Proceedings of the 27th Conference on Hangul and Korean Language Information Processing, pp. 109–114, (2015) Shin, D., Lee, Y., Jang, J., Rim, H.: Emotion classification in dialogue using embedding features. In: Proceedings of the 27th Conference on Hangul and Korean Language Information Processing, pp. 109–114, (2015)
27.
Zurück zum Zitat Aggarwal, C.C., Reddy, C.K.: Data Clustering Algorithms and Applications. CRC Press, Boca Raton (2015)MATH Aggarwal, C.C., Reddy, C.K.: Data Clustering Algorithms and Applications. CRC Press, Boca Raton (2015)MATH
28.
Zurück zum Zitat Ester, M., Kriegel, H., Xu, X.: Knowledge discovery in large spatial databases: focusing techniques for efficient class identification. In: Proceedings of 4th International Symposium on Large Spatial Databases, pp. 67–82, (1995) Ester, M., Kriegel, H., Xu, X.: Knowledge discovery in large spatial databases: focusing techniques for efficient class identification. In: Proceedings of 4th International Symposium on Large Spatial Databases, pp. 67–82, (1995)
29.
Zurück zum Zitat Hinneburg, A., Keim, D.: An efficient approach to clustering large multimedia databases with noise. In: Proceedings of 4th International Conference on Knowledge Discovery and Data Mining, pp. 58–65, (1998) Hinneburg, A., Keim, D.: An efficient approach to clustering large multimedia databases with noise. In: Proceedings of 4th International Conference on Knowledge Discovery and Data Mining, pp. 58–65, (1998)
30.
Zurück zum Zitat Lin, C., Cheng, J., Wu, C.: Mobile location estimation using density-based clustering techniques for NLoS environments. Clust. Comput. 10(1), 3–16 (2007)CrossRef Lin, C., Cheng, J., Wu, C.: Mobile location estimation using density-based clustering techniques for NLoS environments. Clust. Comput. 10(1), 3–16 (2007)CrossRef
31.
Zurück zum Zitat Ko, Y., Kim, K., Seo, J.: Topic keyword identification for text summarization using lexical clustering. IEICE Trans. Inform. Syst. 86(9), 1695–1701 (2003) Ko, Y., Kim, K., Seo, J.: Topic keyword identification for text summarization using lexical clustering. IEICE Trans. Inform. Syst. 86(9), 1695–1701 (2003)
32.
Zurück zum Zitat Li, Y., Luo, C., Chung, S.: A parallel text document clustering algorithm based on neighbors. Clust. Comput. 18(2), 933–948 (2015)CrossRef Li, Y., Luo, C., Chung, S.: A parallel text document clustering algorithm based on neighbors. Clust. Comput. 18(2), 933–948 (2015)CrossRef
33.
Zurück zum Zitat Park, K., Lim, H.: Acquiring lexical knowledge using raw corpora and unsupervised clustering method. Clust. Comput. 17(3), 901–910 (2014)CrossRef Park, K., Lim, H.: Acquiring lexical knowledge using raw corpora and unsupervised clustering method. Clust. Comput. 17(3), 901–910 (2014)CrossRef
34.
Zurück zum Zitat Lee, D., Rim, H.: Probabilistic modeling of Korean morphology. IEEE Trans. Audio Speech Lang. Process. 17(5), 945–955 (2009)CrossRef Lee, D., Rim, H.: Probabilistic modeling of Korean morphology. IEEE Trans. Audio Speech Lang. Process. 17(5), 945–955 (2009)CrossRef
35.
Zurück zum Zitat van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 3221–3245 (2014)MathSciNetMATH van der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 3221–3245 (2014)MathSciNetMATH
36.
Zurück zum Zitat Kim, D., Lee, Y., Zhang, J., Rim, H.: Lexical feature embedding for classifying dialogue acts on Korean conversations., In: Proceedings of 42th Winter Conference on Korean Institute of Information Scientists and Engineers, pp. 575–577, (2015) Kim, D., Lee, Y., Zhang, J., Rim, H.: Lexical feature embedding for classifying dialogue acts on Korean conversations., In: Proceedings of 42th Winter Conference on Korean Institute of Information Scientists and Engineers, pp. 575–577, (2015)
Metadaten
Titel
A novel density-based clustering method using word embedding features for dialogue intention recognition
verfasst von
Jungsun Jang
Yeonsoo Lee
Seolhwa Lee
Dongwon Shin
Dongjun Kim
Haechang Rim
Publikationsdatum
01.12.2016
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 4/2016
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-016-0649-7

Weitere Artikel der Ausgabe 4/2016

Cluster Computing 4/2016 Zur Ausgabe

Premium Partner