Skip to main content

2016 | OriginalPaper | Buchkapitel

A Hierarchical LSTM Model for Joint Tasks

verfasst von : Qianrong Zhou, Liyun Wen, Xiaojie Wang, Long Ma, Yue Wang

Erschienen in: Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Previous work has shown that joint modeling of two Natural Language Processing (NLP) tasks are effective for achieving better performances for both tasks. Lots of task-specific joint models are proposed. This paper proposes a Hierarchical Long Short-Term Memory (HLSTM) model and some its variants for modeling two tasks jointly. The models are flexible for modeling different types of combinations of tasks. It avoids task-specific feature engineering. Besides the enabling of correlation information between tasks, our models take the hierarchical relations between two tasks into consideration, which is not discussed in previous work. Experimental results show that our models outperform strong baselines in three different types of task combination. While both correlation information and hierarchical relations between two tasks are helpful to improve performances for both tasks, the models especially boost performance of tasks on the top of the hierarchical structures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Sentences with ‘request’ intent are not included, since there is always no slot values in those sentences.
 
Literatur
1.
Zurück zum Zitat Zhou, J., Qu, W., Zhang, F.: Exploiting chunk-level features to improve phrase chunking. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 557–567. Association for Computational Linguistics, July 2012 Zhou, J., Qu, W., Zhang, F.: Exploiting chunk-level features to improve phrase chunking. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 557–567. Association for Computational Linguistics, July 2012
2.
Zurück zum Zitat Chen, W., Zhang, Y., Isahara, H.: An empirical study of Chinese chunking. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 97–104. Association for Computational Linguistics, July 2006 Chen, W., Zhang, Y., Isahara, H.: An empirical study of Chinese chunking. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 97–104. Association for Computational Linguistics, July 2006
3.
Zurück zum Zitat Tan, Y., Yao, T., Chen, Q., Zhu, J.: Applying conditional random fields to chinese shallow parsing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 167–176. Springer, Heidelberg (2005)CrossRef Tan, Y., Yao, T., Chen, Q., Zhu, J.: Applying conditional random fields to chinese shallow parsing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 167–176. Springer, Heidelberg (2005)CrossRef
4.
Zurück zum Zitat Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH
5.
Zurück zum Zitat Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001) Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
6.
Zurück zum Zitat Lyu, C., Zhang, Y., Ji, D.: Joint word segmentation, POS-tagging and syntactic chunking. In: Thirtieth AAAI Conference on Artificial Intelligence, March 2016 Lyu, C., Zhang, Y., Ji, D.: Joint word segmentation, POS-tagging and syntactic chunking. In: Thirtieth AAAI Conference on Artificial Intelligence, March 2016
7.
Zurück zum Zitat Tie-jun, Z.C.H.Z., De-quan, Z.: Joint Chinese word segmentation and POS tagging system with undirected graphical models. J. Electr. Inf. Technol. 3, 038 (2010) Tie-jun, Z.C.H.Z., De-quan, Z.: Joint Chinese word segmentation and POS tagging system with undirected graphical models. J. Electr. Inf. Technol. 3, 038 (2010)
8.
Zurück zum Zitat Shi, Y., Yao, K., Chen, H., Pan, Y.C., Hwang, M.Y., Peng, B.: Contextual spoken language understanding using recurrent neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5271–5275. IEEE, April 2015 Shi, Y., Yao, K., Chen, H., Pan, Y.C., Hwang, M.Y., Peng, B.: Contextual spoken language understanding using recurrent neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5271–5275. IEEE, April 2015
9.
Zurück zum Zitat Lee, C., Ko, Y., Seo, J.: A simultaneous recognition framework for the spoken language understanding module of intelligent personal assistant software on smart phones. Short Papers, vol. 29, p. 818 (2015) Lee, C., Ko, Y., Seo, J.: A simultaneous recognition framework for the spoken language understanding module of intelligent personal assistant software on smart phones. Short Papers, vol. 29, p. 818 (2015)
10.
Zurück zum Zitat Duh, K.: Jointly labeling multiple sequences: a factorial HMM approach. In: Proceedings of the ACL Student Research Workshop, pp. 19–24. Association for Computational Linguistics, June 2005 Duh, K.: Jointly labeling multiple sequences: a factorial HMM approach. In: Proceedings of the ACL Student Research Workshop, pp. 19–24. Association for Computational Linguistics, June 2005
11.
Zurück zum Zitat Li, X.: Research on joint learning of sequence labeling in natural language processing (Dissertation for the Doctoral Degree in Engineering). Harbin Institue of Technology, Harbin, China (2010) Li, X.: Research on joint learning of sequence labeling in natural language processing (Dissertation for the Doctoral Degree in Engineering). Harbin Institue of Technology, Harbin, China (2010)
12.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
13.
Zurück zum Zitat Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, May 2013 Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, May 2013
14.
Zurück zum Zitat Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for chinese word segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015) Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for chinese word segmentation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)
15.
Zurück zum Zitat Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH
16.
Zurück zum Zitat Cotter, A., Shamir, O., Srebro, N., Sridharan, K.: Better mini-batch algorithms via accelerated gradient methods. In: Advances in neural information processing systems, pp. 1647–1655 (2011) Cotter, A., Shamir, O., Srebro, N., Sridharan, K.: Better mini-batch algorithms via accelerated gradient methods. In: Advances in neural information processing systems, pp. 1647–1655 (2011)
17.
Zurück zum Zitat Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: IEEE International Conference on Neural Networks, vol. 1, pp. 347–352. IEEE, June 1996 Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: IEEE International Conference on Neural Networks, vol. 1, pp. 347–352. IEEE, June 1996
18.
Zurück zum Zitat Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bengio, Y.: Theano: new features and speed improvements. arXiv preprint arXiv:1211.5590 (2012) Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bengio, Y.: Theano: new features and speed improvements. arXiv preprint arXiv:​1211.​5590 (2012)
19.
Zurück zum Zitat Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), vol. 4, p. 3, June 2010 Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), vol. 4, p. 3, June 2010
20.
Zurück zum Zitat Williams, J., Raux, A., Ramachandran, D., Black, A.: The dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference, pp. 404–413, August 2013 Williams, J., Raux, A., Ramachandran, D., Black, A.: The dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference, pp. 404–413, August 2013
21.
Zurück zum Zitat Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775, August 2013 Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775, August 2013
22.
Zurück zum Zitat Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tur, D., Zweig, G.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)CrossRef Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tur, D., Zweig, G.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)CrossRef
23.
Zurück zum Zitat Florian, R., Ngai, G.: Multidimensional transformation-based learning. In: Proceedings of the 2001 workshop on Computational Natural Language Learning, vol. 7, p. 1. Association for Computational Linguistics, July 2001 Florian, R., Ngai, G.: Multidimensional transformation-based learning. In: Proceedings of the 2001 workshop on Computational Natural Language Learning, vol. 7, p. 1. Association for Computational Linguistics, July 2001
24.
Zurück zum Zitat Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. ACL 1, 293–303 (2014) Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. ACL 1, 293–303 (2014)
25.
Zurück zum Zitat Qiu, X., Qian, P., Yin, L., Wu, S., Huang, X.: Overview of the NLPCC 2015 shared task: chinese word segmentation and POS tagging for micro-blog texts. In: Hou, L., et al. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 541–549. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25207-0_50 CrossRef Qiu, X., Qian, P., Yin, L., Wu, S., Huang, X.: Overview of the NLPCC 2015 shared task: chinese word segmentation and POS tagging for micro-blog texts. In: Hou, L., et al. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 541–549. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-25207-0_​50 CrossRef
Metadaten
Titel
A Hierarchical LSTM Model for Joint Tasks
verfasst von
Qianrong Zhou
Liyun Wen
Xiaojie Wang
Long Ma
Yue Wang
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-47674-2_27