nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Shortcut Sequence Tagging

verfasst von : Huijia Wu, Jiajun Zhang, Chengqing Zong

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Deep stacked RNNs are usually hard to train. Recent studies have shown that shortcut connections across different RNN layers bring substantially faster convergence. However, shortcuts increase the computational complexity of the recurrent computations. To reduce the complexity, we propose the shortcut block, which is a refinement of the shortcut LSTM blocks. Our approach is to replace the self-connected parts (\(c_t^l\)) with shortcuts (\(h_t^{l-2}\)) in the internal states. We present extensive empirical experiments showing that this design performs better than the original shortcuts. We evaluate our method on CCG supertagging task, obtaining a 8% relatively improvement over current state-of-the-art results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Convolutional Attention Model for Text Classification

Nächstes Kapitel Look-Ahead Attention for Generation in Neural Machine Translation

Chung, J., Ahn, S., Bengio, Y.: Hierarchical multiscale recurrent neural networks. arXiv preprint arXiv:1609.01704 (2016)

Clark, S., Curran, J.R.: Wide-coverage efficient statistical parsing with CCG and log-linear models. Comput. Linguist. 33(4), 493–552 (2007)CrossRefMATH

El Hihi, S., Bengio, Y.: Hierarchical recurrent neural networks for long-term dependencies. In: NIPS. vol. 400, p. 409. Citeseer (1995)

Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)

Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)CrossRef

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)

Hermans, M., Schrauwen, B.: Training and analysing deep recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 190–198 (2013)

Hochreiter, S., Schmidhuber, J.: LSTM can solve hard long time lag problems. In: Advances in Neural Information Processing Systems, pp. 473–479 (1997)

Hockenmaier, J., Steedman, M.: CCGbank: a corpus of CCG derivations and dependency structures extracted from the penn treebank. Comput. Linguist. 33(3), 355–396 (2007)CrossRefMATH

10.

Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.: Deep networks with stochastic depth. arXiv preprint arXiv:1603.09382 (2016)

11.

Kalchbrenner, N., Danihelka, I., Graves, A.: Grid long short-term memory. arXiv preprint arXiv:1507.01526 (2015)

12.

Krueger, D., Maharaj, T., Kramár, J., Pezeshki, M., Ballas, N., Ke, N.R., Goyal, A., Bengio, Y., Larochelle, H., Courville, A., et al.: Zoneout: Regularizing RNNs by randomly preserving hidden activations. arXiv preprint arXiv:1606.01305 (2016)

13.

Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

14.

Lewis, M., Lee, K., Zettlemoyer, L.: LSTM CCG parsing. In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics (2016)

15.

Lewis, M., Steedman, M.: Improved CCG parsing with semi-supervised supertagging. Trans. Assoc. Comput. Linguist. 2, 327–338 (2014)

16.

Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of english: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)

17.

Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)

18.

Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120 (2013)

19.

Schmidhuber, J.: Learning complex, extended sequences using the principle of history compression. Neural Comput. 4(2), 234–242 (1992)CrossRef

20.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH

21.

Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arXiv:1505.00387 (2015)

22.

Steedman, M.: The Syntactic Process, vol. 24. MIT Press, Cambridge (2000)MATH

23.

Steedman, M., Baldridge, J.: Combinatory categorial grammar. In: Non-Transformational Syntax: Formal and Explicit Models of Grammar. Wiley, Hoboken (2011)

24.

Vaswani, A., Bisk, Y., Sagae, K., Musa, R.: Supertagging with LSTMs. In: Proceedings of the Human Language Technology Conference of the NAACL (2016)

25.

Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv preprint arXiv:1510.06168 (2015)

26.

Wu, H., Zhang, J., Zong, C.: A dynamic window neural network for CCG supertagging. In: National Conference on Artificial Intelligence, pp. 3337–3343 (2016)

27.

Wu, H., Zhang, J., Zong, C.: An empirical exploration of skip connections for sequential tagging. In: International Conference on Computational Linguistics, pp. 203–212 (2016)

28.

Xu, W., Auli, M., Clark, S.: CCG supertagging with a recurrent neural network. Volume 2: Short Papers, p. 250 (2015)

29.

Xu, W., Auli, M., Clark, S.: Expected f-measure training for shift-reduce parsing with recurrent neural networks. In: Proceedings of NAACL-HLT, pp. 210–220 (2016)

30.

Zhang, Y., Chen, G., Yu, D., Yaco, K., Khudanpur, S., Glass, J.: Highway long short-term memory RNNs for distant speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5755–5759. IEEE (2016)

31.

Zilly, J.G., Srivastava, R.K., Koutník, J., Schmidhuber, J.: Recurrent highway networks. arXiv preprint arXiv:1607.03474 (2016)

Titel: Shortcut Sequence Tagging
verfasst von: Huijia Wu
Jiajun Zhang
Chengqing Zong
Verlag: Springer International Publishing
Buch: Natural Language Processing and Chinese Computing
Print ISBN: 978-3-319-73617-4

Electronic ISBN: 978-3-319-73618-1

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-73618-1_17

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"