Skip to main content
Top

2017 | OriginalPaper | Chapter

On Multilingual Training of Neural Dependency Parsers

Authors : Michał Zapotoczny, Paweł Rychlikowski, Jan Chorowski

Published in: Text, Speech, and Dialogue

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We show that a recently proposed neural dependency parser can be improved by joint training on multiple languages from the same family. The parser is implemented as a deep neural network whose only input is orthographic representations of words. In order to successfully parse, the network has to discover how linguistically relevant concepts can be inferred from word spellings. We analyze the representations of characters and words that are learned by the network to establish which properties of languages were accounted for. In particular we show that the parser has approximately learned to associate Latin characters with their Cyrillic counterparts and that it can group Polish and Russian words that have a similar grammatical function. Finally, we evaluate the parser on selected languages from the Universal Dependencies dataset and show that it is competitive with other recently proposed state-of-the art methods, while having a simple structure.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
However, experiments use UD 1.3 dataset which does not include Belarusian and Ukrainian.
 
2
Conveniently, the Unicode has separate codes for Latin and Cyrillic letters.
 
Literature
2.
go back to reference Ammar, W., et al.: Many languages, one parser. Trans. Assoc. Comput. Linguist. 4(0), 431–444 (2016) Ammar, W., et al.: Many languages, one parser. Trans. Assoc. Comput. Linguist. 4(0), 431–444 (2016)
3.
go back to reference Andor, D., Alberti, C., Weiss, D., Severyn, A., Presta, A., Ganchev, K., Petrov, S., Collins, M.: Globally normalized transition-based neural networks. arXiv:1603.06042 [cs], March 2016 Andor, D., Alberti, C., Weiss, D., Severyn, A., Presta, A., Ganchev, K., Petrov, S., Collins, M.: Globally normalized transition-based neural networks. arXiv:​1603.​06042 [cs], March 2016
4.
go back to reference Ballesteros, M., Dyer, C., Smith, N.A.: Improved transition-based parsing by modeling characters instead of words with LSTMs. arXiv preprint arXiv:1508.00657 (2015) Ballesteros, M., Dyer, C., Smith, N.A.: Improved transition-based parsing by modeling characters instead of words with LSTMs. arXiv preprint arXiv:​1508.​00657 (2015)
5.
go back to reference Bender, E.M.: On achieving and evaluating language-independence in NLP. Linguist. Issues Lang. Technol. 6(3), 1–26 (2011) Bender, E.M.: On achieving and evaluating language-independence in NLP. Linguist. Issues Lang. Technol. 6(3), 1–26 (2011)
6.
go back to reference Bergstra, J., et al.: Theano: a CPU and GPU math expression compiler. In: Proceedings of SciPy (2010) Bergstra, J., et al.: Theano: a CPU and GPU math expression compiler. In: Proceedings of SciPy (2010)
8.
go back to reference Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: EMNLP, pp. 740–750 (2014) Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: EMNLP, pp. 740–750 (2014)
9.
go back to reference Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014) Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014)
10.
go back to reference Chorowski, J., Bahdanau, D., Cho, K., Bengio, Y.: End-to-end continuous speech recognition using attention-based recurrent NN: first results. arXiv:1412.1602 [cs stat], December 2014 Chorowski, J., Bahdanau, D., Cho, K., Bengio, Y.: End-to-end continuous speech recognition using attention-based recurrent NN: first results. arXiv:​1412.​1602 [cs stat], December 2014
11.
go back to reference Chorowski, J., Zapotoczny, M., Rychlikowski, P.: Read, tag, and parse all at once, or fully-neural dependency parsing. CoRR abs/1609.03441 (2016) Chorowski, J., Zapotoczny, M., Rychlikowski, P.: Read, tag, and parse all at once, or fully-neural dependency parsing. CoRR abs/1609.03441 (2016)
12.
go back to reference Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. CoRR abs/1611.01734 (2016) Dozat, T., Manning, C.D.: Deep biaffine attention for neural dependency parsing. CoRR abs/1611.01734 (2016)
13.
go back to reference Duong, L., Cohn, T., Bird, S., Cook, P.: A neural network model for low-resource universal dependency parsing. In: EMNLP, pp. 339–348. Citeseer (2015) Duong, L., Cohn, T., Bird, S., Cook, P.: A neural network model for low-resource universal dependency parsing. In: EMNLP, pp. 339–348. Citeseer (2015)
14.
go back to reference Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:1505.08075 (2015) Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:​1505.​08075 (2015)
15.
go back to reference Edmonds, J.: Optimim branchings. J. Res. Natl. Bur. Stand. B 71B(4), 233–240 (1966)CrossRef Edmonds, J.: Optimim branchings. J. Res. Natl. Bur. Stand. B 71B(4), 233–240 (1966)CrossRef
16.
go back to reference Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: ICML, pp. 1319–1327 (2013) Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: ICML, pp. 1319–1327 (2013)
17.
go back to reference Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations. In: ACL, vol. 1, pp. 1234–1244 (2015) Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations. In: ACL, vol. 1, pp. 1234–1244 (2015)
18.
go back to reference Hinton, G.E., McClelland, J.L., Rumelhart, D.E.: Paralell Distributed Processing: Explorations in the Microstructure of Cognition: Foundations, vol. 1. MIT Press/Bradford Books, Cambridge (1986) Hinton, G.E., McClelland, J.L., Rumelhart, D.E.: Paralell Distributed Processing: Explorations in the Microstructure of Cognition: Foundations, vol. 1. MIT Press/Bradford Books, Cambridge (1986)
19.
go back to reference Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv:1602.02410 [cs], February 2016 Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv:​1602.​02410 [cs], February 2016
20.
21.
go back to reference Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. arXiv:1603.04351 [cs], March 2016 Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. arXiv:​1603.​04351 [cs], March 2016
23.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
24.
go back to reference Mikolov, T., Karafiát, M., Burget, L., Cernocky, J., Khudanpur, S.: Recurrent neural network based language model, Makuhari, Chiba, Japan, September 2010 Mikolov, T., Karafiát, M., Burget, L., Cernocky, J., Khudanpur, S.: Recurrent neural network based language model, Makuhari, Chiba, Japan, September 2010
25.
26.
go back to reference Nivre, J., et al.: MaltParser: a language-independent system for data-driven dependency parsing. Nat. Lang. Eng., 1 (2005) Nivre, J., et al.: MaltParser: a language-independent system for data-driven dependency parsing. Nat. Lang. Eng., 1 (2005)
28.
go back to reference Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef
29.
go back to reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)MathSciNetMATH
31.
go back to reference Titov, I., Henderson, J.: A latent variable model for generative dependency parsing. In: Proceedings of IWPT (2007) Titov, I., Henderson, J.: A latent variable model for generative dependency parsing. In: Proceedings of IWPT (2007)
32.
go back to reference Vinyals, O., Kaiser, L., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a Foreign language. arXiv:1412.7449 [cs stat], December 2014 Vinyals, O., Kaiser, L., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a Foreign language. arXiv:​1412.​7449 [cs stat], December 2014
33.
go back to reference Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144, September 2016 Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:​1609.​08144, September 2016
35.
go back to reference Zhang, X., Cheng, J., Lapata, M.: Dependency parsing as head selection. CoRR abs/1606.01280 (2016) Zhang, X., Cheng, J., Lapata, M.: Dependency parsing as head selection. CoRR abs/1606.01280 (2016)
Metadata
Title
On Multilingual Training of Neural Dependency Parsers
Authors
Michał Zapotoczny
Paweł Rychlikowski
Jan Chorowski
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-64206-2_37

Premium Partner