Skip to main content

2017 | OriginalPaper | Buchkapitel

Russian Tagging and Dependency Parsing Models for Stanford CoreNLP Natural Language Toolkit

verfasst von : Liubov Kovriguina, Ivan Shilin, Alexander Shipilo, Alina Putintseva

Erschienen in: Knowledge Engineering and Semantic Web

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The paper concerns implementing maximum entropy tagging model and neural net dependency parser model for Russian language in Stanford CoreNLP toolkit, an extensible pipeline that provides core natural language analysis. Russian belongs to morphologically rich languages and demands full morphological analysis including annotating input texts with POS tags, features and lemmas (unlike the case of case-, person-, etc. insensitive languages when stemming and POS-tagging give enough information about grammatical behavior of a word form). Rich morphology is accompanied by free word order in Russian which adds indeterminacy to head finding rules in parsing procedures. In the paper we describe training data, linguistic features used to learn the classifiers, training and evaluation of tagging and parsing models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Manning, C.D., et al.: The standford CoreNLP natural language processing toolkit. In: ACL (System Demonstrations), pp. 55–60 (2014) Manning, C.D., et al.: The standford CoreNLP natural language processing toolkit. In: ACL (System Demonstrations), pp. 55–60 (2014)
2.
Zurück zum Zitat de Marneffe, M.-C., et al.: Universal Dependencies: A cross-linguistic typology. In: Language Resources and Evaluation Conference (LREC), European Language Resources Association (ELRA), Iceland, Reykjavik, pp. 4585–4592 (2014). ISBN:978-2-9517408-8-4 de Marneffe, M.-C., et al.: Universal Dependencies: A cross-linguistic typology. In: Language Resources and Evaluation Conference (LREC), European Language Resources Association (ELRA), Iceland, Reykjavik, pp. 4585–4592 (2014). ISBN:978-2-9517408-8-4
3.
Zurück zum Zitat de Marneffe, M.-C., et al.: Extending stanford dependencies. In: Proceedings of the 13th International Conference on Dependency Linguistics, pp. 187–196 (2013). ISBN:978-2-9517408-9-1 de Marneffe, M.-C., et al.: Extending stanford dependencies. In: Proceedings of the 13th International Conference on Dependency Linguistics, pp. 187–196 (2013). ISBN:978-2-9517408-9-1
4.
Zurück zum Zitat Dobrovojc, K., Nivre, J.: The universal dependencies treebank of spoken slovenian. In: Proceedings of LREC Conference, European Language Resources Association (ELRA), Portoro\(\check{z}\), Slovenia, pp. 1566–1573 (2016) Dobrovojc, K., Nivre, J.: The universal dependencies treebank of spoken slovenian. In: Proceedings of LREC Conference, European Language Resources Association (ELRA), Portoro\(\check{z}\), Slovenia, pp. 1566–1573 (2016)
5.
Zurück zum Zitat Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), vol. 13, pp. 63–70 (2000) Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), vol. 13, pp. 63–70 (2000)
6.
Zurück zum Zitat Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 740–750 (2014) Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 740–750 (2014)
8.
Zurück zum Zitat Nivre, J., et al.: Labeled pseudo-projective dependency parsing with support vector machines. In: Proceedings of the 10th Conference on Computational Natural Language Learning, CoNLL 2006, pp. 221–225 (2006) Nivre, J., et al.: Labeled pseudo-projective dependency parsing with support vector machines. In: Proceedings of the 10th Conference on Computational Natural Language Learning, CoNLL 2006, pp. 221–225 (2006)
9.
Zurück zum Zitat Zeman, D., Popel, M., Straka, M., Hajic, J., Nivre, J., et al.: CoNLL 2017 shared task: multilingual parsing from raw text to universal dependencies. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, Canada, August 3–4, 2017, pp. 1–19 (2017). doi:10.18653/v1/K17-3001 Zeman, D., Popel, M., Straka, M., Hajic, J., Nivre, J., et al.: CoNLL 2017 shared task: multilingual parsing from raw text to universal dependencies. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, Canada, August 3–4, 2017, pp. 1–19 (2017). doi:10.​18653/​v1/​K17-3001
10.
Zurück zum Zitat Benko, V., Zakharov, V.P.: Very large russian corpora: new opportunities and new challenges. In: Proceedings of the International Conference “Dialogue 2016” (2016) Benko, V., Zakharov, V.P.: Very large russian corpora: new opportunities and new challenges. In: Proceedings of the International Conference “Dialogue 2016” (2016)
11.
Zurück zum Zitat Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of russian. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 641–648 (2008). ISBN: 978-1-905593-44-6 Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of russian. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 641–648 (2008). ISBN: 978-1-905593-44-6
Metadaten
Titel
Russian Tagging and Dependency Parsing Models for Stanford CoreNLP Natural Language Toolkit
verfasst von
Liubov Kovriguina
Ivan Shilin
Alexander Shipilo
Alina Putintseva
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-69548-8_8

Neuer Inhalt