Skip to main content

2018 | OriginalPaper | Buchkapitel

Cross-Tagset Parsing Evaluation for Russian

verfasst von : Kira Droganova, Olga Lyashevskaya

Erschienen in: Digital Transformation and Global Society

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cross-tagset parsing is based on the substitution of one annotation layer for another while processing data within one language. As often as not, either the native tagger or the dependency parser used in (pre-)annotation of the Gold treebank is not available. The cross-tagset approach allows one to annotate new texts using freely available tools or tools optimized to user’s needs. We evaluate the robustness of Russian dependency parsing using different morphological and syntactic tagsets in input and output. Qualitative analysis of errors shows that the cross-substitution of three morphological tagsets and two syntactic tagsets causes only a mild drop in performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Apresian, Ju., Boguslavsky, I., Iomdin, L., Lazursky, A., Sannikov, V., Sizov, V., Tsinman L.: ETAP-3 linguistic processor: a full-fledged NLP implementation of the MTT. In: Proceedings of the First International Conference on Meaning-Text Theory, pp. 279–288 (2003) Apresian, Ju., Boguslavsky, I., Iomdin, L., Lazursky, A., Sannikov, V., Sizov, V., Tsinman L.: ETAP-3 linguistic processor: a full-fledged NLP implementation of the MTT. In: Proceedings of the First International Conference on Meaning-Text Theory, pp. 279–288 (2003)
2.
Zurück zum Zitat Boguslavsky, I., Iomdin, L., Frolova, T., Timoshenko, S.: Development of a Russian tagged corpus with lexical and functional annotation. In: Proceedings of the MONDILEX Third Open Workshop. Bratislava, Slovakia, 15–16 April 2009 (2009) Boguslavsky, I., Iomdin, L., Frolova, T., Timoshenko, S.: Development of a Russian tagged corpus with lexical and functional annotation. In: Proceedings of the MONDILEX Third Open Workshop. Bratislava, Slovakia, 15–16 April 2009 (2009)
3.
Zurück zum Zitat Boguslavsky, I., Iomdin, L., Sizov, V., Tsinman, L., Petrochenkov, V.: Rule-based dependency parser refined by empirical and corpus statistics. Proc. DepLing 2011, 318–327 (2011)MATH Boguslavsky, I., Iomdin, L., Sizov, V., Tsinman, L., Petrochenkov, V.: Rule-based dependency parser refined by empirical and corpus statistics. Proc. DepLing 2011, 318–327 (2011)MATH
4.
Zurück zum Zitat Bohnet, B., Nivre, J., Boguslavsky, I., Farkas, R., Ginter, F., Hajič, J.: Joint morphological and syntactic analysis for richly inflected languages. Trans. Assoc. Comput. Linguist. 1, 415–428 (2013) Bohnet, B., Nivre, J., Boguslavsky, I., Farkas, R., Ginter, F., Hajič, J.: Joint morphological and syntactic analysis for richly inflected languages. Trans. Assoc. Comput. Linguist. 1, 415–428 (2013)
5.
Zurück zum Zitat Droganova, K., Zeman, D.: Conversion of SynTagRus (the Russian dependency treebank) to Universal Dependencies. ÚFAL Technical Report TR-2016-60, ISSN 1214–5521 (2016) Droganova, K., Zeman, D.: Conversion of SynTagRus (the Russian dependency treebank) to Universal Dependencies. ÚFAL Technical Report TR-2016-60, ISSN 1214–5521 (2016)
6.
Zurück zum Zitat Dyachenko, P., et al.: Sovremennoe sostojanie gluboko annotirovannogo korpusa tekstov russkogo jazyka (SynTagRus) [SynTagRus, a deeply annotated corpus of Russian texts: present state of the art (in Russian)]. In: Russian National Corpus: 10 years. Trudy Instituta russkogo jazyka im. V. V. Vinogradova. Moscow, vol. 6, pp. 272–299 (2015) Dyachenko, P., et al.: Sovremennoe sostojanie gluboko annotirovannogo korpusa tekstov russkogo jazyka (SynTagRus) [SynTagRus, a deeply annotated corpus of Russian texts: present state of the art (in Russian)]. In: Russian National Corpus: 10 years. Trudy Instituta russkogo jazyka im. V. V. Vinogradova. Moscow, vol. 6, pp. 272–299 (2015)
7.
Zurück zum Zitat Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH
8.
Zurück zum Zitat Grishina, E., Rakhilina, E.: Russian National Corpus (RNC): an overview and perspectives. In: Proceedings of AATSEEL-2005. Washington, 27–30 December (2005) Grishina, E., Rakhilina, E.: Russian National Corpus (RNC): an overview and perspectives. In: Proceedings of AATSEEL-2005. Washington, 27–30 December (2005)
9.
Zurück zum Zitat Khokhlova, M.: Comparison of high-frequency nouns from the perspective of large corpora. In: RASLAN 2016 Recent Advances in Slavonic Natural Language Processing, pp. 9–17 (2016) Khokhlova, M.: Comparison of high-frequency nouns from the perspective of large corpora. In: RASLAN 2016 Recent Advances in Slavonic Natural Language Processing, pp. 9–17 (2016)
10.
Zurück zum Zitat Lyashevskaya, O., et al.: Ocenka metodov avtomaticheskogo analiza teksta: morfologicheskije parsery russkogo jazyka [NLP evaluation: Russian morphological parsers (in Russian)]. Computational Linguistics and Intellectual Technologies, vol. 9 (16), pp. 318–326 (2010) Lyashevskaya, O., et al.: Ocenka metodov avtomaticheskogo analiza teksta: morfologicheskije parsery russkogo jazyka [NLP evaluation: Russian morphological parsers (in Russian)]. Computational Linguistics and Intellectual Technologies, vol. 9 (16), pp. 318–326 (2010)
11.
Zurück zum Zitat Lyashevskaya, O., et al.: Universal dependencies for Russian: a new syntactic dependencies tagset. In: Series: Linguistics, WP BRP 44/LNG/2016 (2016) Lyashevskaya, O., et al.: Universal dependencies for Russian: a new syntactic dependencies tagset. In: Series: Linguistics, WP BRP 44/LNG/2016 (2016)
12.
Zurück zum Zitat McDonald, R., Petrov, S., Hall, K.: Multi-source transfer of delexicalized dependency parsers. In: Proceedings of EMNLP, pp. 62–72 (2011) McDonald, R., Petrov, S., Hall, K.: Multi-source transfer of delexicalized dependency parsers. In: Proceedings of EMNLP, pp. 62–72 (2011)
14.
Zurück zum Zitat Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the IWPT 2003, pp. 149–160 (2003) Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the IWPT 2003, pp. 149–160 (2003)
15.
Zurück zum Zitat Nivre, J.: MaltParser: a language independent system for data-driven dependency parsing. Nat. Lang. Eng. 13, 95–135 (2007) Nivre, J.: MaltParser: a language independent system for data-driven dependency parsing. Nat. Lang. Eng. 13, 95–135 (2007)
16.
Zurück zum Zitat Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of LREC-10 (2016) Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of LREC-10 (2016)
17.
Zurück zum Zitat Nivre, J., Agić, Ž., et al.: Universal Dependencies 2.1, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University (2017). http://hdl.handle.net/11234/1-2515 Nivre, J., Agić, Ž., et al.: Universal Dependencies 2.1, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University (2017). http://​hdl.​handle.​net/​11234/​1-2515
20.
Zurück zum Zitat Sharoff, S., Nivre, J.: The proper place of men and machines in language technology. In: Processing Russian without any Linguistic Knowledge. Computational Linguistics and Intelligent Technologies, vol. 10 (17), pp. 657–670 (2011) Sharoff, S., Nivre, J.: The proper place of men and machines in language technology. In: Processing Russian without any Linguistic Knowledge. Computational Linguistics and Intelligent Technologies, vol. 10 (17), pp. 657–670 (2011)
21.
Zurück zum Zitat Tiedemann, J.: Cross-lingual dependency parsing with universal dependencies and predicted PoS labels. In: Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015), pp. 340–349 (2015) Tiedemann, J.: Cross-lingual dependency parsing with universal dependencies and predicted PoS labels. In: Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015), pp. 340–349 (2015)
22.
Zurück zum Zitat Toldova, S., et al.: Otsenka metodov avtomaticheskogo analiza teksta 2011–2012: sintaksicheskie parsery russkogo jazyka [NLP evaluation 2011–2012: Russian syntactic parsers (in Russian)]. Computational Linguistics and Intelligent Technologies, vol. 11(18), pp. 797–809 (2012) Toldova, S., et al.: Otsenka metodov avtomaticheskogo analiza teksta 2011–2012: sintaksicheskie parsery russkogo jazyka [NLP evaluation 2011–2012: Russian syntactic parsers (in Russian)]. Computational Linguistics and Intelligent Technologies, vol. 11(18), pp. 797–809 (2012)
23.
Zurück zum Zitat Tsarfaty, R., Nivre, J., Andersson, E.: Cross-framework evaluation for statistical parsing. In: Proceedings of EACL 12, France, 2012 (2012) Tsarfaty, R., Nivre, J., Andersson, E.: Cross-framework evaluation for statistical parsing. In: Proceedings of EACL 12, France, 2012 (2012)
Metadaten
Titel
Cross-Tagset Parsing Evaluation for Russian
verfasst von
Kira Droganova
Olga Lyashevskaya
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-02846-6_31