Skip to main content
Top

2020 | OriginalPaper | Chapter

So What’s the Plan? Mining Strategic Planning Documents

Authors : Ekaterina Artemova, Tatiana Batura, Anna Golenkovskaya, Vitaly Ivanin, Vladimir Ivanov, Veronika Sarkisyan, Ivan Smurov, Elena Tutubalina

Published in: Digital Transformation and Global Society

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-government research.
We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next texts are marked up using human-in-the-loop strategy, so that preliminary annotations are derived from a machine learning model and are manually corrected.
The amount of annotated texts is large enough to showcase what insights can be gained from RuREBus.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: COLING 2018, 27th International Conference on Computational Linguistics, pp. 1638–1649 (2018) Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: COLING 2018, 27th International Conference on Computational Linguistics, pp. 1638–1649 (2018)
2.
go back to reference Albarghothi, A., Saber, W., Shaalan, K.: Automatic construction of e-government services ontology from Arabic webpages. Procedia Comput. Sci. 142, 104–113 (2018)CrossRef Albarghothi, A., Saber, W., Shaalan, K.: Automatic construction of e-government services ontology from Arabic webpages. Procedia Comput. Sci. 142, 104–113 (2018)CrossRef
3.
go back to reference Alekseychuk, N., Sarkisyan, V., Emelyanov, A., Artemova, E.: Processing and analysis of Russian strategic planning programs. In: Alexandrov, D.A., Boukhanovsky, A.V., Chugunov, A.V., Kabanov, Y., Koltsova, O., Musabirov, I. (eds.) DTGS 2019. CCIS, vol. 1038, pp. 68–81. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37858-5_6CrossRef Alekseychuk, N., Sarkisyan, V., Emelyanov, A., Artemova, E.: Processing and analysis of Russian strategic planning programs. In: Alexandrov, D.A., Boukhanovsky, A.V., Chugunov, A.V., Kabanov, Y., Koltsova, O., Musabirov, I. (eds.) DTGS 2019. CCIS, vol. 1038, pp. 68–81. Springer, Cham (2019). https://​doi.​org/​10.​1007/​978-3-030-37858-5_​6CrossRef
4.
go back to reference Anisimovich, K., Druzhkin, K., Minlos, F., Petrova, M., Selegey, V., Zuev, K.: Syntactic and semantic parser based on abbyy compreno linguistic technologies. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog”], Bekasovo, Russia, vol. 2, pp. 90–103 (2012) Anisimovich, K., Druzhkin, K., Minlos, F., Petrova, M., Selegey, V., Zuev, K.: Syntactic and semantic parser based on abbyy compreno linguistic technologies. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog”], Bekasovo, Russia, vol. 2, pp. 90–103 (2012)
5.
go back to reference Baturo, A., Dasandi, N.: What drives the international development agenda? An NLP analysis of the united nations general debate 1970–2016. In: 2017 International Conference on the Frontiers and Advances in Data Science (FADS), pp. 171–176. IEEE (2017) Baturo, A., Dasandi, N.: What drives the international development agenda? An NLP analysis of the united nations general debate 1970–2016. In: 2017 International Conference on the Frontiers and Advances in Data Science (FADS), pp. 171–176. IEEE (2017)
6.
go back to reference Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)CrossRef Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)CrossRef
7.
go back to reference Dale, R.: Law and word order: NLP in legal tech. Nat. Lang. Eng. 25(1), 211–217 (2019)CrossRef Dale, R.: Law and word order: NLP in legal tech. Nat. Lang. Eng. 25(1), 211–217 (2019)CrossRef
8.
go back to reference Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S., Weischedel, R.: The automatic content extraction (ACE) program - tasks, data, and evaluation. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal. European Language Resources Association (ELRA), May 2004. http://www.lrec-conf.org/proceedings/lrec2004/pdf/5.pdf Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S., Weischedel, R.: The automatic content extraction (ACE) program - tasks, data, and evaluation. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal. European Language Resources Association (ELRA), May 2004. http://​www.​lrec-conf.​org/​proceedings/​lrec2004/​pdf/​5.​pdf
9.
go back to reference Evangelopoulos, N., Visinescu, L.: Text-mining the voice of the people. Commun. ACM 55(2), 62–69 (2012)CrossRef Evangelopoulos, N., Visinescu, L.: Text-mining the voice of the people. Commun. ACM 55(2), 62–69 (2012)CrossRef
10.
go back to reference Hendrickx, I., et al.: SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In: Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden, pp. 33–38. Association for Computational Linguistics, July 2010. https://www.aclweb.org/anthology/S10-1006 Hendrickx, I., et al.: SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In: Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden, pp. 33–38. Association for Computational Linguistics, July 2010. https://​www.​aclweb.​org/​anthology/​S10-1006
11.
go back to reference Holderness, E., Yepes, A.J., Lavelli, A., Minard, A.L., Pustejovsky, J., Rinaldi, F.: Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019). In: Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019) (2019) Holderness, E., Yepes, A.J., Lavelli, A., Minard, A.L., Pustejovsky, J., Rinaldi, F.: Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019). In: Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019) (2019)
12.
go back to reference Ivanin, V., Artemova, E., Batura, T., Ivanov, V., Sarkisyan, V., Tutubalina, E., Smurov, I.: Rurebus-2020 shared task: Russian relation extraction for business. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog”], Moscow, Russia (2020) Ivanin, V., Artemova, E., Batura, T., Ivanov, V., Sarkisyan, V., Tutubalina, E., Smurov, I.: Rurebus-2020 shared task: Russian relation extraction for business. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog”], Moscow, Russia (2020)
14.
go back to reference Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 260–270. Association for Computational Linguistics, June 2016. https://www.aclweb.org/anthology/N16-1030 Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 260–270. Association for Computational Linguistics, June 2016. https://​www.​aclweb.​org/​anthology/​N16-1030
15.
go back to reference Lin, Y., Liu, L., Ji, H., Yu, D., Han, J.: Reliability-aware dynamic feature composition for name tagging. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 165–174. Association for Computational Linguistics, July 2019. https://doi.org/10.18653/v1/P19-1016 Lin, Y., Liu, L., Ji, H., Yu, D., Han, J.: Reliability-aware dynamic feature composition for name tagging. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 165–174. Association for Computational Linguistics, July 2019. https://​doi.​org/​10.​18653/​v1/​P19-1016
16.
go back to reference Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF (2016) Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF (2016)
17.
go back to reference Metsker, O., Trofimov, E., Grechishcheva, S.: Natural language processing of Russian court decisions for digital indicators mapping for oversight process control efficiency: disobeying a police officer case. In: Chugunov, A., Khodachek, I., Misnikov, Y., Trutnev, D. (eds.) EGOSE 2019. CCIS, vol. 1135, pp. 295–307. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39296-3_22CrossRef Metsker, O., Trofimov, E., Grechishcheva, S.: Natural language processing of Russian court decisions for digital indicators mapping for oversight process control efficiency: disobeying a police officer case. In: Chugunov, A., Khodachek, I., Misnikov, Y., Trutnev, D. (eds.) EGOSE 2019. CCIS, vol. 1135, pp. 295–307. Springer, Cham (2020). https://​doi.​org/​10.​1007/​978-3-030-39296-3_​22CrossRef
21.
go back to reference Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef
22.
go back to reference Shen, Y., Liu, Z., Luo, S., Fu, H., Li, Y.: Empirical research on e-government based on content mining. In: International Conference on Management of e-Commerce and e-Government, 2009. ICMECG 2009, pp. 91–94. IEEE (2009) Shen, Y., Liu, Z., Luo, S., Fu, H., Li, Y.: Empirical research on e-government based on content mining. In: International Conference on Management of e-Commerce and e-Government, 2009. ICMECG 2009, pp. 91–94. IEEE (2009)
24.
go back to reference Starostin, A., et al.: Factrueval 2016: Evaluation of named entity recognition and fact extraction systems for russian. In: FactRuEval 2016: Evaluation of Named Entity Recognition and Fact Extraction Systems for Russian, pp. 688–705 (2016) Starostin, A., et al.: Factrueval 2016: Evaluation of named entity recognition and fact extraction systems for russian. In: FactRuEval 2016: Evaluation of Named Entity Recognition and Fact Extraction Systems for Russian, pp. 688–705 (2016)
25.
go back to reference Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: Brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics (2012) Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: Brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics (2012)
26.
go back to reference Suh, J.H., Park, C.H., Jeon, S.H.: Applying text and data mining techniques to forecasting the trend of petitions filed to e-people. Expert Syst. Appl. 37(10), 7255–7268 (2010)CrossRef Suh, J.H., Park, C.H., Jeon, S.H.: Applying text and data mining techniques to forecasting the trend of petitions filed to e-people. Expert Syst. Appl. 37(10), 7255–7268 (2010)CrossRef
27.
go back to reference Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 142–147 (2003) Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 142–147 (2003)
29.
go back to reference Zuev, K.A., Indenbom, M.E.J.M.V.: Statistical machine translation with linguistic language model. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog’], Bekasovo, Russia, vol. 2, pp. 164–172 (2013) Zuev, K.A., Indenbom, M.E.J.M.V.: Statistical machine translation with linguistic language model. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog’], Bekasovo, Russia, vol. 2, pp. 164–172 (2013)
Metadata
Title
So What’s the Plan? Mining Strategic Planning Documents
Authors
Ekaterina Artemova
Tatiana Batura
Anna Golenkovskaya
Vitaly Ivanin
Vladimir Ivanov
Veronika Sarkisyan
Ivan Smurov
Elena Tutubalina
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-65218-0_16

Premium Partner