Skip to main content
Top

2018 | OriginalPaper | Chapter

Using Explicit Semantic Analysis and Word2Vec in Measuring Semantic Relatedness of Russian Paraphrases

Authors : Anna Kriukova, Olga Mitrofanova, Kirill Sukharev, Natalia Roschina

Published in: Digital Transformation and Global Society

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this study we compare two semantic relatedness algorithms, namely, Explicit Semantic Analysis (ESA) and Word2Vec. ESA represents text meaning in a high-dimensional space of concepts derived from Wikipedia. Word2Vec generates distributed vector representations from large text corpora). Experiments were carried out on the Russian paraphrase corpus of news titles and Russian ParaPlag paraphrase corpus. The paper contains thorough analysis of results and evaluation procedure.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mitrofanova, O.A.: Measuring semantic distances as a problem of applied linguistics. In: Structural and Applied Linguistics (in Russian), vol. 7. St.-Petersburg (2008) Mitrofanova, O.A.: Measuring semantic distances as a problem of applied linguistics. In: Structural and Applied Linguistics (in Russian), vol. 7. St.-Petersburg (2008)
2.
go back to reference Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol. 1, pp. 775–780 (2006) Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol. 1, pp. 775–780 (2006)
3.
go back to reference Šarić, F., Glavaš, G., Karan, M., Šnajder, J., Bašić, B.D.: TakeLab: systems for measuring semantic text similarity. In: SemEval 2012 Proceedings of the First Joint Conference on Lexical and Computational Semantics, vol. 1–2, pp. 441–448 (2012) Šarić, F., Glavaš, G., Karan, M., Šnajder, J., Bašić, B.D.: TakeLab: systems for measuring semantic text similarity. In: SemEval 2012 Proceedings of the First Joint Conference on Lexical and Computational Semantics, vol. 1–2, pp. 441–448 (2012)
4.
go back to reference Bär, D., Biemann, C., Gurevich, I., Zesch, T.: UKP: computing semantic textual similarity by combining multiple content similarity measures. In: SemEval 2012 Proceedings of the First Joint Conference on Lexical and Computational Semantics, vol. 1–2, pp. 435–440 (2012) Bär, D., Biemann, C., Gurevich, I., Zesch, T.: UKP: computing semantic textual similarity by combining multiple content similarity measures. In: SemEval 2012 Proceedings of the First Joint Conference on Lexical and Computational Semantics, vol. 1–2, pp. 435–440 (2012)
5.
go back to reference Kriukova, A.: Computing semantic similarity of Russian texts by means of DKPro similarity tool (in Russian). In: IMS 2017 Proceedings, St.-Petersburg (2017) Kriukova, A.: Computing semantic similarity of Russian texts by means of DKPro similarity tool (in Russian). In: IMS 2017 Proceedings, St.-Petersburg (2017)
7.
go back to reference Landauer, T.K., Foltz, P., Laham, D.: Introduction to latent semantic analysis. Discourse Process. 25 (1998). 10.1080/01638539809545028 Landauer, T.K., Foltz, P., Laham, D.: Introduction to latent semantic analysis. Discourse Process. 25 (1998). 10.1080/01638539809545028
8.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)
10.
go back to reference Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of The 20th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1606–1611 (2007) Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of The 20th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1606–1611 (2007)
11.
go back to reference Sochenkov, I.V., Zubarev, D.V., Smirnov, I.V.: The ParaPlag: Russian dataset for paraphrased plagiarism detection. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 1, pp. 284–297 (2017) Sochenkov, I.V., Zubarev, D.V., Smirnov, I.V.: The ParaPlag: Russian dataset for paraphrased plagiarism detection. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 1, pp. 284–297 (2017)
13.
go back to reference Pronoza, E., Yagunova, E.: Comparison of sentence similarity measures for Russian paraphrase identification. In: Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pp. 74–82 (2015). 10.1109/AINL-ISMW-FRUCT.2015.7382973 Pronoza, E., Yagunova, E.: Comparison of sentence similarity measures for Russian paraphrase identification. In: Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pp. 74–82 (2015). 10.1109/AINL-ISMW-FRUCT.2015.7382973
14.
go back to reference Enikeeva, E., Mitrofanova, O.: Russian collocation extraction based on word embeddings. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 1, pp. 52–64 (2017) Enikeeva, E., Mitrofanova, O.: Russian collocation extraction based on word embeddings. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 1, pp. 52–64 (2017)
Metadata
Title
Using Explicit Semantic Analysis and Word2Vec in Measuring Semantic Relatedness of Russian Paraphrases
Authors
Anna Kriukova
Olga Mitrofanova
Kirill Sukharev
Natalia Roschina
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-02846-6_28

Premium Partner