Skip to main content
Top

2018 | Book

Digital Transformation and Global Society

Third International Conference, DTGS 2018, St. Petersburg, Russia, May 30 – June 2, 2018, Revised Selected Papers, Part II

Editors: Daniel A. Alexandrov, Alexander V. Boukhanovsky, Andrei V. Chugunov, Yury Kabanov, Olessia Koltsova

Publisher: Springer International Publishing

Book Series : Communications in Computer and Information Science

insite
SEARCH

About this book

This two volume set (CCIS 858 and CCIS 859) constitutes the refereed proceedings of the Third International Conference on Digital Transformation and Global Society, DTGS 2018, held in St. Petersburg, Russia, in May/June 2018.

The 75 revised full papers and the one short paper presented in the two volumes were carefully reviewed and selected from 222 submissions. The papers are organized in topical sections on e-polity: smart governance and e-participation, politics and activism in the cyberspace, law and regulation; e-city: smart cities and urban planning; e-economy: IT and new markets; e-society: social informatics, digital divides; e-communication: discussions and perceptions on the social media; e-humanities: arts and culture; International Workshop on Internet Psychology; International Workshop on Computational Linguistics.

Table of Contents

Frontmatter

E-Society: Digital Divides

Frontmatter
The Winner Takes IT All: Swedish Digital Divides in Global Internet Usage

In the present study, we examined the influence of personality factors and demographic factors on Internet usage. Personality was defined from the Five Factor Model of personality in terms of Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism, while demographic factors were defined as gender, age and socioeconomic status (e.g. income and educational attainment). The results from a large, representative Swedish sample (N = 1,694) show that global Internet usage can be explained by a high degree of Extraversion, young age and high socioeconomic status. Our findings are consistent with some previous studies, but in contrast with others. We discuss contrasting results in terms of different study designs, cultures and time periods of Internet development. The results are discussed in terms of the “rich get richer model” and digital divides, and what broader implication our findings might have for society. The study may help facilitate our understanding regarding future challenges in the Internet design.

John Magnus Roos
The Relationship of ICT with Human Capital Formation in Rural and Urban Areas of Russia

In the article the authors made an attempt to empirically substantiate the link between information and communication technologies and the accumulation of human capital in cities and rural areas of Russia. For that reason the Cobb Douglas model was applied. As a result, four statistically significant models were obtained, where the following two indicators served as the resultant variables: the number of personal computers in organizations per 100 workers and the number of personal computers in organizations with an Internet connection per 100 workers. The explanatory variables were human capital, measured as the average number of years of training per one employed in the region, the average monthly wage in the regions and the share of urban population in the regions. A positive effect of the average level of education on the number of personal computers used in organizations per 100 employees and the number of personal computers in organizations with Internet connection per 100 workers has been proved, and this effect decreases over time. The influence of the average monthly wage is also positive. The assumption has been confirmed that in cities where there is a higher concentration of human capital, higher population density and higher wages, the introduction of information and communication technologies into a production processes in organizations is more intensive than in rural areas. A higher level of wages of the population employed in the region’s economy also acts as an incentive for organizations to use ICT more actively.

Anna Aletdinova, Alexey Koritsky
Toward an Inclusive Digital Information Access: Full Keyboard Access & Direct Navigation

The laws prohibit the discrimination of people with special needs. Accessibility has become a legal obligation for the State, which must ensure equal opportunities for access to services and knowledge. Many people have difficulty in accessing graphical interfaces or controlling the mouse. To promote a high degree of web usability, w3c guidelines emphasize the need to allow the user to interact with web pages not only through a pointing device, but through the keyboard as well. Among their appearance, access keys implementations were criticized. This article gives an overview about access keys drawbacks and presents perspectives on how to support web app interaction through a keyboard.

Sami Rojbi, Anis Rojbi, Mohamed Salah Gouider

E-Communication: Discussions and Perceptions on the Social Media

Frontmatter
Social Network Sites as Digital Heterotopias: Textual Content and Speech Behavior Perception

The relevance of this study relies on the M. Foucault’s concept of heterotopia and anthropocentric paradigm of semiosociopsychology introduced by T.M. Dridze. The goal of the pilot research described in this contribution is to examine how people not involved in social network websites’ (SNS) communication are influenced by it, what kind of emotional effect SNS-discussions produce on bystanders, and what are the grounds of this effect. We analyzed subjects’ (N = 7) emotional shift towards negative and positive emotional reactions in response to discussions on hot home and international politics (n = 20). Discussions in Russian language took place on “VKontakte” social network platform. The research used an experiment that utilized a three-condition (current emotional state of subjects; subjects’ gender; type of the stimuli) between-subjects design. The findings suggest that negative emotional reactions of not involved participants are more like to those who take active part in SNS-communication. The findings suggest that discussions on hot home and international politics provoke a variety of emotions. Textual content of discussions was mentioned as the main ground for subjects’ emotional reactions. No gender differences in perception of communicants’ speech behavior and textual content of discussions were found.

Liliya Komalova
The Influence of Emoji on the Internet Text Perception

Subject of Research. The paper deals with emoji - a small digital image or icon used to express an idea or emotion in electronic communication. The aim of the work is to find the dependencies between the use of emojis in text messages and the extent to which the messages attract users’ attention while viewing a page, especially in Russian-speaking Internet community. Method. Social network “Vkontakte” was chosen for the basis of the study, and four most extensive and popular communities were selected within it. The structure of a typical post in the VKontakte group was studied to identify the most obvious ways of expressing reactions to a post. Using the linear regression algorithm, graphs were constructed for the relationship between the frequency of use of emojis in the post and main indicators of attitude toward the post. Main Results. For all types of communities there is a clear tendency to reduce any type of reaction to a post with the increase in the frequency of emojis in it. Most responded posts contain no emojis at all, and such reports constitute the majority of the analyzed posts. The only exceptions have become fan communities. They also feature this trend, but the “attenuation” of interest is slower. Entertaining and motivational communities also reflect the phenomenon of slow fading of interest, but not so clearly and only in special cases.

Aleksandra Vatian, Antonina Shapovalova, Natalia Dobrenko, Nikolay Vedernikov, Niyaz Nigmatullin, Artem Vasilev, Andrei Stankevich, Natalia Gusarova
Power Laws in Ad Hoc Conflictual Discussions on Twitter

Ad hoc discussions have been gaining a growing amount of attention in scholarly discourse. But earlier research has raised doubts in comparability of ad hoc discussions in social media, as they are formed by unstable, affective, and hardly predictable issue publics. We have chosen inter-ethnic conflicts in the USA, Germany, France, and Russia (six cases altogether, from Ferguson riots to the attack against Charlie Hebdo) to see whether similar patterns are found in the discussion structure across countries, cases, and vocabulary sets. Choosing degree distribution as the structural proxy for differentiating discussion types, we show that exponents change in the same manner across cases if the discussion density changes, this being true for neutral vs. affective hashtags, as well as hashtags vs. hashtag conglomerates. This adds to our knowledge on comparability of ad hoc discussions online, as well as on structural differences between core and periphery in them.

Svetlana S. Bodrunova, Ivan S. Blekanov
Topics of Ethnic Discussions in Russian Social Media

The paper reveals the topic structure of ethnic discussions in the Russian-speaking social media and explores how these topics are related to the post-Soviet ethnic groups. Analyzed more than 2.6 million texts from Russian-speaking social media published for two-year period from 2014 to 2015 and contained at least one of the post-Soviet ethnonyms, we conclude that ethnic discussions in these media are full of socially significant and potentially problematic topics (15 topics out of 97 can be regarded as problematic comparing to the 4 out of 150 topics on random sample from VK.com). The most salient topics are the topics about Ukraine-Russia relations over the recent conflict between two countries. We also found the racial bias in criminal topic towards peoples of the North Caucasus which are often mentioned in the context of crimes and terrorism.

Oleg Nagornyy
Emotional Geography of St. Petersburg: Detecting Emotional Perception of the City Space

Emotional perception of the city space has a great share in subjective well-being and is one of the core subjective indicators of the quality of urban environment. Studies of emotional response towards the city space have recently gained popularity within digital humanities. In the paper we present a new system which allows collecting data on urban emotions - an interactive platform called Imprecity, which has been recently developed at ITMO University and integrated into a wider framework of Smart Saint-Petersburg project supported by city administration of Russian city Saint-Petersburg. When authorized through social networks Imprecity user receives a possibility to place emoji on St.Petersburg map as well as write comments on each emotion. Emotions are divided into 5 groups based on the typology of basic emotions defined by Paul Ekman - joy, sadness, anger, disgust, and fear. Imprecity functions as a mobile and desktop version of a website and will be further developed as a mobile app. The emotions and comments collected from users are processed to form recommendations for placemaking, moreover, active users of Imprecity have a possibility to unite together and propose projects for renovation of specific urban places with the help of experts. We consider methodological difference between studying emotional perception by processing spontaneous data generated by users online and study of emotionally loaded data created by users deliberately via Imprecity. We show visual analytical tools to process a test sample of data collected via Imprecity, such as emotional heatmaps, emotional ratings and word clouds. Analysis of data collected with Imprecity shows that users tend to express more joy than negative emotions; positive emotions tend to cluster close to the main points of attraction and major touristic routes. All types of emotions tend to cluster along the major mobility routes, in the city centre as well as in the sleeping quarters.

Aleksandra Nenko, Marina Petrova

E-Humanities: Arts & Culture

Frontmatter
Art Critics and Art Producers: Interaction Through the Text

As well as the most areas of social life, the field of art is now extended to the cyberspace. In this study, we analyze online reviews of Russian art critics with two objectives. On the one hand, we investigate the patterns of the interactions between critics and artists (both contemporary and recognized ones) in the Russian Art. Since the Russian school of art critique is still in the process of formation, an analysis of web data we offer a significant contribution to the scope of Russian Art studies. On the other hand, we use social network analysis and text mining tools in order to gain more insights from the data and affirm the applicability of the modern tools to the classic research tasks. In this study we analyze data from the 5 Russian art magazines, in particular articles, authors and named entities from this texts. As a result, we explored different patterns of the critics production that could divide this area of web interaction both by geographical and textual characteristics of agents and articles.

Anastasiia Menshikova, Daria Maglevanaya, Margarita Kuleva, Sofia Bogdanova, Anton Alekseev
Digitalization as a Sociotechnical Process: Some Insights from STS

The production and implementation of digital technologies face multiple restrictions, limitations, obstacles, and barriers. The more ubiquitous they become, the more social situations and interactions they take part in. One of the possible ways to understand more about digitalization is to deconstruct the process of dissemination of technologies and innovations as well as intangible knowledge. The paper represents a mixture review of methodological perspectives, which help to grasp the complexity of sociotechnical relations. It involves studies of the knowledge production, artifacts spreading, and innovation diffusion in order to approach digitalization as sociotechnical phenomena.

Liliia V. Zemnukhova
Selection Methods for Quantitative Processing of Digital Data for Scientific Heritage Studies

The methods of newly appeared field of Digital Humanities are getting more and more popular in the history of science. These methods influence the establishing of digital information resources accumulating and aggregating huge amount of metadata and full text publications. In a previous publication we used an example of a Russian evolutionary biologist and ecologist Georgy F. Gause to preliminary estimate the potential of digital resources for the science studies including history of science. We selected prioritized resources to be used in further research.The present study explores the methods of selection, processing and quantitative analysis of data extracted from digital information resources. Our concentration is on the digital information resources offering structured metadata. We selected, processed and visualized extracted metadata. Based on the analysis of the achieved results we came to the conclusion on the potential of using digital information resources in the history of science. Besides, the possibility of extracting unstructured metadata has been explored.

Dmitry Prokudin, Georgy Levit, Uwe Hossfeld
The Use of Internet of Things Technologies Within the Frames of the Cultural Industry: Opportunities, Restrictions, Prospects

The article presents an analysis of the possibilities and limitations of the use of information and communication technologies, in particular the Internet of things as an effective tool for artistic and sociocultural practices in the context of transformations of cultural industries. It is revealed that such radical transformations lead to a change in the formats of cultural objects, their content and form. The prospects of technological development are analyzed and the framework of interdisciplinary research is set.Considering two main trends in the field of culture - the fusion of art with science and the high demand for viewers’ participation in art-projects, we emphasize the role of technology in the development of media and focus on the prospects that can provide the Internet of things. In addition, analyzing the perspectives of contemporary technological tools as creative tools, we argue that the Internet of things and derivative technologies can have a strong influence on design, education and culture: today the society faces exponential innovative growth in all areas, but the most promising among them are those which provide the user with an active position, ability to provide feedback and an option to become co-author of the responsive, recipient-oriented projects that engage complex technical excellence in order to meet the expectations of a contemporary adaptive user, viewer or student.

Ulyana V. Aristova, Alexey Y. Rolich, Alexandra D. Staruseva-Persheeva, Anastasia O. Zaitseva
The Integration of Online and Offline Education in the System of Students’ Preparation for Global Academic Mobility

In this study authors examined the growing role of the process of global academic mobility for the situation of professional staff preparation for the new conditions of growing global market. The requirements for the system of education lie in the field of searching and suggesting new approaches and technologies to solve this task, in particular, to find ways of teaching English academic discourse as it is the tool of communication in academic and professional environment. At the same time, nowadays many ESL teachers are studying the ways of implementing online courses into classroom education. This ongoing paper is devoted to the theoretical and experimental study of different models integrating online and offline education for this purpose and for students’ preparation for effective usage of various genres of academic discourse in the English language as a language of global communication. The findings revealed the growth of various types of competences necessary for global academic mobility.

Nadezhda Almazova, Svetlana Andreeva, Liudmila Khalyapina
LIS Students’ Perceptions of the Use of LMS: An Evaluation Based on TAM (Kuwait Case Study)

Including information technology within the learning and teaching environment, especially learning management systems (LMSs), has been under scrutiny for a long time, and studies in this area have focused on both teachers’ and students’ attitudes toward LMSs. In this paper, an evaluation of the development and application of students’ acceptance of LMSs is presented. Public Authority of Applied Education and Training in Kuwait (PAAET) students’ perceptions of the use of an LMS in a blended learning environment was investigated. The investigation was based on two methods: pre- and post-usage surveys and analyses of students’ actual use of the system through system log mining. The findings show a significant improvement in students’ computer skills after using an LMS. Students were willing to use the compulsory and optional components of an LMS if the motivation to do so was present. The nature of the subject being taught affected the students’ intention and how they used the LMS. There was no significant difference in the students’ perceptions of the use of LMSs before or after actual usage.

Huda R. Farhan

International Workshop on Internet Psychology

Frontmatter
Big Data Analysis of Young Citizens’ Social and Political Behaviour and Resocialization Technics

The paper is based on the experience of an ongoing project ‘Monitoring and prevention of antisocial behaviour of the young people based on Big Data and communication in social networks’ which is created and implemented by the authors with the aim to prevent politically and socially destructive behaviour of the Russian adolescences based on Big Data and the mediation in social networks. We describe the three stages of the project: educational (preparatory-organizational), analytical (information-prognostic) and mediating (social-pedagogical); and introduce the system ‘Social-political insider’ for the collecting, processing and the sentiment analysis of the data about the sphere of the interests, the interest groups, subcultures that are in high demand among the young people. Then we discuss the preliminary results of the first, educational, stage of the project and the special knowledge and skills which are essential for the project team. The importance of such projects which can create the well-trained teams of the professionals and to organize monitoring and modification of the social and political behaviour of young people on a systematic basis is emphasized at the state level as one of the tasks of the state youth policy.

Galina Nikiporets-Takigawa, Olga Lobazova
“I Am a Warrior”: Self-Identification and Involvement in Massively Multiplayer Online Role-Playing Games

The aim of this paper is to explore the connections among the indicators of involvement in massively multiplayer online role-playing games (MMORPG) of Russian-speaking adult gamers. The measurement model of involvement in MMORPG playing that includes motivation to play, engagement on a gamer’s level, identification on a game construction level, and presence on a life environment level, were considered. The findings revealed the statistically significant correlation among all the indicators of involvement that fit the proposed model. Results indicate that gamers have seen more possibilities for social integration in the game rather than in day-to-day life. Two key tendencies were revealed: intentions to prosocial and competitive game behavior.

Yuliya Proekt, Valeriya Khoroshikh, Alexandra Kosheleva, Violetta Lugovaya, Elena Rokhina
Development of the Internet Psychology in Russia: An Overview

A brief description of prehistory, history and current state of the art in the development of the Internet psychology, or cyberpsychology in Russia is presented. Prehistory refers to a “non-meeting” stage: unsuccessful enthusiasts of computer networking in Russia (then the USSR) have not had recourse to social science researchers, including psychologists, who might have provided computer scientists with valuable “human factor” reasons to make the innovative appeal fully argumentative. The second period, named “culture psychology”, refers to the beginning of the appropriate studies which happened to start earlier than the public access to the Internet became available. The main theoretical platform of the culture psychology studies was the Vygotskian paradigm in psychology. The dominant characteristic of the third period was multitheoretical approach, and this phase got an ad hoc name “positive psychology” – simply due to the fact that a series of cyberpsychogical studies was performed within a positive psychology paradigm. The last enlisted period refers to the “current studies”: it includes both multi-theoretical works and diverse projects targeted on numerous Internet mediated activities such as interaction, cognition (e.g. learning), video game playing and various online entertainments.

Alexander Voiskounsky
Problematic Internet Usage and the Meaning-Based Regulation of Activity Among Adolescents

This paper explores the relationship between problematic internet usage and the meaning-based regulation of activity among adolescents. Participants were 77 adolescents (36 males, 41 females; M = 15.16 years, SD = 1.1) in grades 9–10 of two secondary schools predominantly for middle and lower-middle socioeconomic-status families in St. Petersburg (Russian Federation). Personal meaning-based regulation of adolescents’ activity can be defined as a structure connected with various aspects of the adolescents’ inner world and behavior. The data obtained make it possible to identify the personality-meaning-based preconditions for PIU in adolescence: difficulties in modelling the conditions for activity and programming behavior to achieve goals; a pronounced tendency to independent activity; a high level of susceptibility to psychological problems. The findings revealed that PIU may combine with a tendency to frequent usage of various electronic devices and a desire to acquire expensive technical novelties. The results given may be use in the development of psychological prophylaxis and correction of PIU in adolescence.

O. V. Khodakovskaia, I. M. Bogdanovskaya, N. N. Koroleva, A. N. Alekhin, V. F. Lugovaya
Neural Network-Based Exploration of Construct Validity for Russian Version of the 10-Item Big Five Inventory

This study aims to present a new method of exploring construct validity of questionnaires based on neural network. Using this test we further explore convergent validity for Russian adaptation of TIPI (Ten-Item Personality Inventory by Gosling, Rentfrow, and Swann). Due to small number of questions TIPI-RU can be used as an express-method for surveying large number of people, especially in the Internet-studies. It can be also used with other translations of the same questionnaire in the intercultural studies. The neural network test for construct validity can be used as more convenient substitute for path model.

Anastasia Sergeeva, Bogdan Kirillov, Alyona Dzhumagulova
Impulsivity and Risk-Taking in Adult Video Gamers

Video games are often seen as a reason for numerous psychological changes, both positive and negative, in players. For instance, many authors believe that video games push children and adults towards risky behaviors and impulsivity. The study aimed to analyze both theoretical and empirical evidences of that sort, as well as to investigate parameters of personal and cognitive impulsivity and risk-readiness in adult video gamers. The sample of gamers included 223 participants, all from Russia. Impulsivity and related personal traits were measured with Eysencks’ Impulsiveness Scale (I-7) and Kornilova’s Personal Risk Factors Questionnaire. Impulsivity as cognitive style was measured by Kagan’s MFFT. No evidence of high impulsivity was found, though video game players, who played more than 12 h per week turned out to be more venturesome, compared to less active gamers. Sex-related differences were investigated: female gamers scored lower in empathy, while male gamers showed higher venturesomeness. In a cognitive style study, video gamers were more accurate compared to non-gamers, and thus showed no tendency for impulsivity. The results are contrasted to the published data, when applicable.

Nataliya Bogacheva, Alexander Voiskounsky
The Impact of Smartphone Use on the Psychosocial Wellness of College Students

Researchers suggest that excessive smartphone use is correlated with negative psychosocial effects, particularly among younger adults—causing feelings of isolation, depression/anxiety, and restlessness. This pilot study on psychosocial wellness, of 22 college students—measured the impact of smartphone use on emotion/mood, dependency, addiction, purpose of life, social communications, and self-consciousness. For our data analysis, we measured frequency with conversion percentages (of 35 questions) using a seven-point Likert-scale of strongly disagree- to-strongly agree, while averaging the scores of each question group pertaining to each hypothesis. While only 22% agreed they were addicted to smartphone use, 68% reported constantly checking their smartphone, with 57% agreeing that they were smartphone dependent. The majority agreed that smartphone use increased anxiety, stress, and feelings of impatience, if their phone was not with them. While the majority agreed that the smartphone is their primary means of communication, 90% agreed that nothing is more fun than using their smartphone.

Anthony Faiola, Haleh Vatani, Preethi Srinivas
Detecting and Interfering in Cyberbullying Among Young People (Foundations and Results of German Case-Study)

Information and communication technologies (ICT) play more and more significant role in the lives of children and young people. Adolescents use ICT to communicate with others via chat, instant messenger, online communities, etc. They take online offers and services concerning music, pictures or videos to entertain themselves; to search for new knowledge and information, and to acquire and to use the online game offers. The majority of ICT is used in a constructive and peaceful manner, bringing no negative online experiences, which can be perceived as stressful, but this is not the case for all adolescents. The online world similarly as the physical world can bring danger, and the possible danger that adolescents encounter today in the online world is cyber-bullying.In the present contribution, we address to the online risk of cyberbullying among adolescents. The article deals with the following questions - how cyber-mobing can be defined from the research perspective, how many adolescents are affected by it in Germany, what makes cyberbullying specific, what do we know about the victims and the perpetrators, what are the possible consequences, and what are the recommendations for adolescents and adults (parents, teachers and educators) who are dealing with cyberbullying problems.

Sebastian Wachs, Wilfried Schubarth, Andreas Seidel, Elena Piskunova

International Workshop on Computational Linguistics

Frontmatter
Anomaly Detection for Short Texts: Identifying Whether Your Chatbot Should Switch from Goal-Oriented Conversation to Chit-Chatting

Goal-oriented conversational agents are systems able converse with humans using natural language to help them reach a certain goal. The number of goals (or domains) about which an agent could converse is limited, and one of the issues is to identify whether a user talks about the unknown domain (in order to report a misunderstanding or switch to chit-chatting mode). We argue that this issue could be resolved if we consider it as an anomaly detection task which is in a field of machine learning. The scientific community developed a broad range of methods for resolving this task, and their applicability to the short text data was never investigated before. The aim of this work is to compare performance of 6 different anomaly detection methods on Russian and English short texts modeling conversational utterances, proposing the first evaluation framework for this task. As a result of the study, we find out that a simple threshold for cosine similarity works better than other methods for both of the considered languages.

Amir Bakarov, Vasiliy Yadrintsev, Ilya Sochenkov
Emotional Waves of a Plot in Literary Texts: New Approaches for Investigation of the Dynamics in Digital Culture

Digital technologies provide new opportunities for the study of objects of cultural heritage. The paper deals with investigation of the dynamics in literary and musical texts. It is hypothesized that, from a linguistic point of view, it is not by accident that the action in text develops from the beginning (the exposure) through the introduction to the climax, and from the climax to the denouement, but it always has a certain tendency, which can be visualized. In the given research three ʻsmall genres’ are being investigated: Russian short stories, Russian classical sonnets, and classical Russian romances which belong to a hybrid genre of both musical and verbal nature. Generalized profiles of the plot development were made by means of statistical time series method, but with different parameters for different genres. Thus, literary texts were analysed based on measurement of sentence length, poetry texts were measured by stress index, whereas romances were measured both by poetry stress index and musical pitch/duration index. The other variables related to plot development may be used as well. The dynamics of each genre is visualized by means of curves resembling the ʻline of beauty’ proposed by William Hogarth. In conclusion, the received results are compared with dynamic contours obtained by applying sentiment analysis to a big data collection of texts belonging to world classical literature. The obtained results testify that there exist some universal regularities in text and plot generation, which may be revealed independently to research methodology.

Gregory Martynenko, Tatiana Sherstinova
Application of NLP Algorithms: Automatic Text Classifier Tool

This research is dedicated to the design of a decision support system for categorization of scientific literature. The purpose of this work is to research possible ways to apply the machine learning algorithms to the automation of manual text categorization. The following stages are considered: preprocessing of raw data, word embedding, model selection, classification model, and software design. At the first stage, in collaboration with VINITI RAS, the training set of 200,000 Russian texts was formed. At the second stage, the word embedding model was justified as Word2 Vec vector representation from text matrix by “sum” convolution with dimensionality 1500. At the third stage, the quality of the classifiers was estimated, and the logistic regression algorithm with the highest F1 score (0.94) was selected. And at the final stage, the ATC (Automatic Text Classifier) application, which embeds the results obtained on the previous stages, was developed. The overall application structure was described. It consists of compact program modules that can be replaced or adapted to the incoming text and gain the most classification score.

Aleksandr Romanov, Ekaterina Kozlova, Konstantin Lomotin
Structural Properties of Collocations in Tatar-Russian Socio-Political Dictionary of Collocations

This paper discusses some of the issues and challenges encountered during the compilation of the Tatar-Russian Socio-Political Dictionary of collocations, which is based on the data of the available corpora of the Tatar language. The area of collocations within the language system is of particular importance, and the well-known language-specificity of collocations suggests the need for bilingual collocation dictionaries.The main criteria for selecting linguistic data are those of objective (frequency in the corpus) and subjective evaluation (evaluation of the word from the point of view of its thematic, stylistic and collocational value). The main unit in the Dictionary is the noun or verb phrase formed by filling one of possible semantic-syntactic positions of the word and meeting the criteria of semantic completeness. As an exception, we also included certain combinations of header words with postpositions derived from nouns, as long as the corresponding collocations are typical for socio-political discourse.As it is a nontrivial task to fix basic forms of word combinations in morphologically rich languages, special attention is paid to the issue of lemmatization of collocations in the Dictionary (representing grammatical voice forms, fixing and translating predicative phrases, lemmatizing items with polyfunctional affixes, etc.).

Alfiya Galieva, Olga Nevzorova
Computer Ontology of Tibetan for Morphosyntactic Disambiguation

The article presents the experience of developing computer ontology as one of the tools for automatic natural language processing. A computer ontology that contains a consistent specification of meanings of lexical units with different relations between them represents a model of lexical semantics and both syntactic and semantic valencies, reflecting the Tibetan linguistic picture of the world. The article describes the approach of using computer ontology as a means of introducing semantic restrictions for morphosyntactic disambiguation on the basis of the corpus of indigenous grammatical treatises.

Aleksei Dobrov, Anastasia Dobrova, Pavel Grokhovskiy, Maria Smirnova, Nikolay Soms
Using Explicit Semantic Analysis and Word2Vec in Measuring Semantic Relatedness of Russian Paraphrases

In this study we compare two semantic relatedness algorithms, namely, Explicit Semantic Analysis (ESA) and Word2Vec. ESA represents text meaning in a high-dimensional space of concepts derived from Wikipedia. Word2Vec generates distributed vector representations from large text corpora). Experiments were carried out on the Russian paraphrase corpus of news titles and Russian ParaPlag paraphrase corpus. The paper contains thorough analysis of results and evaluation procedure.

Anna Kriukova, Olga Mitrofanova, Kirill Sukharev, Natalia Roschina
Mapping Texts to Multidimensional Emotional Space: Challenges for Dataset Acquisition in Sentiment Analysis

The cornerstone for any sentiment analysis research is labeled data and its acquisition. Canonical corpuses for this task contain different reviews (movies, restaurants) where sentiment can be derived from reviewer’s explicit rating of a reviewed item. Ratings go with supplied comments, which are used as text samples and ratings are converted into labels. Usually emotion labels come in binary form like “negative\positive”.This simplistic approach works well when we are dealing with binary emotional model, but it turns to fail when we are dealing with more complex emotional models like “Pleasure-Arousal-Dominance (PAD)” or Lövheim’s Cube, when we collect data from various sources and of different types (fiction books, social networks conversations, blog posts etc.) or when we delegate labeling to external assessors.In the article, we describe which methodological problems we faced while collecting dataset for sentiment analysis backed by Lövheim’s Cube - emotional model that represents an emotion as a point in three-dimensional space of balance of three monoamines (Dopamine, Serotonin and Noradrenaline).These problems include the choice of necessary metadata to be collected along with text and labels, choice of tools used for labeling and survey design.

Alexander Kalinin, Anastasia Kolmogorova, Galina Nikolaeva, Alina Malikova
On Modelling Domain Ontology Knowledge for Processing Multilingual Texts of Terroristic Content

The paper reports on an ongoing research whose main objective is to address problems in ontology conceptualizing and techniques for building domain ontologies suitable for processing texts in multiple languages. Such ontologies are useful in different NLP tasks from creating semantically annotated multilingual resources to multilingual information retrieval, extraction and machine translation. Another research objective is to contribute to the pool of ontological resources for the terrorist domain as the analysis of terrorism has now been in focus as a matter of national security for more than a decade. The emphasis is made on the linguistic issues of ontology development as the main prerequisite of ontology computer realization. Our approach is a mixed top-down and bottom-up technique adjusted to the domain specificity in the multilingual context. The paper argues for a clear division between the language-dependent lexical knowledge and language-independent conceptual knowledge that, nevertheless, should be represented so as to provide as many direct mappings “lexeme-ontological concept” as possible. The approach is illustrated with an ontology prototype to process texts of terroristic content in the English, French, and Russian languages.

Svetlana Sheremetyeva, Anastasia Zinovyeva
Cross-Tagset Parsing Evaluation for Russian

Cross-tagset parsing is based on the substitution of one annotation layer for another while processing data within one language. As often as not, either the native tagger or the dependency parser used in (pre-)annotation of the Gold treebank is not available. The cross-tagset approach allows one to annotate new texts using freely available tools or tools optimized to user’s needs. We evaluate the robustness of Russian dependency parsing using different morphological and syntactic tagsets in input and output. Qualitative analysis of errors shows that the cross-substitution of three morphological tagsets and two syntactic tagsets causes only a mild drop in performance.

Kira Droganova, Olga Lyashevskaya
Active Processes in Modern Spoken Russian Language (Evidence from Russian)

Various application fields of linguistics including automatic recognition and speech processing, teaching foreign languages and interpreting colloquial speech, characterization of sociolinguistic speech diversity, linguistic “portrayal” of a certain community (social dialect) and a particular persona (idiolect), linguistic examination (for example, for counter-terrorism efforts), among others, require not only extensive lexical and grammatical resources, but also the description of speech production mechanisms, mainly those, typical of spontaneous speech. Regrettably, the latter are almost always neglected by traditional dictionaries and grammar books, being out of scope of linguistic analysis. The knowledge of such mechanisms is necessary for colloquial studies (colloquialistics) per se, a branch of linguistics which studies everyday spoken language. The authors of this article make an attempt to systematize processes proceeding in modern colloquial language through the reliance on domestic and foreign professional academic literature and research results obtained from the ORD-corpus (everyday Russian spoken language) analysis.The research was fulfilled with the support of RFBR (Russian Foundation for Basic Research) No. 17-29-09175 “Diagnostic features of sociolinguistic variation in Russian everyday spoken language (evidence from a corpus)”.

Natalia Bogdanova-Beglarian, Yulia Filyasova
Backmatter
Metadata
Title
Digital Transformation and Global Society
Editors
Daniel A. Alexandrov
Alexander V. Boukhanovsky
Andrei V. Chugunov
Yury Kabanov
Olessia Koltsova
Copyright Year
2018
Electronic ISBN
978-3-030-02846-6
Print ISBN
978-3-030-02845-9
DOI
https://doi.org/10.1007/978-3-030-02846-6

Premium Partner