Top

2015 | Book

Read chapter Read first chapter

Analysis of Images, Social Networks and Texts

4th International Conference, AIST 2015, Yekaterinburg, Russia, April 9–11, 2015, Revised Selected Papers

Editors: Mikhail Yu. Khachay, Natalia Konstantinova, Alexander Panchenko, Dmitry Ignatov, Valeri G. Labunets

Publisher: Springer International Publishing

Book Series : Communications in Computer and Information Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the proceedings of the Fourth International Conference on Analysis of Images, Social Networks and Texts, AIST 2015, held in Yekaterinburg, Russia, in April 2015.

The 24 full and 8 short papers were carefully reviewed and selected from 140 submissions. The papers are organized in topical sections on analysis of images and videos; pattern recognition and machine learning; social network analysis; text mining and natural language processing.

Frontmatter

Invited Papers

Frontmatter

A Probabilistic Rating System for Team Competitions with Individual Contributions

We study the problem of constructing a probabilistic rating system for team competitions. Unlike previous studies, we consider a setting where the competition can be broken down into relatively small individual tasks, and it is reasonable to assume that each task is done by a single team member. We begin with a simplistic naïve Bayes approach which is this case reduces to logistic regression and then develop it into a more complex model with latent variables trained by expectation–maximization. We show experimental results that validate our approach.

Sergey Nikolenko

Sequential Hierarchical Image Recognition Based on the Pyramid Histograms of Oriented Gradients with Small Samples

In this paper we explore an application of the pyramid HOG (Histograms of Oriented Gradients) features in image recognition problem with small samples. A sequential analysis is used to improve the performance of hierarchical methods. We propose to process the next, more detailed level of pyramid only if the decision at the current level is unreliable. The Chow’s reject option of comparison of the posterior probability with a fixed threshold is used to verify recognition reliability. The posterior probability is estimated for the homogeneity-testing probabilistic neural network classifier on the basis of its relation with the Bayesian decision. Experimental results in face recognition are presented. It is shown that the proposed approach allows to increase the recognition performance in 2–4 times in comparison with conventional classification of pyramid HOGs.

Andrey V. Savchenko, Vladimir R. Milov, Natalya S. Belova

Discerning Depression Propensity Among Participants of Suicide and Depression-Related Groups of Vk.com

In online social networks, high level features of user behavior such as character traits can be predicted with data from user profiles and their connections. Recent publications use data from online social networks to detect people with depression propensity and diagnosis. In this study, we investigate the capabilities of previously published methods and metrics applied to the Russian online social network VKontakte. We gathered user profile data from most popular communities about suicide and depression on VK.com and performed comparative analysis between them and randomly sampled users. We have used not only standard user attributes like age, gender, or number of friends but also structural properties of their egocentric networks, with results similar to the study of suicide propensity in the Japanese social network Mixi.com. Our goal is to test the approach and models in this new setting and propose enhancements to the research design and analysis. We investigate the resulting classifiers to identify profile features that can indicate depression propensity of the users in order to provide tools for early depression detection. Finally, we discuss further work that might improve our analysis and transfer the results to practical applications.

Aleksandr Semenov, Alexey Natekin, Sergey Nikolenko, Philipp Upravitelev, Mikhail Trofimov, Maxim Kharchenko

Tutorial

Frontmatter

Normalization of Non-standard Words with Finite State Transducers for Russian Speech Synthesis

This paper describes finite state transducers employed for expansion of numbers, acronyms and graphic abbreviations into full-word numerals and phrases in the task of Russian speech synthesis. The developed finite state transducers cover cardinal and ordinal numbers, convert phone numbers, dates, codes, etc. The developed project is the first Russian open-source normalization system known to the author.

Artem Lukanin

Analysis of Images and Videos

Frontmatter

Transform Coding Method for Hyperspectral Data: Influence of Block Characteristics to Compression Quality

The aim of this paper is to study how block characteristics influence on the compression quality fo hyperspectral data using block transform method. Coordinates in hyperspectral image are not equivalent – two of them are space-based ones and the third coordinate corresponds to a spectral channel. Thus it is necessary to investigate the algorithm implementation with blocks extended along spectral axis. Moreover, it is known that in two-dimensional case, an increasing block size leads to improvement of compression quality. Hence it is useful to investigate the algorithm implementation with cubic blocks of increased size.

Marina Chicheva, Ruslan Yuzkiv

Fréchet Filters for Color and Hyperspectral Images Filtering

Median filtering has been widely used in scalar-valued image processing as an edge preserving operation. The basic idea is that the pixel value is replaced by the median of the pixels contained in a window around it. In this paper, we extend the notion of the Fréchet vector median to the general Fréchet vector median, which minimizes the Fréchet cost function (FCF) in the form of an aggregation function instead of the ordinary sum. Moreover, we propose to use an aggregation distance instead of the classical one. We use the generalized Fréchet median for constructing new nonlinear filters based on an arbitrary pair of aggregation operators that can be changed independently. For each pair of parameters, we get the unique class of new nonlinear filters.

Ekaterina Ostheimer, Valeriy Labunets, Denis Komarov, Tat’yana Fedorova

Fast Global Image Denoising Algorithm on the Basis of Nonstationary Gamma-Normal Statistical Model

We consider here a Bayesian framework and the respective global algorithm for adaptive image denoising which preserves essential local peculiarities in basically smooth changing of intensity of reconstructed image. The algorithm is based on the special nonstationary gamma-normal statistical model and can handle both Gaussian noise, which is an ubiquitous model in the context of statistical image restoration, and Poissonian noise, which is the most common model for low-intensity imaging used in biomedical imaging. The algorithm being proposed is simple in tuning and has linear computation complexity with respect to the number of image elements so as to be able to process large data sets in a minimal time.

Inessa Gracheva, Andrey Kopylov, Olga Krasotkina

Theoretical Approach to Developing Efficient Algorithms of Fingerprint Enhancement

A new theoretical approach to construction of efficient algorithms for fingerprint image enhancement is proposed. The approach comprises novel modifications of advanced orientation field estimation techniques such as the method of fingerprint core extraction based on Poincaré indexes and model-based smoothing for the gradient-based approximation of an orientation field by Legendre polynomials, and new adaptive Gabor filtering technique based on holomorphic transformations of coordinates.

Mikhail Yu. Khachay, Maxim Pasynkov

Remote Sensing Data Verification Using Model-Oriented Descriptors

This paper presents a solution of remote sensing data verification problem. Remote sensing data includes digital image data and metadata, which contains parameters of satellite image shooting process (Sun and satellite azimuth and elevation angles, shooting time, etc.). The solution is based on the analysis of special numerical characteristics, which directly depend on the shooting parameters: sun position, satellite position and orientation. We propose two fully automatic algorithms for remote sensing data analysis and decision-making based on data compatibility: the first one uses vector data of the shooting territory, the second doesn’t.

Andrey Kuznetsov, Vladislav Myasnikov

New Bi-, Tri-, and Fourlateral Filters for Color and Hyperspectral Images Filtering

In the paper, we investigate effectiveness of modified bilateral and new tri-, and fourlateral denoising filters for grey, color, and hyperspectral image procession. Conventional bilateral filter performs merely weighted averaging of the local neighborhood pixels. The weight includes two components: spatial and radiometric ones. The first component measures the geometric distances between the center pixel and local neighborhood ones. The second component measures the radiometric distance between the values of the center pixel and local neighborhood ones. Noise affects all pixels even onto the centre one used as a reference for the tonal filtering. Thus, the noise affecting the centre pixel has a disproportionate effect onto the result. This suggests the first modification: the center pixel is replaced by the weighted average (with some estimate of the true value) of the neighborhood pixels contained in a window around it. The second modification uses the matrix-valued weights. They include four components: spatial, radiometric, inter-channel weights, and radiometric inter-channel ones. The fourth weight measures the radiometric distance (for grey-level images) between the inter-channel values of the center scalar-valued channel pixel and local neighborhood channel ones.

Ekaterina Ostheimer, Valeriy Labunets, Andrey Kurganski, Denis Komarov, Ivan Artemov

Frequency Analysis of Gradient Descent Method and Accuracy of Iterative Image Restoration

For images with sharp changes of intensity, the appropriate regularization is based on variational functionals. In order to minimize such a functional, the gradient descent approach can be used. In this paper, we analyze the performance of the gradient descent method in the frequency domain and show that the method converges to the sum of the original undistorted function and the kernel function of a linear distortion operator.

Artyom Makovetskii, Alexander Vokhmintsev, Vitaly Kober, Vladislav Kuznetsov

Shape Matching Based on Skeletonization and Alignment of Primitive Chains

We introduce a new shape matching approach based on skeletonization and alignment of primitive chains. At the first stage the skeleton of a binary image is traversed counterclockwise in order to encode it by chain of primitives. A primitive describes topological properties of the correlated edge and consists of a pair of numbers: the length of some edge and the angle between this and the next edges. We offer to expand a primitive by the information about the radial function of the skeleton rib. To get the compact width description we interpolate radial function by Legendre polynomials and find the vector of Legendre coefficients. Thus the resulting shape representation by the chain of primitives includes not only topological properties but also the contour ones. Then we suggest the dynamic programming procedure of the alignment of two primitive chains in order to match correspondent shapes. Based on the optimal alignment we propose the pair-wise dissimilarity function which is evaluated on artificial image dataset and the Flavia leaf dataset.

Olesia Kushnir, Oleg Seredin

Color Image Restoration with Fuzzy Gaussian Mixture Model Driven Nonlocal Filter

Color image denoising is one of the classical image processing problem and various techniques have been explored over the years. Recently, nonlocal means (NLM) filter is proven to obtain good results for denoising Gaussian noise corrupted digital images using weighted mean among similar patches. In this paper, we consider fuzzy Gaussian mixture model (GMM) based NLM method for removing mixed Gaussian and impulse noise. By computing an automatic homogeneity map we identify impulse noise locations and utilize an adaptive patch size. Experimental results on mixed noise affected color images show that our scheme performs better than NLM, anisotropic diffusion and GMM-NLM over different noise levels. Comparison with respect to structural similarity, color image difference, and peak signal to noise ratio error metrics are undertaken and our scheme performs well overall without generating color artifacts.

V. B. Surya Prasath, Radhakrishnan Delhibabu

A Phase Unwrapping Algorithm for Interferometric Phase Images

Phase unwrapping is the most complicated and unreliable stage of interferometric data processing, which is often used in remote sensing techniques. For real radar scenes, the phase unwrapping problem doesn’t have a unique solution due to phase discontinuity caused by the phase noise and aliasing. A phase unwrapping algorithm for interferometric phase images based on 3d-phase function branch merging and cutting is proposed. It improves reliability of the unwrapped phase used for digital elevation models generation.

Andrey Sosnovsky

Robust Image Watermarking on Triangle Grid of Feature Points

The paper presents a digital image watermarking technique robust to geometric distortions. This technique is based on a novel procedure of building a set of primitives for embedding using Delaunay triangulation on a set of feature points, and uses additive embedding method with linear correlation detector. Much attention is paid to the problem of choosing the most appropriate feature points detector. Conducted experiments demonstrate robustness of the proposed method to a range of geometric distortions.

Alexander Verichev, Victor Fedoseev

Pattern Recognition and Machine Learning

Frontmatter

Traffic Flow Forecasting Algorithm Based on Combination of Adaptive Elementary Predictors

In this paper the problem of traffic flow prediction in the transport network of a large city is considered. For fast calculation of predictions, partition of a transport graph into a certain number of subgraphs based on the territorial principle is proposed. Next, we use a dimension reduction method based on principal components analysis to describe the spatio-temporal distribution of traffic flow condition in subgraphs. A short-term (up to 1 h) traffic flow prediction in each subgraph is calculated by an adaptive linear combination of elementary predictions. In this paper, the elementary predictions are Box-Jenkins time-series models, support vector regression, and the method of potential functions. The proposed traffic prediction algorithm is implemented and tested against the actual travel times over a large road network in Samara, Russia.

Anton Agafonov, Vladislav Myasnikov

Analysis of the Adaptive Nature of Collaborative Filtering Techniques in Dynamic Environment

Collaborative filtering (CF) has been an active area of research for a long time. However, most of the works available in the literature either focuses on handling cold start problems (when CF fails to make acceptable prediction due to the lack of ratings) or emphasizes on improving CF performance in terms of some evaluation statistics. Very few of them addressed the problem and issues of updating from a cold start affected initial stage to a steady one. To cope with this progressive nature of CF, we propose to model the entire life cycle of Recommender System (RS). Specifically, we suggest a combination of two neural network based CF techniques for the implementation of a complete RS framework. We propose to adopt the cold start based algorithm proposed by Bobadilla et al. for the initial stage. For the later stage we propose a new algorithm based on neural network. We suggest to adopt these two algorithms in different stages of CF to ensure better performance and uniformity throughout the RS life cycle.

Khaleda Akhter, Sheikh Muhammad Sarwar

A Texture Fuzzy Classifier Based on the Training Set Clustering by a Self-Organizing Neural Network

The paper presents a fuzzy approach to the texture classification. According to the classifier the texture class is represented as a set of clusters in N-dimensional feature space that allows generating a cluster or clusters with an arbitrary shape and precisely reflecting any group of the vectors connected with the class. For each texture class it configures the self-organizing features map and estimates a degree of the overlap of the neighboring classes. Upon matching the maps each of them creates a set of fuzzy rules reflecting the feature value statistical distribution in its clusters. Advantages of the system are simplicity of the structure generation, functioning and performance. The suggested classification technique is universal and can be used not only as a texture analyzer but independently for many other real-world classification tasks.

Sergey Axyonov, Kirill Kostin, Dmitry Lykom

Learning Representations in Directed Networks

We propose a probabilistic model for learning continuous vector representations of nodes in directed networks. These representations could be used as high quality features describing nodes in a graph and implicitly encoding global network structure. The usefulness of the representations is demonstrated on link prediction and graph visualization tasks. Using representations learned by our method allows to obtain results comparable to state of the art methods on link prediction while requires much less computational resources. We develop an efficient online learning algorithm which makes it possible to learn representations from large and non-stationary graphs. It takes less than a day on a commodity computer to learn high quality vectors on LiveJournal friendship graph consisting of 4.8 million nodes and 68 million links and the reasonable quality of representations can be obtained much faster.

Oleg U. Ivanov, Sergey O. Bartunov

Distorted High-Dimensional Binary Patterns Search by Scalar Neural Network Tree

The paper offers an algorithm (SNN-tree) that extends the binary tree search algorithm so that it can deal with distorted input vectors. Perceptrons are the tree nodes. The algorithm features an iterative solution search and stopping criterion. Unlike the SNN-tree algorithm, popular methods (LSH, k-d tree, BBF-tree, spill-tree) stop working as the dimensionality of the space grows (N > 1000). In this paper we managed to obtain an estimate of the upper bound on the error probability for SNN-tree algorithm. The proposed algorithm works much faster than exhaustive search (26 times faster at N = 10000).

Vladimir Kryzhanovsky, Magomed Malsagov

Hybrid Classification Approach to Decision Support for Endoscopy in Gastrointestinal Tract

This paper provides a new classification approach combining different methods for image and text analysis. In this work the approach is applied endoscopic image of gastrointestinal tract and appropriate text reports. We propose to extract useful information about gastrointestinal tract images from text descriptions using semantic analysis. The text mining algorithm was validated on real text descriptions of endoscopic surveys.

Vyacheslav V. Mizgulin, Dmitry M. Stepanov, Stepan A. Kamentsev, Radi M. Kadushnikov, Evgeny D. Fedorov, Olga A. Buntseva

User Similarity Computation for Collaborative Filtering Using Dynamic Implicit Trust

Collaborative filtering is one of the most prominent techniques in Recommender System (RS) to retrieve useful information by using most similar items or users. However, traditional collaborative filtering approaches face many limitations like data sparsity, semantic similarity assumption, fake user profiles and they often do not care about user’s evolving interests; such flaws lead to user’s dissatisfaction and low performance of the system. To cope with these limitations, we propose a new dynamic trust-based similarity approach. We compute trust score of the users by means of implicit trust information between them. The experimental results demonstrate that the proposed approach performs better than the existing trust-based recommendation algorithms in terms of accuracy by dealing with the aforementioned limitations.

Falguni Roy, Sheikh Muhammad Sarwar, Mahamudul Hasan

Similarity Aggregation for Collaborative Filtering

In this paper we show how several similarity measures can be combined for finding similarity between a pair of users for performing Collaborative Filtering in Recommender Systems. Through aggregation of several measures we find super similar and super dissimilar user pairs and assign a different similarity value for these types of pairs. We also introduce another type of similarity relationship which we call medium similar user pairs and use traditional JMSD for assigning similarity values for them. By experimentation with real data we show that our method for finding similarity by aggregation performs better than each of the similarity metrics. Moreover, as we apply all the traditional metrics in the same setting, we can assess their relative performance.

Sheikh Muhammad Sarwar, Mahamudul Hasan, Masum Billal, Dmitry I. Ignatov

Distributed Coordinate Descent for L1-regularized Logistic Regression

Logistic regression is a widely used technique for solving classification and class probability estimation problems in text mining, biometrics and clickstream data analysis. Solving logistic regression with L1-regularization in distributed settings is an important problem. This problem arises when training dataset is very large and cannot fit the memory of a single machine. We present d-GLMNET, a new algorithm solving logistic regression with L1-regularization in the distributed settings. We empirically show that it is superior over distributed online learning via truncated gradient.

Ilya Trofimov, Alexander Genkin

Social Network Analysis

Frontmatter

Building Profiles of Blog Users Based on Comment Graph Analysis: The Habrahabr.ru Case

Our study is aimed at developing a language-independent tool for building user profiles of online community users. To that end the definition of a comment graph, a convenient representation of users interaction, is studied. The set of comment graph characteristics for users that form the basis of the profiling techniques is suggested. Finally, the user profiling method based on cluster analysis is presented. The described method was applied to Habrahabr data set.

Alexandra Barysheva, Mikhail Petrov, Rostislav Yavorskiy

Formation and Evolution Mechanisms in Online Network of Students: The Vkontakte Case

The mechanisms of real-world social network formation and evolution are one of the most important topics in the field of network science. In this study we collect data about the development of the Vkontakte (a popular Russian social networking site) network of first-year students at a Russian university. We analyze the network formation process from the moment of network establishing until its stabilization. Using Conditional Uniform Graph Test, we compare the graph-level indices of the observed network with random same-size networks that were generated according to random, preferential attachment, and small-world algorithms. We propose two explanatory mechanisms of online network growth: the connected component attachment mechanism and the brokerage mechanism.

Sofia Dokuka, Diliara Valeeva, Maria Yudkevich

Large-Scale Parallel Matching of Social Network Profiles

A profile matching algorithm takes as input a user profile of one social network and returns, if existing, the profile of the same person in another social network. Such methods have immediate applications in Internet marketing, search, security, and a number of other domains, which is why this topic saw a recent surge in popularity.In this paper, we present a user identity resolution approach that uses minimal supervision and achieves a precision of 0.98 at a recall of 0.54. Furthermore, the method is computationally efficient and easily parallelizable. We show that the method can be used to match Facebook, the most popular social network globally, with VKontakte, the most popular social network among Russian-speaking users.

Alexander Panchenko, Dmitry Babaev, Sergei Obiedkov

Identification of Autopoietic Communication Patterns in Social and Economic Networks

Communications develop the basis for social and economic system functioning. In every communication act system agents exchange information, senses, money, services, industrial goods, energy, etc. Economic agent communications form the network. One of the most important system characteristic is its ability to reproduce itself (autopoiesis), which is performed by circular communications in the closed contours. The main goal of this work is to consider the technology of autopoietic patterns identification in social and economic networks. A new approach to initial data collection for evaluation of social communications is proposed. In this study the data collected while “KOMPAS TQM” system implementation was analyzed. The SNA methods and instruments were used for revealing autopoietic patterns and their subsequent analysis.

Dmitry B. Berg, Olga M. Zvereva

Text Mining and Natural Language Processing

Frontmatter

A Heuristic Strategy for Extracting Terms from Scientific Texts

The paper describes a strategy that applies heuristics to combine sets of terminological words and words combination pre-extracted from a scientific text by several term recognition procedures. Each procedure is based on a collection of lexico-syntactic patterns representing specific linguistic information about terms within scientific texts. Our strategy is aimed to improve the quality of automatic term extraction from a particular scientific text. The experiments have shown that the strategy gives 11–17 % increase of F-measure compared with the commonly-used methods of term extraction.

Elena I. Bolshakova, Natalia E. Efremova

Text Analysis with Enhanced Annotated Suffix Trees: Algorithms and Implementation

We present an improved implementation of the Annotated suffix tree method for text analysis (abbreviated as the AST-method). Annotated suffix trees are an extension of the original suffix tree data structure, with nodes labeled by occurrence frequencies for corresponding substrings in the input text collection. They have a range of interesting applications in text analysis, such as language-independent computation of a matching score for a keyphrase against some text collection. In our enhanced implementation, new algorithms and data structures (suffix arrays used instead of the traditional but heavyweight suffix trees) have enabled us to derive an implementation superior to the previous ones in terms of both memory consumption (10 times less memory) and runtime. We describe an open-source statistical text analysis software package, called “EAST”, which implements this enhanced annotated suffix tree method. Besides, the EAST package includes an adaptation of a distributional synonym extraction algorithm that supports the Russian language and allows us to achieve better results in keyphrase matching.

Mikhail Dubov

Morphological Analyzer and Generator for Russian and Ukrainian Languages

pymorphy2 is a morphological analyzer and generator for Russian and Ukrainian languages. It uses large efficiently encoded lexicons built from OpenCorpora and LanguageTool data. A set of linguistically motivated rules is developed to enable morphological analysis and generation of out-of-vocabulary words observed in real-world documents. For Russian pymorphy2 provides state-of-the-arts morphological analysis quality. The analyzer is implemented in Python programming language with optional C++ extensions. Emphasis is put on ease of use, documentation and extensibility. The package is distributed under a permissive open-source license, encouraging its use in both academic and commercial setting.

Mikhail Korobov

Semantic Role Labeling for Russian Language Based on Russian FrameBank

Semantic Role Labeling (SRL) is one of the major research areas in today’s natural language processing. The task can be described as follows: given an input sentence, that refers to some situation, find the participants of this situation in text and assign them semantically motivated labels, or roles. Although the topic has become increasingly popular in the last decade, there have been only a few attempts to apply SRL to Russian language. We present a supervised semantic role labeling system for Russian based on FrameBank, an actively developing Russian SRL resource analogous to FrameNet and PropBank.

Ilya Kuznetsov

Supervised Approach to Finding Most Frequent Senses in Russian

The paper describes a supervised approach for the detection of the most frequent sense on the basis of RuThes thesaurus, which is a large linguistic ontology for Russian. Due to the large number of monosemous multiword expressions and the set of RuThes relations it is possible to calculate several context features for ambiguous words and to study their contribution in a supervised model for detecting frequent senses.

Natalia Loukachevitch, Ilia Chetviorkin

FrameBank: A Database of Russian Lexical Constructions

Russian FrameBank is a bank of annotated samples from the Russian National Corpus which documents the use of lexical constructions (e.g. argument constructions of verbs and nouns). FrameBank belongs to FrameNet-oriented resources, but unlike Berkeley FrameNet it focuses more on the morphosyntactic and semantic features of individual lexemes rather than the generalized frames, following the theoretical approaches of Construction Grammar (C. Fillmore, A. Goldberg, etc.) and of Moscow Semantic School (J.D. Apresjan, E.V. Paducheva, etc.).

Olga Lyashevskaya, Egor Kashkin

TagBag: Annotating a Foreign Language Lexical Resource with Pictures

Such forms of art as photography or drawing may serve as a uniform language, which represents things that we can either see or imagine. Hence, it is reasonable to use such pictures in order to connect nouns of the natural languages by their meanings. In this paper a study of mapping noun images from an annotated collection to the word senses of a foreign language lexical resource through the usage of a bilingual dictionary has been conducted. In this study, the English-Russian dictionary by V.K. Mueller has been used to enhance the Yet Another RussNet synsets with Flickr photos.

Dmitry Ustalov

BigARTM: Open Source Library for Regularized Multimodal Topic Modeling of Large Collections

Probabilistic topic modeling of text collections is a powerful tool for statistical text analysis. In this paper we announce the BigARTM open source project (http://bigartm.org) for regularized multimodal topic modeling of large collections. Several experiments on Wikipedia corpus show that BigARTM performs faster and gives better perplexity comparing to other popular packages, such as Vowpal Wabbit and Gensim. We also demonstrate several unique BigARTM features, such as additive combination of regularizers, topic sparsing and decorrelation, multimodal and multilanguage modeling, which are not available in the other software packages for topic modeling.

Konstantin Vorontsov, Oleksandr Frei, Murat Apishev, Peter Romov, Marina Dudarenko

Industry Talk

Frontmatter

ATM Service Cost Optimization Using Predictive Encashment Strategy

ATM cash flow management is a challenging task which involves both machine learning predictions and encashment planning. Banks employ these systems to optimize their costs and improve the overall device availability via reducing the number of device failures. Although cash flow prediction is a common task, complete design of the cost optimization system is a complex design problem. In this article we present our complete encashment strategy methodology. We evaluate the proposed system design on real world data from one of the Russian banks. We show that one can effectively achieve $$18\,\%$$18% cost reduction by employing such strategy.

Vladislav Grozin, Alexey Natekin, Alois Knoll

Industry Papers

Frontmatter

Comparison of Deep Learning Libraries on the Problem of Handwritten Digit Classification

This paper presents a comparative analysis of several popular and freely available deep learning frameworks. We compare functionality and usability of the frameworks trying to solve popular computer vision problems like hand-written digit recognition. Four libraries have been chosen for the detailed study: Caffe, Pylearn2, Torch, and Theano. We give a brief description of these libraries, consider key features and capabilities, and provide case studies. We also investigate the performance of the libraries. This study allows making a decision which deep learning framework suites us best and will be used for our future research.

Dmitry Kruchinin, Evgeny Dolotov, Kirill Kornyakov, Valentina Kustikova, Pavel Druzhkov

Methods of Localization of Some Anthropometric Features of Face

In this paper a modified algorithm of localization of a face features based on the Viola-Jones method which is characterized by several classification stages is considered. Experiments show the improvement of the method performance that provides 98 % of correct localizations.

Svetlana Volkova

Ontological Representation of Networks for IDS in Cyber-Physical Systems

Cyber-Physical System (CPSs) combine information and communication technologies and means controlling physical objects. Modern infrastructure objects such as electrical grids, smart-cities, etc. represent complex CPSs consisting of multiple interconnected software and hardware complexes. The software contained in them requires development, support, and in case of updates termination can be the target for malicious attacks. To prevent intrusion into networks of cyber-physical objects one can use Intrusion-Detection System (IDS) that are widely used in existing noncyber-physical networks. CPSs are characterized by formalization and determinacy and it allows to apply a specification-based approach for IDS development.This paper is devoted to IDS development using the ontology-based representation of networks. This representation allows to implement both at the software level – by means of comparing movement of network traffic with its model, and at the physical level – by means of controlling connections of network devices. Ontological representation provides a model of network which is used for creation specifications for IDS.

Vasily A. Sartakov

Determination of the Relative Position of Space Vehicles by Detection and Tracking of Natural Visual Features with the Existing TV-Cameras

During spacecrafts maneuvers, especially at the rendezvous and docking stages, one of the most important tasks is to determine the relative positions of the vehicles. The current Russian “Course” and recently proposed ATV/HTV docking systems are complex and require mounting of specific cumbersome equipment on the outer sides of both vehicles. The proposed TV-based docking control system uses the existing cameras, “natural” visible features of the ISS and an ISS laptop to determine all six relative coordinates of the vehicles. At the training stage the ISS 3D-model and video recordings are used. This paper describes the algorithm flow and the problems of the passive, TV-only approach. The system efficiency is tested against models, mockups and the recordings of previous rendezvous of “Progress”, “Soyuz” and ATV spacecrafts. The nearest goal of the system is to become an independent docking control system helping the ground docking control team and the cosmonauts.

Dmitrii Stepanov, Aleksandr Bakhshiev, Dmitrii Gromoshinskii, Nikolai Kirpan, Filipp Gundelakh

Implementation of Agile Concepts in Recommender Systems for Data Processing and Analyses

Recommender systems have recently become an essential part of the majority of modern information systems. In the paper recommender systems oriented on supporting data and information processing and analyses are considered. The systems are aimed to give recommendations to end users on selection and usage of data processing methods and algorithms. Both commonly used and newly developed algorithms are taken into account by the systems. The algorithms have diverse technological and program implementation. Agile features of the recommendation systems allow continuously modify and enlarge the set of the used methods and algorithms. The key advantage of the systems is possibility to test new algorithms on real data processing tasks. An example of recommender system for binary data streams processing is described.

Alexander Vodyaho, Nataly Zhukova

Backmatter

Title: Analysis of Images, Social Networks and Texts
Editors: Mikhail Yu. Khachay
Natalia Konstantinova
Alexander Panchenko
Dmitry Ignatov
Valeri G. Labunets
Publisher: Springer International Publishing
Electronic ISBN: 978-3-319-26123-2
Print ISBN: 978-3-319-26122-5
DOI: https://doi.org/10.1007/978-3-319-26123-2

Springer Professional

About this book

Table of Contents

Frontmatter

Invited Papers

Frontmatter

A Probabilistic Rating System for Team Competitions with Individual Contributions

Sequential Hierarchical Image Recognition Based on the Pyramid Histograms of Oriented Gradients with Small Samples

Discerning Depression Propensity Among Participants of Suicide and Depression-Related Groups of Vk.com

Tutorial

Frontmatter

Normalization of Non-standard Words with Finite State Transducers for Russian Speech Synthesis

Analysis of Images and Videos

Frontmatter

Transform Coding Method for Hyperspectral Data: Influence of Block Characteristics to Compression Quality

Fréchet Filters for Color and Hyperspectral Images Filtering

Fast Global Image Denoising Algorithm on the Basis of Nonstationary Gamma-Normal Statistical Model

Theoretical Approach to Developing Efficient Algorithms of Fingerprint Enhancement

Remote Sensing Data Verification Using Model-Oriented Descriptors

New Bi-, Tri-, and Fourlateral Filters for Color and Hyperspectral Images Filtering

Frequency Analysis of Gradient Descent Method and Accuracy of Iterative Image Restoration

Shape Matching Based on Skeletonization and Alignment of Primitive Chains

Color Image Restoration with Fuzzy Gaussian Mixture Model Driven Nonlocal Filter

A Phase Unwrapping Algorithm for Interferometric Phase Images

Robust Image Watermarking on Triangle Grid of Feature Points

Pattern Recognition and Machine Learning

Frontmatter

Traffic Flow Forecasting Algorithm Based on Combination of Adaptive Elementary Predictors

Analysis of the Adaptive Nature of Collaborative Filtering Techniques in Dynamic Environment

A Texture Fuzzy Classifier Based on the Training Set Clustering by a Self-Organizing Neural Network

Learning Representations in Directed Networks

Distorted High-Dimensional Binary Patterns Search by Scalar Neural Network Tree

Hybrid Classification Approach to Decision Support for Endoscopy in Gastrointestinal Tract

User Similarity Computation for Collaborative Filtering Using Dynamic Implicit Trust

Similarity Aggregation for Collaborative Filtering

Distributed Coordinate Descent for L1-regularized Logistic Regression

Social Network Analysis

Frontmatter

Building Profiles of Blog Users Based on Comment Graph Analysis: The Habrahabr.ru Case

Formation and Evolution Mechanisms in Online Network of Students: The Vkontakte Case

Large-Scale Parallel Matching of Social Network Profiles

Identification of Autopoietic Communication Patterns in Social and Economic Networks

Text Mining and Natural Language Processing

Frontmatter

A Heuristic Strategy for Extracting Terms from Scientific Texts

Text Analysis with Enhanced Annotated Suffix Trees: Algorithms and Implementation

Morphological Analyzer and Generator for Russian and Ukrainian Languages

Semantic Role Labeling for Russian Language Based on Russian FrameBank

Supervised Approach to Finding Most Frequent Senses in Russian

FrameBank: A Database of Russian Lexical Constructions

TagBag: Annotating a Foreign Language Lexical Resource with Pictures

BigARTM: Open Source Library for Regularized Multimodal Topic Modeling of Large Collections

Industry Talk

Frontmatter

ATM Service Cost Optimization Using Predictive Encashment Strategy

Industry Papers

Frontmatter

Comparison of Deep Learning Libraries on the Problem of Handwritten Digit Classification

Methods of Localization of Some Anthropometric Features of Face

Ontological Representation of Networks for IDS in Cyber-Physical Systems

Determination of the Relative Position of Space Vehicles by Detection and Tracking of Natural Visual Features with the Existing TV-Cameras

Implementation of Agile Concepts in Recommender Systems for Data Processing and Analyses

Backmatter

Premium Partner