Skip to main content
Top

2018 | Book

Information and Software Technologies

24th International Conference, ICIST 2018, Vilnius, Lithuania, October 4–6, 2018, Proceedings

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 24th International Conference on Information and Software Technologies, ICIST 2018, held in Vilnius, Lithuania, in October 2018.

The 48 papers presented were carefully reviewed and selected from 124 submissions. The papers are organized in topical sections on information systems; business intelligence for information and software systems; software engineering; and information technology applications.

Table of Contents

Frontmatter
Correction to: Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources
Włodzimierz Lewoniewski, Ralf-Christian Härting, Krzysztof Węcel, Christopher Reichstein, Witold Abramowicz

Information Systems: Special Session on Innovative Applications for Knowledge Transfer Support

Frontmatter
An Information System Supporting the Eliciting of Expert Knowledge for Successful IT Projects

In order to guarantee the success of an IT project, it is necessary for a company to possess expert knowledge. The difficulty arises when experts no longer work for the company and it then becomes necessary to use their knowledge, in order to realise an IT project. In this paper, the ExKnowIT information system which supports the eliciting of expert knowledge for successful IT projects, is presented and consists of the following modules: (1) the identification of experts for successful IT projects, (2) the eliciting of expert knowledge on completed IT projects, (3) the expert knowledge base on completed IT projects, (4) the Group Method for Data Handling (GMDH) algorithm, (5) new knowledge in support of decisions regarding the selection of a manager for a new IT project. The added value of our system is that these three approaches, namely, the elicitation of expert knowledge, the success of an IT project and the discovery of new knowledge, gleaned from the expert knowledge base, otherwise known as the decision model, complement each other.

Justyna Patalas-Maliszewska, Irene Krebs
Metis: A Scalable Natural-Language-Based Intelligent Personal Assistant for Maritime Services

We implement an intelligent personal conversational assistant, communicating in natural language and designed specifically for the maritime industry. A multi-stage message analysis is performed, first classifying the topic of the request and finally applying special parsers to extract the parameters needed to execute the task. Our system is scalable and robust, employing generic and efficient algorithms. Our contributions are manifold. First, we present a complex and multi-level natural-language-processing-based system, focused particularly on the maritime domain and incorporating expert knowledge of the field. Next, we introduce a series of algorithms that can extract deep information using the syntactic structure of the message. Lastly, we implement and evaluate our approach, testing and proving our system’s effectiveness and efficiency.

Nikolaos Gkanatsios, Konstantina Mermikli, Serafeim Katsikas
Knowledge Acquisition Using Computer Simulation of a Manufacturing System for Preventive Maintenance

Preventive maintenance is an important component of the ‘Industry 4.0 Concept’. Modern industry requires intelligent, autonomous and reliable manufacturing systems. Should a manufacturing system fail, it should be able to reorganise itself and put into effect a production plan, based on a scenario of selected activities. The problem is just how to predict possible failures and prepare scenarios for the behaviour of a system. With regard to the preventive maintenance system, potential failures can be modelled and then, by using computer simulation, based on simulation experiments, a database of maintenance knowledge can be created. In the paper, the methodology for the acquisition of maintenance knowledge, using the computer simulation method, is proposed. The example, put forward as an illustration, was prepared using Tecnomatix Plant Simulation software.

Sławomir Kłos

Information Systems: Special Session on e-Health and Special Session on Digital Transformation

Frontmatter
Rate Your Physician: Findings from a Lithuanian Physician Rating Website

Physician review websites are known around the world. Patients review the subjectively experienced quality of medical services supplied to them and publish an overall rating on the Internet, where quantitative grades and qualitative texts come together. On the one hand, these new possibilities reduce the imbalance of power between health care providers and patients, but on the other hand, they can also damage the usually very intimate relationship between health care providers and patients. Review websites must meet these requirements with a high level of responsibility and service quality. In this paper, we look at the situation in Lithuania: Especially, we are interested in the available possibilities of evaluation and interaction, and the quality of a particular review website measured against the available data. We thereby identify quality weaknesses and lay the foundation for future research.

Frederik S. Bäumer, Joschka Kersting, Vytautas Kuršelis, Michaela Geierhos
Design of an Operator-Controller Based Distributed Robotic System

Modern robots often use more than one processing unit to solve the requirements in robotics. Mobile robots are more and more designed in a modular manner to fulfill the possibility to be extended for future tasks. The usage of multiple processing units leads into a distributed system within one single robot. Therefore, the (software) architecture is even more important than in single-computer robots. The presented DAEbot was designed to implement the Operator-Controller Module (OCM) on a mobile robot. This OCM has been used in other technical systems and splits the system hierarchically into controllers and operator(s). The controllers interact directly with all sensors and actuators within the system. For that reason, hard real-time constrains needs to be complied. The operator however processes the information of the controllers, which can be done by model-based principles using state machines. This paper describes the design of the autonomous mobile DAEbot robot focusing on its architecture with three controllers and two operators as well as the internal communication framework. Furthermore, the simulation capabilities of the system behavior and some safety features are shown.

Uwe Jahn, Carsten Wolff, Peter Schulz

Information Systems: Special Session on Information and Software Technologies for Intelligent Power Systems

Frontmatter
Deployment of Battery Swapping Stations for Unmanned Aerial Vehicles Subject to Cyclic Production Flow Constraints

Given is a production system in which material handling operations are carried out by a fleet of UAVs. A problem has been formulated for this case of cyclic multi-product batch production flow, which combines the problems of split delivery-vehicle routing with time windows and deployment of battery swapping depots. It is assumed that the times of execution of pickup and delivery operations are known. During these operations, workpieces following different production routes reach and leave workstations cyclically. Given is the number of battery swapping depots and their potential arrangement. Given is also the rate of power consumption by an UAV in hovering mode or flying at a constant speed as well as during take-off and landing. The goal is to find the number of UAVs and the routes they fly to serve all the workstations periodically, within a given takt time, without violating constraints imposed by the due-time pickup/delivery operations and collision-free movement of UAVs. A declarative model of the analysed case allows to view the problem under consideration as a constraint satisfaction problem and solve it in the Oz Mozart programming environment.

G. Bocewicz, P. Nielsen, Z. Banaszak, A. Thibbotuwawa
Integral Fuzzy Power Quality Assessment for Decision Support System at Management of Power Network with Distributed Generation

This paper is devoted to the development of scientific and methodological foundations of improvement the information support of decision at management of power network with distributed generation. It is proposed to consider the power quality index as the main criterion of management. Using the theory of fuzzy sets, the assessment of the conformity of power quality indicators to electric energy quality limits is done. A method for estimating the quality of electrical energy is proposed which represents the measured histogram as a fuzzy representation indicator of quality of electric energy in the form of fuzzy sets with a step membership function. The method of integral evaluation of electrical energy quality for different types of load is developed. The presented method allows to formulate rules for managing the operating modes of the distributed electrical network by the decision support system.

Sergii Tymchuk, Oleksandr Miroshnyk, Sergii Shendryk, Vira Shendryk
Development of Models for Computer Systems of Processing Information and Control for Tasks of Ergonomic Improvements

The questions of search of ergonomic reserves of efficiency of computer systems of processing information and control are considered. A set of models of a computer system of processing information and control, describing it in the necessary sections, was developed. The results can be useful in design of information provision for Decision Support Systems, devoted to questions of programs ergonomic quality of automated systems.

Evgeniy Lavrov, Nadiia Pasko

Business Intelligence for Information and Software Systems: Special Session on Intelligent Methods for Data Analysis and Computer-Aided Software Engineering

Frontmatter
Adaptive Resource Provisioning and Auto-scaling for Cloud Native Software

Cloud-native applications (CNA) are developed to run on the cloud in a way that enables them to fully exploit the cloud computing characteristics. These applications are strongly dependent on automated machinery (i.e. auto-scaling engines, schedulers and cloud resource provisioning software), which enables elasticity and auto-healing. These features improve application availability, resource utilization efficiency and help minimizing SLA violations related to performance. This work provides a generic architecture of software system that enables elasticity of cloud native software by use of automated scaling and resource provisioning. The architecture is based on analysis of previous works presented by practitioners and academia. Also it is a cloud platform and vendor agnostic.

Olesia Pozdniakova, Dalius Mažeika, Aurimas Cholomskis
Cloud Software Performance Metrics Collection and Aggregation for Auto-Scaling Module

Cloud computing made a big impact on software architecture evolution. The demand to serve multiple tenants, to include continuous delivery practice into the development process as well as increased system load influenced the style of cloud based software architecture. Microservice architecture is preferred architecture despite its complexity when scalability is an essential attribute of quality of service. Microservices should be managed, i.e., hardware resources should be adjusted based on application load, as well as resiliency should be ensured. Popular IaaS and PaaS providers such as Amazon, Azure or OpenStack ensure auto-scaling and elasticity at the infrastructure level. This approach has the following limitations: (1) Scaling and resiliency is a part of the infrastructure and not emerging from application nature; (2) The software is locked in with a specific vendor; (3) It might be difficult to run and ensure smooth scalability by running software on different vendors at the same time. We are creating auto-scaling module for microservice-based applications. Collecting metrics both at infrastructure and application levels is one important task for auto-scaling. We’ve chosen to investigate ELK stack and build appropriate architecture around it.

Aurimas Cholomskis, Olesia Pozdniakova, Dalius Mažeika
Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources

The leading online encyclopedia Wikipedia is struggling with inconsistent article quality caused by the collaborative editing model. While one can find many helpful articles with consistent information on Wikipedia, there are also a lot of questionable articles with unclear or unfinished information yet. The quality of each article may vary over time as different users repeatedly re-edit content. One of the most important elements of the Wikipedia articles are references which allow to verify content and to show its source to user. Based on the fact that most of these references are web pages, it is possible to get more information about their quality by using citation analysis tools. For science and practice the empirical proof of the quality of the articles in Wikipedia could have a further signal effect, as the citation of Wikipedia articles, especially in scientific practice, is not yet recognised. This paper presents general results of Wikipedia analysis using metrics from the Toolbox SISTRIX, which is one of the leading providers of indicators for Search Engine Optimization (SEO). In addition to the preliminary analysis of the Wikipedia articles as separate web pages, we extracted data from more than 30 million references in different language versions of Wikipedia and analyzed over 180 thousand most popular hosts. In addition, we compared the same sources from different geographical perspectives using country-specific visibility indices.

Włodzimierz Lewoniewski, Ralf-Christian Härting, Krzysztof Węcel, Christopher Reichstein, Witold Abramowicz
How to Describe Basic Urban Pattern in Geographic Information Systems

Spatial patterns play an important role in the Spatial Data Analysis performed by Geographic Information Systems. This paper presents the analysis of the urban pattern description in the form of UML class diagrams covering the aspects of the hierarchy and generalization of patterns and metapatterns. In addition, the data model for keeping the 3D geometric and topographic data of the urban pattern is reviewed. Subsequently, the article presents a survey of the methods and solutions of spatial analysis, concentrating on the methods based on space syntax, which could be used in further research and computerization of the methodology of urban patterns for Geographic Information Systems.

Indraja E. Germanaitė, Rimantas Butleris, Kęstutis Zaleckis
Smart Deployment of Virtual Machines to Reduce Energy Consumption of Cloud Computing Based Data Centers Using Gray Wolf Optimizer

The growth in demand for using cloud computing resources at massive data centers has led to high consumption of energy and, consequently, increased operating costs. Integration of cloud resources makes it possible to save time on the migration of loaded and unprocessed data centers, to qualified data centers, the release of idle nodes, and the reduction of virtual machine virtualization migration.One of the most important challenges is to choose the method of embedding virtual machines that are migrating to the node. Therefore, in this paper, a solution is proposed to reduce energy consumption in cloud data centers. In this solution, the gray wolf optimizer is used to properly assign the virtual machine to the appropriate node. The methodology was simulated with the Claudios software. The results of the simulation indicate a decrease in the number of virtual machines migrating, increasing the efficiency of migration and reducing energy consumption.

Hossein Shahbazi, Sepideh Jamshidi-Nejad
Problem Domain Knowledge Driven Generation of UML Models

The main scope of the article is to present how the quality of stored problem domain information in Enterprise model (EM) is significant and important in Unified Modelling Language (UML) models generation process from Enterprise model. The generation process is explained by top-level transformation algorithm, which is presented in details and depicted by using algorithm’s step by step description. The importance of information quality and fullness is represented by the example of Business Rule element’s, which is stored in Enterprise model, significance for different UML models.

Ilona Veitaite, Audrius Lopata
Visualization for Open Access – A Case Study of Karlstad University and the University of Makerere in Uganda

Open Access (OA) research provides a platform to world-wide knowledge sharing, but the channels and possibilities are still limited. It is not always clearly defined how the process of publishing should take place. Information systems architectures are intrinsically complex engineering products that can be defined graphically on various levels of abstractions and represented using different aspects of the system. For that reason enterprise architecture is not easy to comprehend for different actors involved. Graphical representations of different business scenarios are critical to understand how different aspects of the system descriptions are analysed in relation to each other and how requirements from different perspectives are perceived as a whole. The goal of this paper is to introduce the modelling method for visualization of the process of publication of BSc thesis into institutional repositories. Universities need to investigate undergraduate students’ publications of their graduate work (BSc thesis) to promote knowledge of university repositories. Two different modelling methods were used to visualize publishing process and two case studies at two different universities were made. The results indicated and motivated that SICM method had more semantic power to visualize business process in a more comprehensive way.

Prima Gustiené
A Model-Driven Approach for Access Control in Internet of Things (IoT) Applications – An Introduction to UMLOA

The Internet of Things (IoT) is a collection of billions of devices attached to the internet that collect and exchange data using nodes, sensors, and controllers. The world is now continuously shifting from the traditional approaches to the IoT technology in order to meet the demands of modern technological advancements. However, the selection and implementation of right access control method in IoT applications is always challenging. In this context, OAuth is a renowned access control protocol in IoT applications. However, it is difficult to provide access control in IoT application through OAuth due to its implementation complexity. Therefore, there is a strong dire to introduce a model based approach that provide simple access control mechanism in IoT applications while preserving the major OAuth features. This article introduces Unified Modeling Language profile for OAuth (UMLOA) to model the access control requirements for IoT applications. Particularly, UMLOA is capable of modeling confidentiality, integrity, availability, scalability, and interoperability requirements in IoT applications. This provides the basis to transform the UMLOA source models into different target models (e.g. iFogSim etc.) for early verification of access control requirements. The applicability of UMLOA is validated through intelligent shipping container case study.

Mehreen Khan, Muhammad Waseem Anwar, Farooque Azam, Fatima Samea, Muhammad Fahad Shinwari
A Workflow-Based Large-Scale Patent Mining and Analytics Framework

The analysis of large volumes and complex scientific information such as patents requires new methods and a flexible, highly interactive and easy-to-use platform in order to enable a variety of applications ranging from information search, semantic analysis to specific text- and data mining tasks for information professionals in industry and research. In this paper, we present a scalable patent analytics framework built on top of a big-data architecture and a scientific workflow system. The framework allows to seamlessly integrate essential services for patent analysis employing natural language processing as well as machine learning algorithms for deeply structuring and semantically annotating patent texts for realizing complex scientific workflows. In two case studies we will show how the framework can be utilized for querying, annotating and analyzing large amounts of patent data.

Mustafa Sofean, Hidir Aras, Ahmad Alrifai

Software Engineering: Special Session on Intelligent Systems and Software Engineering Advances

Frontmatter
SVM Accuracy and Training Speed Trade-Off in Sentiment Analysis Tasks

SVM technique is one of the best techniques to classify data, but it has a slow performance in the big data arrays. This paper introduces the method to improve the speed of SVM classification in sentiment analysis by reducing the training set. The method was tested on the Stanford Twitter sentiment corpus dataset and Amazon customer reviews dataset. The results show that the execution time of the introduced method outperforms the standard SVM classification method.

Konstantinas Korovkinas, Paulius Danėnas, Gintautas Garšva
J48S: A Sequence Classification Approach to Text Analysis Based on Decision Trees

Sequences play a major role in the extraction of information from data. As an example, in business intelligence, they can be used to track the evolution of customer behaviors over time or to model relevant relationships. In this paper, we focus our attention on the domain of contact centers, where sequential data typically take the form of oral or written interactions, and word sequences often play a major role in text classification, and we investigate the connections between sequential data and text mining techniques. The main contribution of the paper is a new machine learning algorithm, called J48S, that associates semantic knowledge with telephone conversations. The proposed solution is based on the well-known C4.5 decision tree learner, and it is natively able to mix static, that is, numeric or categorical, data and sequential ones, such as texts, for classification purposes. The algorithm, evaluated in a real business setting, is shown to provide competitive classification performances compared with classical approaches, while generating highly interpretable models and effectively reducing the data preparation effort.

Andrea Brunello, Enrico Marzano, Angelo Montanari, Guido Sciavicco
TabbyPDF: Web-Based System for PDF Table Extraction

PDF is one of the most widespread ways to represent non-editable documents. Many of PDF documents are machine-readable but remain untagged. They have no tags for identifying layout items such as paragraphs, columns, or tables. One of the important challenges with these documents is how to extract tabular data from them. The paper presents a novel web-based system for extracting tables located in untagged PDF documents with a complex layout, for recovering their cell structures, and for exporting them into a tagged form (e.g. in CSV or HTML format). The system uses a heuristic-based approach to table detection and structure recognition. It mainly relies on recovering a human reading order of text, including document paragraphs and table cells. A prototype of the system was evaluated, using the methodology and dataset of “ICDAR 2013 Table Competition”. The standard metric F-score is 93.64% for the structure recognition phase and 83.18% for the table extraction with automatic table detection. The results are comparable with the state-of-the-art academic solutions.

Alexey Shigarov, Andrey Altaev, Andrey Mikhailov, Viacheslav Paramonov, Evgeniy Cherkashin
Modification of Parallelization for Fast Sort Algorithm

One of the most important issues in NoSQL databases is to develop applications and facilitates for the parallel processing in information systems. In the work the author presents some improvements for the parallel algorithm for merging strings and use this algorithm to sort large data sets. Tested sorting for a parallel merging algorithm confirms the reduction of the time complexity and improved stability of the algorithm.

Zbigniew Marszałek
Text Semantics and Layout Defects Detection in Android Apps Using Dynamic Execution and Screenshot Analysis

The paper presents classification of the text defects. It provides a list of user interface text defects and the method based on static/dynamic code analysis for detecting defects in Android applications. This paper proposes a list of static analysis rules for detecting every defect and the tool model implementing those rules. The method and the tool are based on the application of multiple Android application emulators, execution of the application through certain execution paths on multiple hardware and software configurations while taking application screen-shots. The defects are identified by running analysis rules on each taken screen-shot and searching for defect patterns. The results are presented by testing sample Android application.

Šarūnas Packevičius, Dominykas Barisas, Andrej Ušaniov, Evaldas Guogis, Eduardas Bareiša
The Impact of the Cost Function on the Operation of the Intelligent Agent in 2D Games

A large part of the technology development depends on the needs of users. Apart from the hardware requirements for programs used by large companies or smaller groups, the wide applications and hardware load are games and graphics. Increasing the quality of games by improving their story quality requires a lot of more efficient and effective algorithms. In this work, we propose the use of a hybrid approach to the management of opponents’ movements on the classic two-dimensional game called the Tron. Our solution is based on the use of the idea of a simulated annealing algorithm in order to select the agent’s movement technique depending on the cost function. The algorithm has been implemented and tested depending on the used parameters. Obtained results were discussed depending on the advantages and disadvantages of using this type of solution in more complex games.

Dawid Połap, Marcin Woźniak
The Research on Method of Prediction Mine Earthquake Based on the Information Entropy Principle

Earthquake prediction is researched by using the information entropy principle, which provides that magnitude distribution model is not in conformity with G-R or fractal index model, and the reason that the mine earthquake magnitudes obey a certain probability distribution is explained. It is presented to calculate the corresponding information entropy taking advantage of existing mine earthquake measuring results, therefore, the occurrence of mine earthquake is forecasted according to calculation result of entropy. The mine earthquake takes place easily when entropy reduces. Forecast method is tested by the monitoring data of mine earthquake, and the result shows that the method is feasible. Our results provide a kind of effective method of mine earthquake statistical distribution modeling and information entropy prediction of mine earthquake.

Baoxin Jia, Linli Zhou, Yishan Pan, Chengfang Yang, Zhiyong Li
Automated Design Thinking Oriented on Innovatively Personified Projects

The paper presents a version of automating the design thinking approach in its application by an inventor of the certain innovation for a transfer from an innovative intention to a tested prototype of a possible project. A similar case of starting the work with the innovation is arisen in conditions, in which the inventor wants to evaluate the conceived value without intermediaries before decision making about objectifying this value through the design process. The suggested way of automating is based on the use of question-answer interactions of the inventor with the accessible experience, registration of verbal traces of such interactions in the semantic memory and processing the traces for achieving the architectural and cause-and-effect understanding with the use of tested prototypes. Enumerated actions must be implemented in the specialized toolkit OwnWIQA that is oriented on innovatively personified projects.

Petr Sosnin
A Comparison of Concept and Global Probabilistic Approximations Based on Mining Incomplete Data

We discuss incomplete data sets with two interpretations of missing attribute values, lost values and “do not care” conditions. For data mining we use two probabilistic approximations, concept and global. Concept probabilistic approximations are well known while global probabilistic approximations are introduced in this paper. The rationale for introducing global probabilistic approximations is a common opinion of the rough set community that global probabilistic approximations, as closer to the approximated concept, should be more successful. Surprisingly, results of our experiments show that the error rate evaluated by ten-fold cross validation is smaller for concept probabilistic approximations than for global probabilistic approximations.

Patrick G. Clark, Cheng Gao, Jerzy W. Grzymala-Busse, Teresa Mroczek, Rafal Niemiec
A Study in Granular Computing: Homogenous Granulation

This paper is presenting a new method of decision systems granulation in the family of methods inspired by Polkowski standard granulation algorithm. The new method is called homogenous granulation. The idea is to create the granules around each training object separately by selecting smallest r-indiscernibility ratio, based on which granule consists of group of objects with the same class. This is natural idea, where the indiscernibility level is extended until indiscernibility class contains uniform group of objects. After granulation process we have used random choice for covering of universe of objects and majority voting to create granular reflections of selected granules. The main advantage of this method is lack of necessity to estimate optimal granulation radius. We have performed experiments on data from UCI repository using 5 times cross validation 5 model. First results of homogenous granulation, in the terms of classification accuracy, are comparable with the ones of already presented algorithms with significant reduction of training data size after granulation.

Krzysztof Ropiak, Piotr Artiemjew
Detection of Dental Filling Using Pixels Color Recognition

Dental filling is very important material used to fill the cavity in the tooth, formed after the treatment of caries or as a result of mechanical or other damage to the tooth in stomatology. In this article we show that dental filling can be detected using pixels colors of tooth image to evaluate the size and filling gap. We present an algorithm, which analyzes the size of dental filling and gap of filling. Presented research results show that the developed method can find differences between various types of teeth. Also we use Student t-test for dependent variables, which helps to decide whether there is a difference between different types of teeth.

Oleksandra Osadcha, Agata Trzcionka, Katarzyna Pachońska, Marek Pachoński
The Use of an Artificial Neural Network for a Sea Bottom Modelling

Currently data are often acquired by using various remote sensing sensors and systems, which produce big data sets. One of important product are digital models of geographical surfaces that include the sea bottom surface. To improve their processing, visualization and management is often necessary reduction of data points. Paper presents research regarding the application of neural networks for bathymetric geodata reductions. Research take into consideration radial networks, single layer perceptron and self-organizing Kohonen network. During reconstructions of sea bottom model, results shows that neural network with less number of hidden neurons can replace original data set. While the Kohonen network can be used for clustering during reduction of big geodata. Practical implementation of neural network with creation of surface models and reduction of bathymetric data is presented.

Jacek Lubczonek, Marta Wlodarczyk-Sielicka
Application of an Ant Colony Optimization Algorithm in Modeling the Heat Transfer in Porous Aluminum

In this paper procedure for solving inverse heat conduction problem with fractional derivative is presented. Authors present time fractional heat conduction model with Caputo derivative and Neumann, Robin boundary conditions, which can be applied to describe process of heat conduction in porous media. Based on temperature measurements, functional describing error of approximate solution is created. Considered inverse problem is transform to find minimum of created functional. In order to solve inverse problem (find unknown parameters of model) authors applied an Ant Colony Optimization (ACO) algorithm. Finally, experiment with data from porous aluminum was carried out to check effectiveness of proposed algorithm. Goal of this paper is reconstruction unknown parameters in heat conduction model with fractional derivative and show that ACO is effective algorithm and works well in these type of problems.

Rafał Brociek, Damian Słota, Mariusz Król, Grzegorz Matula, Waldemar Kwaśny
Application of the Taylor Transformation to the Systems of Ordinary Differential Equations

In the paper the Taylor transformation is applied to systems of ordinary differential equations, including nonlinear differential equations. Apart from the description of the method, its computational effectiveness is demonstrated on example. Efficiency of the proposed method is confirmed it with the selected classical methods devoted to problems of considered kind. The present paper is an introduction to some further research in this area, which is very important for a wide range of problems described by means of the systems of ordinary differential equations.

Radosław Grzymkowski, Mariusz Pleszczyński
An Introduction to UMLPDSV for Real-Time Dynamic Signature Verification

Signatures are one of the most important behavioral biometric feature which are used to recognize an individual identity. These handwritten signatures are captured as actual input signals that are written on some electronic gadgets by the user. The divergent writing patterns of individuals primarily due to variation in style, shape and steadiness create real time challenges in differentiating real signatures from the fake ones. In order to overcome the said challenge of signature recognition, this article introduces model driven approach for dynamic signature verification. Particularly, a UMLPDSV (Unified Modeling Language Profile for Dynamic Signature Verification) has been proposed to specify the signature verification requirements at high abstraction level. This provides the basis to automatically generate target models of different machine learning tools (e.g. RapidMiner process, Matlab code etc.) to perform dynamic signature verification. The applicability of UMLPDSV has been validated through internet banking case study.

Fatima Samea, Muhammad Waseem Anwar, Farooque Azam, Mehreen Khan, Muhammad Fahad Shinwari
Minimization of Power Pulsations in Traction Supply – Application of Ant Colony Algorithm

In this paper a particular problem related to transformation of AC voltage into DC voltage used in tram supply is considered. Variable component is always present in rectified voltage. Pulsation of rectified voltage is influenced by different factors. In 12-pulse system, where two secondary transformer windings are used (one delta-connected, the other star-connected), an additional factor increasing the pulsation is the unbalance of the output voltages at these windings. Tap changer may be used to compensate (equalize) those voltages and its setting is optimized here by applying ant colony algorithm. Different supply voltage variants have been considered, with particular attention paid to distorted voltage, containing 5th and 7th harmonic. The effects of applying ACO algorithm are demonstrated.

Barbara Kulesz, Andrzej Sikora, Adam Zielonka
Model Driven Architecture Implementation Using Linked Data

We consider tools for developing information systems with use of Model Driven Architecture (MDA) and Linked Open Data technologies (LOD). The original idea of LOD is to allow the software designers to develop program systems integrated by means of common ontologies and web protocols. MDA Platform Independent Model (PIM) is expressed as set of UML diagrams. PIM forms a LOD graph and its namespace. All the PIM entities are defined as ontology resources, i.e. with URI references to LOD terms. This allows us to translate PIM UML model to a set of triples and store them in an ontology warehouse for further transformation into a Platform Specific Model (PSM). The ClioPatria ontology server and the SWI Prolog language are used as tools of PIM and PSM storage, querying and processing. The tools will allow us to mediate the MDA static means of code generation and configuration at development stage with the techniques of flexible data structure processing at run time, thus, producing even more productive information system development and maintenance techniques. This research corresponds to nowadays direction of Semantic Web Software Engineering.

Evgeny Cherkashin, Alexey Kopaygorodsky, Ljubica Kazi, Alexey Shigarov, Viacheslav Paramonov
Relationship Between Cohesion and Coupling Metrics for Object-Oriented Systems

Cohesion and coupling are regarded as fundamental features of the internal quality of object-oriented systems (OOS). Analyzing the relationships between cohesion and coupling metrics plays a significant role to develop efficient techniques for determining the external quality of an object-oriented system. Researchers have proposed several metrics to find cohesion and coupling in object-oriented systems. However, few of them have proposed an analysis of the relationship between cohesion and coupling. This paper empirically investigates the relationships among several cohesion and coupling metrics in object-oriented systems. This work attempts to find mutual relationships between those metrics by statistically analyzing the results of experiments. Three open-source Java systems were used for experimentation. The empirical study shows that cohesion and coupling metrics are inversely correlated.

Samuel António Miquirice, Raul Sidnei Wazlawick
Data Analysis Algorithm for Click Fraud Recognition

This paper presents an analytical system designed to detect click fraud on the Internet. The algorithm works with the data collected from an advertiser’s website to which the Pay-Per-Click traffic is directed. This traffic is not entirely carried out by humans, as a large part of it is carried out by bots – software running automated tasks. The purpose of the proposed algorithm is to analyze the data of individual clicks coming from advertisements and to automatically classify them as suspicious or correct. The paper presents the mechanisms of comparing different types of data, their classification and the tuning of particular elements of the algorithm. Results of the experimental research confirming the effectiveness of the proposed methods are also presented.

Marcin Gabryel

Information Technology Applications: Special Session on Smart e-Learning Technologies and Applications

Frontmatter
Competence Management in Teacher Assignment Planning

In selecting teachers for courses, effort is always made to ensure sound use of competences to achieve the desired instructional quality under the assumed cost conditions. The present study is a review of the state of the art in research on ​​competence management problems, in particular, the Teacher Assignment Problem (TAP). The article focuses methods of modelling TAP, competence models, and level of competence as a function of time.

Eryk Szwarc, Irena Bach-Dąbrowska, Grzegorz Bocewicz
WBT-Master - Modern Learning Management System

An effective learning management system (LMS) is a vital component of an overall E-Learning infrastructure in universities nowadays. Learning Management System can be seen as a structured repository of courseware materials provided with additional communication facilities such as discussion forum, annotations, chats, etc. In this paper, we present innovative features of a modern LMS called WBT-Master. These features actually justify the development of yet another LMS.We describe the architecture of the system, data structuring paradigm, interface to popular cloud services, social computing component and human-computer interface solutions.We also discuss advantages of the implementation, problems that we experienced and first results of actual usage of the LMS.The practical value of this paper is defined by possible reproduction and further development of the presented technical solutions in other LMS.

Nikolai Scerbakov, Frank Kappe
The Mobile Application Based on Augmented Reality for Learning STEM Subjects

App store is full of programs, which are based on augmented reality. A lot of studies showed that augmented reality has a lot of benefits for user’s ability to learn new things or their increase in motivation. However, the majority of these programs are dedicated for entertainment and just several of them are designed for learning processes. The authors have developed an app based on an augmented reality, which aims to provide scientific formulas for math, physics and chemistry subjects and by this to ease the exercise solving. The target group is K12 learners in school. The app was uploaded to the app store for both, iOS and Android operating systems. Further researches have to be done on finding the impact of this type of programs. If the impact would be positive, the authors suggest improving an app by adding more specific topics.

Tomas Valatkevičius, Andrius Paulauskas, Tomas Blažauskas, Reda Bartkutė
The Ways of Using Augmented Reality in Education

As the improvement in the technologies is fast and widespread, it should be used for its best. Emerging technologies like an augmented reality are used to play games when at the same time it can be the best motivating educational tool. However, augmented reality tools are not spread in schools or higher education institutions, just several apps are designed for the educational purposes. The problem is that there is no one effective model which would help to increase the efficiency of the learning process and the gap between the need of different types of using augmented reality (AR) in education and different subjects is still existing. The authors provide the model of AR for the effective learning processes.

Daina Gudonienė, Tomas Blažauskas
Experience with Distance Learning of Informatics

Information and communication technologies are developing very rapidly. Learning of informatics must respond to this rapid development. Also, training of ICT staff requires flexibility and acceptance of new facts. There is a need to educate and expand your knowledge and skills continually. The traditional system of full-time education is often only suitable for young students. Those who are already employed prefer distance learning to a combined form of education. This article illustrates twenty years of distance learning experience in Bachelor and Master of Science in Informatics. Assumptions confirm that distance learning is more demanding for students than full-time learning. Distance learning requires greater motivation of students and the ability to work with time self-study. The article shows the comparison of the results and the success rate of attendance and distance students and also describes the possibilities of improving the quality of distance learning. Long-term findings show that distant students have significantly lower scores in their first years of study than full-time bachelor’s students. In the following years, the differences diminish, and the student results are sometimes comparable.

Rostislav Fojtik

Information Technology Applications: Special Session on Language Technologies

Frontmatter
Word and Phrase Dictionaries Generated with Multiple Translation Paths

Methods used to learn bilingual word embedding mappings, which project the source-language embeddings into the target embedding space, are compared in this paper. Orthogonal transformations, which are robust to noise, can learn to translate between word pairs they have never seen during training (zero-shot translation). Using multiple translation paths, e.g. Finnish $$\rightarrow $$ English $$\rightarrow $$ Russian and Finnish $$\rightarrow $$ French $$\rightarrow $$ Russian, at the same time and combining the results was found to improve the results of this process. Four new methods are presented for the calculation of either the single most similar or the five most similar words, based on the results of multiple translation paths. Of these, the Summation method was found to improve the P@1 translation precision by 1.6% points compared to the best result obtained with a direct translation (Fi $$\rightarrow $$ Ru). The probability margin is presented as a confidence score. With similar coverages, the probability margin was found to outperform probability as a confidence score in terms of P@1 and P@5.

Jouko Vankka, Christoffer Aminoff, Dmitriy Haralson, Janne Siipola
Sentiment Analysis of Lithuanian Texts Using Deep Learning Methods

We describe experiments in sentiment analysis of the Lithuanian texts using the deep learning methods: Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Methods used with pre-trained Lithuanian neural word embeddings are tested with different pre-processing techniques: emoticons restoration, stop words removal, diacritics restoration/elimination. Despite the selected pre-processing technique, CNN was always outperformed by LSTM. Better results (reaching an accuracy of 0.612) were achieved with the undiacritized texts and undiacritized word embeddings. However, these results are still worse if compared to the ones obtained using Support Vector Machines or Naive Bayes Multinomial and with the frequencies of words as features.

Jurgita Kapočiūtė-Dzikienė, Robertas Damaševičius, Marcin Woźniak
Improvement of Reverse Dictionary by Tuning Word Vectors and Category Inference

A reverse dictionary is a system that returns words based on user descriptions or definitions. OneLook Reverse Dictionary is a commercial reverse dictionary system constructed from existing dictionaries. Hill (2016) reported another reverse dictionary system was constructed from public dictionaries using word embeddings and that its performance was comparable to that of OneLook Reverse Dictionary at the time of the comparison. In this paper we report that, by selecting word vectors suitable for a reverse dictionary and combining Convolutional Neural Network text classification, we improved the reverse dictionary described by Hill. It is very significant that our model can automatically construct a reverse dictionary system from publicly available resources such that it obtains similar scores to those obtained with OneLook Reverse Dictionary in accuracy@100/1000. We also show that our model can be used as a filter to the OneLook Reverse Dictionary to improve its performance.

Yuya Morinaga, Kazunori Yamaguchi
Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features

Wikipedia is the most popular and the largest user-generated source of knowledge on the Web. Quality of the information in this encyclopedia is often questioned. Therefore, Wikipedians have developed an award system for high quality articles, which follows the specific style guidelines. Nevertheless, more than 1.2 million articles in Polish Wikipedia are unassessed. This paper considers over 100 linguistic features to determine the quality of Wikipedia articles in Polish language. We evaluate our models on 500 000 articles of Polish Wikipedia. Additionally, we discuss the importance of linguistic features for quality prediction.

Włodzimierz Lewoniewski, Krzysztof Węcel, Witold Abramowicz
NLP in OTF Computing: Current Approaches and Open Challenges

On-The-Fly Computing is the vision of covering software needs of end users by fully-automatic compositions of existing software services. End users will receive so-called service compositions tailored to their very individual needs, based on natural language software descriptions. This everyday language may contain inaccuracies and incompleteness, which are well-known challenges in requirements engineering. In addition to existing approaches that try to automatically identify and correct these deficits, there are also new trends to involve users more in the elaboration and refinement process. In this paper, we present the relevant state of the art in the field of automated detection and compensation of multiple inaccuracies in natural language service descriptions and name open challenges needed to be tackled in NL-based software service composition.

Frederik S. Bäumer, Michaela Geierhos
Text Augmentation Techniques for Document Vector Generation from Russian News Articles

In this paper, a document classification system is enhanced through the construction of a text augmentation technique by testing various Part-of-Speech filters and word vector weighting methods with nine different models for document representation. Subject/object tagging is introduced as a new form of text augmentation, along with a novel classification system grounded in a word weighting method based on the distribution of words among classes of documents. When an augmentation including subject/object tagging, a nouns+adjectives filter and Inverse Document Frequency word weighting was applied, an average increase in classification accuracy of 4.1% points was observed.

Christoffer Aminoff, Aleksei Romanenko, Onni Kosomaa, Jouko Vankka
Accounting for Named Entities in Intent Recognition from Short Chats

The operational cost of call centres accounts for a large part of the total spending of any modern organization. As a consequence, the automated conversation agent powered by Artificial Intelligence (AI) through Natural Language Processing (NLP) alternative has gained major attraction over the past years. Efforts to achieve such level of automation generally rely on predefined business intents or intents, which are in turn tightly related to business entities or processes that they represent. As the success of the automated conversation agent fully relies on its ability to accurately recognise user intents, a good automated agent will be the one that recognises intents it is meant to recognise. In light of the strong relationship that exists between business entities and business intents or entities and intents in general, we propose two approaches for accounting for named entities in the task of intents recognition from short chats. The first approach relies on Bi-Normal Separation (BNS) to weight term features that are named entities more than other features, whereas, the second approach takes advantage of word embedding to encode the relationship between entities and chats. Evaluation of proposed methodologies, on a data set composed of one to one conversations between human actors, suggests that accounting for named entities improves the performance of the intents recognition task.

Ghislain Landry Tsafack, Sharva Kant
Backmatter
Metadata
Title
Information and Software Technologies
Editors
Robertas Damaševičius
Giedrė Vasiljevienė
Copyright Year
2018
Electronic ISBN
978-3-319-99972-2
Print ISBN
978-3-319-99971-5
DOI
https://doi.org/10.1007/978-3-319-99972-2

Premium Partner