nach oben

2017 | Buch

Artificial Intelligence and Soft Computing

16th International Conference, ICAISC 2017, Zakopane, Poland, June 11-15, 2017, Proceedings, Part II

herausgegeben von: Leszek Rutkowski, Marcin Korytkowski, Rafał Scherer, Ryszard Tadeusiewicz, Lotfi A. Zadeh, Jacek M. Zurada

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The two-volume set LNAI 10245 and LNAI 10246 constitutes the refereed proceedings of the 16th International Conference on Artificial Intelligence and Soft Computing, ICAISC 2017, held in Zakopane, Poland in June 2017.
The 133 revised full papers presented were carefully reviewed and selected from 274 submissions. The papers included in the second volume are organized in the following five parts: data mining; artificial intelligence in modeling, simulation and control; various problems of artificial intelligence; special session: advances in single-objective continuous parameter optimization with nature-inspired algorithms; special session: stream data mining.

Inhaltsverzeichnis

Frontmatter

Data Mining

Frontmatter

Computer Based Stylometric Analysis of Texts in Polish Language

The aim of the paper is to compare stylometric methods in a task of authorship, author gender and literacy period recognition for texts in Polish language. Different feature selection and classification methods were analyzed. Features sets include common words (the most common, the rarest and all words) and grammatical classes frequencies, as well as simple statistics of selected characters, words and sentences. Due to the fact that Polish is a highly inflected language common words features are calculated as the frequencies of the lexemes obtained by morpho-syntactic tagger for Polish. Nine different classifiers were analysed. Authors tested proposed methods on a set of Polish novels. Recognition was done on whole novels and chunked texts. Performed experiments showed that the best results are obtained for features based on all words. For ill defined problems (with small recognition accuracy) the random forest classifier gave the best results. In other cases (for tasks with medium or high recognition accuracy) the multilayer perceptron and the linear regression learned by stochastic gradient descent gave the best results. Moreover, the paper includes an analysis of statistical importance of used features.

Maciej Baj, Tomasz Walkowiak

Integration Base Classifiers Based on Their Decision Boundary

Multiple classifier systems are used to improve the performance of base classifiers. One of the most important steps in the formation of multiple classifier systems is the integration process in which the base classifiers outputs are combined. The most commonly used classifiers outputs are class labels, the ranking list of possible classes or confidence levels. In this paper, we propose an integration process which takes place in the “geometry space”. It means that we use the decision boundary in the integration process. The results of the experiment based on several data sets show that the proposed integration algorithm is a promising method for the development of multiple classifiers systems.

Robert Burduk

Complexity of Rule Sets Induced by Two Versions of the MLEM2 Rule Induction Algorithm

We compare two versions of the MLEM2 rule induction algorithm in terms of complexity of rule sets, measured by the number of rules and total number of conditions. All data sets used for our experiments are incomplete, with many missing attribute values, interpreted as lost values, attribute-concept values and “do not care” conditions. In our previous research we compared the same two versions of MLEM2, called true and emulated, with regard to an error rate computed by ten-fold cross validation. Our conclusion was that the two versions of MLEM2 do not differ much, and there exists some evidence that lost values are the best. In this research our main objective is to compare both versions of MLEM2 in terms of complexity of rule sets. The smaller rule sets the better. Our conclusion is again that both versions do not differ much. Our secondary objective is to compare three interpretations of missing attribute values. From the complexity point of view, lost values are the worst.

Patrick G. Clark, Cheng Gao, Jerzy W. Grzymala-Busse

Spark-Based Cluster Implementation of a Bug Report Assignment Recommender System

The use of recommenders for bug report triage decisions is especially important in the context of large software development projects, where both the frequency of reported problems and a large number of active developers can pose problems in selecting the most appropriate developer to work on a certain issue. From a machine learning perspective, the triage problem of bug report assignment in software projects may be regarded as a classification problem which can be solved by a recommender system. We describe a highly scalable SVM-based bug report assignment recommender that is able to run on massive datasets. Unlike previous desktop-based implementations of bug report triage assignment recommenders, our recommender is implemented on a cloud platform. The system uses a novel sequence of machine learning processing steps and compares favorably with other SVM-based bug report assignment recommender systems with respect to prediction performance. We validate our approach on real-world datasets from the Netbeans, Eclipse and Mozilla projects.

Adrian-Cătălin Florea, John Anvik, Răzvan Andonie

The Bag-of-Words Method with Dictionary Analysis by Evolutionary Algorithm

In this paper we present innovative solutions improving general operational efficiency of the Bag-of-Words algorithm (BoW). The first innovation which we put forward is creating a visual words’ dictionary using the clustering algorithm which in itself is responsible for selecting the appropriate number of clusters. This solution results in significant automation of image database creation. Another innovation is adding to the BoW model an analytical module whose task is to analyse the visual words’ dictionary and to modify histogram values before storing them in a database. This algorithm is operated with the use of the evolutionary algorithm. The modifications of the BoW algorithm significantly improve the efficiency of image search and classification, which has been presented in a variety of experiments.

Marcin Gabryel, Giacomo Capizzi

The Novel Method of the Estimation of the Fourier Transform Based on Noisy Measurements

This article refers to the problem of the analysis of spectrum of signals observed in the presence of noise. We propose a new concept of estimation of the frequency content in the signal. The method is derived from the nonparametric methodology of function estimation. We refer to the model of the system $$y_i = R\left( {x_i } \right) + \epsilon _i ,\,i = 1,2, \ldots n$$, where $$x_i$$ is assumed to be the set of deterministic inputs, $$x_i \in D$$, $$y_i$$ is the set of probabilistic outputs, and $$\epsilon _i$$ is a measurement noise with zero mean and bounded variance. R(.) is a completely unknown function. In this paper we are interested in a question about frequency spectrum of unknown function. Finding of unknown function in the model could be realized using algorithms based on the Parzen kernel. The alternative approach is based on the orthogonal series expansions. Nonparametric methodology could also be used in the task of implicit estimation of its spectrum. The main aim of this paper is to propose an original integral version of nonparametric estimation of spectrum based on trigonometric series - referring to the classic Fourier transform. The results of numerical experiments are presented.

Tomasz Galkowski, Miroslaw Pawlak

A Complete Efficient FFT-Based Algorithm for Nonparametric Kernel Density Estimation

Multivariate kernel density estimation (KDE) is a very important statistical technique in exploratory data analysis. Research on high performance KDE is still an open research problem. One of the most elegant and efficient approach utilizes the Fast Fourier Transform. Unfortunately, the existing FFT-based solution suffers from a serious limitation, as it can accurately operate only with the constrained (i.e., diagonal) multivariate bandwidth matrices. In the paper we propose a crucial improvement to this algorithm which results in relaxing the above mentioned limitation. Numerical simulation study demonstrates good properties of the new solution.

Jarosław Gramacki, Artur Gramacki

A Framework for Business Failure Prediction

Business failure prediction systems help predict financial failures before they actually happen and provide an early warning for enterprises. Using machine learning techniques, instead of traditional statistical models, has brought a considerable increase in performance into the area of business failure prediction. This paper presents a framework for predicting business failures by using different machine learning techniques. We, also, implemented a novel model for business failure prediction based on NARX (nonlinear autoregressive network with exogenous inputs) feedback neural network to be included into this framework which is a recurrent dynamic network with feedback connections. Detailed experiments are conducted to compare the performance of these approaches. Especially, for the long-term business failure predictions, there are no other papers investigating the performance of NARX. To the best of our knowledge, this is the first time NARX algorithm is applied for long-term business failure prediction.

Irem Islek, Idris Murat Atakli, Sule Gunduz Oguducu

Fuzzy Clustering with -Hyperballs and Its Application to Data Classification

In the presented paper the Fuzzy Clustering with $$\varepsilon $$-Hyperballs being the prototypes is proposed. It is based on the idea of regions of insensitivity – described by the hyperballs of radius $$\varepsilon $$, in which the distances of objects from the centers of the hyperballs are considered as equal to zero. The proposed clustering was applied to determine the parameters of fuzzy sets in antecedents of the classifier based on fuzzy if-then rules. The classification quality obtained for six benchmark datasets was compared with the reference classifiers. The results show the improvement of the classification accuracy using the proposed method.

Michal Jezewski, Robert Czabanski, Jacek Leski

Two Modifications of Yinyang K-means Algorithm

In the paper a very fast algorithm for K-means clustering problem, called Yinyang K-means, is considered. The algorithm uses initial grouping of cluster centroids and the triangle inequality to avoid unnecessary distance calculations. We propose two modifications of Yinyang K-means: regrouping of cluster centroids during the run of the algorithm and replacement of the grouping procedure with a method, which generates the groups of equal sizes. The influence of these two modifications on the efficiency of Yinyang K-means is experimentally evaluated using seven datasets. The results indicate that new grouping procedure reduces runtime of the algorithm. For one of tested datasets it runs up to 2.8 times faster.

Wojciech Kwedlo

Detection of the Innovative Logotypes on the Web Pages

The aim of this study was to describe a found method for detection of logotypes that indicate innovativeness of companies, where the images originate from their Internet domains. For this purpose, we elaborated a system that covers a supervised and heuristic approach to construct a reference dataset for each logotype category that is utilized by the logistic regression classifiers to recognize a logotype category. We proposed the approach that uses one-versus-the-rest learning strategy to learn the logistic regression classification models to recognize the classes of the innovative logotypes. Thanks to this we can detect whether a given company’s Internet domain contains a innovative logotype or not. Moreover, we find a way to construct a simple and small dimension of feature space that is utilized by the image recognition process. The proposed feature space of logotype classification models is based on the weights of images similarity and the textual data of the images that are received from HTMLs ALT tags.

Marcin Mirończuk, Michał Perełkiewicz, Jarosław Protasiewicz

Extraction and Interpretation of Textual Data from Czech Insolvency Proceedings

Recently, the Czech Insolvency Register covers about 200000 insolvency proceedings. In order to better assess the real impact of indebtedness across the Czech society, the data about creditors or reasons for debt might be of great value. Unfortunately, the vast majority of such information is contained only in scanned document copies attached to the insolvency proceedings. Therefore, this study aims at finding efficient pre-processing, clustering and classification techniques capable of extracting the wanted information from these cca 1200000 pdf-files.

Iveta Mrázová, Peter Zvirinský

Spectral Clustering for Cell Formation with Minimum Dissimilarities Distance

Group Technology (GT) is a useful tool in manufacturing systems. Cell formation (CF) is a part of a cellular manufacturing system that is the implementation of GT. It is used in designing cellular manufacturing systems using the similarities between parts in relation to machines so that it can identify part families and machine groups. Spectral clustering had been applied in CF, but, there are still several drawbacks to these spectral clustering approaches. One of them is how to get an optimal number of clusters/cells. To address this concern, we propose a spectral clustering algorithm for machine-part CF using minimum dissimilarities distance. Some experimental examples are used to illustrate its efficiency. In summary, the proposed algorithm has better efficiency to be used in CF with a wide variety of machine/part matrices.

Yessica Nataliani, Miin-Shen Yang

Exercise Recognition Using Averaged Hidden Markov Models

This paper presents a novel learning algorithm for Hidden Markov Models (HMMs) based on multiple learning sequences. For each activity a few left-to-right HMMs are created and then averaged into singular model. Averaged models’ structure is defined by a proposed Sequences Concatenation Algorithm which has been included in this paper. Also the modification of action recognition algorithm for such averaged models has been described.The experiments have been conducted for the problem of modeling and recognition of chosen 13 warm-up exercises. The input data have been collected using the depth sensor Microsoft Kinect 2.0. The experiments results confirm that an averaged model combines the features of all component models and thus recognizes more sequences. The obtained models do not confuse modeled activities with others.

Aleksandra Postawka

A Study of Cluster Validity Indices for Real-Life Data

In this paper a study of several cluster validity indices for real-life data sets is presented. Moreover, a new version of validity index is also proposed. All these indices can be considered as a measure of data partitioning accuracy and the performance of them is demonstrated for real-life data sets, where three popular algorithms have been applied as underlying clustering techniques, namely the Complete–linkage, Expectation Maximization and K-means algorithms. The indices have been compared taking into account the number of clusters in a data set. The results are useful to choose the best validity index for a given data set.

Artur Starczewski, Adam Krzyżak

Improvement of the Validity Index for Determination of an Appropriate Data Partitioning

In this paper a detail analysis of an improvement of the Silhouette validity index is presented. This proposed approach is based on using an additional component which improves clusters validity assessment and provides better results during a clustering process, especially when the naturally existing groups in a data set are located in very different distances. The performance of the modified index is demonstrated for several data sets, where the Complete–linkage method has been applied as the underlying clustering technique. The results prove superiority of the new approach as compared to other methods.

Artur Starczewski, Adam Krzyżak

Stylometric Features for Authorship Attribution of Polish Texts

Authorship attribution aims at distinguishing texts written by different authors using text features representing their styles. In this paper we investigate stylometric features for the Polish language based on Part of Speech (POS) tagging (including POS bigrams) and function words. Due to high inflection level of Polish language the feature space tends to be very large. This in particular concerns POS n-grams. Focusing on POS bigrams, we propose their simplified representation allowing to keep the feature space compact. We report experiments, in which authorship attribution was conducted for varying in lengths documents, with use of classifiers from the Weka library. We evaluate classification results for combinations of the following features: POS tags, POS bigrams, function words and simple document statistics. Experiments indicate that the developed features provide good classification performance.

Piotr Szwed

Handwriting Recognition with Extraction of Letter Fragments

This paper is focused on intelligent character recognition of handwritten texts. We apply elements of the handwriting movement analysis in order to calculate possibilities of primitive character fragments called strokes. The key feature rely on the processing of uncertainty in the form of fuzzy quality values starting from the identification of strokes, through the construction of words and phrases, up to future application of language filters and possible contextual recognition.

Michal Wróbel, Janusz T. Starczewski, Christian Napoli

Multidimensional Signal Transformation Based on Distributed Classification Grid and Principal Component Analysis

In the paper, the analysis of audio signal and spectral analysis based on sounds recorded by the authors are proposed. To perform the spectral analysis, the authors apply independent Principal Component Analysis. In this paper, we propose a novel approach to Distributed Classification Grid to improve performance and accelerate execution time.

Marcin Wyczechowski, Lukasz Was, Slawomir Wiak, Piotr Milczarski, Zofia Stawska, Lukasz Pietrzak

Artificial Intelligence in Modeling, Simulation and Control

Frontmatter

The Concept on Nonlinear Modelling of Dynamic Objects Based on State Transition Algorithm and Genetic Programming

In this paper a new hybrid method to determine parameters of time-variant non-linear models of dynamic objects is proposed. This method first uses the State Transition Algorithm to create many local models and then applies genetic programming in order to join and simplify those models. This allows to obtain simply model which is not computationally demanding and has high accuracy.

Łukasz Bartczuk, Piotr Dziwiński, Vladimir G. Red’ko

A Method for Non-linear Modelling Based on the Capabilities of PSO and GA Algorithms

The most nonlinear dynamic objects have their Approximate Nonlinear Model (ANM). Their parameters are known or can be determined by one of the typical identification procedures. The model obtained in this way describes well the main features of the identified dynamic object only in some Operating Point (OP). In this approach we use hybrid model increasing accuracy of the modeling. The hybrid model is composed of two parts: base ANM and Takagi-Sugeno (TS) fuzzy system. A Particle Swarm Optimization with Genetic Algorithm (PSO-GA) was used for identification of the parameters of the ANM and TS fuzzy system. An important advantage of the proposed approach is the obtained characteristics of the unknown parameters of the ANM described by the Fuzzy Rules (FR) of the TS fuzzy system. They provide the valuable knowledge for the experts about the nature of the unknown phenomena.

Piotr Dziwiński, Łukasz Bartczuk, Huang Tingwen

Linguistic Habit Graphs Used for Text Representation and Correction

This paper introduces a novel associative way of storing, compressing, and processing sentences. The Linguistic Habit Graphs (LHG) are introduced as graph models that could be used for spell checking, text correction, proof–reading, and compression of sentences. All the above mentioned functionalities are always available in the constant computational complexity as a result of the associative way of text processing, special kinds of connections and graph nodes that enable to activate various important relations between letters and words simultaneously for any given contexts. Furthermore, using the proposed graph structure, new algorithms have been developed to provide effective text analyzes and contextual text correction. These new algorithms can properly locate and often automatically correct typical mistakes in texts written in a given language for which the graph was build.

Marcin Gadamer

Dynamic Epistemic Preferential Logic of Action

H.P. van Ditmarsch, W. van der Hoek and B.P. Kooi proposed in 2003 some complete formalism for representation of actions for Multi-Agent Systems. This paper is aimed at proposing a new preferential extension of this formalism in terms of dynamic-epistemic logic supported by a unique multi-valued logic. This new system is interpreted in the interval fibred semantics on a base of earlier ideas of D. Gabbay.

Krystian Jobczyk, Antoni Ligeza

Proposal of a Multi-agent System for a Smart Outdoor Lighting Environment

Outdoor smart lighting is more and more popular since it is regarded as a significant example of an ideal and friendly environment. Systems controlling outdoor lighting, considered as context-aware software, are challenging. A multi-agent system for outdoor street lighting, dealing with intelligent software applications in pervasive computing, has been proposed. Smart scenarios, typical for such an intelligent environment, are presented. An agent-based architecture for a multi-agent system, dealing with the well-known framework JADE, is proposed. It allows further testing of these smart scenarios. This is the first proposal and the beginning of a greater work for implementation and testing of smart lighting scenarios carried out in the agent systems. This work describes the rationale of efforts for achieving ecosystems also working in the IoT paradigm by focusing on a rural environment, featuring data collection, as well as event detections and coordinated reactions.

Radosław Klimek

Understanding Human Behavior in Intelligent Environments: A Context-Aware System Supporting Mountain Rescuers

Intelligent environments provide people-centered computing to support people in their daily lifes. Understanding human behavior and context information is crucial to provide context-aware and pro-active services for all actors of smart spaces. On the other hand, mobile phone network data, collected by suppliers, provide valuable information about human locations and behaviors. This paper presents a unified approach comprising both informal (use cases) and more formal (algorithms) elements which enable obtaining a common framework that use information encoded into pervasive datasets to generate, through context-based reasoning, decisions which support actors operating in a smart space. The system is designed to support mountain rescuers. It provides pro-active decision taking or warning about dangerous situations on the mountain trails. In this way, the system supports rescuers and makes tourist staying in the mountains more safe.

Radosław Klimek

TLGProb: Two-Layer Gaussian Process Regression Model for Winning Probability Calculation in Two-Team Sports

Sports analytics is gaining much attention in the research community nowadays. This paper deals with a prominent problem in sports analytics, namely, winning probability calculation. In particular, we focus on the two-team sports. A novel model called TLGProb is proposed by stacking a non-linear regression model – Gaussian process regression (GPR) to address complex association between match outcomes and players’ performances. For evaluation, we selected a popular sports event around the world – National Basketball Association (NBA) as the domain for experiments. Finally, using TLGProb, we correctly predicted 85.28% of outcomes among 1,230 matches in NBA 2014/2015 season.

Max W. Y. Lam

Fuzzy PID Controllers with FIR Filtering and a Method for Their Construction

In this paper a new structure of fuzzy PID controllers with FIR filters and a method for selecting its parameters is presented. The proposed solution can be particularly important in solving problems with noise of the object’s feedback signals. To confirm the effectiveness of the proposed method a typical control problem was tested.

Krystian Łapa, Krzysztof Cpałka, Andrzej Przybył, Takamichi Saito

The Use of Heterogeneous Cellular Automata to Study the Capacity of the Roundabout

This article presents a research study analysing the impact of changing the roundabout island diameter on the roundabout capacity. The study was based on the developed Cellular Automata Model and the implemented simulation system. The developed CA Model takes into account various types of vehicles (cars, trucks and motorcycles) and various sizes of roundabouts; also, it reflects the actual technical conditions of those vehicles (acceleration and braking depending on the vehicle dimensions and function, as well as driving on the roundabout with different speeds that are adequate to the vehicle size). The study was based on the example of a two-lane roundabout with four two-lane feeder roads.

Krzysztof Małecki

A Method for Design of Hardware Emulators for a Distributed Network Environment

This paper describes the method for hardware implementation of the emulator of nonlinear dynamic objects in FPGA technology. In order to ensure high-fidelity of emulation it has been proposed a new architecture of the arithmetic unit used to operations on real numbers in digital systems. The method allows us to obtain high processing performance similar to that obtained in fixed-point systems, while offering a wide range of numbers as in a floating-point notation. Based on this idea it has been proposed a super-scalar architecture of the digital processing unit. The described approach provides powerful processing of a matrix state equation with variable coefficients, which are calculated in real-time by fuzzy systems. Obtained and presented results confirm the high performance of the developed solution.

Andrzej Przybył, Meng Joo Er

Iterative Learning of Optimal Control – Case Study of the Gantry Robot

In [15] the authors proposed an iterative learning algorithm for searching for optimal control of linear dynamic systems. This algorithm has been preliminary tested on the laser power control for the cladding process. The aim of this paper is to present a case study of a similar algorithm when applied to control Z-axis of a gantry robot. The original algorithm from [15] has to be modified in order to cover the case when the tracking signal is the output of the system instead of its whole state, as in [15]. The obtained results indicate a fast rate of convergence of the learning algorithm. One can also observe how learning of the shapes of the optimal input and output signals are convergent.

Ewaryst Rafajłowicz, Wojciech Rafajłowicz

An Approach to Robust Urban Transport Management. Mixed Graph-Based Model for Decision Support

In this paper, we present a mathematical model of public transport network, which can be used for generation of alternative routes during crisis situations. It is based on a mixed graph, where decision points are represented by vertices and track sections by edges. Route and vehicle definitions are also provided. We determine the objective function to select the most suitable route as well as the forbidden path set which contains paths that cannot be executed in real networks. The model definition is preceded by examples and analyses of different types of crisis situations.

Piotr Wiśniewski, Antoni Ligęza

Street Lighting Control, Energy Consumption Optimization

Using a graph formalism to model outdoor lighting infrastructure has proven to be an efficient method for both design and control. It has been tested not only on laboratory scale, but also in a city-scale deployment. The paper proposes further energy usage optimization if the streetlights are dynamically controlled. It is to alter the design process taking into account influence of the control schemas. As a result substantial energy consumption savings can be achieved. The introduced optimization is also modeled with graphs and graph transformations.

Igor Wojnicki, Leszek Kotulski

Various Problems of Artificial Intelligence

Frontmatter

Patterns in Serious Game Design and Evaluation Application of Eye-Tracker and Biosensors

In this paper, a general process of design and evaluation of serious games is presented. There are, obviously, qualitative and quantitative ways to analyze game dynamics (understood as mechanics activated by the user). Their general principles have been taken into consideration and a proposal of a modified pattern-based framework that allows for inclusion of data acquired from eye-tracker and biosensors used in affective computing. The paper concludes with a case study of design patterns in serious game that is currently in use as part of OHD training at the authors’ Alma Mater.

Jan K. Argasiński, Iwona Grabska-Gradzińska

Photo-Electro Characterization and Modeling of Organic Light-Emitting Diodes by Using a Radial Basis Neural Network

In this paper we present a new RBFNNs neural networks based model to relate the overall OLEDs electroluminescent density as a function of the voltage and current at different wavelengths. The polymer-based OLEDs considered in this paper are realized in the Optoelectronic Organic Semiconductor Devices Laboratory at Ben Gurion University of the Negev. The simulation results show a good agreement between the experimental data and those obtained with the proposed model. This results prove that the model is capable of repeating and interpreting the experimental data.

Shiran Nabha Barnea, Grazia Lo Sciuto, Nathaniel Hai, Rafi Shikler, Giacomo Capizzi, Marcin Woźniak, Dawid Połap

Conditioned Anxiety Mechanism as a Basis for a Procedure of Control Module of an Autonomous Robot

This paper is devoted to the problem of self-control of autonomous robot in a complex, unknown environment. In such an environment it is impossible to predict all situations the robot could be faced with. Because of this it is necessary to equip the robot with control procedures that allow it to avoid dangerous scenarios. Mechanisms that serve to avoid threatening events have been worked out during evolution and living organisms are equipped with them. Conditioned anxiety is one of such mechanisms. In this paper the way in which this mechanism can be adapted to control of behaviour of autonomous robot, is presented. The effectiveness of the proposed approach has been verified by using V-REP simulator.

Andrzej Bielecki, Marzena Bielecka, Przemysław Bielecki

Framework for Benchmarking Rule-Based Inference Engines

Rule-based systems constitute the state of the art solutions in the area of artificial intelligence. They provide fast, human readable and self explanatory mechanism for encoding knowledge. Due to large popularity of rules, dozens of inference engines were developed over last few decades. They differ in the reasoning efficiency depending on many factors such as model characteristics or deployment platform. Therefore, picking a reasoning engine that best fits the requirement of the system becomes a non-trivial task. The primary objective of the work presented in this paper was to provide a fully automated framework for benchmarking rule-based reasoning engines.

Szymon Bobek, Piotr Misiak

Web-Based Editor for Structured Rule Bases

Knowledge engineering aims at providing methods for efficient knowledge encoding to allow for automatic reasoning. Most of the research in this field is devoted to the design of expressive modeling languages or effective reasoning mechanisms. We argue that powerful knowledge representation and inference mechanism is not enough to assure high quality knowledge bases. It is crucial to provide methods for creation and visualization of knowledge. This allows an engineer to focus on the task of building the knowledge without the distraction caused by the complexity of the representation, syntax, etc. The original contribution of this paper is a definition of three categories of requirements for visualization and editing software for structured rule bases. We propose the prototype implementation of such a tool and provide the evaluation that involves comparison with existing approaches and user test to measure the usability of the solution.

Szymon Bobek, Grzegorz J. Nalepa, Przemysław Babiarz

Parallelization of Image Encryption Algorithm Based on Game of Life and Chaotic System

In this paper, the results of parallelizing an image encryption algorithm based on Game of Life and chaotic system are presented. The data dependence analysis of loops is applied in order to parallelize the algorithm. The parallelism of the algorithm is demonstrated in accordance with the OpenMP standard. As a result of this study, it is stated that the most time-consuming loops of the algorithm are suitable for parallelization. The efficiency measurements of the parallel algorithm working in standard modes of operation are shown.

Dariusz Burak

Cognitive Investigation on Pilot Attention During Take-Offs and Landings Using Flight Simulator

The paper presents cognitive studies on pilot’s attention during the take-off and landing performance. The studies were conducted using SMI RED 500 eyetracker and Saitek Pro 2000 set of pilot instruments. Simulation experiments involved two groups with different flight experience, recording particular attention trajectories during respective flight phases. The NON-PILOT group comprised members who had less than 80 h of flight time and the PILOT group the ones with more than 80 h of flight time. The differences in perception of flight process in a group of people with different flight experience were presented based on the analyses of the conducted measurements. This might be a useful advice to junior pilots improving their skills and, as a result may increase passengers safety during a flight.

Zbigniew Gomolka, Boguslaw Twarog, Ewa Zeslawska

3D Integrated Circuits Layout Optimization Game

This paper is devoted to the original approach to block-level 3D IC layout design. The circuit components are modeled as autonomous mobile agents that explore their virtual world in order to find a globally near-optimal layout solution. The search space is defined by geometry features, wire connections, goals and constraints of the design task. The approach is illustrated by the example application to one of the MCNC benchmark circuits and implemented using Godot.

Katarzyna Grzesiak-Kopeć, Leszek Nowak, Maciej Ogorzałek

Multi-valued Extension of Putnam-Davis Procedure

In 1960 M. Davis and H. Putnam introduced some logical verification procedure for propositional languages – called later Putnam-Davis procedure. It found a broad application in AI as a basis of the planning paradigm based on satisfiability of formulas. Unfortunately, this procedure refers to satisfiability in a classical two-valued logic. This paper is aimed at proposing some multi-valued extension of this procedure that may be sensitive to temporal and preferential aspects of reasoning. This method is evaluated in more practical contexts

Krystian Jobczyk, Antoni Ligeza

Comparison of Effectiveness of Multi-objective Genetic Algorithms in Optimization of Invertible S-Boxes

Strength of modern ciphers depends largely on cryptographic properties of substitution boxes, such as nonlinearity and transparency order. It is difficult to optimize all such properties because they often contradict each other. In this paper we compare two of the most popular multi-objective genetic algorithms, NSGA-II and its steady-state version, in solving the problem of optimizing invertible substitution boxes. In our research we defined objectives as cryptographic properties and observed how they change within population during experiments.

Tomasz Kapuściński, Robert K. Nowicki, Christian Napoli

The Impact of the Number of Averaged Attacker’s Strategies on the Results Quality in Mixed-UCT

Mixed-UCT is a method for finding efficient defender’s mixed strategy in multi-act Security Games. This paper presents experimental evaluation of the impact of the number of averaged past attackers (APA) used to define the defender’s strategy on solution quality of the method. Specifically designed set of test games is proposed for evaluation of the Mixed-UCT method with different values of APA parameter. The results indicate that larger values of APA generally lead to faster convergence of the method, and in some cases also improve the results in terms of the expected defender’s payoff value.

Jan Karwowski, Jacek Mańdziuk

Data-Driven Polish Poetry Generator

The paper describes an attempt to create a poetry generator for Polish language. It is a data-driven approach – grammatical and semantic structures are automatically derived from input text. The system was successfully implemented and the quality of the output “poems” was tested in a “Poetic Turing Test”: a public survey. Its participants have been asked to distinguish between human written and computer generated poetry.

Marek Korzeniowski, Jacek Mazurkiewicz

Rule Based Dependency Parser for Polish Language

The paper presents a dependency parser for Polish language. It uses a simple chain of word combining rules operating on fully morphosyntactically tagged input instead of a formal grammar model or statistical learning. The proposed approach generates robust dependency trees and allows parsing of uncommon texts, such as poetry. This gives a significant advantage over current state-of-the-art dependency parsers.

Marek Korzeniowski, Jacek Mazurkiewicz

Porous Silica Templated Nanomaterials for Artificial Intelligence and IT Technologies

This paper focuses on two types of novel nanomaterials based on ordered mesoporous silica designed for applications in artificial intelligence and IT technologies: molecular neural network and super dense magnetic memories. There’s no doubt that electronics needs new solutions for the further development. Nanotechnology comes here with the help. Especially nanostructured functional materials can help solve the problem of miniaturization.

Magdalena Laskowska, Łukasz Laskowski, Jerzy Jelonkiewicz, Henryk Piech, Tomasz Galkowski, Arnaud Boullanger

Combining SVD and Co-occurrence Matrix Information to Recognize Organic Solar Cells Defects with a Elliptical Basis Function Network Classifier

This paper presents a new methodology based on elliptical basis function (EBF) networks and an innovative feature extraction technique which makes use of the co-occurrence matrices and the SVD decomposition in order to recognize organic solar cells defects. The experimental results show that our algorithm achieves an high accuracy of recognition of 96% and that the feature extraction technique proposed is very effective in the pattern recognition problems that involving the texture’s analysis. The proposed methodology can be used as a tool to optimize the fabrication process of the organic solar cells. All the tests carried out for this work were made by using the organic solar cells realized in the Optoelectronic Organic Semiconductor Devices Laboratory at Ben Gurion University of the Negev.

Grazia Lo Sciuto, Giacomo Capizzi, Dor Gotleyb, Sivan Linde, Rafi Shikler, Marcin Woźniak, Dawid Połap

An Intelligent Decision Support System for Assessing the Default Risk in Small and Medium-Sized Enterprises

In the last years, default prediction systems have become an important tool for a wide variety of financial institutions, such as banking systems or credit business, for which being able of detecting credit and default risks, translates to a better financial status. Nevertheless, small and medium-sized enterprises did not focus its attention on customer default prediction but in maximizing the sales rate. Consequently, many companies could not cope with the customers’ debt and ended up closing the business. In order to overcome this issue, this paper presents a novel decision support system for default prediction specially tailored for small and medium-sized enterprises that retrieves the information related to the customers in an Enterprise Resource Planning (ERP) system and obtain the default risk probability of a new order or client. The resulting approach has been tested in a Graphic Arts printing company of The Basque Country allowing taking prioritized and preventive actions with regard to the default risk probability and the customer’s characteristics. Simulation results verify that the proposed scheme achieves a better performance than a naïve Random Forest (RF) classification technique in real scenarios with unbalanced datasets.

Diana Manjarres, Itziar Landa-Torres, Imanol Andonegui

Swarm Intelligence in Solving Stochastic Capacitated Vehicle Routing Problem

In this paper, the two most popular Swarm Intelligence approaches (Particle Swarm Optimization and Ant Colony Optimization) are compared in the task of solving the Capacitated Vehicle Routing Problem with Traffic Jams (CVRPwTJ). The CVRPwTJ is a highly challenging optimization problem for the following reasons: while the CVRP is already a problem of NP complexity, adding another stochastic layer to its definition (related to stochastic occurrence of traffic jams while traversing the planned vehicle routes) further increases the problem’s difficulty by requiring that potential solution methods be capable of on-line adaptation of the routes, in response to changing traffic conditions. The results presented in the paper shed light on the underlying differences between ACO and PSO in terms of their suitability to solving particular instances of CVRPwTJ.

Jacek Mańdziuk, Maciej Świechowski

LSTM Recurrent Neural Networks for Short Text and Sentiment Classification

Recurrent neural networks are increasingly used to classify text data, displacing feed-forward networks. This article is a demonstration of how to classify text using Long Term Term Memory (LSTM) network and their modifications, i.e. Bidirectional LSTM network and Gated Recurrent Unit. We present the superiority of this method over other algorithms for text classification on the example of three sets: Spambase Data Set, Farm Advertisement and Amazon book reviews. The results of the first two datasets were compared with AdaBoost ensemble of feedforward neural networks. In the case of the last database, the result is compared to the bag-of-words algorithm. In this article, we focus on classifying two groups in the first two collections, since we are only interested in whether something is classified into a SPAM or an eligible message. In the last dataset, we distinguish three classes.

Jakub Nowak, Ahmet Taspinar, Rafał Scherer

Categorization of Multilingual Scientific Documents by a Compound Classification System

The aim of this study was to propose a classification method for documents that include simultaneously text parts in various languages. For this purpose, we constructed a three-leveled classification system. On its first level, a data processing module prepares a suitable vector space model. Next, in the middle tier, a set of monolingual or multilingual classifiers assigns the probabilities of belonging each document or its parts to all possible categories. The models are trained by using Multinomial Naïve Bayes and Long Short-Term Memory algorithms. Finally, in the last component, a multilingual decision module assigns a target class to each document. The module is built on a logistic regression classifier, which as the inputs receives probabilities produced by the classifiers. The system has been verified experimentally. According to the reported results, it can be assumed that the proposed system can deal with textual documents which content is composed of many languages at the same time. Therefore, the system can be useful in the automatic organizing of multilingual publications or other documents.

Jarosław Protasiewicz, Marcin Mirończuk, Sławomir Dadas

Cognitive Content Recommendation in Digital Knowledge Repositories – A Survey of Recent Trends

This paper presents an overview of the cognitive aspects of content recommendation process in large heterogeneous knowledge repositories and their applications to design algorithms of incremental learning of users’ preferences, emotions, and satisfaction. This allows the recommendation procedures to align to the present and expected cognitive states of a user, increasing the combined recommendation and repository use efficiency. The learning algorithm takes into account the results of the cognitive and neural modelling of users’ decision behaviour. Inspirations from nature used in recommendation systems differ from the usual mimicking the biological neural processes. Specifically, a cognitive knowledge recommender may follow a strategy to discover emotional patterns in user behaviour and then adjust the recommendation procedure accordingly. The knowledge of cognitive decision mechanisms helps to optimize recommendation goals. Other cognitive recommendation procedures assist users in creating consistent learning or research groups. The primary application field of the above algorithms is a large knowledge repository coupled with an innovative training platform developed within an ongoing Horizon 2020 research project.

Andrzej M. J. Skulimowski

Supporting BPMN Process Models with UML Sequence Diagrams for Representing Time Issues and Testing Models

Business Process Model and Notation is a standard for process modeling. However, such models do not specify the time issues such as time of performing tasks or time of utilizing the resources. We propose the complementary UML sequence model generated from the BPMN model. Such a model can support time specification and provide direct time visualization. It is also suitable for validation in terms of time matters by domain experts as well as can be used to estimate and test methods in the systems based on random examination of the critical paths.

Anna Suchenia (Mroczek), Krzysztof Kluza, Krystian Jobczyk, Piotr Wiśniewski, Michał Wypych, Antoni Ligęza

Simulation of Multi-agent Systems with Alvis Toolkit

The paper presents a method of using the Alvis formal modelling language and related software to model and simulate multi-agent systems. The approach has been illustrated with an example of a railway traffic management system for a real train station. One of the main advantages of this approach is the possibility of including artificial intelligence (AI) systems encoded in Haskell into Alvis models. Moreover, Alvis models can be developed at the level very close to the final implementation of the corresponding real system. Thus simulation logs can be treated as a virtual prototype logs.

Marcin Szpyrka, Piotr Matyasik, Łukasz Podolski, Michał Wypych

Tensor-Based Syntactic Feature Engineering for Ontology Instance Matching

We investigate a machine learning approach to ontology instance matching. We apply syntactic and lexical text analysis as well as tensor-based data representation as means for feature engineering effectively supporting supervised learning based on logistic regression. We experimentally evaluate our approach in the scenario of the SABINE Data linking subtask defined by Ontology Alignment Evaluation Initiative. We show that, as far as the prediction of non-trivial matches is concerned, the use of the proposed tensor-based modelling of lexical and syntactical properties of the ontology instances enables achieving a significant quality improvement.

Andrzej Szwabe, Paweł Misiorek, Jarosław Bąk, Michał Ciesielczyk

Semantic Annotations for Mediation of Complex Rule Systems

Design of Business Intelligence systems capable of effectively handling a domain knowledge is a well known, but currently not solved challenge for both Software and Knowledge Engineers. There exist several approaches to extract and model the Business Knowledge, most notably Business Processes and Business Rules. However, each of them has its own weaknesses and therefore it is often desirable to build hybrid models composed of several knowledge representations. In this paper we describe an extension to Business Rules in order to facilitate creation of such heterogeneous systems. This is achieved by introducing semantic annotations to existing rule modeling languages. We also present how the additional semantic information is leveraged in the Prosecco project.

Mateusz Ślażyński, Grzegorz J. Nalepa, Szymon Bobek, Krzysztof Kutt

Convolutional Neural Networks for Time Series Classification

This article concerns identifying objects generating signals from various sensors. Instead of using traditional hand-made time series features we feed the signals as input channels to a convolutional neural network. The network learned low- and high-level features from data. We describe the process of data preparation, filtering, and the structure of the convolutional network. Experiment results showed that the network was able to learn to recognize objects with high accuracy.

Mariusz Zȩbik, Marcin Korytkowski, Rafal Angryk, Rafał Scherer

Special Session: Advances in Single-Objective Continuous Parameter Optimization with Nature-Inspired Algorithms

Frontmatter

A DSS Based on Hybrid Ant Colony Optimization Algorithm for the TSP

The traveling salesman problem (TSP) is one of the most studied problems in combinatorial optimization due to its importance and NP-hard numerous approximation methods were proposed to solve it. In this paper, we propose a new hybrid approach which combines local search with the ant colony optimization algorithm (ACO) for solving the TSP. The performance of the proposed algorithm is highlighted through the implementation of a Decision Support System (DSS). Some benchmark problems are selected to test the performance of the proposed hybrid method. We compare the ability of our algorithm with the classical ACO and against some well-known methods. The experiments show that the proposed hybrid method can efficiently improve the quality of solutions than the classical ACO algorithm, and distinctly speed up computing time. Our approach is also better than the performance of compared algorithms in most cases in terms of solution quality and robustness.

Islem Kaabachi, Dorra Jriji, Saoussen Krichen

Comparing Strategies for Search Space Boundaries Violation in PSO

In this paper, we choose to compare four methods for controlling particle position when it violates the search space boundaries and the impact on the performance of Particle Swarm Optimization algorithm (PSO). The methods are: hard borders, soft borders, random position and spherical universe. The goal is to compare the performance of these methods for the classical version of PSO and popular modification – the Attractive and Repulsive Particle Swarm Optimization (ARPSO). The experiments were carried out according to CEC benchmark rules and statistically evaluated.

Tomas Kadavy, Michal Pluhacek, Adam Viktorin, Roman Senkerik

PSO with Attractive Search Space Border Points

One of the biggest drawbacks of the original Particle Swarm Optimization is the premature convergence and fast loss of diversity in the population. In this paper, we propose and discuss a simple yet effective modification to help the PSO maintain diversity and avoid premature convergence. The particles are randomly attracted towards the border points of the search space. We use the CEC13 Benchmark function set to test the performance of proposed method and compare it to original PSO.

Michal Pluhacek, Roman Senkerik, Adam Viktorin, Tomas Kadavy

Differential Evolution Driven Analytic Programming for Prediction

This research deals with the hybridization of symbolic regression open framework, which is Analytical Programming (AP) and Differential Evolution (DE) algorithm in the task of time series prediction. This paper provides a closer insight into applicability and performance of connection between AP and different strategies of DE. AP can be considered as powerful open framework for symbolic regression thanks to its applicability in any programming language with arbitrary driving evolutionary/swarm based algorithm. Thus, the motivation behind this research, is to explore and investigate the differences in performance of AP driven by basic canonical strategies of DE as well as by the state of the art strategy, which is Success-History based Adaptive Differential Evolution (SHADE). Simple experiment has been carried out here with the time series consisting of 300 data-points of GBP/USD exchange rate, where the first 2/3 of data were used for regression process and the last 1/3 of the data were used as a verification for prediction process. The differences between regression/prediction models synthesized by means of AP as a direct consequences of different DE strategies performances are briefly discussed within conclusion section of this paper.

Roman Senkerik, Adam Viktorin, Michal Pluhacek, Tomas Kadavy, Ivan Zelinka

Archive Analysis in SHADE

The aim of this research paper is to analyze the current optional archive in Success-History based Adaptive Differential Evolution (SHADE) which is used during mutation. The usefulness of the archive is analyzed on CEC 2015 benchmark set of test functions where the impact of successful archive use on final test function value is studied. This paper also proposes a new version of optional archive named Enhanced Archive (EA), which is also tested on CEC 2015 benchmark set and the results are compared with the canonical version. Two research questions are discussed: Whether SHADE with EA has better performance than canonical SHADE and whether it makes a better use of the archive.

Adam Viktorin, Roman Senkerik, Michal Pluhacek, Tomas Kadavy

Special Session: Stream Data Mining

Frontmatter

Learning in Nonstationary Environments: A Hybrid Approach

Solutions present in the literature to learn in nonstationary environments can be grouped into two main families: passive and active. Passive solutions rely on a continuous adaptation of the envisaged learning system, while the active ones trigger the adaptation only when needed. Passive and active solutions are somehow complementary and one should be preferred than the other depending on the nonstationarity rate and the tolerable computational complexity. The aim of this paper is to introduce a novel hybrid approach that jointly uses an adaptation mechanism (as in passive solutions) and a change detection triggering the need to retrain the learning system (as in active solutions).

Cesare Alippi, Wen Qi, Manuel Roveri

Classifier Concept Drift Detection and the Illusion of Progress

When a new concept drift detection method is proposed, a common way to show the benefits of the new method, is to use a classifier to perform an evaluation where each time the new algorithm detects change, the current classifier is replaced by a new one. Accuracy in this setting is considered a good measure of the quality of the change detector. In this paper we claim that this is not a good evaluation methodology and we show how a non-change detector can improve the accuracy of the classifier in this setting. We claim that this is due to the existence of a temporal dependence on the data and we propose not to evaluate concept drift detectors using only classifiers.

Albert Bifet

Heuristic Regression Function Estimation Methods for Data Streams with Concept Drift

In this paper the regression function methods based on Parzen kernels are investigated. Both the modeled function and the variance of noise are assumed to be time-varying. The commonly known kernel estimator is extended by adopting two popular tools often applied in concept drifting data stream scenario. The first tool is a sliding window, in which only a constant number of recently received data elements affects the estimator. The second one is the forgetting factor. In this case at each time step past data become less and less important. These heuristic approaches are experimentally compared with the basic mathematically justified estimator and demonstrate similar accuracy.

Maciej Jaworski, Piotr Duda, Leszek Rutkowski, Patryk Najgebauer, Miroslaw Pawlak

Backmatter

Titel: Artificial Intelligence and Soft Computing
herausgegeben von: Leszek Rutkowski
Marcin Korytkowski
Rafał Scherer
Ryszard Tadeusiewicz
Lotfi A. Zadeh
Jacek M. Zurada
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-59060-8
Print ISBN: 978-3-319-59059-2
DOI: https://doi.org/10.1007/978-3-319-59060-8