Skip to main content

2016 | Buch

Computer Information Systems and Industrial Management

15th IFIP TC8 International Conference, CISIM 2016, Vilnius, Lithuania, September 14-16, 2016, Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the proceedings of the 15th IFIP TC8 International Conference on Computer Information Systems and Industrial Management, CISIM 2016, held in Vilnius, Lithuania, in September 2016.

The 63 regular papers presented together with 1 inivted paper and 5 keynotes in this volume were carefully reviewed and selected from about 89 submissions. The main topics covered are rough set methods for big data analytics; images, visualization, classification; optimization, tuning; scheduling in manufacturing and other applications; algorithms; decisions; intelligent distributed systems; and biometrics, identification, security.

Inhaltsverzeichnis

Frontmatter
Correction to: Complex Adaptive Systems and Interactive Granular Computing

The acknowledgement section of this paper originally referred to grant DEC-2013/09/B/ST6/01568. The reference to this grant has been removed from the acknowledgement section at the request of one of the authors.

Andrzej Skowron

Invited Paper

Frontmatter
Neural Networks – State of Art, Brief History, Basic Models and Architecture

The history of neural networks can be traced back to the work of trying to model the neuron. Today, neural networks discussions are occurring everywhere. Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A brief history of the neural networks research is presented and some more popular models are briefly discussed. The major attention is on the feed-forward networks and specially to the topology of such the network and method of building the multi-layer perceptrons.

Bohdan Macukow

Rough Set Methods for Big Data Analytics

Frontmatter
Complex Adaptive Systems and Interactive Granular Computing

We discuss an approach to modeling of computations performed by Complex Adaptive Systems (CAS) based on Interactive Granular Computing (IGrC).

Andrzej Skowron
Innovation and Big Data

Innovation is an essential issue for companies and educational institutes. We have seen that well-known companies came into trouble since they did not innovate in time. They kept their well-established way of operating and adhered to their old-fashioned business models. The main topic of this contribution is to address the question: to what extent can the availability of Big Data support the innovation attempts of companies and educational institutes?We start by mentioning five examples of innovation and relate them to data science and big data. We describe basic concepts and developments. We then formulate six possible classes of obstacles and take into account the use of sensitive data. Finally, we arrive at two conclusions and three recommendations.

H. Jaap van den Herik
Imbalanced Data Classification: A Novel Re-sampling Approach Combining Versatile Improved SMOTE and Rough Sets

In recent years, the problem of learning from imbalanced data has emerged as important and challenging. The fact that one of the classes is underrepresented in the data set is not the only reason of difficulties. The complex distribution of data, especially small disjuncts, noise and class overlapping, contributes to the significant depletion of classifier’s performance. Hence, the numerous solutions were proposed. They are categorized into three groups: data-level techniques, algorithm-level methods and cost-sensitive approaches. This paper presents a novel data-level method combining Versatile Improved SMOTE and rough sets. The algorithm was applied to the two-class problems, data sets were characterized by the nominal attributes. We evaluated the proposed technique in comparison with other preprocessing methods. The impact of the additional cleaning phase was specifically verified.

Katarzyna Borowska, Jarosław Stepaniuk
Embedding the V-Detector Algorithm in FPGA

The b-v model is a hybrid immune-based approach for detecting anomalies in high-dimensional datasets. It is based on a negative selection algorithm and utilizes both types of detectors to achieve better results in comparison to single detection models. Also, it is an interesting alternative to well known traditional, statistical approaches, because only positive (self) examples are required at the learning stage. As a result, it is able to detect even unnkown or never met anomalies and this fact is one of the most attractive features of this approach. However, especially in the case of on-line classification, not only high accuracy but also high efficiency is needed. Thus, we propose to embed some complex tasks in a reprogrammable FPGA to offload CPU and speed up the classification process. This paper presents a hardware implementation of the V-Detector algorithm, which is the most complex and time consuming part of b-v model.

Maciej Brzozowski, Andrzej Chmielewski
Attribute Reduction Based on MapReduce Model and Discernibility Measure

This paper discusses two important problems of data reduction. The problems are related to computing reducts and core in rough sets. The authors use the fact that the necessary information about discernibility matrices can be computed directly from data tables, in the case of this paper so called counting tables are used. The discussed problems are of high computational complexity. Hence the authors propose to use the relevant heuristics, MRCR (MapReduce Core and Reduct Generation) implemented using the MapReduce model.

Michal Czolombitko, Jaroslaw Stepaniuk
Mapping Points Back from the Concept Space with Minimum Mean Squared Error

In this article we present a method to map points from the concept space, associated with the fuzzy c–means algorithm, back to the feature space. We assume that we have a probability density function f defined on the feature space (e.g. a normalized density of a data set). For a given point $$\varvec{w}$$ of concept space, we give explicitly a set of points in feature space that are mapped onto $$\varvec{w}$$ and we give a formula for a reverse mapping to the feature space which results in minimum mean squared error, with respect to density f, of the operation of mapping a point of feature space into the concept space and back. We characterize the circumstances under which points can be mapped back into the feature space unambiguously and provide a formula for the inverse mapping.

Wladyslaw Homenda, Tomasz Penza
Representatives of Rough Regions for Generating Classification Rules

Rough set theory provides a useful tool for describing uncertain concepts. The description of a given concept constructed based on rough regions can be used to improve the quality of classification. Processing large data using rough set methods requires efficient implementations as well as alternative approaches to speed up computations. This paper proposes a representative-based approach for rough region-based classification. Positive, boundary, and negative regions are replaced with their representatives sets that preserve information needed for generating classification rules. For data divisible into a relatively low number of equivalence classes representatives sets are considerably smaller than the whole regions. Using a small representation of regions significantly speeds up the process of rule generation.

Piotr Hońko
Hardware Supported Rough Sets Based Rules Generation for Big Datasets

In this paper we propose a combination of capabilities of the Field Programmable Gate Arrays based device and PC computer for rough sets based data processing resulting in generation of decision rules. Solution is focused on big datasets. Presented architecture has been tested in programmable unit on real datasets. Obtained results confirm the significant acceleration of the computation time using hardware supporting rough set operations in comparison to software implementation.

Maciej Kopczynski, Tomasz Grzes, Jaroslaw Stepaniuk

Images, Visualization, Classification

Frontmatter
Information Density Based Image Binarization for Text Document Containing Graphics

In this work, a new clustering based binarization technique has been proposed. Clustering is done depending on the information density of the input image. Here input image is considered as a set of text, images as foreground and some random noises, marks of ink, spots of oil, etc. in the background. It is often quite difficult to separate the foreground from the background based on existing binarization technique. The existing methods offer good result if the input image contains only text. Experimental results indicate that this method is particularly good for degraded text document containing graphic images as well. USC-SIPI database is used for testing phase. It is compared with iterative partitioning, Otsu’s method for seven different metrics.

Soma Datta, Nabendu Chaki, Sankhayan Choudhury
MRI Texture-Based Classification of Dystrophic Muscles. A Search for the Most Discriminative Tissue Descriptors

The study assesses the usefulness of various texture-based tissue descriptors in the classification of canine hindlimb muscles. Experiments are performed on T2-weighted Magnetic Resonance Images (MRI) acquired from healthy and Golden Retriever Muscular Dystrophy (GRMD) dogs over a period of 14 months. Three phases of canine growth and/or dystrophy progression are considered. In total, 39 features provided by 8 texture analysis methods are tested. Features are ranked according to their frequency of selection in a modified Monte Carlo procedure. The top-ranked features are used in differentiation (i) between GRMD and healthy dogs at each phase of canine growth, and (ii) between three phases of dystrophy progression in GRMD dogs. Three classifiers are applied: Adaptive Boosting, Neural Networks, and Support Vector Machines. Small sets of selected features (up to 10) are found to ensure highly satisfactory classification accuracies.

Dorota Duda, Marek Kretowski, Noura Azzabou, Jacques D. de Certaines
Detection of Orbital Floor Fractures by Principal Component Analysis

Principal component analysis (PCA) is a statistical method based on orthogonal transformation, which is used to convert possibly correlated datasets into linearly uncorrelated variables called principal components. PCA is one of the simplest methods based on the eigenvector analysis. This method is widely used in many fields, such as signal processing, quality control or mechanical engineering. In this paper, we present the use of PCA in area of medical image processing. In the medical image processing with subsequent reconstruction of 3D models, data from sources such as Computed Tomography (CT) or Magnetic Resonance Imagining (MRI) are used. Series of images representing axial slices of human body are stored in Digital Imaging and Communications in Medicine (DICOM) format. Physical properties of different body tissues are characterized by different shades of grey of each pixel correlated to the tissue density. Properties of each pixel are then used in image segmentation and subsequent creation of 3D model of human organs. Image segmentation splits digital image into regions with similar properties which are later used to create 3D model. In many cases accurate detections of edges of such objects are necessary. This could be for example the case of a tumour or orbital fracture identification. In this paper, identification of the orbital fracture using PCA method is presented as an example of application of the method in the area of medical image processing.

Daniel Krpelik, Milan Jaros, Marta Jarosova, Petr Strakos, Tereza Buresova, Alena Vasatova, Tomas Karasek
An Approach to Cell Nuclei Counting in Histological Image Analysis

The paper describes a technique for automated cell nuclei counting. In this study, the primary goal is to provide simple and effective automated scheme of cell nuclei counting. The experiments on public data set of histology images have demonstrated acceptable level of calculation results.

Maryna Lukashevich, Valery Starovoitov
Automatic Segmentation Framework for Fluorescence in Situ Hybridization Cancer Diagnosis

In this paper we address a problem of HER2 and CEN-17 reactions detection in fluorescence in situ hybridization images. These images are very often used in situation where typical biopsy examination is not able to provide enough information to decide on the type of treatment the patient should undergo. Here the main focus is placed on the automatization of the procedure. Using an unsupervised neural network and principal component analysis, we present a segmentation framework that is able to keep the high segmentation accuracy. For comparison purposes we test the neural network approach against an automatic threshold method.

Marcin Stachowiak, Łukasz Jeleń
Neural Network Classification of SDR Signal Modulation

With the rising popularity of Software Defined Radios (SDR), there is a strong demand for automatic detection of the modulation type and signal parameters. Automatic modulation classification is an approach to identify the modulation type and its parameters such as the carrier frequency or symbol rate. In electronic warfare, it enables real-time signal interception and processing. In civil applications, it can be used, e.g., by the amateur radio operators to automatically set the transceiver to the appropriate modulation and communication protocol. This paper presents a modulation classification driven by a neural network. A set of signal features are provided as an input of the neural network. The paper discusses the relevance of different signal features and its impact on the success rate of the neural network classification. The proposed approach is tested on both artificial and real samples captured by the SDR.

Jakub Stebel, Michal Krumnikl, Pavel Moravec, Petr Olivka, David Seidl
Skull Stripping for MRI Images Using Morphological Operators

One of the most common MRI (Magnetic Resonance Imaging) use is a brain visualisation. Brain anatomy is highly complicated therefore it might be difficult to extract only these structures which have diagnostic value. In a consequence it is so necessary to develop and apply most efficient brain’s segmentation algorithms. One of the first steps in case of neurological MRI analysis is skull stripping. It involves removing extra-meningeal tissue from the head image, therefore it is essential to find the best method to determine the brain and skull boundaries. In T1-weighted images, cerebrospinal fluid (CSF) space and skull are dark, that is why the edges between the brain and the skull are well-marked but even strong edges might be unsettled because of finite resolution during MRI acquisition or the presence of other anatomical partial structures within the brain (connections between the brain and optic nerves or brainstem). There are many ways to perform this operation, none of them is not so great as to constitute a standard proceedings. In many cases, there are limitations associated with the development environment, license and images input that hinder skull stripping without specialised software. Proposed method is free of these constraints. It is based on application of morphological operations and image filtration to enhance the result of the edge detection and to provide better tissues separation. The efficiency was compared with other methods, common in commercial use, and the results of this comparison was presented in this paper.

Joanna Swiebocka-Wiek
Empirical Assessment of Performance Measures for Preprocessing Moments in Imbalanced Data Classification Problem

The article concerns the problem of imbalanced data classification, when classes, into which elements belong, are not equally represented. In the classification model building process cross-validation technique is one of the most popular to assess the efficacy of a classifier. While over-sampling methods are used to create new objects to obtain the balance between the number of objects in classes, inappropriate usage of the preprocessing moment has a direct impact on the achieved results. In most cases they are overestimated. To present and assess this phenomenon in this paper three preprocessing techniques (SMOTE, Safe-level SMOTE, SPIDER) and their modifications are used to make new elements of data sets to balance cardinalities of classes, and two classification methods (SVM, C4.5) are compared. k-folds cross-validation technique ($$k=10$$) considering two moments of preprocessing approaches is performed. The measures as precision, recall, F-measure and area under the ROC curve (AUC) are calculated and compared.

Paweł Szeszko, Magdalena Topczewska

Optimization, Tuning

Frontmatter
Optimization of Chosen Transport Task by Using Generic Algorithms

The paper presents genetic algorithms, their properties and capabilities in solving computational problems. Using a genetic algorithm an optimization task of transportation - production regard to the transport and processing of milk will be investigated. For the network of collection centers and processing plants (factories) the cost-optimal transportation plan regarding to raw materials to the relevant factories will be established. It is assumed that the functions defining the costs of processing are polynomials of the second degree. To solve the problem the program that uses genetic algorithms written in MATLAB will be used.

Anna Burduk, Kamil Musiał
Fast Branch and Bound Algorithm for the Travelling Salesman Problem

New strategies are proposed for implementing algorithms based on Branch and Bound scheme. Those include two minimal spanning tree lower bound modifications, a design based on the fact that edges in the optimal tour can never cross in the euclidean TSP and parallelization of Branch and Bound scheme. Proposed approaches are compared with primary algorithms.

Radosław Grymin, Szymon Jagiełło
Simulations for Tuning a Laser Power Control System of the Cladding Process

Our aim is present the methodology of simulations for repetitive processes and tuning control systems for them in the presence of noise. This methodology is applied for tuning a laser power control system of the cladding process. Even the simplest model of this process is nonlinear, making analytical tuning rather difficult. The proposed approach allows us to select quickly the structure of the control system and to optimize its parameters. Preliminary comparisons with experimental results on a robot-based laser cladding systems are also reported. These comparisons are based on the temperature measurements, observations by a camera and IR camera.

Piotr Jurewicz, Wojciech Rafajłowicz, Jacek Reiner, Ewaryst  Rafajłowicz
Automated Application of Inventory Optimization

We present automated application of inventory optimization based on sales forecast. Inventory stock optimization is very required issue by companies recent years, however inventory models are based on the sales expectation. Therefore, the problem of optimizing inventory stock is divided into two parts, sales forecast and setting optimal inventory. We describe an automated solution to model selection for sales forecast and the inventory setting based on those predictions. In the end, we present our validation of the system through historical simulation. We compare simulations results against real inventory levels. Due to the large number of different length time series, this simulation was run in parallel on cluster and was parallelized in R. The algorithms were developed and tested on inventory time series from real data sets of the K2 atmitec company (Sales forecasting and inventory optimization are parts of ERP solution K2, which is provided by K2 atmitec s.r.o. company, http://www.k2.cz/en/).

Tomáš Martinovič, Kateřina Janurová, Kateřina Slaninová, Jan Martinovič
Optimal Ellipse Based Algorithm as an Approximate and Robust Solution of Minimum Volume Covering Ellipse Problem

We propose a algorithm to give a approximate solution of a minimal covering circle or ellipse of a set of points. The iterative algorithm is based on the optimal ellipse which best describe a given set of points.

Krzysztof Misztal, Jacek Tabor, Jakub Hyła
Process of Point Clouds Merging for Mapping of a Robot’s Working Environment

The lidar is nowadays increasingly used in many robotic applications. Nevertheless the 3D lidars are still very expensive and their use on small robots is not economical. This article briefly introduces a construction of cheap 3D lidar for indoor usage based on the 2D laser range finder. Subsequently, this article introduces process of merging acquired point clouds. The every pair of neighboring point clouds is oriented in space to fit together in the best possible way. The result of this process is a 3D map of the robot working environment. This map can be segmented and further used for a navigation. In this way, it is also possible to map inaccessible and dangerous areas.

Petr Olivka, Michal Krumnikl, Pavel Moravec, David Seidl
Techniques of Czech Language Lossless Text Compression

For lossless data compression of the texts of natural language and for achieving better compression ratio we can use linguistic and grammatical properties extracted from the text analysis. This work deals with usage of word order, word categories and grammatical rules in sentences and sentence units in Czech language. Special grammatical properties of this language which are different from for example English language are used here. Further, there is an algorithm designed for searching similarities in analyzed sentence structures and its next processing to final compressed file. For analysis of the sentence units a special tool is used which allows parsing on more levels.

Jiří Ševčík, Jiří Dvorský
Optimization of Combining of Self Organizing Maps and Growing Neural Gas

The paper deals with the issue of high dimensional data clustering. One possible way to cluster this kind of data is based on Artificial Neural Networks (ANN) such as Growing Neural Gas (GNG) or Self Organizing Maps (SOM). Parallel modification, Growing Neural Gas with pre-processing by Self Organizing Maps, and its implementation on the HPC cluster is presented in the paper. Some experimental results are also presented. We focus on effective preprocessing for GNG. The clustering is realized on the output layer of SOM and the data for GNG are distributed into parallel processes.

Lukáš Vojáček, Pavla Dráždilová, Jiří Dvorský

Scheduling in Manufacturing and Other Applications

Frontmatter
Robust Tabu Search Algorithm for Planning Rail-Truck Intermodal Freight Transport

In this paper a new efficient tabu search algorithm for assigning freight to the intermodal transport connections was developed. There were also formulated properties of the problem that can be used to design robust heuristic algorithms based on the local search methods. The quality of solutions produced by the tabu search algorithm and by often recommended greedy approach were also compared.

Wojciech Bożejko, Radoslaw Grymin, Szymon Jagiełło, Jarosław Pempera
Multi-machine Scheduling with Setup Times

In this paper we are considering the problem of tasks scheduling executed simultaneously on multiple identical machines with a cost-criterion, which is the product of the tasks execution time and the number of used machines. Moreover, it is assumed that between the tasks performed sequentially there must be setup of the machines performed. Solution to the problem comes down to a generalization of a two-dimensional packing problem. The paper presents simulated annealing algorithm with different variants of packing strategy. The conducted computational experiments proved that designated solution differs little from some lower bounds for the constraints of the objective function.

Wojciech Bożejko, Łukasz Kacprzak, Piotr Nadybski, Mieczysław Wodecki
Two Step Algorithm for Virtual Machine Distributed Replication with Limited Bandwidth Problem

This article presents a proposal of solution for the problem of optimization of the virtual machine (VM) backup or replication process in architecture with multiple locations where efficient bandwidth usage and maximal tardiness for single VM are objectives. The two step algorithm is considered, where in the first step a set of tasks is partitioned into smaller subsets for load balancing. In the second step - tabu search algorithm is used to minimize weighted sum of tardiness. The paper contains the results of computational experiments on the scalability of the presented method.

Wojciech Bożejko, Piotr Nadybski, Mieczysław Wodecki
An Optimization Based Software Tool for Individual Automated Guideway Transit Systems

Automated Guideway Transit is a form of public transportation tools where fully driverless vehicles operate in order to move passengers between specific locations. Automated Guideway Transit includes a large variety of systems including mass transit systems and limited people automated guideway transit systems. In this paper, we focus on a specific limited people mover system called personal rapid transit. We develop in this paper an optimization based software which could manage efficiently the empty vehicles movements within the personal rapid transit system. Within this context, a branch and bound algorithm is proposed and its efficiently is proved on a set of instances taken from the literature. Our algorithm is shown to get good quality results.

Ezzeddine Fatnassi
Determination of the Optimal Routes for Autonomous Unmanned Aerial Vehicle Under Varying Wind with Using of the Traveling Salesman Problem Algorithm

The goal of this paper is to propose and test the algorithm for traveling salesman problem (TSP) for autonomous unmanned aerial vehicles. In this paper we consider the situation when the multicopter is flying under a variable wind and is intended to visit indicated points. We analyze the efficiency of the algorithm in case of limited flying time on a constant height.

Jerzy Greblicki, Maciej Walczyński
Cyber Security of the Application Layer of Mission Critical Industrial Systems

In this paper we focus on proposing the effective methods of cyber protection of the application layer. We also discuss how this challenge is related to mission critical industrial and manufacturing systems. In this paper we propose two step HTTP request analysis method that engages request segmentation, statistical analysis of the extracted content and machine learning on the imbalanced data. In this work, we particularly addressed the segmentation technique that allows us to divide the large dataset on smaller subsets and learn the classifiers in a significantly shorter time. In our experiments we evaluated several classifiers that are popular in data mining community. The results of our experiments are obtained on a benchmark CSIC’10 HTTP dataset. The proposed approach allows us to further improve the achieved results of protecting application layer in comparison to other benchmark approaches.

Rafał Kozik, Michał Choraś, Rafał Renk, Witold Hołubowicz
Kohonen SOM for Image Slides Sequencing

In this paper we present a new approach to sequencing of 2-D slide images with the goal of 3-D object reconstructions. We focus our attention mostly on microscope images obtained from serial sections of biological tissues. For adequate 3-D body reconstruction the correct sequence of section should be available. We propose a modified SOM Kohonen networks with open chain topology for true sequence estimation based on based on 2-D image features invariant to section image translation and rotation, but sensitive to a scale of an object on the slide. We provide an experimental validation of the proposed method using artificial section images of a mechanical device and real microscopic histological section images.

Ewa Skubalska-Rafajłowicz, Aneta Górniak
An Evolutionary Approach to Cyclic Real World Scheduling

Real world problems are often an inspiration for engineers to model and solve new scheduling problems. In this paper a cyclic scheduling of jobs with uncertain data is considered. A minimization of the cycle time represents an important economical factor considered by the companies. Proposed model was tested and data concerning processing times was obtained. Due to the nature of stations and human operator factor, we deal with uncertain data modeled using fuzzy numbers. For the modeled production station a fuzzy genetic algorithm was proposed and tested against deterministic algorithms. Proposed algorithm outperformed all deterministic algorithms.

Dominik Żelazny

Algorithms

Frontmatter
Performance Evaluation of Probabilistic Time-Dependent Travel Time Computation

Computational performance of route planning algorithms has become increasingly important in recent real navigation applications with many simultaneous route requests. Navigation applications should recommend routes as quickly as possible and preferably with some added value. This paper presents a performance evaluation of the main part of probabilistic time-dependent route planning algorithm. The main part of the algorithm computes the full probability distribution of travel time on routes with Monte Carlo simulation. Experiments show the performance of the algorithm and suggest real possibilities of use in modern navigation applications.

Martin Golasowski, Radek Tomis, Jan Martinovič, Kateřina Slaninová, Lukáš Rapant
Flexible Global Constraint Extension for Dynamic Time Warping

Dynamic Time Warping algorithm (DTW) is an effective tool for comparing two sequences which are subject to some kind of distortion. Unlike the standard methods for comparison, it is able to deal with a different length of compared sequences or with reasonable amount of inaccuracy. For this reason, DTW has become very popular and it is widely used in many domains. One of its the biggest advantages is a possibility to specify definable amount of benevolence while evaluating similarity of two sequences. It enables to percept similarity through the eyes of domain expert, in contrast with a strict sequential comparison of opposite sequence elements. Unfortunately, such commonly used definition of benevolence cannot be applied on DTW modifications, which were created for solving specific tasks (e.g. searching the longest common subsequence). The main goal of this paper is to eliminate weaknesses of commonly used approach and to propose a new flexible mechanism for definition of benevolence applicable to modifications of original DTW.

Tomáš Kocyan, Kateřina Slaninová, Jan Martinovič
Orthogonal Illuminations in Two Light-Source Photometric Stereo

In this paper we investigate the case of ambiguous shape reconstruction from two light-source photometric stereo based on illuminating the unknown Lambertian surface. So-far this problem is merely well-understood for two linearly independent light-source directions with one illumination assumed as overhead. As already established, a necessary and sufficient condition to disambiguate the entire shape reconstruction process is controlled by the satisfaction of the corresponding second-order linear PDE with constant coefficients in two independent variables. This work extends the latter to an arbitrary pair of light-source directions transforming the above constraint into a special nonlinear PDE. In addition, a similar ambiguity analysis is also performed for a special configuration of two light-source directions assumed this time as orthogonal and contained in the vertical plane. Finally, this work is supplemented by illustrative examples exploiting symbolic computation used within a framework of continuous reflectance map model (i.e. an image irradiance equation) and applied to a genuine Lambertian surfaces.

Ryszard Kozera, Alexander Prokopenya
Graph Clustering Using Early-Stopped Random Walks

Very fast growth of empirical graphs demands clustering algorithms with nearly-linear time complexity. We propose a novel approach to clustering, based on random walks. The idea is to relax the standard spectral method and replace eigenvectors with vectors obtained by running early-stopped random walks. We abandoned iterating the random walk algorithm to convergence but instead stopped it after the time that is short compared with the mixing time. The computed vectors constitute a local approximation of the leading eigenvectors. The algorithm performance is competitive to the traditional spectral solutions in terms of computational complexity. We empirically evaluate the proposed approach against other exact and approximate methods. Experimental results show that the use of the early stop procedure does not influence the quality of the clustering on the tested real world data sets.

Małgorzata Lucińska, Sławomir T. Wierzchoń
Methods of Synthesis of Controlled Random Tests

Controlled random tests, methods of their generation, main criteria used for their synthesis, such as the Hamming distance and the Euclidean distance, as well as their application to the testing of both hardware and software systems are discussed. Available evidences suggest that high computational complexity is one of the main drawbacks of these methods. Therefore we propose a technique to overcome this problem. A method for synthesizing multiple controlled random tests based on the use of the initial random test and addition operation has been proposed. The resulting multiple tests can be interpreted as a single controlled random test. The complexity of its construction is significantly lower than the complexity of the construction of classical random tests. Examples of generated tests as well as estimates of their effectiveness compared to other solutions have been presented in experimental studies.

Ireneusz Mrozek, Vyacheslav Yarmolik
Blocking and Deadlocking Phenomena in Two-Server Tandem Configuration with Optional Feedback – Modeling and Parameter Sensitivity Investigation

Tandem queues provide good mathematical models of computer systems and networks, and their detailed examination is important for theory and applications. The study presented in this paper is based on performance analysis of a two-server computer network with blocking and deadlocking. New, practical results provided describe performance of a three-node Markovian queuing network with finite capacity buffers. The results highlight an area where measures of effectiveness, such as Quality of Service (QoS) are essential. In conclusion, a two-dimensional state graph is constructed, followed by a set of steady-state equations along with their probabilities for each of the states.

Walenty Oniszczuk
Mutual Information for Quaternion Time Series

Mutual Information method is a widely used method for estimation of time delay value in the process of time delay embedding. It’s designed for a univariate scalar time series. In the real systems often many outputs of investigated system are available. In this case a multivariate time delay estimation method is necessary if one may require to perform the uniform time delay embedding. The special case of multivariate data is a kinematic time series(e.g. quaternion time series or Euler angles time series). The main goal of this paper is to provide a method for this case: Mutual Information method’s extension for quaternion time series. The results are also compared with previosly presented quaternion’s angle method. The method was tested on the real kinematic data - the recordings of human gait.

Michał Piórek
Algorithm of Allophone Borders Correction in Automatic Segmentation of Acoustic Units

In concatenative speech synthesis the fundamental factor with heavy influence on synthesized speech quality is the database of acoustic units. In case of bases received in automatic way, the key matter is suitable marking the borders of acoustic units. This article describes the algorithm of correction of acoustic units borders appointive in automatic way. It is based on two factors specified and tested here. It also describes worked out method of grade of acoustic units database, which allows to observe the influence of introduced correction on the base quality.

Janusz Rafałko

Decisions

Frontmatter
Ensemble of Classifiers with Modification of Confidence Values

In the classification task, the ensemble of classifiers have attracted more and more attention in pattern recognition communities. Generally, ensemble methods have the potential to significantly improve the prediction base classifier which are included in the team. In this paper, we propose the algorithm which modifies the confidence values. This values are obtained as an outputs of the base classifiers. The experiment results based on thirteen data sets show that the proposed method is a promising method for the development of multiple classifiers systems. We compared the proposed method with other known ensemble of classifiers and with all base classifiers.

Robert Burduk, Paulina Baczyńska
Fuzzy Random Forest with C–Fuzzy Decision Trees

In this paper a new classification solution which joins C–Fuzzy Decision Trees and Fuzzy Random Forest is proposed. Its assumptions are similar to the Fuzzy Random Forest, but instead of fuzzy trees it consists of C–Fuzzy Decision Trees. To test the proposed classifier there was performed a set of experiments. These experiments were performed using four datasets: Ionosphere, Dermatology, Pima–Diabetes and Hepatitis. Created forest was compared to C4.5 rev. 8 Decision Tree and single C–Fuzzy Decision Tree. The influence of randomness on the classification accuracy was also tested.

Łukasz Gadomer, Zenon A. Sosnowski
On Using Speed as the Criteria of State Selection for Minimization of Finite State Machines

This paper presents a heuristic method for minimization of incompletely specified Mealy finite state machines. In this method, such optimization criteria as the speed and possibility of merging other states are taken into account already at the stage of minimizing internal states. Algorithms for the estimation of optimization criteria values are described. The proposed method is based on two states merging. Experimental results for two styles of state encoding and two types of programmable structures are presented. The results show that this approach to minimization of FSM in most of cases is more effective than classical methods in respect of FSM performance.

Adam Klimowicz
PLACE_SUBST Transformation of P/T Petri Process Nets and Its Properties

Petri nets is one of mathematical modeling languages for the description of all kind of parallel systems. Property-preserving Petri net process algebras (PPPA) were originally designed for the specification and verification of Petri net processes representing the manufacturing systems. PPPA does not need to verify the composition of Petri net processes because all their algebraic operators preserve the specified set of the properties. These original PPPA are generalized for the newly introduced class of the P/T Petri process nets (PTPN) in this article. The only one PLACE_SUBST transformation is defined for the class of PTPN and its chosen properties are presented. The PTPN can be with the support of the PLACE_SUBST transformation easily applied into the area of design, simulation and verification of multithreading programming systems executed in parallel or distributed environment. This fact is demonstrated on the simple example of the client-server distributed programming system.

Ivo Martiník
Fuzzy Dempster-Shafer Modelling and Decision Rules

In this study, we discuss the use of Dempster-Shafer theory as a well-rounded algorithmic vehicle in the construction of fuzzy decision rules. The concept of fuzzy granulation realized via fuzzy clustering is aimed at the discretization of continuous attributes. Detailed experimental studies are presented concerning well-known medical data sets available on the Web.

Zenon A. Sosnowski, Jarosław S. Walijewski
Classification of Protein Interactions Based on Sparse Discriminant Analysis and Energetic Features

Prediction of protein-protein interaction (PPI) types is an important problem in life sciences because of fundamental role of PPIs in many biological processes. In this paper we propose a new classification approach based on the extended classical Fisher linear discriminant analysis (FLDA) to predict obligate and non-obligate protein-protein interactions. To characterize properties of the protein interaction, we proposed to use the binding free energies (total of 282 features). The obtained results are better than in the previous studies.

Katarzyna Stąpor, Piotr Fabian
Ensembles of Heterogeneous Concept Drift Detectors - Experimental Study

For the contemporary enterprises, possibility of appropriate business decision making on the basis of the knowledge hidden in stored data is the critical success factor. Therefore, the decision support software should take into consideration that data usually comes continuously in the form of so-called data stream, but most of the traditional data analysis methods are not ready to efficiently analyze fast growing amount of the stored records. Additionally, one should also consider phenomenon appearing in data stream called concept drift, which means that the parameters of an using model are changing, what could dramatically decrease the analytical model quality. This work is focusing on the classification task, which is very popular in many practical cases as fraud detection, network security, or medical diagnosis. We propose how to detect the changes in the data stream using combined concept drift detection model. The experimental evaluations confirm its pretty good quality, what encourage us to use it in practical applications.

Michał Woźniak, Paweł Ksieniewicz, Bogusław Cyganek, Krzysztof Walkowiak

Intelligent Distributed Systems

Frontmatter
Harmony Search for Data Mining with Big Data

In this paper, some harmony search algorithms have been proposed for data mining with big data. Three areas of big data processing have been studied to apply new metaheuristics. The first problem is related to MapReduce architecture that can be supported by a team of harmony search agents in grid infrastructure. The second dilemma involves development of harmony search in preprocessing of data series before data mining. Moreover, harmony search as a classification algorithm is studied as the third application. Finally, some outcomes for numerical experiments are submitted.

Jerzy Balicki, Piotr Dryja, Waldemar Korłub
Harmony Search for Self-configuration of Fault–Tolerant and Intelligent Grids

In this paper, harmony search algorithms have been proposed to self-configuration of intelligent grids for big data processing. Self-configuration of computer grids lies in the fact that new computer nodes are automatically configured by software agents and then integrated into the grid. A base node works due to several configuration parameters that define some aspects of data communications and energy power consumption. We propose some optimization agents that are based on harmony search to find a suboptimal configuration of fault–tolerant grids processing big data. Criteria such as probability that all tasks meet their deadlines and also a reliability of grid are considered. Finally, some experimental results have been considered.

Jerzy Balicki, Waldemar Korłub, Jacek Paluszak, Maciej Tyszka
A Study on Fuzzy Cognitive Map Optimization Using Metaheuristics

Fuzzy Cognitive Maps (FCMs) are a framework based on weighted directed graphs which can be used for system modeling. The relationships between the concepts are stored in graph edges and they are expressed as real numbers from the $$[-1,1]$$ interval (called weights). Our goal was to evaluate the effectiveness of non-deterministic optimization algorithms which can calculate weight matrices (i.e. collections of all weights) of FCMs for synthetic and real-world time series data sets. The best results were reported for Differential Evolution (DE) with recombination based on 3 random individuals, as well as Particle Swarm Optimization (PSO) where each particle is guided by its neighbors and the best particle. The choice of the algorithm was not crucial for maps of size roughly up to 10 nodes, however, the difference in performance was substantial (in the orders of magnitude) for bigger matrices.

Aleksander Cisłak, Władysław Homenda, Agnieszka Jastrzębska
Pattern Recognition with Rejection
Combining Standard Classification Methods with Geometrical Rejecting

The motivation of our study is to provide algorithmic appro-aches to distinguish proper patterns, from garbage and erroneous patterns in a pattern recognition problem. The design assumption is to provide methods based on proper patterns only. In this way the approach that we propose is truly versatile and it can be adapted to any pattern recognition problem in an uncertain environment, where garbage patterns may appear. The proposed attempt to recognition with rejection combines known classifiers with geometric methods used for separating native patterns from foreign ones. Empirical verification has been conducted on datasets of handwritten digits classification (native patterns) and handwritten letters of Latin alphabet (foreign patterns).

Wladyslaw Homenda, Agnieszka Jastrzebska, Piotr Waszkiewicz, Anna Zawadzka
Connecting Household Weather Sensors to IoT World

As the Internet of Things applications become more widespread, we face the need to replace existing implementations with Internet-enabled sensors. However, these sensors often provide worse battery life, have higher costs per unit and are not backward-compatible with existing weather stations. This paper concentrates on the possibility of decoding the weather station data by a cheap device which would forward them to the Internet of Things frameworks. The basics steps of receiving and decoding of the wireless weather sensor data are discussed together with the common problems which are associated with their capture and processing.

Pavel Moravec, Michal Krumnikl, Petr Olivka, David Seidl
An Electronic Document for Distributed Electronic Services

The paper presents the role of documents in the implementation of various types of transactions. The main features of the document determining its usefulness in the effective exchange of legal information, ensuring the authenticity, integrity and non-repudiation of origin are presented. Considering the general background of the document, the concept of an electronic document having significant (in terms of legal effectiveness) features of traditional document as well as those features that allow its operation and processing in the virtual space has been presented. An example of the use of an electronic document in the implementation of typical transactions, which are reflecting traditional electronic transaction will be discussed as well.

Gerard Wawrzyniak, Imed El Fray

Biometrics, Identification, Security

Frontmatter
FE8R - A Universal Method for Face Expression Recognition

This paper proposes a new method for recognition of face expressions, called FE8R. We studied 6 standard expressions: anger, disgust, fear, happiness, sadness, surprise, and additional two: cry and natural. For experimental evaluation samples from MUG Facial Expression Database and color FERET Database were taken, with addition of cry expression. The proposed method is based on the extraction of characteristic objects from images by gradient transformation depending on the coordinates of the minimum and maximum points in each object on the face area. The gradient is ranked in $$[-15,+35]$$ degrees. Essential objects are studied in two ways: the first way incorporates slant tracking, the second is based on feature encoding using BPCC algorithm with classification by Backpropagation Artificial Neural Networks. The achieved classification rates have reached 95 %. The second method is proved to be fast and producing satisfactory results, as compared to other approaches.

Majida Albakoor, Khalid Saeed, Mariusz Rybnik, Mohamad Dabash
A Fault-Tolerant Authenticated Key-Conference Agreement Protocol with Forward Secrecy

In conference channels, users communicate with each other using a conference key that is used to encrypt messages. There are two basic approaches in which the key can be established. In the first one, a central server is used (with a chairman role). The server generates the key and distributes it to participants. The second approach is that all participants compute a key without a chairman. In this paper, we introduce a special type of group authentication using secret sharing, which provides an efficient way to authenticate multiple users belonging to the same group without the chairman. Our proposed protocol is a many-to-many type of authentication. Unlike most user authentication protocols that authenticate a single user each time, our proposed protocol authenticates all users of a group at once.

Tomasz Hyla, Jerzy Pejaś
Towards Generating a Unique Signature for Remote User by Keystrokes Dynamics

Keystrokes dynamics has been used for quite sometimes in authentication of users. The technique has immense possibilities due to ease of implementation and un-obtrusive nature. Researchers have been working for attaining improved accuracy rate of user identification. Such techniques are validated using standard data-set. As it turns out, the quality of input data is very much important for generating an accurate use pattern vector. In this paper, an application for data collection has been presented. The application, besides creating a user data-set, also generates a signature vector database.

Puja Mukherjee, Rituparna Chaki
A Multimodal Biometric User Identification System Based on Keystroke Dynamics and Mouse Movements

In this work it is shown how the behavioral biometrics allows to strengthen security of a personal computer during casual use. The user does not have to be even aware of verification system running in the background. Unfortunately, short passwords do not supply enough data for keystroke dynamics algorithms to be precise enough to keep the way and level the biometrics system requires. Behavioral biometrics cannot grant such authentication level as the other physiological biometric methods, e.g. fingerprint or retina scan. However, their transparency in analyzing data allows to merge methods into multimodal systems with a minimal cost. The benefit of keystroke dynamics is that it can be easily connected with some other biometric methods, especially with other human input interface devices. In this paper an approach to analyze keystroke dynamics along with mouse movement is presented. Even though both of the features are of behavioral character and hence with low repeatability, the results are good and promising for further research and modification.

Piotr Panasiuk, Maciej Szymkowski, Marcin Dąbrowski, Khalid Saeed
An Efficient Three Factor Based Remote User Authentication Protocol for Distributed Networks

In distributed networks, one major security drawback is to identify the legitimate remote users of a web service on the Internet. To eliminate this security problem, many researchers have been proposed smart card based remote user authentication for secure communication in wireless networks. The wireless networks mostly use password based protocols that are based on two factors-smart card and PIN. But, this type of authentication protocols are susceptible to password guessing attack, stolen verifier attack, replay attack etc. In this paper, we propose a three factor based mutual authentication protocol using smart card in distributed networks, which resists all possible attacks. This protocol is suitable for hand held devices due to the low computational and communicational cost.

Ashish Singh, Kakali Chatterjee

Miscellanous

Frontmatter
Music Emotion Maps in Arousal-Valence Space

In this article we present the approach in which the detection of emotion is modeled by the pertinent regression problem. Conducting experiments required building a database, annotation of samples by music experts, construction of regressors, attribute selection, and analysis of selected musical compositions. We obtained a satisfactory correlation coefficient value for SVM for regression algorithm at 0.88 for arousal and 0.74 for valence. The result applying regressors are emotion maps of the musical compositions. They provide new knowledge about the distribution of emotions in musical compositions. They reveal new knowledge that had only been available to music experts until this point.

Jacek Grekow
Attribute Grammars for Controlling House Layout Customization

This paper introduces a new framework for design automation. The discussion is focused on syntactic processing nested in a selected architecture problem of house layout customization. House layout is interpreted as multi dimensional concepts structure of domain knowledge. Syntactic structuring is based on context-free methods. We propose constructions of context-free grammars driven by concepts of multidimensional knowledge space. The formalism affords consistent representation of an approximate grammatical correctness of house layout. Furnishing grammars with attributes allows for information flow between domain’s concepts supporting features disclosing and constrains verification. The approach is validated on some simple house layout and the directions for the future development of the system are outlined. The paper focuses on conceptual framework only and, as such, is intended to stimulate further research into the various implementation considerations that are prerequisite of large-scale applications.

Władysław Homenda, Krystian Kwieciński
The Radio Direction Finding with Advantage of the Software Defined Radio

The radio-frequency engineering recently has gone through extensive development. Software-Defined Radio (SDR) plays an important role in this development, bringing new possibilities to radio-frequency engineering and enables us to look at existing radio technologies in a new, innovative way. One such technology is a Doppler antenna, which allows us to find the direction from which the radio signal of given frequency is being transmitted. The SDR allows us to design and construct a highly-simplified system based on the Doppler antenna array. This paper deals with the design and implementation of such system used for the localization using the Doppler effect and the tests of the designed Doppler antenna system.

Josef Hrabal, David Seidl, Michal Krumnikl, Pavel Moravec, Petr Olivka

Open Access

A Quality Assessment Framework for Large Datasets of Container-Trips Information

Customs worldwide are facing the challenge of supervising huge volumes of containerized trade arriving to their country with resources allowing them to inspect only a minimal fraction of it. Risk assessment procedures can support them on the selection of the containers to inspect. The Container-Trip information (CTI) is an important element for that evaluation, but is usually not available with the needed quality. Therefore, the quality of the computed CTI records from any data sources that may use (e.g. Container Status Messages), needs to be assessed. This paper presents a quality assessment framework that combines quantitative and qualitative domain specific metrics to evaluate the quality of large datasets of CTI records and to provide a more complete feedback on which aspects need to be revised to improve the quality of the output data. The experimental results show the robustness of the framework in highlighting the weak points on the datasets and in identifying efficiently cases of potentially wrong CTI records.

Michail Makridis, Raúl Fidalgo-Merino, José-Antonio Cotelo-Lema, Aris Tsois, Enrico Checchi
Synthesis of High-Speed Finite State Machines in FPGAs by State Splitting

A synthesis method of high-speed finite state machines (FSMs) in field programmable gate arrays (FPGAs) based on LUT (Look Up Table) by internal state splitting is offered. The method can be easily included in designing the flow of digital systems in FPGA. Estimations of the number of LUT levels are presented for an implementation of FSM transition functions in the case of sequential and parallel decomposition. Split algorithms of FSM internal states for the synthesis of high-speed FSMs are described. The experimental results showed a high efficiency of the offered method. FSM performance increases by 1.52 times on occasion. In conclusion, the experimental results were considered, and prospective directions for designing high-speed FSMs are specified.

Valery Salauyou
Backmatter
Metadaten
Titel
Computer Information Systems and Industrial Management
herausgegeben von
Prof. Khalid Saeed
Władysław Homenda
Copyright-Jahr
2016
Electronic ISBN
978-3-319-45378-1
Print ISBN
978-3-319-45377-4
DOI
https://doi.org/10.1007/978-3-319-45378-1

Premium Partner