Recognition of Fuzzy or Incompletely Described Objects

Typical pattern recognition problem consists in assigning of a given object (result of observation) to one of previously defined similarity classes of objects. The problem has an unique solution if the classes are disjoint; otherwise it may happen that the considered object can be assigned to a class only on a limited certainty level. A more general problem arises if the object being to be recognized has not been described with a full accuracy. The situations of uncertainty consisting in missing some components of objects description and in inaccuracy of some objects’ features or parameters description are considered. An approach to the solution of the ill-described objects recognition based on the concepts of relative logic is proposed. This makes the proposed approach closer to a natural human decision making supported by intuition and, as such, useful in the case of uncertainty concerning the input data of the recognition problem.

Juliusz L. Kulikowski

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

Performance evaluation of supervised classification learning method related to its prediction ability on independent data is very important in machine learning. It is also almost unthinkable to carry out any research work without the comparison of the new, proposed classifier with other already existing ones. This paper aims to review the most important aspects of the classifier evaluation process including the choice of evaluating metrics (scores) as well as the statistical comparison of classifiers. Critical view, recommendations and limitations of the reviewed methods are presented. The article provides a quick guide to understand the complexity of the classifier evaluation process and tries to warn the reader about the wrong habits.

Katarzyna Stąpor

Multi-aspect Assessment and Classification of Porous Materials Designed for Tissue Engineering

The paper presents an approach to classification of porous materials used in biomedicine based on computer-aided analysis of scanning electron microscope images of sections of the examined material. Due to various size and high irregularity of forms of the pores visible in the images selected morphological parameters are used to the description of the samples of the porous material. The space of morphological parameters is automatically divided into porosity classes which are a step to establish the classes of porous material quality, based on suggestions of experts. An approach to verification of the morphological parameters utility to discriminate the materials according to their porosity is also proposed.

Małgorzata Przytulska, Juliusz L. Kulikowski

Enhancing English-Japanese Translation Using Syntactic Pattern Recognition Methods

In this paper, we present a novel approach to Machine Translation (MT) using syntactic Pattern Recognition (PR) methods. Our aim is to evaluate the possibility of using syntactic PR techniques in this mature field, and to identify any potential benefits that can be gleaned by such an approach. To make use of syntactic PR techniques, we propose a system that performs string-matching to pair English sentence structures to Japanese (The specific languages, namely English and Japanese, were chosen because their sentence structures are completely dissimilar. This, however, proves the point that such syntactic methods will be applicable for other pairs of languages too.) structures – as opposed to matching strings in and of themselves, and to thus facilitate translation between the languages. In order to process the sentence structures of either language as a string, we have created a representation that replaces the tokens of a sentence with their respective Part-of-Speech tags. Further, to perform the actual string-matching operation, we make use of the OptPR algorithm, a syntactic award-winning PR scheme that has been proven to achieve optimal accuracy, and that also attains the information theoretic bound. Through our experiments, we show that our implementation obtains superior results to that of a standard statistical MT system on our data set. Our results provide the additional guarantee of generating a known sentence structure in the target language. With further research, this system could be expanded to have a more complete coverage of the languages worked with. The incorporation of such PR techniques in MT, in general, and the OptPR algorithm, in particular, are both pioneering.

Thomas McMahon, B. John Oommen

Novel Results on Random Walk-Jump Chains That Possess Tree-Based Transitions

The most difficult task in analyzing and appraising algorithms in Artificial Intelligence (AI) involves their formal mathematical analysis. In general, such an analysis is intractable because of the size of the search space and the fact that the transitions between the states within this space can be very intricate. That is why AI algorithms are, for the most part, evaluated empirically and experimentally, i.e., by simulations. However, whenever such an analysis is undertaken, it usually involves an analysis of the underlying stochastic process. In this connection, the most common tools used involve Random Walks (RWs), which is a field that has been extensively studied for more than a century [6]. These walks have traditionally been on a line, and the generalizations for two and three dimensions, have been by extending the random steps to the corresponding neighboring positions in one or many of the dimensions. The analysis of RWs on a tree have received little attention, even though it is an important topic since a tree is a counter-part space representation of a line whenever there is some ordering on the nodes on the line.Nevertheless, RWs on a tree entail moving to non-neighbor states in the space, which makes the analysis involved, and in many cases, impossible. This is precisely what we achieve in this rather pioneering paper. The applications of this paper are numerous. Indeed, the RW on the tree that this paper models, is a type of generalization of dichotomous search with faulty feedback about the direction of the search, rendering the real-life application of the model to be pertinent. To resolve this, we advocate the concept of “backtracking” transitions in order to efficiently explore the search space. Interestingly, it is precisely these “backtracking” transitions that naturally render the chain to be “time reversible”. By doing this, we are able to bridge the gap between deterministic dichotomous search and its faulty version, explained, in detail, in [21].

Anis Yazidi, B. John Oommen

Travel Time Prediction for Trams in Warsaw

The paper presents a comparison between different prediction methods for trams time travels in Warsaw. Predictions are constructed based on historical trams GPS positions. Three different prediction approaches were implemented and compared with the official timetables and real time travels. Obtained results show that the official timetables provides only approximated time travel especially in rush hours. Proposed prediction methods outperform the official schedule in the term of time travel precision and may be used as a more accurate source of travel time for passengers.

Adam Zychowski, Konstanty Junosza-Szaniawski, Aleksander Kosicki

Diagnostic Rule Extraction Using the Dempster-Shafer Theory Extended for Fuzzy Focal Elements

The Dempster-Shafer theory along with the fuzzy set theory are suitable tools for the medical diagnosis support. They can deal with medical knowledge uncertainty and data imprecision. This paper presents a study of medical knowledge representation by means of the Dempster-Shafer theory extended with the fuzzy set theory and introduces the new rule selection algorithm. The presented method gives an opportunity of interpretable and reliable rule extraction. The method is elaborated and its performance is tested on a popular medical data set. Results show that the presented method can be useful for the knowledge engineer and diagnostician cooperation due to the simple rule base and clear inference method.

Sebastian Porebski, Ewa Straszecka

Gait Recognition Using Motion Trajectory Analysis

Gait recognition has received significant attention in the recent years due to its applications in numerous fields of computer vision, particularly in automated person identification in visual surveillance and monitoring systems. In this paper, we propose a novel algorithm for gait recognition using spatio-temporal motion characteristics of a person. The proposed algorithm consists of four steps. First, motion features are extracted from video sequence which are used to generate a codebook in the second step. In a third step, the local descriptors are encoded using Fisher vector encoding. Finally, the encoded features are classified using linear Support Vector Machine (SVM). The performance of the proposed algorithm is evaluated and compared with state-of-the-art on two widely used gait databases TUM GAID and CASIA-A. The recognition results demonstrate the effectiveness of the proposed algorithm.

Muhammad Hassan Khan, Frederic Li, Muhammad Shahid Farid, Marcin Grzegorzek

Methodology of the Construction of a GDPLL(k) Grammar-Based Syntactic Pattern Recognition System

GDPLL(k) grammars have been introduced as a tool for the construction of syntactic pattern recognition-based systems. The grammars have been successfully used in several different applications. The practical experience with the implementation of a syntactic pattern recognition system based on GDPLL(k) grammars has served to define methodological guidelines for constructing such systems. In the paper key methodological issues are presented.

Mariusz Flasiński, Janusz Jurek

Determining of an Estimate of the Equivalence Relation on the Basis of Pairwise Comparisons

The paper presents two approaches for solving of a discrete programming problem necessary for determining of estimates of the equivalence relation on the basis of pairwise comparisons with random errors. The problem minimizes sum of absolute differences between relation form and comparisons. The problem is NP hard and can be solved with the use of exact algorithms for moderate size of sets - about 50 elements. In the case of larger sets, at least 100 elements and/or multiple comparisons, it is necessary to apply heuristic algorithms. The paper presents original results: a statistical preprocessing, based on two tests proposed, which allows determining of the optimal or suboptimal solution for large sets with acceptable computational cost.

Leszek Klukowski

Classification of Body Regions Based on MRI Log Files

Every Siemens Magnetic Resonance Imaging (MRI) system consistently writes events into log files while the system is running. The log files and their contents are constantly refined by software developers. This results in different information contents depending on the software version. One information that is missing in some log files is the examined body region. As the body region is crucial for usage analysis, we used pattern recognition methods to estimate the examined body region for software versions not logging it automatically. We learned the examined body region from a set of used MRI acquisition parameters such as grid and voxel size and could classify body region information with a classification rate up to $$94.7\%$$. We compared Bayesian Network augmented Naïve Bayes, Decision Trees, and Neural Networks, and found Neural Networks resulting in the best classification rate.

Nadine Kuhnert, Oliver Lindenmayr, Andreas Maier

Semi-automatic Segmentation of Scattered and Distributed Objects

This paper presents a novel object segmentation technique to extract objects that are potentially scattered or distributed over the whole image. The goal of the proposed approach is to achieve accurate segmentation with minimum and easy user assistance. The user provides input in the form of few mouse clicks on the target object which are used to characterize its statistical properties using Gaussian mixture model. This model determines the primary segmentation of the object which is refined by performing morphological operations to reduce the false positives. We observe that the boundary pixels of the target object are potentially misclassified. To obtain an accurate segmentation, we recast our objective as a graph partitioning problem which is solved using the graph cut technique. The proposed technique is tested on several images to segment various types of distributed objects e.g. fences, railings, flowers. We also show some remote sensing application examples, i.e. segmentation of roads, rivers, etc. from aerial images. The obtained results show the effectiveness of the proposed technique.

Muhammad Shahid Farid, Maurizio Lucenteforte, Muhammad Hassan Khan, Marco Grangetto

Playback Attack Detection: The Search for the Ultimate Set of Antispoof Features

Automatic speaker verification systems are vulnerable to several kinds of spoofing attacks. Some of them can be quite simple – for example, the playback of an eavesdropped recording does not require any specialized equipment nor knowledge, but still may pose a serious threat for a biometric identification module built into an e-banking application. In this paper we follow the recent approach and convert recordings to images, assuming that original voice can be distinguished from its played back version through the analysis of local texture patterns. We propose improvements to the state-of-the-art solution, but also show its severe limitations. This in turn leads to the fundamental question: is it possible to find one set of features which are characteristic for all playback recordings? We look for the answer by performing a series of optimization experiments, but in general the problem remains open.

Maciej Smiatacz

A Vision-Based Method for Automatic Crack Detection in Railway Sleepers

In this paper, a method for automatic selection and classification of the sleeper cracks is presented. This method includes three main sequential steps of image pre-processing, sleeper detection and crack detection. Two approaches including rule-based method and template matching method in the frequency domain are proposed for the sleeper detection step. We utilize adaptive threshold binarization to handle challenging crack detection under non-uniform lightening condition and hierarchical structure for the decision making step. Two unsupervised classifiers are exploited to detect the cracks. The results show that the presented method has the overall detection rate with accuracy of at least 87 percent.

Ahmad Delforouzi, Amir Hossein Tabatabaei, Muhammad Hassan Khan, Marcin Grzegorzek

Towards Privacy-Aware Keyboards

As shown by various studies, the dynamics of typing on a keyboard is characteristic to persons. On the one hand, this may allow for person identification based on keystroke dynamics in various applications. On the other hand, in certain situations, such as chat-based anonymous helplines, web search for sensitive topics, etc., users may not want to reveal their identity. In general, there are various methods to increase the protection of personal data. In this paper, we propose the concept of privacy-aware keyboard, i.e., a keyboard which transmits keyboard events (such as pressing or releasing of a key) with small random delays in order to ensure that the identity of the user is difficult to be inferred from her typing dynamics. We use real-world keystroke dynamics data in order to simulate privacy-aware keyboards with uniformly random delay and Gaussian delay. The experimental results indicate that the proposed techniques may have an important contribution to keeping the anonymity of users.

Krisztian Buza, Piroska B. Kis

Saliency-Based Optimization for the Histogram of Oriented Gradients-Based Detection Methods

The paper presents research on using graphical saliency cue for optimizing computation of the distance metrics for HoG features. Saliency values have been computed in the area of a HoG descriptor for macro and micro scale. Macro scale uses HoG features as a global descriptor for an image presenting a particular object, whereas micro scale consists of feature points and k-nearest neighbours approach to create similarity measure. Mechanism has been tested on a chosen database consisting of 20 000 images. Promising results have been achieved for macro scale approach.

Grzegorz Kurzejamski, Marcin Iwanowski

Efficient Sketch Recognition Based on Shape Features and Multidimensional Indexing

Face sketch recognition on real forensic mug shot photo galleries is a complex task since a large amount of images needs to be matched in few seconds to produce a useful outcome. Several effective solutions for sketch-based subject identification have been recently proposed, but the cost of linear search makes them not scalable when large databases have to be scanned. In this work we propose an approach which combines the use of efficient shape features for sketch-photo matching with a suitable indexing structure based on dimensionality reduction. The proposed method provides a preliminary set of candidate photos to be used as input for the final identification based on state-of-the-art techniques, offering scalability and time efficiency without noticeably compromising recognition accuracy, as confirmed by the experimental results.

Simone Buoncompagni, Annalisa Franco, Dario Maio

Performance Evaluation of Selected Thermal Imaging-Based Human Face Detectors

The paper is devoted to the problem of face detection in thermal imagery. Its aim was to investigate several contemporary general-purpose object detectors known to be accurate when working in visible lighting conditions. Employed classifiers are based on AdaBoost learning method with three types of low-level descriptors, namely Haar–like features, Histogram of Oriented Gradients, and Local Binary Patterns. Additionally, the performance of recently proposed Max-Margin Object-Detection Algorithm joint with HOG feature extractor and Deep Neural Network-based approach have been investigated. Performed experiments, on images taken in controlled and uncontrolled conditions, gathered in our own benchmark database and in a few other databases support final observations and conclusions.

Paweł Forczmański

On a New Method of Dynamic Integration of Fuzzy Linear Regression Models

In the study the problem of ensemble regression with fuzzy linear regression (FLR) models is considered. For this case a novel method of integration is proposed in which first fuzzy responses of base FLR models are integrated and next the fuzzy response of a common model is defuzzified. Four different operators are defined for integration procedure. The performance of proposed integration methods of FLR base models on the soft level were compared against state-of-the-art integration method on the crisp level using computer generated datasets with linear, 2-order and 3-order models and different variances of Gaussian disturbances. As a criterion of method quality the root mean square error was applied. The results of computer experiments clearly show that in many cases proposed methods significant outperform the reference approach.

Jakub Kozerski, Marek Kurzynski

Ensemble Machine Learning Approach for Android Malware Classification Using Hybrid Features

Feature-based learning plays a crucial role at building and sustaining the security. Determination of a software based on its extracted features whether a benign or malign process, and particularly classification into a correct malware family improves the security of the operating system and protects critical user’s information. In this paper, we present a novel hybrid feature-based classification system for Android malware samples. Static features such as permissions requested by mobile applications, hidden payload, and dynamic features such as API calls, installed services, network connections are extracted for classification. We apply machine learning and evaluate the level in classification accuracy of different classifiers by extracting Android malware features using a fairly large set of 3339 samples belonging to 20 malware families. The evaluation study has been scalable with 5 guest machines and took 8 days of processing. The testing accuracy is reached at 92%.

Abdurrahman Pektaş, Tankut Acarman

An Ensemble of Weak Classifiers for Pattern Recognition in Motion Capture Clouds of Points

This paper deals with the problem of labeling a cloud of points as a classification problem, proposing an ensemble of weak classifiers. First, we define a set of geometrical features over small subsets of the cloud of points. Then, we apply an Adaboost like strategy to select a collection of features achieving a target accuracy in the detection of correct labeling as a whole. Furthermore, we use these features to generate the labeling of the points in the cloud. We demonstrate the approach on a real dataset obtained from the measurement of gait motion of persons, for which the ground truth labeling has been carried out manually. Results are encouraging, achieving high accuracy in both tasks (correct label detection and label generation) at a reduced computational cost.

Juan Luis Jiménez-Bascones, Manuel Graña

Portable Dynamic Malware Analysis with an Improved Scalability and Automatisation

A malware is deployed ubiquitously to steal safety or liability-critical information and damage the compromised systems. In this paper, we present a portable, scalable and transparent system for dynamic analysis of malware targeting Windows OS. The portability feature is enabled by introducing a driver capable of collecting the behavioural activities of analysed samples in low kernel level and detection of a new malware in the latest version of Windows OS is guaranteed without waiting for its signature update. A large volume and variety of malicious behaviour is monitored and analysed by the presented virtual, scalable and automated system deployment. End-to-end design is presented and functional tests of portability feature are conducted by compiling the developed kernel driver component in the analysis machine. Evaluation is performed by using recently captured malware samples that are automatically analysed and detected on a Windows 8 Ultimate 64-bit and Windows 10 OS.

Abdurrahman Pektaş, Tankut Acarman

Projection-Based Person Identification

The increasing interest in person identification based on keystroke dynamics can be attributed to several factors. First of all, it is a cheap and widely applicable technique, whereas online services such as internet banking or online tax declaration require reliable person identification methods. Furthermore, there are various attack techniques against the existing identification methods, thus combining the existing methods with new person identification methods could improve the reliability of the identification. Recent research shows that person identification based on machine learning using keystroke dynamics data works surprisingly well. This is because the dynamics of typing is characteristic to users and a user is hardly able to mimic the dynamics of typing of another user. In this paper, we propose to use a projection-based classification technique for the task of person identification based on keystroke dynamics.

Dora Neubrandt, Krisztian Buza

3-Steps Keyboard: Reduced Interaction Interface for Touchless Typing with Head Movements

This paper introduces a novel technique for touchless typing with head movements allowing to reach any alphabet character in only three steps. Head movements are frequently used for human-computer interaction by users with motor impairments unable to operate standard computer input devices. In such interfaces great difficulty is typing. Many directional head movements are required to reach subsequent characters using the on-screen keyboard and additional mechanism (like eye blink or mouth open) supplements the selection process. In this paper, a reduced interaction keyboard for touchless typing with head movements is proposed. The solution is based on recognition of head movements in four main directions.

Adam Nowosielski

An Algorithm for Selective Preprocessing of Multi-class Imbalanced Data

In this paper we propose a new algorithm called SPIDER3 for selective preprocessing of multi-class imbalanced data sets. While it borrows selected ideas (i.e., combination of relabeling and local resampling) from its predecessor – SPIDER2, it introduces several important extensions. Unlike SPIDER2, it is able to handle directly multi-class problems. Moreover, it considers the relevance of specific decision classes to control the order of their processing. Finally, it uses information about relations between specific classes (modeled with misclassification costs) to better control the extent of changes introduced locally to preprocessed data. We performed a computational experiment on artificial 3-class data sets to evaluate and compare SPIDER3 to SPIDER2 with temporarily aggregated classes and the results confirmed advantages of the new algorithm.

Szymon Wojciechowski, Szymon Wilk, Jerzy Stefanowski

The Method of Person Verification by Use of Finger Knuckle Images

The paper proposes a personal identity verification method based on images of finger knuckles. The knuckle images were recorded using a digital camera and then proceed to extract the furrows appearing on them. The verification was performed by comparing the locations and courses of the furrows on the pattern being verified and on the reference image of finger knuckles. In order to determine the similarity between the images, a new similarity measure was proposed. During the analysis of finger knuckle images, there appears a problem which consists in the fact that the location and size of the same furrow may be different in subsequent images obtained from the same person. This problem results from the elasticity of the human skin. To minimize the problem in question, this paper proposes a solution that consists in matching the furrows with each other before they are compared. For this purpose, a method based on Thin Plate Spline and Shape Context has been used. The usability of this method was verified experimentally.

Rafal Doroz, Krzysztof Wrobel, Piotr Porwik, Hossein Safaverdi

On a New Method for Improving Weak Classifiers Using Bayes Metaclassifier

In this paper new algorithm called Bayes metaclassifier (BMC) will be introduced as a method for improving weak classifiers performance. In general, BMC constitutes the probabilistic generalization of any base classifier and has the form of the Bayes scheme. To validate BMC classification two experiments were designed. In the first one three synthetic datasets were generated from normal distribution to calculate and check empirically upper bound for improving base classifier when BMC approach is applied. Furthermore, to validate usefulness of this algorithm extensive simulations from 22 available benchmarks were performed comparing BMC model against 8 base classifiers with different design paradigms.

Marcin Majak, Marek Kurzyński

Recent Advances in Image Pre-processing Methods for Palmprint Biometrics

Biometric identification may be used in real-life applications like security, forensic and common smartphones. However to ensure the robustness of biometric methods, the proper pre-processing method has to be applied. In this paper we focus only on this part of the whole recognition process. Finding an appropriate method is crucial, especially when the identification system is dedicated to a mobile scenario. An image acquired by a smartphone may have lower quality, be more defected by noises and blurred. In this paper we discuss only palmprint as a feature that may distinguish people. It is used in identification systems but still is rarely implemented in mobile.

Agata Wojciechowska, Michał Choraś, Rafał Kozik

Some Properties of Consensus-Based Classification

The objective of this paper is to consider some properties of decisions produced by classifiers that are in consensus. Consensus allows strong classifiers to obtain very reliable classification on the objects on which consensus has been reached. For those ones where consensus is not reached the reclassification procedure should be applied based on other classification algorithms. Properties of different consensuses are described using algebraic approach and performance evaluation routine.

Vitaliy Tayanov, Adam Krzyżak, Ching Suen

Knowledge Based Active Partition Approach for Heart Ventricle Recognition

In the paper a method of automatic localization of heart ventricles in CT images is presented. Analysis of their shape can be an important element of pulmonary embolism diagnosis. For that purpose active partitions, a generalization of active contour approach, was used with superpixel representation of image content. Active partitions, similarly to active contours, possess a natural ability to incorporate external experience into object localization process. It means that not only information contained in the image itself but also experience of the radiologist and the medical knowledge can be used to improve segmentation results.

Arkadiusz Tomczyk, Piotr S. Szczepaniak

The Method of Teeth Region Detection in Panoramic Dental Radiographs

Radiographs, which visualise human dentition as a single panorama photograph, are commonly used in prognosis and diagnosis in dental care and biometric individual identification. The automatics analysis of these images can provide useful information about patients. So, this paper proposes a method of an introductory phase of teeth segmentation: selection of a region of interest which contains every tooth in the pantomograph. The proposed method finds an area containing all the teeth and the line dividing the upper and lower jaws, called teeth line, using adaptive power-law transformation, Hölder exponent, energy and entropy of the pantomograph. The teeth area is found correctly in 81.9%, while teeth line in 34.0%. The method is the introductory step in a single teeth segmentation, dentition evaluation according to presence of strong dental interventions and automatics generation of dental score for a patient.

Jakub Zak, Anna Korzynska, Lukasz Roszkowiak, Krzysztof Siemion, Sebastian Walerzak, Monika Walerzak, Konrad Walerzak

Color Independent Quality Assessment of 3D Printed Surfaces Based on Image Entropy

The paper is focused on the issue of visual quality assessment of 3D printed surfaces which can be helpful in detection of quality decrease during the printing process as well as the quality inspection of previously printed objects. The basic assumption of the proposed approach is the fact that each distortion of the regular patterns, visible on the side surfaces of objects printed using Fused Deposition Modeling (FDM) technology, causes the increase of the local image entropy. However, due to different colors of the filaments used in our experiments, a reliable prediction of the absolute entropy values can be troublesome. The proposed solution utilizes the combined quality indicator based on the entropy and its variance calculated for the hue component, as well as for the RGB channels, depending on the color of the filament, allowing proper detection of low quality surfaces regardless of the filament’s color.

Krzysztof Okarma, Jarosław Fastowicz

Raster Maps Search Using Text Queries and Reasoning

Search engines have become one of indispensable elements of modern information processing systems. One of challenges to face in this area is to build an interface between pictorial and text data, that would allow for image search using text queries. In the paper, a concept is proposed, that combines inference, ontology and natural language processing to formulate queries to the pictorial data of raster map images. The proposed system analyses a user’s query formulated in the natural language and finds place on map which meet this criteria. The proposed approach improves remarkably the easy of access to the raster map content.

Arkadiusz Cacko, Marcin Iwanowski

Nuclei Recognition Using Iterated Conditional Modes Approach

The paper presents Iterated Conditional Modes based method for nuclei recognition in cytological images. It approximates nuclei by circles and ellipses. The first step is to find coordinates and sizes of circles. To find good configuration of circles, Iterated Conditional Modes (ICM) approach is employed to maximize the probability of configuration given image data. However, nucleus shape appears to be more elliptical than circular. Unfortunately, the process of finding nuclei using ellipses is computationally expensive, because at one point ellipses have three parameters (minor axis, major axis and angle), while circle have only one parameter (radius). To tackle this problem, we proposed heuristic procedure to estimate ellipses based on previously determined circles. To test the effectiveness of the method, it was applied to recognize disks in synthetically generated images and nuclei in microscopic images of breast cancer tissue.

Marcin Skobel, Marek Kowal, Józef Korbicz

Feature Selection in Texts

Feature selection is used in many application areas relevant to expert and intelligent systems, such as machine learning, data mining, cheminformatics and natural language processing. In this study we propose methods for feature selection and features analysis based on Support Vector Machines (SVM) with linear kernels. We explore how these techniques can be used to obtain some interesting information for further exploration of text data. The results provide satisfactory observations which may lead to progress in feature selection field.

Magdalena Wiercioch

Image Processing and Analysis in Lung Cancer Cells Growth

Modern diagnostic methods allow to get multiple information regarding research material. Patients diagnosis is performed using highly specialized tools, effectively supporting any medical diagnostic processes. This paper focuses on the analysis of lung cancer cell cultures growth and migration in vitro. Most of the publications on the growth rate of cells is based on the analysis of changes in surface area, less wide cracks. This study determined there are additional parameters like cells angle, number of cells and distance between cells, with separate cells up and down the scratch in all parameters, that affect how the migration of cells which have not been considered previously. Analysis on the arrangement of the cells and the distances between them, allow for determination of the level of cell migration. Experience has shown that on the first day a high proliferation of cells, and then clear their migration, increasing the distance. It was also noted changing the angle of the cells that begin migration. The performed analysis confirmed that those additional parameters differentiate correctly evaluated a group of images. Developed algorithm of image processing and analysis, operates on data from a collection of microscopic images of lung cancer cells in vitro propagation, acquiring data on the growth and migration of cancer cells. As a result the data contain a description of parameters studied images in the form of growth profiles over time and the type of growth.

Przemysław Jędrusik, Łukasz Walusiak, Ilona Bednarek, Robert Koprowski, Zygmunt Wróbel

Frame-to-Frame Visual Odometry: The Importance of Local Transformations

Trajectory estimation is of pivotal importance for mobile robots. Visual Odometry (VO) allows localizing a robot from passive vision data in frame-to-frame fashion. The VO problem can be solved in different ways, hence an evaluation of these algorithms in the context of real benchmark data is interesting. We focus on feature-based n-point methods based on RGB images. These methods used in monocular vision allow for camera rotation estimation, but only a few of them provide translation estimates up to the unknown scale. In the context of the use of commodity RGB-D cameras, we also compare these methods with the Kabsch algorithm, which uses full depth information.

Aleksander Kostusiak

Chiron: A Robust Recommendation System with Graph Regularizer

Recommendation systems have been widely used by commercial service providers for giving suggestions to users. Collaborative filtering (CF) systems, one of the most popular recommendation systems, utilize the history of behaviors of the aggregate user-base to provide individual recommendations and are effective when almost all users faithfully express their opinions. However, they are vulnerable to malicious users biasing their inputs in order to change the overall ratings of a specific group of items. CF systems largely fall into two categories - neighborhood-based and (matrix) factorization-based - and the presence of adversarial input can influence recommendations in both categories, leading to instabilities in estimation and prediction. Although the robustness of different collaborative filtering algorithms has been extensively studied, designing an efficient system that is immune to manipulation remains a challenge. We propose a novel hybrid recommendation system with an adaptive graph user/item similarity-regularization - Chiron. Chiron ties the performance benefits of dimensionality reduction (via factorization) with the advantage of neighborhood clustering (through regularization). We demonstrate, using extensive comparative experiments, that Chiron is resistant to manipulation by large and lethal attacks.

Saber Shokat Fadaee, Mohammad Sajjad Ghaemi, Hossein Azari Soufiani, Ravi Sundaram

A 2D/3D Convolutional Neural Network for Brain White Matter Lesion Detection in Multimodal MRI

White matter hyperintensities (WHM) are characteristics of various brain diseases, so automated detection tools have a broad clinical spectrum. Deep learning architectures have been recently very successful for the segmentation of brain lesions, such as ictus or tumour lesions. We propose a Convolutional Neural Network composed of four parallel data paths whose input is a mixture of 2D/3D windows extracted from multimodal magnetic resonance imaging of the brain. The architecture is lighter than others proposed in the literature for lesion detection so its training is faster. We carry out computational experiments on a dataset of multimodal imaging from 18 subjects, achieving competitive results with state of the art approaches.

Leire Roa-Barco, Oscar Serradilla-Casado, Mikel de Velasco-Vázquez, Asier López-Zorrilla, Manuel Graña, Darya Chyzhyk, Catherine Price

Automated Segmentation of Visceral Adiposity in MRI in Obese Children

Children obesity is a growing concern in the healthcare system, because a life of dependence and cronic health problems results in the adult phase of life. Non-alcoholic liver fat and visceral adiposity are two biomarkers of the health status of the child. Some studies try to measure the impact of exercise and improved habits in the reduction of these biomarkers. The studies use a fat enhancing magnetic resonance imaging sequence, but visceral fat is difficult to segment manually. In this paper we describe the automated process that we have devised, providing some visual results. The process has three main phases: (a) image inhomogeneity correction, (b) removal of spurious features such as arms and navel, (c) identification of the peripheral and visceral volumes, (d) localization of the intervertebral disks, which need to be removed. As there is no manual tracing to use as gold standard, we can not provide quantitative performance measures.

Manuel Graña, Oier Echaniz, Beatriz Rodriguez-Vigil, Idoia Labayen

EEG Classification for MI-BCI with Independent Component Analysis

Independent Component Analysis (ICA) is often used in EEG signal processing but only when the matrix of channels is big enough. If the matrix is small, the artifacts usually cannot be extracted as single components and the decision which components should be removed from the components’ set is impeded. Therefore, in our opinion, in order to apply ICA successfully for a low-dimensional EEG, the strategy for dealing with the components’ set should be reversed.In the paper we propose a strategy of searching for components correlating with the desired brain activity, instead of looking for artifact-components. Obviously, since the brain activity depends on the task at hand, different tasks would require adaptation of the proposed approach but the overall scheme is independent from the task. In the paper we describe the strategy and illustrate it via the experiment with a simple 2-states Motor Imagery Brain Computer Interface. Our results show that even if we added only one spare channel to the core set of two channels (C3 and C4) essential for hand movement recognition, we obtained 18% increase in the recognition accuracy after applying ICA and our strategy.

Izabela Rejer, Paweł Górski

Objective Description of Choral Singers Voice Quality Using Glottal-to-Noise Excitation Ratio

The article presents the results of the research joining signal analysis and medical aspects. It focuses on the analysis of the voice signals of the singers using GNE parameter. The GNE (Glottal-to-Noise Excitation Ratio) parameter has been used for the analysis of singing voice quality. The GNE values were calculated basing on the recorded samples. The results show that GNE may be useful in assessing the overall singing quality of the choir singers. The results may be useful for the development of the tools for computer analysis of singer’s or speaker’s voices.

Edward Półrolniczak

Permutation-Based Diversity Measure for Classifier-Chain Approach

In this paper, the problem of multilabel classification using the classifier chain scheme is addressed. We deal with the problem of building a diverse ensemble of the classifier-chain-based ensemble. For this purpose, we propose a permutation-based criterion of chain diversity. The final ensemble is build using a multi-objective genetic algorithm, which is used to optimise classification quality and chain diversity simultaneously. The proposed methods were evaluated using 29 benchmark datasets. The comparison was performed using four different multi-label evaluation measures. The experimental study reveals that the proposed approach provides a better classification quality than response-based diversity criteria.

Pawel Trajdos, Marek Kurzynski

Static Posed Versus Genuine Smile Recognition

Recognition of a posed or fake smile is a vital and challenging research topic and a growing interest has been observed from the computer vision and machine learning community. The state-of-the-art algorithms related to this field focus on the facial expressions dynamics, while several psychologists suggest that the main difference between posed and spontaneous smile should be observed in different muscles contractions in the upper part of the face. Therefore, in this work we evaluate the accuracy of recognition based only on the face appearance using the High-Dimensional Local Binary Patterns. The smile authenticity is analyzed on the set of images extracted at the smile apex phase from the UvA-NEMO database. The obtained results indicate that the analyzed algorithms can spot a fake smile much better than a human, but worse than systems that incorporate the facial dynamics.

Krystian Radlak, Natalia Radlak, Bogdan Smolka

Competitive Detector of Changes with a Statistical Test

The detector of jumps or changes in the function value and its derivative designed with the use of the concept of competing approximators is revisited. The previously defined condition for the existence of a jump in the function value is extended by introducing a statistical test of significance. This extension makes it possible to eliminate some false positive detections which appeared in the previously obtained results. The features of the extended detector are demonstrated on some artificial and real-life data.

Leszek J. Chmielewski, Konrad Furmańczyk, Arkadiusz Orłowski

A Novel Simulated Annealing Based Training Algorithm for Data Stream Processing Ensemble Classifier

Training of compound ensemble classifier systems might be computationally complex and hence time consuming task. Not only elementary classifiers are to be trained, but also model of the ensemble has to be updated. Therefore, an efficiency of the training shall be considered as a compound quality which consists of not only a classification accuracy but also a running time. This gains a special importance while dealing with data streams where data arrive at high pace and the system update shall be done promptly. In this paper we present an application of Simulated Annealing based algorithm for training of data stream processing ensemble. The evaluation of our method is performed in series of experiments which show that our ensemble perform very effectively in term of accuracy and processing time.

Konrad Jackowski

Tweet Classification Framework for Detecting Events Related to Health Problems

In this paper we present and validate the MC (Multiclassifier) system for Tweet classification related to flu and its symptoms. Proposed method consists of a preprocessing phase applying NLTK processor with converter from text corpora into feature space and as a last step ensemble of heterogenous classifiers fused at support level for Tweet classification. We have checked two methods for translating text into feature space. The first one uses standard Term Frequency times Inverse Document frequency, while the second one is enriched with hashtag analysis and word reduction after n-grams generation. Our preliminary results prove that Twitter can be an excellent platform for sensing real events. The most important task in proper event detection is a feature extraction technique taking into account not only text corpora, but also sentiment analysis and message intention.

Marcin Majak, Andrzej Zolnierek, Katarzyna Wegrzyn, Lamine Bougueroua

Real-Time Image Content Assessment for Underwater Robot Manoeuvring Based on Structural Tensor Analysis

The paper presents an efficient method for real-time image analysis for manoeuvring of the underwater robot. Image analysis is done after computing the structural tensor components which unveil rich texture and texture-less areas. To allow a power efficient underwater operation in real-time the method is implemented on the Jetson TK1 self-standing graphics card using the CUDA compute architecture. The laboratory experimental results show that the system is capable of processing about 40 Full HD images per second while allowing orientation toward texture specific regions for obstacle avoidance.

Jakub Nawała, Bogusław Cyganek

Distributed DBSCAN Algorithm – Concept and Experimental Evaluation

One of the most popular clustering algorithm is DBSCAN, which is known to be efficient and highly resistant to noise. In this paper we propose its distributed implementation. Distributed computing is a very fast growing way of solving problems in big datasets using a multinode cluster, rather than parallelization in one computer. Using its features in proper way, can lead to higher performance and, what is probably more important, higher scalability. In order to show added value of this way of designing and implementing algorithms we compare our results with GPU parallelization. On the basis of the obtained results We formulate the propositions how to improve our solution.

Adam Merk, Piotr Cal, Michał Woźniak

Combining Active Learning and Self-Labeling for Data Stream Mining

Data stream mining is among the most vital contemporary data science challenges. In this work we concentrate on the issue of actual availability of true class labels. Assumption that the ground truth for each instance becomes known right after processing it is far from being realistic, due to usually high costs connected with its acquisition. Active learning is an attractive solution to this problem, as it selects most valuable instances for labeling. In this paper, we propose to augment the active learning module with self-labeling approach. This allows classifier to automatically label instances for which it displays the highest certainty and use them for further training. Although in this preliminary work we use a static threshold for self-labeling, the obtained results are encouraging. Our experimental study shows that this approach complements the active learning strategy and allows to improve data stream classification, especially in scenarios with very small labeling budget.

Łukasz Korycki, Bartosz Krawczyk

Drift Detection Algorithm Using the Discriminant Function of the Base Classifiers

Recently, several approaches have been proposed to deal with the concept drift detection. In this paper we propose the new concept drift detection algorithm based on the decision templates. The decision templates are obtained from the outputs of the base classifier that form an ensemble of classifiers. Experiments on several publicly available data sets verify the effectiveness of the proposed algorithm.

Robert Burduk

An Algorithm for Detecting the Instant of Olfactory Stimulus Perception, Using the EEG Signal and the Hilbert-Huang Transform

The paper describes approach to instant of olfactory stimulus perception detection. Classification of olfactory stimuli in EEG is complex, but very important task. It allows to describe cognitive process and help in medical diagnosis process. Due to chemical - electrical nature of olfactory perception, there is need of solution which provide detection of beginning stimuli in EEG signal. Other way classification of olfactory stimuli would be more complex, due to not accurate in objects localization in learning set. Therefore the paper proposes utilization of Hilbert-Huang transformation in pre-processing. Proposed approach is evaluated and it have proven it’s usability.

Edward Puchala, Maciej Krysmann

Springer Professional

About this book

Table of Contents

Frontmatter