Applications of Data Science

Frontmatter

Exploring Spatial and Social Factors of Crime: A Case Study of Taipei City

Recognizing the significance of transparency and accessibility of government information, the Taipei Government recently published city-wide crime data to encourage relevant research. In this project, we explore the underlying relationships between crimes and various geographic, demographic and socioeconomic factors. First we collect a total of 25 datasets from the City and other publicly available sources, and select statistically significant features via correlation tests and feature selection techniques. With the selected features, we use machine learning techniques to build a data-driven model that is capable of describing the relationship between high crime rate and the various factors. Our results demonstrate the effectiveness of the proposed methodology by providing insights into interactions between key geographic, demographic and socioeconomic factors and city crime rate. The study shows the top three factors affecting crime rate are educational attainment, marital status, and distance to schools. The result is presented to the Taipei City officials for future government policy decision making.

Nathan Kuo, Chun-Ming Chang, Kuan-Ta Chen

A Fuzzy Logic Based Network Intrusion Detection System for Predicting the TCP SYN Flooding Attack

Fuzzy logic is one of the powerful tools for reasoning under uncertainty and since uncertainty is an intrinsic characteristic of intrusion analysis, Fuzzy logic is therefore an appropriate tool to use to analyze intrusions in a Network. This paper presents a fuzzy logic based network intrusion detection system to predict neptune which is a type of a Transmission Control Protocol Synchronized (TCP SYN) flooding attack. The performance of the proposed fuzzy logic based system is compared to that of a decision tree which is one of the well-known machine learning techniques. The results indicate that the performance difference, in terms of predicting the proportion of attacks in the data, of the proposed system with respect to the decision tree is negligible.

Nenekazi Nokuthala Penelope Mkuzangwe, Fulufhelo Vincent Nelwamondo

Analytical Ideas to Improve Daily Demand Forecasts: A Case Study

With the growing popularity of app-based taxi aggregators, bike-sharing systems and supermarkets across the world, it is now important to forecast short-term (often daily) demand accurately. Imprecise forecasts generally result in daily losses due to over or under stocking. This paper proposes multiple analytical constructs for demand prediction using Capital Bikeshare’s data as an example. The aim is to provide novel and business-justified ideas on feature engineering and subsequently using these features to create different analytical constructs for the actual prediction problem. A comparison of different modeling techniques in solving the same problem is also included. The findings demonstrate that a decomposed multi-stage prediction performs better than the pure forecasting or prediction approaches. Ensembling results show that a cross-construct ensemble may perform better than the traditional multiple-learner ensembles within the same construct.

Sougata Deb

Increasing the Detection of Minority Class Instances in Financial Statement Fraud

Financial statement fraud has proven to be difficult to detect without the assistance of data analytical procedures. In the fraud detection domain, minority class instances cannot be readily found using standard machine learning algorithms. Moreover, incomplete instances or features tend to be removed from investigations, which could lead to greater class imbalance. In this study, a combination of imputation, feature selection and classification is shown to increase the identification of minority samples given severely imbalanced data.

Stephen Obakeng Moepya, Fulufhelo V. Nelwamondo, Bhekisipho Twala

Information Technology Services Evaluation Based ITIL V3 2011 and COBIT 5 in Center for Data and Information

Increasing the role of IT is directly proportional to the increase in investment is accompanied also by an increase in expenses is great. Indicators of successful implementation of IT in the form of service excellence in Center for Data and Information Ministery of Defence Republic of Indonesia is reliable, available, fast, and accurate. By planning a mature IT governance implementation of IT services is expected to do well and the embodiment of good IT Governance. Success in providing information services can provide a positive impact for organizations and society in general. Thus, the investments made by the Ministry of Defence in the implementation of ICT should not contradict with the goals of the organization and to the expectations of stakeholders. This study aims to measure the capability of service information in the Ministry of Defence in order to improve stakeholder satisfaction. Measurement capability is used COBIT 5 with qualitative method and the case study method. Stages of this research is the analysis of the condition of the eighth COBIT 5 process, the target area of process improvement, gap analysis and the determination of strategies for achieving capability. The end result of this research activity in the form of recommendations adopted policies and procedures of ITILV3 2011, as well as Key Performance Indicator (KPI) recommendations for Center for Data and Information Ministery of Defence.

Firman Hartawan, Jarot S. Suroso

Artificial Intelligence Applications for E-services

Frontmatter

To Solve the TDVRPTW via Hadoop MapReduce Parallel Computing

The convenience of online shopping has made it common to everyone. With the increase of online transaction, optimization of VRP is an important issue in logistics and transportation. TDVRPTW is a crucial problem which considers a given time window in VRP. This paper targets solving TDVRPTW by using Hadoop MapReduce and compares the effectiveness of Hadoop with a single machine. We used an existing program to cluster the demand nodes and then calculated a route for every cluster by using random method and heuristic algorithm including nearest time window algorithm, nearest neighbor algorithm and 2-opt. After that, we executed parallel computing in Hadoop by implementing program on MapReduce. We used Solomon benchmarking problem as the base of experimental examples and made the experiments. This research proved that Hadoop MapReduce has better efficacy to calculate the best solution than a single machine.

Bo-Yi Li, Chen-Shu Wang

MapReduce-Based Frequent Pattern Mining Framework with Multiple Item Support

The analysis of big data mining for frequent patterns is become even more problematic. Many efficient itemset mining algorithms to set a multiple support values for each transaction which could seem feasible as real life applications. To solve problem of single support have been discovered in the past. Since, we know that parallel and distributed computing are valid approaches to deal with large datasets. In order to reduce the search space, we using MISFP-growth algorithm without the process of rebuilding and post pruning steps. Accordingly, in this paper we proposed a model to use of MapReduce framework for implement the parallelization under multi-sup values, thereby improving the overall performance of mining frequent patterns and rare items accurately and efficiently.

Chen-Shu Wang, Shiang-Lin Lin, Jui-Yen Chang

Balanced k-Means

K-Means is a very common method of unsupervised learning in data mining. It is introduced by Steinhaus in 1956. As time flies, many other enhanced methods of k-Means have been introduced and applied. One of the significant characteristic of k-Means is randomize. Thus, this paper proposes a balanced k-Means method, which means number of items distributed within clusters are more balanced, provide more equal-sized clusters. Cases those are suitable to apply this method are also discussed, such as Travelling Salesman Problem (TSP). In order to enhance the performance and usability, we are in the process of proposing a learning ability of this method in the future.

Chen-Ling Tai, Chen-Shu Wang

An Optimal Farmland Allocation E-Service Deployment

This research develops an intelligent farmland allocation e-service. The objective of the research is to provide an effective production and cooperative model between peasants and social enterprises. Since the production and marketing manager needs to visit organic-certified farmlands and to offer advices, and help to decide what types of agricultural commodities will be planted. Social enterprises offer their peasants plant-to-order. Thus, the research creates an intelligent farmland allocation e-service for organic produce, and users also adopt an optimization model which is capable of searching for the best combination of agricultural commodities based on the predefined lowest planting costs. The research also can be viewed as an automatic recommendation for peasants and certain production and marketing managers of social enterprises.In traditional, a company case of certain social enterprise uses the manual allocating and worksheet file management. The research introduces the innovative service for social enterprise and peasants to improve the pervious service process. The social enterprise can create a new way to collaborate and improve productivity. In order to embody the agricultural innovation, the peasants and social enterprise can achieve the value co-creation of business and social values. Even the proposed service can be used to support and land friendly and sustainable development of safe agricultural commodities.

Wei-Feng Tung, Chun-Liang Pan

Automated Reasoning and Proving Techniques with Applications in Intelligent Systems

Frontmatter

Anticipatory Runway Incursion Prevention Based on Inaccurate Position Surveillance Information

To build a practical anticipatory runway incursion prevention system (ARIPS), it is necessary to predict runway incursions based on inaccurate position information of aircraft and vehicles. To this end, this paper proposes a series of improved methods to predict runway incursions based on inaccurate position surveillance information for ARIPSs. The evaluation shows that our system can handle different types of runway incursions based on inaccurate position information, deal with the momentary absence of surveillance data, and produce few false detection under non-runway incursion circumstances.

Kai Shi, Hai Yu, Zhiliang Zhu, Jingde Cheng

The Relation Between Syntax Restriction of Temporal Logic and Properties of Reactive System Specification

Open reactive systems provide services to users by interacting the users and environments of the systems. There are several methods that describe formal specifications of reactive systems. Temporal logic is one of the methods. An open reactive system specification is defined to be realizable if and only if there is a program that satisfies the specification even if the environment and the users of the reactive system take any behaviors. There are several kinds of the methods of deciding realizability of open reactive system. These methods are based on automata theory and their complexities are at least double exponential times of the length of a specification. This paper shows the relation between a syntax and realizability properties of reactive system specifications. This relation can reduce the complexity of deciding the properties of reactive system specifications.

Noriaki Yoshiura

Measuring Interestingness of Theorems in Automated Theorem Finding by Forward Reasoning: A Case Study in Peano’s Arithmetic

Wos proposed 33 basic research problems for automated reasoning field, one of them is the problem of automated theorem finding. The problem has not been solved until now. The problem implicitly requires some metrics to be used for measuring interestingness of found theorems. We have proposed some metrics to measure interestingness of theorems found by using forward reasoning approach. We have measured interestingness of the theorems of NBG set theory by using those metrics. To confirm the generality of the proposed metrics, we have to apply them in other mathematical fields. This paper presents a case study in Peano’s arithmetic to show the generality of proposed metrics. We evaluate the interestingness of theorems of Peano’s arithmetic obtained by using forward reasoning approach, and confirm the effectiveness of the metrics.

Hongbiao Gao, Jingde Cheng

A Predicate Suggestion Algorithm for Automated Theorem Finding with Forward Reasoning

The problem of automated theorem finding (ATF for short) is one of the 33 basic research problems in automated reasoning. To solve the ATF problem, an ATF method with forward reasoning based on strong relevant logics has been proposed and studied. In the method, predicate abstraction plays important role. However, in the current method, targets of predicate abstraction are predicates that an executor of ATF has already known. This paper presents a predicate suggestion algorithm to suggest previously unknown predicates and create abstraction rules for predicate abstraction in the ATF method with forward reasoning. The paper also shows that the proposed algorithm is effective through a case study.

Yuichi Goto, Hongbiao Gao, Jingde Cheng

Collective Intelligence for Service Innovation, Technology Opportunity, E-Learning and Fuzzy Intelligent Systems

Frontmatter

Modeling a Multi-criteria Decision Support System for Capital Budgeting Project Selection

Capital budgeting project selection is an important part of strategic decision-making in every enterprise because successful new investment projects essentially contribute to enterprise’s financing growth, value proposition and strategic intent. Therefore, the main purpose of this study was to present a design philosophy and operation process for modeling a decision support system to handle capital budgeting project selection problems. In order to achieve this purpose, the goal of this study has been two-fold. The first objective was to propose a new fuzzy multi-criteria decision making method for project alternative comparison and selection. The second objective was to employ the new fuzzy multi-criteria decision-making method to model the computational architecture of the decision support system for capital budgeting project selection. Finally, an algorithm and a numerical example resumed the design philosophy and the operation process of the modeled multi-criteria DSS was illustrated and the results have indicated that the objectives of the study were achieved.

Kuo-Sui Lin, Jui-Ching Pan

Innovative Diffusion Chance Discovery

The purpose of this study is to explore the innovative use value created by early adopters existing on the Internet world. It is to affect the purchasing intention of the early majority when there is no social relationship network between early adopters and early majority so as to build a new innovative diffusion model. In order to achieve this, the innovative use value created in the Internet by the early adopters was designed as commercial fliers to directly stimulate the early majority. It is used to observe whether new products are accepted by the early majority, and to predict the new product’s diffusion chance. The experimental results proved that this method surely can move the early majority toward groups with high intention and observe how many social networks of early majority will work to influence late majority.

Chao-Fu Hong, Mu-Hua Lin

Chance Discovery in a Group-Trading Model ─ Creating an Innovative Tour Package with Freshwater Fish Farms at Yilan

Information Technology in modern days is developing very fast, which has brought out new opportunities for Electrical Commercial Businesses in Tourism Industry. Crossing the chasm between early adopters and an early majority is an important issue for tourism innovation diffusion. In this paper, a group-trading model called the Core Broker Model is used to create an innovative tour product and to diffuse the innovation through E-markets on the Internet. A core broker organizes an innovation team, using the technique of text mining, to create a featured tour package and initiate a joint-selling project for different type of providers. The core broker then recruits market brokers to form a market team to promote the tourism innovation, and put it on the websites for consumers to group-purchase e-coupons for the trip. With the advantages of group-trading in the model, the chance for innovation diffusion to cross the chasm is high.

Pen-Choug Sun, Chao-Fu Hong, Tsu-Feng Kuo, Rahat Iqbal

Using Sentiment Analysis to Explore the Association Between News and Housing Prices

In recent years, semi-structured and unstructured data have received substantial attention. Previous studies on sentiment analysis and opinion mining have indicated that media information features sentiment factors that can affect investor decisions. However, few studies have explored the correlation between news sentiment and housing prices; hence, the present study was conducted to investigate this correlation. A method was proposed to collect and filter news information and analyze the correlation between news sentiment and housing prices. The results indicate that news sentiment can serve as a reference for evaluating housing price trends.

Hsiao-Fang Yang, Jia-Lang Seng

Importance-Performance Analysis Based Evaluation Method for Security Incident Management Capability

SEI’s Incident Management Capability Metrics provides an overview of how the metrics can be used to evaluate and improve organizations’ information security incident management capability. However, there still exist several deficiencies when using SEI’s Metrics to measure the function areas of security incident management capability. An importance-performance analysis based evaluation method for measuring and improving organizations’ information security incident management capability was proposed in this paper. The evaluation method produces a four-quadrant IPA matrix that considers both importance and performance simultaneously for better identifying function areas needing improvement. A numerical example of the evaluation method showed that the proposed method is efficient for deploying continuous improvement program and better allocating limited resources.

Chih-Chung Chiu, Kuo-Sui Lin

Intelligent Computer Vision Systems and Applications

Frontmatter

Moment Shape Descriptors Applied for Action Recognition in Video Sequences

Algorithms for recognition of human activities have found application in many computer vision systems, for example in visual content analysis approaches and in video surveillance systems, where they can be employed for the recognition of single gestures, simple actions, interactions and even behaviour. In this paper an approach for human action recognition based on shape analysis is presented. Set of binary silhouettes extracted from video sequences representing a person performing an action are used as input data. The developed approach is composed of several algorithms including those for shape representation and matching. It can deal with sequences of different number of frames and none of them has to be removed. The paper provides some initial experimental results on classification using proposed approach and moment shape description algorithms, namely the Zernike Moments, Moment Invariants and Contour Sequence Moments.

Katarzyna Gościewska, Dariusz Frejlichowski

Automatic Detection of Singular Points in Fingerprint Images Using Convolution Neural Networks

Minutiae based matching, the most popular approach used in fingerprint matching algorithms, is to calculate the similarity by finding the maximum number of matched minutiae pairs in two given fingerprints. With no prior knowledge about anchor/clue to match, this becomes a combinatorial problem. Global features of the fingerprints (e.g., singular core and delta points) can be used as the anchor to speed up the matching process. Most approaches use the conventional Poincare Index method with additional techniques to improve the detection of the core and delta points. Our approach uses Convolution Neural Networks which gained state-of-the-art results in many computer vision tasks to automatically detect those points. With the experimental results on FVC2002 database, we achieved the accuracy and false alarm of (96%, 7.5%) and (90%, 6%) for detecting core, and delta points, correspondingly. These results are comparative to those of the detection algorithms with human knowledge.

Hong Hai Le, Ngoc Hoa Nguyen, Tri-Thanh Nguyen

Vehicle Detection in Hsuehshan Tunnel Using Background Subtraction and Deep Belief Network

This paper proposes a method to detect vehicle in the Hsuehshan Tunnel. Vehicle detection in the Tunnel is a challenging problem due to use of heterogeneous cameras, varied camera setup locations, low resolution videos, poor tunnel illumination, and reflected lights on the tunnel wall. Furthermore, the vehicles to be detected vary greatly in shape, color, size, and appearance. The proposed method is based on background subtraction and Deep Belief Network (DBN) with three hidden layers architecture. Experimental results show that it can detect vehicles in he Tunnel effectively. The experimental accuracy rate is 96.59%.

Bo-Jhen Huang, Jun-Wei Hsieh, Chun-Ming Tsai

Segment Counting Versus Prime Counting in the Ulam Square

Points that correspond to prime numbers in the Ulam square form straight line segments of various lengths. It is shown that the relation between the number of the segments and the number of primes present in a given square steadily grows together with the growing values of the prime numbers in the studied range up to 25 009 991 and is close to double log. These observations were tested also on random dot images to see if the findings were not a result of merely the abundance of data. In random images the densities of the longer segments and the lengths of the longest ones are considerably smaller than those in the Ulam square, while for the shorter segments it is the opposite. This could lead to a cautious presumption that the structure of the set of primes might contain long-range relations which do not depend on scale.

Leszek J. Chmielewski, Arkadiusz Orłowski, Grzegorz Gawdzik

Improving Traffic Sign Recognition Using Low Dimensional Features

In the recent decades, researches of the autonomous vehicle are getting popular in the computer vision society, since such vehicle is equipped with cameras for sensing the environment in helping navigation movement. Cameras give a lot of information and are low-cost device sensor rather than the other sensors which can be mounted on the vehicle. One of the visual information which can be acquired by autonomous vehicle for its navigation is traffic sign. Thus, this work addresses a traffic sign recognition framework as part of the autonomous vehicle. For recognizing the traffic sign, it is assumed that the traffic sign regions have been extracted using maximally extremal stable region (MSER). Using a heuristic rule of geometry properties, the false detections will be excluded. Furthermore, traffic sign images are classified using low dimensional features which were encoded using Adversarial Auto-encoder technique. Using this strategy, classification task can be performed using 2-dimensional features while improving the classification results over the high dimensional grayscale features. Extensive experiments were carried out over German traffic sign recognition database show that the proposed method provides reliable results.

Laksono Kurnianggoro, Wahyono, Kang-Hyun Jo

Image Processing Approach to Diagnose Eye Diseases

Image processing and machine learning techniques are used for automatic detection of abnormalities in eye. The proposed methodology requires a clear photograph of eye (not necessarily a fundoscopic image) from which the chromatic and spatial property of the sclera and iris is extracted. These features are used in the diagnosis of various diseases considered. The changes in the colour of iris is a symptom for corneal infections and cataract, the spatial distribution of different colours distinguishes diseases like subconjunctival haemorrhage and conjunctivitis, and the spatial arrangement of iris and sclera is an indicator of palsy. We used various classifiers of which adaboost classifier which was found to give a substantially high accuracy i.e., about 95% accuracy when compared to others (k-NN and naive-Bayes). To enumerate the accuracy of the method proposed, we used 150 samples in which 23% were used for testing and 77% were used for training.

M. Prashasthi, K. S. Shravya, Ankit Deepak, Manjunath Mulimani, Koolagudi G. Shashidhar

Weakly-Labelled Semantic Segmentation of Fish Objects in Underwater Videos Using a Deep Residual Network

We propose the use of a 152-layer Fully Convolutional Residual Network (ResNet-FCN) for non motion-based semantic segmentation of fish objects in underwater videos that is robust to varying backgrounds and changes in illumination. For supervised training, we use weakly-labelled ground truth derived from motion-based adaptive Mixture of Gaussians Background Subtraction. Segmentation results of videos taken from six different sites at a benthic depth of around 10 m using ResNet-FCN provide a fish object average precision of 65.91%, and average recall of 83.99%. The network is able to correctly segment fish objects solely through color-based input features, without need for motion cues, and it could detect fish objects even in frames that have strong changes in illumination due to wave motion at the sea surface. It can segment fish objects that are located far from the camera despite varying benthic background appearance and differences in aquatic hues.

Alfonso B. Labao, Prospero C. Naval Jr.

A New Feature Extraction in Dorsal Hand Recognition by Chromatic Imaging

Biometric authentication is a trending topic in biometrics that increases the security of personal data. The clients effortlessly authenticate any kind of system, since the controllers, which could be one-time or continuous, stealthily validate the biometric characteristics. The enhancements could be used as the main authentication protocol or implemented as a second security layer. We therefore propose a dorsal hand recognition and validation procedure to increase the security of personal computers. The camera mounted on top a laptop monitor captures the hands on the keyboard and identifies the frames by adaptive chromatic method. Afterwards, new geometrical features are extracted as the key biometric traits to be analyzed in Levenberg-Marquardt based neural controller for validation.

Orcan Alpar, Ondrej Krejcar

Testing the Limits of Detection of the ‘Orange Skin’ Defect in Furniture Elements with the HOG Features

In principle, the orange skin surface defect can be successfully detected with the use of a set of relatively simple image processing techniques. To assess the technical possibilities of classifying relatively small surfaces the Histogram of Oriented Gradients (HOG) and the Support Vector Machine were used for two sets of about 400 surface patches in each. Color, grey and binarized images were used in tests. For grey images the worst classification accuracy was 91% and for binarized images it was 99%. For color image the results were generally worse. The experiments have shown that the cell size in the HOG feature extractor should be not more than 4 by 4 pixels which corresponds to 0.12 by $$0.12\,$$mm on the object surface.

Leszek J. Chmielewski, Arkadiusz Orłowski, Grzegorz Wieczorek, Katarzyna Śmietańska, Jarosław Górski

Intelligent Data Analysis, Applications and Technologies for Internet of Things

Frontmatter

Reaching Safety Vehicular Ad Hoc Network of IoT

The Internet of Things (IoT) is the interconnection between things in a network. The sensors are embedded into an equipment to support the transformation of sensor information between the carriers by the communication technology, then the smart automatic control and applications can be provided. However, Vehicular Ad hoc NETwork (VANET) is one of the applications of IoT that allowing vehicles within the network to communicate effectively with each another. It is important that VANETs are applied within a safe and reliable network topology. Therefore, in this study, a Reliable VANET Agreement Protocol (RVAP) is proposed. RVAP allows all fault-free nodes to reach safety agreement with minimal rounds of data gatherings, and tolerates the maximal number of allowable components in the VANET.

Shu-Ching Wang, Shih-Chi Tseng, Shun-Sheng Wang, Kuo-Qin Yan

Short-Term Load Forecasting in Smart Meters with Sliding Window-Based ARIMA Algorithms

Forecasting of electricity consumption for residential and industrial customers is an important task providing intelligence to the smart grid. Accurate forecasting should allow a utility provider to plan the resources as well as to take control actions to balance the supply and the demand of electricity. This paper presents two non - seasonal and two seasonal sliding window-based ARIMA (Auto Regressive Integrated Moving Average) algorithms. These algorithms are developed for short-term forecasting of hourly electricity load. The algorithms integrate non - seasonal and seasonal ARIMA models with the OLIN (Online Information Network) methodology. To evaluate our approach, we use a real hourly consumption data stream recorded by six smart meters during a 16-month period.

Dima Alberg, Mark Last

Bus Drivers Fatigue Measurement Based on Monopolar EEG

When people are tired, their conscious activities are sluggish and slow brainwaves predominant in their brains. More specifically speaking, the health of a human body may be affected, and chances of accident may arise due to the lowering of ones’ alertness as a result of the reduction of activity in cerebral cortex. In this study, by means of a portable electroencephalograph machine and brain-computer interface, the brainwave of bus drivers were measured for their fatigue quality. The results indicated that through the empirical formula proposed, we were able to measure people’s fatigue state with a result comparable to sophisticated medical instrument.

Chin-Ling Chen, Chong-Yan Liao, Rung-Ching Chen, Yung-Wen Tang, Tzay-Farn Shih

An Improved Sleep Posture Recognition Based on Force Sensing Resistors

In this paper, we applied six force sensing-resistor sensors (FSR Sensors) to perform sleep posture recognition. The analog-to-digital converter (ADC) is used to extract the resistance signals of FSRs. The recorded FSR signals are averaged as reference pattern of six values. The reference patterns and test patterns of the postures are performed pattern matching with the mean squared error (MSE) method. With a scale adjusting method, the recognition accuracy is obtained by 87%. Moreover, after the moving average windows are adopted to remove the high ripple, the recognition accuracy can be improved to 96% with window length L = 7.

Yung-Fa Huang, Yi-Hsiang Hsu, Chia-Chi Chang, Shing-Hong Liu, Ching-Chuan Wei, Tsung-Yu Yao, Chuan-Bi Lin

Remaining Useful Life Estimation-A Case Study on Soil Moisture Sensors

This paper presents an approach to estimate the remaining useful life of sensors. First, a system state machine is defined to divide the sampled data received from the sensors into different categories. Then, the sampled data sets are sent to the fault model to detect whether a fault has occurred. The time of occurrence for each type of fault is recorded and weighted with different coefficient. The weighted values are accumulated to form a trend data graph. An exponential curve fitting is then used to approximate the trend of data to determine the remaining useful life function and threshold is also generated from the cumulative faults value. The experimental results shows the proposed model has a precision of 66.67% and recall rate near 100% within 10-h timespan. Thus, the proposed model may not only prolong the life span of sensors, but may also reduce the cost to replace them.

Fang-Chien Chai, Chun-Chih Lo, Mong-Fong Horng, Yau-Hwang Kuo

Intelligent Algorithms and Brain Functions

Frontmatter

Neurofeedback System for Training Attentiveness

Attention Deficit Disorder (ADD) has long been recognized as a public health concern amongst children, where its symptoms include impulsiveness, inattentiveness and unfocused. The consequence is children with poor academic performance and discipline that has negative impact on their future. Current treatment for ADD uses powerful psycho-stimulant drugs, to reduce aggression and enhance concentration. However, there are always risk factors and adverse effects with these drugs. Moreover, drugs do not alter the dysfunctional condition. Forefront research in biomedical engineering unveils neurofeedback, which presents an exciting alternative approach to neural related disorders. Our ultimate goal is to develop a neurofeedback system to enable anyone with attention deficit to practice regulating their brain to reach an attentive state of mind, with reduced dependency on drug related intervention. Relying on neuroplasticity, neurofeedback focuses on the training of brain through activities to circumvent the dysfunctional condition. In this paper, such a system has been developed and applied on normal healthy subjects, to establish the protocol on EEG subband and electrode placement as well as system functional testing. It consists of a wireless EEG acquisition module, a feature extraction module, an IoT database module, an Intel Edison microcontroller board and a feedback activity center, the humanoid robot. The protocol on subband and electrode placement is established with short time Fourier transform (STFT) and fast Fourier transform (FFT). The system rewards the subject if the root mean square voltage of his beta subband at Fp1 exceeds the target voltage, when he is attentive.

Khuan Y. Lee, Emir Eiqram Hidzir, Muhd Redzuan Haron

Building Classifiers for Parkinson’s Disease Using New Eye Tribe Tracking Method

Parkinson Disease (PD) is the second major neurodegenerative disease, which causes severe complications for patients’ daily life. PD remains unspecified in many aspects including best treatment, prediction of its progression and precise diagnosis. In our study we have built machine learning (ML) models, which address some of those issues by helping to improve symptom evaluation precision by using advanced biomarkers such as fast eye movements. We have built and compared model accuracy relaying on data from two systems for recording eye movements: one is saccadometer (Ober Consulting), and another is based on the Eye Tribe (ET1000). We have reached 85% accuracy in prediction of neurologic attributes based on ET and 82% accuracy with saccadometer with help of rough set theory. The purpose of this study was to compare ET with clinically approved eye movement measurements saccadometer of Ober. We have demonstrated in 8 PD patients that both systems gave comparable results based on neurological and eye movement measurements attributes.

Artur Szymański, Stanisław Szlufik, Dariusz M. Koziorowski, Andrzej W. Przybyszewski

Rules Found by Multimodal Learning in One Group of Patients Help to Determine Optimal Treatment to Other Group of Parkinson’s Patients

We have already demonstrated that measurements of eye movements in Parkinson’s disease (PD) are diagnostic. We have performed experimental measurements of fast reflexive saccades (RS) in PDs in order to predict effects of different therapies. We have also found rules by means of data mining and machine learning (ML) in order to classify how different doses of medication have determined motor symptoms (UPDRS III) improvements. These rules from one group of 23 patients only on medications were supplied to another group of 18 patients under medications and DBS (deep brain stimulation) therapies in order to predict motor symptoms changes. Such parameters as patient’s age, neurological and saccade’s parameters gave a global accuracy in the motor symptoms predictions of 76% based on the cross-validation. Our approach demonstrated that rough set rules are universal between groups of patients with different therapies that may help to predict optimal treatments for individual PDs.

Andrzej W. Przybyszewski, Stanislaw Szlufik, Piotr Habela, Dariusz M. Koziorowski

Intelligent Systems and Algorithms in Information Sciences

Frontmatter

Reasoning in Formal Systems of Extended RDF Networks

It is a fact that the RDF(S) model has been declared as the ground base for implementations of further web development conception. RDF provides a common and flexible way to decompose knowledge to elementary statements that allows, as networks of indivisible knowledge atoms, to be represented by RDF triples or by RDF graph vectors.The article presents two graph based formal systems GRDF and RDFCFL defined on the base of extended RDF model with the help of clausal form logic principle and notation. The transformation process from the first order predicate logics (FOPL) to the RDF graph notation protects language expressivity and moreover both the presented systems share the partial decidability with the FOPL. As an example it is shown a reasoning of consequents in a monotonic version of the RDFCFL system.

Alena Lukasová, Martin Žáček, Marek Vajgl

Big Data Filtering Through Adaptive Resonance Theory

The aim of the article is to use Adaptive Resonance Theory (ART1) for Big Data Filtering. ART1 is used for preprocessing of the training set. This allows finding typical patterns in the full training set and thus covering the whole space of solutions. The neural network adapted by a reduced training set has a greater ability of generalization. The work also discusses the influence of vigilance parameter settings for filtering the training set. The proposed method Big Data Filtering through Adaptive Resonance Theory is experimentally verified to control the behavior of an autonomous robot in an unknown environment. All obtained results are evaluated in the conclusion.

Adam Barton, Eva Volna, Martin Kotyrba

The Effectiveness of the Simplicity in Evolutionary Computation

Current research in Evolutionary Computation concentrates on proposing more and more sophisticated methods that are supposed to be more effective than their predecessors. New mechanisms, like linkage learning (LL) that improve the overall method effectiveness, are also proposed. These research directions are promising and lead to effectiveness increase that cannot be questioned. Nevertheless, in this paper, we concentrate on a situation in which the simplification of the method leads to the improvement of its effectiveness. We show situations when primitive methods, like Random Search (RS) combined with local search, can compete with highly sophisticated and highly effective methods. The presented results were obtained for an up-to-date, practical, NP-complete problem, namely the Routing and Spectrum Allocation of Multicast and Unicast Flows (RSA/MU) in Elastic Optical Networks (EONs). None of the considered test cases is trivial. The number of solutions possible to encode by an evolutionary method is large.

Michal Witold Przewozniczek, Krzysztof Walkowiak, Michal Aibin

Genetic Algorithm for Self-Test Path and Circular Self-Test Path Design

The article presents the use of Genetic Algorithm to search for non-linear Autonomous Test Structures (ATS) in Built-In Testing approach. Such structures can include essentially STP and CSTP and their modifications. Nonlinear structures are more difficult to analyze than the widely used structures like independent Test Pattern Generator and the Test Response Compactor realized by Linear Feedback Shift Register. To reduce time-consuming test simulation of sequential circuit it was used an approach based on the stochastic model of pseudo-random testing. The use of stochastic model significantly affects the time effectiveness of the search for evolutionary autonomous structures. In test simulation procedure the block of sequential circuit memory is not disconnected. This approach does not require a special selection of memory registers like BILBOs. A series of studies to test circuits set ISCAS’89 are made. The results of the study are very promising.

Miłosław Chodacki

IT in Biomedicine

Frontmatter

The Use of Tuned Shape Window for the Improvement of Scars Imaging in Static Renal Scintigraphy in Children

Physiological renal processes are evaluated based on planar images registered by a gamma camera, SPECT or PET. However, detection of small disorders in standard planar scintigraphic imaging is difficult and sometimes impossible. The aim of the research conducted at Pomeranian Medical University is to increase the sensitivity of the method for detecting the areas of renal functional disorders called scarring. The image recorded as a result of the test was subject to digital processing. In that purpose, a novel window function was used for filter designing. Standard and processed images were presented to three independent experts. The diagnosis results were subject to statistical analysis. As a result, a large share of changes in diagnosis was reported in the total number of conducted tests, a strong correlation between positive evaluation of the role of processed images and a change in diagnosis as well as an increased possibility of kidney condition evaluation.

Janusz Pawel Kowalski, Bozena Birkenfeld, Piotr Zorga, Jakub Peksinski, Grzegorz Mikolajczak

PCA-SCG-ANN for Detection of Non-structural Protein 1 from SERS Salivary Spectra

With non-structural protein (NS1) being acknowledged as biomarker for Dengue fever, the need to automate detection of NS1 from salivary surface enhanced Raman spectroscopic (SERS) spectra, with claim of sensitivity up to a single molecule thus become eminent. Choice for Principal Component Analysis (PCA) termination criterion and artificial neural network (ANN) topology critically affect the performance and efficiency of PCA-SCG-ANN classifier. This paper aims to explore the effect of number of hidden node for the ANN topology and PCA termination criterion on the performance of the PCA-SCG-ANN classifier for detection of NS1 from SERS spectra of saliva of subjects. The Eigenvalue-One-Criterion (EOC), Cumulative Percentage Variance (CPV) and Scree criteria, integrated with ANN topology containing hidden nodes from 3 to 100 are investigated. Performance of a total of 42 classifier models are examined and compared in terms of accuracy, precision, sensitivity. From experiments, it is found that EOC criterion paired with ANN topology of 13 hidden node outperforms the other models, with a performance of [Accuracy 91%, Precision 94%, Sensitivity 94%, Specificity 96%].

N. H. Othman, Khuan Y. Lee, A. R. M. Radzol, W. Mansor

Prediction of Arterial Blood Gases Values in Premature Infants with Respiratory Disorders

Arterial blood gases sampling (ABG) is a method for acquiring neonatal patients’ acid-base status. Variations of blood gasometry parameters values over time can be modelled using multi-layer artificial neural networks (ANNs). Accurate predictions of future levels of blood gases can be useful in supporting therapeutic decision making. In the paper several models of ANN are trained using growing numbers of feature vectors and assessment is made about the influence of input matrix size on the accuracy of ANNs’ prediction capabilities.

Wiesław Wajs, Hubert Wojtowicz, Piotr Wais, Marcin Ochab

Extraction of Optical Disc Geometrical Parameters with Using of Active Snake Model with Gradient Directional Information

An analysis of the optical disc is challenging task in the field of clinical ophthalmology. Optical disc (OD) is frequently utilized as reference parameter for time evolution of retinal changes therefore, their analysis is significantly important. In the clinical practice, there are especially problem with lower quality of retinal records acquired by retinal probe of RetCam 3, and worse observation of OD area. Therefore, many algorithms are unable to precisely approximate of OD area. We propose a method based on the active snake model carrying out automatic extraction of retinal disc area even in the spots where an OD is not clearly observable, or image edges completely missing. Furthermore, the proposed solution calculates OD centroid, and respective area for further comparison of OD with retinal lesions.

Jan Kubicek, Juraj Timkovic, Marek Penhaker, Martin Augustynek, Iveta Bryjova, Vladimir Kasik

Segmentation of Vascular Calcifications and Statistical Analysis of Calcium Score

Assessment of blood vessels is current task in the field of clinical angiography. Calcification spots are usually observable well from native CT angiography records nevertheless, it is absence of methods precisely calculating calcium score (CS) serving as indicator of blood vessel deterioration by calcification process. The paper deals with the segmentation method for segmentation, extraction and differentiation of vascular calcifications based on the multilevel Otsu thresholding. The method generates mathematical model of physiological and calcification part of blood vessels and, it consequently allows for CS calculation as calcified blood vessel to physiological ratio. We performed analysis of blood vessels modeling and CS calculation on the base of wide dataset 90 patient’s records. We compared our CS results with three independent clinical experts. On the base of this comparative analysis, we propose estimated intervals for CS serving for objective analysis of this vascular parameter in the dependence of a calcification level. The main clinically applicable result is SW VesselsCalc application for complex analysis of blood vessels, modeling of calcification areas and calculation of calcium score.

Jan Kubicek, Iveta Bryjova, Jan Valosek, Marek Penhaker, Martin Augustynek, Martin Cerny, Vladimir Kasik

Rough Hypercuboid and Modified Kulczynski Coefficient for Disease Gene Identification

The most important objective of human genetics research is the discovery of genes associated to a disease. In this respect, a new algorithm for gene selection is presented, which integrates wisely the information from expression profiles of genes and protein-protein interaction networks. The rough hypercuboid approach is used for identifying differentially expressed genes from the microarray, while a new measure of similarity is proposed to exploit the interaction network of proteins and therefore, determine the pairwise functional similarity of proteins. The proposed algorithm aims to maximize the relevance and functional similarity, and utilizes it as an objective function for the identification of a subset of genes that it predicts as disease genes. The performance of the proposed algorithm is compared with other related methods using some cancer associated data sets.

Ekta Shah, Pradipta Maji

Detection of Raynaud’s Phenomenon by Thermographic Testing for Finger Thermoregulation

Raynaud phenomenon is a disorder of blood flow in the fingers and generally said to be a vasospastic response to cold or emotional stress due to blockage in constricted digital artery. However some cases triggered by excessive use of the hands are also reported besides the idiopathic vasospasms. During the attack, the vessels temporarily narrow down limiting the blood supply to fingers thus the small arteries may thicken overtime. The severe episodes result in numbness, color change of the affected fingers, eventually gangrene. Therefore in this paper we present a detection methodology for early diagnosis to prevent the reduced blood flow by thermal image processing based on thermoregulation in fingers. Several experiments were conducted by altering the conditions to understand the differences between the states after cooling process. The first results are greatly encouraging that reduced blood flow is mathematically identified by grid histogram matrices for red-channel conversion of higher radiation.

Orcan Alpar, Ondrej Krejcar

Intelligent Technologies in the Smart Cities in the 21st Century

Frontmatter

Enhancing Energy Efficiency of Adaptive Lighting Control

Lighting standards (generally based on the CEN/TR 13201 standard) allow assigning different lighting classes to a single roadway dependently on an actual traffic flow. Applying a less restrictive lighting class decreases energy consumption. Thanks to this setting lighting classes accordingly to actual road conditions yields up to 34% reduction of the power usage. This result may be further improved if one performs control on a well design installation. The optimization of a lighting design for one lighting class is the subject to the intensive research. In this article we show in turn that the optimization within one class is not sufficient for preparing an optimal solution. The metrics allowing the comparison of two solutions has been also introduced. The energy consumption in the presented case is reduced by 2.73% only but in the context of the global energy consumption of lighting systems (estimated for 3 GWh in the EU) it seems to be interesting.

Adam Sędziwy, Leszek Kotulski, Artur Basiura

Comparative Analysis of Selected Algorithms in the Process of Optimization of Traffic Lights

Optimal settings of traffic lights and traffic light cycles are important tasks of modeling a modern ordered traffic in smart cities. This article analyzes the comparative effectiveness of selected optimization algorithms for the identified area. In particular, it involves the comparison of the concepts of genetic algorithm using particle swarm optimization, the differential evolution and the Monte Carlo method with two new approaches: evolution strategy involving the adaptation of the covariance matrix and topology archipelago consisting of four islands - different algorithms to optimize the length of the phase in fixed time traffic signals. Developed simulation solutions allowed to achieve a quantitative improvement in the selection of the optimal durations of the phases of traffic lights for the tested roads with junctions.

K. Małecki, P. Pietruszka, S. Iwan

Knowledge Representation Framework for Agent–Based Economic Systems in Smart City Context

The agent-based economic systems essentially need precise configuration data and access to knowledge from various information sources in order to function properly. Main sources of such information are national statistical economic data, regional statistics, company performance indicators, etc. Generally, information sources of various formats and levels of detail are to be used. The main aim of the paper is to present a general framework for knowledge management used in the smart city context, allowing efficient employment and distribution of such data. The knowledge layer serves as an ontological intermediary between information resources and agents themselves, and is used mainly for improvement of the model efficiency especially in the following areas: (1) inter-agent communication, (2) system parameters configuration, (3) meta-data for improved search processes, and (4) unification of data exchange.

Martina Husáková, Petr Tučník

The Principles of Model Building Concepts Which Are Applied to the Design Patterns for Smart Cities

The involvement of citizens into decision-making processes is one of the main features of smart cities. Such commitment is reflected in the form of requirements towards the city, and the benefits which are expected from the city. Requirements and benefits are thus the primary language of communication between decision-makers and urban residents. To develop such a language, it becomes necessary to develop design patterns for Smart Cities, that could integrate the requirements and benefits into ontological concepts referring to the rules describing design patterns.The article proposes the construction of a conversion model of requirements and benefits, which are saved with the use of the natural language into ontological concepts of the principles referring to the patterns of Smarty Cities. The study verifies the model developed in the environment of an experiment. It applies ontologies for both languages: of benefits and of requirements. Then, it rates the mapping of both ontologies in relation to the sample requirements and benefits presented for Smart Cities. After that, the similarity of both ontologies is assessed and the concepts for the standard pattern rules are defined. This approach provides conditions for the development of Smart Cities patterns and for their use in decision-making processes which are so important for the development of Smart Cities.

Katarzyna Ossowska, Liliana Szewc, Cezary Orłowski

Assessment and Optimization of Air Monitoring Network for Smart Cities with Multicriteria Decision Analysis

Environmental monitoring networks need to be designed in efficient way, to minimize costs and maximize the information granted by their operation. Gathering data from monitoring stations is also the essence of Smart Cities. Agency of Regional Air Quality Monitoring in the Gdańsk Metropolitan Area (pol. ARMAAG) was assessed in terms of its efficiency to obtain variety of information. The results on one-month average concentrations of seven parameters (data for three years) for eight monitoring stations were the input data to multicriteria decision analysis assessment and the least effective was detected.

Aleksander Orłowski, Mariusz Marć, Jacek Namieśnik, Marek Tobiszewski

Urban Air Quality Forecasting: A Regression and a Classification Approach

We employ Computational Intelligence (CI) methods to model air pollution for the Greater Gdańsk Area in Poland. The forecasting problem is addressed with both classification and regression algorithms. In addition, we present an ensemble method that allows for the use of a single Artificial Neural Network-based model for the whole area of interest. Results indicate good model performance with a correlation coefficient between forecasts and measurements for the hourly PM10 concentration 24 h in advance reaching 0.81 and an agreement index (Cohen’s kappa) up to 54%. Moreover, the ensemble model demonstrates a decrease in Mean Square Error in comparison to the best simple model. Overall results suggest that the specific modelling approach can support the provision of air quality forecasts at an operational basis.

Kostas Karatzas, Nikos Katsifarakis, Cezary Orlowski, Arkadiusz Sarzyński

Analysis of Image, Video and Motion Data in Life Sciences

Frontmatter

Ethnicity Distinctiveness Through Iris Texture Features Using Gabor Filters

Research in iris biometrics has been focused on utilizing iris features as a means of identity verification and authentication. However, not enough research work has been done to explore iris textures to determine soft biometrics such as gender and ethnicity. Researchers have reported that iris texture features contain information that is inclined to human genetics and is highly discriminative between different eyes of different ethnicities. This work applies image processing and machine learning techniques by designing a bank of Gabor filters to develop a model that extracts iris textures to distinctively differentiate individuals according to ethnicity. From a database of 30 subjects with 120 images, results show that the mean amplitude computed from Gabor magnitude and phase provides a correct ethnic distinction of 93.33% between African Black and Caucasian subjects. The compactness of the produced feature vector promises a suitable integration with an existing iris recognition system.

Gugulethu Mabuza-Hocquet, Fulufhelo Nelwamondo, Tshilidzi Marwala

Diminishing Variant Illumination Factor in Object Recognition

Undetected object(s) from a camera due to a poor condition of light intensity, or shadow that appears are problems that can occurs in object detection. This can lead to a loss, especially when applied in the industrial world. The purpose of this research is to fix an illumination factor, particularly the shadow factor on an image that will be detected by combining two methods, namely adaptive single scale retinex and shadow removal. Smoothing from retinex and shadow removal process are performed after an image is captured. Accuracy of object detection obtained is 95.45%, using experimental image detection program and random sampling method from 22 images of two datasets used in this study. Namely “Shadow Removal Online Dataset and Benchmark for Variable Scene Categories” and “Klik BCA” which obtained from the simulation process. This method can be applied to real time conditions, where the speed of the process is stable and fast enough such that it can be applied into industrial companies to help their quality control.

Ardian Yunanto, Iman Herwidiana Kartowisastro

Fast Moving UAV Collision Avoidance Using Optical Flow and Stereovision

Unmanned aerial vehicles are becoming popular, but their autonomous operation is constrained by their collision avoidance ability in high-velocity movement. We propose a simple collision avoidance scheme for fast, business-grade fixed-wing aircraft which is based on optical flow and stereovision. We calculate optical flow on parts of the image that are essential for collision avoidance and enlarge the analysed area only as long, as the framerate allows, thus avoiding the need to stretch calculations over several frames.

Damian Pęszor, Marzena Wojciechowska, Konrad Wojciechowski, Marcin Szender

Towards the Notion of Average Trajectory of the Repeating Motion of Human Limbs

Average trajectory of the repeating motion of the limb joint defined in this paper. Majority of the existing results communicate analysis of the motions by means of different numeric parameters which does not necessary provide desirable feedback for those without deep knowledge of the subject. The notion of the average trajectory defined as the pair, consisting of trajectory, describing the shape of the joint motion, and pipe-shaped neighbourhood, describing variability of the observed motions. Proposed notion allows visualisation in three-dimensional space, which is easily interpretable and in turn may be used as feedback communicating results of the training or therapy session. Numeric parameters are associated with the average trajectory to validate proposed definition.

Sven Nõmm, Aaro Toomela, Ilia Gaichenja

Interfered Seals and Handwritten Characters Removal for Prescription Images

Text detection in prescription images is a challenging problem because the prescription images are captured by heterogeneous cameras and under different illuminations. Further, the text is always interfered with by affixed seals and by handwritten characters. In this paper, a binarization method is proposed based on Niblack’s method to remove the interfering affixed seal and handwritten characters in prescription images. Experimental results show that the proposed method can threshod the text and remove interference effectively with an accuracy recognition rate of 93.38%. This experimental result is compared with four others methods.

Wen-Hong Zhang, Teng-Hui Tseng, Chun-Ming Tsai

Neuromuscular Fatigue Analysis of Soldiers Using DWT Based EMG and EEG Data Fusion During Load Carriage

This research reports peripheral and central fatigue of soldiers during load carriage on a treadmill. Electromyography (EMG) was used to investigate peripheral fatigue of lower extremity muscles and electroencephalography (EEG) was used for central fatigue detection on frontal lobe of the brain. EMG data were processed using Db5 and Rbio3.1 discrete wavelet transforms with a six levels of decomposition and EEG data were iteratively transformed into multi-resolution subsets of coefficients using Db8 wavelet function to perform the power spectrum analysis of alpha, beta and theta waves. Peak alpha frequency (PAF) was also calculated for EEG signals. A majority of significant results (p < 0.05) from EMG signals were observed in the lower extremity muscles using Db5 wavelet function at all conditions. While, significant changes were only observed during unloaded conditions at the frontal cortex. Significant changes (p < 0.05) in the PAF was also detected at certain conditions in the pre-frontal and frontal cortex. A significant increase in heart rate and rating of perceived exertion values were seen at all conditions. Hence, peripheral fatigue was the cause of the exhaustion sustained by the soldiers during load carriage which sends signals to the brain for decision making as to stop the exercise.

D. N. Filzah P. Damit, S. M. N. Arosha Senanayake, Owais A. Malik, Nor Jaidi Tuah

Manifold Methods for Action Recognition

Among a broad spectrum of published methods of recognition of human actions in video sequences, one approach stands out, different from the rest by not relying on detection of interest points or events, extraction of features, region segmentation or finding trajectories, which are all prone to errors. It is based on representation of a time segment of a video sequence as a point on a manifold, and uses a geodesic distance defined on manifold for comparing and classifying video segments. A manifold based representation of a video sequence is obtained starting with a 3d array of consecutive image frames or a 3rd order tensor, which is decomposed into three $$3 \times k$$ arrays that are mapped to a point of a manifold. This article presents a review of manifold based methods for human activity recognition and sparse coding of images that also rely on a manifold representation. Results of a human activity classification experiment that uses an implemented action recognition method based on a manifold representation illustrate the presentation.

Agnieszka Michalczuk, Kamil Wereszczyński, Jakub Segen, Henryk Josiński, Konrad Wojciechowski, Artur Bąk, Sławomir Wojciechowski, Aldona Drabik, Marek Kulbacki

Optical Flow Based Face Anonymization in Video Sequences

In this paper we present a method of anonymization of people’s faces in video. Results are analyzed on the basis of optical flow methods. Anonymization bases on face detection. Because of mistakes made by such detectors in video sequences, gaps and false detections appear. They are recognized using the results of face detections and optical flow analysis. In this paper we describe: face detectors and the results of method of analysis of optical-flow based detector. We present novel method of filing gaps and false detection recognizing with use of optical flow. Then we present visual results.

Kamil Wereszczyński, Agnieszka Michalczuk, Jakub Segen, Magdalena Pawlyta, Artur Bąk, Jerzy Paweł Nowacki, Marek Kulbacki

An Analysis of the Centre of Mass Behavior During Treadmill Walking

The authors present the preliminary results of the analysis of the centre of mass behavior during treadmill walking by means of the sample entropy which quantifies a regularity of a time series. The research is focused on the centre of mass trajectories in the mediolateral, anteroposterior and longitudinal axes recorded using the motion capture technique. From among several entropy measures the sample entropy was chosen for the purpose of assessment of the influence of both walking speed and ground inclination on a regularity in movements of the centre of mass. The results were compared with the sample entropy values for periodic, chaotic and stochastic signals.

Henryk Josiński, Adam Świtoński, Agnieszka Michalczuk, Konrad Wojciechowski, Jerzy Paweł Nowacki

A Bayesian Framework for Chemical Shift Assignment

Nuclear magnetic resonance (NMR) spectroscopy is one of the techniques used in structural biology and drug discovery. A critical step in analysis of NMR images lies in automation of assigning NMR signals to nuclei in studied macromolecules. This procedure is known as sequence-specific resonance assignment and is carried out manually. Manual analysis of NMR data results in high costs, lengthy analysis and proneness to user-specific errors. To address this problem, we propose a new Bayesian approach, where resonance assignment is formulated as maximum a posteriori inference over continuous variables.

Adam Gonczarek, Piotr Klukowski, Maciej Drwal, Paweł Świątek

Modern Applications of Machine Learning for Actionable Knowledge Extraction

Frontmatter

Traditional vs. Machine Learning Techniques: Customer Propensity

In today’s world there is a need of speedy tools and techniques to convert big data into information. Traditional techniques are robust but might carry inherent bias of data scientist’s modeling skills. Machine Learning (ML) models makes the machine learn from the data and might not suffer from human bias. Some researchers argue that ML techniques can turn around model faster for big data, while others debate traditional techniques would still outperform when data is not so big. This paper tests arguments raised above in Oil & Gas (ONG) industry for actionable knowledge extraction on not so big sample customer data to predict customer’s campaign response. Our experiment reveals that on our data ML doesn’t outclass traditional modeling but is either slightly better or at par in terms of model accuracy i.e. percentage of instances classified correctly. It also establishes that though ML results don’t improve much on accuracy but some of them can be developed much faster.

Mamta A. Rajnayak, Snigdha Moitra, Charu Nahata

Analytics on the Impact of Leadership Styles and Leadership Outcome

Data mining and machine learning approaches are helping organisations to streamline and optimise their processes and assisting management to devise more effective and efficient business strategies. In organisations, management or leadership style play an important role in employees motivation, work satisfaction, work commitment and productivity. Moreover, leadership style shapes up the organisational culture. The aim of this paper is to explore the impact of various leadership styles on employees using data mining techniques. A significant research has taken place in this area using simple statistical methods such as correlation and regression analysis. However, no research has been conducted in this area which used data mining algorithms to extract useful information from the data to examine how leadership styles influence/affect employees. In this research, rule and decision tree based algorithms are used to extract actionable information from the collected data.

Waseem Ahmad, Muhammad Akhtaruzamman

An Investigation into the Relationship of Strategic Planning Practices and Organizational Performance Using Advanced Data Mining Techniques

This paper presents an investigation of the strategic planning practices of small businesses in Rotorua. The Rotorua district has a population of around 70,000 residents, located in the heart of New Zealand’s North Island. Tourism is Rotorua’s largest employer contributing around $593 million per year, or 10% of the district’s economy. The strong link between planning and performance is understood for larger organisations but less so for small and medium sized firms. This study explores the degree of strategic planning undertaken by Rotorua-based businesses. A combination of quantitative and qualitative data collection via a structured questionnaire and a semi-structured interview, comprised the research methodology. Collected data was analyzed using advanced machine learning (classification) techniques. Findings suggest, a firm’s industry sector influences the degree of formal planning. Other factors that influence the degree of planning formality include the age of the business, annual revenue, and the number of employees. More significantly, this study observes that formal business training and an age of a business are two features responsible for separating more formal planners from less formal planners.

Philip Bright, Waseem Ahmad, Uswa Zahra

Mathematics of Decision Sciences and Information Science

Frontmatter

A Ranking Procedure with the Shapley Value

This paper considers the problem of electing candidates for a certain position based on ballots filled by voters. We suggest a voting procedure using cooperative game theory methods. For this, it is necessary to construct a characteristic function via the preference profile of voters. The Shapley value serves as the ranking method. The winner is the candidate having the maximum Shapley value. And finally, we explore the properties of the designed procedures.

Aleksei Kondratev, Vladimir Mazalov

Optimal Design of Robust Combinatorial Mechanisms for Substitutable Goods

In this paper we consider multidimensional mechanism design problem for selling discrete substitutable items to a group of buyers. Previous work on this problem mostly focus on stochastic description of valuations used by the seller. However, in certain applications, no prior information regarding buyers’ preferences is known. To address this issue, we consider uncertain valuations and formulate the problem in a robust optimization framework: the objective is to minimize the maximum regret. For a special case of revenue-maximizing pricing problem we present a solution method based on mixed-integer linear programming formulation.

Maciej Drwal

Communication and KP-Model

This paper treats a Bayesian routing problem from the epistemic point of view. We discuss on the role of communication among all users about the users’ individual conjectures on the others’ selections of channels in the network game. In this paper we focus on the expectations of social costs and its individual conjectures, and we show that, in a revision process of all users’ conjectures on the expectations of social costs by communication through the message on the conjecture among the all users, the process yields a Nash equilibrium for social cost in the based KP-model.

Takashi Matsuhisa

Scalable Data Analysis in Bioinformatics and Biomedical Informatics

Frontmatter

Orchestrating Task Execution in Cloud4PSi for Scalable Processing of Macromolecular Data of 3D Protein Structures

The growing amount of biological data, including macromolecular data describing 3D protein structures, encourages the scientific community to reach for computing resources of the Cloud in order to process and analyze the data on a large scale. This applies, among many different analytical processes performed in bioinformatics, to protein structure alignment and similarity searching. In this paper, we show a parameter sweep-based approach for scheduling computations related to massive 3D protein structure alignments performed with Cloud4PSi system working on Microsoft Azure public cloud.

Dariusz Mrozek, Artur Kłapciński, Bożena Małysiak-Mrozek

Probabilistic Neural Network Inferences on Oligonucleotide Classification Based on Oligo: Target Interaction

Oligonucleotides are small non-coding regulatory RNA or DNA sequences that bind to specific mRNA locations to impart gene regulation. Identification of oligonucleotides from other small non-coding RNA sequences such as miRNAs, piRNAs etc. is still challenging as oligos exhibit a notable overlap in sequence length and properties with these RNA categories. This work focuses on a probabilistic oligonucleotide classification method based on its distinct underlying feature vectors to identify oligos from other regulatory classes. We propose a computational approach developed using a probabilistic neural network (PNN) based on oligo: target binding characteristics. The performance measure showed promising results when compared with other existing computational methods. Role and contribution of extracted features was estimated using the receiver operating curves. Our study suggests the potentiality of probabilistic approaches over non-probabilistic techniques in oligonucleotide classification problems.

Abdul Rahiman Anusha, S. S. Vinodchandra

Scalability of a Genomic Data Analysis in the BioTest Platform

BioTest platform is dedicated for the processing of biomedical data that originate from various measurement techniques. This includes next-generation sequencing (NGS), that focuses the attention of researchers all of the world due to its broad possibilities in determining the structure of the DNA and RNA. However, the analysis of data provided by NGS requires large disk space, and is time-consuming, becoming a challenge for the data processing systems. In this paper, we have analyzed the possibility of scaling the BioTest platform in terms of genomic data analysis and platform architecture. Scalability tests were carried out using next-generation sequencing data and relied on methods for detection of somatic mutations and polymorphisms in the human DNA. Our results show that the platform is scalable, allowing to significantly reduce the execution time of performed calculations. However, the scalability capabilities depend on the experiment methodology and homogeneity of resources required by each task, which in NGS studies can be highly variable.

Krzysztof Psiuk-Maksymowicz, Dariusz Mrozek, Roman Jaksik, Damian Borys, Krzysztof Fujarewicz, Andrzej Swierniak

Quantifying the Effect of Metapopulation Size on the Persistence of Infectious Diseases in a Metapopulation

We investigate the special role of the three-dimensional relationship between periodicity, persistence and synchronization on its ability of disease persistence in a meta-population. Persistence is dominated by synchronization effects, but synchronization is dominated by the coupling strength and the interaction between local population size and human movement. Here we focus on the quite important role of population size on the ability of disease persistence. We implement the simulations of stochastic dynamics in a susceptible-exposed-infectious-recovered (SEIR) metapopulation model in space. Applying the continuous-time Markov description of the model of deterministic equations, the direct method of Gillespie [10] in the class of Monte-Carlo simulation methods allows us to simulate exactly the transmission of diseases through the seasonally forced and spatially structured SEIR meta-population model. Our finding shows the ability of the disease persistence in the meta-population is formulated as an exponential survival model on data simulated by the stochastic model. Increasing the meta-population size leads to the clearly decrease of the extinction rates local as well as global. The curve of the coupling rate against the extinction rate which looks like a convex functions, gains the minimum value in the medium interval, and its curvature is directly proportional to the meta-population size.

Cam-Giang Tran-Thi, Marc Choisy, Jean Daniel Zucker

Large-Scale Data Classification System Based on Galaxy Server and Protected from Information Leak

In this work we present SPICY (SPecialized Classification sYstem) application for a supervised data analysis (feature selection, classification, model validation and model selection) with the structure preventing the data processing work-flow from so called information leak. The information leak may result in optimistically biased classification quality assessment, especially for large-scale, small-sample data sets. The application uses the Galaxy Server environment that originally allows the user to manual data processing and is not prevented from the information leak. The way how the classification model is built by the user and the specific structure of all implemented methods makes the information leak impossible. The lack of information leak in the presented supervised data analysis tool is demonstrated on numerical examples, where synthetic and real data sets are used.

Krzysztof Fujarewicz, Sebastian Student, Tomasz Zielański, Michał Jakubczak, Justyna Pieter, Katarzyna Pojda, Andrzej Świerniak

Technological Perspective of Agile Transformation in IT organizations

Frontmatter

Agents of RUP Processes Model for IT Organizations Readiness to Agile Transformation Assessment

A significant problem in modern software engineering is maturity assessment of organizations developing software.We propose the use of process model of RUP development methodology as a pattern for comparing it with the tested project. Percent values of accordance coefficient determine the task accordance of the tested project with the pattern of activities flow. This RUP model concept is based on a multi-agent based simulation (MABS). It presents agents and their behaviours as well as objects placed in the agent system environment. The behaviour of agents is presented as a set of Finite State Automatons.The usefulness of the method for assessment of organization’s maturity was examined in a two-part experiment. The result of the first part of the experiment was used in the second part as the process pattern to determine the accordance of a sample project to the result of simulated model. The results confirmed the usefulness of the model in maturity assessment.

Włodzimierz Wysocki, Cezary Orłowski, Artur Ziółkowski, Grzegorz Bocewicz

Evaluation of Readiness of IT Organizations to Agile Transformation Based on Case-Based Reasoning

Nowadays many of IT organization decides to change the way of delivering from classic, waterfall approach to agile. This transition is called “agile transformation” (AT). The problem of this process is that part of companies started AT without any analysis. This causes that many of transitions fails and organizations must return to old methods of delivering. Cost of return is significant and number of projects with violated project management triangle is bigger than before. In this paper authors described the results of conducted research and the model of evaluating of readiness to agile transformation based on case-based reasoning.

Cezary Orłowski, Tomasz Deręgowski, Miłosz Kurzawski, Artur Ziółkowski

Building Dedicated Project Management Process Basing on Historical Experience

Project Management Process used to manage IT project could be a key aspect of project success. Existing knowledge does not provide a method, which enables IT Organizations to choose Project Management methodology and processes, which would be adjusted to their unique needs. As a result, IT Organization use processes which are not tailored to their specific and do not meet their basic needs. This paper is an attempt to fill this gap. It describes a method for selecting management methodologies, processes and engineering practices most adequate to project characteristic. Choices are made on the basis of organization’s historical experience. Bespoken Project Management process covers technical and non-technical aspects of software development and is adjusted to unique project challenges and needs. Created process is a hybrid based on CMMI for Development Model, it derives from different sources, uses elements of waterfall and Agile approaches, different engineering practices and process improvement methods.

Cezary Orłowski, Tomasz Deręgowski, Miłosz Kurzawski, Artur Ziółkowski

Describing Criteria for Selecting a Scrum Tool Using the Technology Acceptance Model

Scrum teams extensively use tools to support their processes, but little attention has been given to criteria a Scrum team applies in its selection of such a tool. A greenfield approach was used to explore these criteria. To this extent twelve Scrum teams were asked to list criteria and assigned weights in their decision processes. After having chosen and used a tool for a number of Sprints, the teams also evaluated the selected tools. Using the Technology Acceptance Model to structure findings, two major categories were identified: Perceived usefulness, alias criteria directly related to Scrum, and perceived ease of use. Most teams listed more or less the same criteria. Within the categories several specific subcategories were distinguished, for instance burn-down chart support or multi-platform aspects. Teams evaluated more issues, positive or negative, within the Scrum-related criteria. The findings indicate that Scrum teams prefer perceived usefulness over perceived ease of use. In other words: Specific support of Scrum, especially its artefacts, are of greater value to a team than general tool considerations.

Gerard Wagenaar, Sietse Overbeek, Remko Helms

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Applications of Data Science

Frontmatter

Exploring Spatial and Social Factors of Crime: A Case Study of Taipei City

A Fuzzy Logic Based Network Intrusion Detection System for Predicting the TCP SYN Flooding Attack

Analytical Ideas to Improve Daily Demand Forecasts: A Case Study

Increasing the Detection of Minority Class Instances in Financial Statement Fraud

Information Technology Services Evaluation Based ITIL V3 2011 and COBIT 5 in Center for Data and Information

Artificial Intelligence Applications for E-services

Frontmatter

To Solve the TDVRPTW via Hadoop MapReduce Parallel Computing

MapReduce-Based Frequent Pattern Mining Framework with Multiple Item Support

Balanced k-Means

An Optimal Farmland Allocation E-Service Deployment

Automated Reasoning and Proving Techniques with Applications in Intelligent Systems

Frontmatter

Anticipatory Runway Incursion Prevention Based on Inaccurate Position Surveillance Information

The Relation Between Syntax Restriction of Temporal Logic and Properties of Reactive System Specification

Measuring Interestingness of Theorems in Automated Theorem Finding by Forward Reasoning: A Case Study in Peano’s Arithmetic

A Predicate Suggestion Algorithm for Automated Theorem Finding with Forward Reasoning

Collective Intelligence for Service Innovation, Technology Opportunity, E-Learning and Fuzzy Intelligent Systems

Frontmatter

Modeling a Multi-criteria Decision Support System for Capital Budgeting Project Selection

Innovative Diffusion Chance Discovery

Chance Discovery in a Group-Trading Model ─ Creating an Innovative Tour Package with Freshwater Fish Farms at Yilan

Using Sentiment Analysis to Explore the Association Between News and Housing Prices

Importance-Performance Analysis Based Evaluation Method for Security Incident Management Capability

Intelligent Computer Vision Systems and Applications

Frontmatter

Moment Shape Descriptors Applied for Action Recognition in Video Sequences

Automatic Detection of Singular Points in Fingerprint Images Using Convolution Neural Networks

Vehicle Detection in Hsuehshan Tunnel Using Background Subtraction and Deep Belief Network

Segment Counting Versus Prime Counting in the Ulam Square

Improving Traffic Sign Recognition Using Low Dimensional Features

Image Processing Approach to Diagnose Eye Diseases

Weakly-Labelled Semantic Segmentation of Fish Objects in Underwater Videos Using a Deep Residual Network

A New Feature Extraction in Dorsal Hand Recognition by Chromatic Imaging

Testing the Limits of Detection of the ‘Orange Skin’ Defect in Furniture Elements with the HOG Features

Intelligent Data Analysis, Applications and Technologies for Internet of Things

Frontmatter

Reaching Safety Vehicular Ad Hoc Network of IoT

Short-Term Load Forecasting in Smart Meters with Sliding Window-Based ARIMA Algorithms

Bus Drivers Fatigue Measurement Based on Monopolar EEG

An Improved Sleep Posture Recognition Based on Force Sensing Resistors

Remaining Useful Life Estimation-A Case Study on Soil Moisture Sensors

Intelligent Algorithms and Brain Functions

Frontmatter

Neurofeedback System for Training Attentiveness

Building Classifiers for Parkinson’s Disease Using New Eye Tribe Tracking Method

Rules Found by Multimodal Learning in One Group of Patients Help to Determine Optimal Treatment to Other Group of Parkinson’s Patients

Intelligent Systems and Algorithms in Information Sciences

Frontmatter

Reasoning in Formal Systems of Extended RDF Networks

Big Data Filtering Through Adaptive Resonance Theory

The Effectiveness of the Simplicity in Evolutionary Computation

Genetic Algorithm for Self-Test Path and Circular Self-Test Path Design

IT in Biomedicine

Frontmatter

The Use of Tuned Shape Window for the Improvement of Scars Imaging in Static Renal Scintigraphy in Children

PCA-SCG-ANN for Detection of Non-structural Protein 1 from SERS Salivary Spectra

Prediction of Arterial Blood Gases Values in Premature Infants with Respiratory Disorders

Extraction of Optical Disc Geometrical Parameters with Using of Active Snake Model with Gradient Directional Information

Segmentation of Vascular Calcifications and Statistical Analysis of Calcium Score

Rough Hypercuboid and Modified Kulczynski Coefficient for Disease Gene Identification

Detection of Raynaud’s Phenomenon by Thermographic Testing for Finger Thermoregulation

Intelligent Technologies in the Smart Cities in the 21st Century

Frontmatter

Enhancing Energy Efficiency of Adaptive Lighting Control

Comparative Analysis of Selected Algorithms in the Process of Optimization of Traffic Lights

Knowledge Representation Framework for Agent–Based Economic Systems in Smart City Context

The Principles of Model Building Concepts Which Are Applied to the Design Patterns for Smart Cities

Assessment and Optimization of Air Monitoring Network for Smart Cities with Multicriteria Decision Analysis

Urban Air Quality Forecasting: A Regression and a Classification Approach

Analysis of Image, Video and Motion Data in Life Sciences

Frontmatter

Ethnicity Distinctiveness Through Iris Texture Features Using Gabor Filters

Diminishing Variant Illumination Factor in Object Recognition