Invited Talks

How Artificial Intelligence May Be Applied in Real World Situations

In the modern information era, managers must recognize the competitive opportunities represented by decision-support tools. New family of such systems, based on recent advances in Artificial Intelligence, combine prediction and optimization techniques to assist decision makers in complex, rapidly changing environments. These systems address the fundamental questions: What is likely to happen in the future? and what is the best course of action? These modern AI systems include elements of data mining, predictive modelling, forecasting, optimization, and adaptability and aim at providing significant cost savings and revenue increases for businesses. The talk introduces the concepts behind construction of such systems and indicates the current challenging research issues. Several real-world examples will be shown and discussed.

Zbigniew Michalewicz

Modern Machine Learning Techniques and Their Applications to Medical Diagnostics

The talk presents several machine learning techniques and their applications to clinical decision-making. In many problems of com-puter-aided medical diagnosis and treatment a program must be capable of learning from previously accumulated past patients data records, and extrapolating to make diagnosis for new patient by considering their symptoms. Many machine learning and statisitical techniques have been developed to help in clinical decision making. Among them decision trees, the Bayesian techniques, dicriminant analysis, neural networks and many others. These techniques usually deal with conventional, small-scale, low-dimensional problems, and the application of these techniques to modern high-dimensional data sets with many thousand attributes (symptoms) usually leads to serious computational problems. Several new techniques such as Support Vector Machine (SVM) have been developed to tackle the problem of dimensionality by transferring the problem into high-dimensional space, and solving it in that space. They based on so-called kernal methods and can very often solve some high-dimensional problems. These techniques perform very well with good accuracy. However, a typical drawback of techniques such as the SVM is that they usually do not provide any useful measure of confidence of new, unclassified examples (new pattients). Recently a new set of techniques, called Conformal Predictors, have been developed that allows to make predictions with valid measures of confidence. The approach is based on approximations to the universal measures of confidence given by the algorithmic theory of randomness and allows us to compute diagnostic classes and estimate confidence of the diagnostics for high-dimensional data. The talk will present Conformal Predictors and their applications in medicine.

Alexander Gammerman

Innovative Applications of Artificial Intelligence Techniques in Software Engineering

Artificial Intelligence (AI) techniques have been successfully applied in many areas of software engineering. The complexity of software systems has limited the application of AI techniques in many real world applications. This talk provides an insight into applications of AI techniques in software engineering and how innovative application of AI can assist in achieving ever competitive and firm schedules for software development projects as well as Information Technology (IT) management. The pros and cons of using AI techniques are investigated and specifically the application of AI in IT management, software application development and software security is considered.

Organisations that build software applications do so in an environment characterised by limited resources, increased pressure to reduce cost and development schedules. Organisations demand to build software applications adequately and quickly. One approach to achieve this is to use automated software development tools from the very initial stage of software design up to the software testing and installation. Considering software testing as an example, automated software systems can assist in most software testing phases.

On the hand data security, availability, privacy and integrity are very important issues in the success of a business operation. Data security and privacy policies in business are governed by business requirements and government regulations. AI can also assist in software security, privacy and reliability. Implementing data security using data encryption solutions remain at the forefront for data security. Many solutions to data encryption at this level are expensive, disruptive and resource intensive. AI can be used for data classification in organizations. It can assist in identifying and encrypting only the relevant data thereby saving time and processing power. Without data classification organizations using encryption process would simply encrypt everything and consequently impact users more than necessary. Data classification is essential and can assist organizations with their data security, privacy and accessibility needs. This talk explores the use of AI techniques (such as fuzzy logic) for data classification and suggests a method that can determine requirements for classification of organizations’ data for security and privacy based on organizational needs and government policies. Finally the application of FCM in IT management is discussed.

Masoud Mohammadian

Machine Learning

Linear Probability Forecasting

In this paper we consider two online multi-class classification problems: classification with linear models and with kernelized models. The predictions can be thought of as probability distributions. The quality of predictions is measured by the Brier loss function. We suggest two computationally efficient algorithms to work with these problems, the second algorithm is derived by considering a new class of linear prediction models. We prove theoretical guarantees on the cumulative losses of the algorithms. We kernelize one of the algorithms and prove theoretical guarantees on the loss of the kernelized version. We perform experiments and compare our algorithms with logistic regression.

Fedor Zhdanov, Yuri Kalnishkan

The Importance of Similarity Metrics for Representative Users Identification in Recommender Systems

In this paper we explore the efficiency of recommendation provided by representative users on behalf of cluster members. Clustering is used to moderate the scalability and diversity issues faced by most recommendation algorithms face. We show through extended evaluation experiments that cluster representative make successful recommendations outperforming the K-nearest neighbor approach which is common in recommender systems that are based on collaborative filtering. However, selection of representative users depends heavily on the similarity metric that is used to identify users with similar preferences. It is shown that the use of different similarity metrics leads, in general, to different representative users while the commonly used Pearson coefficient is the poorest similarity metric in terms of representative user identification.

Olga Georgiou, Nicolas Tsapatsoulis

An Optimal Scaling Approach to Collaborative Filtering Using Categorical Principal Component Analysis and Neighborhood Formation

Collaborative Filtering (CF) is a popular technique employed by Recommender Systems, a term used to describe intelligent methods that generate personalized recommendations. The most common and accurate approaches to CF are based on latent factor models. Latent factor models can tackle two fundamental problems of CF, data sparsity and scalability and have received considerable attention in recent literature. In this work, we present an optimal scaling approach to address both of these problems using Categorical Principal Component Analysis for the low-rank approximation of the user-item ratings matrix, followed by a neighborhood formation step. The optimal scaling approach has the advantage that it can be easily extended to the case when there are missing data and restrictions for ordinal and numerical variables can be easily imposed. We considered different measurement levels for the user ratings on items, starting with a multiple nominal and consecutively applying nominal, ordinal and numeric levels. Experiments were executed on the MovieLens dataset, aiming to evaluate the aforementioned options in terms of accuracy. Results indicated that a combined approach (multiple nominal measurement level, ‘‘passive’’ missing data strategy) clearly outperformed the other tested options.

Angelos I. Markos, Manolis G. Vozalis, Konstantinos G. Margaritis

A Classroom Observation Model Fitted to Stochastic and Probabilistic Decision Systems

This paper focuses on solving the problems of preparing and normalizing data that are captured from a classroom observation, and are linked with significant relevant properties. We adapt these data using a Bayesian model that creates normalization conditions to a well fitted artificial neural network. We separate the method in two stages: first implementing the data variable in a functional multi-factorial normalization analysis using a normalizing constant and then using constructed vectors containing normalization values in the learning and testing stages of the selected learning vector quantifier neural network.

Marios Poulos, Vassilios S. Belesiotis, Nikolaos Alexandris

Prediction with Confidence Based on a Random Forest Classifier

Conformal predictors represent a new flexible framework that outputs region predictions with a guaranteed error rate. Efficiency of such predictions depends on the nonconformity measure that underlies the predictor. In this work we designed new nonconformity measures based on a random forest classifier. Experiments demonstrate that proposed conformal predictors are more efficient than current benchmarks on noisy mass spectrometry data (and at least as efficient on other type of data) while maintaining the property of validity: they output fewer multiple predictions, and the ratio of mistakes does not exceed the preset level. When forced to produce singleton predictions, the designed conformal predictors are at least as accurate as the benchmarks and sometimes significantly outperform them.

Dmitry Devetyarov, Ilia Nouretdinov

Fuzzy Logic Techniques

A Generic Tool for Building Fuzzy Cognitive Map Systems

A generic system for simulating complex dynamical systems along the paradigm of fuzzy cognitive maps (FCM) has been created and tested. The proposed system enables a user to design appropriate FCM structures, by specifying the desired concepts and the various parameters such as sensitivities, as well as a variety of shaping functions. The user is able to see the results, change the parameters, modify the functions, and rerun the system using an alteration of the final results and make new conclusions. The system is introduced and demonstrated using a simple real case. The results of a usability test of the system suggest that the system is capable of simulating complicated FCM structures in an effective manner, helping the user to reduce the degree of risks during decision making.

Maria Papaioannou, Costas Neocleous, Anastasis Sofokleous, Nicos Mateou, Andreas Andreou, Christos N. Schizas

A Fuzzy Rule-Based Approach to Design Game Rules in a Mission Planning and Evaluation System

Simulations and wargames offer powerful representations to model the mechanics and psychology of military operations that are inherently complex. They offer mechanisms to predict and assess the effectiveness of the mission plans and operations in achieving the military objectives. In this paper, we present a new approach to design the games rules of wargames using fuzzy rule bases, for quantitatively evaluating the effectiveness of air tasking missions. We determine the comparative damage relative to intended damage for a target, taking into account the effects of operational characteristics to compute possibilistic damage to the target as opposed to the probability of damage to the target. The cookie-cutter method to compute the damage is modeled as a fuzzy variable. Effectiveness of the mission is obtained by comparing the damage to targets with the cost and significance of the target in meeting the mission objectives. Damage assessment computation to targets using fuzzy rule bases gave more realistic results when used in field training and deployment of the system.

D. Vijay Rao, Jasleen Kaur

One-Dimensional Linear Local Prototypes for Effective Selection of Neuro-Fuzzy Sugeno Model Initial Structure

We consider a Takagi-Sugeno-Kang (TSK) fuzzy rule based system used to model a memory-less nonlinearity from numerical data. We develop a simple and effective technique allowing to remove irrelevant inputs, choose a number of membership functions for each input, propose well estimated starting values of membership functions and consequent parameters. All this will make the fuzzy model more concise and transparent. The final training procedure will be shorter and more effective.

Jacek Kabziński

Lasso: Linkage Analysis of Serious Sexual Offences

A Decision Support System for Crime Analysts and Investigators

One of the most important considerations when investigating a serious sexual offence is to find if it can be linked to other offences. If this can be done then there is a considerable dividend in terms additional evidence and new lines of enquiry. The central problem is the construction of a satisfactory typology of these crimes, but little progress has been made. It is the authors’ contention that difficulties arise from the inadequacy of the adoption of the classical or ‘crisp set’ paradigm. Complex events like crimes cannot be described satisfactorily in this way and it is proposed that fuzzy set theory offers a powerful framework within which crime can be portrayed in a sensitive and perceptive manner that can enhance the search for associations between offences.

Don Casey, Phillip Burrell

Evolutionary Computation

Forecasting Euro – United States Dollar Exchange Rate with Gene Expression Programming

In the current paper we present the application of our Gene Expression Programming Environment in forecasting Euro-United States Dollar exchange rate. Specifically, using the GEP Environment we tried to forecast the value of the exchange rate using its previous values. The data for the EURO-USD exchange rate are online available from the European Central Bank (ECB). The environment was developed using the JAVA programming language, and is an implementation of a variation of Gene Expression Programming. Gene Expression Programming (GEP) is a new evolutionary algorithm that evolves computer programs (they can take many forms: mathematical expressions, neural networks, decision trees, polynomial constructs, logical expressions, and so on). The computer programs of GEP, irrespective of their complexity, are all encoded in linear chromosomes. Then the linear chromosomes are expressed or translated into expression trees (branched structures). Thus, in GEP, the genotype (the linear chromosomes) and the phenotype (the expression trees) are different entities (both structurally and functionally). This is the main difference between GEP and classical tree based Genetic Programming techniques.

Maria A. Antoniou, Efstratios F. Georgopoulos, Konstantinos A. Theofilatos, Spiridon D. Likothanassis

Automatically Designing Robot Controllers and Sensor Morphology with Genetic Programming

Genetic programming provides an automated design strategy to evolve complex controllers based on evolution in nature. In this contribution we use genetic programming to automatically evolve efficient robot controllers for a corridor following task. Based on tests executed in a simulation environment we show that very robust and efficient controllers can be obtained. Also, we stress that it is important to provide sufficiently diverse fitness cases, offering a sound basis for learning more complex behaviour. The evolved controller is successfully applied to real environments as well. Finally, controller and sensor morphology are co-evolved, clearly resulting in an improved sensor configuration.

Bert Bonte, Bart Wyns

Multiple Criteria Performance Analysis of Non-dominated Sets Obtained by Multi-objective Evolutionary Algorithms for Optimisation

The paper shows the importance of a multi-criteria performance analysis in evaluating the quality of non-dominated sets. The sets are generated by the use of evolutionary algorithms, more specifically through SPEA2 or NSGA-II. Problem examples from different problem domains are analyzed on four criteria of quality. These four criteria namely

cardinality

of the non-dominated set,

spread

of the solutions,

hyper-volume

, and

set coverage

do not favour any algorithm along the problem examples. In the Multiple Shortest Path Problem (MSPP) examples, the

spread

of solutions is the decisive factor for the 2S|1M configuration, and the

cardinality

and

set coverage

for the 3S configuration. The differences in

set coverage

values between SPEA2 and NSGA-II in the MSPP are small since both algorithms have almost identical non-dominated solutions. In the Decision Tree examples, the decisive factors are

set coverage

and

hyper-volume

. The computations show that the decisive criterion or criteria vary in all examples except for the

set coverage

criterion. This shows the importance of a binary measure in evaluating the quality of non-dominated sets, as the measure itself tests for dominance. The various criteria are confronted by means of a multi-criteria decision tool.

Gerrit K. Janssens, José Maria Pangilinan

Efficiency and Robustness of Three Metaheuristics in the Framework of Structural Optimization

Due to the technological advances in computer hardware and software tools, structural optimization has been gaining continuously increasing interest over the last two decades. The purpose of the present work is to quantitatively compare three metaheuristic optimization algorithms, namely the Differential Evolution, Harmony Search and Particle Swarm Optimization methods, in the framework of structural optimization. The comparison of the optimizers is performed with reference to their efficiency (overall computing demands) and robustness (capability to detect near-optimal solutions). The optimum design of a real-world overhead traveling crane is used as the test bed application for conducting optimization test runs.

Nikos D. Lagaros, Dimos C. Charmpis

Medical Informatics and Biomedical Engineering

A Fuzzy Non-linear Similarity Measure for Case-Based Reasoning Systems for Radiotherapy Treatment Planning

This paper presents a decision support system for treatment planning in brain cancer radiotherapy. The aim of a radiotherapy treatment plan is to apply radiation in a way that destroys tumour cells but minimizes the damage to healthy tissue and organs at risk. Treatment planning for brain cancer patients is a complex decision-making process that relies heavily on the subjective experience and expert domain knowledge of clinicians. We propose to capture this experience by using case-based reasoning. Central to the working of our case-based reasoning system is a novel similarity measure that takes into account the non-linear effect of the individual case attributes on the similarity measure. The similarity measure employs fuzzy sets. Experiments, which were carried out to evaluate the similarity measure using real brain cancer patient cases show promising results.

Rupa Jagannathan, Sanja Petrovic, Angela McKenna, Louise Newton

A Soft Computing Approach for Osteoporosis Risk Factor Estimation

This research effort deals with the application of Artificial Neural Networks (ANNs) in order to help the diagnosis of cases with an orthopaedic disease, namely osteoporosis. Probabilistic Neural Networks (PNNs) and Learning Vector Quantization (LVQ) ANNs, were developed for the estimation of osteoporosis risk. PNNs and LVQ ANNs are both feed-forward networks; however they are diversified in terms of their architecture, structure and optimization approach. The obtained results of successful prognosis over pathological cases lead to the conclusion that in this case the PNNs (96.58%) outperform LVQ (96.03%) networks, thus they provide an effective potential soft computing technique for the evaluation of osteoporosis risk. The ANN with the best performance was used for the contribution assessment of each risk feature towards the prediction of this medical disease. Moreover, the available data underwent statistical processing using the Receiver Operating Characteristic (ROC) analysis in order to determine the most significant factors for the estimation of osteoporosis risk. The results of the PNN model are in accordance with the ROC analysis and identify age as the most significant factor.

Dimitrios Mantzaris, George Anastassopoulos, Lazaros Iliadis, Konstantinos Kazakos, Harris Papadopoulos

Protein Secondary Structure Prediction with Bidirectional Recurrent Neural Nets: Can Weight Updating for Each Residue Enhance Performance?

Successful protein secondary structure prediction is an important step towards modelling protein 3D structure, with several practical applications. Even though in the last four decades several PSSP algorithms have been proposed, we are far from being accurate. The Bidirectional Recurrent Neural Network (BRNN) architecture of Baldi et al. [1] is currently considered as one of the optimal computational neural network type architectures for addressing the problem. In this paper, we implement the same BRNN architecture, but we use a modified training procedure. More specifically, our aim is to identify the effect of the contribution of local versus global information, by varying the length of the segment on which the Recurrent Neural Networks operate for each residue position considered. For training the network, the backpropagation learning algorithm with an online training procedure is used, where the weight updates occur for every amino acid, as opposed to Baldi et al. [1], where the weight updates are applied after the presentation of the entire protein. Our results with a single BRNN are better than Baldi et al. [1] by three percentage points (Q3) and comparable to results of [1] when they use an ensemble of 6 BRNNs. In addition, our results improve even further when sequence-to-structure output is filtered in a post-processing step, with a novel Hidden Markov Model-based approach.

Michalis Agathocleous, Georgia Christodoulou, Vasilis Promponas, Chris Christodoulou, Vassilis Vassiliades, Antonis Antoniou

Contourlet Transform for Texture Representation of Ultrasound Thyroid Images

Texture representation of ultrasound (US) images is currently considered a major issue in medical image analysis. This paper investigates the texture representation of thyroid tissue via features based on the Contourlet Transform (CT) using different types of filter banks. A variety of statistical texture features based on CT coefficients, have been considered through a selection schema. The Sequential Float Feature Selection (SFFS) algorithm with a

k

-NN classifier has been applied in order to investigate the most representative set of CT features. For the experimental evaluation a set of normal and nodular ultrasound thyroid textures have been utilized. The maximum classification accuracy was 93%, showing that CT based texture features can be successfully applied for the representation of different types of texture in US thyroid images.

Stamos Katsigiannis, Eystratios G. Keramidas, Dimitris Maroulis

Assessment of Stroke Risk Based on Morphological Ultrasound Image Analysis with Conformal Prediction

Non-invasive ultrasound imaging of carotid plaques allows for the development of plaque image analysis in order to assess the risk of stroke. In our work, we provide reliable confidence measures for the assessment of stroke risk, using the Conformal Prediction framework. This framework provides a way for assigning valid confidence measures to predictions of classical machine learning algorithms. We conduct experiments on a dataset which contains morphological features derived from ultrasound images of atherosclerotic carotid plaques, and we evaluate the results of four different Conformal Predictors (CPs). The four CPs are based on Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Naive Bayes classification (NBC), and

k

-Nearest Neighbours (

k

-NN). The results given by all CPs demonstrate the reliability and usefulness of the obtained confidence measures on the problem of stroke risk assessment.

Antonis Lambrou, Harris Papadopoulos, Efthyvoulos Kyriacou, Constantinos S. Pattichis, Marios S. Pattichis, Alexander Gammerman, Andrew Nicolaides

Text Mining and Natural Language Processing

Concept Based Representations as Complement of Bag of Words in Information Retrieval

Information Retrieval models, which do not represent texts merely as collections of the words they contain, but rather as collections of the concepts they contain through synonym sets or latent dimensions, are known as Bag-of-Concepts (BoC) representations. In this paper we use random indexing, which uses co-occurrence information among words to generate semantic context vectors and then represent the documents and queries as BoC. In addition, we use a novel representation, Holographic Reduced Representation, previously proposed in cognitive models, which can encode relations between words. We show that these representations can be successfully used in information retrieval, can associate terms, and when they are combined with the traditional vector space model, they improve effectiveness, in terms of mean average precision.

Maya Carrillo, Aurelio López-López

Information Fusion for Entity Matching in Unstructured Data

Every day the global media system produces an abundance of news stories, all containing many references to people. An important task is to automatically generate reliable lists of people by analysing news content. We describe a system that leverages large amounts of data for this purpose. Lack of structure in this data gives rise to a large number of ways to refer to any particular person. Entity matching attempts to connect references that refer to the same person, usually employing some measure of similarity between references. We use information from multiple sources in order to produce a set of similarity measures with differing strengths and weaknesses. We show how their combination can improve precision without decreasing recall.

Omar Ali, Nello Cristianini

An Example-Tracing Tutor for Teaching NL to FOL Conversion

In this paper we present an Example-tracing Tutor for the conversion of a sentence written in natural language (NL) to a sentence written in first order logic (FOL), which is a basic knowledge representation language. The tutor is based on the scripting of the process of the NL to FOL conversion and it has been authored using the Cognitive Tutoring Authoring Tool (CTAT) in which we have implemented a completed student interface and we also have created a Behavior Recorder graph for the above process.

Themistoklis Chronopoulos, Isidoros Perikos, Ioannis Hatzilygeroudis

Learning the Preferences of News Readers with SVM and Lasso Ranking

We attack the task of predicting which news-stories are more appealing to a given audience by comparing ‘most popular stories’, gathered from various online news outlets, over a period of seven months, with stories that did not become popular despite appearing on the same page at the same time. We cast this as a learning-to-rank task, and train two different learning algorithms to reproduce the preferences of the readers, within each of the outlets. The first method is based on Support Vector Machines, the second on the Lasso. By just using words as features, SVM ranking can reach significant accuracy in correctly predicting the preference of readers for a given pair of articles. Furthermore, by exploiting the sparsity of the solutions found by the Lasso, we can also generate lists of keywords that are expected to trigger the attention of the outlets’ readers.

Elena Hensinger, Ilias Flaounas, Nello Cristianini

Knowledge Representation and Reasoning

A Comparison of Two Ontology-Based Semantic Annotation Frameworks

The paper compares two semantic annotation frameworks that are designed for unstructured and ungrammatical domains. Both frameworks, namely ontoX (ontology-driven information Extraction) and BNOSA (Bayesian network and ontology based semantic annotation), extensively use ontologies during knowledge building, rule generation and data extraction phases. Both of them claim to be scalable as they allow a knowledge engineer, using either of these frameworks, to employ them for any other domain by simply plugging the corresponding ontology to the framework. They, however, differ in the ways conflicts are resolved and missing values are predicted. OntoX uses two heuristic measures, named level of evidence and level of confidence, for conflict resolution while the same task is performed by BNOSA with the aid of Bayesian networks. BNOSA also uses Bayesian networks to predict missing values. The paper compares the performance of both BNOSA and ontoX on the same data set and analyzes their strengths and weaknesses.

Quratulain Rajput, Sajjad Haider

A Tool for Automatic Creation of Rule-Based Expert Systems with CFs

This paper introduces a tool, namely ACRES (Automatic CReator of Expert Systems), which can automatically produce rule-based expert systems as CLIPS scripts from a dataset containing knowledge about a problem domain in the form of a large number of cases. The rules are created via a simple systematic approach and make use of certainty factors (CFs). CFs of same conclusions can be combined either using the MYCIN method or a generalization of MYCIN’s method. This latter method requires calculation of some weights, based on a training dataset, via the use of a genetic algorithm. Creation of an expert system is outlined. Small scale experimental results comparing the above methods with each other and a neural network are finally presented.

Ioannis Hatzilygeroudis, Konstantinos Kovas

Non-standard Reasoning Services for the Verification of DAML+OIL Ontologies

Ontology has a pivot role in the development of Semantic Web which provides the understanding of various domains that can be communicated between people and applications. Motivated by J. S. Dong’s work, we propose a new approach to interpreting DAML+OIL in a lightweight modeling language for software design, Alloy, which is used to provide a non-standard reasoning service for the verification of DAML+OIL ontologies. To do so, Jena is first used to parse ontology documents into classes, properties and statements, next we use algorithms to translate them into Alloy model, the Alloy Analyzer is then used to check and reason about such model. The experiments show that our method greatly improves J. S. Dong’s work, and distinguishes from the traditional ontology reasoners in property checking and reasoning.

Yingjie Song, Rong Chen

Algorithms for the Reconciliation of Ontologies in Open Environments

The dynamic changing feature of Semantic Web determines that the ontology which is a part of Semantic Web needs constantly to be modified in order to adapt outer environment. In this paper we make a careful analysis of the ontology changes’ complexity under open environment. The main contents discussed are as follow. At first we point out all possible relation types between any two ontology change sequences including directly conflict relation, indirectly conflict relation, dependent relation and compatible relation according to ontology change’s definition. And then we propose a new algorithm named Algorithm of Searching Maximum and Sequential Ontology Change Sequence Set(ASMSOCSS) to find all maximum and sequential ontology change sequence subset in the prime ontology change sequence set and prove the independence of the result which may be got after running ASMSOCSS. At last we put forward the algorithm by using these maximum and sequential ontology change sequence sets to create new ontology versions according to the dependence relation between ontology change sequences.

Yaqing Liu, Rong Chen, Hong Yang

Knowledge-Based Support for Software Engineering

The existing ambiguity of the notion of software engineering is mainly due to the fact that it is based on and depends on knowledge. The new definition of the term “software engineering”, proposed in this paper, encounters that fact. The main subject of discussion in the paper is how three different types of knowledge, namely declarative explicit, declarative structured (ontologies) and tacit can be used for effective support of software engineering as both practice and academic subject. Illustrative examples are shown along with some trends for more intensive use of knowledge for support of software engineering.

Dencho Batanov

Planning and Scheduling

A Hybrid Searching Method for the Unrelated Parallel Machine Scheduling Problem

The work addresses the NP-hard problem of scheduling a set of jobs to unrelated parallel machines with the overall objective of minimizing makespan. The solution presented proposes a greedy constructive algorithm followed by an application of a Variable Neighborhood Decent strategy that continually improves the incumbent solution until a local optimum is reached. The strength of the approach lies in the adoption of different objectives at various stages of the search to avoid early local optimum entrapment and, mainly, in the hybridization of heuristic methods and mathematical programming for the definition and exploration of neighborhood structures. Experimental results on a large set of benchmark problems attest to the efficacy of the proposed approach.

Christoforos Charalambous, Krzysztof Fleszar, Khalil S. Hindi

Aiding Interactive Configuration and Planning: A Constraint and Evolutionary Approach

This communication aims to propose a two step interactive aiding system dealing with product configuration and production planning. The first step assists interactively and simultaneously the configuration of a product and the planning of its production process. Then a second step complete the two previous tasks thanks to a constrained multi-criteria optimisation that proposes to the user a set of solutions belonging to a Pareto front minimizing cost and cycle time. The first section of the paper introduces the problem. The second one proposes a solution for the first step relying on constraint filtering for both configuration and planning. The following ones propose an evolutionary optimisation process and first computation results.

Paul Pitiot, Michel Aldanondo, Elise Vareilles, Paul Gaborit, Meriem Djefel, Claude Baron

Decentralized Services Orchestration Using Intelligent Mobile Agents with Deadline Restrictions

The necessity for better performance drives service orchestration towards decentralization. There is a recent approach where the integrator - that traditionally centralizes all corporative services and business logics - remains as a repository of interface services, but now lacks to know all business logics and business workflows. There are several techniques using this recent approach, including hybrid solutions, peer-to-peer solutions and trigger-based mechanisms. A more flexible approach regarding environment configuration and not fully explored in services orchestration technology is the use of intelligent mobile agents to execute it. In this paper, we present new adaptive heuristics for mobile agents to execute the decentralization of orchestration through missions (services) that correspond to the stages of business flow, with the ability to trade-off the quality of the result with the deadline of the mission. Some test case scenarios are presented and collected data are analyzed pointing the advantages and disadvantages of each heuristic.

Alex Magalhães, Lau Cheuk Lung, Luciana Rech

Mobile Robot-Assisted Cellular Environment Coverage

The robotic coverage problem of a known rectangular cellular environment with obstacles is considered in this article. The robot can move only at directions parallel and perpendicular to the sides of the rectangle and can cover one cell at each time unit. A suboptimal minimum-time complete area coverage path-planning algorithm is presented. This algorithm minimizes a distance cost metric related to its current position and the quadtree-based decomposed blocks of the unexplored space. The efficiency of the suggested algorithm is shown through simulation studies and the implementation of a non-linear controller designed for a two-wheeled robot to achieve tracking of reference trajectories. For the implementation of the controller, the localization of the robotic vehicle was necessary and it was achieved via image processing.

Georgios Siamantas, Konstantinos Gatsis, Antony Tzes

Feature Selection and Dimensionality Reduction

A Novel Feature Selection Method for Fault Diagnosis

A new method for automated feature selection is introduced. The application domain of this technique is fault diagnosis, where robust features are needed for modeling the wear level and therefore diagnosing it accurately. A robust feature in this field is one that exhibits a strong correlation with the wear level. The proposed method aims at selecting such robust features, while at the same time ascertain that they are as weakly correlated to each other as possible. The results of this technique on the extracted features for a real-world problem appear to be promising. It is possible to make use of the proposed technique for other feature selection applications, with minor adjustments to the original algorithm.

Zacharias Voulgaris, Chris Sconyers

Dimensionality Reduction for Distance Based Video Clustering

Clustering of video sequences is essential in order to perform video summarization. Because of the high spatial and temporal dimensions of the video data, dimensionality reduction becomes imperative before performing Euclidean distance based clustering. In this paper, we present non-adaptive dimensionality reduction approaches using random projections on the video data. Assuming the data to be a realization from a mixture of Gaussian distributions allows for further reduction in dimensionality using random projections. The performance and computational complexity of the K-means and the K-hyperline clustering algorithms are evaluated with the reduced dimensional data. Results show that random projections with an assumption of Gaussian mixtures provides the smallest number of dimensions, which leads to very low computational complexity in clustering.

Jayaraman J. Thiagarajan, Karthikeyan N. Ramamurthy, Andreas Spanias

Towards Stock Market Data Mining Using Enriched Random Forests from Textual Resources and Technical Indicators

The present paper deals with a special Random Forest Data Mining technique, designed to alleviate the significant issue of high dimensionality in volatile and complex domains, such as stock market prediction. Since it has been widely acceptable that media affect the behavior of investors, information from both technical analysis as well as textual data from various on-line financial news resources are considered. Different experiments are carried out to evaluate different aspects of the problem, returning satisfactory results. The results show that the trading strategies guided by the proposed data mining approach generate higher profits than the buy-and-hold strategy, as well as those guided by the level-estimation based forecasts of standard linear regression models and other machine learning classifiers such as Support Vector Machines, ordinary Random Forests and Neural Networks.

Manolis Maragoudakis, Dimitrios Serpanos

On the Problem of Attribute Selection for Software Cost Estimation: Input Backward Elimination Using Artificial Neural Networks

Many parameters affect the cost evolution of software projects. In the area of software cost estimation and project management the main challenge is to understand and quantify the effect of these parameters, or ‘cost drivers’, on the effort expended to develop software systems. This paper aims at investigating the effect of cost attributes on software development effort using empirical databases of completed projects and building Artificial Neural Network (ANN) models to predict effort. Prediction performance of various ANN models with different combinations of inputs is assessed in an attempt to reduce the models’ input dimensions. The latter is performed by using one of the most popular saliency measures of network weights, namely Garson’s Algorithm. The proposed methodology provides an insight on the interpretation of ANN which may be used for capturing nonlinear interactions between variables in complex software engineering environments.

Efi Papatheocharous, Andreas S. Andreou

Engineering Intelligent Systems

A Fast Mobile Face Recognition System for Android OS Based on Eigenfaces Decomposition

This paper presents a speed-optimized face recognition system designed for mobile devices. Such applications may be used in the context of pervasive and assistive computing for the support of elderly suffering from dementia in recognizing persons or for the development of cognitive memory games Eigenfaces decomposition and Mahalanobis distance calculation have been utilized whereas the recognition application has been developed for Android OS. The initial implementation and the corresponding results have proven the feasibility and value of the proposed system.

Charalampos Doukas, Ilias Maglogiannis

Application of Conformal Predictors to Tea Classification Based on Electronic Nose

In this paper, we present an investigation into the performance of conformal predictors for discriminating the aroma of different types of tea using an electronic nose system based on gas sensors. We propose a new non-conformity measure for the implementation of conformal predictors based on Support Vector Machine for multi-class classification problems. The experimental results have shown the good performance of the implemented conformal predictors.

Ilia Nouretdinov, Guang Li, Alexander Gammerman, Zhiyuan Luo

Detecting and Confining Sybil Attack in Wireless Sensor Networks Based on Reputation Systems Coupled with Self-organizing Maps

The Sybil attack is one of the most aggressive and evasive attacks in sensor networks that can affect on many aspects of network functioning. Thus, its efficient detection is of highest importance. In order to resolve this issue, in this work we propose to couple reputation systems with agents based on self-organizing map algorithm trained for detecting outliers in data. The response of the system consists in assigning low reputation values to the compromised node rendering them isolated from the rest of the network. The main improvement of this work consists in the way of calculating reputation, which is more flexible and discriminative in distinguishing attacks from normal behavior. Self-organizing map algorithm deploys feature space based on sequences of sensor outputs. Our solution offers many benefits: scalable solution, fast response to adversarial activities, ability to detect unknown attacks, high adaptability and low consumption. The testing results demonstrate its high ability in detecting and confining Sybil attack.

Zorana Banković, David Fraga, José M. Moya, Juan Carlos Vallejo, Álvaro Araujo, Pedro Malagón, Juan-Mariano de Goyeneche, Daniel Villanueva, Elena Romero, Javier Blesa

Statistical Fault Localization with Reduced Program Runs

A typical approach to software fault location is to pinpoint buggy statements by comparing the failing program runs with some successful runs. Most of the research works in this line require a large amount of failing runs and successful runs. Those required execution data inevitably contain a large number of redundant or noisy execution paths, and thus leads to a lower efficiency and accuracy of pinpointing. In this paper, we present an improved fault localization method by statistical analysis of difference between reduced program runs. To do so, we first use a clustering method to eliminate the redundancy in execution paths, next calculate the statistics of difference between the reduced failing runs and successful runs, and then rank the buggy statements in a generated bug report. The experimental results show that our algorithm works many times faster than Wang’s, and performs better than competitors in terms of accuracy.

Lina Hong, Rong Chen

Fuzzy Cognitive Map for Software Testing Using Artificial Intelligence Techniques

This paper discusses a framework to assist test managers to evaluate the use of AI techniques as a potential tool in software testing. Fuzzy Cognitive Maps (FCMs) are employed to evaluate the framework and make decision analysis easier. A what-if analysis is presented that explores the general application of the framework. Simulations are performed to show the effectiveness of the proposed method. The framework proposed is innovative and it assists managers in making efficient decisions.

Deane Larkman, Masoud Mohammadian, Bala Balachandran, Ric Jentzsch

Intelligent User Environments and HCI

Learning User Preferences in Ubiquitous Systems: A User Study and a Reinforcement Learning Approach

Our study concerns a virtual assistant, proposing services to the user based on its current perceived activity and situation (ambient intelligence). Instead of asking the user to define his preferences, we acquire them automatically using a reinforcement learning approach. Experiments showed that our system succeeded the learning of user preferences. In order to validate the relevance and usability of such a system, we have first conducted a user study. 26 non-expert subjects were interviewed using a model of the final system. This paper presents the methodology of applying reinforcement learning to a real-world problem with experimental results and the conclusions of the user study.

Sofia Zaidenberg, Patrick Reignier, Nadine Mandran

Decision Oriented Programming in HCI: The Multi-Attribute Decision Language MADL

In Human Computer Interaction (HCI), the computer has to take many decisions to react in a way that human wants. As decisions in HCI are diverse, contradictory, and hard to measure it is hard to study and model them, e.g. finding a mapping from the users preferences to adaptions of the user interface. To ease these tasks, we developed MADL (Multi-Attribute Decision Language). This programming language, based on Multi-Attribute Decision Making (MADM), is designed to model and make hierarchical multi-attributive decisions and is based on the analysis of decisions and goals in HCI. It fosters respecting HCI specific characteristics in development, like uncertainty and risk, can be embedded easily in other applications and allows inclusion of and experimentation with different decision rules. User interface logic can also be modeled by non-programmers and more easily separated from the business logic. The applicability will been shown in three use cases.

Bjoern Zenker

Investigating the Role of Mutual Cognitive Environment for End-User Programming

In this paper we present a situated end user programming approach where user co-constructs, in an iterative process, a mutual cognitive environment with the system. We argue that co-construction of a mutual cognitive environment, between both the human and the system, is a key toward social human-computer interaction. Preliminary results are illustrated with a step by step case study: a user teaches the system new perceptual and abstract concepts using hand gesture and an interactive learning table.

Rémi Barraquand, Patrick Reignier

On the Quantification of Aging Effects on Biometric Features

Biometric templates are often used in intelligent human computer interaction systems that include automated access control and personalization of user interaction. The effectiveness of biometric systems is directly linked with aging that causes modifications on biometric features. For example the long term performance of person identification systems decreases as biometric templates derived from aged subjects may display substantial differences when compared to reference templates whereas in age estimation, aging variation allows the age of a subject to be estimated. In this paper we attempt to quantify the effects of aging for different biometric modalities facilitating in that way the design of systems that use biometric features. In this context the homogeneity of statistical distributions of biometric features belonging to certain age classes is quantified enabling in that way the definition of age sensitive and age invariant biometric features. Experimental results demonstrate the applicability of the method in quantifying aging effects.

Andreas Lanitis, Nicolas Tsapatsoulis

Environmental Modeling

Fuzzy Inference Systems for Automatic Classification of Earthquake Damages

This paper presents efficient models in the area of damage potential classification of seismic signals. After an earthquake, one of the most important actions that authorities must take is to inspect structures and estimate the degree of damages. The interest is obvious for several reasons such as public safety, economical recourses management and infrastructure. This approach provides a comparative study between the Mamdani-type and Sugeno-type fuzzy inference systems (FIS). The fuzzy models use a set of artificial accelerograms in order to classify structural damages in a specific structure. Previous studies propose a set of twenty well-known seismic parameters which are essential for description of the seismic excitation. The proposed fuzzy systems use an input vector of twenty seismic parameters instead of the earthquake accelerogram and produce classification rates up to 90%. Experimental results indicate that these systems are able to classify the structural damages in structures accurately. Both of them produce the same level of correct classification rates but the Mamdani-type has a slight superiority.

Petros-Fotios Alvanitopoulos, Ioannis Andreadis, Anaxagoras Elenas

A Fuzzy Inference System Using Gaussian Distribution Curves for Forest Fire Risk Estimation

This paper describes the development of a fuzzy inference system under the MATLAB platform. The system uses three distinct Gaussian distribution fuzzy membership functions in order to estimate the partial and the overall risk indices due to wild fires in the southern part of Greece. The behavior of each curve has been investigated in order to determine which one fits better for the specific problem and for the specific areas. Regardless the characteristics of each function, the risky areas have been spotted from 1984 till 2007. The results have shown a reliable performance over time and they encourage its wider use in the near future.

Lazaros Iliadis, Stergios Skopianos, Stavros Tachos, Stefanos Spartalis

Evolutionary Prediction of Total Electron Content over Cyprus

Total Electron Content (TEC) is an ionospheric characteristic used to derive the signal delay imposed by the ionosphere on trans-ionospheric links and subsequently overwhelm its negative impact in accurate position determination. In this paper, an Evolutionary Algorithm (EA), and particularly a Genetic Programming (GP) based model is designed. The proposed model is based on the main factors that influence the variability of the predicted parameter on a diurnal, seasonal and long-term time-scale. Experimental results show that the GP-model, which is based on TEC measurements obtained over a period of 11 years, has produced a good approximation of the modeled parameter and can be implemented as a local model to account for the ionospheric imposed error in positioning. The GP-based approach performs better than the existing Neural Network-based approach in several cases.

Alexandros Agapitos, Andreas Konstantinidis, Haris Haralambous, Harris Papadopoulos

A Multi-layer Perceptron Neural Network to Predict Air Quality through Indicators of Life Quality and Welfare

This paper considers the similarity between two measures of air pollution/quality control, on the one hand, and widely used indicators of life quality and welfare, on the other. We have developed a multi-layer perceptron neural network system which is trained to predict the measurements of air quality (emissions of sulphur and nitrogen oxides), using Eurostat data for 34 countries. We used life expectancy, healthy life years, infant mortality, Gross Domestic Product (GDP) and GDP growth rate as a set of inputs. Results were dominated by GDP growth rate and GDP. Obtaining accurate estimates of air quality measures can help in deciding on distinct dimensions to be considered in multidimensional studies of welfare and quality of life.

Kyriaki Kitikidou, Lazaros Iliadis

Springer Professional

About this book

Table of Contents

Frontmatter

Invited Talks

How Artificial Intelligence May Be Applied in Real World Situations

Modern Machine Learning Techniques and Their Applications to Medical Diagnostics

Innovative Applications of Artificial Intelligence Techniques in Software Engineering

Machine Learning

Linear Probability Forecasting

The Importance of Similarity Metrics for Representative Users Identification in Recommender Systems

An Optimal Scaling Approach to Collaborative Filtering Using Categorical Principal Component Analysis and Neighborhood Formation

A Classroom Observation Model Fitted to Stochastic and Probabilistic Decision Systems

Prediction with Confidence Based on a Random Forest Classifier

Fuzzy Logic Techniques

A Generic Tool for Building Fuzzy Cognitive Map Systems

A Fuzzy Rule-Based Approach to Design Game Rules in a Mission Planning and Evaluation System

One-Dimensional Linear Local Prototypes for Effective Selection of Neuro-Fuzzy Sugeno Model Initial Structure

Lasso: Linkage Analysis of Serious Sexual Offences

Evolutionary Computation

Forecasting Euro – United States Dollar Exchange Rate with Gene Expression Programming

Automatically Designing Robot Controllers and Sensor Morphology with Genetic Programming

Multiple Criteria Performance Analysis of Non-dominated Sets Obtained by Multi-objective Evolutionary Algorithms for Optimisation

Efficiency and Robustness of Three Metaheuristics in the Framework of Structural Optimization

Medical Informatics and Biomedical Engineering

A Fuzzy Non-linear Similarity Measure for Case-Based Reasoning Systems for Radiotherapy Treatment Planning

A Soft Computing Approach for Osteoporosis Risk Factor Estimation

Protein Secondary Structure Prediction with Bidirectional Recurrent Neural Nets: Can Weight Updating for Each Residue Enhance Performance?

Contourlet Transform for Texture Representation of Ultrasound Thyroid Images

Assessment of Stroke Risk Based on Morphological Ultrasound Image Analysis with Conformal Prediction

Text Mining and Natural Language Processing

Concept Based Representations as Complement of Bag of Words in Information Retrieval

Information Fusion for Entity Matching in Unstructured Data

An Example-Tracing Tutor for Teaching NL to FOL Conversion

Learning the Preferences of News Readers with SVM and Lasso Ranking

Knowledge Representation and Reasoning

A Comparison of Two Ontology-Based Semantic Annotation Frameworks

A Tool for Automatic Creation of Rule-Based Expert Systems with CFs

Non-standard Reasoning Services for the Verification of DAML+OIL Ontologies

Algorithms for the Reconciliation of Ontologies in Open Environments

Knowledge-Based Support for Software Engineering

Planning and Scheduling

A Hybrid Searching Method for the Unrelated Parallel Machine Scheduling Problem

Aiding Interactive Configuration and Planning: A Constraint and Evolutionary Approach

Decentralized Services Orchestration Using Intelligent Mobile Agents with Deadline Restrictions

Mobile Robot-Assisted Cellular Environment Coverage

Feature Selection and Dimensionality Reduction

A Novel Feature Selection Method for Fault Diagnosis

Dimensionality Reduction for Distance Based Video Clustering

Towards Stock Market Data Mining Using Enriched Random Forests from Textual Resources and Technical Indicators

On the Problem of Attribute Selection for Software Cost Estimation: Input Backward Elimination Using Artificial Neural Networks

Engineering Intelligent Systems

A Fast Mobile Face Recognition System for Android OS Based on Eigenfaces Decomposition

Application of Conformal Predictors to Tea Classification Based on Electronic Nose

Detecting and Confining Sybil Attack in Wireless Sensor Networks Based on Reputation Systems Coupled with Self-organizing Maps

Statistical Fault Localization with Reduced Program Runs

Fuzzy Cognitive Map for Software Testing Using Artificial Intelligence Techniques

Intelligent User Environments and HCI

Learning User Preferences in Ubiquitous Systems: A User Study and a Reinforcement Learning Approach

Decision Oriented Programming in HCI: The Multi-Attribute Decision Language MADL

Investigating the Role of Mutual Cognitive Environment for End-User Programming

On the Quantification of Aging Effects on Biometric Features

Environmental Modeling

Fuzzy Inference Systems for Automatic Classification of Earthquake Damages

A Fuzzy Inference System Using Gaussian Distribution Curves for Forest Fire Risk Estimation

Evolutionary Prediction of Total Electron Content over Cyprus

A Multi-layer Perceptron Neural Network to Predict Air Quality through Indicators of Life Quality and Welfare

Backmatter

Premium Partner