Skip to main content
Top

2018 | Book

AI 2018: Advances in Artificial Intelligence

31st Australasian Joint Conference, Wellington, New Zealand, December 11-14, 2018, Proceedings

insite
SEARCH

About this book

This book constitutes the proceedings of the 31st Australasian Joint Conference on Artificial Intelligence, AI 2018, held in Wellington, New Zealand, in December 2018.

The 50 full and 26 short papers presented in this volume were carefully reviewed and selected from 125 submissions. The paper were organized in topical sections named: agents, games and robotics; AI applications and innovations; computer vision; constraints and search; evolutionary computation; knowledge representation and reasoning; machine learning and data mining; planning and scheduling; and text mining and NLP.

Table of Contents

Frontmatter

Agents, Games and Robotics

Frontmatter
Compromise as a Way to Promote Convention Emergence and to Reduce Social Unfairness in Multi-Agent Systems

Recently, the study of social conventions has attracted much attention. We notice that different agents may tend to establish different conventions, even though they share common interests in convention emergence. We model such scenarios to be competitive-coordination games. We hypothesize that agents may fail to establish a convention under these scenarios and introducing the option of compromise may help solve this problem. Experimental study confirms this hypothesis. In particular, it is shown that besides convention emergence is promoted, the undesirable social unfairness is also significantly reduced. In addition, we discuss how the reward of coordination via compromise affects convention emergence, social efficiency and unfairness.

Shuyue Hu, Ho-fung Leung
Hierarchical Population-Based Learning for Optimal Large-Scale Coalition Structure Generation in Smart Grids

Large-scale Coalition Structure Generation poses a key challenge in the Cooperative Game Theory and Multi-Agent Systems in regards to its NP-hardness computation complexity. State-of-the-art algorithms, such as Optimal Dynamic Programming, could only solve the problem on a small scale, e.g. 20 agents, with an excessive running time. Our previous study, using population-based learning to deal with the same scale outperforms others and revels an immense potential of efficiency and accuracy. In this study we further advance the problem to large scales, e.g. 80 agents. Firstly, we show that our PBIL-MW algorithm could obtain an approximate optimal solution. Furthermore, we propose an approach of Hierarchical PBIL-MW with a termination scheme that achieves significant efficiency with only small losses in terms of accuracy. It provides an alternative solution, while time restriction is essential in some applications.

Sean Hsin-Shyuan Lee, Jeremiah D. Deng, Martin K. Purvis, Maryam Purvis
Empowerment-Driven Single Agent Exploration for Locating Multiple Wireless Transmitters

Unmanned Aerial Vehicles (UAVs) have attracted significant interest in recent years, as they have shown to be effective in supporting a wide range of applications in many different areas, including logistics, search and rescue (SAR) [3], public safety communications [8], infrastructure monitoring [9], precision agriculture [4], forestry [5], and telecommunications [2]. Specifically we focus on those of search and exploration in the context of search and rescue. In our presented work, success is measured in an agents ability to find all transmitters in as small a time as possible. Through the use of a challenging discretized simulation environment, we investigate the practicality of an empowerment-driven exploration behaviour (EEB) in order to locate an unknown number of wireless transmitters with minimal prior knowledge about the locations of obstacles, transmitters and their properties. With problem specific adaptations to the algorithm, including the ability to detect non-identifying signals from transmitters, when compared with a random walk agent and an idealistic Bayesian agent, the empowerment algorithm performs near to that of the Bayesian agent with unrealistic information about the environment. We show that our empowerment-driven algorithm has practical potential and lays a foundation for future work in this area.

Daniel Barry, Andreas Willig, Graeme Woodward
A Logic for Reasoning About Game Descriptions

General game playing aims to develop autonomous computer players capable of playing any formally described games. The biggest challenge for such a player is to understand a game and acquire useful knowledge about the game from its description. This paper aims to develop a logical approach for reasoning about game rules. We introduce a modal logic with a sound and complete axiomatic system. The logic extends Zhang and Thielscher’s framework with two modalities to express game rules and reason about game outcomes. We use a well-known strategy game, Hex, to demonstrate how to use the logic to standardise game descriptions and verify properties of a game description.

Dongmo Zhang
General Language Evolution in General Game Playing

General Game Playing (GGP) is concerned with the development of programs capable of expertly playing a game by just receiving its rules and without human intervention. Its standard Game Description Language (GDL) has been extended so as to include incomplete information games. The extended version is named as GDL-II. Different algorithms were recommended to play games in GDL-II, however, none of them can solve coordination games properly. One reason for this shortcoming is their inability to generate the necessary coordination language. On the other side, most existing language evolution techniques focus on generating a common language without considering its generality or its use for problem solving. In this paper, we will extend GGP with language evolution to develop a general language generation technique. The new technique can be combined with GGP algorithms for incomplete-information games and assist players in automatically generating a common language to solve cooperation problems.

Armin Chitizadeh, Michael Thielscher
Intra-task Curriculum Learning for Faster Reinforcement Learning in Video Games

In this paper we present a new method for improving reinforcement learning training times under the following two assumptions: (1) we know the conditions under which the environment gives reward; and (2) we can control the initial state of the environment at the beginning of a training episode. Our method, called intra-task curriculum learning, presents the different episode starting states to an agent in order of increasing distance to immediate reward.

Nathaniel du Preez-Wilkinson, Marcus Gallagher, Xuelei Hu
An Approach for Task Execution in Dynamic Multirobot Environment

In this paper we consider the problem of task execution in dynamic environments. We introduce a formal framework of a dynamic environment, and model the behavior of the robots using communicating automata. Based on the model we suggest a distributed approach for task execution that can handle multiple tasks that arrive at same instant of time. We have implemented the approach using ARGoS–a multirobot simulator.

Amar Nath, A. R. Arun, Rajdeep Niyogi
Monocular ORB-SLAM on a Humanoid Robot for Localization Purposes

This work presents an application of ORB-SLAM in an iGus bipedal humanoid robotic platform. The method was adapted from its original implementation into the framework used by the NUbots robotic soccer team and used for localization purposes. The paper presents a description of the challenges to implement the adaptation, as well as several tests where the method’s performance is analyzed to determine its suitability for further development and use on medium sized humanoid robots.To conduct the tests, we determined the robot’s real location using a high-accuracy, camera-based infrared tracking system. Two experiments were performed to estimate the robustness of the method to the vibration and constant camera wobbling inherent to a bipedal walk and its ability to deal with the kidnapped robot problem.The tests indicate that ORB-SLAM is suitable for implementation into a medium sized humanoid robot in situations comparable to a robotic soccer environment, and requires relatively low computational resources, leaving enough CPU power for other tasks. Additionally, since ORB-SLAM is robust to the difficulties associated with humanoid motion, we conclude that it provides a good SLAM algorithm to enhance with features specific to the humanoid robotic platform.

Daniel Ginn, Alexandre Mendes, Stephan Chalup, Jake Fountain

AI Applications and Innovations

Frontmatter
Impact of Compression Ratio and Reconstruction Methods on ECG Classification for E-Health Gadgets: A Preliminary Study

In IoT applications, it is often necessary to achieve an optimal trade-off between data compression and data quality. This study investigates the effect of Compressed Sensing and reconstruction algorithms on ECG arrhythmia detection using SVM classifiers. To neutralise the mutual effect of compression and reconstruction algorithms on one another, we consider each reconstruction algorithms with various compression ratios and vice versa. The employed reconstruction algorithms are Basis Pursuit (BP) and Orthogonal Matching Pursuit (OMP). We employ two steps: (a) identifying proper compression ratio that withholds essential information of ECG signals, (b) assessing the impact of two reconstruction algorithms and their exactness on quality of classification. The findings of this study are threefold: (a) Remarkably, the SVM classifier requires few samples to detect ECG arrhythmia. (b) The results indicate for compression ratios up to around 1:7 ECG signals are recovered then classified with the same quality for both algorithms. However, by increasing compression ratio BP outperforms OMP in terms of ECG arrhythmia detection. (c) Negative correlation between compression ratio and signal quality is observed, that is intuitive enough to realise the trade-off between them.

Sophie Zareei, Jeremiah D. Deng
Intelligent Fault Detection of High-Speed Railway Turnout Based on Hybrid Deep Learning

With the purpose of detecting the turnout fault without label data and fault data timely, this paper proposes a hybrid deep learning framework com-bining the DDAE (Deep Denoising Auto-encoder) and one-class SVM (Support Vector Machine) for turnout fault detection only using normal data. The proposed method achieves an accuracy of 98.67% on the real turn-out dataset for current curve, which suggests that this work realizes the purpose of detecting the fault with only normal data and provides a basis for the intelligent fault detection of turnouts.

Zhi Zhuang, Guohua Zhang, Wei Dong, Xinya Sun, Chuanjiang Wang
Uncovering Discriminative Knowledge-Guided Medical Concepts for Classifying Coronary Artery Disease Notes

Text classification is a challenging task for allocating each document to the correct predefined class. Most of the time, there are irrelevant features which make noise in the learning step and reduce the precision of prediction. Hence, more efficient methods are needed to select or extract meaningful features to avoid noise and overfitting. In this work, an ontology-guided method utilizing the taxonomical structure of the Unified Medical Language System (UMLS) is proposed. This method extracts concepts of appeared phrases in the documents which relate to diseases or symptoms as features. The efficiency of this method is evaluated on the 2010 Informatics for Integrating Biology and the Bedside (i2b2) data set. The obtained experimental results show significant improvement by the proposed ontology-based method on the accuracy of classification.

Mahdi Abdollahi, Xiaoying Gao, Yi Mei, Shameek Ghosh, Jinyan Li
A Multi-tree Genetic Programming Representation for Melanoma Detection Using Local and Global Features

Melanoma is the deadliest type of skin cancer that accounts for nearly 75% of deaths associated with it. However, survival rate is high, if diagnosed at an early stage. This study develops a novel classification approach to melanoma detection using a multi-tree genetic programming (GP) method. Existing approaches have employed various feature extraction methods to extract features from skin cancer images, where these different types of features are used individually for skin cancer image classification. However they remain unable to use all these features together in a meaningful way to achieve performance gains. In this work, Local Binary Pattern is used to extract local information from gray and color images. Moreover, to capture the global information, color variation among the lesion and skin regions, and geometrical border shape features are extracted. Genetic operators such as crossover and mutation are designed accordingly to fit the objectives of our proposed method. The performance of the proposed method is assessed using two skin image datasets and compared with six commonly used classification algorithms as well as the single tree GP method. The results show that the proposed method significantly outperformed all these classification methods. Being interpretable, this method may help dermatologist identify prominent skin image features, specific to a type of skin cancer.

Qurrat Ul Ain, Harith Al-Sahaf, Bing Xue, Mengjie Zhang
Impacts of Climate Change and Socioeconomic Development on Electric Load in California

In order to develop policies to mitigate the impacts of climate change on energy consumption, it is imperative to understand and quantify the impacts of climate change and socioeconomic development on residential electric load. This paper develops a feed-forward neural network to model the complex relationships among socioeconomic factors, weather, distributed renewable generation, and electric load at the census block group level. The influence of different explanatory variables on electric load is quantified through the layer-wise relevance propagation method. A case study with 4,000 census block groups in southern California is conducted. The results show that temperature, housing units, and solar PV systems have the highest influence on net electric load. The scenario analysis reveals that net electric load of disadvantaged communities are much more sensitive to rising temperature than the non-disadvantaged ones. Hence, they are much more vulnerable to climate change.

Jie Shi, Nanpeng Yu
Implementing Propositional Networks on FPGA

The speed of game rules processing plays an essential role in the performance of a General Game Playing (GGP) agent. Propositional Networks (propnets) are an example of a highly efficient representation of game rules. So far, in GGP, only software implementations of propnets have been proposed and investigated. In this paper, we present the first implementation of propnets on Field-Programmable Gate Arrays (FPGAs), showing that they perform between 25 and 58 times faster than a software-propnet for most of the tested games. We also integrate the FPGA-propnet within an MCTS agent, discussing the challenges of the process, and possible solutions for the identified shortcomings.

Cezary Siwek, Jakub Kowalski, Chiara F. Sironi, Mark H. M. Winands
A Genetic Programming Hyper-heuristic Approach for Online Resource Allocation in Container-Based Clouds

The popularity of container-based clouds is its ability to deploy and run applications without launching an entire virtual machine (VM) for each application. Container-based clouds support flexible deployment of applications and therefore brings the potential to reduce the energy consumption of data centers. With the goal of energy reduction, it is more difficult to optimize the allocation of containers than traditional VM-based clouds because of the finer granularity of resources. Little research has been conducted for applying human-design heuristics on balanced and unbalanced resources. In this paper, we first compare three human-design heuristics and show they cannot handle balanced and unbalanced resources scenarios well. We propose a learning-based algorithm: genetic programming hyper-heuristic (GPHH) to automatically generate a suitable heuristic for allocating containers in an online fashion. The results show that the proposed GPHH managed to evolve better heuristics than the human-designed ones in terms of energy consumption in a range of cloud scenarios.

Boxiong Tan, Hui Ma, Yi Mei
A Computational Framework for Autonomous Self-repair Systems

This paper describes a novel computational framework for damage detection and regeneration in an artificial tissue of cells resembling living systems. We represent the tissue as an Auto-Associative Neural Network (AANN) consisting of a single layer of perceptron neurons (cells) with local feedback loops. This allows the system to recognise its state and geometry in a form of collective intelligence. Signalling entropy is used as a global (emergent) property characterising the state of the system. The repair system has two submodels - global sensing and local sensing. Global sensing is used to sense the change in whole system state and detect general damage region based on system entropy change. Then, local sensing is applied with AANN to find the exact damage locations and repair the damage. The results show that the method allows robust and efficient damage detection and accurate regeneration.

Tran Nguyen Minh-Thai, Jagannath Aryal, Sandhya Samarasinghe, Michael Levin
Investigation of Unsupervised Models for Biodiversity Assessment

Significant animal species loss has been observed in recent decades due to habitat destruction, which puts at risk environmental integrity and biodiversity. Traditional ways of assessing biodiversity are limited in terms of both time and space, and have high cost. Since the presence of animals can be indicated by sound, recently acoustic recordings have been used to estimate species richness. Bioacoustic sounds are typically recorded in habitats for several weeks, so contain a large collection of different sounds. Birds are of particular interest due to their distinctive calls and because they are useful ecological indicators. To assess biodiversity, the task of manually determining how many different types of birds are present in such a lengthy audio is really cumbersome. Towards providing an automated support to this issue, in this paper we investigate and propose a clustering based approach to assist in automated assessment of biodiversity. Our approach first estimates the number of different species and their volumes which are used for deriving a biodiversity index. Experimental results with real data indicates that our proposed approach estimates the biodiversity index value close to the ground truth.

KVSN Rama Rao, Saurabh Garg, James Montgomery
Field-Regularised Factorization Machines for Mining the Maintenance Logs of Equipment

Failure prediction is very important for railway infrastructure. Traditionally, data from various sensors are collected for this task. Value of maintenance logs is often neglected. Maintenance records of equipment usually indicate equipment status. They could be valuable for prediction of equipment faults. In this paper, we propose Field-regularised Factorization Machines (FrFMs) to predict failures of railway points with maintenance logs. Factorization Machine (FM) and its variants are state-of-the-art algorithms designed for sparse data. They are widely used in click-through rate prediction and recommendation systems. Categorical variables are converted to binary features through one-hot encoding and then fed into these models. However, field information is ignored in this process. We propose Field-regularised Factorization Machines to incorporate such valuable information. Experiments on data set from railway maintenance logs and another public data set show the effectiveness of our methods.

Zhibin Li, Jian Zhang, Qiang Wu, Christina Kirsch
Real-Time Collusive Shill Bidding Detection in Online Auctions

Shill bidding is where a seller introduces fake bids into an auction to artificially inflate an item’s final price, thereby cheating legitimate bidders. Shill bidding detection becomes more difficult when a seller involves multiple collaborating shill bidders. Colluding shill bidders can distribute the work evenly among each other to collectively reduce their chances of being detected. Previous detection methods wait until an auction ends before determining who the shill bidders are. However, if colluding shill bidders are not detected during the auction, an honest bidder can potentially be cheated by the end of the auction. This paper presents a real-time collusive shill bidding detection algorithm for identifying colluding shill bidders while an auction is running. Experimental results on auction data show that the algorithm can potentially highlight colluding shill bidders in real-time.

Nazia Majadi, Jarrod Trevathan, Neil Bergmann
OHC: Uncovering Overlapping Heterogeneous Communities

A heterogeneous correlation network represents relationships (edges) among source-typed and attribute-typed objects (nodes). It can be used to model an academic collaboration network, describing connections among authors and published papers. To date, there has been little research into mining communities in heterogeneous networks. The objective of our research is to discover overlapping communities that include all node and edge types in a heterogeneous correlation network. We describe an algorithm, OHC, that detects overlapping communities in heterogeneous correlation networks. Inspired by a homogeneous community scoring function, Triangle Participation Ratio (TPR), OHC finds target heterogeneous communities then expands them recursively with triangle-forming nodes. Experiments on different real world networks demonstrate that OHC identifies heterogeneous communities that are tightly connected internally according to two traditional scoring functions. Additionally, analyzing the top ranking heterogeneous communities in a case study, we evaluate the results qualitatively.

Ranran Bian, Yun Sing Koh, Gillian Dobbie, Anna Divoli

Computer Vision

Frontmatter
A Genetic Programming Approach for Constructing Foreground and Background Saliency Features for Salient Object Detection

Salient Object Detection (SOD) methods have been widely investigated in order to mimic human visual system in selecting regions of interest from complex scenes. The majority of existing SOD methods have focused on designing and combining handcrafted features. This process relies on domain knowledge and expertise and becomes increasingly difficult as the complexity of candidate models increases. In this paper, we develop an automatic feature combination method for saliency features to relieve human intervention and domain knowledge. The proposed method contains three phases, two Genetic Programming (GP) phases to construct foreground and background features and a spatial blending phase to combine those features. The foreground and background features are constructed to complement each other, therefore one can improve other’s shortcomings. This method is compared with the state-of-the-art methods on four different benchmark datasets. The results indicate the new automatic method is comparable with the state-of-the-art methods and even improves SOD performance on some datasets.

Shima Afzali, Harith Al-Sahaf, Bing Xue, Christopher Hollitt, Mengjie Zhang
Distinction Between Ships and Icebergs in SAR Images Using Ensemble Loss Trained Convolutional Neural Networks

With the phenomenon of global warming, more new shipping routes will be open and utilized by more and more ships in the polar regions, particularly in the Arctic. Synthetic aperture radar (SAR) has been widely used in ship and iceberg monitoring for maritime surveillance and safety in the Arctic waters. At present, compared with the object detection of ship or iceberg, the task of ship and iceberg distinction in SAR images is still in challenge. In this work, we propose a novel loss function called ensemble loss to train convolutional neural networks (CNNs), which is a convex function and incorporates the traits of cross entropy and hinge loss. The ensemble loss trained CNNs model for the distinction between ship and iceberg is evaluated on a real-world SAR data set, which can get a higher classification accuracy to 90.15%. Experiment on another real image data set also confirm the effectiveness of the proposed ensemble loss.

Xiaocai Zhang, Yuansheng Liu, Yi Zheng, Zhixun Zhao, Jinyan Li, Yang Liu
Shark Detection from Aerial Imagery Using Region-Based CNN, a Study

Shark attacks have been a very sensitive issue for Australians and many other countries. Thus, providing safety and security around beaches is very fundamental in the current climate. Safety for both human beings and underwater creatures (sharks, whales, etc.) in general is essential while people continue to visit and use the beaches heavily for recreation and sports. Hence, an efficient, automated and real-time monitoring approach on beaches for detecting various objects (e.g. human activities, large fish, sharks, whales, surfers, etc.) is necessary to avoid unexpected casualties and accidents. The use of technologies such as drones and machine learning techniques are promising directions in such challenging circumstances. This paper investigates the potential of Region-based Convolutional Neural Networks (R-CNN) for detecting various marine objects, and Sharks in particular. Three network architectures namely Zeiler and Fergus (ZF), Visual Geometry Group (VGG16), and VGG_M were considered for analysis and identifying their potential. A dataset consisting of 3957 video frames were used for experiments. VGG16 architecture with faster-R-CNN performed better than others, with an average precision of 0.904 for detecting Sharks.

Nabin Sharma, Paul Scully-Power, Michael Blumenstein
A Hybrid Differential Evolution Approach to Designing Deep Convolutional Neural Networks for Image Classification

Convolutional Neural Networks (CNNs) have demonstrated their superiority in image classification, and evolutionary computation (EC) methods have recently been surging to automatically design the architectures of CNNs to save the tedious work of manually designing CNNs. In this paper, a new hybrid differential evolution (DE) algorithm with a newly added crossover operator is proposed to evolve the architectures of CNNs of any lengths, which is named DECNN. There are three new ideas in the proposed DECNN method. Firstly, an existing effective encoding scheme is refined to cater for variable-length CNN architectures; Secondly, the new mutation and crossover operators are developed for variable-length DE to optimise the hyperparameters of CNNs; Finally, the new second crossover is introduced to evolve the depth of the CNN architectures. The proposed algorithm is tested on six widely-used benchmark datasets and the results are compared to 12 state-of-the-art methods, which shows the proposed method is vigorously competitive to the state-of-the-art algorithms. Furthermore, the proposed method is also compared with a method using particle swarm optimisation with a similar encoding strategy named IPPSO, and the proposed DECNN outperforms IPPSO in terms of the accuracy.

Bin Wang, Yanan Sun, Bing Xue, Mengjie Zhang
A Gaussian Filter-Based Feature Learning Approach Using Genetic Programming to Image Classification

To learn image features automatically from the problems being tackled is more effective for classification. However, it is very difficult due to image variations and the high dimensionality of image data. This paper proposes a new feature learning approach based on Gaussian filters and genetic programming (GauGP) for image classification. Genetic programming (GP) is a well-known evolutionary learning technique and has been applied to many visual tasks, showing good learning ability and interpretability. In the proposed GauGP method, a new program structure, a new function set and a new terminal set are developed, which allow it to detect small regions from the input image and to learn discriminative features using Gaussian filters for image classification. The performance of GauGP is examined on six different data sets of varying difficulty and compared with four GP methods, eight traditional approaches and convolutional neural networks. The experimental results show GauGP achieves significantly better or similar performance in most cases.

Ying Bi, Bing Xue, Mengjie Zhang
Image Up-Sampling for Super Resolution with Generative Adversarial Network

In case, we carry out single image Super Resolution (SR) utilizing deep learning, we utilize bicubic interpolation for up-sampling of low resolution images before input them into SR methods. In the preprocessing, these basic interpolation methods cause blur and noise effects for after processed images. These noise images may affect the SR results. In this research, by focusing on this point, we propose a new image up-sampling method utilizing Generative Adversarial Network (GAN). In this work, we improve an image evaluation criterion in generator part of GAN by combining Multi-Scale Structural Similarity (MS-SSIM) and L1 norm. From experiments, we have confirmed that our method allows us to create more qualitatively up-sampling images. As the quantitative results, our proposed method have achieved 0.90 [dB] of average PSNR, 3.35 [%] of average SSIM, and 1.28 [%] of average MS-SSIM improvement using Set 5 and Set 14 dataset compared with bicubic interpolation.

Shohei Tsunekawa, Katsufumi Inoue, Michifumi Yoshioka
A Model for Learning Representations of 3D Objects Through Tactile Exploration: Effects of Object Asymmetries and Landmarks

In this paper, we develop a neural network model that learns representations of 3D objects via tactile exploration. The basic principle is that the hand is considered as an autonomous ‘navigating agent’, traveling within the ‘environment’ of a 3D object. We adapt a model of hippocampal place cells, which learns the structure of a 2D environment by exploiting constraints imposed by the environment’s boundaries on the agent’s movement, and perceptual information about landmarks in the environment. In the current paper, our focus is on 3D analogues of these 2D information sources. We systematically investigate the information about object geometry that is provided by navigation constraints in a simple cuboid, and by tactile landmarks. We find that an asymmetric cuboid conveys more information to the navigator than a symmetric cuboid (i.e., a cube) – and that landmarks convey additional information independently from asymmetry.

Xiaogang Yan, Alistair Knott, Steven Mills
Cyclist Trajectory Prediction Using Bidirectional Recurrent Neural Networks

Predicting a long-term horizon of vulnerable road users’ trajectories such as cyclists become an inevitable task for a reliable operation of highly and fully automated vehicles. In the literature, this problem is often tackled using linear dynamics-based approaches based on recursive Bayesian filters. These approaches are usually challenged when it comes to predicting long-term horizon of trajectories (more than 1 sec). Additionally, they also have difficulties in predicting non-linear motions such as maneuvers done by cyclists in traffic environments. In this work, we are proposing two novel models based on deep stacked recurrent neural networks for the task of cyclists trajectories prediction to overcome some of the aforementioned challenges. Our proposed predictive models have achieved robust prediction results when evaluated on a real-life cyclist trajectories dataset collected using vehicle-based sensors in the urban traffic environment. Furthermore, our proposed models have outperformed other traditional approaches with an improvement of more than 50% in mean error score averaged over all the predicted cyclists’ trajectories.

Khaled Saleh, Mohammed Hossny, Saeid Nahavandi

Constraints and Search

Frontmatter
Diversified Late Acceptance Search

The well-known Late Acceptance Hill Climbing (LAHC) search aims to overcome the main downside of traditional Hill Climbing (HC) search, which is often quickly trapped in a local optimum due to strictly accepting only non-worsening moves within each iteration. In contrast, LAHC also accepts worsening moves, by keeping a circular array of fitness values of previously visited solutions and comparing the fitness values of candidate solutions against the least recent element in the array. While this straightforward strategy has proven effective, there are nevertheless situations where LAHC can unfortunately behave in a similar manner to HC. For example, when a new local optimum is found, often the same fitness value is stored many times in the array. To address this shortcoming, we propose new acceptance and replacement strategies to take into account worsening, improving, and sideways movement scenarios with the aim to improve the diversity of values in the array. Compared to LAHC, the proposed Diversified Late Acceptance Search approach is shown to lead to better quality solutions that are obtained with a lower number of iterations on benchmark Travelling Salesman Problems and Quadratic Assignment Problems.

Majid Namazi, Conrad Sanderson, M. A. Hakim Newton, Md Masbaul Alam Polash, Abdul Sattar
Hyper-heuristic Based Local Search for Combinatorial Optimisation Problems

Combinatorial optimisation is often needed for solving real-world problems, which are often NP-hard so exact methods are not suitable. Instead local search methods are often effective to find near-optimal solutions quickly. However, it is difficult to determine which local search with what parameter setting should be optimal for a given problem. In this study two complex combinatorial optimisation are used, Multi-capacity Bin Packing Problems (MCBPP) and Google Machine Reassignment Problem (GMRP). Our experiments show that no single local search method could consistently achieve the best. They are sensitive to problem search space and parameters. Therefore we propose a hyper heuristic based method, which automatically selects the most appropriate local search during the search and tune the parameters accordingly. The results show that our proposed hyper-heuristic approach is effective and can achieve the overall best on multiple instances of both MCBPP and GMRP.

Ayad Turky, Nasser R. Sabar, Simon Dunstall, Andy Song
Distributed Model Predictive Control of Linear Systems with Coupled Constraints Based on Collective Neurodynamic Optimization

Distributed model predictive control explores an array of local predictive controllers that synthesize the control of subsystems independently yet they communicate to efficiently cooperate in achieving the closed-loop control performance. Distributed model predictive control problems naturally result in sequential distributed optimization problems that require real-time solution. This paper presents a collective neurodynamic approach to design and implement the distributed model predictive control of linear systems in the presence of globally coupled constraints. For each subsystem, a neurodynamic model minimizes its cost function using local information only. According to the communication topology of the network, neurodynamic models share information to their neighbours to reach consensus on the optimal control actions to be carried out. The collective neurodynamic models are proven to guarantee the global optimality of the model predictive control system.

Zheng Yan, Jie Lu, Guangquan Zhang
Constraint-Guided Local Search for Single Mixed-Operation Runway

Aircraft sequencing problem (ASP) is to schedule the operation times of departing and arriving aircraft such that their deviation from the desired operation times are minimised. There are two types of hard constraint which make this problem very challenging: time window constraint for the operation time of each aircraft, and minimum separation time between each pair of aircraft. ASP is known to be NP-Hard. Although some progress has been made in recent years in solving ASP, existing techniques still rely on generic algorithms that usually lack problem specific knowledge. This leads to either finding low quality solutions or scrambling with large-sized problems. In this work, we propose a constraint-guided local search algorithm that advances ASP search by injecting the specific knowledge of the problem into its different phases. In the intensification phase, we propose a greedy approach that gives more priorities to aircraft that are more problematic and create more delays. In the diversification phase, we employ a bounded-diversification technique that controls the new position of each selected aircraft and does not allow them to move very far away from their current positions. Computational results show that the proposed algorithm outperforms the existing state-of-the-art methods with considerable margin.

Vahid Riahi, M. A. Hakim Newton, Abdul Sattar

Evolutionary Computation

Frontmatter
A Hybrid GP-KNN Imputation for Symbolic Regression with Missing Values

In data science, missingness is a serious challenge when dealing with real-world data sets. Although many imputation approaches have been proposed to tackle missing values in machine learning, most studies focus on the classification task rather than the regression task. To the best of our knowledge, no study has been conducted to investigate the use of imputation methods when performing symbolic regression on incomplete real-world data sets. In this work, we propose a new imputation method called GP-KNN which is a hybrid method employing two concepts: Genetic Programming Imputation (GPI) and K-Nearest Neighbour (KNN). GP-KNN considers both the feature and instance relevance. The experimental results show that the proposed method has a better performance comparing to state-of-the-art imputation methods in most of the considered cases with respect to both imputation accuracy and symbolic regression performance.

Baligh Al-Helali, Qi Chen, Bing Xue, Mengjie Zhang
Adaptive Reference Point Generation for Many-Objective Optimization Using NSGA-III

In Non Dominated Sorting Genetic Algorithm-III (NSGA-III), the diversity of solutions is guided by a set of uniformly distributed reference points in the objective space. However, uniformly distributed reference points may not be efficient for problems with disconnected and non-uniform Pareto-fronts. These kinds of problems may have some reference points that are never associated with any of the Pareto-optimal solutions and will become useless reference points during evaluation. The existence of these useless reference points in NSGA-III significantly affects its performance. To address this issue, a new reference points adaptation mechanism is proposed that generates reference points according to the distribution of the candidate solutions. The use of this proposed adaptation method improves the performance of evolutionary search and promotes population diversity for better exploration. The proposed approach is evaluated on a number of unconstrained benchmark problems and is compared with NSGA-III and other reference point adaptation approaches. Experiment results on several benchmark problems clearly show a prominent improvement in the performance by using the proposed reference point adaptation mechanism in NSGA-III.

Atiya Masood, Gang Chen, Yi Mei, Mengjie Zhang
Improving Representation in Benchmarking of Binary PSO Algorithms

Binary PSO algorithms are extensions of the PSO algorithm that enjoy some of the social intelligence properties of the original algorithm. The intensive local search ability is one of the most important characteristics of PSO. In this paper, we argue that, when evaluating binary PSO algorithms against common real-value benchmark problems—a common practice in the literature—the representation of the search space can have a significant effect on the results. For this purpose we propose the use of reflected binary code, which is a minimal change ordering representation for mapping a binary genotype space to a real phenotype space, while preserving the notion of locality in the phenotype space.

Shinichi Yamada, Kourosh Neshatian
Investigation of a Simple Distance Based Ranking Metric for Decomposition-Based Multi/Many-Objective Evolutionary Algorithms

Multi-objective problems with more than three objectives, more commonly referred to as many-objective problems, have lately been a subject of significant research interest. Decomposition of the objective space is one of the most widely used approaches, where the original problem is decomposed into several single-objective sub-problems and solved collaboratively. The sub-problems are defined using reference vectors, to which candidate solutions are assigned based on some proximity measures (e.g. perpendicular distance/angle etc.). The individuals attached to a given reference vector can thus be considered as a sub-population trying to solve that sub-problem. To create selection pressure among the members of the sub-population, several measures have been proposed in the past; such as weighted sum, penalty boundary intersection, achievement scalarizing function, Tchebycheff, etc. While being competitive, some of them require parameters or reference points for implementation, which is far from ideal. The aim of this study is to investigate an alternative, less explored avenue - the use of distance based ranking with a decomposition based algorithm. Towards this end, we propose an improved version of an existing distance based metric and embed it within a decomposition based evolutionary algorithm (DBEA-MDR). We characterize its performance through a comprehensive benchmarking on a range of regular and inverted DTLZ/WFG problems. While the performance of DBEA-MDR based on conventional benchmarking practice (quality of solutions of the final populations) is not competitive with existing state-of-the-art algorithms, selection of a diverse set of solutions (of same size as the population) from the archive significantly improves its performance which in a number of cases supersedes the performance of other algorithms. Based on these observations, apart from highlighting the scope of improvement in the presented strategy, the study also emphasizes the need to look into existing benchmarking practices further. In particular, instead of the performance judged by the final population, a better approximation set could be found from the archive and performance judged on such sets would be more reflective of the true performance of the algorithms.

Hemant Kumar Singh, Kalyan Shankar Bhattacharjee, Tapabrata Ray, Sanaz Mostaghim
Hierarchical Learning Classifier Systems for Polymorphism in Heterogeneous Niches

Learning classifier systems (LCSs) have been successfully adapted to real-world domains with the claim of human-readable rule populations. However, due to the inherent rich characteristic of the employed representation, it is possible to represent the underlying patterns in multiple (polymorphic) ways, which obscures the most informative patterns. A novel rule reduction algorithm is proposed based on ensembles of multiple trained LCSs populations in a hierarchical learning architecture to reduce the local diversity and global polymorphism. The primary aim of this project is to interrogate the hidden patterns in LCSs’ trained population rather than improve the predictive power on test sets. This enables successful visualization of the importance of features in data groups (niches) that can contain heterogeneous patterns, i.e. even if different patterns result in the same class the importance of features can be found.

Yi Liu, Will N. Browne, Bing Xue
A Cooperative Coevolutionary Algorithm for Real-Time Underground Mine Scheduling

We apply a cooperative coevolutionary algorithm for the real-time evolution of schedules in underground mines. The algorithm evolves simultaneously both truck dispatching and traffic light schedules for one-lane roads. The coevolutionary approach achieves high production with fewer trucks than both the widely-used DISPATCH algorithm, and commonly-used greedy heuristics.

Wesley Cox, Tim French, Mark Reynolds, Lyndon While
Particle Swarm Optimization for Feature Selection with Adaptive Mechanism and New Updating Strategy

Feature selection is an important data preprocessing technique in the emerging field of artificial intelligence and data mining which aims at finding a small set of features from the original dataset with predetermined targets. Particle swarm optimization (PSO) has been widely used to address feature selection problems because of its easy implementation, efficiency and simplicity. However, in high-dimensional problems, selecting the discriminative features with a higher correct classification rate is limited. To solve the issue above, a particle swarm optimization method with adaptive mechanism and new updating strategy is proposed to choose best features to improve the correct classification rate. The proposed approach, named as EPSO, is verified and compared with other three meta-heuristic algorithms and four recent PSO-based feature selection methods. The experimental results and statistical tests have proved the efficiency and feasibility of the EPSO approach in obtaining higher classification accuracy along with smaller number of features. Therefore, the proposed EPSO algorithm can be successfully used as a novel feature selection strategy.

Ke Chen, Fengyu Zhou, Bine Xue
An Improved Genetic Programming Hyper-Heuristic for the Uncertain Capacitated Arc Routing Problem

This paper uses a Genetic Programming Hyper-Heuristic (GPHH) to evolve routing policies for the Uncertain Capacitated Arc Routing Problem (UCARP). Given a UCARP instance, the GPHH evolves feasible solutions in the form of decision making policies which decide the next task to serve whenever a vehicle completes its current service. Existing GPHH approaches have two drawbacks. First, they tend to generate small routes by routing through the depot and refilling prior to the vehicle being fully loaded. This usually increases the total cost of the solution. Second, existing GPHH approaches cannot control the extra repair cost incurred by a route failure, which may result in higher total cost. To address these issues, this paper proposes a new GPHH algorithm with a new No-Early-Refill filter to prevent generating small routes, and a novel Flood Fill terminal to better handle route failures. Experimental studies show that the newly proposed GPHH algorithm significantly outperforms the existing GPHH approaches on the Ugdb and Uval benchmark datasets. Further analysis has verified the effectiveness of both the new filter and terminal.

Jordan MacLachlan, Yi Mei, Juergen Branke, Mengjie Zhang
Uncovering Performance Envelopes Through Optimum Design of Tests

Test and evaluation is a process that is used to determine if a product/system satisfies its performance specifications across its entire operating regime. The operating regime is typically defined using factors such as types of terrains/sea-states/altitudes, weather conditions, operating speeds, etc., and involves multiple performance metrics. With each test being expensive to conduct and with multiple factors and performance metrics under consideration, design of a test and evaluation schedule is far from trivial. Design of experiments (DOE) still continues to be the most prevalent approach to derive the test plans, although there is significant opportunity to improve this practice through optimization. In this paper, we introduce a surrogate-assisted optimization approach to uncover the performance envelope with a small number of tests. The approach relies on principles of decomposition to deal with multiple performance metrics and employs bi-directional search along each reference vector to identify the best and worst performance simultaneously. To limit the number of tests, the search is guided by multiple surrogate models. At every iteration the approach delivers a test plan involving at most $$K_T$$ tests, and the information acquired is used to generate future test plans. In order to evaluate the performance of the proposed approach, a set of scalable test functions with various Pareto front characteristics and objective space bias are introduced. The performance of the approach is quantitatively assessed and compared with two popular DOE strategies, namely Latin Hypercube Sampling (LHS) and Full Factorial Design (FFD). Further, we also demonstrate its practical use on a simulated catapult system.

Tapabrata Ray, Ahsanul Habib, Hemant Kumar Singh, Michael Ryan
Towards Fully Automated Semantic Web Service Composition Based on Estimation of Distribution Algorithm

Web service composition has been a challenging research area, where many researchers have been working on a composition problem that optimizes Quality of service and/or Quality of semantic matchmaking of composite solutions. This NP-hard problem has been successfully handled by many Evolutionary Computation techniques with promising results. Estimation of Distribution has shown its initial promise in solving fully automated service composition, and its success strongly relies on distribution models and sampling techniques. Our recently published work proposed a Node Histogram-Based approach to fully automated service composition. However, many services presented in sampled optimized queues does not contribute to decoded solutions of the queue. Therefore, efforts should be made to focus on learning distributions of component services in solutions. Consequently, we aim to learn more suitable distributions considering services satisfying service dependency in the solutions and use the Edge Histogram Matrix to learn restricted sampled outcomes satisfying the dependency. Besides that, we proposed effective sampling techniques with high efficiency in a straightforward implementation. Our experimental evaluation using benchmark datasets shows our proposed EDA-based approach outperforms two recent approaches regarding both efficiency and effectiveness.

Chen Wang, Hui Ma, Gang Chen, Sven Hartmann
Genetic Programming with Multi-tree Representation for Dynamic Flexible Job Shop Scheduling

Flexible job shop scheduling (FJSS) can be regarded as an optimization problem in production scheduling that captures practical and challenging issues in real-world scheduling tasks such as order picking in manufacturing and cloud computing. Given a set of machines and jobs, FJSS aims to determine which machine to process a particular job (by routing rule) and which job will be chosen to process next by a particular machine (by sequencing rule). In addition, dynamic changes are unavoidable in the real-world applications. These features lead to difficulties in real-time scheduling. Genetic programming (GP) is well-known for the flexibility of its representation and tree-based GP is widely and typically used to evolve priority functions for different decisions. However, a key issue for the tree-based representation is how it can capture both the routing and sequencing rules simultaneously. To address this issue, we proposed to use multi-tree GP (MTGP) to evolve both routing and sequencing rules together. In order to enhance the performance of MTGP algorithm, a novel tree swapping crossover operator is proposed and embedded into MTGP. The results suggest that the multi-tree representation can achieve much better performance with smaller rules and less training time than cooperative co-evolution for GP in solving dynamic FJSS problems. Furthermore, the proposed tree swapping crossover operator can greatly improve the performance of MTGP.

Fangfang Zhang, Yi Mei, Mengjie Zhang

Knowledge Representation and Reasoning

Frontmatter
Adaptive Inference on Probabilistic Relational Models

Standard approaches for inference in probabilistic relational models include lifted variable elimination (LVE) for single queries. To efficiently handle multiple queries, the lifted junction tree algorithm (LJT) uses a first-order cluster representation of a model, employing LVE as a subroutine in its steps. Adaptive inference concerns efficient inference under changes in a model. If the model changes, LJT restarts, possibly unnecessarily dumping information. The purpose of this paper is twofold, (i) to adapt the cluster representation to incremental changes, and (ii) to transform LJT into an adaptive version, enabling LJT to preserve as much computations as possible. Adaptive LJT fast reaches the point of answering queries again after changes, which is especially important for time-critical applications or online query answering.

Tanya Braun, Ralf Möller
Reinterpretation with Systems of Spheres

Communicating agents in open environments such as the semantic web face the problem of inter-ontological ambiguity, i.e., the problem that some agent uses a (constant, role or concept) name differently than another agent. In this paper, we propose a strategy for online ambiguity resolution relying on the ideas of belief revision and reinterpretation. The data structures guiding the conflict resolution are systems of spheres, which, in particular, allow to select a resolution result amongst other potential results. The paper defines operators for (iterated) reinterpretation based on systems of spheres and shows that they fulfill some desirable set of properties (postulates).

Özgür Lütfü Özçep
Sparse Approximation for Gaussian Process with Derivative Observations

We propose a sparse Gaussian process model to approximate full Gaussian process with derivatives when a large number of function observations t and derivative observations $$t'$$ exist. By introducing a small number of inducing point m, the complexity of posterior computation can be reduced from $$\mathcal {O}((t+t')^{3})$$ to $$\mathcal {O}((t+t')m^{2})$$ . We also find the usefulness of our approach in Bayesian optimisation. Experiments demonstrate the superiority of our approach.

Ang Yang, Cheng Li, Santu Rana, Sunil Gupta, Svetha Venkatesh
Counterfactual Inference with Hidden Confounders Using Implicit Generative Models

In observational studies, a key problem is to estimate the causal effect of a treatment on some outcome. Counterfactual inference tries to handle it by directly learning the treatment exposure surfaces. One of the biggest challenges in counterfactual inference is the existence of unobserved confounders, which are latent variables that affect both the treatment and outcome variables. Building on recent advances in latent variable modelling and efficient Bayesian inference techniques, deep latent variable models, such as variational auto-encoders (VAEs), have been used to ease the challenge by learning the latent confounders from the observations. However, for the sake of tractability, the posterior of latent variables used in existing methods is assumed to be Gaussian with diagonal covariance matrix. This specification is quite restrictive and even contradictory with the underlying truth, limiting the quality of the resulting generative models and the causal effect estimation. In this paper, we propose to take advantage of implicit generative models to detour this limitation by using black-box inference models. To make inference for the implicit generative model with intractable likelihood, we adopt recent implicit variational inference based on adversary training to obtain a close approximation to the true posterior. Experiments on simulated and real data show the proposed method matches the state-of-art.

Fujin Zhu, Adi Lin, Guangquan Zhang, Jie Lu
On the Use of Matrix Based Representation to Deal with Automatic Composer Recognition

In this article the use of a matrix based representation of pieces is tested for the classification of musical pieces of some well known classical composers. The pieces in two corpora have been represented in two ways: matrices of interval pair probabilities and a set of 12 global features which had previously been used in a similar task. The classification accuracies of both representations have been computed using several supervised classification algorithms. A class binarization technique has also been applied to study how the accuracies change with this kind of methods. Promising results have been obtained which show that both the matrix representation and the class binarization techniques are suitable to be used in the automatic composer recognition problem.

Izaro Goienetxea, Iñigo Mendialdua, Basilio Sierra
Flood-Fill Q-Learning Updates for Learning Redundant Policies in Order to Interact with a Computer Screen by Clicking

We present a specialisation of Q-learning for the problem of training an agent to click on a computer screen. In this problem formulation the agent sees the pixels of the screen as input, and selects a pixel as output. The task of selecting a pixel to click on involves selecting an action from a large discrete action space in which many of the actions are completely equivalent in terms of reinforcement learning state transitions. We propose to exploit this by performing simultaneous Q-learning updates for equivalent actions. We use the flood fill algorithm on the input image to determine the action (pixel) equivalence.

Nathaniel du Preez-Wilkinson, Marcus Gallagher, Xuelei Hu
Answering Multiple Conjunctive Queries with the Lifted Dynamic Junction Tree Algorithm

The lifted dynamic junction tree algorithm (LDJT) answers filtering and prediction queries efficiently for probabilistic relational temporal models by building and then reusing a first-order cluster representation of a knowledge base for multiple queries and time steps. We extend LDJT to answer conjunctive queries over multiple time steps by avoiding eliminations, while keeping the complexity to answer a conjunctive query low. The extended version of saves computations compared to an existing approach to answer multiple lifted conjunctive queries.

Marcel Gehrke, Tanya Braun, Ralf Möller
Preventing Unnecessary Groundings in the Lifted Dynamic Junction Tree Algorithm

The lifted dynamic junction tree algorithm (LDJT) answers filtering and prediction queries efficiently for probabilistic relational temporal models by building and then reusing a first-order cluster representation of a knowledge base for multiple queries and time steps. Unfortunately, a non-ideal elimination order can lead to groundings. We extend LDJT (i) to identify unnecessary groundings and (ii) to prevent groundings by delaying eliminations through changes in a temporal first-order cluster representation. The extended version of LDJT answers multiple temporal queries orders of magnitude faster than the original version.

Marcel Gehrke, Tanya Braun, Ralf Möller

Machine Learning and Data Mining

Frontmatter
Feature Standardisation in Symbolic Regression

While standardisation of variables is a common practice for many machine learning algorithms, it is rarely seen in the literature on genetic programming for symbolic regression. This paper compares the predictive performance of unscaled and standardised genetic programming, using artificial datasets and benchmark problems. Linear scaling is also applied to genetic programming for these problems. We show that unscaled genetic programming provides worse predictive performance than genetic programming augmented by linear scaling and/or standardisation as it is highly sensitive to the magnitude and range of explanatory or response variables. While linear scaling does provide better predictive performance on the simple artificial datasets, we attribute much of its success to an implicit standardisation within the predictive model. For benchmark problems, the combination of linear scaling and standardisation provides greater stability than only applying linear scaling to genetic programming. Also, for many of the simple artificial datasets, unscaled genetic programming produces larger individuals, which is undesirable in the search for parsimonious models.

Caitlin A. Owen, Grant Dick, Peter A. Whigham
Genetic Programming with Interval Functions and Ensemble Learning for Classification with Incomplete Data

Missing values are an unavoidable issue in many real-world datasets. Classification with incomplete data has to be addressed carefully because inadequate treatment often leads to a big classification error. Interval genetic programming (IGP) is an approach to directly use genetic programming to evolve an effective and efficient classifier for incomplete data. This paper proposes a method to improve IGP for classification with incomplete data by integrating IGP with ensemble learning to build a set of classifiers. Experimental results show that the integration of IGP and ensemble learning to evolve a set of classifiers for incomplete data can achieve better accuracy than IGP alone. The proposed method is also more accurate than other common methods for classification with incomplete data.

Cao Truong Tran, Mengjie Zhang, Bing Xue, Peter Andreae
Stochastic Conjugate Gradient Descent Twin Support Vector Machine for Large Scale Pattern Classification

With the advent of technology, the amount of data available for learning is increasing day by day. However, machine learning algorithms such as Support Vector Machines (SVMs) are effective but slow in dealing with this huge inflow of information. Recent researches have largely focussed on increasing the scalability of machine learning algorithms including by using algorithmic level speed-ups such as TWSVM [10], LS-SVM [18] and training level speed-ups such as using Newton-Armijo method [12], Coordinate Descent Method [8] etc. Among these, recently proposed stochastic gradient based methods have attracted significant attention. However, these methods suffer from the inherent problems of stochastic gradient methodology such as ill-conditioning, slow convergence near minima etc. In this paper, we propose a Stochastic Conjugate Gradient Descent method based Twin Support Vector Machine (SCG-TWSVM) which improves upon the limitations of Stochastic Gradient Descent Support Vector Machine (SG-SVM) and Stochastic Gradient Twin Support Vector Machine (SG-TWSVM) and leads to a more robust, effective and generalizable classifier. We also extend our proposed classifier to non-linear case by using Kernel trick. We perform extensive experiments on a variety of machine learning benchmark datasets as well as real-world machine learning datasets which prove the efficacy of our proposed approach compared to related methods on large scale problems.

Sweta Sharma, Reshma Rastogi
A Novel Synthetic Over-Sampling Technique for Imbalanced Classification of Gene Expressions Using Autoencoders and Swarm Optimization

A new synthetic minority class over-sampling approach for binary (normal/cancer) classification of microarray gene expression data is proposed. The idea is to exploit a previously trained autoencoder in combination with the Particle Swarm Optimisation algorithm to generate new synthetic examples of the minority class for solving the class imbalance problem. Experiments using two different autoencoder representation sizes (500 and 30) and two base classifiers (Support Vector Machine and naïve Bayes) show that the proposed method is able to generate discriminating representations that outperformed state-of-the-art methods such as Synthetic Minority Class Over-sampling Technique and Density-Based Synthetic Minority Class Over-sampling Technique in many test cases.

Maisa Daoud, Michael Mayo
Co-evolution of Novel Tree-Like ANNs and Activation Functions: An Observational Study

Deep convolutional neural networks (CNNs) represent the state-of-the-art model structure in image classification problems. However, deep CNNs suffer from issues of interpretability and are difficult to train. This work presents new tree-like shallow ANNs, and offers a novel approach to exploring and examining the relationship between activation functions and network performance. The proposed work is examined on the MNIST and CIFAR10 datasets, finding surprising results relating to the necessity and benefit of activation functions in this new type of shallow network. In particular the work finds high accuracy networks for the MNIST dataset which utilise pooling operations as the only non-linearity, and demonstrate a certain invariance to the specific form of activation functions on the more complicated CIFAR10 dataset.

Damien O’Neill, Bing Xue, Mengjie Zhang
Lift-Per-Drift: An Evaluation Metric for Classification Frameworks with Concept Drift Detection

Data streams with concept drift change over time. Detecting drift allows remedial action, but this can come at a cost e.g. training a new classifier. Prequential accuracy is commonly used to evaluate the impact of drift detection frameworks on data stream classification, but recent work shows frequent periodic drift detection can provide better accuracy than state-of-the-art drift detection techniques. We discuss how sequentiality, the degree of consecutive matching class labels across instances, allows high accuracy without a classifier learning to differentiate classes. We propose a novel metric: lift-per-drift (lpd). This measures drift detection performance through its impact on classification accuracy, penalised by drifts detected in a dataset. This metric solves three problems: lpd cannot be increased by periodic, frequent drifts; lpd clearly shows when using drift detection increases classifier error; and lpd does not require knowledge of where real drifts occurred. We show how lpd can be set to be sensitive to the cost of each drift. Our experiments show lpd is not artificially increased through sequentiality; that lpd highlights when drift detection has caused a loss in accuracy; and that it is sensitive to change in true-positive drift and false-positive drift detection rates.

Robert Anderson, Yun Sing Koh, Gillian Dobbie
Genetic Programming Based on Granular Computing for Classification with High-Dimensional Data

Classification tasks become more challenging when having the curse of dimensionality issue. Recently, there has been an increasing number of datasets with thousands of features. Some classification algorithms often need feature selection to avoid the curse of dimensionality. Genetic programming (GP) has shown success in classification tasks. GP does not require to do feature selection because of its built-in capability to automatically select informative features. However, GP-based methods are often computationally intensive to achieve a good classification accuracy. Based on perspectives from granular computing (GrC), this paper proposes a new approach to linking features hierarchically for GP-based classification. Experiments on seven high-dimensional datasets show the effectiveness of the proposed algorithm in terms of saving training time and enhancing the classification accuracy, compared to baseline methods.

Wenbin Pei, Bing Xue, Lin Shang, Mengjie Zhang
Random-Sets for Dealing with Uncertainties in Relevance Feature

Most relevance discovery models only consider document-level evidence, which may introduce uncertainties to relevance features. Research in information retrieval shows that adopting passage-level (i.e. paragraph-level) evidence can improve the performance of different models in various retrieval tasks. This paper proposes an innovative and effective relevance method based on paragraph evidence to reduce uncertainties in the relevance features discovered by existing models. The method exploits latent topics in the relevance feedback collection to estimate the implicit paragraph relevance. It uses random sets to effectively model the intricate relationships between paragraphs, topics and features to deal with the associated uncertainties. Experiments are conducted using the standard RCV1 dataset, its TREC filtering collections and six popular performance measures. The results confirm that the proposed Uncertainty Reduction (UR) method can significantly enhance the performance of 12 models for relevance feature selection.

Abdullah Semran Alharbi, Md Abul Bashar, Yuefeng Li
Multiple-Task Learning and Knowledge Transfer Using Generative Adversarial Capsule Nets

It is common that practical data has multiple attributes of interest. For example, a picture can be characterized in terms of its content, e.g. the categories of the objects in the picture, and in the meanwhile the image style such as photo-realistic or artistic is also relevant. This work is motivated by taking advantage of all available sources of information about the data, including those not directly related to the target of analytics.We propose an explicit and effective knowledge representation and transfer architecture for image analytics by employing Capsules for deep neural network training based on the generative adversarial nets (GAN). The adversarial scheme help discover capsule-representation of data with different semantic meanings in respective dimensions of the capsules. The data representation includes one subset of variables that are particularly specialized for the target task – by eliminating information about the irrelevant aspects. We theoretically show the elimination by mixing conditional distributions of the represented data. Empirical evaluations show the propose method is effective for both standard transfer-domain recognition tasks and zero-shot transfer.

Ancheng Lin, Jun Li, Lujuan Zhang, Zhenyuan Ma, Weiqi Luo
Link Prediction via Factorization Machines

Link prediction is the problem of predicting the existence of edges in a network. The link prediction problem is a fundamental research problem in graph mining and has numerous applications in social networks, bioinformatics, e-commerce, etc. A major challenge of link prediction problems is handling the fact that real-world networks are becoming extremely large. The large network size leads to huge sparsity in the network’s adjacency matrix which most existing link prediction methods (such as matrix factorization) rely on. Moreover, when networks become very large, there exists a non-trivial link imbalance problem where the numbers of known present and known absent links are significantly different. Such sparsity and imbalance issues significantly impact and decrease the performance of existing link prediction methods. To address these challenges, in this research we propose a Balanced Factorization Machine (BFM) which performs link predictions on very sparse network via learning interactions among nodes and edges of the network in a supervised learning setting. Through extensive experiments on real-world network data sets, we show that our BFM method significantly outperforms other existing link prediction methods.

Lile Li, Wei Liu
Discovering Granger-Causal Features from Deep Learning Networks

In this research, we propose deep networks that discover Granger causes from multivariate temporal data generated in financial markets. We introduce a Deep Neural Network (DNN) and a Recurrent Neural Network (RNN) that discover Granger-causal features for bivariate regression on bivariate time series data distributions. These features are subsequently used to discover Granger-causal graphs for multivariate regression on multivariate time series data distributions. Our supervised feature learning process in proposed deep regression networks has favourable F-tests for feature selection and t-tests for model comparisons. The experiments, minimizing root mean squared errors in the regression analysis on real stock market data obtained from Yahoo Finance, demonstrate that our causal features significantly improve the existing deep learning regression models.

Aneesh Sreevallabh Chivukula, Jun Li, Wei Liu
A New Family of Generative Adversarial Nets Using Heterogeneous Noise to Model Complex Distributions

Generative adversarial nets (GANs) are effective framework for constructing data models and enjoys desirable theoretical justification. On the other hand, realizing GANs for practical complex data distribution often requires careful configuration of the generator, discriminator, objective function and training method and can involve much non-trivial effort.We propose an novel family of generative adversarial nets (GANs), where we employ both continuous noise and random binary codes in the generating process. The binary codes in the new GAN model (named BGANs) play the role of categorical latent variables helps improve the model capability and training stability when dealing with complex data distributions. BGAN has been evaluated and compared with existing GANs trained with the state-of-the-art method on both synthetic and practical data. The empirical evaluation shows effectiveness of BGAN.

Ancheng Lin, Jun Li, Lujuan Zhang, Lei Shi, Zhenyuan Ma
Twin Bounded Large Margin Distribution Machine

In order to speed up the learning time of large margin distribution machine (LDM) and improve the generalization performance of twin bounded support vector machine (TBSVM), a novel method named twin bounded large margin distribution machine (TBLDM) is proposed in this paper. The central idea of TBLDM is to seek a pair of nonparallel hyperplanes by optimizing the positive and negative margin distributions on the base of TBSVM. The experimental results indicate that the proposed TBLDM is a fast, effective and robust classifier.

Haitao Xu, Brendan McCane, Lech Szymanski
Concept Drift Detector Selection for Hoeffding Adaptive Trees

Dealing with evolving data requires strategies for detecting and quantifying change, and forgetting irrelevant examples during the model revision process. To design an adaptive classifier that is suitable for different types of streams requires us to understand the characteristics of the data stream. Current adaptive classifiers have built-in concept drift detectors used as an estimator at each node. Our research aim is to investigate the usage of different drift detectors for Hoeffding Adaptive Tree (HAT), an adaptive classifier. We proposed three variants of the proposed classifier, called HAT $$_{SEED}$$ , HAT $$_{HDDM_A}$$ , and HAT $$_{PHT}$$ .

Moana Stirling, Yun Sing Koh, Philippe Fournier-Viger, Sri Devi Ravana

Planning and Scheduling

Frontmatter
Evolutionary Multitask Optimisation for Dynamic Job Shop Scheduling Using Niched Genetic Programming

Dynamic job shop scheduling (DJSS) problems are combinatorial optimisation problems where dynamic events occur during processing that prevents scheduling algorithms from being able to predict the optimal solutions in advance. DJSS problems have been studied extensively due to the difficulty of the problem and their applicability to real-world scenarios. This paper deals with a DJSS problem with dynamic job arrivals and machine breakdowns. A standard genetic programming (GP) approach that evolves dispatching rules, which is effective for DJSS problems with dynamic job arrivals, have difficulty generalising over problem instances with different machine breakdown scenarios. This paper proposes a niched GP approach that incorporates multitasking to simultaneously evolve multiple rules that can effectively cope with different machine breakdown scenarios. The results show that the niched GP approach can evolve rules for the different machine breakdown scenarios faster than the combined computation times of the benchmark GP approach and significantly outperform the benchmark GP’s evolved rules. The analysis shows that the specialist rules effective for DJSS problem instances with zero machine breakdown have different behaviours to the rules effective for DJSS problem instances with machine breakdown and the generalist rules, but there is also large variance in the behaviours of the zero machine breakdown specialist rules.

John Park, Yi Mei, Su Nguyen, Gang Chen, Mengjie Zhang
Autonomous Vehicle Constrained Path Planning Using Opal for Dynamic Networks

Mobile communications networks may be required to operate in highly dynamic environments. We consider a communications network in an urban scenario where a UAV is used as a radio relay between mobile ground based nodes. The UAVs flight path is constrained by buildings such that ground based nodes will lose connectivity. We present the Opal system to generate, through multi-objective optimisation, network solutions for such a scenario. Opal is shown to develop novel behaviours which are effective in reducing network disconnection time.

Kin-Ping Hui, Damien Phillips, Asanka Kekirigoda, Alan Allwright
A Dynamic Planner for Object Assembly Tasks Based on Learning the Spatial Relationships of Its Parts from a Single Demonstration

In this paper, we propose a general system for enabling robots to generate assembly plans for assisting people during assembly tasks. Such a plan is derived from a 3D occupancy grid that the system generates while observing a person performing an assembly task. Our proposed system uses the acquired 3D occupancy grid and a graph search to generate an assembly plan. This plan is used to guide users during assembly tasks to create a similar object. If the user deviates from the suggested plan, our system automatically validates whether the new state is solvable or not and reacts accordingly. Forward assembly planning is an NP-hard problem, but we introduce pruning methods for the search tree that make the approach practical.

Ahmed Abbas, Frederic Maire, Sareh Shirazi, Feras Dayoub, Markus Eich
Surrogate-Assisted Genetic Programming for Dynamic Flexible Job Shop Scheduling

Genetic programming (GP) has been widely used for automatically evolving priority rules for solving job shop scheduling problems. However, one of the main drawbacks of GP is the intensive computation time. This paper aims at investigating appropriate surrogates for GP to reduce its computation time without sacrificing its performance in solving dynamic flexible job shop scheduling (DFJSS) problems. Firstly, adaptive surrogate strategy with dynamic fidelities of simulation models are proposed. Secondly, we come up with generation-range-based surrogate strategy in which homogeneous (heterogeneous) surrogates are used in same (different) ranges of generations. The results show that these two surrogate strategies with GP are efficient. The computation time are reduced by 22.9% to 27.2% and 32.6% to 36.0%, respectively. The test performance shows that the proposed approaches can obtain rules with at least the similar quality to the rules obtained by the GP approach without using surrogates. Moreover, GP with adaptive surrogates achieves significantly better performance in one out of six scenarios. This paper confirms the potential of using surrogates to solve DFJSS problems.

Fangfang Zhang, Yi Mei, Mengjie Zhang
Virtual Roundabout Protocol for Autonomous Vehicles

This paper investigates intersection management protocols under an environment in which all vehicles are autonomous and capable of communication. Three potential protocols for intersection control was considered and implemented as a multi-robot system under the Robot Operating System platform. Although these protocols mimic conventional control mechanisms for human-driving, their behaviour under environments of autonomous driving is different. We found that the virtual roundabout, a protocol under which all vehicles follow the real-world rule of roundabouts without a physical roundabout, is most effective among the three protocols with respect to traffic load, safety distance and traffic imbalances.

Jianglin Qiao, Dongmo Zhang, Dave de Jonge
Multi-objective Container Consolidation in Cloud Data Centers

In recent years, container-based clouds are becoming increasingly popular for their lightweight nature. Existing works on container consolidation mainly focus on reducing the energy consumption of cloud data centers. However, reducing energy consumption often results in container migrations which have big impact on the performance (i.e. availability) of applications in the containers. In this paper, we consider container consolidation as one multi-objective optimization problem with the objectives of minimizing the total energy consumption and minimizing the total number of container migrations within the certain period of time and present an NSGA-II based algorithm to find solutions for the container consolidation problem. Our experimental evaluation based on the real-world workload demonstrates that our proposed approach can lead to further energy saving and significant reduction of container migrations at the same time compared with some existing approaches.

Tao Shi, Hui Ma, Gang Chen

Text Mining and NLP

Frontmatter
Enhancing Decision Boundary Setting for Binary Text Classification

Text classification is a task of assigning a set of text documents into predefined classes based on the classifier that learns from training samples; labelled or unlabeled. Binary text classifiers provide a way to separate related documents from a large dataset. However, the existing binary text classifiers are not grounded in reality due to the issue of overfitting. They try to find a clear boundary between relevant and irrelevant objects rather than understand the decision boundary. Normally, the decision boundary cannot be described as a clear boundary because of the numerous uncertainties in text documents. This paper attempts to address this issue by proposing an effective model based on sliding window technique (SW) and Support Vector Machine (SVM) to deal with the uncertain boundary and to improve the effectiveness of binary text classification. This model aims to set the decision boundary by dividing the training documents into three distinct regions (positive, boundary, and negative regions) to ensure the certainty of extracted knowledge to describe relevant information. The model then organizes training samples for the learning task to build a multiple SVMs based classifier. The experimental results using the standard dataset Reuters Corpus Volume 1 (RCV1) and TREC topics for text classification, show that the proposed model significantly outperforms six state-of-the-art baseline models in binary text classification.

Aisha Rashed Albqmi, Yuefeng Li, Yue Xu
Stability of Word Embeddings Using Word2Vec

The word2vec model has been previously shown to be successful in creating numerical representations of words (word embeddings) that capture the semantic and syntactic meanings of words. This study examines the issue of model stability in terms of how consistent these representations are given a specific corpus and set of model parameters. Specifically, the study considers the impact of word embedding dimension size and frequency of words on stability. Stability is measured by comparing the neighborhood of words in the word vector space model. Our results demonstrate that the dimension size of word embeddings has a significant effect on the consistency of the model. In addition, the effect of the frequency of the target words on stability is identified. An approach to mitigate the effects of word frequency on stability is proposed.

Mansi Chugh, Peter A. Whigham, Grant Dick
Using Word Embeddings with Linear Models for Short Text Classification

Text documents often contain information relevant for a particular domain in short “snippets”. The social science field of peace and conflict studies is such a domain, where identifying, classifying and tracking drivers of conflict from text sources is important, and snippets are typically classified by human analysts using an ontology. One issue in automating this process is that snippets tend to contain infrequent “rare” terms which lack class-conditional evidence. In this work we develop a method to enrich a bag-of-words model by complementing rare terms in the text to be classified with related terms from a Word Vector model. This method is then combined with standard linear text classification algorithms. By reducing sparseness in the bag-of-words, these enriched models perform better than the baseline classifiers. A second issue is to improve performance on “small” classes having only a few examples, and here we show that Paragraph Vectors outperform the enriched models.

Alfred Krzywicki, Bradford Heap, Michael Bain, Wayne Wobcke, Susanne Schmeidl
Towards Compiling Textbooks from Wikipedia

In this paper, we explore challenges in compiling a pedagogic resource like a textbook on a given topic from relevant Wikipedia articles, and present an approach towards assisting humans in this task. We present an algorithm that attempts to suggest the textbook structure from Wikipedia based on a set of seed concepts (chapters) provided by the user. We also conceptualize a decision support system where users can interact with the proposed structure and the corresponding Wikipedia content to improve its pedagogic value. The proposed algorithm is implemented and evaluated against the outline of online textbooks on five different subjects. We also propose a measure to quantify the pedagogic value of the suggested textbook structure.

Ditty Mathew, Sutanu Chakraborti
Measuring Language Complexity Using Word Embeddings

The analysis of word patterns from a corpus has previously been examined using a number of different word embedding models. These models create a numeric representation of word co-occurrence and are able to capture some of the syntactic and semantic relationships of words in a document. Assessing language complexity has been considered for many years through the use of simple indexes and basic statistical properties (word frequency, etc.), however little work has been done on using word embeddings to develop language complexity measures. This paper describes preliminary work on measuring language complexity using clustered word embeddings to produce network transition models. The structural measures of these transition networks are shown to represent basic properties of language complexity and may be used to infer some aspects of the underlying generative grammar.

Peter A. Whigham, Mansi Chugh, Grant Dick
Backmatter
Metadata
Title
AI 2018: Advances in Artificial Intelligence
Editors
Tanja Mitrovic
Bing Xue
Xiaodong Li
Copyright Year
2018
Electronic ISBN
978-3-030-03991-2
Print ISBN
978-3-030-03990-5
DOI
https://doi.org/10.1007/978-3-030-03991-2

Premium Partner