Skip to main content

Über dieses Buch

The papers in this volume are the refereed papers presented at AI-2016, the Thirty-sixth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, held in Cambridge in December 2016 in both the technical and the application streams.

They present new and innovative developments and applications, divided into technical stream sections on Knowledge Discovery and Data Mining, Sentiment Analysis and Recommendation, Machine Learning, AI Techniques, and Natural Language Processing, followed by application stream sections on AI for Medicine and Disability, Legal Liability and Finance, Telecoms and eLearning, and Genetic Algorithms in Action. The volume also includes the text of short papers presented as posters at the conference.

This is the thirty-third volume in the Research and Development in Intelligent Systems series, which also incorporates the twenty-fourth volume in the Applications and Innovations in Intelligent Systems series. These series are essential reading for those who wish to keep up to date with developments in this important field.



Research and Development in Intelligent Systems XXXIII


Harnessing Background Knowledge for E-Learning Recommendation

The growing availability of good quality, learning-focused content on the Web makes it an excellent source of resources for e-learning systems. However, learners can find it hard to retrieve material well-aligned with their learning goals because of the difficulty in assembling effective keyword searches due to both an inherent lack of domain knowledge, and the unfamiliar vocabulary often employed by domain experts. We take a step towards bridging this semantic gap by introducing a novel method that automatically creates custom background knowledge in the form of a set of rich concepts related to the selected learning domain. Further, we develop a hybrid approach that allows the background knowledge to influence retrieval in the recommendation of new learning materials by leveraging the vocabulary associated with our discovered concepts in the representation process. We evaluate the effectiveness of our approach on a dataset of Machine Learning and Data Mining papers and show it to outperform the benchmark methods.
Blessing Mbipom, Susan Craw, Stewart Massie

Knowledge Discovery and Data Mining


Category-Driven Association Rule Mining

The quality of rules generated by ontology-driven association rule mining algorithms is constrained by the algorithm’s effectiveness in exploiting the usually large ontology in the mining process. We present a framework built around superimposing a hierarchical graph structure on a given ontology to divide the rule mining problem into disjoint subproblems whose solutions can be iteratively joined to find global associations. We present a new metric for evaluating the interestingness of generated rules based on where their constructs fall within the ontology. Our metric is anti-monotonic on subsets, making it usable in an Apriori-like algorithm which we present here. The algorithm categorises the ontology into disjoint subsets utilising the hierarchical graph structure and uses the metric to find associations in each, joining the results using the guidance of anti-monotonicity. The algorithm optionally embeds built-in definitions of user-specified filters to reflect user preferences. We evaluate the resulting model using a large collection of patient health records.
Zina M. Ibrahim, Honghan Wu, Robbie Mallah, Richard J. B. Dobson

A Comparative Study of SAT-Based Itemsets Mining

Mining frequent itemsets from transactional datasets is a well known problem. Thus, various methods have been studied to deal with this issue. Recently, original proposals have emerged from the cross-fertilization between data mining and artificial intelligence. In these declarative approaches, the itemset mining problem is modeled either as a constraint network or a propositional formula whose models correspond to the patterns of interest. In this paper, we focus on the propositional satisfiability based itemset mining framework. Our main goal is to enhance the efficiency of SAT model enumeration algorithms. This issue is particularly crucial for the scalability and competitiveness of such declarative itemset mining approaches. In this context, we deeply analyse the effect of the different SAT solver components on the efficiency of the model enumeration problem. Our analysis includes the main components of modern SAT solvers such as restarts, activity based variable ordering heuristics and clauses learning mechanism. Through extensive experiments, we show that these classical components play an essential role in such procedure to improve the performance by pushing forward the efficiency of SAT solvers. More precisely, our experimental evaluation includes a comparative study in enumerating all the models corresponding to the closed frequent itemsets. Additionally, our experimental analysis is extended to include the Top-k itemset mining problem.
Imen Ouled Dlala, Said Jabbour, Lakhdar Sais, Boutheina Ben Yaghlane

Mining Frequent Movement Patterns in Large Networks: A Parallel Approach Using Shapes

This paper presents the Shape based Movement Pattern (ShaMP) algorithm, an algorithm for extracting Movement Patterns (MPs) from network data that can later be used (say) for prediction purposes. The principal advantage offered by the ShaMP algorithm is that it lends itself to parallelisation so that very large networks can be processed. The concept of MPs is fully defined together with the realisation of the ShaMP algorithm. The algorithm is evaluated by comparing its operation with a benchmark Apriori based approach, the Apriori based Movement Pattern (AMP) algorithm, using large social networks generated from the Cattle tracking Systems (CTS) in operation in Great Britain (GB) and artificial networks.
Mohammed Al-Zeyadi, Frans Coenen, Alexei Lisitsa

Sentiment Analysis and Recommendation


Emotion-Corpus Guided Lexicons for Sentiment Analysis on Twitter

Conceptual frameworks for emotion to sentiment mapping have been proposed in Psychology research. In this paper we study this mapping from a computational modelling perspective with a view to establish the role of an emotion-rich corpus for lexicon-based sentiment analysis. We propose two different methods which harness an emotion-labelled corpus of tweets to learn word-level numerical quantification of sentiment strengths over a positive to negative spectrum. The proposed methods model the emotion corpus using a generative unigram mixture model (UMM), combined with the emotion-sentiment mapping proposed in Psychology (Cambria et al. 28th AAAI Conference on Artificial Intelligence, pp. 1515–1521, 2014) [1] for automated generation of sentiment lexicons. Sentiment analysis experiments on benchmark Twitter data sets confirm the quality of our proposed lexicons. Further a comparative analysis with standard sentiment lexicons suggest that the proposed lexicons lead to a significantly better performance in both sentiment classification and sentiment intensity prediction tasks.
Anil Bandhakavi, Nirmalie Wiratunga, Stewart Massie, P. Deepak

Context-Aware Sentiment Detection from Ratings

The explosion of user-generated content, especially tweets, customer reviews, makes it possible to build sentiment lexicons automatically by harnessing the consistency between the content and its accompanying emotional signal, either explicitly or implicitly. In this work we describe novel techniques for automatically producing domain specific sentiment lexicons that are optimised for the language patterns and idioms of a given domain. We describe how we use review ratings as sentiment signals. We also describe an approach to recognising contextual variations in sentiment and show how these variations can be exploited in practice. We evaluate these ideas in a number of different product domains.
Yichao Lu, Ruihai Dong, Barry Smyth

Recommending with Higher-Order Factorization Machines

The accumulated information about customers collected by the big players of internet business is incredibly large. The main purpose of collecting these data is to provide customers with proper offers in order to gain sales and profit. Recommender systems cope with those large amounts of data and have thus become an important factor of success for many companies. One promising approach to generate capable recommendations are Factorization Machines. This paper presents an approach to extend the basic 2-way Factorization Machine model with respect to higher-order interactions. We show how to implement the necessary additional term for 3-way interactions in the model equation in order to retain the advantage of linear complexity. Furthermore, we carry out a simulation study which demonstrates that modeling 3-way interactions improves the prediction quality of a Factorization Machine.
Julian Knoll

Machine Learning


Multitask Learning for Text Classification with Deep Neural Networks

Multitask learning, the concept of solving multiple related tasks in parallel promises to improve generalization performance over the traditional divide-and-conquer approach in machine learning. The training signals of related tasks induce a bias that helps to find better hypotheses. This paper reviews the concept of multitask learning and prior work on it. An experimental evaluation is done on a large scale text classification problem. A deep neural network is trained to classify English newswire stories by their overlapping topics in parallel. The results are compared to the traditional approach of training a separate deep neural network for each topic separately. The results confirm the initial hypothesis that multitask learning improves generalization.
Hossein Ghodrati Noushahr, Samad Ahmadi

An Investigation on Online Versus Batch Learning in Predicting User Behaviour

An investigation on how to produce a fast and accurate prediction of user behaviour on the Web is conducted. First, the problem of predicting user behaviour as a classification task is formulated and then the main problems of such real-time predictions are specified: the accuracy and time complexity of the prediction. Second, a method for comparison of online and batch (offline) algorithms used for user behaviour prediction is proposed. Last, the performance of these algorithms using the data from a popular question and answer platform, Stack Overflow, is empirically explored. It is demonstrated that a simple online learning algorithm outperforms state-of-the-art batch algorithms and performs as well as a deep learning algorithm, Deep Belief Networks. The proposed method for comparison of online and offline algorithms as well as the provided experimental evidence can be used for choosing a machine learning set-up for predicting user behaviour on the Web in scenarios where the accuracy and the time performance are of main concern.
Nikolay Burlutskiy, Miltos Petridis, Andrew Fish, Alexey Chernov, Nour Ali

A Neural Network Test of the Expert Attractor Hypothesis: Chaos Theory Accounts for Individual Variance in Learning

By positing that complex, abstract memories can be formalised as network attractors, the present paper introduces chaos theory in the field of psychological learning and, in particular, in the field of expertise acquisition. The expert attractor hypothesis is that the cortical re-organisation of biological networks via neural plasticity leads to a stable state that implements the memory template underpinning expert performance. An artificial neural network model of chess players’ strategic thinking, termed Templates for Expert Knowledge Simulation, was used to simulate, in 500 individuals, the learning of 128 positions which belong to 8 different chess strategies. The behavioural performance of the system as a novice, as an expert, and its variance in learning, are all in line with psychological findings. Crucially, the distribution of weights, the learning curves, and the evolution of the distribution of weights support the attractor hypothesis. Following a discussion on the psychological implications of the simulations, the next step towards unravelling the chaotic features of the human mind are evoked.
P. Chassy

AI Techniques


A Fast Algorithm to Estimate the Square Root of Probability Density Function

A fast maximum likelihood estimator based on a linear combination of Gaussian kernels is introduced to represent the square root of probability density function. It is shown that, if the kernel centres and kernel width are known, then the underlying problem can be formulated as a Riemannian optimization one. The first order Riemannian geometry of the sphere manifold and vector transport are explored, and then the well-known Riemannian conjugate gradient algorithm is used to estimate the model parameters. For completeness the k-means clustering algorithm and a grid search are applied to determine the centers and kernel width respectively. Illustrative examples are employed to demonstrate that the proposed approach is effective in constructing the estimate of the square root of probability density function.
Xia Hong, Junbin Gao

3Dana: Path Planning on 3D Surfaces

An important issue when planning the tasks that a mobile robot has to reach is the path that it has to follow. In that sense, classical path planning algorithms focus on minimizing the total distance, generally assuming a flat terrain. Newer approaches also include traversability cost maps to define the terrain characteristics. However, this approach may generate unsafe paths in realistic environments as the terrain relief is lost in the discretisation. In this paper we will focus on the path planning problem when dealing with a Digital Terrain Model (DTM). Over such DTM we have developed 3Dana, an any-angle path planning algorithm. The objective is to obtain candidate paths that may be longer than the ones obtained with classical algorithms, but safer. Also, in 3Dana we can consider other parameters to maximize the path optimality: the maximum slope allowed by the robot and the heading changes during the path. These constraints allow discarding infeasible paths, while minimizing the heading changes of the robot. To demonstrate the effectiveness of the algorithm proposed, we present the results for the paths obtained for real Mars DTMs.
Pablo Muñoz, María D. R-Moreno, Bonifacio Castaño

Natural Language Processing


Covert Implementations of the Turing Test: A More Level Playing Field?

It has been suggested that a covert Turing Test, possibly in a virtual world, provides a more level playing field for a chatbot, and hence an earlier opportunity to pass the Turing Test (or equivalent) in its overt, declared form. This paper looks at two recent covert Turing Tests in order to test this hypothesis. In one test (at Loyola Marymount) run as a covert-singleton test, of 50 subjects who talked to the chatbot avatar 39 (78 % deception) did not identify that the avatar was being driven by a chatbot. In a more recent experiment at the University of Worcester groups of students took part in a set of problem-based learning chat sessions, each group having an undeclared chatbot. Not one participant volunteered the fact that a chatbot was present (a 100 % deception rate). However the chatbot character was generally seen as being the least engaged participant—highlighting that a chatbot needs to concentrate on achieving legitimacy once it can successfully escape detection.
D. J. H. Burden, M. Savin-Baden, R. Bhakta

Context-Dependent Pattern Simplification by Extracting Context-Free Floating Qualifiers

Qualification may occur anywhere within a temporal utterance. To reduce the ensuing pattern complexity for context-dependent systems such as Enguage \(^{\mathrm{TM}}\), it is necessary to remove the qualified value from the utterance; rendering the utterance atemporal and presenting the value as the contextual variable when. This is possible because a qualifier—at 7:30 or until today—is immediately recognisable as such if preceding a time value: when is context-free. This appropriation gives insight into the nature of the context-dependent processing of habitual natural language. While the difference between the resultant concepts—how many coffees do I have and how old am I—is perhaps not that great despite their differing origins, this work ensures the mediation system remains practical and effective. This research is informed by a prototype for the health-tech app Memrica Prompt in support of independent living for people with early stage dementia.
M. J. Wheatman

Short Papers


Experiments with High Performance Genetic Programming for Classification Problems

In recent years there have been many papers concerned with significantly improving the computational speed of Genetic Programming (GP) through exploitation of parallel hardware. The benefits of timeliness or being able to consider larger datasets are obvious. However, a question remains in whether there are wider benefits of this high performance GP approach. Consequently, this paper will investigate leveraging this performance by using a higher degree of evolution and ensemble approaches in order to discern if any improvement in classification accuracies can be achieved from high performance GP thereby advancing the technique itself.
Darren M. Chitty

Towards Expressive Modular Rule Induction for Numerical Attributes

The Prism family is an alternative set of predictive data mining algorithms to the more established decision tree data mining algorithms. Prism classifiers are more expressive and user friendly compared with decision trees and achieve a similar accuracy compared with that of decision trees and even outperform decision trees in some cases. This is especially the case where there is noise and clashes in the training data. However, Prism algorithms still tend to overfit on noisy data; this has led to the development of pruning methods which have allowed the Prism algorithms to generalise better over the dataset. The work presented in this paper aims to address the problem of overfitting at rule induction stage for numerical attributes by proposing a new numerical rule term structure based on the Gauss Probability Density Distribution. This new rule term structure is not only expected to lead to a more robust classifier, but also lowers the computational requirements as it needs to induce fewer rule terms.
Manal Almutairi, Frederic Stahl, Mathew Jennings, Thien Le, Max Bramer

OPEN: New Path-Planning Algorithm for Real-World Complex Environment

This paper tackles with the single-source, shortest-path problem in the challenging context of navigation through real-world, natural environment like a ski area, where traditional on-site sign posts could be limited or not available. For this purpose, we propose a novel approach for planning the shortest path in a directed, acyclical graph (DAG) built on geo-location data mapped from available web databases through Google Map and/or Google Earth. Our new path-planning algorithm we called OPEN is run against this resulting graph and provides the optimal path in a computationally efficient way. Our approach was demonstrated on real-world cases, and it outperforms state-of-art, path-planning algorithms.
J. I. Olszewska, J. Toman

Encoding Medication Episodes for Adverse Drug Event Prediction

Understanding the interplay among the multiple factors leading to Adverse Drug Reactions (ADRs) is crucial to increasing drug effectiveness, individualising drug therapy and reducing incurred cost. In this paper, we propose a flexible encoding mechanism that can effectively capture the dynamics of multiple medication episodes of a patient at any given time. We enrich the encoding with a drug ontology and patient demographics data and use it as a base for an ADR prediction model. We evaluate the resulting predictive approach under different settings using real anonymised patient data obtained from the EHR of the South London and Maudsley (SLaM), the largest mental health provider in Europe. Using the profiles of 38,000 mental health patients, we identified 240,000 affirmative mentions of dry mouth, constipation and enuresis and 44,000 negative ones. Our approach achieved 93 % prediction accuracy and 93 % F-Measure.
Honghan Wu, Zina M. Ibrahim, Ehtesham Iqbal, Richard J. B. Dobson

Applications and Innovations in Intelligent Systems XXIV


A Genetic Algorithm Based Approach for the Simultaneous Optimisation of Workforce Skill Sets and Team Allocation

In large organisations with multi-skilled workforces, continued optimisation and adaptation of the skill sets of each of the engineers in the workforce is very important. However this change in skill sets can have an impact on the engineer’s usefulness in any team. If an engineer has skills easily obtainable by others in the team, that particular engineer might be more useful in a neighboring team where that skill may be scarce. A typical way to handle skilling and resource movement would be to preform them in isolation. This is a sub-optimal way of optimising the workforce overall, as there would be better combinations found if the effect of upskilling some of the workforce was also evaluated against the resultant move recommendations at the time the solutions are being evaluated. This paper presents a genetic algorithm based system for the optimal selection of engineers to be upskilled and simultaneous suggestions of engineers who should swap teams. The results show that combining team moves and engineer upskilling in the same optimisation process lead to an increase in coverage across the region. The combined optimisation results produces better coverage than only moving engineers between teams, just upskilling the engineers and performing both these operations, but in isolation. The developed system has been deployed in BT’s iPatch optimisation system with improvements integrated from stakeholder feedback.
A. J. Starkey, H. Hagras, S. Shakya, G. Owusu

Legal Liability, Medicine and Finance


Artificial Intelligence and Legal Liability

A recent issue of a popular computing journal asked which laws would apply if a self-driving car killed a pedestrian. This paper considers the question of legal liability for artificially intelligent computer systems. It discusses whether criminal liability could ever apply; to whom it might apply; and, under civil law, whether an AI program is a product that is subject to product design legislation or a service to which the tort of negligence applies. The issue of sales warranties is also considered. A discussion of some of the practical limitations that AI systems are subject to is also included.
J. K. C. Kingston

SELFBACK—Activity Recognition for Self-management of Low Back Pain

Low back pain (LBP) is the most significant contributor to years lived with disability in Europe and results in significant financial cost to European economies. Guidelines for the management of LBP have self-management at their cornerstone, where patients are advised against bed rest, and to remain active. In this paper, we introduce SELFBACK, a decision support system used by the patients themselves to improve and reinforce self-management of LBP. SELFBACK uses activity recognition from wearable sensors in order to automatically determine the type and level of activity of a user. This is used by the system to automatically determine how well users adhere to prescribed physical activity guidelines. Important parameters of an activity recognition system include windowing, feature extraction and classification. The choices of these parameters for the SELFBACK system are supported by empirical comparative analyses which are presented in this paper. In addition, two approaches are presented for detecting step counts for ambulation activities (e.g. walking and running) which help to determine activity intensity. Evaluation shows the SELFBACK system is able to distinguish between five common daily activities with 0.9 macro-averaged F1 and detect step counts with 6.4 and 5.6 root mean squared error for walking and running respectively.
Sadiq Sani, Nirmalie Wiratunga, Stewart Massie, Kay Cooper

Automated Sequence Tagging: Applications in Financial Hybrid Systems

Internal data published by a firm regarding their financial position, governance, people and reaction to market conditions are all believed to impact the underlying company’s valuation. An abundance of heterogeneous information coupled with the ever increasing processing power of machines, narrow AI applications are now managing investment positions and making decisions on behalf of humans. As unstructured data becomes more common, disambiguating structure from text-based documents remains an attractive research goal in the Finance and Investment industry. It has been found that statistical approaches are considered high risk in industrial applications and deterministic methods are typically preferred. In this paper we experiment with hybrid (ensemble) approaches for Named Entity Recognition to reduce implementation and run time risk involved with modern stochastic methods.
Peter Hampton, Hui Wang, William Blackburn, Zhiwei Lin

Telecoms and E-Learning


A Method of Rule Induction for Predicting and Describing Future Alarms in a Telecommunication Network

In order to gain insights into events and issues that may cause alarms in parts of IP networks, intelligent methods that capture and express causal relationships are needed. Methods that are predictive and descriptive are rare and those that do predict are often limited to using a single feature from a vast data set. This paper follows the progression of a Rule Induction Algorithm that produces rules with strong causal links that are both descriptive and predict events ahead of time. The algorithm is based on an information theoretic approach to extract rules comprising of a conjunction of network events that are significant prior to network alarms. An empirical evaluation of the algorithm is provided.
Chris Wrench, Frederic Stahl, Thien Le, Giuseppe Di Fatta, Vidhyalakshmi Karthikeyan, Detlef Nauck

Towards Keystroke Continuous Authentication Using Time Series Analytics

An approach to Keystroke Continuous Authentication (KCA) is described founded on a time series analysis based approach that, unlike previous work on KCA (using feature vector representations), takes the sequencing of keystrokes into consideration. The significance of KCA is in the context of online assessments and examinations used in eLearning environments and MOOCs, which are becoming increasingly popular. The process is fully described and analysed, including comparison with established feature vector approaches. Our proposed method outperforms these other approaches to KCA (with a detection accuracy of 94 %, compared to 79.53 %), a clear indicator that the proposed time series analysis based KCA has significant potential.
Abdullah Alshehri, Frans Coenen, Danushka Bollegala

Genetic Algorithms in Action


EEuGene: Employing Electroencephalograph Signals in the Rating Strategy of a Hardware-Based Interactive Genetic Algorithm

We describe a novel interface and development platform for an interactive Genetic Algorithm (iGA) that uses Electroencephalograph (EEG) signals as an indication of fitness for selection for successive generations. A gaming headset was used to generate EEG readings corresponding to attention and meditation states from a single electrode. These were communicated via Bluetooth to an embedded iGA implemented on the Arduino platform. The readings were taken to measure subjects’ responses to predetermined short sequences of synthesised sound, although the technique could be applied any appropriate problem domain. The prototype provided sufficient evidence to indicate that use of the technology in this context is viable. However, the approach taken was limited by the technical characteristics of the equipment used and only provides proof of concept at this stage. We discuss some of the limitations of using biofeedback systems and suggest possible improvements that might be made with more sophisticated EEG sensors and other biofeedback mechanisms.
C. James-Reynolds, E. Currie

Spice Model Generation from EM Simulation Data Using Integer Coded Genetic Algorithms

In the electronics industry, circuits often use passive planar structures, such as coils and transmission line elements. In contrast to discrete components these structures are distributed and show a frequency response which is difficult to model. The typical way to characterize such structures is using matrices that describe the behaviour in the frequency domain. These matrices, also known as S-parameters, do not provide any insight into the actual physics of the planar structures. When simulations in the time domain are required, a network representation is more suitable than S-parameters. In this research, a network is generated that exhibits the same frequency response as the given S-parameters whilst allowing for a fast and exact simulation in the time domain. For this, it is necessary to find optimum component values for the network. This is achieved in this work by using an integer coded genetic algorithm with power mutation. It has been shown that the proposed method is capable of finding near optimal solutions within reasonable computation time. The advantage of this method is that it uses small networks with fewer passive components compared to traditional methods that produce much larger networks comprising of many active and passive devices.
Jens Werner, Lars Nolle

Short Papers


Dendritic Cells for Behaviour Detection in Immersive Virtual Reality Training

This paper presents a cross-disciplinary research of artificial intelligence (AI) and virtual reality (VR) that presents a real application of an aircraft door operation training conducted in an immersive virtual environment. In the context of the study, AI takes an imperative role in distinguishing misbehaviour of trainees such as inappropriate steps and positions in the virtual training environment that mimics a real training scenario. Trainee’s behaviours are detected by the classical Dendritic Cell Algorithm (DCA) which is a signal-based classification approach that is inspired from the segmented detection and interaction with the signal molecules mechanisms of the human dendritic cells. The resulted approach demonstrated accurate detection and classification processes that are evidence in the experimental studies. This position paper also reveals the ability of the DCA method in human behaviour detection/classification in a dynamic environment.
N. M. Y. Lee, H. Y. K. Lau, R. H. K. Wong, W. W. L. Tam, L. K. Y. Chan

Interactive Evolutionary Generative Art

Interactive Genetic Algorithms may be used as an interface for exploring the solution spaces of computer art generative systems. In this paper, we describe the application of “Eugene” a hardware interactive Genetic Algorithm controller to the production of fractal and algorithmic images. Processing was used to generate the images. In tests with users, it was found that there was adequate exploration of the solution space, although the results were not always as expected by the user. Some additional testing explored this and the results indicate that mapping strategies used in encoding the solution are important if the interface allows for goal oriented user tasks.
L. Hernandez Mengesha, C. J. James-Reynolds

Incorporating Emotion and Personality-Based Analysis in User-Centered Modelling

Understanding complex user behaviour under various conditions, scenarios and journeys is fundamental to improving the user-experience for a given system. Predictive models of user reactions, responses—and in particular, emotions—can aid in the design of more intuitive and usable systems. Building on this theme, the preliminary research presented in this paper correlates events and interactions in an online social network against user behaviour, focusing on personality traits. Emotional context and tone is analysed and modelled based on varying types of sentiments that users express in their language using the IBM Watson Developer Cloud tools. The data collected in this study thus provides further evidence towards supporting the hypothesis that analysing and modelling emotions, sentiments and personality traits provides valuable insight into improving the user experience of complex social computer systems.
Mohamed Mostafa, Tom Crick, Ana C. Calderon, Giles Oatley

An Industrial Application of Data Mining Techniques to Enhance the Effectiveness of On-Line Advertising

Nowadays, online behavioural targeting is one of the most popular and profitable business strategies on the display advertising. It is based on data analysis of web user behaviours with the usage of machine learning aiming to optimise web advertising. The objective of this paper is to identify consumers who have no previously observed an advert but are “possible prospects” more likely to purchase an advertisement’s product. By identifying prospect customers, online advertisers may be able to optimise campaign performance, maximise their revenue as well as deliver advertisements tailored to a variety of user interests. Our work presents various benchmark machine-learning algorithms and attribute pre-processing techniques in the context of behavioural targeting. The performance of the experiments is evaluated using the key performance metric which is the predicted conversion rate. Our experimental results indicate that the presented data mining framework can significantly identify prospect customers in the vast majority of cases. Our results seem promising, indicating that there is a need for further studies in the area of data mining in online display advertising.
Maria Diapouli, Miltos Petridis, Roger Evans, Stelios Kapetanakis
Weitere Informationen

Premium Partner