Skip to main content

2018 | Buch

Advances in Artificial Intelligence

31st Canadian Conference on Artificial Intelligence, Canadian AI 2018, Toronto, ON, Canada, May 8–11, 2018, Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 31th Canadian Conference on Artificial Intelligence, Canadian AI 2018, held in Toronto, ON, Canada, in May 2018.

The 16 regular papers and 18 short papers presented together with 7 Graduate Student Symposium papers and 4 Industry Track papers were carefully reviewed and selected from 72 submissions. The focus of the conference was on artificial intelligence research and advanced information and communications technology.

Inhaltsverzeichnis

Frontmatter

Long Papers

Frontmatter
Compressing Bayesian Networks: Swarm-Based Descent, Efficiency, and Posterior Accuracy

Local models in Bayesian networks (BNs) reduce space complexity, facilitate acquisition, and can improve inference efficiency. This work focuses on Non-Impeding Noisy-AND Tree (NIN-AND Tree or NAT) models whose merits include linear complexity, being based on simple causal interactions, expressiveness, and generality. We present a swarm-based constrained gradient descent for more efficient compression of BN CPTs (conditional probability tables) into NAT models. We show empirically that multiplicatively factoring NAT-modeled BNs allows significant speed up in inference for a reasonable range of sparse BN structures. We also show that such gain in efficiency only causes reasonable approximation errors in posterior marginals in NAT-modeled real world BNs.

Yang Xiang, Benjamin Baird
De-Causalizing NAT-Modeled Bayesian Networks for Inference Efficiency

Conditional independence encoded in Bayesian networks (BNs) avoids combinatorial explosion on the number of variables. However, BNs are still subject to exponential growth of space and inference time on the number of causes per effect variable in each conditional probability table (CPT). A number of space-efficient local models exist that allow efficient encoding of dependency between an effect and its causes, and can also be exploited for improved inference efficiency. We focus on the Non-Impeding Noisy-AND Tree (NIN-AND Tree or NAT) models due to its multiple merits. In this work, we develop a novel framework, de-causalization of NAT-modeled BNs, by which causal independence in NAT models can be exploited for more efficient inference. We demonstrate its exactness and efficiency impact on inference based on lazy propagation (LP).

Yang Xiang, Dylan Loker
A Novel Evaluation Methodology for Assessing Off-Policy Learning Methods in Contextual Bandits

We propose a novel evaluation methodology for assessing off-policy learning methods in contextual bandits. In particular, we provide a way to use data from any given Randomized Control Trial (RCT) to generate a range of observational studies with synthesized “outcome functions” that can match the user’s specified degrees of sample selection bias, which can then be used to comprehensively assess a given learning method. This is especially important in evaluating methods developed for precision medicine, where deploying a bad policy can have devastating effects. As the outcome function specifies the real-valued quality of any treatment for any instance, we can accurately compute the quality of any proposed treatment policy. This paper uses this evaluation methodology to establish a common ground for comparing the robustness and performance of the available off-policy learning methods in the literature.

Negar Hassanpour, Russell Greiner
Synthesizing Controllers: On the Correspondence Between LTL Synthesis and Non-deterministic Planning

Linear Temporal Logic ($$\mathsf {LTL}$$LTL) synthesis can be understood as the problem of building a controller that defines a winning strategy, for a two-player game against the environment, where the objective is to satisfy a given $$\mathsf {LTL}$$LTL formula. It is an important problem with applications in software synthesis, including controller synthesis. In this paper we establish the correspondence between $$\mathsf {LTL}$$LTL synthesis and fully observable non-deterministic (FOND) planning. We study $$\mathsf {LTL}$$LTL interpreted over both finite and infinite traces. We also provide the first explicit compilation that translates an $$\mathsf {LTL}$$LTL synthesis problem to a FOND problem. Experiments with state-of-the-art $$\mathsf {LTL}$$LTL FOND and synthesis solvers show automated planning to be a viable and effective tool for highly structured $$\mathsf {LTL}$$LTL synthesis problems.

Alberto Camacho, Jorge A. Baier, Christian Muise, Sheila A. McIlraith
Logic-Based Benders Decomposition for Two-Stage Flexible Flow Shop Scheduling with Unrelated Parallel Machines

We study a two-stage flexible flow shop scheduling problem (FFSP) with the objective of makespan minimization. There is a single machine in stage 1 and unrelated parallel machines in stage 2. We propose a logic-based Benders decomposition (LBBD) algorithm, which decomposes this problem into a mixed-integer programming (MIP) master problem that sequences jobs on stage 1 and assigns jobs to machines on stage 2, and a set of constraint programming sub-problems that aim to find a feasible schedule on stage 2. Extensive computational results show that LBBD outperforms the best-known MIP model for this problem in terms of both computational time and ability to prove optimality over the majority of test instances. Additional experiments show that the superiority of LBBD over the monolithic MIP model holds regardless of whether algorithm tuning features are applied.

Yingcong Tan, Daria Terekhov
Advice-Based Exploration in Model-Based Reinforcement Learning

Convergence to an optimal policy using model-based reinforcement learning can require significant exploration of the environment. In some settings such exploration is costly or even impossible, such as in cases where simulators are not available, or where there are prohibitively large state spaces. In this paper we examine the use of advice to guide the search for an optimal policy. To this end we propose a rich language for providing advice to a reinforcement learning agent. Unlike constraints which potentially eliminate optimal policies, advice offers guidance for the exploration, while preserving the guarantee of convergence to an optimal policy. Experimental results on deterministic grid worlds demonstrate the potential for good advice to reduce the amount of exploration required to learn a satisficing or optimal policy, while maintaining robustness in the face of incomplete or misleading advice.

Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Anthony Valenzano, Sheila A. McIlraith
Deep Super Learner: A Deep Ensemble for Classification Problems

Deep learning has become very popular for tasks such as predictive modeling and pattern recognition in handling big data. Deep learning is a powerful machine learning method that extracts lower level features and feeds them forward for the next layer to identify higher level features that improve performance. However, deep neural networks have drawbacks, which include many hyper-parameters and infinite architectures, opaqueness into results, and relatively slower convergence on smaller datasets. While traditional machine learning algorithms can address these drawbacks, they are not typically capable of the performance levels achieved by deep neural networks. To improve performance, ensemble methods are used to combine multiple base learners. Super learning is an ensemble that finds the optimal combination of diverse learning algorithms. This paper proposes deep super learning as an approach which achieves log loss and accuracy results competitive to deep neural networks while employing traditional machine learning algorithms in a hierarchical structure. The deep super learner is flexible, adaptable, and easy to train with good performance across different tasks using identical hyper-parameter values. Using traditional machine learning requires fewer hyper-parameters, allows transparency into results, and has relatively fast convergence on smaller datasets. Experimental results show that the deep super learner has superior performance compared to the individual base learners, single-layer ensembles, and in some cases deep neural networks. Performance of the deep super learner may further be improved with task-specific tuning.

Steven Young, Tamer Abdou, Ayse Bener
One Single Deep Bidirectional LSTM Network for Word Sense Disambiguation of Text Data

Due to recent technical and scientific advances, we have a wealth of information hidden in unstructured text data such as offline/online narratives, research articles, and clinical reports. To mine these data properly, attributable to their innate ambiguity, a Word Sense Disambiguation (WSD) algorithm can avoid numbers of difficulties in Natural Language Processing (NLP) pipeline. However, considering a large number of ambiguous words in one language or technical domain, we may encounter limiting constraints for proper deployment of existing WSD models. This paper attempts to address the problem of one-classifier-per-one-word WSD algorithms by proposing a single Bidirectional Long Short-Term Memory (BLSTM) network which by considering senses and context sequences works on all ambiguous words collectively. Evaluated on SensEval-3 benchmark, we show the result of our model is comparable with top-performing WSD algorithms. We also discuss how applying additional modifications alleviates the model fault and the need for more training data.

Ahmad Pesaranghader, Ali Pesaranghader, Stan Matwin, Marina Sokolova
MedFact: Towards Improving Veracity of Medical Information in Social Media Using Applied Machine Learning

Since the advent of Web 2.0 and social media, anyone with an Internet connection can create content online, even if it is uncertain or fake information, which has attracted significant attention recently. In this study, we address the challenge of uncertain online health information by automating systematic approaches borrowed from evidence-based medicine. Our proposed algorithm, MedFact, enables recommendation of trusted medical information within health-related social media discussions and empowers online users to make informed decisions about the credibility of online health information. MedFact automatically extracts relevant keywords from online discussions and queries trusted medical literature with the aim of embedding related factual information into the discussion. Our retrieval model takes into account layperson terminology and hierarchy of evidence. Consequently, MedFact is a departure from current consensus-based approaches for determining credibility using “wisdom of the crowd”, binary “Like” votes and ratings, popular in social media. Moving away from subjective metrics, MedFact introduces objective metrics. We also present preliminary work towards a granular veracity score by using supervised machine learning to compare statements within uncertain social media text and trusted medical text. We evaluate our proposed algorithm on various data sets from existing health social media involving both patient and medic discussions, with promising results and suggestions for ongoing improvements and future research.

Hamman Samuel, Osmar Zaïane
Reranking Candidate Lists for Improved Lexical Induction

Identifying translations in bilingual material—also referred to as the Bilingual Lexicon Induction (BLI) task—is a challenge that has attracted many researchers since a long time. In this paper, we investigate the reranking of two types of state-of-the-art approaches that have been used for the task. We test our reranker on four language pairs (translating from English), analyzing the influence of the frequency of the source terms we seek to translate. Our reranking approach almost invariably leads to performance gains for all the translation directions we consider.

Laurent Jakubina, Philippe Langlais
Analysis of Social Media Posts for Early Detection of Mental Health Conditions

This paper presents a multipronged approach to predict early risk of mental health issues from user-generated content in social media. Supervised learning and information retrieval methods are used to estimate the risk of depression for a user given the content of its posts in reddit. The approach presented here was evaluated on the CLEF eRisk 2017 pilot task. We describe the details of five systems submitted to the task, and compare their performance. The comparisons show that combining information retrieval and machine learning methods gives the best results.

Antoine Briand, Hayda Almeida, Marie-Jean Meurs
Motor Bearing Fault Diagnosis Using Deep Convolutional Neural Networks with 2D Analysis of Vibration Signal

Bearings are critical components in rotating machinery, and it is crucial to diagnose their faults at an early stage. Existing fault diagnosis methods are mostly limited to manual features and traditional artificial intelligence learning schemes such as neural network, support vector machine, and k-nearest-neighborhood. Unfortunately, interpretation and engineering of such features require substantial human expertise. This paper proposes an adaptive deep convolutional neural network (ADCNN) that utilizes cyclic spectrum maps (CSM) of raw vibration signal as bearing health states to automate feature extraction and classification process. The CSMs are two-dimensional (2D) maps that show the distribution of cycle energy across different bands of the vibration spectrum. The efficiency of the proposed algorithm (CSM+ADCNN) is validated using benchmark dataset collected from bearing tests. Experimental results indicate that the proposed method outperforms the state-of-the-art algorithms, yielding 8.25% to 13.75% classification performance improvement.

M. M. Manjurul Islam, Jong-Myon Kim
Mobile App for Detection of Counterfeit Banknotes

Mobile phone usage has become very common. The market continues to grow as more advanced functionalities are incorporated in mobile phones. They possess sufficient computational capability that is needed for the identification and authentication of banknotes. This paper presents a mobile application for the recognition of banknote denominations and detection of counterfeit Nigerian Naira notes using Unity 3D – which is a multiplatform mobile application development system. The system extracted the face value of the banknotes and evaluated its performance using a combination of several KNN distant measures based on a Cascaded Ensemble approach. It was tested on the Android and iOS platforms using a Samsung Galaxy S6 and an iPhone 6 respectively. The experimental results presented a 99.27% recognition rate, a 94.70% detection rate, at an average processing time of 0.02 ms.

Tamarafinide V. Dittimi, Ching Y. Suen
A Multiagent Framework for Understanding Addiction

In this paper, we provide a framework for examining the problem of addiction that considers both internal factors like self-control as well as external factors denoted as the environment. We do this by considering the Prisoner’s Dilemma, a game-theoretic concept examined by competitive multiagent systems researchers. In particular, we devise an iterated Prisoner’s Dilemma involving you and future you. We devote considerable effort in defining the notion of selfhood from previous literature in economics, as this is critical in examining addiction. The main contribution is a framework that enables calibration of alternate scenarios of behaviour for addicted individuals: in essence an application of artificial intelligence for the important social problem of modeling addiction, yielding some intuitive and explanatory results. We briefly comment on the main underlying assumptions and biases as well as mention future work that could be derived from this research, including commentary on how reasoning about both current and future reflections of self may be useful in general for multiagent decision making.

Wasif Khan, Robin Cohen
Infusing Domain Knowledge to Improve the Detection of Alzheimer’s Disease from Everyday Motion Behaviour

Alzheimer’s disease can severely impair the independent lifestyle of a person. Dem@Care is an European research project that conducted a study for timely diagnosis of Alzheimers disease by collecting everyday motion data from couples (or dyads), with one of the person in the couple having AD. Their results suggest that AD can be detected using everyday motion data from accelerometers. They did evaluation based on leave-one-person-out cross-validation. However, this evaluation can introduce bias in the classification results because one of the person from the dyad is present in the training set while the other is being tested. In this paper, we revisit the Dem@Care study and propose a new evaluation method that performs leave one-dyad-out cross-validation to remove the dataset selection bias. We then introduce new domain specific features based on dynamic and static intervals of motions that significantly improves the classification results. We further show increase in performance by combining the proposed features with new time, frequency domain and baseline features used in the Dem@Care study.

Chao Bian, Shehroz S. Khan, Alex Mihailidis
An Incremental Machine Learning Algorithm for Nuclear Forensics

This paper presents an incremental machine learning algorithm that identifies the origin, or provenance, of samples of nuclear material. This is part of work being undertaken by the Canadian National Nuclear Forensics Library development program, which seeks to build a comprehensive database of signatures for radioactive and nuclear materials under Canadian regulatory control. One difficulty with this application is the small ratio of the number of examples over the number of classes. So, we introduce variants to a basic generative algorithm, based on ideas from the robust statistics literature and elsewhere, to address this issue and to improve robustness to attribute noise. We show experimentally the effectiveness of the approach, and the problems that arise, when adding new examples and classes.

Chris Drummond

Short Papers

Frontmatter
MML-Based Approach for Determining the Number of Topics in EDCM Mixture Models

This paper proposes an unsupervised algorithm for learning a finite mixture model of the exponential family approximation to the Dirichlet Compound Multinomial (EDCM). An important part of the mixture modeling problem is determining the number of components that best describes the data. In this work, we extend the Minimum Message Length (MML) principle to determine the number of topics (clusters) in case of text modeling using a mixture of EDCMs. Parameters estimation is based on the previously proposed deterministic annealing expectation-maximization approach. The proposed method is validated using several document collections. A comparison with results obtained for other selection criteria is provided.

Nuha Zamzami, Nizar Bouguila
Constrained Bayesian Optimization for Problems with Piece-wise Smooth Constraints

This paper proposes a new formulation of Gaussian process for constraints with piece-wise smooth conditions. Combining ideas from decision trees and Gaussian processes, it is shown that the new model can effectively identify the non-smooth regions and tackle the non-smoothness in piece-wise smooth constraint functions. A constrained Bayesian optimizer is then constructed to handle optimization problems with both noisy objective and constraint functions.

Aliakbar Gorji Daronkolaei, Amir Hajian, Tonya Custis
Dimensionality Reduction and Visualization by Doubly Kernelized Unit Ball Embedding

In this paper, we present a nonlinear dimensionality reduction algorithm which is aimed to preserve the local structure of data by building and exploiting a neighborhood graph. The cost function is defined to minimize the discrepancy between the similarities of points in the input and output spaces. We propose an effective way to calculate the input and output similarities based on Gaussian and polynomial kernel functions. By maximizing the within-cluster cohesion and between-cluster separation, KUBE remarkably improves the quality of clustering algorithms on the low-dimensional embedding. Our experiments on image recognition datasets show that KUBE can learn the structure of manifolds and it significantly improves the clustering quality.

Behrouz Haji Soleimani, Stan Matwin
Accelerated Gradient and Block-Wise Gradient Methods for Big Data Factorization

A problem often encountered in analysis of the large-scale data pertains to approximation of a given matrix $$A\in \mathbf {R}^{m\times n}$$A∈Rm×n by $$UV^T$$UVT, where $$U\in \mathbf {R}^{m\times r}$$U∈Rm×r, $$V\in \mathbf {R}^{n\times r}$$V∈Rn×r and $$r < \min \{ m, n \}$$r<min{m,n}. The aim of this paper is to tackle this problem through proposing an accelerated gradient descent algorithm as well as its stochastic counterpart. These frameworks are suitable candidates to surmount the computational difficulties in computing the SVD form of big matrices. On the other hand, big data are usually presented and stored in some fixed-size blocks, which is an incentive to further propose a block-wise gradient descent algorithm for their low-rank approximation. A stochastic block-wise gradient method will further be suggested to enhance the computational efficiencies when a large number of blocks are presented in the problem. Under some standard assumptions, we investigate the convergence property of the block-wise approach. Computational results for both synthetic data as well as the real-world data are provided in this paper.

M. Reza Peyghami, Kevin Yang, Shengyuan Chen, Zijiang Yang, Masoud Ataei
Learning Belief Revision Operators

The beliefs of an agent change in response to new information. Formal belief change operators have been introduced to model this change. Although the properties of belief change operators are well understood, there has been little work on specifying exactly where these operators come from. In this paper, we propose that belief revision operators can be learned from data. In other words, by looking at the behaviour of an agent, we can use basic machine learning algorithms to determine exactly how they revise their beliefs. This is a preliminary paper advocating a particular approach, and demonstrating its feasibility. Fundamentally, we are concerned with the manner in which machine learning techniques can be used to learn formal models of knowledge and belief. We suggest that this kind of advance will be important for future applications of AI.

Aaron Hunter
Solving Constraint Satisfaction Problems Using Firefly Algorithms

Constraints Satisfaction Problems (CSPs) are known to be hard to solve and require a backtrack search algorithm with exponential time cost. Metaheuristics have recently gained much reputation for solving complex problems and can be employed as an alternative to tackle CSPs even if, in theory, they do not guarantee a complete solution to the problem. This paper proposes a new Discrete Firefly Algorithm (DFA) and investigates its applicability for dealing with CSPs. To assess the performance of the proposed DFA, experiments have been conducted on CSP instances, randomly generated based on the Model RB. The results of the experiments clearly demonstrate the significant performance of the proposed method in dealing with CSPs. For all the instances tested, DFA is successful to find a complete solution that satisfies all constraints in a reasonable amount of time.

Mahdi Bidar, Malek Mouhoub, Samira Sadaoui, Mohsen Bidar
An AI Planning-Based Approach to the Multi-Agent Plan Recognition Problem

Multi-Agent Plan Recognition (MAPR) is the problem of inferring the goals and plans of multiple agents given a set of observations. While previous MAPR approaches have largely focused on recognizing team structures and behaviors, given perfect and complete observations, in this paper, we address potentially unreliable observations and temporal actions. We propose a multi-step compilation technique that enables the use of AI planning for the computation of the probability distributions of plans and goals, given observations. We present results of an experimental evaluation on a novel set of benchmarks, using several temporal and diverse planners.

Maayan Shvo, Shirin Sohrabi, Sheila A. McIlraith
Predicting Transportation Modes of GPS Trajectories Using Feature Engineering and Noise Removal

Understanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features generation; (iii) trajectory features extraction; (iv) noise removal; (v) normalization. We show that the extraction of the new point features: bearing rate, the rate of rate of change of the bearing rate and the global and local trajectory features, like medians and percentiles enables many classifiers to achieve high accuracy (96.5%) and f1 (96.3%) scores. We also show that the noise removal task affects the performance of all the models tested. Finally, the empirical tests where we compare this work against state-of-art transportation mode prediction strategies show that our framework is competitive and outperforms most of them.

Mohammad Etemad, Amílcar Soares Júnior, Stan Matwin
Prediction of Container Damage Insurance Claims for Optimized Maritime Port Operations

A company operating in a commercial maritime port often experiences clients filing insurance claims on damaged shipping containers. In this work, multiple classifiers have been trained on synthesized data, to predict such insurance claims. The results show that Random Forests outperform other classifiers on typical machine learning metrics. Further, insights into the importance of various features in this prediction are discussed, and their deviation from expert opinions. This information facilitates selective information collation to predict container claims, and to rank data sources by relevance. To our knowledge, this is the first publication to investigate the factors associated with container damage and claims, as opposed to ship damage or other related problems.

Ashwin Panchapakesan, Rami Abielmona, Rafael Falcon, Emil Petriu
Drug-Target Interaction Network Predictions for Drug Repurposing Using LASSO-Based Regularized Linear Classification Model

It has been well-known that biological and experimental methods for drug discovery are time-consuming and expensive. New efforts have been explored to perform drug repurposing through predicting drug-target interaction networks using biological and chemical properties of drugs and targets. However, due to the high-dimensional nature of the data sets extracted from drugs and targets, which have hundreds of thousands of features and relatively small numbers of samples, traditional machine learning approaches, such as logistic regression analysis, cannot analyze these data efficiently. To overcome this issue, we proposed a LASSO-based regularized linear classification model to predict drug-target interactions, which were used for drug repurposing for inflammatory bowel disease. Experiments showed that the model out performed the traditional logistic regression model.

Jiaying You, Md. Mohaiminul Islam, Liam Grenier, Qin Kuang, Robert D. McLeod, Pingzhao Hu
Optimal Scheduling for Smart Charging of Electric Vehicles Using Dynamic Programming

We are proposing a formulation of the smart charging problem that can be solved by dynamic programming. It allows the optimal charging schedule of EVs to be determined in order to minimize the cost considering the different driving patterns of each car owner as well as the electricity prices varying according to supply and demand. Conclusive experiments are made through simulations, relying upon a database storing the history of the real use of vehicles over several months and an hourly electricity price.

Karol Lina López, Christian Gagné
Combining MCTS and A3C for Prediction of Spatially Spreading Processes in Forest Wildfire Settings

In recent years, Deep Reinforcement Learning (RL) algorithms have shown super-human performance in a variety Atari and classic board games like chess and GO. Research into applications of RL in other domains with spatial considerations like environmental planning are still in their nascent stages. In this paper, we introduce a novel combination of Monte-Carlo Tree Search (MCTS) and A3C algorithms on an online simulator of a wildfire, on a pair of forest fires in Northern Alberta (Fort McMurray and Richardson fires) and on historical Saskatchewan fires previously compared by others to a physics-based simulator. We conduct several experiments to predict fire spread for several days before and after the given spatial information of fire spread and ignition points. Our results show that the advancements in Deep RL applications in the gaming world have advantages in spatially spreading real-world problems like forest fires.

Sriram Ganapathi Subramanian, Mark Crowley
Text-Based Detection of Unauthorized Users of Social Media Accounts

Although social media platforms can assist organizations’ progress, they also make them vulnerable to unauthorized users gaining access to their account and posting as the organization. This can have negative effects on the company’s public appearance and profit. Once attackers gain access to a social media account, they are able to post any content from that account. In this paper, we propose an author verification task in the realm of blog posts to detect and block unauthorized users based on the textual content of their unauthorized post. We use different methods to represent a document, such as word frequency and word2vec, and we train two different classifiers over these document representations. The experimental results show that regardless of the classifier the word2vec method outperforms other representations.

Milton King, Dima Alhadidi, Paul Cook
N-Gram Based Approach for Automatic Prediction of Essay Rubric Marks

Automatic Essay Scoring, applied to the prediction of grades for dimensions of a scoring rubric, can provide automatic detailed feedback on students’ written assignments. We apply a character and word n-gram based technique proposed originally for authorship identification—Common N-Gram (CNG) classifier—to this task. We report promising results for the rubric mark prediction for essays by CNG, and perform analysis of suitability of different types of n-grams for the task.

Magdalena Jankowska, Colin Conrad, Jabez Harris, Vlado Kešelj
Matching Résumés to Job Descriptions with Stacked Models

We describe a method for matching résumés to job descriptions provided by employers, and evaluate it on real data from a Canadian company specialized in e-recruitment. We model the task as a classifying each résumé as suitable or not for a follow up interview. We evaluate the methods on two datasets with approximately 1,500 real job descriptions and approximately 70,000 résumés, from two important industry sectors, considering several models individually and also stacked. Our stacked model shows high accuracy (often above 0.8) and consistently outperforms standard methods, including neural networks.

Peng Xu, Denilson Barbosa
Towards a Comprehensive Evaluation of Recommenders: A Cognition-Based Approach

Evaluating Recommender Systems (RSs) is a challenging issue that is significantly magnified by the multifaceted properties of RSs, which makes it insufficient to use only one metric to evaluate recommenders. This challenge necessitates the need for a unified evaluation model that comprehensively assesses multiple aspects of the recommender. This position paper proposes a cognition-based comprehensive evaluation to evaluate the main activities of RSs. We innovated the proposed model based on the cognitive dimension of Bloom’s taxonomy, a widely used model for classifying learning objectives in the teaching area. We created a phase-wise mapping between RSs and Bloom’s taxonomy to come up with an overall evaluation for recommenders. Based on these connections, we believe that the proposed evaluation model would have the potential to support the decision of selecting the most appropriate recommender systems by giving a benchmarked score for different aspects of RSs.

Alaa Alslaity, Thomas Tran
A Sentence-Level Sparse Gamma Topic Model for Sentiment Analysis

Online consumer reviews have become an essential source of information for understanding markets and customer preferences. This research introduces a novel topic model to identify product attributes and sentiments toward them at the sentence level. The model uses a recursive definition of topic distribution in a sentence to avoid the problem of over-parametrization in topic models. The introduction of the inference network enables the utilization of rich features in the content to drive the identification of sentiments, in contrast with other multi-aspect sentiment analysis models that rely on single words. The sentence topic model has a superior performance in producing coherent topics, and the sentence topic-sentiment model outperforms the existing model on the task of predicting product attribute rating.

Tao Chen, Jeffrey Parsons
Topic Detection and Document Similarity on Financial News

Traders often rely on financial news to come up with predictions for stock price changes. Dealing with vast amount of news data makes it essential to use an automated methodology to identify the relevant news items for a given criteria. In this study we use Latent Dirichlet Allocation (LDA) to model the correlation of news items with stock price time series data. LDA model is trained with news items from a time window in the past and then the trained model is used to measure the similarity between the current news items and the news items used for training. Calculated similarity measure can be used as a predictor for switching points in the future. We tested our methodology using a collection of about 1,700,000 financial news items published between 2015-01-01 and 2015-12-31, and compared the results with various standard classification techniques. Our results indicate that use of LDA instead of standard classification techniques makes it possible to achieve the same level of performance by using a much smaller feature space.

Saeede Sadat Asadi Kakhki, Can Kavaklioglu, Ayse Bener

Graduate Student Symposium Papers

Frontmatter
Software Defect Prediction from Code Quality Measurements via Machine Learning

Improvement in software development practices to predict and reduce software defects can lead to major cost savings. The goal of this study is to demonstrate the value of static analysis metrics in predicting software defects at a much larger scale than previous efforts. The study analyses data collected from more than 500 software applications, across 3 multi-year software development programs, and uses over 150 software static analysis measurements. A number of machine learning techniques such as neural network and random forest are used to determine whether seemingly innocuous rule violations can be used as significant predictors of software defect rates.

Ross MacDonald
Automated Scheduling: Reinforcement Learning Approach to Algorithm Policy Learning

Automated planning and scheduling continues to be an important part of artificial intelligence research and practice [6, 7, 11]. Many commonly-occurring scheduling settings include multiple stages and alternative resources, resulting in challenging combinatorial problems with high-dimensional solution spaces. The literature for solving such problems is dominated by specialized meta-heuristic algorithms.

Yingcong Tan
Estimating Vineyard Grape Yield from Images

Agricultural yield estimation from natural images is a challenging problem to which machine learning can be applied. Convolutional Neural Networks have advanced the state of the art in many machine learning applications such as computer vision, speech recognition and natural language processing. The proposed research uses convolution neural networks to develop models that can estimate the weight of grapes on a vine using an image. Trained and tested with a dataset of 60 images of grape vines, the system manages to achieve a cross-validation yield estimation accuracy of 87%.

Tanya Monga
Real-Time Deep Learning Pedestrians Classification on a Micro-Controller

Deep learning neural network is one of the most advanced tools for object classification. However, it is computationally expensive and has performance issues in real time applications. This research’s use-case is efficient design and deployment of deep learning neural networks on palm sized computers like Raspberry Pi (RPi) as an in-vehicle-monitoring-system (IVMS) for real-time pedestrian classification. I have developed a system based on a neural network template named Cafenet that runs on an RPi and can classify pedestrians using deep learning. Simultaneously, I have proposed a new classification system based on multiple RPi boards, which offers users two modes of pedestrian detection: one is fast classification, and the other is accurate classification. The experiments results show that the device could classify pedestrians in real-time and the detecting accuracy is acceptable.

Zhaoyang Huang
A Unified Evaluation Framework for Recommenders

Recommender Systems are usually evaluated by one or two metrics. Due to the multifaceted nature of recommender systems, however, it is insufficient to evaluate them using only one metric. This paper presents my Ph.D. research agenda on evaluating recommenders from different points of view. In particular, I aim to provide a comprehensive evaluation framework that merges different metrics and comes up with an overall result evaluation. The proposed framework is built based on an inferred correlation between the most important metrics and a weight function that assign different weights for different metrics based on the application area of the recommender. This work can be used to evaluate different recommender types that are applied to the most popular application areas such as movies, documents, etc.

Alaa Alslaity
Early Detection of Alzheimer’s Disease Using Deep Learning

Using a combination of methods from image processing, signal processing and deep learning, we aim to develop a model to predict whether or not a patient will develop symptomatic Alzheimer’s disease using Diffusion MRI (dMRI) imaging data. We first propose a 3D multichannel convolutional neural network (CNN) architecture to distinguish patients with Alzheimer’s from normal controls, then propose an extension of our architecture to incorporate multiple scans from a patient’s history to improve classification accuracy and predict future prognosis. Finally, we discuss methods for performing data augmentation to add diversity and robustness to our unique and comparatively small dataset.

Laura McCrackin
Learning with Prior Domain Knowledge and Insufficient Annotated Data

Machine learning exploits data to learn, but when not enough data is available (often due to increasingly complex models) or the quality of the data is insufficient, then prior domain knowledge from experts can be incorporated to guide the learner. Prior knowledge typically employed in machine learning tends to be concise, single statements. But for many problems, knowledge is much more messy requiring in-depth discussions with domain experts to extract and often takes many iterations of model development and feedback from experts to collect all the relevant knowledge. In the Bayesian learning paradigm, we learn which hypotheses are most likely given the data as evidence. How can we refine this model when new feedback is given by domain experts? We are working with domain experts on a problem where data is expensive, but we also have prior knowledge. This research has two objectives: (1) automatically refine models using prior knowledge, and (2) handle various forms of prior knowledge elicited from experts in a unified framework.

Matthew Dirks

Industry Track

Frontmatter
Predicting Crime Using Spatial Features

Our study aims to build a machine learning model for crime prediction using geospatial features for different categories of crime. The reverse geocoding technique is applied to retrieve open street map (OSM) spatial data. This study also proposes finding hotpoints extracted from crime hotspots area found by Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). A spatial distance feature is then computed based on the position of different hotpoints for various types of crime and this value is used as a feature for classifiers. We test the engineered features in crime data from Royal Canadian Mounted Police of Halifax, NS. We observed a significant performance improvement in crime prediction using the new generated spatial features.

Fateha Khanam Bappee, Amílcar Soares Júnior, Stan Matwin
A Tool for Defining and Simulating Storage Strategies on the Smart Grid

Intelligent distribution of electrical power is a key problem on the Smart Grid. It is known that the introduction of micro-storage devices, such as electric cars, can lead to benefits for consumers both in terms of power cost and emissions output. This process requires consumers to be educated on the importance of power storage, and it also requires the development of intelligent power storage strategies. This paper introduces a simulation tool that can be used to achieve both of these goals. In particular, our software allows users to easily define a Smart Grid topology, and then use a simple scripting language to define intelligent power storage strategies for simulated consumers. The software permits evaluation of different strategies, which can lead to practical improvements for consumers.

Dan Russell, Aaron Hunter
Decision Assist for Self-driving Cars

Research into self-driving cars has grown enormously in the last decade primarily due to the advances in the fields of machine intelligence and image processing. An under-appreciated aspect of self-driving cars is actively avoiding high traffic zones, low visibility zones, and routes with rough weather conditions by learning different conditions and making decisions based on trained experiences. This paper addresses this challenge by introducing a novel hierarchical structure for dynamic path planning and experiential learning for vehicles. A multistage system is proposed for detecting and compensating for weather, lighting, and traffic conditions as well as a novel adaptive path planning algorithm named Checked State A3C. This algorithm improves upon the existing A3C Reinforcement Learning (RL) algorithm by adding state memory which provides the ability to learn an adaptive model of the best decisions to take from experience.

Sriram Ganapathi Subramanian, Jaspreet Singh Sambee, Benyamin Ghojogh, Mark Crowley
Rule Mining and Prediction Using the Flek Machine – A New Machine Learning Engine

One of the exciting areas in data science is the development of new machine learning engines to do data mining, analytics and prediction. In this paper, we introduce the “Flek Machine” – an innovative AI engine that learns a Bayes Net model from binary data.FlekML, the core machine learning engine inside, builds a rich model that can be manipulated by the Toolkit to do rule mining, discover associations and association maps as well as make predictions. The Flek Machine enables binary, multi-class and multi-label classifications all on the fly and over the same built model. This tool has several use cases such as customer behaviour analysis, predicting equipment failure in IoT, or detecting drug combinations that produce side effects.

Abbas Taher
Backmatter
Metadaten
Titel
Advances in Artificial Intelligence
herausgegeben von
Ebrahim Bagheri
Jackie C.K. Cheung
Copyright-Jahr
2018
Electronic ISBN
978-3-319-89656-4
Print ISBN
978-3-319-89655-7
DOI
https://doi.org/10.1007/978-3-319-89656-4