nach oben

2019 | Buch

Kapitel lesen Erstes Kapitel lesen

Advances in Artificial Intelligence

32nd Canadian Conference on Artificial Intelligence, Canadian AI 2019, Kingston, ON, Canada, May 28–31, 2019, Proceedings

herausgegeben von: Marie-Jean Meurs, Frank Rudzicz

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the refereed proceedings of the 32nd Canadian Conference on Artificial Intelligence, Canadian AI 2019, held in Kingston, ON, Canada, in May 2019.
The 27 regular papers and 34 short papers presented together with 8 Graduate Student Symposium papers and 4 Industry Track papers were carefully reviewed and selected from 132 submissions. The focus of the conference was on artificial intelligence research and advanced information and communications technology.

Inhaltsverzeichnis

Frontmatter

Regular Papers

Frontmatter

Categorizing Emails Using Machine Learning with Textual Features

We developed an application that automates the process of assigning emails received in a generic request inbox to one of fourteen predefined topic categories. To build this application, we compared the performance of several classifiers in predicting the topic category, using an email dataset extracted from this inbox, which consisted of 8,841 emails over three years. The algorithms ranged from linear classifiers operating on n-gram features to deep learning techniques such as CNNs and LSTMs. For our objective, we found that the best-performing model was a logistic regression classifier using n-grams with TF-IDF weights, achieving 90.9% accuracy. The traditional models performed better than the deep learning models for this dataset, likely in part due to the small dataset size, and also because this particular classification task may not require the ordered sequence representation of tokens that deep learning models provide. Eventually, a bagged voting model was selected which combines the predictive power of the top eight models, with accuracy of 92.7%, surpassing the performance of any of the individual models.

Haoran Zhang, Jagadish Rangrej, Saad Rais, Michael Hillmer, Frank Rudzicz, Kamil Malikov

Weakly Supervised, Data-Driven Acquisition of Rules for Open Information Extraction

We propose a way to acquire rules for Open Information Extraction, based on lemma sequence patterns (including potential typographical symbols) linking two named entities in a sentence. Rule acquisition is data-driven and requires little supervision. Given an arbitrary relation, we identify, in a large corpus, pairs of entities that are linked by the relation and then gather, score and rank other phrases that link the same entity pairs. We experimented with 81 relations and acquired 20 extraction rules for each by mining ClueWeb12. We devised a semi-automatic evaluation protocol to measure recall and precision and found them to be at most 79.9% and 62.4% respectively. Verbal patterns are of better quality than non-verbal ones, although the latter achieve a maximum recall of 76.5%. The strategy proposed does not necessitate expensive resources or time-consuming handcrafted resources, but does require a large amount of text.

Fabrizio Gotti, Philippe Langlais

Measuring Human Emotion in Short Documents to Improve Social Robot and Agent Interactions

Social robots and agents can interact with people better if they can infer their affective state (emotions). While they cannot yet recognise affective state from tone and body language, they can use the fragments of speech that they (over)hear. We show that emotions – as conventionally framed – are difficult to detect. We suggest, from empirical results, that this is because emotions are the wrong granularity; and that emotions contain subemotions that are much more clearly separated from one another, and so are both easier to detect and to exploit.

David Skillicorn, Nasser Alsadhan, Richard Billingsley, Mary-Anne Williams

Exploiting Symmetry of Independence in d-Separation

In this paper, we exploit the symmetry of independence in the implementation of d-separation. We show that it can matter whether the search is conducted from start to goal or vice versa. Analysis reveals it is preferable to approach observed v-structure nodes from the bottom. Hence, a measure, called depth, is suggested to decide whether the search should run from start to goal or from goal to start. One salient feature is that depth can be computed during a pruning optimization step widely implemented. An empirical comparison is conducted against a clever implementation of d-separation. The experimental results are promising in two aspects. The effectiveness of our method increases with network size, as well as with the amount of observed evidence, culminating with an average time savings of 9% in the 9 largest BNs used in our experiments.

Cory J. Butz, André E. dos Santos, Jhonatan S. Oliveira, Anders L. Madsen

Personality Extraction Through LinkedIn

LinkedIn is a professional social network used by many recruiters as a way to look for potential employees and communicate with them. In order to facilitate communication, it is possible to use personality models to gain a better understanding of what drives the person of interest. This paper first looks at the possibility of collecting a corpus on LinkedIn labelled with a personality model, which has never been done before, then looks at the possibility of extracting two different personalities from the user. We show that we can achieve results going from 73.7% to 80.5% of precision on the DiSC personality model and from 80.7% to 86.2% of precision on the MBTI personality model. These results are similar to what has been found on other social networks such as Facebook or Twitter, which is surprising given the more professional nature of LinkedIn. Finally, an analysis of the significance of the results and of the possible sources of errors is presented.

Frédéric Piedboeuf, Philippe Langlais, Ludovic Bourg

Solving Influence Diagrams with Simple Propagation

Recently, Simple Propagation was introduced as an algorithm for belief update in Bayesian networks using message passing in a junction tree. The algorithm differs from other message passing algorithms such as Lazy Propagation in the message construction process. The message construction process in Simple Propagation identifies relevant potentials and variables to eliminate using the one-in, one-out-principle. This paper introduces Simple Propagation as a solution algorithm for influence diagrams with discrete variables. The one-in, one-out-principle is not directly applicable to influence diagrams. Hence, the principle is extended to cope with decision variables, utility functions, and precedence constraints to solve influence diagrams. Simple Propagation is demonstrated on an extensive example and a number of useful and interesting properties of the algorithm are described.

Anders L. Madsen, Cory J. Butz, Jhonatan Oliveira, André E. dos Santos

Uncertain Evidence for Probabilistic Relational Models

Standard approaches for inference in probabilistic relational models include lifted variable elimination (LVE) for single queries. To efficiently handle multiple queries, the lifted junction tree algorithm (LJT) uses a first-order cluster representation of a model, employing LVE as a subroutine in its steps. LVE and LJT can only handle certain evidence. However, most events are not certain. The purpose of this paper is twofold, (i) to adapt LVE, presenting LVE $$^{evi}$$ , to handle uncertain evidence and (ii) to incorporate uncertain evidence for multiple queries in LJT, presenting LJT $$^{evi}$$ . With LVE $$^{evi}$$ and LJT $$^{evi}$$ , we can handle uncertain evidence for probabilistic relational models, while benefiting from the lifting idea. Further, we show that uncertain evidence does not have a detrimental effect on completeness results and leads to similar runtimes as certain evidence.

Marcel Gehrke, Tanya Braun, Ralf Möller

Enhanced Collaborative Filtering Through User-Item Subgroups, Particle Swarm Optimization and Fuzzy C-Means

Recommender systems are information filtering systems that assist users to retrieve relevant information from massive amounts of data. Collaborative filtering (CF) is the most widely used technique in recommender systems for predicting the interests of a user on particular items. In traditional CF preferences of all items from many users are collected in the prediction process and this may include items that are irrelevant to the active user (the user for whom the prediction is for). Recently, subgroup based methods have emerged which take into account correlation of users and a set of items to rule out consideration of superfluous items. In this paper our objective is to explore CF that considers only user-item subgroups which consist of only similar subset of users based on a subset of items. We propose a novel hybrid framework based on Particle Swarm Optimization and Fuzzy C-Means clustering that optimizes the searching behaviour of user-item subgroups in CF. The proposed algorithm is experimented and compared with several state-of-the-art algorithms using benchmark datasets. Accuracy metrics such as precision, recall and mean average precision is used to find the top N recommended items.

Ayangleima Laishram, Vineet Padmanabhan

TextKD-GAN: Text Generation Using Knowledge Distillation and Generative Adversarial Networks

Text generation is of particular interest in many NLP applications such as machine translation, language modeling, and text summarization. Generative adversarial networks (GANs) achieved a remarkable success in high quality image generation in computer vision, and recently, GANs have gained lots of interest from the NLP community as well. However, achieving similar success in NLP would be more challenging due to the discrete nature of text. In this work, we introduce a method using knowledge distillation to effectively exploit GAN setup for text generation. We demonstrate how autoencoders (AEs) can be used for providing a continuous representation of sentences, which is a smooth representation that assign non-zero probabilities to more than one word. We distill this representation to train the generator to synthesize similar smooth representations. We perform a number of experiments to validate our idea using different datasets and show that our proposed approach yields better performance in terms of the BLEU score and Jensen-Shannon distance (JSD) measure compared to traditional GAN-based text generation approaches without pre-training.

Md. Akmal Haidar, Mehdi Rezagholizadeh

SALSA-TEXT: Self Attentive Latent Space Based Adversarial Text Generation

Inspired by the success of self attention mechanism and Transformer architecture in sequence transduction and image generation applications, we propose novel self attention-based architectures to improve the performance of adversarial latent code-based schemes in text generation. Adversarial latent code-based text generation has recently gained a lot of attention due to its promising results. In this paper, we take a step to fortify the architectures used in these setups, specifically AAE and ARAE. We benchmark two latent code-based methods (AAE and ARAE) designed based on adversarial setups. In our experiments, the Google sentence compression dataset is utilized to compare our method with these methods using various objective and subjective measures. The experiments demonstrate the proposed (self) attention-based models outperform the state-of-the-art in adversarial code-based text generation.

Jules Gagnon-Marchand, Hamed Sadeghi, Md. Akmal Haidar, Mehdi Rezagholizadeh

Automatically Learning a Human-Resource Ontology from Professional Social-Network Data

In this work, we build an ontology (automatically learned) in the domain of Human Ressources by using a simple, efficient and undemanding procedure. Our principal challenge is to tackle the problem of automatically grouping human-provided job titles into a hierarchy and by similarity (as they are presented in human-made HR ontologies). We use the Louvain algorithm, a greedy optimization method that, given a sufficient amount of data, interconnects domain-specific jobs that have more skills in common than jobs from different domains. In our case, we used publicly available profiles from LinkedIn (written in English by users in France). An automatic evaluation was performed and shows that the resulting ontology is similar in size and structure to ESCO (one of the most complete human-made ontology for HR). The whole procedure allows recruitment professionals to easily generate and update this ontology with virtually no human intervention.

David Alfonso-Hermelo, Philippe Langlais, Ludovic Bourg

Efficient Transformer-Based Sentence Encoding for Sentence Pair Modelling

Modelling a pair of sentences is important for many NLP tasks such as textual entailment (TE), paraphrase identification (PI), semantic relatedness (SR) and question answer pairing (QAP). Most sentence pair modelling work has looked only at the local context to generate a distributed sentence representation without considering the mutual information found in the other sentence. The proposed attentive encoder uses the representation of one sentence generated by a multi-head transformer encoder to guide the focussing on the most semantically relevant words from the other sentence using multi-branch attention. Evaluating this novel sentence encoder on the TE, PI, SR and QAP tasks shows notable improvements over the standard Transformer encoder as well as other current state-of-the-art models.

Mahtab Ahmed, Robert E. Mercer

Instance Ranking and Numerosity Reduction Using Matrix Decomposition and Subspace Learning

One way to deal with the ever increasing amount of available data for processing is to rank data instances by usefulness and reduce the dataset size. In this work, we introduce a framework to achieve this using matrix decomposition and subspace learning. Our central contribution is a novel similarity measure for data instances that uses the basis obtained from matrix decomposition of the dataset. Using this similarity measure, we propose several related algorithms for ranking data instances and performing numerosity reduction. We then validate the effectiveness of these algorithms for data reduction on several datasets for classification, regression, and clustering tasks.

Benyamin Ghojogh, Mark Crowley

Hybrid Temporal Situation Calculus

We present a hybrid discrete-continuous extension of Reiter’s temporal situation calculus, directly inspired by hybrid systems in control theory. While keeping to the foundations of Reiter’s approach, we extend it by adding a time argument to all fluents that represent continuous change. Thereby, we ensure that change can happen not only because of actions, but also due to the passage of time. We present a systematic methodology to derive, from simple premises, a new group of axioms which specify how continuous fluents change over time within a situation. We study regression for our new hybrid action theories and demonstrate what reasoning problems can be solved. Finally, we show that our hybrid theories indeed capture hybrid automata.

Vitaliy Batusov, Giuseppe De Giacomo, Mikhail Soutchanski

3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

Standard 3D convolution operations usually require larger amounts of memory and computation cost than 2D convolution operations. The fact increases the difficulty of the development of deep neural nets in many 3D vision tasks. In this paper, we investigate the possibility of applying depthwise separable convolutions in 3D scenario and introduce the use of 3D depthwise convolution. A 3D depthwise convolution splits a single standard 3D convolution into two separate steps, which would drastically reduce the number of parameters in 3D convolutions with more than one order of magnitude. We experiment with 3D depthwise convolution on popular CNN architectures and also compare it with a similar structure called pseudo-3D convolution. The results demonstrate that, with 3D depthwise convolutions, 3D vision tasks like classification and reconstruction can be carried out with more light-weighted neural networks while still delivering comparable performances.

Rongtian Ye, Fangyu Liu, Liqiang Zhang

Identifying Misaligned Spans in Parallel Corpora Using Change Point Detection

Parallel corpora are the basic resource for many multilingual natural language processing models. Recent advances in, e.g. neural machine translation have shown that the quality of the alignment in the corpus has a crucial impact on the quality of the resulting model, renewing interest in filtering automatically aligned corpora to increase their quality. In this contribution, we investigate the use of a fast change point detection method to detect possibly problematic parts of a parallel corpus. We demonstrate its performance on German-English corpora of 11k and 31k sentences, achieve a boundary identification performance above 80% and improve the detection of genuine parallel sentences up to 88%. To our knowledge this is the first application of change point detection to the problem of error detection in noisy corpora.

Andrea Pagotto, Patrick Littell, Yunli Wang, Cyril Goutte

In Vino Veritas: Estimating Vineyard Grape Yield from Images Using Deep Learning

Agricultural harvest estimation is an important, yet challenging problem to which machine learning can be applied. There is value in having better methods of yield estimation based on data that can be captured with inexpensive technology in the field. This research investigates five approaches to using convolution neural networks (CNNs) to develop models that can estimate the weight of grapes on the vine from an image taken by a smartphone. The results indicate that a combination of image processing and deep CNN machine learning can produce models that are sufficiently accurate within a variety of grape for data captured at harvest time. The best approach involved transfer learning; where a CNN is developed starting from the weights of a pretrained density map model that learns to output the location of grapes in the image. The best model achieved a MAE of 157 g over a mean average weight of 1335 g, or a MAE% of 11.8.

Daniel L. Silver, Tanya Monga

Options in Multi-task Reinforcement Learning - Transfer via Reflection

Temporally extended actions such as options are known to lead to improvements in reinforcement learning (RL). At the same time, transfer learning across different RL tasks is an increasingly active area of research. Following Baxter’s formalism for transfer, the corresponding RL question considers the benefit that an RL agent can achieve on new tasks based on experience from previous tasks in a common “learning environment”. We address this in the specific context of goal-based multi-task RL, where the different tasks correspond to different goal states within a common state space, and we introduce Landmark Options Via Reflection (LOVR), a flexible framework that uses options to transfer domain knowledge. As an explicit analog of principles in transfer learning, we provide theoretical and empirical results demonstrating that when a set of landmark states covers the state space suitably, then a LOVR agent that learns optimal value functions for these in an initial phase and deploys the associated optimal policies as options in the main phase, can achieve a drastic reduction in cumulative regret compared to baseline approaches.

Nicholas Denis, Maia Fraser

The Invisible Power of Fairness. How Machine Learning Shapes Democracy

Many machine learning systems make extensive use of large amounts of data regarding human behaviors. Several researchers have found various discriminatory practices related to the use of human-related machine learning systems, for example in the field of criminal justice, credit scoring and advertising. Fair machine learning is therefore emerging as a new field of study to mitigate biases that are inadvertently incorporated into algorithms. Data scientists and computer engineers are making various efforts to provide definitions of fairness. In this paper, we provide an overview of the most widespread definitions of fairness in the field of machine learning, arguing that the ideas highlighting each formalization are closely related to different ideas of justice and to different interpretations of democracy embedded in our culture. This work intends to analyze the definitions of fairness that have been proposed to date to interpret the underlying criteria and to relate them to different ideas of democracy.

Elena Beretta, Antonio Santangelo, Bruno Lepri, Antonio Vetrò, Juan Carlos De Martin

Maize Insects Classification Through Endoscopic Video Analysis

An early identification of insects in grains is of paramount importance to avoid losses. Instead of sampling and visual/laboratory analysis of grains, we propose carrying out the insect identification task automatically, using endoscopic video analysis methods. As the classification process of moving objects in video relies heavily on precise segmentation of moving objects, we propose a new background subtraction method and comparing their results with the main methods of the literature according to a comprehensive review. The background subtraction method relies on a binarization process that uses two thresholds: a global and a local threshold. The binarized results are combined by adding details of the object obtained by the local threshold in the result to the global threshold. Experimental results performed through visual analysis of the segmentation results and using a SVM classifier suggest that the proposed segmentation method produces more accurate results than the state-of-art background subtraction methods.

André R. de Geus, Marcos A. Batista, Marcos N. Rabelo, Celia Z. Barcelos, Sérgio F. da Silva

Collaborative Clustering Approach Based on Dempster-Shafer Theory for Bag-of-Visual-Words Codebook Generation

Feature encoding methods play an important role in the performance of the recognition tasks. The Bag-of-Visual-Words (BoVW) paradigm aims to assign the feature vectors to the codebook visual words. However, in the codebook generation phase, different clustering algorithms can be used, each giving a different set of visual words. Thus, the choice of the discriminative visual words set is a challenging task. In this work, we propose an enhanced bag-of-visual-words codebook generation approach using a collaborative clustering method based on the Dempster-Shafer Theory (DST). First, we built three codebooks using the k-means, the Fuzzy C-Means (FCM), and the Gaussian Mixture Model (GMM) clustering algorithms. Then, we computed the Agreement Degrees Vector (ADV) between the clusters of the pairs (k-means, GMM) and (k-means, FCM). We merged the obtained ADVs using the DST in order to generate the clusters weights. We evaluated the proposed approach for Remote Sensing Image Scene Classification (RSISC). The results proved the effectiveness of our proposed approach and showed that it can be applied for different recognition tasks in various domains.

Sabrine Hafdhellaoui, Yaakoub Boualleg, Mohamed Farah

Memory-Efficient Backpropagation for Recurrent Neural Networks

Recurrent Neural Networks (RNN) process sequential data to capture the time-dependency in the input signal. Training a deep RNN conventionally involves segmenting the data sequence to fit the model into memory. Increasing the segment size permits the model to better capture long-term dependencies at the expense of creating larger models that may not fit in memory. Therefore, we introduce a technique to allow designers to train a segmented RNN and obtain the same model parameters as if the entire data sequence was applied regardless of the segment size. This enables an optimal capturing of long-term dependencies. This technique can increase the computational complexity during training. Hence, the proposed technique grants designers the flexibility of balancing memory and runtime requirements. To evaluate the proposed method, we compared the total loss achieved on the testing dataset after every epoch while varying the size of the segments. The results we achieved show matching loss graphs irrespective of the segment size.

Issa Ayoub, Hussein Al Osman

A Behavior-Based Proactive User Authentication Model Utilizing Mobile Application Usage Patterns

Access to smart home networks is mostly achieved through end-user devices, especially mobile phones, but such devices are susceptible to theft or loss. In this paper, we present a user authentication model based on application access events, using only a small amount of information, thus reducing the computation time. To validate our model, we utilize a public real-world dataset collected from real users over a long period of time, in an uncontrolled manner. The model is evaluated by differentiating between users who utilize shared apps at the same daily intervals. In addition, we evaluate various classification approaches regarding legitimate user classification in compliance with the history of app usage events. The results demonstrate the capacity of the presented model to authenticate users with high true positive and true negative rates.

Yosef Ashibani, Qusay H. Mahmoud

Sparseout: Controlling Sparsity in Deep Networks

Dropout is commonly used to help reduce overfitting in deep neural networks. Sparsity is a potentially important property of neural networks, but is not explicitly controlled by Dropout-based regularization. In this work, we propose Sparseout a simple and efficient variant of Dropout that can be used to control the sparsity of the activations in a neural network. We theoretically prove that Sparseout is equivalent to an $$L_q$$ penalty on the features of a generalized linear model and that Dropout is a special case of Sparseout for neural networks. We empirically demonstrate that Sparseout is computationally inexpensive and is able to control the desired level of sparsity in the activations. We evaluated Sparseout on image classification and language modelling tasks to see the effect of sparsity on these tasks. We found that sparsity of the activations is favorable for language modelling performance while image classification benefits from denser activations. Sparseout provides a way to investigate sparsity in state-of-the-art deep learning models. Source code for Sparseout could be found at https://github.com/najeebkhan/sparseout .

Najeeb Khan, Ian Stavness

Crowd Prediction Under Uncertainty

On this paper we use a newly-published method, DFFT, to estimate counts of crowds on unseen environments. Our main objective is to explore the relationship between noise in the input snapshots (of crowds) that a DFFT-based pipeline requires with the errors made on the predictions. If such a relationship exists we could apply our pipeline to the understanding of crowds; this application is extremely important to our industrial partners, that see utility in predicting crowds for objectives such as security of spatial planning. Our explorations indicate the possibility of such a characterization, but it depends on features of the actual environment being studied. Here we present 2 simulated environments of different difficulty, and we show how the predictions DFFT issues are of varying quality. We discuss the reasons we hypothesize are behind these performances and we set the ground for further experiments.

Luis Da Costa, Jean-François Rajotte

Optimized Random Walk with Restart for Recommendation Systems

Many sophisticated recommendation methods have been developed to produce recommendations to the users. Among them, Random Walk with Restart (RWR) is one of the most widely used techniques. However, RWR has a large time complexity of $$O(k(n+m)^3)$$ and memory complexity of $$(O(n+m)^2)$$ . The change reducing the computational complexity is of great practical importance. In this paper, we propose an optimized version of random walk with restart, called the Optimized Random Walk with Restart (ORWR) and conduct theoretical and empirical studies on its performance. Mathematical analysis shows that using this technique the time complexity reduces to $$O(nm^2)$$ and the memory complexity to O(nm). Experiments on three different recommendation problems using real-world datasets confirms the proposed ORWR method improves both time and memory cost of the recommendation.

Seyyed Mohammadreza Rahimi, Rodrigo Augusto de Oliveira e Silva, Behrouz Far, Xin Wang

Inter and Intra Document Attention for Depression Risk Assessment

We take interest in the early assessment of risk for depression in social media users. We focus on the eRisk 2018 dataset, which represents users as a sequence of their written online contributions. We implement four RNN-based systems to classify the users. We explore several aggregations methods to combine predictions on individual posts. Our best model reads through all writings of a user in parallel but uses an attention mechanism to prioritize the most important ones at each timestep.

Diego Maupomé, Marc Queudot, Marie-Jean Meurs

Short Papers

Frontmatter

A Shallow Learning - Reduced Data Approach for Image Classification

Shepard Interpolation Neural Networks (SINN) lay a foundation addressing the flaws of deep algorithms, inspired by statistical interpolation techniques rather than biological brains it can be mathematically proven and the neuron interactions can be intuitively described. They also possess the ability to discriminate well with limited training data during the algorithm process. To enhance SINN from just regular vectorized images, we look to utilize hand designed and natural image features to help the SINN perform better on benchmark image classification data sets. We compare these input feature vectors using the SINN framework on three benchmark image classification test sets, showing comparable results to the state-of-the-art (SOTA) for a fraction of the computational and memory requirements due to SINN shallow learning ability.

Kaleb E. Smith, Phillip Williams

Multi-class Ensemble Learning of Imbalanced Bidding Fraud Data

E-auctions are vulnerable to Shill Bidding (SB), the toughest fraud to detect due to its resemblance to usual bidding behavior. To avoid financial losses for genuine buyers, we develop a SB detection model based on multi-class ensemble learning. For our study, we utilize a real SB dataset but since the data are unlabeled, we combine a robust data clustering technique and a labeling approach to categorize the training data into three classes. To solve the issue of imbalanced SB data, we use an advanced multi-class over-sampling method. Lastly, we compare the predictive performance of ensemble classifiers trained with balanced and imbalanced SB data. Combining data sampling with ensemble learning improved the classifier accuracy, which is significant in fraud detection problems.

Farzana Anowar, Samira Sadaoui

Towards Causal Analysis of Protocol Violations

When a protocol specified within a given system fails to ensure some desired properties, it is important to identify the actual causes of this failure. In this paper, we utilize a formal model of causal analysis in the situation calculus to show how one can specify the actual causes of such violations in non-deterministic protocols defined within dynamic systems. We show that our definition has some desirable properties.

Shakil M. Khan, Mikhail Soutchanski

Lexicographic Preference Trees with Hard Constraints

The CP-net and the LP-tree are two fundamental graphical models for representing user’s qualitative preferences. Constrained CP-nets have been studied in the past in which a very expensive operation, called dominance testing, between outcomes is required. In this paper, we propose a recursive backtrack search algorithm that we call Search-LP to find the most preferable feasible outcome for an LP-tree extended to a set of hard constraints. Search-LP instantiates the variables with respect to a hierarchical order defined by the LP-tree. Since the LP-tree represents a total order over the outcomes, Search-LP simply returns the first feasible outcome without performing dominance testing. We prove that this returned outcome is preferable to every other feasible outcome.

Sultan Ahmed, Malek Mouhoub

Supervised Versus Unsupervised Deep Learning Based Methods for Skin Lesion Segmentation in Dermoscopy Images

Image segmentation is considered a crucial step in automatic dermoscopic image analysis as it affects the accuracy of subsequent steps. The huge progress in deep learning has recently revolutionized the image recognition and computer vision domains. In this paper, we compare a supervised deep learning based approach with an unsupervised deep learning based approach for the task of skin lesion segmentation in dermoscopy images. Results show that, by using the default parameter settings and network configurations proposed in the original approaches, although the unsupervised approach could detect fine structures of skin lesions in some occasions, the supervised approach shows much higher accuracy in terms of Dice coefficient and Jaccard index compared to the unsupervised approach, resulting in 77.7% vs. 40% and 67.2% vs. 30.4%, respectively. With a proposed modification to the unsupervised approach, the Dice and Jaccard values improved to 54.3% and 44%, respectively.

Abder-Rahman Ali, Jingpeng Li, Thomas Trappenberg

Lifted Temporal Maximum Expected Utility

The dynamic junction tree algorithm (LDJT) efficiently answers exact filtering and prediction queries for temporal probabilistic relational models by building and then reusing a first-order cluster representation of a knowledge base for multiple queries and time steps. To also support sequential online decision making, we extend the underling model of LDJT with action and utility nodes, resulting in parameterised probabilistic dynamic decision models, and introduce meuLDJT to efficiently solve the exact lifted temporal maximum expected utility problem, while also answering marginal queries efficiently.

Marcel Gehrke, Tanya Braun, Ralf Möller

Automatic Generation of Video Game Character Images Using Augmented Structure-and-Style Networks

We propose a fast, flexible approach to automatically generating two-dimensional character images for video games. We treat the generation of character images as a two-part machine learning problem. The first task is to generate structured images that represent common attributes of characters as images. The second task is to add details to the structured images that appear to fit with some overall theme for each image. For both tasks, we employ generative adversarial network architectures, with modifications to improve their performance for their respective tasks. The resulting As2-GAN approach generates character images that are as realistic as those generated with the DCGAN approach, and are more consistent in quality and structural resemblance to images from the dataset. The As2-GAN approach also provides image creators with more control over images than typical one-step methods, while being able to generate high quality images using small datasets.

Matthew T. Mann, Howard J. Hamilton

Weaving Information Propagation: Modeling the Way Information Spreads in Document Collections

Information usually spreads between people by the mean of textual documents. During such propagation, a piece of information can either remain the same or mutate. We propose to formulate information spread with a set of time-ordered document chains along which some information has likely been transmitted. This formulation is different from the usual graph view of a transmission process as it integrates a notion of lineage of the information. We also propose a way to construct a candidate set of document chains for the information propagation in a corpus of documents. We show that most of the chains have been judged as plausible by human experts.

Charles Huyghues-Despointes, Leila Khouas, Julien Velcin, Sabine Loudcher

Mitigating Overfitting Using Regularization to Defend Networks Against Adversarial Examples

Recent work has shown that neural networks are vulnerable to adversarial examples. There is an discussion if this problem is related to overfitting. While many researcher stress that overfitting is not related to adversarial sensitivity, Galloway et al. [4] showed that mitigating overfitting improves the accuracy on adversarial examples. In this study we add to this view that overfitting is a factor in adversarial sensitivity. To make this argument, we include two directions in our study, the first is to evaluate several standard regularization techniques with adversarial attacks and to the second is to evaluate binarized stochastic neural networks on adversarial examples. We report that strong regularizations including binarized stochastic neural networks do not only improve overfitting but also help the networks in fighting against adversarial attacks. Supplemental materials are available at https://github.com/ykubo82/ovf .

Yoshimasa Kubo, Thomas Trappenberg

Self-training for Cell Segmentation and Counting

Learning semantic segmentation and object counting often need a large amount of training data while manual labeling is expensive. The goal of this paper is to train such networks on a small set of annotations. We propose an Expectation Maximization(EM)-like self-training method that first trains a model on a small amount of labeled data and adds additional unlabeled data with the model’s own predictions as labels. We find that the methods of thresholding used to generate pseudo-labels are critical and that only one of the methods proposed here can slightly improve the model’s performance on semantic segmentation. However, we also show that the induced value changes in the prediction map helped to isolate cells that we use in a new counting algorithm.

J. Luo, S. Oore, P. Hollensen, A. Fine, T. Trappenberg

Enhancing Unsupervised Pretraining with External Knowledge for Natural Language Inference

Unsupervised pretraining such as BERT (Bidirectional Encoder Representations from Transformers) [2] represents the most recent advance on learning representation for natural language, which has helped achieve leading performance on many natural language processing problems. Although BERT can leverage large corpora, we assume it cannot learn all needed semantics and knowledge for natural language inference (NLI). In this paper, we leverage human-authorized external knowledge to further improve BERT, and our results show that BERT, the current state-of-the-art pretraining framework, can benefit from external knowledge.

Xiaoyu Yang, Xiaodan Zhu, Huasha Zhao, Qiong Zhang, Yufei Feng

An Experiment for Background Subtraction in a Dynamic Scene

This paper aims to analyze a background subtraction algorithm. Different from tradition methods, we feed the trained network with the target and background images. The paper focuses on how to get background images without using the temporal median filter. We use Gaussian mixture models to produce background images. In this way, the accuracy of background images increases. We also study the difference between grayscale and RGB images, and adding the foreground masks from the convolutional Neural Networks to the Gaussian mixture models. Experiments lead on the 2014 ChangeDetection.net dataset show that our proposed method outperforms several state-of-the-art methods, including IUTIS-5, PAWCS, SuBSENSE and so on.

Ting-Yuan Lin, Jeng-Sheng Yeh, Fu-Che Wu, Yung-Yu Chuang, Andrew Dellinger

Machine Translation on a Parallel Code-Switched Corpus

Code-switching (CS) is the phenomenon that occurs when a speaker alternates between two or more languages within an utterance or discourse. In this work, we investigate the existence of code-switching in formal text, namely proceedings of multilingual institutions. Our study is carried out on the Arabic-English code-mixing in a parallel corpus extracted from official documents of United Nations. We build a parallel code-switched corpus with two reference translations one in pure Arabic and the other in pure English. We also carry out a human evaluation of this resource in the aim to use it to evaluate the translation of code-switched documents. To the best of our knowledge, this kind of corpora does not exist. The one we propose is unique. This paper examines several methods to translate code-switched corpus: conventional statistical machine translation, the end-to-end neural machine translation and multitask-learning.

M. A. Menacer, D. Langlois, D. Jouvet, D. Fohr, O. Mella, K. Smaïli

Weighting Words Using Bi-Normal Separation for Text Classification Tasks with Multiple Classes

An important usage of natural language processing is creating vector representations of documents as features in a classification task. The traditional bag-of-word approach uses one-hot vector representations of words that aggregate into sparse vector document representation. This representation can be enhanced by weighting words that contribute the most to a classification task. In this paper, we propose a generalization of the Bi-Normal Separation metric that enhances vector representations of documents and outperforms TF-IDF scaling algorithms for one-of-m classification tasks.

Jean-Thomas Baillargeon, Luc Lamontagne, Étienne Marceau

A Deep Learning Approach for Diagnosing Long QT Syndrome Without Measuring QT Interval

For decades, ECG segmentation and QT interval measurement have been two fundamental steps in ECG-based diagnosis of the long QT syndrome (LQTS). However, due to the subjective nature of the definition of Q and T wave boundaries and confusion with an adjacent U wave, it suffers from a high degree of inter- and intra-analyst variability. In this paper, without measuring the QT interval and extracting the ECG waves, we propose a convolutional neural network which receives the raw ECG signal, and classifies every heartbeat as Normal or LQTS. The network is trained using a dataset of genotype-positive LQTS, and genotype-negative normal ECGs of family relatives. Experimental results reveal a high accuracy in diagnosing LQTS non-invasively, with a very low computational complexity, guaranteeing the clinical application of the proposed method.

Habib Hajimolahoseini, Damian Redfearn, Andrew Krahn

Learning Career Progression by Mining Social Media Profiles

With the popularity of social media, large amounts of data have given us the possibility to learn and build products to optimize certain areas of our existence. In this work, we focus on exploring methods by which we can model the career trajectory of a given candidate, with the help of data mining techniques applied to professional social media data. We first discuss our efforts to normalizing raw data in order to get good enough data for predictive models to be trained. We then report the experiments we conducted. Results show that we can predict job transitions with 67% accuracy when looking at the 10 top predictions.

Zakaria Soliman, Philippe Langlais, Ludovic Bourg

Toward a Conceptual Framework for Understanding AI Action and Legal Reaction

Artificial Intelligence (AI), refers to computational components that process, classify, make decisions, or act on information from data inputs, and in recent years more capable autonomous systems with realtime decision-making properties have become tenable. In this landscape it becomes imperative to consider the socio-technical implications of such systems, particularly at the legal level. This work facilitates this discussion, broadly highlighting the relationship between law and AI, and proposes a conceptual framework to understand the intersection between these disciplines. AI designers and legal reasoners are encouraged to apply this work to identify the connection and constraints involved when developing AI systems, and the legal responses to these systems.

Raheena Dahya, Alexis Morris

Unsupervised Sentiment Analysis of Objective Texts

Unsupervised learning is an emerging approach in sentiment analysis. In this paper, we apply unsupervised word and document embedding algorithms, Word2Vec and Doc2Vec, to medical and scientific text. We use SentiWordNet as the benchmark measures. Our empirical study is done on the Obesity NLP Challenge data set and four Science subgroups from Reuters 20 Newsgroups. Our results show that Word2Vec demonstrates a reliable performance in sentiment analysis of the text, whereas Doc2Vec requires more detailed studies.

Qufei Chen, Marina Sokolova

Efficient Sequence Labeling with Actor-Critic Training

Neural approaches to sequence labeling often use a Conditional Random Field (CRF) to model their output dependencies. We set out to establish Recurrent Neural Networks (RNNs) as an efficient alternative to CRFs especially in tasks with large number of output labels. We propose an adjusted actor-critic reinforcement learning algorithm to fine-tune RNN network (AC-RNN). Our comprehensive experiments suggest that AC-RNN efficiently matches the performance of the CRF on NER and CCG tagging, and outperforms it on Machine Transliteration; with an overall faster training time, and smaller memory footprint.

Saeed Najafi, Colin Cherry, Grzegorz Kondrak

Detecting Depression from Voice

In this paper, we present our exploration of different machine-learning algorithms for detecting depression by analyzing the acoustic features of a person’s voice. We have conducted our study on benchmark datasets, in order to identify the best framework for the task, in anticipation of deploying it in a future application.

Mashrura Tasnim, Eleni Stroulia

A Machine Learning Method for Nipple-Areola Complex Localization for Chest Masculinization Surgery

Appropriately positioning the Nipple-Areola Complex (NAC) during chest masculinization surgery is a principle determinant of the aesthetic success of the procedure. Nonetheless, today, this positioning process relies on the subjective judgement of the surgeon. Therefore, this paper proposes a novel machine learning solution that leverages Artificial Neural Networks (ANNs) for estimating the NAC location on the chest wall. A dataset composed of 173 pictures of male subjects of various ages and body types was used. The ANN was fed a set of features inputs based on distance ratios between features of the upper body that are common between both biological sexes (e.g. umbilicus, anterior axillary fold, suprasternal notch). Using the proposed ANN regressive model, we achieved a Root Mean Square Error (RMSE) of 0.0617 for the ratio of distances from the suprasternal notch to the center between the NACs, and from the latter point to the umbilicus. Furthermore, an RMSE of 0.0560 for the ratio of the distances between the NACs and from the anterior axillary fold to the umbilicus was obtained. Our results demonstrate that machine learning can be used to support the surgeon in the localization of the NAC for chest masculinization surgery.

Mohammad Ghodratigohar, Kevin Cheung, Natalie Baddour, Hussein Al Osman

A Generic Evolutionary Algorithm for Efficient Multi-Robot Task Allocations

Task allocation in multi-robot teams is conventionally carried out using customized algorithms against individual distributions due to their NP-hard nature. The expanding range of autonomous multi-robot operations demands for a generic allocation scheme capable of working across a variety of problem distributions. This paper presents an intelligently crafted, novel, evolutionary algorithm based task allocation scheme capable of working across a range of multi-robot problem distributions. Qualitative analysis against exact optimal solutions and a state of the art auction based scheme verify the capabilities of the proposed algorithm.

Muhammad Usman Arif

Neural Prediction of Patient Needs in an Ovarian Cancer Online Discussion Forum

Social media is an important source to learn the concerns and needs of patients and caregivers in home care settings. However, manually identifying their needs can be labor-intensive and time-consuming. In this paper, we address the problem of need detection, automatically identifying patient needs in text. We explore both neural and traditional machine learning approaches, and evaluate them on a newly annotated dataset in an ovarian cancer discussion forum. We discuss issues and challenges of this novel task.

Hyeju Jang, Young Ji Lee, Giuseppe Carenini, Raymond Ng, Grace Campbell, Kendall Ho

Fully End-To-End Super-Resolved Bone Age Estimation

With the release of large-scale bone age assessment datasets and competitions looking at solving the problem of bone age estimation, there has been a large boom of machine learning in medical imaging which has attempted to solve this problem. Although many of these approaches use convolutional neural networks, they often include some specialized form of preprocessing which is often lengthy. We propose using a subpixel convolution layer in addition to an attention mechanism similar to those developed by Luong et al. in order to overcome some of the implicit problems with assuming particular placement and orientation of radiographs due to forced preprocessing.

Mohammed Gasmallah, Farhana Zulkernine, Francois Rivest, Parvin Mousavi, Alireza Sedghi

Name2Vec: Personal Names Embeddings

Predicting if two names refer to the same entity is an important task for many domains, such as information retrieval, record linkage and data integration. In this paper, we propose to create name-embeddings by employing a Doc2Vec methodology, where each name is viewed as a document and each letter in the name is considered a word. Our hypothesis is that representing names as documents, with letters as words, will help capture the internal structure of names and relationships among letters. We present and discuss an experimental study where we explore the effect of various parameters, and we assess the stability of the models built for the embedding of names. Our results show that the new proposed method can predict with high accuracy when a pair of names matches.

Jeremy Foxcroft, Adrian d’Alessandro, Luiza Antonie

Genome-Wide Canonical Correlation Analysis-Based Computational Methods for Mining Information from Microbiome and Gene Expression Data

Multi-omics datasets are very high-dimensional in nature and have relatively fewer number of samples compared to the number of features. Canonical correlation analysis (CCA)-based methods are commonly used for reducing the dimensions of such multi-view (multi-omics) datasets to test the associations among the features from different views and to make them suitable for downstream analyses (classification, clustering etc.). However, most of the CCA approaches suffer from lack of interpretability and result in poor performance in the downstream analyses. Presently, there is no well-explored comparison study for CCA methods with application to multi-omics datasets (such as microbiome and gene expression datasets). In this study, we address this gap by providing a detail comparison study of three popular CCA approaches: regularized canonical correlation analysis (RCC), deep canonical correlation analysis (DCCA), and sparse canonical correlation analysis (SCCA) using a multi-omics dataset consisting of microbiome and gene expression profiles. We evaluated the methods in terms of the total correlation score, and the classification performance. We found that the SCCA provides reasonable correlation scores in the reduced space, enables interpretability, and also provides the best classification performance among the three methods.

Rayhan Shikder, Pourang Irani, Pingzhao Hu

Semantic Roles: Towards Rhetorical Moves in Writing About Experimental Procedures

Scholarly writing in the experimental biomedical sciences follows the IMRaD (Introduction, Methods, Results, and Discussion) structure. Many Biomedical Natural Language Processing tasks take advantage of this structure. The task of interest in this paper is the identification of semantic roles of procedural verbs as a first step toward identifying rhetorical moves, text segments that are rhetorical and perform specific communicative goals, in the Methods section. Based on a descriptive taxonomy of rhetorical moves structured around IMRaD, the foundational linguistic knowledge needed for a computationally feasible model of the rhetorical moves is described: semantic roles. Using the observation that the structure of scholarly writing in the laboratory-based experimental sciences closely follows the laboratory procedures, we focus on the procedural verbs in the Methods section. Our goal is to provide FrameNet and VerbNet-like information for the specialized domain of biochemistry. This paper presents the semantic roles required to achieve this goal.

Mohammed Alliheedi, Robert E. Mercer

Compression Improves Image Classification Accuracy

We study the relationship between the accuracy of image classification and the level of image compression. Specifically, we look at how various levels of JPEG and SVD compression affect the score of the correct answer in Inception-v3, a TensorFlow-based image classifier trained on the ImageNet database.Surprisingly, the compression seems to improve the ability of Inception-v3 to recognize images, with the best performance seen at fairly high degrees of compression for most images tested (with half achieving maximal score at JPEG quality under 15, corresponding to more than tenfold reduction in file size). The same behaviour holds for images compressed using the singular value decomposition (SVD) method. This phenomenon suggests that even significant compression can be beneficial rather than detrimental to image classification accuracy, in particular for convolutional neural networks. Understanding when and why compression helps, and which compression algorithm and compression ratio are optimal for any given image remains an open problem.

Nnamdi Ozah, Antonina Kolokolova

Predicting Commentaries on a Financial Report with Recurrent Neural Networks

Aim: The paper aims to automatically generate commentaries on financial reports. Background: Analysing and commenting financial reports is critical to evaluate the performance of a company so that management may change course to meet the targets. Generating commentaries is a task that relies on the expertise of analysts. Methodology: We propose an encoder-decoder architecture based on Recurrent Neural Networks (RNN) that are trained on both financial reports and commentaries. This architecture learns to generate those commentaries from the detected patterns on data. The proposed model is assessed on both synthetic and real data. We compare different neural network combinations on both encoder and decoder, namely GRU, LSTM and one layer neural networks. Results: The accuracy of the generated commentaries is evaluated using BLEU, ROUGE and METEOR scores and probability of commentary generation. The results show that a combination of one layer neural network and an LSTM as encoder and decoder respectively provides a higher accuracy. Conclusion: We observe that the LSTM highly depends on long term memory particularly in learning from real commentaries.

Karim El Mokhtari, John Maidens, Ayse Bener

Applications of Feature Selection Techniques on Large Biomedical Datasets

The main goal of this paper is to determine the best feature selection algorithm to use on large biomedical datasets. Feature Selection shows a potential role in analyzing large biomedical datasets. Four different feature selection techniques have been employed on large biomedical datasets. These techniques were Information Gain, Chi-Squared, Markov Blanket Discovery, and Recursive Feature Elimination. We measured the efficiency of the selection, the stability of the algorithms, and the quality of the features chosen. Of the four techniques used, the Information Gain and Chi-Squared filters were the most efficient and stable. Both Markov Blanket Discovery and Recursive Feature Elimination took significantly longer to select features, and were less stable. The features selected by Recursive Feature Elimination were of the highest quality, followed by Information Gain and Chi-Squared, and Markov Blanket Discovery placed last. For the purpose of education (e.g. those in the biomedical field teaching data techniques), we recommend Information Gain or Chi-Squared filter. For the purpose of research or analyzing, we recommend one of the filters or Recursive Feature Elimination, depending on the situation. We do not recommend the use of Markov Blanket discovery for the situations used in this trial, keeping in mind that the experiments were not exhaustive.

Nicolas Ewen, Tamer Abdou, Ayse Bener

DeepAnom: An Ensemble Deep Framework for Anomaly Detection in System Processes

Model checking and verification using Kripke structures and computational tree logic* (CTL*) use abstractions from the process to create the state-transition graphs that verify the model behavior. This scheme of profiling the behavior of a process means that the depth of the model behavior that can be synthesized correlates with the level of the model abstraction. Therefore, for complex processes, this approach does not produce a fine-grained behavioral model and does not capture the execution time interactions amongst processes, hardware, and the kernel because of state explosion problems. Hence, in this paper, we introduce DeepAnom: an ensemble deep framework for anomaly detection in system processes. DeepAnom targets anomalies in both time-driven and event-driven processes. We test the model with dataset generated from autonomous aerial vehicle application, and the results confirm our hypothesis that DeepAnom presents a deeper view of the system processes and can therefore capture anomalies of various scenarios.

Okwudili M. Ezeme, Michael Lescisin, Qusay H. Mahmoud, Akramul Azim

Predicting Sparse Clients’ Actions with CPOPT-Net in the Banking Environment

The digital revolution of the banking system with evolving European regulations have pushed the major banking actors to innovate by a newly use of their clients’ digital information. Given highly sparse client activities, we propose CPOPT-Net, an algorithm that combines the CP canonical tensor decomposition, a multidimensional matrix decomposition that factorizes a tensor as the sum of rank-one tensors, and neural networks. CPOPT-Net removes efficiently sparse information with a gradient-based resolution while relying on neural networks for time series predictions. Our experiments show that CPOPT-Net is capable to perform accurate predictions of the clients’ actions in the context of personalized recommendation. CPOPT-Net is the first algorithm to use non-linear conjugate gradient tensor resolution with neural networks to propose predictions of financial activities on a public data set.

Jeremy Charlier, Radu State, Jean Hilger

Contextual Generation of Word Embeddings for Out of Vocabulary Words in Downstream Tasks

Over the past few years, the use of pre-trained word embeddings to solve natural language processing tasks has considerably improved performances on every end. However, even though these embeddings are trained on gigantic corpora, the vocabulary is fixed and thus numerous out of vocabulary words appear in specific downstream tasks. Recent studies proposed models able to generate embeddings for out of vocabulary words given its morphology and its context. These models assume that we have sufficient textual data in hand to train them. In contrast, we specifically tackle the case where such data is not available anymore and we rely only on pre-trained embeddings. As a solution, we introduce a model that predicts meaningful embeddings from the spelling of a word as well as from the context in which it appears for a downstream task without the need of pre-training on a given corpus. We thoroughly test our model on a joint tagging task on three different languages. Results show that our model helps consistently on all languages, outperforms other ways of handling out of vocabulary words and can be integrated into any neural model to predict out of vocabulary words.

Nicolas Garneau, Jean-Samuel Leboeuf, Luc Lamontagne

Using a Deep CNN for Automatic Classification of Sleep Spindles: A Preliminary Study

In this work we applied a deep convolutional neural network to a binary classification task of clinical relevance, namely detecting sleep spindles. Specifically, we studied the conditions that are conducive of successful training on small data, emphasizing how the number of processing layers and the relative proportion of the two classes of events affect performance. We demonstrate that, in contrast with our expectations, the number of processing layers did not influence performance. Instead, the relative proportion of events affected the speed of learning but did not affect accuracy. This ceases to be the case when one class represents less than 30% of the total events, wherein training does not lead to improvement above the chance level. Overall, this preliminary study provides a picture of the dynamics that characterize training on small data, while providing further insights to explore the potential of automatic detection of sleep spindles based on deep learning.

Francesco Usai, Thomas Trappenberg

Graduate Student Symposium Papers

Frontmatter

Principal Sample Analysis for Data Ranking

Because of the ever growing amounts of data, challenges have appeared for storage and processing, making data reduction still an important field of study. Numerosity reduction or prototype selection is one of the primary methods of data reduction. In this paper, we propose some possible improvements for Principal Sample Analysis (PSA) which is a numerosity reduction algorithm. The improvements are PSA in Hilbert space, improving its time complexity using anchor points, sample size estimation using PAC learning, and PSA for regression and clustering tasks.

Benyamin Ghojogh

Discrete-Event Systems for Modelling Decision-Making in Human Motor Control

Artificial intelligence, control theory and neuroscience have a long history of interplay. An example is human motor control: optimal feedback control describes low-level motor functions and reinforcement learning explains high-level decision-making, but where the two meet is not as well understood. Here I formulate the human motor decision-making problem, describe how discrete-event systems could model it and lay out future research paths to fill in this gap in the literature.

Richard Hugh Moulton

Event Prediction in Social Graphs Using 1-Dimensional Convolutional Neural Network

Social network graphs and structures possess implicit knowledge embedded about their respective nodes and edges which may be exploited, using effective and efficient methods, for relative event prediction upon these network structures. Thus, understanding the intrinsic patterns of relationship among spatial social actors as well as their respective properties are very crucial factors to be taken into consideration with respect to event prediction in social network graphs. Generally, event prediction problems are considered to be NP-Complete. This research work proposes an original approach (Graph-ConvNet) for predicting events in social network structures using a one-dimensional convolutional neural network (1D-ConvNet) model. In this regard, two distinct methodologies have been proposed herein with each having its individual characteristics and advantages. The first methodology introduces a pre-convolution layer that involves reframing the input social network graph to a two-dimensional adjacency matrix. Thereafter, feature-extraction operations are applied to reduce the linear dimensionality (across the $$ x-axis $$ ) of the input matrix before it is introduced to a repetitive series of non-linear convolution and pooling operations. The second methodology operates on a joint input comprising the edge list (E) of the social graph, and its associated feature space matrix. With regard to the observations and findings from experiments thus far: the first method is suitable for relatively smaller network graphs $$ (nodes \le 30,000)$$ ; while the second method is a good fit for much larger network graphs $$ (nodes > 30,000)$$ . Training and evaluation of these proposed approaches have been done on datasets (compiled: November, 2017) extracted from real social network communities with respect to 3 European countries where each dataset comprises an average of 280,000 edges and 48,000 nodes.

Bonaventure C. Molokwu

A Framework for Determining Effective Team Members Using Evolutionary Computation in Dynamic Social Networks

The team formation problem (TFP) concerns the process of bringing the experts together from Social Networks (SN) as teams in a collaborative working environment for a productive outcome. It was proven to be NP-hard problem. Our findings on a static SN using Evolutionary Computations (EC) achieved a significant improvement than State-of-Art methods on different datasets such as DBLP and Palliative care network. Since complexity and dynamics are challenging properties of real-world SN, our current research focuses on these properties in discovering new individuals for the teams. The process of detecting suitable members for teams is typically a real-time application of link prediction. Although different methods have been proposed to enhance the performance of link prediction, these methods need significant improvement in accuracy. Moreover, we examine the changes in attributes over time between individuals of the SN, especially on the co-authorship network. We introduce a time-varying score function, to evaluate the active researcher, that uses the number of new collaborations and number of frequent collaborations with existing connections. Moreover, we incorporate the shortest distance between any two individuals and introduces a score function to evaluate the skill similarity between any two individuals to form an effective team. We introduce Link prediction as a multi-objective optimization problem for optimizing three objectives, score of active researchers, skill similarity and shortest distance. We solved this problem by applying the NSGA-II and MOCA frameworks.

Kalyani Selvarajah

Generating Accurate Virtual Examples for Lifelong Machine Learning

Lifelong machine learning (LML) is an area of machine learning research concerned with human-like persistent and cumulative nature of learning. LML system’s objective is consolidating new information into an existing machine learning model without catastrophically disrupting the prior information. Our research addresses this LML retention problem for creating a knowledge consolidation network through task rehearsal without retaining the prior task’s training examples. We discovered that the training data reconstruction error from a trained Restricted Boltzmann Machine can be successfully used to generate accurate virtual examples from the reconstructed set of a uniform random set of examples given to the trained model. We also defined a measure for comparing the probability distributions of two datasets given to a trained network model based on their reconstruction mean square errors.

Sazia Mahfuz

Towards a Novel Data Representation for Classifying Acoustic Signals

In this paper, we evaluate a novel data representation of acoustic signals that builds upon the traditional spectrogram representation through interpolation. The novel representation is used in training a deep Convolutional Neural Network for the task of marine mammal species classification. The resulting classifier is compared in terms of performance to several other classifiers trained on traditional spectrograms.

Mark Thomas

Safe Policy Learning with Constrained Return Variance

It is desirable for a safety-critical application that the agent performs in a reliable and repeatable manner which conventional setting in reinforcement learning (RL) often fails to provide. In this work, we derive a novel algorithm to learn a safe hierarchical policy by constraining the direct estimate of the variance in the return in the Option-Critic framework [1]. We first present the novel theorem of safe control in the policy gradient methods and then extend the derivation to the Option-Critic framework.

Arushi Jain

Artificial Intelligence-Based Latency Estimation for Distributed Systems

Network latency is an important metric specially for distributed systems. Depending on the system size, network latency can be either explicitly measured or predicted. However, prediction methods suffer from several drawbacks which lead to poor performance. The goal of this study is to demonstrate a novel method of network latency estimation which will be considered a valuable addition to the existing works due to its accuracy and efficiency. A number of machine learning techniques such as conventional linear regression, convolutional neural network, and support vector machine are used to predict the value to the end-to-end latency between any given pair of nodes. Two datasets: Ubique and iConnect-Ubisoft are used for training and testing the machine learning algorithms.

Shady A. Mohammed

Backmatter

Titel: Advances in Artificial Intelligence
herausgegeben von: Marie-Jean Meurs
Frank Rudzicz
Verlag: Springer International Publishing
Electronic ISBN: 978-3-030-18305-9
Print ISBN: 978-3-030-18304-2
DOI: https://doi.org/10.1007/978-3-030-18305-9