Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the Third Mediterranean Conference on Pattern Recognition and Artificial Intelligence, MedPRAI 2019, held in Istanbul, Turkey, in December 2019.

The 18 revised full papers and one short paper presented were carefully selected from 54 submissions. The papers are covering the topics of recent advancements in different areas of pattern recognition and artificial intelligence, such as statistical, structural and syntactic pattern recognition, machine learning, data mining, neural networks, computer vision, multimedia systems, information retrieval, etc.



Room-Level Indoor Localization with Artificial Neural Networks

Indoor localization system determines the location of the users or some assets in indoor environments. There are important applications of indoor localization including smart home systems, indoor navigation and tracking systems. In this work, a reliable neural network model is developed for localizing users in room level. Model is based on Wi-Fi signals received by the users’ devices at different rooms from various Wi-Fi access points. A neural network with two hidden layers with sigmoid activation functions is trained with back-propagation optimizing collected signal data. Some of the signals are set to 0 during the training process, which gives significant stability to the model under the conditions where some of the data required for prediction are not available. An additional dataset is collected for the evaluation in addition to the existing datasets. Performance of the model on the existing datasets as well as the new collected dataset is discussed and evaluated. Results are promising in terms of reliability and accuracy.
Ahmet Serdar Karadeniz, Mehmet Önder Efe

A Simple Authentication Method with Multilayer Feedforward Neural Network Using Keystroke Dynamics

Keystroke dynamics is a widely accepted user recognition and verification behavioral biometric, which has been studied nearly for a century. Intrinsically, this biometric is used together with id/password authentication forming multi-factor authentication. There are several anomaly detection algorithms that have been proposed for this task. While some proposals handle this problem with measuring data distance by taking correlation and dependence into account, some models use complex and time-consuming models deep neural networks to train to reach the right approximation. Our paper addresses a simple, accurate and lightweight method for user authentication. We show the effectiveness of our approach through comparisons with existing methods, which have also used the CMU keystroke dynamics benchmark dataset used here too. Using feed forward multilayer neural network with resilient backpropagation, we obtained an Equal Error Rate (ERR) equal to 0.049 for authentication with overall identification accuracy of 94.7%.
Ahmet Melih Gedikli, Mehmet Önder Efe

A Comparative Study of ANN Tuning Methods for Multiclass Daily Activity and Fall Recognition

Smart phones and other sensor-enabled devices are very frequently used daily life devices. Movement data obtained by sensors from these devices can be interpreted by artificial intelligence algorithms and this may be critically helpful in some daily life issues. Such a daily activity and fall classification mechanism is particularly important for rapid and accurate medical intervention to the elderly people who live alone. In addition, the real time human activity recognition (HAR) is important for healthcare solutions and better assistance of intelligent personal assistants (IPAs). In this study, the dataset is obtained from 6 different wearable sensors. It contains 20 daily activities and 16 fall motions on the 3060 observations. To classify these movements separately, 3 different Artificial Neural Network (ANN) training algorithms were chosen as the basis. These are gradient descent, momentum with gradient descent and Adam algorithms. Dropout and L2 regularization techniques are used to obtain better results for the test data. The results have shown that the ANN based approach correctly recognizes the daily activities and falls with 94.58% accuracy score on the test set.
Tevfik Aktay, Mehmet Önder Efe

Exploring Model Transfer Potential for Airborne LiDAR Point Cloud Classification

The deep learning paradigm has been shown to be an effective framework in many applications, including airborne light detection and ranging point cloud classification. However, even a simple deep neural network has large quantities of parameters, and the optimal parameters generally need several hours to be learned. In this paper, we propose a framework to take full advantage of existing deep neural networks in image processing domains and to reduce the training time for classification. The framework is composed of four key steps: (1) calculate low-level features; (2) transform three-dimensional point clouds into multi-scale feature images by the proposed feature image generation strategy; (3) extract multi-scale deep features from the feature images by introducing transfer learning, i.e., a pre-trained neural network; and (4) learn higher-level features via a fully connected network and fuse higher-level features using a convolutional neural network. Our framework has been evaluated using a benchmark dataset provided by the International Society for Photogrammetry and Remote Sensing, and experimental results show that the proposed framework can reduce the time needed for obtaining an optimal classification model and effectively classify nine objects, such as buildings, the ground, and cars, with an overall accuracy of 90.1%, which is beneficial for providing reliable information for further applications.
Yuzhun Lin, Chuan Zhao, Daoji Li, Junfeng Xu, Baoming Zhang

Haralick Feature Guided Network for the Improvement of Generalization in Landcover Classification

This study examined the application of semantic segmentation in landcover classification, a recently popular task in the field of remote sensing. Most semantic segmentation methods exhibit strong sample dependence. This tends to have high prediction accuracy in similar areas, but low accuracy in other areas or the same area at different time phases. Our approach utilizes three Haralick features to enhance the generalization ability. In addition, several variants were also implemented for comparison. We found that these features can effectively improve generalization of landcover classification.
Yuzhun Lin, Daoji Li, Chuan Zhao, Junfeng Xu, Baoming Zhang

Determining Location and Detecting Changes Using a Single Training Video

This paper proposes a new approach to find a robot’s current location and to detect any changes in its path, using monocular vision. A recorded single obstacle-free training video is first obtained and saved. Then, a moving robot can use its camera to find its current location, within its path, by matching current frames with the ones from the training video. This frame-to-frame matching is performed using extracted feature points. Once a match is found, the corresponding frames are aligned (registered) using a homography that is calculated based the matched feature points. This allows to compensate for viewpoint changes between the observed and saved frames. Finally, we compare the regions of interest (ROIs) of the aligned frames, using their colour histograms. We carried out seventeen tests using this approach. The videos, for both training and testing, were recorded using off-the-shelf phone camera by walking down different paths. Four tests were performed in an outdoor environment, and 13 in an indoor environment. Our tests have shown excellent results, with an accuracy above 95% for most of them, for finding the robot’s location and for detecting obstacles in the robot’s path. Both training and testing videos used in our tests were realistic and very challenging, as they consisted of a mix of indoor and outdoor environments with cluttered backgrounds, repetitive floor textures and glare.
Ryan Bluteau, Boubakeur Boufama, Pejman Habashi

Improving Deep Learning Parkinson’s Disease Detection Through Data Augmentation Training

Deep learning has been successfully applied to different classification applications where large data are available. However, the lack of data makes it more difficult to predict Parkinson’s disease (PD) with the deep models, which requires enough number of training data. Online handwriting dynamic signals can provide more detailed and complex information for PD detection task. In our previous work [1], two different deep models were studied for time series classification; the convolutional neural network (CNN) and the convolutional neural network- bidirectional long short term memory network (CNN-BLSTM). Different approaches were applied to encode pen-based signals into images for the CNN model while the raw time series are used directly with the CNN-BLSTM model. We have showed that both CNN model with spectrogram images as input and CNN-BLSTM model, improve the performance of time series classification applied for early PD stage detection. However, these approaches did not outperform classical support vector machine (SVM) classification applied on pre-engineered features. In this paper we investigate transfer learning and data augmentation approaches in order to train these models for PD detection on large-scale data. Various data augmentation methods for pen-based signals are proposed. Our experimental results show that the CNN-BLSTM model used with the combination of Jittering and Synthetic data augmentation methods provides promising results in the context of early PD detection, with accuracy reaching 97.62%. We have illustrated that deep architecture can surpass the models trained on pre-engineered features even though the available data is small.
Catherine Taleb, Laurence Likforman-Sulem, Chafic Mokbel

Human Action Recognition Using Stereo Trajectories

This paper proposes a new method that uses a pair of uncalibrated stereo videos, without the need for three-dimensional reconstruction, for human action recognition (HAR). Two stereo views of the same scene, obtained from two different cameras, are used to create a set of two-dimensional trajectories. Then, we calculate disparities between them and fuse them with the trajectories, to obtain our disparity-augmented trajectories that is used in our HAR method. The obtained results have shown on average a 2.40% improvement, when using disparity-augmented trajectories, compared to using the classical 2D trajectory information only. Furthermore, we have also tested our method on the challenging Hollywood 3D dataset and, we have obtained competitive results, at a faster speed than some state of the art methods.
Pejman Habashi, Boubakeur Boufama, Imran Shafiq Ahmad

Deep Convolutional Neural Network with 2D Spectral Energy Maps for Fault Diagnosis of Gearboxes Under Variable Speed

For industrial safety, correct classification of gearbox fault conditions is necessary. One of the most crucial tasks in data-driven fault diagnosis is determining the best set of features by analyzing the statistical parameters of the signals. However, under variable speed conditions, these statistical parameters are incapable of uncovering the dynamic characteristics of different fault conditions of gearboxes. Later, several deep learning algorithms are used to improve the performance of the feature selection process, but domain knowledge expertise is still necessary. In this paper, a combination domain knowledge analysis and a deep neural network is proposed. By using the input acoustic emission (AE) signal, a two-dimensional spectrum energy map (2D AE-SEM) is created to form an identical fault pattern for various speed conditions of gearboxes. Then, a deep convolutional neural network (DCNN) is proposed to investigate the detailed structure of the 2D input for final fault classification. This 2D AE-SEM offers a graphical depiction of acoustic emission spectral characteristics. Our proposed system offers vigorous and dynamic classification performance through the proposed DCNN with a high diagnostic fault classification accuracy of 96.37% in all considered scenarios.
Md Junayed Hasan, Jongmyon Kim

Bayesian Convolutional Neural Network: Robustly Quantify Uncertainty for Misclassifications Detection

For safety and mission critical systems relying on Convolutional Neural Networks (CNNs), it is crucial to avoid incorrect predictions that can cause accident or financial crisis. This can be achieved by quantifying and interpreting the predictive uncertainty. Current methods for uncertainty quantification rely on Bayesian CNNs that approximate Bayesian inference via dropout sampling. This paper investigates different dropout methods to robustly quantify the predictive uncertainty for misclassifications detection. Specifically, the following questions are addressed: In which layers should activations be sampled? Which dropout sampling mask should be used? What dropout probability should be used? How to choose the number of ensemble members? How to combine ensemble members? How to quantify the classification uncertainty? To answer these questions, experiments were conducted on three datasets using three different network architectures. Experimental results showed that the classification uncertainty is best captured by averaging the predictions of all stochastic CNNs sampled from the Bayesian CNN and by validating the predictions of the Bayesian CNN with three uncertainty measures, namely the predictive confidence, predictive entropy and standard deviation thresholds. The results showed further that the optimal dropout method specified through the sampling location, sampling mask, inference dropout probability, and number of stochastic forward passes depends on both the dataset and the designed network architecture. Notwithstanding this, I proposed to sample inputs to max pooling layers with a cascade of Multiplicative Gaussian Mask (MGM) followed by Multiplicative Bernoulli Spatial Mask (MBSM) to robustly quantify the classification uncertainty, while keeping the loss in performance low.
Cedrique Rovile Njieutcheu Tassi

Particle Filter for Trajectories of Movers from Laser Scanned Dataset

Laser scanner takes away the problem of private life conservation as it does not record real world videos except scanned data points. So it shows many benefits over the use of video camera. This paper portrays an approach to detect and track movers from laser scanned datasets. Laser scanned data points from each scan are deemed as a video frame. Blobs are extracted from each frame. Support vector machine (SVM) and Hungarian method along with particle filter are used to get trajectories of movers. Experimental results on the identical laser scanned dataset demonstrate that the approach of SVM with Hungarian method using particle filter outperforms both the threshold based approach with Hungarian method using Kalman filter and the approach of SVM with Hungarian method using Kalman filter.
Md. Haidar Sharif

Video Steganography Using 3D Convolutional Neural Networks

In an steganography we intend to hide information of interest in another data, aiming at secure data transaction, such as hiding an image in another image. The same task could be performed in video steganography. One approach to steganography is through manipulating least significant bits (LSB). Increasing the precision of the approaches in this regard, is still challenging and desired. The attention toward the sequence of video frames is an important issue which affects the accuracy of these methods. In this paper, a 3D convolutional neural network has been employed for the first time, to our knowledge, to perform video steganography. The incentive of exploiting the spatial and temporal features together, pushed us toward using this approach. Implementing the proposed method over the UCF101 dataset, lead to 22.75 bits per pixel improvement and output video frame enhancement level of 1 dB compared to the LSB, on average, respectively.
Mahdi Abdolmohammadi, Rahil Mahdian Toroghi, Azam Bastanfard

DeepRank: Adapting Neural Tensor Networks for Ranking the Recommendations

Online real estate property portals are gaining great attraction from masses due to ease in finding properties for rental or sale/purchase. With a few clicks, a real estate portal can display relevant information to a user by ranking the searched items according to user’s specifications. It is highly significant that the ranking results display the most relevant search results to the user. Therefore, an efficient ranking algorithm that takes user’s context is crucial for enhancing user experience in finding real estate properties online. This paper proposes an expressive Neural Tensor Network to rank the properties when searched for based on the similarity between the two property entities. Previous similarity techniques do not take into account the numerous complex features used to define a property. We showed that the performance can be enhanced if the property entities are represented as an average of their constituting features before finding the similarity between them. The proposed method takes into account each feature dynamically and ranks properties according to similarity with an accuracy of 86.6%.
Raaiha Humayun Kabir, Bisma Pervaiz, Tayyeba Muhammad Khan, Adnan Ul-Hasan, Raheel Nawaz, Faisal Shafait

An Intelligent Context Aware Recommender System for Real-Estate

Finding products and items in large online space that meet user needs is difficult. Time spent searching before finding a relevant item can be a significant time sink for users. As with other economic branches, growing Internet usage also changed user behavior in the real-estate market. Advancements in virtual reality offer virtual tours and interactive map and floor plans which make an online rental websites very popular among users. With the abundance of information, recommender systems become more important than ever to give the user relevant property suggestions and reduce search time. A sophisticated recommender in this domain can help reduce the need of a real-estate agent. Session-based user behavior and lack of user profiles leads to the use of traditional recommendation methods. In this research, we propose an approach for real-estate recommendation based on Gated Orthogonal Recurrent Unit (GORU) and Weighted Cosine Similarity. GORU captures the user search context and weighted cosine similarity improves the rank of pertinent property. We have used the data of an online public real estate web portal (AARZ.PK). The data represents the original behavior of the user on an online portal. We have used Recall, User coverage and Mean Reciprocal Rank (MRR) metrics for the evaluation of our system against other state-of-the-art techniques. The proposed solution outperforms various baselines and state-of-the-art RNN based solutions.
Faiza Rehman, Hira Masood, Adnan Ul-Hasan, Raheel Nawaz, Faisal Shafait

Integration of Fuzzy Clustering into the Case Base Reasoning for the Prediction of Response to Immunotherapy Treatment

The functioning of the medical diagnostic process is very similar to the pattern of the case-based reasoning cycle (CBR). This resemblance has prompted several research groups to build on the CBR, which is a paradigm of problem solving based on past experiences, in the design of medical decision support systems. In this article, we propose a medical decision support system specifically in dermatology based on fuzzy logic to predict the response of a patient with plantar and common warts to immunotherapy treatment. The aim of this work is to improve the retrieval step, which is a very important phase in the CBR cycle, by incorporating segmentation techniques “fuzzy clustering”. The proposed approach is composed of two parts; the part of the clustering by the Fuzzy C-Means algorithm and the part of case-based reasoning realized by the JColibri platform. The use of the FCM is to reduce the search space and solve the problem of rapid retrieval of similar cases.
Fatima Saadi, Baghdad Atmani, Fouad Henni

Music Generation Using an Interactive Evolutionary Algorithm

Music generation with the aid of computers has been recently grabbed the attention of many scientists in the area of artificial intelligence. Deep learning techniques have evolved sequence production methods for this purpose. Yet, a challenging problem is how to evaluate a music generated by a machine. In this paper, a methodology has been developed based upon an interactive evolutionary optimization method, with which the scoring of the generated musics are primarily performed by human expertise, during the training. This music quality scoring is modeled using a BiLSTM recurrent neural network. Moreover, the innovative generated music through a Genetic algorithm, will then be evaluated using this BiLSTM network. The results of this mechanism clearly show that the proposed method is able to create pleasurable melodies with desired styles and pieces. This method is also quite fast, compared to the state-of-the-art data-oriented evolutionary systems.
Majid Farzaneh, Rahil Mahdian Toroghi


Weitere Informationen

Premium Partner