Skip to main content

About this book

This volume constitutes the refereed proceedings of the 9th International Conference on Image and Signal Processing, ICISP 2020, which was due to be held in Marrakesh, Morocco, in June 2020. The conference was cancelled due to the COVID-19 pandemic.
The 40 revised full papers were carefully reviewed and selected from 84 submissions. The contributions presented in this volume were organized in the following topical sections: digital cultural heritage & color and spectral imaging; data and image processing for precision agriculture; machine learning application and innovation; biomedical imaging; deep learning and applications; pattern recognition; segmentation and retrieval; mathematical imaging & signal processing.

Table of Contents


Digital Cultural Heritage and Color and Spectral Imaging


Approach to Analysis the Surface Geometry Change in Cultural Heritage Objects

The three-dimensional digitization of the cultural heritage objects during different stages of the conservation process is an important tool for objective documentation. Further data analysis is also important to monitor, estimate and understand any possible change as accurately as possible. In this work, the cultural heritage (CH) objects were selected for 3D scanning, analysis and visualisation of the change or degradation on their surface over time. The main goal of this work is to develop analysis, and visualization methods for CH object to assess local change in their surface geometry to support conservation processes documentation. The analysis was based on geometrical analysis of change in global distance between before and after chemical cleaning for a chosen object. The new local neighborhood distance histogram has been proposed as a local measure of surface change based on optimized k-neighborhood search algorithm to assess the local geometry change of a focus point.

Sunita Saha, Piotr Foryś, Jacek Martusewicz, Robert Sitnik

Towards the Tactile Discovery of Cultural Heritage with Multi-approach Segmentation

This paper presents a new way to access visual information in museums through tactile exploration, and related techniques to efficiently transform visual data into tactile objects. Accessibility to cultural heritage and artworks for people with visual impairments requires the segmentation of images and paintings to extract and classify their contents into meaningful elements which can then be presented through a tactile medium. In this paper, we investigate the feasibility and how to optimize the tactile discovery of an image. First, we study the emergence of image comprehension through tactile discovery, using 3D-printed objects extracted from paintings. Later, we present a dynamic Force Feedback Tablet (F2T) used to convey the 2D shape and texture information of objects through haptic feedback. We then explore several image segmentation methods to automate the extraction of meaningful objects from selected artworks, to be presented to visually impaired people through the F2T. Finally, we evaluate how to best combine the F2T’s haptic effects in order to convey the extracted objects and features to the users, with the aim of facilitating the comprehension of the represented objects and their affordances.

Ali Souradi, Christele Lecomte, Katerine Romeo, Simon Gay, Marc-Aurele Riviere, Abderrahim El Moataz, Edwige Pissaloux

Use of Imaging Techniques as a Support for the Preventive Conservation Strategy of Wall Paintings: Application to the Medieval Decors of the Château de Germolles

Imaging techniques were used to document and monitor physical damage to the unique wall paintings at the Château de Germolles, Burgundy, France. Photogrammetry combined with scanning are the most appropriate techniques to monitor the evolution of microcrack networks in the cornice overhanging the paintings and preserved as a witness to 19th century additions.However, the application of these techniques was challenged due to the given constraints of a working height of 4 m and the required accuracy of a tenth of a millimetre. A special effort was therefore necessary to ensure sufficient stability of the acquisition protocol and to make it relevant during the four measurement campaigns planned over the two years of the project.The analysis of photogrammetric data has made it possible to document certain macro-deformations of the cornice according to the seasons of the year. The microcracks could be visualized and monitored from transformed 3D models of each segment for the different campaigns. The results obtained show only local movements, mainly on the walls that are most exposed to the specific climatic conditions of each season.

Christian Degrigny, Frank Boochs, Laura Raddatz, Jonas Veller, Carina Justus, Matthieu Pinette

Multispectral Dynamic Codebook and Fusion Strategy for Moving Objects Detection

The Codebook model is one of the popular real-time models for background subtraction to detect moving objects. In this paper, we propose two techniques to adapt the original Codebook algorithm to multispectral images: dynamic mechanism and fusion strategy. For each channel, only absolute spectral value is used to calculate the spectral similarity between the current frame pixel and reference average value in the matching process, which can simplify the matching equations. Besides, the deciding boundaries are obtained based on statistical information extracted from the data and always adjusting themselves to the scene changes. Results demonstrate that with the proposed techniques, we can acquire a comparable accuracy with other methods using the same multispectral dataset for background subtraction.

Rongrong Liu, Yassine Ruichek, Mohammed El Bagdouri

A Spectral Hazy Image Database

We introduce a new database to promote visibility enhancement techniques intended for spectral image dehazing. SHIA (Spectral Hazy Image database for Assessment) is composed of two real indoor scenes M1 and M2 of 10 levels of fog each and their corresponding fog-free (ground-truth) images, taken in the visible and the near infrared ranges every 10 nm starting from 450 to 1000 nm. The number of images that form SHIA is 1540 with a size of $$1312\,\times \,1082$$ pixels. All images are captured under the same illumination conditions. Three of the well-known dehazing image methods based on different approaches were adjusted and applied on the spectral foggy images. This study confirms once again a strong dependency between dehazing methods and fog densities. It urges the design of spectral-based image dehazing able to handle simultaneously the accurate estimation of the parameters of the visibility degradation model and the limitation of artifacts and post-dehazing noise. The database can be downloaded freely at .

Jessica El Khoury, Jean-Baptiste Thomas, Alamin Mansouri

A Bottom-Up Approach for Pig Skeleton Extraction Using RGB Data

Animal behavior analysis is a crucial task for the industrial farming. In an indoor farm setting, extracting Key joints of animals is essential for tracking the animal for a longer period of time. In this paper, we proposed a deep network that exploits transfer learning to train the network for the pig skeleton extraction in an end to end fashion. The backbone of the architecture is based on an hourglass stacked dense-net. In order to train the network, keyframes are selected from the test data using K-mean sampler. In total, 9 Keypoints are annotated that gives a brief detailed behavior analysis in the farm setting. Extensive experiments are conducted and the quantitative results show that the network has the potential of increasing the tracking performance by a substantial margin.

Akif Quddus Khan, Salman Khan, Mohib Ullah, Faouzi Alaya Cheikh

Data and Image Processing for Precision Agriculture


Deep Transfer Learning Models for Tomato Disease Detection

Vegetable crops in Morocco and especially in the Sous-Massa region are exposed to parasitic diseases and pest attacks which affect the quantity and the quality of agricultural production. Precision farming is introduced as one of the biggest revolutions in agriculture, which is committed to improving crop protection by identifying, analyzing and managing variability delivering effective treatment in the right place, at the right time, and with the right rate.The main purpose of this study is to find the most suitable machine learning model to detect tomato crop diseases in standard RGB images. To deal with this problem we consider the deep learning models DensNet, 161 and 121 layers and VGG16 with transfer learning. Our study is based on images of infected plant leaves divided into 6 types of infections pest attacks and plant diseases. The results were promising with an accuracy up to 95.65% for DensNet161, 94.93% for DensNet121 and 90.58% for VGG16.

Maryam Ouhami, Youssef Es-Saady, Mohamed El Hajji, Adel Hafiane, Raphael Canals, Mostafa El Yassa

Machine Learning-Based Classification of Powdery Mildew Severity on Melon Leaves

Precision agriculture faces challenges related to plant disease detection. Plant phenotyping assesses the appearance to select the best genotypes that resist to varying environmental conditions via plant variety testing. In this process, official plant variety tests are currently performed in vitro by visual inspection of samples placed in a culture media. In this communication, we demonstrate the potential of a computer vision approach to perform such tests in a much faster and reproducible way. We highlight the benefit of fusing contrasts coming from front and back light. To the best of our knowledge, this is illustrated for the first time on the classification of the severity of the presence of a fungi, powdery mildew, on melon leaves with 95% of accuracy.

Mouad Zine El Abidine, Sabine Merdinoglu-Wiedemann, Pejman Rasti, Helin Dutagaci, David Rousseau

Vine Disease Detection by Deep Learning Method Combined with 3D Depth Information

Vine disease detection (VDD) is an important asset to predict a probable contagion of virus or fungi. Diseases that spreads through the vineyard has a huge economic impact, therefore it is considered as a challenge for viticulture. Automatic detection and mapping of vine disease in earlier stage can help to limit its impact and reduces the use of chemicals. This study deals with the problem of locating symptomatic areas in images from an unmanned aerial vehicle (UAV) using the visible and infrared domains. This paper, proposes a new method, based on segmentation by a convolutional neuron network SegNet and a depth map (DM), to delineate the asymptomatic regions in the vine canopy. The results obtained showed that SegNet combined with the depth information give better accuracy than a SegNet segmentation alone.

Mohamed Kerkech, Adel Hafiane, Raphael Canals, Frederic Ros

A Random Forest-Cellular Automata Modeling Approach to Predict Future Forest Cover Change in Middle Atlas Morocco, Under Anthropic, Biotic and Abiotic Parameters

This study aims to predict forest species cover changes in the Sidi M’Guild Forest (Mid Atlas, Morocco). Used approach combines remote sensing and GIS and is based on training Cellular Automata and Random Forest (RF) regression model for predicting species cover transition. Five covariates that precludes such transition have been chosen according to Pearson’s test. The model was trained and validated based on the use of forest cover stratum transition probabilities between 1990 and 2004 and then validated using 2018 forest species cover map. Validation of the predicted map with that of 2018 shows an overall agreement between the two maps (72%) for each number of RF’s trees used. The 2032 projected forest species cover map indicate a strong regression of Cedar atlas and thuriferous juniper cover and a medium regression of mixture holm oak and thuriferous juniper, mixture of atlas cedar and thuriferous juniper, and sylvatic and asylvatic vacuums, a very strong progression of holm oak, and of mixture atlas cedar, holm oak and thuriferous juniper and medium progression of mixture of atlas cedar and holm oak. These findings provide important insights to planners, natural resource managers and policy-makers to reconsider their strategies to ensure the sustainability goals.

Anass Legdou, Hassan Chafik, Aouatif Amine, Said Lahssini, Mohamed Berrada

Machine Learning Application and Innovation


Incep-EEGNet: A ConvNet for Motor Imagery Decoding

The brain-computer interface consists of connecting the brain with machines using the brainwaves as a mean of communication for several applications that help to improve human life. Unfortunately, Electroencephalography that is mainly used to measure brain activities produces noisy, non-linear and non-stationary signals that weaken the performances of Common Spatial Pattern (CSP) techniques. As a solution, deep learning waives the drawbacks of the traditional techniques, but it still not used properly. In this paper, we propose a new approach based on Convolutional Neural Networks (ConvNets) that decodes the raw signal to achieve state-of-the-art performances using an architecture based on Inception. The obtained results show that our method outperforms state-of-the-art filter bank common spatial patterns (FBCSP) and ShallowConvNet on based on the dataset IIa of the BCI Competition IV.

Mouad Riyad, Mohammed Khalil, Abdellah Adib

Fuzzy-Based Approach for Assessing Traffic Congestion in Urban Areas

The very rapid evolution of urban areas leads to a reflection on the citizens’ mobility inside the cities. This mobility problem is highlighted by the increase in terms of time, distance and social and economic costs, whereas the congestion management approach implemented rarely meets the road users’ expectations. To overcome this problem, a novel approach for evaluating urban traffic congestion is proposed. Factors such as the imprecision of traffic records, the user’s perception of the road’s level of service provided and variation in sample data are mandatory to describe the real traffic condition. To respond to these requirements, a fuzzy inference-based method is suggested. It combines three independent congestion measures which are: speed ratio, volume to capacity ratio and decreased speed ratio into a single composite measure which is the congestion index. To run the proposed fuzzy model, the traffic dataset of Austin-Texas is used. Although it is still not possible to determine the best congestion measure, the proposed approach gives a composite aspect of traffic congestion by combining and incorporating the uncertainty of the three independent measures.

Sara Berrouk, Abdelaziz El Fazziki, Mohammed Sadgal

Big Data and Reality Mining in Healthcare: Promise and Potential

Nowadays individuals are creating a huge amount of data; with a cell phone in every pocket, a laptop in every bag and wearable sensors everywhere, the fruits of the information are easy to see but less noticeable is the information itself. This data could be particularly useful in making people’s lives healthier and easier, by contributing not only to understand new diseases and therapies but also to predict outcomes at earlier stages and make real-time decisions. In this paper, we explain the potential benefits of big data to healthcare and explore how it improves treatment and empowers patients, providers and researchers. We also describe the capabilities of reality mining in terms of individual health, social network mapping, behavior patterns and treatment monitoring. We illustrate the benefits of reality mining analytics that lead to promote patients’ health, enhance medicine, reduce cost and improve healthcare value and quality. Furthermore, we highlight some challenges that big data analytics faces in healthcare.

Hiba Asri, Hajar Mousannif, Hassan Al Moatassime, Jihad Zahir

A Dataset to Support Sexist Content Detection in Arabic Text

Social media have become a viral source of information. This huge amount of data offers an opportunity to study the feelings and opinions of the crowds toward any subject using Sentiment Analysis, which is a struggling area for Arabic Language. In this research, we present our approach to build a thematic training set by combining manual and automatic annotation of Arabic texts addressing Discrimination and Violence Against Women.

Oumayma El Ansari, Zahir Jihad, Mousannif Hajar

Multistage Deep Neural Network Framework for People Detection and Localization Using Fusion of Visible and Thermal Images

In Computer vision object detection and classification are active fields of research. Applications of object detection and classification include a diverse range of fields such as surveillance, autonomous cars and robotic vision. Many intelligent systems are built by researchers to achieve the accuracy of human perception but could not quite achieve it yet. Convolutional Neural Networks (CNN) and Deep Learning architectures are used to achieve human like perception for object detection and scene identification. We are proposing a novel method by combining previously used techniques. We are proposing a model which takes multi-spectral images, fuses them together, drops the useless images and then provides semantic segmentation for each object (person) present in the image. In our proposed methodology we are using CNN for fusion of Visible and thermal images and Deep Learning architectures for classification and localization. Fusion of visible and thermal images is carried out to combine informative features of both images into one image. For fusion we are using Encoder-decoder architecture. Fused image is then fed into Resnet-152 architecture for classification of images. Images obtained from Resnet-152 are then fed into Mask-RCNN for localization of persons. Mask-RCNN uses Resnet-101 architecture for localization of objects. From the results it can be clearly seen that Fused model for object localization outperforms the Visible model and gives promising results for person detection for surveillance purposes. Our proposed model gives the Miss Rate of 5.25% which is much better than the previous state of the art method applied on KAIST dataset.

Bushra Khalid, Muhammad Usman Akram, Asad Mansoor Khan

Biomedical Imaging


Diagnosing Tuberculosis Using Deep Convolutional Neural Network

One of the global topmost causes of death is Tuberculosis (TB) which is caused by mycobacterium bacillus. The increase rate of infected people and the recorded deaths from TB disease is as a result of its transmissibility, lack of early diagnosis, and inadequate professional radiologist in developing regions where TB is more prevalent. Tuberculosis is unquestionably curable but needs to be detected early for necessary treatment to be effective. Many screening techniques are available, but chest radiograph has proven to be valuable for screening pulmonary diseases but hugely dependent on the interpretational skill of an expert radiologist. We propose a Computer-Aided Detection model using Deep Convolutional Neural Networks to automatically detect TB from Montgomery County (MC) Tuberculosis radiographs. Our proposed model performed at 87.1% validation accuracy and evaluated using confusion matrix and accuracy as metrics.

Mustapha Oloko-Oba, Serestina Viriri

Semantic Segmentation of Diabetic Foot Ulcer Images: Dealing with Small Dataset in DL Approaches

Foot ulceration is the most common complication of diabetes and represents a major health problem all over the world. If these ulcers are not adequately treated in an early stage, they may lead to lower limb amputation. Considering the low-cost and prevalence of smartphones with a high-resolution camera, Diabetic Foot Ulcer (DFU) healing assessment by image analysis became an attractive option to help clinicians for a more accurate and objective management of the ulcer. In this work, we performed DFU segmentation using Deep Learning methods for semantic segmentation. Our aim was to find an accurate fully convolutional neural network suitable to our small database. Three different fully convolutional networks have been tested to perform the ulcer area segmentation. The U-Net network obtained a Dice Similarity Coefficient of 97.25% and an intersection over union index of 94.86%. These preliminary results demonstrate the power of fully convolutional neural networks in diabetic foot ulcer segmentation using a limited number of training samples.

Niri Rania, Hassan Douzi, Lucas Yves, Treuillet Sylvie

DermoNet: A Computer-Aided Diagnosis System for Dermoscopic Disease Recognition

The research of skin lesion diseases is currently one of the hottest topics in the medical research fields, and has gained a lot of attention on the last few years. However, the existing skin lesion methods are mainly relying on conventional Convolutional Neural Network (CNN) and the performance of skin lesion recognition is far from satisfactory. Therefore, to overcome the aforementioned drawbacks of traditional methods, we propose a novel Computer-Aided Diagnosis (CAD) system, named DermoNet, based on Multi-Scale Feature Level (MSFL) blocks and Multi-Level Feature Fusion (MLFF). Further, the DermoNet approach yields a significant enhancement in terms of dealing with the challenge of small training data sizes in the dermoscopic domain and avoiding high similarity between classes and overfitting issue. Extensive experiments are conducted on the public dermoscopic dataset, and the results demonstrate that DermoNet outperforms the state-of-the-art approaches. Hence, DermoNet can achieve an excellent diagnostic efficiency in the auxiliary diagnosis of skin lesions.

Ibtissam Bakkouri, Karim Afdel

A New Method of Image Reconstruction for PET Using a Combined Regularization Algorithm

Positron emission tomography (PET), is a medical imaging technique that provides functional information about physiological processes. The goal of PET is to reconstruct the distribution of the radioisotopes in the body by measuring the emitted photons. The computer methods are designed to solve the inverse problem known as “image reconstruction from projections.” In this paper, an iterative image reconstruction algorithm ART was regularized by combining Tikhonov and total variation regularizations. In the first step, combined regularization algorithm of total variation and Tikhonov regularization was applied to the image obtained by ART algorithm in each iteration for background noise removal with preserving edges. The quality measurements and visual inspections show a significant improvement in image quality compared to other algorithms.

Abdelwahhab Boudjelal, Abderrahim El Moataz, Zoubeida Messali

Visualizing Blood Flow of Palm in Different Muscle Tense State Using High-Speed Video Camera

In this paper, we propose a method to visualize blood flow of palm in different muscle tense state using RGB high-speed camera. Recently, new modalities are needed to develop a more accurate system to non-contact multi-modal affect analysis. Then, we focus on muscle tense. The muscle tense is caused by stress. Hence, the muscle tense is one of the effective modalities for non-contact multi-modal affect analysis. However, it is very difficult to measure muscle tense in the real environment because it requires a contact-type sensor. Therefore, we use iPPG to visualize the pulse wave during muscle tense from the skin video taken with the RGB video camera. As a result of this experiment, we found that it was possible to recognize the difference in pulse wave during muscle tense from the video that visualized the pulse wave. From this result, the realization of non-contact measurement of muscle tense can be expected.

Ryo Takahashi, Keiko Ogawa-Ochiai, Norimichi Tsumura

Deep Learning and Applications


Segmentation of Microscopic Image of Colorants Using U-Net Based Deep Convolutional Networks for Material Appearance Design

In this study, U-Net based deep convolutional networks are used to achieve the segmentation of particle regions in a microscopic image of colorants. The material appearance of products is greatly affected by the distribution of the particle size. From that fact, it is important to obtain the distribution of the particle size to design the material appearance of products. To obtain the particle size distribution, it is necessary to segment particle regions in the microscopic image of colorants. Conventionally, this segmentation is performed manually using simple image processing. However, this manual processing leads to low reproducibility. Therefore, in this paper, to extract the particle region with high reproducibility, segmentation is performed using U-Net based deep convolutional networks. We improved deep convolutional U-Net type networks based on the feature maps trained for a microscopic image of colorants. As a result, we obtained more accurate segmentation results using the improved network than conventional U-Net.

Mari Tsunomura, Masami Shishikura, Toru Ishii, Ryo Takahashi, Norimichi Tsumura

A Deep CNN-LSTM Framework for Fast Video Coding

High Efficiency Video Coding (HEVC) doubles the compression rates over the previous H.264 standard for the same video quality. To improve the coding efficiency, HEVC adopts the hierarchical quadtree structured Coding Unit (CU). However, the computational complexity significantly increases due to the full search for Rate-Distortion Optimization (RDO) to find the optimal Coding Tree Unit (CTU) partition. Here, this paper proposes a deep learning model to predict the HEVC CU partition at inter-mode, instead of brute-force RDO search. To learn the learning model, a large-scale database for HEVC inter-mode is first built. Second, to predict the CU partition of HEVC, we propose as a model a combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) network. The simulation results prove that the proposed scheme can achieve a best compromise between complexity reduction and RD performance, compared to existing approaches.

Soulef Bouaafia, Randa Khemiri, Fatma Ezahra Sayadi, Mohamed Atri, Noureddine Liouane

Microcontrollers on the Edge – Is ESP32 with Camera Ready for Machine Learning?

For most machine learning tasks big computing power is needed, but some tasks can be done with microcontrollers. In this paper well-known SoC ESP32 has been analyzed. It is usually used in IoT devices for data measurement, but some authors started to use simple machine learning algorithms with them. Generally, this paper will analyze the possibility of using ESP32 with a built-in camera for machine learning algorithms. Focus of research will be on durations of photographing and photograph processing, because that can be a bottleneck of a machine learning tasks.For this purpose, logistic regression has been implemented on ESP32 with camera. It has been used to differentiate two handwritten letters on the greyscale pictures (“o” and “x”). Logistic regression weights have been calculated on the cloud, but then they have been transferred to an ESP32. The output results have been analyzed. The duration of photographing and processing were analyzed as well as the impact of implemented PSRAM memory on performances. It can be concluded that ESP32 with camera can be used for some simple machine learning tasks and for camera picture taking and preparing for other more powerful processors. Arduino IDE still does not provide enough level of optimization for implemented PSRAM memory.

Kristian Dokic

Speech Enhancement Based on Deep AutoEncoder for Remote Arabic Speech Recognition

Remote applications that deal with speech need the speech signal to be compressed. First, speech coding transforms the continuous waveform into a numerical form. Then, the digitized signal is compressed with or without loss of information. This transformation affects the original waveform and degrades performances for further recognition of the speech signal. Meanwhile, the transmission is another source of speech degradation. To restore the original “clean” speech, speech enhancement (SE) is widely used, and deep learning algorithms are state-of-the-art, nowadays. In this paper, the target application is a remote Arabic speech recognition system, and the aim of using SE is to improve the accuracy of the speech recognizer. For that purpose, a Deep Auto Encoder (DAE) is used. The effect of the DAE-based SE is studied through different configurations, and the performances are evaluated through accuracy. The results showed an improvement of about 3.17 between the accuracy prior to the SE and that computed with the enhanced speech.

Bilal Dendani, Halima Bahi, Toufik Sari

Pattern Recognition


Handwriting Based Gender Classification Using COLD and Hinge Features

Gender Classification from handwriting is still considered to be challenging due to homogeneous vision comparing male and female handwritten documents. This paper presents a new method based on Cloud of Line Distribution (COLD) and Hinge feature for distinguishing the gender from handwriting. The SVM classifier combination decides the assigned class based on the maximum of the two decisions values resulting from COLD and Hinge feature. The proposed approach is evaluated on the standard QUWI dataset and following the framework protocol described in the ICFHR 2016 competition. Obtained results are promising regarding the classification rates announced in the literature.

Abdeljalil Gattal, Chawki Djeddi, Ameur Bensefia, Abdellatif Ennaji

Extraction and Recognition of Bangla Texts from Natural Scene Images Using CNN

The semantic information presents in the scene images may be the useful information for the viewers who is searching for a specific location or any specific shop and address. This type of information can also be useful in licenseplate detection, controlling the vehicle on the road, robot navigation, and assisting visually impaired persons. An efficient method is presented in this paper to detect and extract Bangla texts from scene images based on a connected component approach along with rule-based filtering and vertical scanning scheme. Next, extracted characters are recognized by using Convolutional Neural Network (CNN). The method consists of the four basic consecutive steps such as detection and extraction of the Region of Interest (ROI), segmentation of the words, extraction of characters, and recognition of the extracted characters. After extracting the ROI from the input image, connected component(CC) analysis and bounding box technology are used for segmentation of Bangla words. To separate and extract Bangla characters from the segmented Bangla words, vertical scanning based method along with a dynamic threshold value has been applied. Finally, character recognition is carried out using CNN. The proposed algorithm is applied to 600 scene images of different writing styles and colors, and we have obtained 89.25% accuracy in text detection and 94.50% accuracy in the extraction of characters. We have achieved an accuracy of 99.30% and 95.76% in recognition of Bangla digits and characters respectively. By combining both the digits and characters, obtained recognition accuracy is 95.39%.

Rashedul Islam, Md Rafiqul Islam, Kamrul Hasan Talukder

Detection of Elliptical Traffic Signs

Detection of elliptical features is a challenging and important task in computer vision. In fact, ellipses can describe many objects in real images like manufactured objects, cells, ball, or traffic signs. Furthermore, an ellipse is defined by five parameters: the center coordinates, the semi major axe, the semi minor axe and the orientation, which require more computational power to estimate them. Some non-ideal ellipses cause also difficulties for detection such as occlusion, appearance of multiple ellipses at same time and non-parallel ellipse. In this paper, we are interested in detecting elliptical traffic signs. We present a method for detecting different cases of ellipses: simple, partially occluded, non-parallel, multiple ellipses in images. The method selects three lines to find the ellipse center, and then it calculates the value of the semi minor and major axes of the ellipse. Experiments show that the proposed method performs well on real images in the presence or not of noises.

Manal El Baz, Taher Zaki, Hassan Douzi

Image-Based Place Recognition Using Semantic Segmentation and Inpainting to Remove Dynamic Objects

Place recognition is an important step in intelligent driving, allowing the vehicle recognize where it is to plan its route. Obtaining distinguishable features can ensure the success of image-based place recognition. However, generating robust features across drastically appearance changing images is still a challenging problem. Deep features are frequently chosen instead of local features in the tasks of place recognition following the development of convolutional neural networks. But even the deep features generated by powerful neural models can cause unsatisfactory recognition results. This is perhaps due to a lack of information selecting process. The technology of semantic segmentation allows recognizing and classifying image information. Semantic segmentation followed by image inpainting provide a possibility of detecting, deleting and reconstructing annoying information.This paper proves that dynamic information present in images such as vehicles and pedestrians damages the performance of place recognition and proposes a feature extraction system that includes a step to decrease the presence of dynamic information of an image. This system is composed of two stages: 1) dynamic objects detection and removing, 2) image inpainting to reconstruct the background of removed regions. Objects detection and removing consists of deleting unstable objects recognized by semantic segmentation method from images. Image inpainting and reconstructing deals with generating inpaint-images by repairing missing regions through image inpainting method. The robustness of the proposed approach is evaluated by comparing to the non-selecting deep feature based place recognition approaches over three datasets.

Linrunjia Liu, Cindy Cappelle, Yassine Ruichek

CNN-SVM Learning Approach Based Human Activity Recognition

Although it has been encountered for a long time, the human activity recognition remains a big challenge to tackle. Recently, several deep learning approaches have been proposed to enhance the recognition performance with different areas of application. In this paper, we aim to combine a recent deep learning-based method and a traditional classifier based hand-crafted feature extractors in order to replace the artisanal feature extraction method with a new one. To this end, we used a deep convolutional neural network that offers the possibility of having more powerful extracted features from sequence video frames. The resulting feature vector is then fed as an input to the support vector machine (SVM) classifier to assign each instance to the corresponding label and bythere, recognize the performed activity. The proposed architecture was trained and evaluated on MSR Daily activity 3D dataset. Compared to state of art methods, our proposed technique proves that it has performed better.

Hend Basly, Wael Ouarda, Fatma Ezahra Sayadi, Bouraoui Ouni, Adel M. Alimi

Convolutional Neural Networks Backbones for Object Detection

Detecting objects in images is an extremely important step in many image and video analysis applications. Object detection is considered as one of the main challenges in the field of computer vision, which focuses on identifying and locating objects of different classes in an image. In this paper, we aim to highlight the important role of deep learning and convolutional neural networks in particular in the object detection task. We analyze and focus on the various state-of-the-art convolutional neural networks serving as a backbone in object detection models. We test and evaluate them in the common datasets and benchmarks up-to-date. We Also outline the main features of each architecture. We demonstrate that the application of some convolutional neural network architectures has yielded very promising state-of-the-art results in image classification in the first place and then in the object detection task. The results have surpassed all the traditional methods, and in some cases, outperformed the human being’s performance.

Ayoub Benali Amjoud, Mustapha Amrouch

Object Detector Combination for Increasing Accuracy and Detecting More Overlapping Objects

Object detection is considered as the cornerstone of many modern applications such as Drone vision and Self-driven cars. Object detectors, mainly those which are based on Convolutional Neural Net-works (CNNs) have received great attention from many researchers because they were able to yield remarkable results. However, most of them fail when it comes to detecting overlapping and small objects in images. There are two families of detectors: the first family detects more objects but with imprecise bounding boxes, while those of the second family do the opposite. In this paper, we propose a solution to this problem by combining the two families, in a way similar to classifier combination. Our solution has been validated through the combination of two famous detectors, Faster R-CNN which detects more objects and YOLO which produces accurate bounding boxes. However, it is more general and it can be applied to other detectors. The evaluation of our method has been applied to the PASCAL VOC dataset and it gave promising results.

Khaoula Drid, Mebarka Allaoui, Mohammed Lamine Kherfi

Segmentation and Retrieval


Graph-Based Image Retrieval: State of the Art

The paper deals with the problem of semantic Image Retrieval. Indeed, the image has recently gained popularity in several domains such as medical domain, marketing, etc. Image plays a very vital role in documentation. However, finding visual and relevant information in an image is a huge task for Image Retrieval community and a very discussed issue in digital image processing. In fact, image can be extracted from a big collection of images, in the purpose of responding to user’s need. Image Retrieval processes based on classical techniques may not be sufficient to user. For several years, great efforts have been devoted to integrate semantic aspect, in order to enhance relevance of the result and ensure high-level content consideration in image. This paper presents a state of the art of Image Retrieval approaches using graph theory due to the growing interest given to graphs in terms of performance, representation and its ability to ingrate semantic aspect. We review a number of recently available graph-based approaches in Image Retrieval aiming to determine factors adding semantic aspect in Image Retrieval system.

Imane Belahyane, Mouad Mammass, Hasna Abioui, Ali Idarrou

A New Texture Descriptor: The Homogeneous Local Binary Pattern (HLBP)

This paper presents a simple and novel descriptor named Homogeneous Local Binary Pattern (HLBP) for texture analysis. The purpose of this description is to improve the Local Binary Pattern (LBP) approach basing on the impact of criterion homogeneous region using General Adaptive Neighborhood (GAN) principle. HLBP method is generated by using the criterion homogeneity which helps to represent a significant feature based on relationships between neighboring pixels. The main idea of HLBP is to threshold the distance between the current pixel and each of its neighbors with a homogeneity tolerance value which correspond more to the underlying spatial structures consequently allow extracting highly distinctive invariant features of the image. To assess the performance of the our proposed descriptor, we use “Outex" database and compared with the basic (LBPs). The experimental results show that the proposed Homogeneous Local Binary Pattern gives a good performance in term of classification accuracy.

Ibtissam Al Saidi, Mohammed Rziza, Johan Debayle

Considerably Improving Clustering Algorithms Using UMAP Dimensionality Reduction Technique: A Comparative Study

Dimensionality reduction is widely used in machine learning and big data analytics since it helps to analyze and to visualize large, high-dimensional datasets. In particular, it can considerably help to perform tasks like data clustering and classification. Recently, embedding methods have emerged as a promising direction for improving clustering accuracy. They can preserve the local structure and simultaneously reveal the global structure of data, thereby reasonably improving clustering performance. In this paper, we investigate how to improve the performance of several clustering algorithms using one of the most successful embedding techniques: Uniform Manifold Approximation and Projection or UMAP. This technique has recently been proposed as a manifold learning technique for dimensionality reduction. It is based on Riemannian geometry and algebraic topology. Our main hypothesis is that UMAP would permit to find the best clusterable embedding manifold, and therefore, we applied it as a preprocessing step before performing clustering. We compare the results of many well-known clustering algorithms such ask-means, HDBSCAN, GMM and Agglomerative Hierarchical Clustering when they operate on the low-dimension feature space yielded by UMAP. A series of experiments on several image datasets demonstrate that the proposed method allows each of the clustering algorithms studied to improve its performance on each dataset considered. Based on Accuracy measure, the improvement can reach a remarkable rate of 60%.

Mebarka Allaoui, Mohammed Lamine Kherfi, Abdelhakim Cheriet

Logo Detection Based on FCM Clustering Algorithm and Texture Features

Logo detection methods usually depend on logo shapes and need for training data or a-priori information on the processed images. This limits their effectiveness to real-world applications. In this paper, we tackle these challenges by exploring the textural information. Specifically we propose a novel approach for administrative logo detection based on a fuzzy classification with a multi-fractal texture feature, capable of automatically characterizing texture measures describing logo and non-logo regions. Experimental results, using two real datasets, confirm the feasibility of the proposed method for degraded administrative documents. Extensive comparative evaluations demonstrate the superiority of this approach over the state-of-the-art methods.

Wala Zaaboub, Lotfi Tlig, Mounir Sayadi, Basel Solaiman

Mathematical Imaging and Signal Processing


Discrete p-bilaplacian Operators on Graphs

In this paper, we first introduce a new family of operators on weighted graphs called p-bilaplacian operators, which are the analogue on graphs of the continuous p-bilaplacian operators. We then turn to study regularized variational and boundary value problems associated to these operators. For instance, we study their well-posedness (existence and uniqueness). We also develop proximal splitting algorithms to solve these problems. We finally report numerical experiments to support our findings.

Imad El Bouchairi, Abderrahim El Moataz, Jalal Fadili

Image Watermarking Based on Fourier-Mellin Transform

Geometric attacks are still challenging issues in image watermarking. In this paper, the robustness of different insertion position and shape of the watermark are evaluated in watermarking scheme based on Fourier-Mellin transform. We propose diagonal, rectangular, and circular insertion of the mark. The robustness of these techniques against geometric deformations such as rotation, scaling and translation (RST) is evaluated. Results show that the circular insertion performs better for translation and scaling attacks, while the diagonal insertion is better for rotations and RST attacks. The last point makes the diagonal insertion to be preferred in industrial applications since the combination of RST attacks often occurs in many applications such as printing the image on a physical support, and scanning it (print-scan attack).

Khadija Gourrame, Hassan Douzi, Rachid Harba, Riad Rabia, Frederic Ros, Mehamed ElHajji

A New Sparse Blind Source Separation Method for Determined Linear Convolutive Mixtures in Time-Frequency Domain

This paper presents a new Blind Source Separation method for linear convolutive mixtures, which exploits the sparsity of source signals in the time-frequency domain. This method especially brings a solution to the artifacts problem that affects the quality of signals separated by existing time-frequency methods. These artifacts are in fact introduced by a time-frequency masking operation, used by all these methods. Indeed, by focusing on the case of determined mixtures, we show that this problem can be solved with much less restrictive sparsity assumptions than those of existing methods. Test results show the superiority of our new proposed method over existing ones based on time-frequency masking.

Mostafa Bella, Hicham Saylani

Proposed Integration Algorithm to Optimize the Separation of Audio Signals Using the ICA and Wavelet Transform

In the present work, an integration of two combined methodologies is developed for the blind separation of mixed audio signals. The mathematical methodologies are the independent component analysis (ICA) and the discrete Wavelet transform (DWT). The DWT optimizes processing time by decreasing the amount of data, before that signals are processed by ICA. A traditional methodology for signal processing such as Wavelet is combined with a statistical process as ICA, which assumes that the source signals are mixed and they are statistically independent of each other. The problem refers to very common situations where the human being listens to several sound sources at the same time. The human brain being able to pay attention to the message of a particular signal. The results are very satisfactory, effectively achieving signal separation, where only a small background noise and a attenuation in the amplitude of the recovered signal are noticed, but that nevertheless the signal message is identified in such a way.

Enrique San Juan, Ali Dehghan Firoozabadi, Ismael Soto, Pablo Adasme, Lucio Cañete

ECG Signal Analysis on an Embedded Device for Sleep Apnea Detection

Low cost embedded devices with computational power have the potential to revolutionise detection and management of many diseases. This is especially true in the case of conditions like sleep apnea, which require continuous long term monitoring. In this paper, we give details of a portable, cost-effective and customisable Electrocardiograph(ECG) Signal analyser for real time sleep apnea detection. We have developed a data analysis pipeline using which we can identify sleep apnea using a single lead ECG signal. Our method combines steps including dataset extraction, segmentation, signal cleaning, filtration and finally apnea detection using Support Vector Machines (SVM). We analysed our proposed implementation through a complete run on the MIT-Physionet dataset. Due to the low computational complexity of our proposed method, we find that it is well suited for deployment on embedded devices such as the Raspberry Pi.

Rishab Khincha, Soundarya Krishnan, Rizwan Parveen, Neena Goveas


Additional information

Premium Partner

    Image Credits