Skip to main content
Top

2018 | Book

Articulated Motion and Deformable Objects

10th International Conference, AMDO 2018, Palma de Mallorca, Spain, July 12-13, 2018, Proceedings

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 10th International Conference on Articulated Motion and Deformable Objects, AMDO 2018, held in Palma de Mallorca, Spain, in July 2018.
The 12 papers presented were carefully reviewed and selected from 26 submissions. The papers address the following topics: advanced computer graphics and immersive videogames; human modeling and animation; human motion analysis and tracking; 3D human reconstruction and recognition; multimodal user interaction and applications; ubiquitous and social computing; design tools; input technology; programming user interfaces; 3D medical deformable models and visualization; deep learning methods for computer vision and graphics; and multibiometric.

Table of Contents

Frontmatter
Mammographic Mass Segmentation Using Fuzzy C–means and Decision Trees
Abstract
Breast cancer is the second most frequent cancer among Cuban women. The mammographic image processing constitutes a challenge due to the breast anatomy diversity and its low–contrast that doesn’t allow a good border definition and anomalies visualization. The aim of this research is the mass anomaly segmentation and classification. Two algorithms are presented: (i) an efficient mass segmentation approach based on a Fuzzy C–means modification using image histogram, and (ii) a method for classifying regions of interest corresponding to masses, based on a binary decision tree. The novelty on classifier training consists in using co-occurrence matrices of the region of interest’s radial image. The images are pre-processed with re-scaling, CLAHE and homogeneity filters. Mammograms pre-processing importance and the fuzzy method’s effectiveness were shown by the experiments. In the classification step, we obtained 90% sensitivity and 72% specificity, while reducing false positives we reached 87% sensitivity and 88% specificity.
Damian Valdés-Santiago, Raúl Quintana-Martínez, Ángela León-Mecías, Marta Lourdes Baguer Díaz-Romañach
Refining the Pose: Training and Use of Deep Recurrent Autoencoders for Improving Human Pose Estimation
Abstract
In this paper, a discriminative human pose estimation system based on deep learning is proposed for monocular video-sequences. Our approach combines a simple but efficient Convolutional Neural Network that directly regresses the 3D pose estimation with a recurrent denoising autoencoder that provides pose refinement using the temporal information contained in the sequence of previous frames. Our architecture is also able to provide an integrated training between both parts in order to better model the space of activities, where noisy but realistic poses produced by the partially trained CNN are used to enhance the training of the autoencoder. The system has been evaluated in two standard datasets, HumanEva-I and Human3.6M, comprising more than 15 different activities. We show that our simple architecture can provide state of the art results.
Niall McLaughlin, Jesus Martinez del Rincon
How Can Deep Neural Networks Be Generated Efficiently for Devices with Limited Resources?
Abstract
Despite the increasing hardware capabilities of embedded devices, running a Deep Neural Network (DNN) in such systems remains a challenge. As the trend in DNNs is to design more complex architectures, the computation time in low-resource devices increases dramatically due to their low memory capabilities. Moreover, the physical memory used to store the network parameters augments with its complexity, hindering a feasible model to be deployed in the target hardware. Although a compressed model helps reducing RAM consumption, a large amount of consecutive deep layers increases the computation time. Despite the wide literature about DNN optimization, there is a lack of documentation for practical and efficient deployment of these networks. In this paper, we propose an efficient model generation by analyzing the parameters and their impact and address the design of a simple and comprehensive pipeline for optimal model deployment.
Unai Elordi, Luis Unzueta, Ignacio Arganda-Carreras, Oihana Otaegui
Controlling a Smartphone with Brain-Computer Interfaces: A Preliminary Study
Abstract
Nowadays, smartphones are essential parts of our lives. The wide range of functionalities that they offer to us, from calling, taking photos, sharing information or contacting with people, has contributed to make them a useful tool. However, its accessibility remains restricted to disabled people that are unable to control their motor functions. In this preliminary study, we have developed a Brain-Computer Interface system that allows users to control two main functionalities of our smartphones using their own brain signals. In particular, due to the importance of the socializing apps in today’s world, the system includes the control of social networking and instant message services: Twitter and Telegram, respectively. The system has been tested with 10 healthy subjects, who were asked to perform several tasks, reaching an average accuracy of \(92.3\%\). Preliminary results show that users can successfully control the system, bridging the accessibility gap in smartphone applications.
Víctor Martínez-Cagigal, Eduardo Santamaría-Vázquez, Roberto Hornero
Capturing Industrial Machinery into Virtual Reality
Abstract
In this paper we set out to find a new technical and commercial solution to easily acquire a virtual model of existing machinery for visualisation in a VR environment. To this end we introduce an image-based scanning approach with an initial focus on a monocular (handheld) capturing device such as a portable camera. Poses of the camera will be estimated with a Simultaneous Localisation and Mapping technique. Depending on the required quality offline calibration is incorporated by means of ArUco markers placed within the captured scene. Once the images are captured, they are compressed in a format that allows rapid low-latency streaming and decoding on the GPU. Finally, upon viewing the model in a VR environment, an optical flow method is used to interpolate between the triangulisation of the captured viewpoints to deliver a smooth VR experience. We believe our tool will facilitate the capturing of machinery into VR providing a wide range of benefits such as doing marketing, providing offsite help and performing remote maintenance.
Jeroen Put, Nick Michiels, Fabian Di Fiore, Frank Van Reeth
Leishmaniasis Parasite Segmentation and Classification Using Deep Learning
Abstract
Leishmaniasis is considered a neglected disease that causes thousands of deaths annually in some tropical and subtropical countries. There are various techniques to diagnose leishmaniasis of which manual microscopy is considered to be the gold standard. There is a need for the development of automatic techniques that are able to detect parasites in a robust and unsupervised manner. In this paper we present a procedure for automatizing the detection process based on a deep learning approach. We train a U-net model that successfully segments leismania parasites and classifies them into promastigotes, amastigotes and adhered parasites.
Marc Górriz, Albert Aparicio, Berta Raventós, Verónica Vilaplana, Elisa Sayrol, Daniel López-Codina
Robust Pedestrian Detection for Semi-automatic Construction of a Crowded Person Re-Identification Dataset
Abstract
The problem of re-identification of people in a crowd commonly arises in real application scenarios, yet it has received less attention than it deserves. To facilitate research focusing on this problem, we have embarked on constructing a new person re-identification dataset with many instances of crowded indoor and outdoor scenes. This paper proposes a two-stage robust method for pedestrian detection in a complex crowded background to provide bounding box annotations. The first stage is to generate pedestrian proposals using Faster R-CNN and locate each pedestrian using Non-maximum Suppression (NMS). Candidates in dense proposal regions are merged to identify crowd patches. We then apply a bottom-up human pose estimation method to detect individual pedestrians in the crowd patches. The locations of all subjects are achieved based on the bounding boxes from the two stages. The identity of the detected subjects throughout each video is then automatically annotated using multiple features and spatial-temporal clues. The experimental results on a crowded pedestrians dataset demonstrate the effectiveness and efficiency of the proposed method.
Zengxi Huang, Zhen-Hua Feng, Fei Yan, Josef Kittler, Xiao-Jun Wu
Shape and Appearance Based Sequenced Convnets to Detect Real-Time Face Attributes on Mobile Devices
Abstract
In computer vision, classifying facial attributes has attracted deep interest from researchers and corporations. Deep Neural Network based approaches are now widely spread for such tasks and have reached higher detection accuracies than previously manually-designed approaches. Our paper reports how preprocessing and face image alignment influence accuracy scores when detecting face attributes. More importantly it demonstrates how the combination of a representation of the shape of a face and its appearance, organized as a sequence of convolutional neural networks, improves classification scores of facial attributes when compared with previous work on the FER+ dataset. While most studies in the field have tried to improve detection accuracy by averaging multiple very deep networks, exposed work concentrates on building efficient models while maintaining high accuracy scores. By taking advantage of the face shape component and relying on an efficient shallow CNN architecture, we unveil the first available, highly accurate real-time implementation on mobile browsers.
Nicolas Livet, George Berkowski
Image Colorization Using Generative Adversarial Networks
Abstract
Over the last decade, the process of automatic image colorization has been of significant interest for several application areas including restoration of aged or degraded images. This problem is highly ill-posed due to the large degrees of freedom during the assignment of color information. Many of the recent developments in automatic colorization involve images that contain a common theme or require highly processed data such as semantic maps as input. In our approach, we attempt to fully generalize the colorization procedure using a conditional Deep Convolutional Generative Adversarial Network (DCGAN). The network is trained over datasets that are publicly available such as CIFAR-10 and Places365. The results between the generative model and traditional deep neural networks are compared.
Kamyar Nazeri, Eric Ng, Mehran Ebrahimi
Multimodal Deep Learning for Advanced Driving Systems
Abstract
Multimodal deep learning is about learning features over multiple modalities. Impressive progress has been made in deep learning solutions that rely on a single sensor modality for advanced driving. However, these approaches are limited to cover certain functionalities. The potential of multimodal sensor fusion has been very little exploited, although research vehicles are commonly provided with various sensor types. How to combine their data to achieve a complex scene analysis and improve therefore robustness in driving is still an open question. While different surveys have been done for intelligent vehicles or deep learning, to date no survey on multimodal deep learning for advanced driving exists. This paper attempts to narrow this gap by providing the first review that analyzes existing literature and two indispensable elements: sensors and datasets. We also provide our insights on future challenges and work to be done.
Nerea Aranjuelo, Luis Unzueta, Ignacio Arganda-Carreras, Oihana Otaegui
Optical Recognition of Numerical Characters in Digital Images of Glucometers
Abstract
The present project develops an informatics application oriented towards the Optical Recognition of Characters (OCR) that enables the automatic recognition and generation of a sequence of numerical characters based on real digital images of a conventional glucometer. The readings are taken from a diabetic patient and are presented in numerical characters varying between 50 and 600 mg per deciliter (mg/dl). The digitalization of this information will allow the identification and storage of data for its posterior statistic processing. Such data are crucial in controlling a patient’s diabetes since the endocrinologist requires a log with 5 daily measurements during a 1 to 3 month period. In general, the glucometers designed in the last 10 years do not include a data acquisition system. Hence, designing an app that allows both the patient and the specialist to gather the data statistically and graphically facilitates the prescription of medicine and making decisions regarding treatment.
Claudia Marcela Ospina Mosquera, Octavio José Salcedo Parra, Miguel J. Espitia R.
A Comparison of Text String Similarity Algorithms for POI Name Harmonisation
Abstract
The article deals with the analysis of harmonisation options of key data from points of interest across different geosocial networks. Data harmonization is realised in the paper by using the five most effective algorithms which compare similarity of text strings. The main aim of this article is to identify the most appropriate algorithm for harmonizing different names of de facto identical points of interest within different geosocial networks. The comparison includes results of the following algorithms: Jaro-Winkler similarity, Levenshtein similarity, Damerau-Levenshtein similarity, Jaccard similarity coefficient and Cosine similarity.
Jiří Kysela
Backmatter
Metadata
Title
Articulated Motion and Deformable Objects
Editors
Francisco José Perales
Prof. Dr. Josef Kittler
Copyright Year
2018
Electronic ISBN
978-3-319-94544-6
Print ISBN
978-3-319-94543-9
DOI
https://doi.org/10.1007/978-3-319-94544-6

Premium Partner