nach oben

2015 | Buch

Kapitel lesen Erstes Kapitel lesen

New Trends in Image Analysis and Processing -- ICIAP 2015 Workshops

ICIAP 2015 International Workshops, BioFor, CTMR, RHEUMA, ISCA, MADiMa, SBMI, and QoEM, Genoa, Italy, September 7-8, 2015, Proceedings

herausgegeben von: Vittorio Murino, Enrico Puppo, Diego Sona, Marco Cristani, Carlo Sansone

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the refereed proceedings of seven workshops held at the 18th International Conference on Image Analysis and Processing, ICIAP 2015, in Genoa, Italy, in September 2015: International Workshop on Recent Advances in Digital Security: Biometrics and Forensics, BioFor 2015; International Workshop on Color in Texture and Material Recognition, CTMR 2015; International Workshop on Medical Imaging in Rheumatology: Advanced applications for the analysis of in ammation and damage in the rheumatoid Joint, RHEUMA 2015; International Workshop on Image-Based Smart City Application, ISCA 2015; International Workshop on Multimedia Assisted Dietary Management, MADiMa 2015; International Workshop on Scene Background Modeling and initialization, SBMI 2015; and International Workshop on Image and Video Processing for Quality of Multimedia Experience, QoEM 2015.

Inhaltsverzeichnis

Frontmatter

Biofor 2015 - International Workshop on Recent Advances in Digital Security: Biometrics and Forensics

Frontmatter

Reflectance Normalization in Illumination-Based Image Manipulation Detection

One approach to detect spliced images is to compare the lighting environment of suspicious objects or persons in the scene. The original method, proposed by Johnson and Farid, requires an investigator to mark occluding contours of multiple objects, from which the distribution of the incident light intensity is estimated. Unfortunately, this method imposes relatively strict constraints on the user and on the scene under investigation.

In this work, we propose a color-normalization approach to relax one important constraint. With our modification, a user is able to select the contours from multiple different materials (instead of having to use a single material). The proposed method will automatically compensate the differences in the reflected intensities. We demonstrate the robustness of the method with a carefully designed ground-truth dataset, consisting of 10 subjects, each of them under 3 controlled lighting environments. With the proposed method, lighting direction as a forensic cue becomes applicable to a much wider range of natural images.

Christian Riess, Sven Pfaller, Elli Angelopoulou

Evaluation of Residual-Based Local Features for Camera Model Identification

Camera model identification is of interest for many applications. In-camera processes, specific of each model, leave traces that can be captured by features designed

ad hoc

, and used for reliable classification. In this work we investigate on the use of

blind

features based on the analysis of image residuals. In particular, features are extracted locally based on co-occurrence matrices of selected neighbors and then used to train an SVM classifier. Experiments on the well-known Dresden database show this approach to provide state-of-the-art performances.

Francesco Marra, Giovanni Poggi, Carlo Sansone, Luisa Verdoliva

Biometric Walk Recognizer

In this paper we present a comparative test of different approaches to gait recognition by smartphone accelerometer. Our work provides a twofold contribution. The first one is related to the use of low-cost, built-in sensors that nowadays equip most mobile devices. The second one is related to the use of our system in identification mode. Instead of being used to just verify the identity of the device owner, it can also be used for identification among a set of enrolled subjects. Whether the identification is carried out remotely or even if its results are transmitted to a server, the system can also be exploited in a multibiometric setting. Its results can be fused with those from computer-vision based gait recognition, as well as other biometric modalities, to enforce identification for accessing critical locations/services. We obtained the best results by matching complete walk captures (Recognition Rate 0.95), but the implicit limitation is represented by the fixed number of steps in the walks. Therefore we also investigated methods based on first dividing the signal into steps. The best of these achieved a Recognition Rate of 0.88.

Maria De Marsico, Alessio Mecca

Touchstroke: Smartphone User Authentication Based on Touch-Typing Biometrics

Smartphones are becoming pervasive and widely used for a large variety of activities from social networking to online shopping, from message exchanging to mobile gaming, to mention just a few. Many of these activities generate private information or require storing on the phone user credentials and payment details. In spite of being so security and privacy critical, smartphones are still widely protected by traditional authentication mechanisms such as PINs and passwords, whose limitations and drawbacks are well known and documented in the security community. New accurate, user-friendly and effective authentication mechanisms are required. To this end, behavior-based authentication has recently attracted a significant amount of interest in both commercial and academic contexts.

This paper proposes a new bi-modal biometric authentication solution,

Touchstroke

, which makes use of the user’s hand movements while holding the device, and the timing of touch-typing(Touch-typing is the act of typing input on the touchscreen of a smartphone.) when the user enters a

text-independent

4-digit PIN/password. We implemented and tested the new biometrics in real smartphones. Preliminary results are encouraging, showing high accuracy. Thus, our solution is a plausible alternative to traditional authentication mechanisms.

Attaullah Buriro, Bruno Crispo, Filippo Del Frari, Konrad Wrona

EEG/ECG Signal Fusion Aimed at Biometric Recognition

The recognition of individuals based on behavioral and biological characteristics has made important strides over the past few years. Growing interest has been recently devoted to the study of physiological measures, which include the electrical activity of brain (EEG) and heart (ECG). Even if the use of multimodal approaches overcome several limitations of traditional uni-modal biometric systems, the simultaneous use of EEG and ECG characteristics has been scarcely investigated. In this paper, we present a set of preliminary results derived by the investigation of a biometric system based on the fusion of simple features simultaneously extracted from EEG and ECG signals. The reported results show high performance both from uni-modal approach (higher performance being EER = 11.17 and EER = 3.83 for EEG and ECG respectively) and fusion (EER = 2.94). However, caution should be considered in the interpretation of the reported results mainly beacuse the analysis was performed on a limited set of subjects.

Silvio Barra, Andrea Casanova, Matteo Fraschini, Michele Nappi

Fusion of Holistic and Part Based Features for Gender Classification in the Wild

Gender classification (GC) in the wild is an active area of current research. In this paper, we focus on the combination of a holistic state of the art approach based on features extracted from the facial pattern, with patch based approaches that focus on inner facial areas. Those regions are selected for being relevant to the human system according to the psychophysics literature: the ocular and the mouth areas. The resulting proposed GC system outperforms previous approaches, reducing the classification error of the holistic approach roughly a

$$30\%$$

Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda

A Hand Gesture Approach to Biometrics

In this paper we present a biometric technique based on hand gestures. By means of the Microsoft Kinect sensor, the user’s hand is tracked while following a circle moving on the screen. Both 3D data about the position of the hand and 2D data about the position of the screen pointer are provided to different classifiers (SVM, Naive Bayes, Classification Tree, KNN, Random Forest and Neural Networks). Experiments carried out with 20 testers have demonstrated that the method is very promising for both identification and verification (with success rates above 90%), and can be a viable biometric solution, especially for soft biometric applications.

Nahumi Nugrahaningsih, Marco Porta, Giuseppe Scarpello

Quis-Campi: Extending in the Wild Biometric Recognition to Surveillance Environments

Efforts in biometrics are being held into extending robust recognition techniques to

in the wild

scenarios. Nonetheless, and despite being a very attractive goal, human identification in the surveillance context remains an open problem. In this paper, we introduce a novel biometric system –

Quis-Campi

– that effectively bridges the gap between surveillance and biometric recognition while having a minimum amount of operational restrictions. We propose a fully automated surveillance system for human recognition purposes, attained by combining human detection and tracking, further enhanced by a PTZ camera that delivers data with enough quality to perform biometric recognition. Along with the system concept, implementation details for both hardware and software modules are provided, as well as preliminary results over a real scenario.

João C. Neves, Gil Santos, Sílvio Filipe, Emanuel Grancho, Silvio Barra, Fabio Narducci, Hugo Proença

CTMR 2015 - Color in Texture and Material Recognition

Frontmatter

On Comparing Colour Spaces From a Performance Perspective: Application to Automated Classification of Polished Natural Stones

In this paper we investigate the problem of choosing the adequate colour representation for automated surface grading. Specifically, we discuss the pros and cons of different colour spaces, point out some common misconceptions about their use, and propose a number of ‘best practices’ for colour conversion. To put the discussion into practice we generated a new dataset of 25 classes of natural stone products which we used to systematically compare and evaluate the performance of seven device-dependent and three device-independent colour spaces through two classification strategies. With the nearest neighbour classifiers no significant difference emerged among the colour spaces considered, whereas with the linear classifier it was found that device-independent Lab and Luv spaces performed significantly better than the others.

Francesco Bianconi, Raquel Bello, Antonio Fernández, Elena González

Methods for Predicting Spectral Response of Fibers Blends

Textile companies usually manufacture fabrics using a mix of pre-colored fibers according to a traditional recipe based on their own experience. Unfortunately, mainly due to the fibers dyeing process, the colorimetric distance between the obtained fabric and the desired one results unsatisfactory with respect to a colorimetric threshold established by the technicians. In such cases, colorists are required to slightly change the original recipe in order to reduce the colorimetric distance. This trial and error process is time-consuming and requires the work of highly skilled operators. Computer-based color recipe assessment methods have been proposed so far in scientific literature to address this issue. Unlikely, many methods are still far to be reliably predictive when the fabric is composed by a high number of components. Accordingly, the present work proposes two alternative methods based on Kubelka-Munk and subtractive mixing able to perform a reliable prediction of the spectrophotometric response of a fabric obtained by means of any variation of a recipe. The assessment performed on a prototypal implementation of the two methods demonstrates that they are suitable for reliable prediction of fabric blends spectral response.

Rocco Furferi, Lapo Governi, Yary Volpe

Texture Classification Using Rotation Invariant LBP Based on Digital Polygons

This paper investigates the use of digital polygons as a replacement for circular interpolated neighbourhoods for extracting texture features through Local Binary Patterns. The use of digital polygons has two main advantages: reduces the computational cost, and avoids the high-frequency loss resulting from pixel interpolation. The solution proposed in this work employs a sub-sampling scheme over Andres’ digital circles. The effectiveness of the method was evaluated in a supervised texture classification experiment over eight different datasets. The results showed that digital polygons outperformed interpolated circular neighbourhoods in most cases.

Juan Pardo-Balado, Antonio Fernández, Francesco Bianconi

Analysis of Albedo Influence on Surface Urban Heat Island by Spaceborne Detection and Airborne Thermography

Urban environment overheating is gaining growing importance for its consequences on citizens comfort and energy consumption. The surface albedo represents one of the most influencing parameters on the local temperature, therefore, its punctual and large scale detection could give a significant contribution to the urban microclimate assessment. A comparison of satellite data with airborne infrared thermography images is proposed for the city of Florence, starting from temperature analyses and moving to surface albedo assessments. It is shown that, despite the aircraft surveys higher resolution, their area covering limitation, sporadic availability, and high cost make the satellite retrieved data competitive, considering that the current 30 m pixel size of the Landsat images seems to be already suitable for the construction material classification.

Giorgio Baldinelli, Stefania Bonafoni

An Interactive Tool for Speed up the Analysis of UV Images of Stradivari Violins

UV fluorescence photography is widely use in the study of artworks, in particular for the analysis of historical musical instruments. This technique allows seeing important details which cannot be observed with visible light, such as retouching, different paints coats or worn areas. The complexity of the interpretation of the surface of a violin is proportional to its state of preservation: more alterations correspond to a more wide range of colors. We designed an interactive tool able to help the scientist to understand the composition of the surface and in particular the distribution of the colors on the entire instrument, avoiding perception illusion. The result is achieved using a quantized histogram in HSV color space. The tests were performed on UV imagery of the Stradivari violins collection stored by “Museo del Violino” in Cremona.

Piercarlo Dondi, Luca Lombardi, Marco Malagodi, Maurizio Licchelli, Tommaso Rovetta, Claudia Invernizzi

Local Angular Patterns for Color Texture Classification

The description of color texture under varying lighting conditions is still an open issue. We defined a new color texture descriptor, that we called Local Angular Patterns, specially designed to be robust to changes in the color of the illuminant. The results show that our descriptor outperforms the state-of-the-art on a dataset of food textures.

Claudio Cusano, Paolo Napoletano, Raimondo Schettini

Complexity Perception of Texture Images

Visual complexity perception plays an important role in the fields of both psychology and computer vision: it can be useful not only to investigate human perception but also to better understand the properties of the objects being perceived. In this paper we investigate the complexity perception of texture images. To this end we perform a psycho-physical experiment on real texture patches. The complexity of each image is assessed on a continuous scale. At the end of the evaluation, each observer indicates the criteria used to assess texture complexity. The most frequent criteria used are regularity, understandability, familiarity and edge density. As candidate complexity measures we consider thirteen image features and we correlate each of them with the subjective scores collected during the experiment. The performance of these correlations are evaluated in terms of Pearson correlation coefficients. The four measures that show the highest correlations are energy, edge density, compression ratio and a visual clutter measure, in accordance with the verbal descriptions collected by the questionnaire.

Gianluigi Ciocca, Silvia Corchs, Francesca Gasparini

RHEUMA 2015 – Medical Imaging in Rheumatology: Advanced Applications for the Analysis of Inflammation and Damage in the Rheumatoid Joint

Frontmatter

An MRI Study of Bone Erosions Healing in the Wrist and Metacarpophalangeal Joints of Patients with Rheumatoid Arthritis

Bone erosions, considered the hallmark of rheumatoid arthritis (RA), are shown more accurately by MRI than by conventional radiography (CR). Erosions healing is exceptional when studied by CR. This study is concerned with an extremity-dedicated MRI evaluation of erosion changes in patients with RA followed over time. Wrist and metacarpo-phalangeal (MCP) joints of 57 RA patients were imaged with a dedicated-extremity, 0.2 T MRI at baseline and follow up. A decrease of the RAMRIS erosion score indicating erosion healing, calculated both by conventional visual judgement and by a semi-automated method, was seen in 7 (12.3%) patients at the wrist and in 3 (5.3%) at the MCPs. In the same locations, RAMRIS was unchanged in 17 (29.8%) and 31 (54.4%) patients, and worsened in 33 (57.9%) and 17 (29.8%), respectively. Healing of erosions occurs, although rarely, in patients with RA when studied with sensitive imaging techniques, such as MRI.

Francesca Barbieri, Veronica Tomatis, Giuseppe Zampogna, Elena Aleo, Valentina Prono, Stefania Migone, Patrizia Parascandolo, Lorenzo Cesario, Gianni Viano, Marco Amedeo Cimmino

RheumaSCORE: A CAD System for Rheumatoid Arthritis Diagnosis and Follow-Up

Recently, computer-aided diagnosis (CAD) has become one of the major research subjects in medical imaging and diagnostic radiology. The goal of a CAD is to improve the quality and productivity of physicians’ job by improving the accuracy and consistency of radiological diagnosis. This paper describes RheumaSCORE, a CAD system specialized for the diagnosis and treatment of patients affected by bone erosions, as a consequence of one of the most common and serious forms of arthritis, the Rheumatoid Arthritis (RA), and gives an overview of its main features.

Patrizia Parascandolo, Lorenzo Cesario, Loris Vosilla, Gianni Viano

A Database of Segmented MRI Images of the Wrist and the Hand in Patients with Rheumatic Diseases

This paper is concerned with the ideation, organization and distribution of a database of segmented MRI images - and associated clinical parameters - of the wrist and the hand in patients affected by a variety of the most frequent rheumatic diseases. The final goal is empowering future biomedical research thanks to the completeness of details and cases. MRI Images were analyzed by means of the software RheumaSCORE (Softeco Sismat Srl), which performs semi-automatic segmentation of the bones, returns the volume of bones and erosions, as well as their tri-dimensional reconstruction. In order to favor its exploitation, the database of segmented images, along with many relevant clinical anthropometric parameters, are available online through the Patient Browser platform (Softeco Sismat Srl). Moreover, the original images and their clinical parameters are accessible online through the dedicated DICOM viewer QuantaView (CAMELOT Biomedical Systems Srl).

Veronica Tomatis, Marco A. Cimmino, Francesca Barbieri, Giulia Troglio, Patrizia Parascandolo, Lorenzo Cesario, Gianni Viano, Loris Vosilla, Marios Pitikakis, Andrea Schiappacasse, Michela Moraldo, Matteo Santoro

Novel Automatic Tool for Magnetic Resonance Imaging Quantification of Bone Erosion Scoring in Rheumatoid Arthritis

Rheumatoid arthritis (RA) is a systemic disease that affects the synovial joints. Currently, the gold standard measurement for tracking the progression of the disease involves a semi-quantitative assessment of bone erosion, bone marrow edema and synovitis, as seen in magnetic resonance images (MRI). The work presented in this paper identifies how computer automation can be used to quantify bone erosion volumes in MRI without expert and time consuming interventions. This tool is fully integrated in a computer aided diagnosis (CAD) system named RheumaSCORE (Softeco Sismat Srl). Preliminary results of qualitative and quantitative validation are presented and discussed at the end of the paper.

Patrizia Parascandolo, Lorenzo Cesario, Loris Vosilla, Francesca Barbieri, Marco Amedeo Cimmino, Gianni Viano

Optimizing and Evaluating a Graph-Based Segmentation of MRI Wrist Bones

In this paper, a quantitative evaluation of the graph-based segmentation method presented in a previous work is performed. The algorithm, starting from a single source element belonging to a region of interest, aims at finding the optimal path minimizing a new cost function for all elements of a digital volume. The method is an adaptive, unsupervised, and semi-automatic approach.

For the assessment, a training phase and a testing phase are considered. The system is able to learn and adapt to the ground truth. The performance of the method is estimated by computing classical indices from the confusion matrix, similarity measures, and distance measures.

Our work is based on the segmentation and 3D reconstructions of carpal bones derived from Magnetic Resonance Imaging (MRI) volumetric data of patients affected by rheumatic diseases.

Sonia Nardotto, Roberta Ferretti, Laura Gemme, Silvana Dellepiane

Generation of 3D Canonical Anatomical Models: An Experience on Carpal Bones

The paper discusses the initial results obtained for the generation of canonical 3D models of anatomical parts, built on real patient data. 3D canonical models of anatomy are key elements in a computer-assisted diagnosis; for instance, they can support pathology detection, semantic annotation of patient-specific 3D reconstructions, quantification of pathological markers. Our approach is focused on carpal bones and on the elastic analysis of 3D reconstructions of these bones, which are segmented from MRI scans, represented as 0-genus triangle meshes, and parameterized on the sphere. The original method [

] relies on a set of sparse correspondences, defined as matching vertices. For medical applications, it is desirable to constrain the mean shape generation to set-up the correspondences among a larger set of anatomical landmarks, including vertices, lines, and areas. Preliminary results are discussed and future development directions are sketched.

Imon Banerjee, Hamid Laga, Giuseppe Patanè, Sebastian Kurtek, Anuj Srivastava, Michela Spagnuolo

ISCA 2015 - Image-Based Smart City Application

Frontmatter

DicomPrint, an Application for Managing DICOM Images

Digital Imaging and Communications in Medicine (DICOM) is a standard for handling, storing, printing and transmitting information in medical imaging. It includes: the file format and the networking protocol. The image consists of a list of attributes which contains a) metadata for image like size, dimensions, resolution etc. and b) patient metadata like patient name, sex, ID, age etc. The process of reading and printing the image in itself is difficult because of the adaption between different modalities and devices.

The aim of this work is to develop an application that reads, views and prints DICOM images of a certain user, regardless of the device. The image can be printed with and without the patient metadata and has to be integrated in the DICOMRX (RxScan) software, developed by GrowIdeas Ltd which examines the patient and scans his/her inner parts. The developed application selects a DICOM image from a folder, extracts its metadata and loads these data into a form. The form is then printed according to the needs of the doctor.

Edlira Kalemi, Edlira Martiri, Brunela Manaj, Dionis Prifti

Accurate Positioning and Orientation Estimation in Urban Environment Based on 3D Models

This paper describes a positioning algorithm for mobile phones based on image recognition. The use of image recognition based (IRB) positioning in mobile applications is characterized by the availability of a single camera for estimate the camera position and orientation. A prior knowledge of 3D environment is needed in the form of a database of images with associated spatial information that can be built projecting the 3D model on a set of synthetic solid images (range + RGB images). The IRB procedure proposed by the authors can be divided in two steps: the selection from the database of the most similar image to the query image used to locate the camera and the estimation of the position and orientation of the camera based on available 3D data on the reference image. The MPEG standard Compact Descriptors for Visual Search (CDVS) has been used to reduce hugely the processing time. Some practical results of the location methodology in outdoor environment have been described in terms of processing time and accuracy of position and attitude.

Giorgio Ghinamo, Cecilia Corbi, Piero Lovisolo, Andrea Lingua, Irene Aicardi, Nives Grasso

An Hippocampal Segmentation Tool Within an Open Cloud Infrastructure

This study presents a fully automated algorithm for the segmentation of the hippocampus in structural Magnetic Resonance Imaging (MRI) and its deployment as a service on an open cloud infrastructure. Optimal atlases strategies for multi-atlas learning are combined with a voxel-wise classification approach. The method efficiency is optimized as training atlases are previously registered to a data driven template, accordingly for each test MRI scan only a registration is needed. The selected optimal atlases are used to train dedicated random forest classifiers whose labels are fused by majority voting. The method performances were tested on a set of 100 MRI scans provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Leave-one-out results (Dice =

$$0.910\,\pm \,0.004$$

) show the presented method compares well with other state-of-the-art techniques and a benchmark segmentation tool as FreeSurfer. The proposed strategy significantly improves a standard multi-atlas approach (

$$p < .001$$

Nicola Amoroso, Sabina Tangaro, Rosangela Errico, Elena Garuccio, Anna Monda, Francesco Sensi, Andrea Tateo, Roberto Bellotti, [Authorinst]for the Alzheimer’s Disease Neuroimaging Initiative

A Survey on Traffic Light Detection

Traffic light detection is an important matter in urban environments during the transition to fully autonomous driving. Many literature has been generated in the recent years approaching different pattern recognition strategies. In this paper we present a survey summarizing relevant works in the field of detection of both suspended and supported traffic light. This survey organizes different methods highlighting main reasearch areas in the computer vision field.

Moises Diaz, Pietro Cerri, Giuseppe Pirlo, Miguel A. Ferrer, Donato Impedovo

Saliency-Based Keypoint Reduction for Augmented-Reality Applications in Smart Cities

In this paper we show that Saliency-based keypoint selection makes natural landmark detection and object recognition quite effective and efficient, thus enabling augmented reality techniques in a plethora of applications in smart city contexts. As a case study we address a tour of a museum where a modern smart device like a tablet or smartphone can be used to recognize paintings, retrieve their pose and graphically overlay useful information.

Simone Buoncompagni, Dario Maio, Davide Maltoni, Serena Papi

A Likelihood-Based Background Model for Real Time Processing of Color Filter Array Videos

One of the first tasks executed by a vision system made of fixed cameras is the background (BG) subtraction and a particularly challenging context for real time applications is the athletic one because of illumination changes, moving objects and cluttered scenes. The aim of this work is to extract a BG model based on statistical likelihood able to process color filter array (CFA) images taking into account the intrinsic variance of each gray level of the sensor, named Likelihood Bayer Background (LBB). The BG model should be not so computationally complex while highly responsive to extract a robust foreground. Moreover, the mathematical operations used in the formulation should be parallelizable, working on image patches, and computationally efficient, exploiting the dynamics of a pixel within its integer range. Both simulations and experiments on real video sequences demonstrate that this BG model approach shows great performances and robustness during the real time processing of scenes extracted from a soccer match.

Vito Renó, Roberto Marani, Nicola Mosca, Massimiliano Nitti, Tiziana D’Orazio, Ettore Stella

Smart Maintenance to Support Digital Life

This paper describes experiments performed to validate the possibility of delivering smart maintenance to utilities and telecommunication operators. First, the paper describes the reference architectural model of smart cities and it introduces smart maintenance and social smart maintenance. Then, the paper outlines performed experiments and relative results.

Federico Bergenti, Massimo Chiappone, Danilo Gotta

FSSGR: Feature Selection System to Dynamic Gesture Recognition

Dynamic gesture recognition systems based on computer vision techniques have been frequently used in some fields such as medical, games and sign language. Usually, these systems have a time execution problem caused by the elevated number of features or attributes extracted for gesture classification. This work presents a system for dynamic gesture recognition that uses Particle Swarm Optimization to reduce the feature vector while increases the classification capability. The system FSSGR, Feature Selection System to Dynamic Gesture Recognition, solved the gesture recognition problem in RPPDI dataset, achieving 99.21% of classification rate with the same vectors size of previous works on the same database, although with a better response time.

Diego G. S. Santos, Rodrigo C. Neto, Bruno Fernandes, Byron Bezerra

Interoperability of Biometric Systems: Analysis of Geometric Characteristics of Handwritten Signatures

Handwritten signatures are considered one of the most useful biometric traits for personal verification. In the networked society, in which a multitude of different devices can be used for signature acquisition, specific research is still needed to determine the extent to which features of an input signature depend on the characteristics of the signature apposition process.

In this paper an experimental investigation was carried out on constrained signatures, which were acquired using writing boxes having different area and shape, and the different behaviour of geometric features with respect to the writing boxes is discussed.

Giuseppe Pirlo, Fabrizio Rizzi, Annalisa Vacca, Donato Impedovo

Computer Aided Evaluation (CAE) of Morphologic Changes in Pigmented Skin Lesions

Mole pattern changes are important elements in detecting cancerous skin lesions, the early stage detection is a key factor to completely cure the pathology. In this paper, an automatic system for mole-tracking is indicated. The method presented is been realized as a mobile app and can be used to perform periodically a careful self-examination of their pigmented skin lesions. The implemented method receives in input two segmented images of the same pigmented skin lesion corresponding to the actual image and to the image before the last period under test. The method performs image matching and changes evaluation adopting a three stage artificial neural network and provides as output a risk indicator related to the morphology changes of the skin lesion.

Maria Rizzi, Matteo D’Aloia, Gianpaolo Cice

Analytical Method and Research of Uyghur Language Chunks Based on Digital Forensics

In the digital forensics process based on Uygur language information, stem, affix, synonym mark and other characteristics are added on the features-based English and Chinese chunks and according to relevant characteristics of Uygur language. In terms of performance evaluation indexes in this paper, accuracy rate, recall rate and F value are adopted. The test indicates that the scale of Uyghur chunks database has great effects on the model performance.

Yasen Aizezi, Anwar Jamal, Dilxat Mamat, Ruxianguli Abdurexit, Kurban Ubul

CLICK TEATRO Project: Augmented Reality and Promotion of Theater Events

This work describes the activity related to the field of Augmented Reality (AR) developed in the “Integrated Multimedia Assets to promote TheaTrical” (CLICK TEATRO) living lab project.

The “Integrated Multimedia Assets to promote TheaTrical” project realizes a system addressed to the citizens in order to create a link between theater activities and commercial operators. Through the use of an App for Smartphone, the user will receive content related to the show (additional content, interviews, etc.) and service associated (restaurants, car parks, hotels, bar, etc.). This information will be generated and managed by means of Augmented Reality tools.

Finally, the involved activities will enter in a virtuous circle, in fact the user will gain discount bonus to be used for purchasing other tickets.. The app has been developed for iOS systems.

Donato Barbuzzi, Bachir Boussahel, Francesca De Carlo, Angelo Galiano, Donato Impedovo, Annalisa Longo

A Marker-Based Image Processing Method for Detecting Available Parking Slots from UAVs

Due to the considerable number of vehicles in many cities, parking problem is a long-term phenomenon and represents one of the main causes of traffic congestion. Unmanned Aerial Vehicles (UAVs) can handle automatic monitoring of traffic, pollution and other interesting services in urban areas non-invasively. UAVs are usually equipped with one or more onboard cameras and with other electronic sensors. In this context, a method for parking slot occupancy detection in parking areas is presented. For recognition of free parking spaces, pictures of urban areas captured by the onboard camera of the UAV are georeferenced and processed for marker detection. The implemented system shows good results in terms of robustness and reliability. Moreover, it paves the way for an improved management of urban spaces.

Matteo D’Aloia, Maria Rizzi, Ruggero Russo, Marianna Notarnicola, Leonardo Pellicani

Crosswalk Recognition Through Point-Cloud Processing and Deep-Learning Suited to a Wearable Mobility Aid for the Visually Impaired

In smart-cities, computer vision has the potential to dramatically improve the quality of life of people suffering of visual impairments. In this field, we have been working on a wearable mobility aid aimed at detecting in real-time obstacles in front of a visually impaired. Our approach relies on a custom RGBD camera, with FPGA on-board processing, worn as traditional eyeglasses and effective point-cloud processing implemented on a compact and lightweight embedded computer. This latter device also provides feedback to the user by means of an haptic interface as well as audio messages. In this paper we address crosswalk recognition that, as pointed out by several visually impaired users involved in the evaluation of our system, is a crucial requirement in the design of an effective mobility aid. Specifically, we propose a reliable methodology to detect and categorize crosswalks by leveraging on point-cloud processing and deep-learning techniques. The experimental results reported, on 10000+ frames, confirm that the proposed approach is invariant to head/camera pose and extremely effective even when dealing with large occlusions typically found in urban environments.

Matteo Poggi, Luca Nanni, Stefano Mattoccia

Early Diagnosis of Neurodegenerative Diseases by Handwritten Signature Analysis

Handwritten signatures are generally considered a powerful biometric traits for personal verification. Recently, handwritten signatures have been also investigated for early diagnosis of neurodegenerative diseases. This paper presents a new approach for early diagnosis of neurodegenerative diseases by the analysis of handwritten dynamic signatures. For the purpose, the sigma-lognormal model was considered and dynamic parameters are extracted for signatures. Based on these parameters, the health condition of the signer is analysed in terms of Alzheimer disease. The approach is cheap and effective, therefore it can be considered as a very promising direction for further research.

Giuseppe Pirlo, Moises Diaz, Miguel Angel Ferrer, Donato Impedovo, Fabrizio Occhionero, Urbano Zurlo

A New Context-Aware Computing Method for Urban Safety

Recently, various research efforts have been made to analyze urban environments. Particularly, predicting urban safety from by means of visual perception is very important for most people. In this paper, we propose a context-aware urban safety prediction method by measuring the contexts of urban environments through visual information. In our context-aware evaluation, we define and extract positive and negative visual associations with urban safety. Then, we add these associations to a computational model of urban safety. Our experimental results show better performance than previous approaches.

Hyeon-Woo Kang, Hang-Bong Kang

MADiMa 2015 - 1st International Workshop on Multimedia Assisted Dietary Management

Frontmatter

Estimating the Nutrient Content of Commercial Foods from their Label Using Numerical Optimization

We propose a method for automatically estimating the amount of a given nutrient contained in a commercial food. The method applies when no part of any ingredient is removed in the preparation process. First, we automatically bound the amount of each ingredient used to prepare the food using the information provided on its label (Ingredient list and Nutrition Facts Label) along with the nutrition information for at least some of the ingredients. Using these bounds (minimum and maximum amount for each ingredient), we obtain an initial set of bounds (minimum and maximum amount) for the nutrient considered. We then utilize the Simplex algorithm to refine these bounds on the nutrient content. Our motivating application is the management of medical diets that require keeping track of certain nutrients such as phenylalanine (Phe) in the case of the inherited metabolic disease phenylketonuria (PKU). To test our method, we used it to estimate the Phe content of 25 commercial foods. In a majority of cases (17 / 25), the bounds obtained were within 10.4mg of each other and thus our method provided a very accurate estimate (

$$\pm 5.2$$

5.2

mg) for the Phe content of the foods.

Jieun Kim, Mireille Boutin

The Use of Temporal Information in Food Image Analysis

We have developed a dietary assessment system that uses food images captured by a mobile device. Food identification is a crucial component of our system. Achieving a high classification rates is challenging due to the large number of food categories and variability in food appearance. In this paper, we propose to improve food classification by incorporating temporal information. We employ recursive Bayesian estimation to incrementally learn from a person’s eating history. We show an improvement of food classification accuracy by 11% can be achieved.

Yu Wang, Ye He, Fengqing Zhu, Carol Boushey, Edward Delp

Tastes and Textures Estimation of Foods Based on the Analysis of Its Ingredients List and Image

Recently, the number of cooking recipes on the Web is increasing. However, it is difficult to search them by tastes or textures although they are actually important considering the nature of the contents. Therefore, we propose a method for estimating the tastes and the textures of a cooking recipe by analyzing them. Concretely, the proposed method refers to an ingredients feature from the “ingredients list” and image features from the “food image” in a cooking recipe. We confirmed the effectiveness of the proposed method through an experiment.

Hiroki Matsunaga, Keisuke Doman, Takatsugu Hirayama, Ichiro Ide, Daisuke Deguchi, Hiroshi Murase

Food Recognition and Leftover Estimation for Daily Diet Monitoring

Here we propose a system for automatic dietary monitoring of canteen customers based on robust computer vision techniques. The proposed system recognizes foods and estimates food leftovers. Results achieved on 1000 customers of a real canteen are promising.

Gianluigi Ciocca, Paolo Napoletano, Raimondo Schettini

Mobile Computing and Artificial Intelligence for Diet Management

This paper proposes a software architecture for automatic diet management and recipes analysis. We devise a

virtual dietitian

that is able: (1) to recover the nutritional information directly from a specific recipe, (2) to reason over recipes and diets with

flexibility

, i.e. by allowing some forms of diet disobedience, and (3) to persuade the user to minimize these acts of disobedience.

Alessandro Mazzei, Luca Anselma, Franco De Michieli, Andrea Bolioli, Matteo Casu, Jelle Gerbrandy, Ivan Lunardi

Highly Accurate Food/Non-Food Image Classification Based on a Deep Convolutional Neural Network

“Food” is an emerging topic of interest for multimedia and computer vision community. In this paper, we investigate food/non-food classification of images. We show that CNN, which is the state of the art technique for general object classification, can perform accurately for this problem. For the experiments, we used three different datasets of images: (1) images we collected from Instagram, (2) Food-101 and Caltech-256 dataset (3) dataset we used in [

]. We investigated the combinations of training and testing using the all three of them. As a result, we achieved high accuracy 96, 95 and 99% in the three datasets respectively.

Hokuto Kagaya, Kiyoharu Aizawa

A Printer Indexing System for Color Calibration with Applications in Dietary Assessment

In image based dietary assessment, color is a very important feature in food identification. One issue with using color in image analysis in the calibration of the color imaging capture system. In this paper we propose an indexing system for color camera calibration using printed color checkerboards also known as fiducial markers (FMs). To use the FM for color calibration one must know which printer was used to print the FM so that the correct color calibration matrix can be used for calibration. We have designed a printer indexing scheme that allows one to determine which printer was used to print the FM based on a unique arrangement of color squares and binarized marks (used for error control) printed on the FM. Using normalized cross correlation and pattern detection, the index corresponding to the printer for a particular FM can be determined. Our experimental results show this scheme is robust against most types of lighting conditions.

Shaobo Fang, Chang Liu, Fengqing Zhu, Carol Boushey, Edward Delp

Food Object Recognition Using a Mobile Device: State of the Art

In this paper nine mobile food recognition systems are described based on their system architecture and their core properties (the core properties and experimental results are shown on the last page). While the mobile hardware increased its power through the years (2009 - 2013) and the food detection algorithms got optimized, still there was no uniform approach to the question of food detection. Also, some system used additional information for better detection, like voice data, OCR and bounding boxes. Three systems included a volume estimation feature. First five systems were implemented on a client-server architecture, while the last three took advantage of the available hardware in later years and proposed a client only based architecture.

Simon Knez, Luka Šajn

On the Exploitation of One Class Classification to Distinguish Food Vs Non-Food Images

In the last years automatic food image understanding has become an important research challenge for the society. This is because of the serious impact that food intake has in human life. Food recognition engines, can help the monitoring of the patient diet and his food intake habits. Nevertheless, distinguish among different classes of food is not the first question for assisted dietary monitoring systems. Prior to ask what class of food is depicted in an image, a computer vision system should be able to distinguish between food vs non-food images. In this work we consider one-class classification method to distinguish food vs non-food images. The UNICT-FD889 dataset is used for training purpose, whereas other two datasets of food and non-food images has been downloaded from Flickr to test the method. Taking into account previous works, we used Bag-of-Words representation considering different feature spaces to build the codebook. To give possibility to the community to work on the considered problem, the datasets used in our experiments are made publicly available.

Giovanni Maria Farinella, Dario Allegra, Filippo Stanco, Sebastiano Battiato

Food Recognition Using Consensus Vocabularies

Food recognition is an interesting and challenging problem with applications in medical, social and anthropological research areas. The high variability of food images makes the recognition task difficult for current state-of-the-art methods. It has been proved that the exploitation of multiple features to capture complementary aspects of the image contents is useful to improve the discrimination of different food items. In this paper we exploit an image representation based on the consensus among visual vocabularies built on different feature spaces. Starting from a set of visual codebooks, a consensus clustering technique is used to build a consensus vocabulary used to represent food pictures with a Bag-of-Visual-Words paradigm. This new representation is employed together with a SVM for recognition purpose.

Giovanni Maria Farinella, Marco Moltisanti, Sebastiano Battiato

Using Small Checkerboards as Size Reference: A Model-Based Approach

Monitoring diet is crucial for preventing or dealing with many chronic diseases. Therefore, plenty of different methods have been developed to serve this purpose. Among these, automatic diet monitoring based on mobile devices are of particular interest. An automatic system is supposed to be able to detect type and amount of food intake. This work suggests using a small checkerboard in food images as size reference as an aid for estimating food amount. Although checkerboard is a simple pattern, most of the off-the-shelf algorithms do not perform well in detecting small checkerboards. This paper extends a previous work presenting a new stochastic model-based algorithm to detect small checkerboards. The algorithm first locates the checkerboard in the food image and then applies a customized corner detection algorithm to the located region. Experimental results show notably better performance in comparison to basic methods and to the previous version of the method.

Hamid Hassannejad, Guido Matrella, Monica Mordonini, Stefano Cagnoni

Fractal Nature of Chewing Sounds

In the battle against Obesity as well as Eating Disorders, non-intrusive dietary monitoring has been investigated by many researchers. For this purpose, one of the most promising modalities is the acoustic signal captured by a common microphone placed inside the outer ear canal. Various chewing detection algorithms for this type of signals exist in the literature. In this work, we perform a systematic analysis of the fractal nature of chewing sounds, and find that the Fractal Dimension is substantially different between chewing and talking. This holds even for severely down-sampled versions of the recordings. We derive chewing detectors based on the the fractal dimension of the recorded signals that can clearly discriminate chewing from non-chewing sounds. We experimentally evaluate snacking detection based on the proposed chewing detector, and we compare our approach against well known counterparts. Experimental results on a large dataset of 10 subjects and total recordings duration of more than 8 hours demonstrate the high effectiveness of our method. Furthermore, there exists indication that discrimination between different properties (such as crispness) is possible.

Vasileios Papapanagiotou, Christos Diou, Zhou Lingchuan, Janet van den Boer, Monica Mars, Anastasios Delopoulos

Objective and Subjective Meal Registration via a Smartphone Application

SPLENDID is a research programme that develops a novel preventive intervention for young people at risk for obesity and eating disorders. The SPLENDID app, a novel smartphone application that mediates the monitoring and modification of the participants’ eating and activity behaviors, resides in the intervention’s core. The app receives and manages eating and physical activity related signals from three communicating sensors as well as subjective user input. In this paper, we present two discrete meal registration mechanisms – subjective and objective – that have been implemented and incorporated in the SPLENDID app, along with the relevant user feedback. In objective meal registration, the app records meal information with the help of a portable food weight scale, while an electronic meal report is employed for the subjective registration. Certain components of the proposed registration mechanisms and the relevant feedback have been evaluated with respect to usability on forty young adolescents, yielding promising results.

Ioannis Moulos, Christos Maramis, Ioannis Ioakimidis, Janet van den Boer, Jenny Nolstam, Monica Mars, Cecilia Bergh, Nicos Maglaveras

Towards an Engaging Mobile Food Record for Teenagers

In the frame of the PEGASO European project, we aim at promoting healthier lifestyles among teenagers focusing on the alimentary education and physical activity. This paper presents a novel concept of mobile food record developed following a multidisciplinary approach to innovate both the monitoring and the user experience. This mobile food record does not count calories but is focused on tracking dietary patterns and support the adoption of target behaviours. Moreover, the introduction of game mechanics developed through participatory design techniques aims at sustaining engagement in the long term.

Maurizio Caon, Stefano Carrino, Federica Prinelli, Valentina Ciociola, Fulvio Adorni, Claudio Lafortuna, Sarah Tabozzi, José Serrano, Laura Condon, Omar Abou Khaled, Elena Mugellini

MANGO - Mobile Augmented Reality with Functional Eating Guidance and Food Awareness

The prevention of cardiovascular diseases becomes more and more important, as malnutrition accompanies today’s fast moving society. While most people know the importance of adequate nutrition, information on advantageous food is often not at hand, such as in daily activities. Decision making on individual dietary management is closely linked to the food shopping decision. Since food shopping often requires fast decision making, due to stressful and crowded situations, the user needs a meaningful assistance, with clear and rapidly available associations from food items to dietary recommendations. This paper presents first results of the Austrian project (MANGO) which develops mobile assistance for instant, situated information access via Augmented Reality (AR) functionality to support the user during everyday grocery shopping. Within a modern diet - the functional eating concept - the user is advised which fruits and vegetables to buy according to his individual profile. This specific oxidative stress profile is created through a short in-app survey. Using a built-in image recognition system, the application automatically classifies video captured food using machine learning and computer vision methodology, such as Random Forests classification and multiple color feature spaces. The user can decide to display additional nutrition information along with alternative proposals. We demonstrate, that the application is able to recognize food classes in real-time, under real world shopping conditions, and associates dietary recommendations using situated AR assistance.

Georg Waltner, Michael Schwarz, Stefan Ladstätter, Anna Weber, Patrick Luley, Horst Bischof, Meinrad Lindschinger, Irene Schmid, Lucas Paletta

Dish Detection and Segmentation for Dietary Assessment on Smartphones

Diet-related chronic diseases severely affect personal and global health. However, managing or treating these diseases currently requires long training and high personal involvement to succeed. Computer vision systems could assist with the assessment of diet by detecting and recognizing different foods and their portions in images. We propose novel methods for detecting a dish in an image and segmenting its contents with and without user interaction. All methods were evaluated on a database of over 1600 manually annotated images. The dish detection scored an average of 99% accuracy with a .2s/image run time, while the automatic and semi-automatic dish segmentation methods reached average accuracies of 88% and 91% respectively, with an average run time of .5s/image, outperforming competing solutions.

Joachim Dehais, Marios Anthimopoulos, Stavroula Mougiakakou

FooDD: Food Detection Dataset for Calorie Measurement Using Food Images

Food detection, classification, and analysis have been the topic of in-depth studies for a variety of applications related to eating habits and dietary assessment. For the specific topic of calorie measurement of food portions with single and mixed food items, the research community needs a dataset of images for testing and training. In this paper we introduce FooDD: a Food Detection Dataset of 3000 images that offer variety of food photos taken from different cameras with different illuminations. We also provide examples of food detection using graph cut segmentation and deep learning algorithms.

Parisa Pouladzadeh, Abdulsalam Yassine, Shervin Shirmohammadi

CNN-Based Food Image Segmentation Without Pixel-Wise Annotation

We propose a CNN-based food image segmentation which requires no pixel-wise annotation. The proposed method consists of food region proposals by selective search and bounding box clustering, back propagation based saliency map estimation with the CNN model fine-tuned with the UEC-FOOD100 dataset, GrabCut guided by the estimated saliency maps and region integration by non-maximum suppression. In the experiments, the proposed method outperformed RCNN regarding food region detection as well as the PASCAL VOC detection task.

Wataru Shimoda, Keiji Yanai

Food Recognition for Dietary Assessment Using Deep Convolutional Neural Networks

Diet management is a key factor for the prevention and treatment of diet-related chronic diseases. Computer vision systems aim to provide automated food intake assessment using meal images. We propose a method for the recognition of already segmented food items in meal images. The method uses a 6-layer deep convolutional neural network to classify food image patches. For each food item, overlapping patches are extracted and classified and the class with the majority of votes is assigned to it. Experiments on a manually annotated dataset with 573 food items justified the choice of the involved components and proved the effectiveness of the proposed system yielding an overall accuracy of 84.9%.

Stergios Christodoulidis, Marios Anthimopoulos, Stavroula Mougiakakou

SBMI 2015 - Scene Background Modeling and Initialization

Frontmatter

Towards Benchmarking Scene Background Initialization

Given a set of images of a scene taken at different times, the availability of an initial background model that describes the scene without foreground objects is the prerequisite for a wide range of applications, ranging from video surveillance to computational photography. Even though several methods have been proposed for scene background initialization, the lack of a common groundtruthed dataset and of a common set of metrics makes it difficult to compare their performance. To move first steps towards an easy and fair comparison of these methods, we assembled a dataset of sequences frequently adopted for background initialization, selected or created ground truths for quantitative evaluation through a selected suite of metrics, and compared results obtained by some existing methods, making all the material publicly available.

Lucia Maddalena, Alfredo Petrosino

Simple Median-Based Method for Stationary Background Generation Using Background Subtraction Algorithms

The estimation of the background image from a video sequence is necessary in some applications. Computing the median for each pixel over time is effective, but it fails when the background is visible for less than half of the time. In this paper, we propose a new method leveraging the segmentation performed by a background subtraction algorithm, which reduces the set of color candidates, for each pixel, before the median is applied. Our method is simple and fully generic as any background subtraction algorithm can be used. While recent background subtraction algorithms are excellent in detecting moving objects, our experiments show that the frame difference algorithm is a technique that compare advantageously to more advanced ones. Finally, we present the background images obtained on the SBI dataset, which appear to be almost perfect. The source code of our method can be downloaded at

http://www.ulg.ac.be/telecom/research/sbg

Benjamin Laugraud, Sébastien Piérard, Marc Braham, Marc Van Droogenbroeck

Multi-modal Background Model Initialization

Background subtraction is a widely used technique for detecting moving objects in image sequences. Very often background subtraction approaches assume the availability of one or more clear frames (i.e., without foreground objects) at the beginning of the image sequence in input. This strong assumption is not always correct, especially when dealing with dynamic background. In this paper, we present the results of an on-line and real-time background initialization method, called IMBS, which generates a reliable initial background model even if no clear frames are available. The accuracy of the proposed approach is calculated on a set of seven publicly available benchmark sequences. Experimental results demonstrate that IMBS generates accurate background models with respect to eight different quality metrics.

Domenico D. Bloisi, Alfonso Grillo, Andrea Pennisi, Luca Iocchi, Claudio Passaretti

Background Modeling by Weightless Neural Networks

Background initialization is the task of computing a background model by processing a set of preliminary frames in a video scene. The initial background estimation serves as bootstrap model for video segmentation of foreground objects, although the background estimation could be refined and updated in steady state operation of video processing systems. In this paper we approach the background modeling problem with a weightless neural network called WiSARD

$$^{rp}$$

. The proposed approach is straightforward, since the computation is pixel–based and it exploits a dedicated neural network to model the pixel background by using the same training rule.

Massimo De Gregorio, Maurizio Giordano

BMTDL for Scene Modeling on the SBI Dataset

In this paper we evaluate our method for Background Modeling Through Dictionary Learning (BMTDL) and sparse coding on the recently proposed Scene Background Initialization (SBI) dataset. The BMTDL, originally proposed in [

] for the specific purpose of detecting the foreground of a scene, leverages on the availability of long time observations, where we can treat foreground objects as noise. The SBI dataset refers to more general scene modeling problems – as for video segmentation, compression or editing – where video sequences may be generally short, and often include foreground objects occupying a large portion on the image for the majority of the sequence. The experimental analysis we report is very promising and show how the BMTDL may be also appropriate for these different and challenging conditions.

Nicoletta Noceti, Alessandra Staglianò, Alessandro Verri, Francesca Odone

Comparison of Matrix Completion Algorithms for Background Initialization in Videos

Background model initialization is commonly the first step of the background subtraction process. In practice, several challenges appear and perturb this process such as dynamic background, bootstrapping, illumination changes, noise image, etc. In this context, this work aims to investigate the background model initialization as a matrix completion problem. Thus, we consider the image sequence (or video) as a partially observed matrix. First, a simple joint motion-detection and frame-selection operation is done. The redundant frames are eliminated, and the moving regions are represented by zeros in our observation matrix. The second stage involves evaluating nine popular matrix completion algorithms with the Scene Background Initialization (SBI) data set, and analyze them with respect to the background model challenges. The experimental results show the good performance of LRGeomCG [

] method over its direct competitors.

Andrews Sobral, Thierry Bouwmans, El-hadi Zahzah

Real-Time Implementation of Background Modelling Algorithms in FPGA Devices

The article discusses the possibilities of hardware implementation of foreground object segmentation and background modelling algorithms in FPGA. The potential benefits, as well as challenges and problems associated with porting algorithms from general-purpose processors (CPU) to reconfigurable logic (FPGA) are presented. Also several hardware implementation of well known method are reviewed: GMM, Codebook, Clustering, ViBE and PBAS.The last algorithm was also evaluated on the SBI dataset.

Tomasz Kryjak, Marek Gorgon

A Perfect Estimation of a Background Image Does Not Lead to a Perfect Background Subtraction: Analysis of the Upper Bound on the Performance

The quest for the “best” background subtraction technique is ongoing. Despite that a considerable effort has been undertaken to develop flexible and efficient methods, some elementary questions are still unanswered. One of them is the existence of an intrinsic upper bound to the performance. In fact, data are affected by noise, and therefore it is illusory to believe that it is possible to achieve a perfect segmentation. This paper aims at exploring some intrinsic limitations of the principle of background subtraction. The purpose consists in studying the impact of several limiting factors separately. One of our conclusions is that even if an algorithm would be able to calculate a perfect background image, it is not sufficient to achieve a perfect segmentation with background subtraction, due to other intrinsic limitations.

Sébastien Piérard, Marc Van Droogenbroeck

Nonlinear Background Filter to Improve Pedestrian Detection

In this paper, we propose a simple nonlinear filter which improves the detection of pedestrians walking in a video. We do so by first cumulating temporal gradient of moving objects into a motion history image (MHI). Then we apply to each frame of the video a motion-guided nonlinear filter whose goal is to smudge out background details while leaving untouched foreground moving objects. The resulting blurry-background image is then fed to a pedestrian detector. Experiments reveal that for a given miss rate, our motion-guided nonlinear filter can decrease the number of false positives per image (FPPI) by a factor of up to 26. Our method is simple, computationally light, and can be applied on a variety of videos to improve the performances of almost any kind of pedestrian detectors.

Yi Wang, Sébastien Piérard, Song-Zhi Su, Pierre-Marc Jodoin

QoEM 2015 - Workshop on Image and Video Processing for Quality of Multimedia Experience

Frontmatter

Full-Reference SSIM Metric for Video Quality Assessment with Saliency-Based Features

This paper uses models of visual attention in order to estimate the human visual perception and thus improve metrics of Video Quality Assessment. This work reports on the use of the saliency based model in a full-reference structural similarity metric for creating new metrics that take into account regions that greatly attract the human attention. Correlation results with the differential mean opinion score values from the LIVE Video Quality Database are presented and discussed.

Eduardo Romani, Wyllian Bezerra da Silva, Keiko Verônica Ono Fonseca, Dubravko Culibrk, Alexandre de Almeida Prado Pohl

Video Quality Assessment for Mobile Devices on Mobile Devices

Pervasiveness of mobile devices and ubiquitous broadband Internet access have laid foundations for video content to be consumed increasingly on smart phones or tablets. As over 85% of the global consumer traffic by 2016 is estimated to be generated by streaming video content, video quality as perceived by end-users of such devices is becoming an important issue. Most of the studies concerned with Video Quality Assessment (VQA) for mobile devices have been carried out in a carefully controlled environment, thus potentially failing to take into account variables or effects present in real-world conditions. In this paper, we compare the results of traditional approach to VQA for mobile devices to those obtained in real-world conditions by using a physical mobile device, for the same video test-set. Results indicate that a difference in perceived video quality between the two settings exists, thus laying foundations for further research to explain the reasons behind it.

Milan Mirkovic, Dubravko Culibrk, Srdjan Sladojevic, Andras Anderla

An Efficient SIMD Implementation of the H.265 Decoder for Mobile Architecture

This paper focuses on an efficient optimization of the H.265 video decoder on suitable architectures for mobile devices. The solutions developed to support the H.265 features, and the achieved performances are shown. The most demanding modules have been optimized with Single Instruction Multiple Data (SIMD) instructions and we keep in special account the memory handling, with the minimization of the memory transfer. The effectiveness of the proposed solutions has been demonstrated on ARM architecture. In particular, we have selected the dual-core Cortex A9 processor with NEON SIMD extension.

Massimo Bariani, Paolo Lambruschini, Marco Raggio, Luca Pezzoni

Kinematics Analysis Multimedia System for Rehabilitation

Driven by recent advances in information and communications technology, tele-rehabilitation services based on multimedia processing are emerging. Gait analysis is common for many rehabilitation programs, being, for example, periodically performed in the post-stroke recovery assessment. Since current optical diagnostic and patient assessment tools tend to be expensive and not portable, this paper proposes a novel marker-based tracking system using a single depth camera which provides a cost-effective solution that enables tele-rehabilitation services from home and local clinics. The proposed system can simultaneously generate motion patterns even within a complex background using the proposed geometric model-based algorithm and autonomously provide gait analysis results using a customised user-friendly application that facilitates seamless navigation through the captured scene and multi-view video data processing, designed using feedback from practitioners to maximise user experience. The locally processed rehabilitation data can be accessed by cross-platform mobile devices using cloud-based services enabling emerging tele-rehabilitation practices.

Minxiang Ye, Cheng Yang, Vladimir Stankovic, Lina Stankovic, Andrew Kerr

Evaluation of Signal Processing Methods for Attention Assessment in Visual Content Interaction

Eye movements and changes in pupil dilation are known to provide information about viewer’s attention and interaction with visual content. This paper evaluates different statistical and signal processing methods for autonomously analysing pupil dilation signals and extracting information about viewer’s attention when perceiving visual information. In particular, using a commercial video-based eye tracker to estimate pupil dilation and gaze fixation, we demonstrate that wavelet-based signal processing provides an effective tool for pupil dilation analysis and discuss the effect that different image content has on pupil dilation and viewer’s attention.

Georgia Elafoudi, Vladimir Stankovic, Lina Stankovic, Deepti Pappusetti, Hari Kalva

Why You Trust in Visual Saliency

Image understanding is a simple task for a human observer. Visual attention is automatically pointed to interesting regions by a natural objective stimulus in a first step and by prior knowledge in a second step. Saliency maps try to simulate human response and use actual eye-movements measurements as ground truth. An interesting question is: how much corruption in a digital image can affect saliency detection respect to the original image? One of the contributions of this work is to compare the performances of standard approaches with respect to different type of image corruptions and different threshold values on saliency maps. If the corruption can be estimated and/or the threshold is fixed, the results of this work can also be used to help in the selection of a method with best performance.

Edoardo Ardizzone, Alessandro Bruno, Luca Greco, Marco La Cascia

Backmatter

Titel: New Trends in Image Analysis and Processing -- ICIAP 2015 Workshops
herausgegeben von: Vittorio Murino
Enrico Puppo
Diego Sona
Marco Cristani
Carlo Sansone
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-23222-5
Print ISBN: 978-3-319-23221-8
DOI: https://doi.org/10.1007/978-3-319-23222-5