nach oben

2009 | Buch

Kapitel lesen Erstes Kapitel lesen

Image Analysis and Processing – ICIAP 2009

15th International Conference Vietri sul Mare, Italy, September 8-11, 2009 Proceedings

herausgegeben von: Pasquale Foggia, Carlo Sansone, Mario Vento

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the refereed proceedings of the 15th International Conference on Image Analysis and Processing, ICIAP 2009, held in Vietri sul Mare, Italy, in September 2009. The 107 revised full papers presented together with 3 invited papers were carefully reviewed and selected from 168 submissions. The papers are organized in topical sections on computer graphics and image processing, low and middle level processing, 2D and 3D segmentation, feature extraction and image analysis, object detection and recognition, video analysis and processing, pattern analysis and classification, learning, graphs and trees, applications, shape analysis, face analysis, medical imaging, and image analysis and pattern recognition.

Inhaltsverzeichnis

Frontmatter

Invited Talks

Learning with Missing or Incomplete Data

The problem of learning with missing or incomplete data has received a lot of attention in the literature [6,10,13,21,23]. The reasons for missing data can be multi-fold ranging from sensor failures in engineering applications to deliberate withholding of some information in medical questioners in the case of missing input feature values or lack of solved (labelled) cases required in supervised learning algorithms in the case of missing labels. And though such problems are very interesting from the practical and theoretical point of view, there are very few pattern recognition techniques which can deal with missing values in a straightforward and efficient manner. It is in a sharp contrast to the very efficient way in which humans deal with unknown data and are able to perform various pattern recognition tasks given only a subset of input features or few labelled reference cases.

In the context of pattern recognition or classification systems the problem of missing labels and the problem of missing features are very often treated separately.

The availability or otherwise of labels determines the type of the learning algorithm that can be used and has led to the well known split into supervised, unsupervised or more recently introduced hybrid/semi-supervised classes of learning algorithms.

Commonly, using supervised learning algorithms enables designing of robust and well performing classifiers. Unfortunately, in many real world applications labelling of the data is costly and thus possible only to some extent. Unlabelled data on the other hand is often available in large quantities but a classifier built using unsupervised learning is likely to demonstrate performance inferior to its supervised counterpart. The interest in a mixed supervised and unsupervised learning is thus a natural consequence of this state of things and various approaches have been discussed in the literature [2,5,10,12,14,15,18,19]. Our experimental results have shown [10] that when supported by unlabelled samples much less labelled data is generally required to build a classifier without compromising the classification performance. If only a very limited amount of labelled data is available the results based on random selection of labelled samples show high variability and the performance of the final classifier is more dependent on how reliable the labelled data samples are rather than use of additional unlabelled data. This points to a very interesting discussion point related to the issue of the trade-off between the information content in the observed data (in this case available labels) versus the impact that can be achieved by employing sophisticated data processing algorithms which we will also revisit when discussing approaches dealing with missing feature values.

Bogdan Gabrys

Image Analysis and Machine Learning: How to Foster a Stronger Connection?

In this talk I am trying to answer the question stated in its title. I discuss some of the challenges encountered in the use of Machine Learning in image analysis, including examples from our own work. In particular, I look at examples of successful recent research that uses advanced Machine Learning for specific image analysis tasks. I try to generalize lessons learned from this successful research. I argue that some of the reasons for these successes are consistent with the current research trends in Machine Learning. These new approaches resolve some of the difficulties encountered in the past between the two fields, e.g. in the area of feature selection. I look further at some of the recent research trends in Machine Learning (e.g. Active Learning), which might be interesting in image analysis and processing. I also speculate that the new research in cognitive neuroscience might provide interesting cues to Artificial Intelligence in general, and to Computer Vision and image analysis in particular.

Stan Matwin

Computer Analysis of Lighting in Realist Master Art: Current Methods and Future Challenges

We review several computer based techniques for analyzing the lighting in images that have proven valuable when addressing questions in the history and interpretation of realist art. These techniques fall into two general classes: model independent (where one makes no assumption about the three-dimensional shape of the rendered objects) and model dependent (where one does make some assumptions about their three-dimensional shape). An additiona, statistical algorithm integrates the estimates of lighting position or direction produced by different such techniques. We conclude by discussing several outstanding problems and future directions in the analysis of lighting in realist art.

David G. Stork

Contest Session

First International Fingerprint Liveness Detection Competition—LivDet 2009

Fingerprint recognition systems are vulnerable to artificial spoof fingerprint attacks, like molds made of silicone, gelatin or Play-Doh. “Liveness detection”, which is to detect vitality information from the biometric signature itself, has been proposed to defeat these kinds of spoof attacks. The goal for the LivDet 2009 competition is to compare different methodologies for software-based fingerprint liveness detection with a common experimental protocol and large dataset of spoof and live images. This competition is open to all academic and industrial institutions which have a solution for software-based fingerprint vitality detection problem. Four submissions resulted in successful completion: Dermalog, ATVS, and two anonymous participants (one industrial and one academic). Each participant submitted an algorithm as a Win32 console application. The performance was evaluated for three datasets, from three different optical scanners, each with over 1500 images of “fake” and over 1500 images of “live” fingerprints. The best results were from the algorithm submitted by Dermalog with a performance of 2.7% FRR and 2.8% FAR for the Identix (L-1) dataset. The competition goal is to become a reference event for academic and industrial research in software-based fingerprint liveness detection and to raise the visibility of this important research area in order to decrease risk of fingerprint systems to spoof attacks.

Gian Luca Marcialis, Aaron Lewicke, Bozhao Tan, Pietro Coli, Dominic Grimberg, Alberto Congiu, Alessandra Tidu, Fabio Roli, Stephanie Schuckers

Computer Graphics and Image Processing

SVR-Based Jitter Reduction for Markerless Augmented Reality

The ability to augment a video stream with consistent virtual contents is an attractive Computer Vision application. The first Augmented Reality (AR) proposals required the scene to be endowed with special markers. Recently, thanks to the developments in the field of natural invariant local features, similar results have been achieved in a markerless scenario. The computer vision community is now equipped with a set of relatively standard techniques to solve the underlying markerless camera pose estimation problem, at least for planar textured reference objects. The majority of proposals, however, does not exploit temporal consistency across frames in order to reduce some disturbing effects of per-frame estimation, namely visualization of short spurious estimations and jitter. We proposes a new method based on Support Vector Regression to mitigate these undesired effects while preserving the ability to work in real-time. Our proposal can be used as a post processing step independent of the chosen pose estimation method, thus providing an effective and easily integrable building block for AR applications.

Samuele Salti, Luigi Di Stefano

Re-photography and Environment Monitoring Using a Social Sensor Network

This paper presents a technology capable of enabling the creation of a diffuse, calibrated vision-sensor network from the wealth of socially generated geo-referenced imagery, freely available on the Internet. Through the implementation of an accurate image registration system, based on image processing, terrain modelling and subsequent correlation, we will demonstrate how images taken by the public can potentially be used as a mean to gather environmental information from a unique, ground-level viewpoint normally denied non-terrestrial sensors (consider vertical or overhanging cliffs). Moreover, we will also show this registration technology can be used to synthesize new views using sections of photos taken from a variety of places and times.

Paul Chippendale, Michele Zanin, Claudio Andreatta

Region-Based Illuminant Estimation for Effective Color Correction

Several algorithms were proposed in the literature to recover the illuminant chromaticity of the original scene. These algorithms work well only when prior assumptions are satisfied, and the best and the worst algorithms may be different for different scenes. In particular for certain images a do nothing strategy can be preferred. Starting from these considerations, we have developed a region-based color constancy algorithm able to automatically select (and/or blend) among different color corrections, including a conservative do nothing strategy. The strategy to be applied is selected without any a priori knowledge of the image content and only performing image low level analysis.

Simone Bianco, Francesca Gasparini, Raimondo Schettini

Low and Middle Level Processing

A New Technique for Image Magnification

A discrete technique for image magnification is presented, which produces the resulting image in one scan of the input image and does not require any threshold. The technique allows the user to magnify an image with any integer zooming factor. The performance of the algorithm is evaluated by using the standard criterion based on the Peak Signal to Noise Ratio PSNR. The obtained results are visually good, since artifacts do not significantly affect the magnified images.

Carlo Arcelli, Maria Frucci, Gabriella Sanniti di Baja

Nonlocal Similarity Image Filtering

We exploit the recurrence of structures at different locations, orientations and scales in an image to perform denoising. While previous methods based on “nonlocal filtering” identify corresponding patches only up to translations, we consider more general similarity transformations. Due to the additional computational burden, we break the problem down into two steps: First, we extract similarity invariant descriptors at each pixel location; second, we search for similar patches by matching descriptors. The descriptors used are inspired by scale-invariant feature transform (SIFT), whereas the similarity search is solved via the minimization of a cost function adapted from local denoising methods. Our method compares favorably with existing denoising algorithms as tested on several datasets.

Yifei Lou, Paolo Favaro, Stefano Soatto, Andrea Bertozzi

FESID: Finite Element Scale Invariant Detector

Recently, finite element based methods have been used to develop gradient operators for edge detection that have improved angular accuracy over standard techniques. A more prominent issue in the field of image processing has become the use of interest point detectors and to this end we expand upon this research developing a finite element scale invariant interest point detector that is based on the same multi-scale approach used in the SURF detector. The operator differs in that the autocorrelation matrix is used to select the interest point location and the derivative and smoothing operations are combined into one operator developed through the use of the finite element framework.

Dermot Kerr, Sonya Coleman, Bryan Scotney

2D and 3D Segmentation

Conditions for Segmentation of 2D Translations of 3D Objects

Various computer vision applications involve recovery and estimation of multiple motions from images of dynamic scenes. The exact nature of objects’ motions and the camera parameters are often not known a priori and therefore, the most general motion model (the fundamental matrix) is applied. Although the estimation of a fundamental matrix and its use for motion segmentation are well understood, the conditions governing the feasibility of segmentation for different types of motions are yet to be discovered. In this paper, we study the feasibility of separating 2D translations of 3D objects in a dynamic scene. We show that successful segmentation of 2D translations depends on the magnitude of the translations, average distance between the camera and objects, focal length of the camera and level of noise. Extensive set of controlled experiments using both synthetic and real images were conducted to show the validity of the proposed constraints. In addition, we quantified the conditions for successful segmentation of 2D translations in terms of the magnitude of those translations, the average distance between the camera and objects in motions for a given camera. These results are of particular importance for practitioners designing solutions for computer vision problems.

Shafriza Nisha Basah, Alireza Bab-Hadiashar, Reza Hoseinnezhad

Segmentation of Wood Fibres in 3D CT Images Using Graph Cuts

To completely segment all individual wood fibres in volume images of fibrous materials presents a challenging problem but is important in understanding the micro mechanical properties of composite materials. This paper presents a filter that identifies and closes pores in wood fibre walls, simplifying the shape of the fibres. After this filter, a novel segmentation method based on graph cuts identifies individual fibres. The methods are validated on a realistic synthetic fibre data set and then applied on

CT images of wood fibre composites.

Erik L. G. Wernersson, Anders Brun, Cris L. Luengo Hendriks

Semantic-Based Segmentation and Annotation of 3D Models

3D objects have become widely available and used in different application domains. Thus, it is becoming fundamental to use, integrate and develop techniques for extracting and maintaining their embedded knowledge. These techniques should be encapsulated in portable and intelligent systems able to semantically annotate the 3D object models in order to improve their usability and indexing, especially in innovative web cooperative environments. Lately, we are moving in this direction, with the definition and development of data structures, methods and interfaces for structuring and semantically annotating 3D complex models (and scenes) - even changing in time - according to ontology-driven metadata and following ontology-driven processes. Here, we concentrate on the tools for segmenting manifold 3D models and on the underline structural representation that we build and manipulate. We also describe the first prototype of an annotation tool which allows a hierarchical semantic-driven tagging of the segmented model and provides an interface from which the user can inspect and browse the entire segmentation graph.

Laura Papaleo, Leila De Floriani

Feature Extraction and Image Analysis

Reducing Keypoint Database Size

Keypoints are high dimensional descriptors for local features of an image or an object. Keypoint extraction is the first task in various computer vision algorithms, where the keypoints are then stored in a database used as the basis for comparing images or image features. Keypoints may be based on image features extracted by feature detection operators or on a dense grid of features. Both ways produce a large number of features per image, causing both time and space performance challenges when upscaling the problem.

We propose a novel framework for reducing the size of the keypoint database by learning which keypoints are beneficial for a specific application and using this knowledge to filter out a large portion of the keypoints. We demonstrate this approach on an object recognition application that uses a keypoint database. By using leave one out K nearest neighbor regression we significantly reduce the number of keypoints with relatively small reduction in performance.

Shahar Jamshy, Eyal Krupka, Yehezkel Yeshurun

Automatic Estimation of the Inlier Threshold in Robust Multiple Structures Fitting

This paper tackles the problem of estimating the inlier threshold in RANSAC-like approaches to multiple models fitting. An iterative approach finds the maximum of a score function which resembles the Silhouette index used in clustering validation. Although several methods have been proposed to solve this problem for the single model case, this is the first attempt to address multiple models. Experimental results demonstrate the performances of the algorithm.

Roberto Toldo, Andrea Fusiello

Integration of Contextual Information in Online Handwriting Representation

Robust handwriting recognition of complex patterns of arbitrary scale, orientation and location is yet elusive to date as reaching a good recognition rate is not trivial for most of the application developments in this field. Cursive scripts with complex character shapes, such as Arabic and Persian, make the recognition task even more challenging. This complexity requires sophisticated representations and learning methods, and comprehensive data samples. A direct approaches to achieve a better performance is focusing on designing more powerful building blocks of a handwriting recognition system which are

pattern representation

and

pattern classification

. In this paper we aim to scale up the efficiency of online recognition systems for Arabic characters by integrating novel representation techniques into efficient classification methods. We investigate the idea of incorporating two novel feature representations for online character data. We advocate the usefulness and practicality of these features in classification methods using neural networks and support vector machines. The combinations of proposed representations with related classifiers can offer a module for recognition tasks which can deal with any two-dimensional online pattern. Our empirical results confirm the higher distinctiveness and robustness to character deformations obtained by the proposed representation compared to currently available techniques.

Sara Izadi, Ching Y. Suen

Object Detection and Recognition

Plant Leaf Identification Using Multi-scale Fractal Dimension

Taxonomical classification of plants is a very complex and time-consuming task. This is mostly due to the great biodiversity of species and the fact of most measures extracted from plants are traditionally performed manually. This paper presents a novel approach to plant identification based on leaf texture. Initially, the texture is modelled as a surface, so complexity analysis using Multi-scale fractal dimension can be performed over the generated surface, resulting in a feature vector which represents texture complexity in terms of the spatial scale. Yielded results show the potential of the approach, which overcomes traditional texture analysis methods, such as Co-occurrence matrices, Gabor filters and Fourier descriptors.

André R. Backes, Odemir M. Bruno

Detection of a Hand Holding a Cellular Phone Using Multiple Image Features

Detection of a hand holding a cellular phone was developed to recognize whether someone is using a cellular phone while operating an automated teller machine (ATM). The purpose is to prevent money transfer fraud. Since a victim is told a bogus reason to transfer money and how to operate the machine through a cellular phone, detecting a working cellular phone is necessary.

However, cellular phone detection was not realistic due to variable colors and shapes. We assumed that a user’s hand beside the face was holding a cellular phone and decided to detect it.

The proposed method utilizes color, shape, and motion. Color and motion were used to compare the input to the face. Shape was used to compare the input to the standard hand pattern. The experimental result was a detection rate of 90.0% and a false detection rate of 3.2%, where 7,324 and 20,708 images were used respectively.

Hiroto Nagayoshi, Takashi Watanabe, Tatsuhiko Kagehiro, Hisao Ogata, Tsukasa Yasue, Hiroshi Sako

Object Detection by Estimating and Combining High-Level Features

Many successful object detection systems characterize object classes with a statistical profile over a large number of local features. We present an enhancement to this method that learns to assemble local features into features that capture more global properties such as body shape and color distribution. The system then learns to combine these estimated global features to improve object detection accuracy. In our approach, each candidate object detection from an off-the-shelf gradient-based detection system is transformed into a conditional random field. This CRF is used to extract a most likely object silhouette, which is then processed into features based on color and shape. Finally, we show that on the difficult Pascal VOC 2007 data set, detection rates can be improved by combining these global features with the local features from a state-of-the-art gradient based approach.

Geoffrey Levine, Gerald DeJong

Video Analysis and Processing

Video Event Classification Using Bag of Words and String Kernels

The recognition of events in videos is a relevant and challenging task of automatic semantic video analysis. At present one of the most successful frameworks, used for object recognition tasks, is the bag-of-words (BoW) approach. However this approach does not model the temporal information of the video stream. In this paper we present a method to introduce temporal information within the BoW approach. Events are modeled as a sequence composed of histograms of visual features, computed from each frame using the traditional BoW model. The sequences are treated as strings where each histogram is considered as a character. Event classification of these sequences of variable size, depending on the length of the video clip, are performed using SVM classifiers with a string kernel that uses the Needlemann-Wunsch edit distance. Experimental results, performed on two datasets, soccer video and TRECVID 2005, demonstrate the validity of the proposed approach.

Lamberto Ballan, Marco Bertini, Alberto Del Bimbo, Giuseppe Serra

Video Sequences Association for People Re-identification across Multiple Non-overlapping Cameras

This paper presents a solution of the appearance-based people re-identification problem in a surveillance system including multiple cameras with different fields of vision. We first utilize different color-based features, combined with several illuminant invariant normalizations in order to characterize the silhouettes in static frames. A graph-based approach which is capable of learning the global structure of the manifold and preserving the properties of the original data in a lower dimensional representation is then introduced to reduce the effective working space and to realize the comparison of the video sequences. The global system was tested on a real data set collected by two cameras installed on board a train. The experimental results show that the combination of color-based features, invariant normalization procedures and the graph-based approach leads to very satisfactory results.

Dung Nghi Truong Cong, Catherine Achard, Louahdi Khoudour, Lounis Douadi

Shadow Removal in Outdoor Video Sequences by Automatic Thresholding of Division Images

Several video-based applications, such as video surveillance, traffic monitoring, video annotation, etc., rely on the correct detection and tracking of moving objects within the observed scene. Even though several works have been proposed in the field of moving object detection, many of them do not consider the problem of segmenting real objects from their shadows. The shadow is considered part of the object, thus leading to possibly large errors in the subsequent steps of object localisation and tracking. In this paper we propose a shadow detection algorithm able to remove shadows from the blobs of moving objects, using division images and Expectation-Maximization histogram analysis. Experimental results prove that the use of the proposed method can significantly increase the performance of a video analysis system.

Srinivasa Rao Dammavalam, Claudio Piciarelli, Christian Micheloni, Gian Luca Foresti

Pattern Analysis and Classification

A New Generative Feature Set Based on Entropy Distance for Discriminative Classification

Score functions induced by generative models extract fixed-dimensions feature vectors from different-length data observations by subsuming the process of data generation, projecting them in highly informative spaces called score spaces. In this way, standard discriminative classifiers such as support vector machines, or logistic regressors are proved to achieve higher performances than a solely generative or discriminative approach. In this paper, we present a novel score space that capture the generative process encoding it in an entropic feature vector. In this way, both uncertainty in the generative model learning step and “local” compliance of data observations with respect to the generative process can be represented. The proposed score space is presented for hidden Markov models and mixture of gaussian and is experimentally validated on standard benchmark datasets; moreover it can be applied to any generative model. Results show how it achieves compelling classification accuracies.

Alessandro Perina, Marco Cristani, Umberto Castellani, Vittorio Murino

A Hybrid Approach Handling Imbalanced Datasets

Several binary classification problems exhibit imbalance in class distribution, influencing system learning. Indeed, traditional machine learning algorithms are biased towards the majority class, thus producing poor predictive accuracy over the minority one. To overcome this limitation, many approaches have been proposed up to now to build artificially balanced training sets. Further to their specific drawbacks, they achieve more balanced accuracies on each class harming the global accuracy. This paper first reviews the more recent method coping with imbalanced datasets and then proposes a strategy overcoming the main drawbacks of existing approaches. It is based on an ensemble of classifiers trained on balanced subsets of the original imbalanced training set working in conjunction with the classifier trained on the original imbalanced dataset. The performance of the method has been estimated on six public datasets, proving its effectiveness also in comparison with other approaches. It also gives the chance to modify the system behaviour according to the operating scenario.

Paolo Soda

Nonlinear Embedded Map Projection for Dimensionality Reduction

We describe a dimensionality reduction method used to perform similarity search that is tested on document image retrieval applications. The approach is based on data point projection into a low dimensional space obtained by merging together the layers of a Growing Hierarchical Self Organizing Map (GHSOM) trained to model the distribution of objects to be indexed. The low dimensional space is defined by embedding the GHSOM sub-maps in the space defined by a non-linear mapping of neurons belonging to the first level map. The latter mapping is computed with the Sammon projection algorithm.

The dimensionality reduction is used in a similarity search framework whose aim is to efficiently retrieve similar objects on the basis of the distance among projected points corresponding to high dimensional feature vectors describing the indexed objects.

We compare the proposed method with other dimensionality reduction techniques by evaluating the retrieval performance on three datasets.

Simone Marinai, Emanuele Marino, Giovanni Soda

Learning

A Riemannian Self-Organizing Map

We generalize the classic self-organizing map (SOM) in flat Euclidean space (linear manifold) onto a Riemannian manifold. Both sequential and batch learning algorithms for the generalized SOM are presented. Compared with the classical SOM, the most novel feature of the generalized SOM is that it can learn the intrinsic topological neighborhood structure of the underlying Riemannian manifold that fits to the input data. We here compared the performance of the generalized SOM and the classical SOM using real 3-Dimensional facial surface normals data. Experimental results show that the generalized SOM outperforms the classical SOM when the data lie on a curved Riemannian manifold.

Dongjun Yu, Edwin R. Hancock, William A. P. Smith

Towards a Theoretical Framework for Learning Multi-modal Patterns for Embodied Agents

Multi-modality is a fundamental feature that characterizes biological systems and lets them achieve high robustness in understanding skills while coping with uncertainty. Relatively recent studies showed that multi-modal learning is a potentially effective add-on to artificial systems, allowing the transfer of information from one modality to another. In this paper we propose a general architecture for jointly learning visual and motion patterns: by means of regression theory we model a mapping between the two sensorial modalities improving the performance of artificial perceptive systems. We present promising results on a case study of grasp classification in a controlled setting and discuss future developments.

Nicoletta Noceti, Barbara Caputo, Claudio Castellini, Luca Baldassarre, Annalisa Barla, Lorenzo Rosasco, Francesca Odone, Giulio Sandini

The Impact of Reliability Evaluation on a Semi-supervised Learning Approach

self-training

methods, unlabeled samples are first assigned a provisional label by the classifier, and then used to extend the training set of the classifier itself. For this latter step it is important to choose only the samples whose classification is likely to be correct, according to a suitably defined reliability measure.

In this paper we want to study to what extent the choice of a particular technique for evaluating the classification reliability can affect the learning performance. To this aim, we have compared five different reliability evaluators on four publicly available datasets, analyzing and discussing the obtained results.

Pasquale Foggia, Gennaro Percannella, Carlo Sansone, Mario Vento

Graphs and Trees

On the Training Patterns Pruning for Optimum-Path Forest

The Optimum-Path Forest (OPF) classifier is a novel graph-based supervised pattern recognition technique that has been demonstrated to be superior to Artificial Neural Networks and similar to Support Vector Machines, but much faster. The OPF classifier reduces the problem of pattern recognition to a computation of an optimum-path forest in the feature space induced by a graph, creating discrete optimal partitions, which are optimum-path trees rooted by prototypes, i.e., key samples that will compete among themselves trying to conquer the remaining samples. Some applications, such that medical specialist systems for image-based diseases identification, need to be constantly re-trained with new instances (diagnostics) to achieve a better generalization of the problem, which requires large storage devices, due to the high number of generated data (millions of voxels). In that way, we present here a pruning algorithm for the OPF classifier that learns the most irrelevant samples and eliminate them from the training set, without compromising the classifier’s accuracy.

João P. Papa, Alexandre X. Falcão

Learning Class Specific Graph Prototypes

This paper describes how to construct a graph prototype model from a large corpus of multi-view images using local invariant features. We commence by representing each image with a graph, which is constructed from a group of selected SIFT features. We then propose a new pairwise clustering method based on a graph matching similarity measure. The positive example graphs of a specific class accompanied with a set of negative example graphs are clustered into one or more clusters, which minimize an entropy function. Each cluster is simplified into a tree structure composed of a series of irreducible graphs, and for each of which a node co-occurrence probability matrix is obtained. Finally, a recognition oriented class specific graph prototype (CSGP) is automatically generated from the given graph set. Experiments are performed on over 50K training images spanning ~500 objects and over 20K test images of 68 objects. This demonstrates the scalability and recognition performance of our model.

Shengping Xia, Edwin R. Hancock

Tree Covering within a Graph Kernel Framework for Shape Classification

Shape classification using graphs and skeletons usually involves edition processes in order to reduce the influence of structural noise. On the other hand, graph kernels provide a rich framework in which many classification algorithm may be applied on graphs. However, edit distances cannot be readily used within the kernel machine framework as they generally lead to indefinite kernels. In this paper, we propose a graph kernel based on bags of paths and edit operations which remains positive definite according to the bags. The robustness of this kernel is based on a selection of the paths according to their relevance in the graph. Several experiments prove the efficiency of this approach compared to alternative kernel.

François-Xavier Dupé, Luc Brun

Applications

Vision-Based Markerless Gaming Interface

The paper proposes a novel human machine interface for gaming applications based on computer vision. The key idea is to allow the user to interact with the game by simply moving a hand-held consumer grade camera. Detection of natural features in the incoming video stream avoids instrumenting the scene with optical markers while preserving real-time computation and accuracy. The paper presents also a prototype videogame developed as proof-of-concept of our camera-based gaming interface. Thanks to recent advances in real-time extraction and matching of natural features from images on mobile platforms, our proposal holds the potential to enable a new generation of camera-controlled videogames for hand-held mobile devices.

Pietro Azzari, Luigi Di Stefano

Quality Assessment of the MPEG-4 Scalable Video CODEC

In this paper, the performance of the emerging MPEG-4 SVC CODEC is evaluated. In the first part, a brief introduction on the subject of quality assessment and the development of the MPEG-4 SVC CODEC is given. After that, the used test methodologies are described in detail, followed by an explanation of the actual test scenarios. The main part of this work concentrates on the performance analysis of the MPEG-4 SVC CODEC - both objective and subjective. Please note that this document is only a shortened version of the assessment. Further experimental results can be found in the extended version available at the Computing Research Repository (CoRR).

Florian Niedermeier, Michael Niedermeier, Harald Kosch

An Automatic Method for Counting Annual Rings in Noisy Sawmill Images

The annual ring pattern of a log end face is related to the quality of the wood. We propose a method for computing the number of annual rings on a log end face depicted in sawmill production. The method is based on the grey-weighted polar distance transform and registration of detected rings from two different directions. The method is developed and evaluated on noisy images captured in on-line sawmill production at a Swedish sawmill during 2008, using an industrial colour camera. We have also evaluated the method using synthetic data with different ring widths, ring eccentricity, and noise levels.

Kristin Norell

A License Plate Extraction Algorithm Based on Edge Statistics and Region Growing

This paper presents a license plate extraction method for gray-scale images, based on a combination of edge statistics and a two-step seeded region growing algorithm. The proposed region growing algorithm uses a dual criterion based on edge density and gray-scale intensity affinity. The proposed method aims at achieving license plate segmentation that fits to the real plate boundaries better than existing methods. The robustness of the method has been tested with experimental results, including examples of low quality or poor-preserved plates and commercial or freight transport vehicles.

Manuel Vargas, Sergio L. Toral, Federico Barrero, Francisco Cortés

Boat Speed Monitoring Using Artificial Vision

This paper describes a method to detect, measure the speed, and extract statistics of boats moving on a wide water surface using a single image stream taken from grayscale camera. The approach is based on a background subtraction technique combined with classification and tracking to improve robustness; it provides a stable detection even with sea waves and strong light reflections. The method returns correct speed values within the range ±5% in the 97% of use cases. The algorithm has been integrated in a speed warning prototype system on the Burano island in Venice, monitoring a 250

wide channel slice. Images are captured by a high resolution camera and processed on site in real-time. Processing results can be accessed remotely for monitoring purposes. The system has been up and running for more than two years.

Alberto Broggi, Pietro Cerri, Paolo Grisleri, Marco Paterlini

Theia: Multispectral Image Analysis and Archaeological Survey

Theia is a software framework for multispectral image analysis. The design is grounded on the object-oriented approach and a model combining parallel computation with selective data processing. Multispectral images from the archaeological site of Aquileia, Italy, have been used as the experimental testbed in order to assess the effectiveness and performance of the system; satisfactory results are reported, and are quite promising towards the use of the framework as a dynamic, interactive interface to real-time data exploration and processing.

Vito Roberto, Massimiliano Hofer

A Self-updating Multiexpert System for Face Identification

Multibiometric systems can solve a number of problems of single-biometry approaches. A source of flaws for present systems, both single-biometric and multibiometric, can be found in the lack of dynamic update of parameters, which does not allow them to adapt to changes in the working settings. They are generally calibrated once and for all, so that they are tuned and optimized with respect to standard conditions. In this work we investigate an architecture where single-biometry subsystems work in parallel, yet exchanging information at fixed points, according to the N-Cross Testing Protocol. In particular, the integrated subsystems work on the same biometric feature, the face in this case, yet exploiting different classifiers. Subsystems collaborate at a twofold level, both for returning a common answer and for tuning to changing operating conditions. Results demonstrate that component collaboration increases system accuracy and allows identifying unstable subsystems.

Andrea F. Abate, Maria De Marsico, Michele Nappi, Daniel Riccio

Characterisation of Retinal Feature Points Applied to a Biometric System

In this work a methodology for the classification of retinal feature points is applied to a biometric system. This system is based in the extraction of feature points, namely bifurcations and crossovers as biometric pattern. In order to compare a pattern to other from a known individual a matching process takes place between both points sets. That matching task is performed by finding the best geometric transform between sets, i.e. the transform leading to the highest number of matched points. The goal is to reduce the number of explored transforms by introducing the previous characterisation of feature points. This is achieved with a constraint avoiding two differently classified points to match. The empirical reduction of transforms is about 20%.

David Calvo, Marcos Ortega, Manuel G. Penedo, José Rouco, Beatriz Remeseiro

Coastline Detection from SAR Images by Level Set Model

In this paper we present an innovative and automatic procedure which is used to extract the coastline from SAR (Synthetic Aperture Radar) images by the level set model. This model consists in a PDE (Partial Differential Equation) equation governing the evolution of a curve corresponding to the zero level of a 3D function, called level set function, until the curve reaches the edge of the region to be segmented. A coastline is the boundary between land and sea masses. Detecting the coastline is of fundamental importance when monitoring various natural phenomena such as tides, coastal erosion and the dynamics of glaciers. In this case SAR images show problems which arise from the presence of the speckle noise and of the strong signal deriving from the rough or slight sea. In fact in the case of heavy sea the signal determines an intensity similar to the one of land, making it difficult to distinguish the coastline.

Maria Mercede Cerimele, Luigi Cinque, Rossella Cossu, Roberta Galiffa

Wavelet-Based Feature Extraction for Handwritten Numerals

We present a novel preprocessing technique for handwritten numerals recognition, that relies on the extraction of multiscale features to characterize the classes. These features are obtained by means of different continuous wavelet transforms, which behave as scale-dependent bandpass filters, and give information on local orientation of the strokes. First a shape-preserving, smooth and smaller version of the digit is extracted. Second, a complementary feature vector is constructed, that captures certain properties of the digits, such as orientation, gradients and curvature at different scales. The accuracy with which the selected features describe the original digits is assessed with a neural network classifier of the multilayer perceptron (MLP) type. The proposed method gives satisfactory results, regarding the dimensionality reduction as well as the recognition rates on the testing sets of CENPARMI and MNIST databases; the recognition rate being 92.60 % for the CENPARMI data-base and 98.22 % for the MNIST database.

Diego Romero, Ana Ruedin, Leticia Seijas

A Texture Based Shoe Retrieval System for Shoe Marks of Real Crime Scenes

Shoeprints found on the crime scene contain useful information for the investigator: being able to identify the make and model of the shoe that left the mark on the crime scene is important for the culprit identification. Semi-automatic and automatic systems have already been proposed in the literature to face the problem, however all previous works have dealt with synthetic cases, i.e. shoe marks which have not been taken from a real crime scene but are artificially generated with different noise adding techniques.

Here we propose a descriptor based on the Mahalanobis distance for the retrieval of shoeprint images. The performance test of the proposed descriptor is performed on real crime scenes shoe marks and the results are promising.

Francesca Dardi, Federico Cervelli, Sergio Carrato

Encephalic NMR Tumor Diversification by Textural Interpretation

The novel technologies used in different application domains allow to obtain digital images with a high complex informative content. These meaningful information are expressed by textural skin that covers the objects represented inside the images. The textural information can be exploited to interpret the semantic meaning of the images themselves. This paper provides a mathematical characterization, based on texture analysis, of the craniopharyngioma pathology distinguishing it from other kinds of primary cerebral tumors. By this characterization a prototype has been developed, which has primarily allowed to identify potential abnormal masses inside the cerebral tissue and subsequently to possibly classify them as craniopharyngiomas.

Danilo Avola, Luigi Cinque

Pathnodes Integration of Standalone Particle Filters for People Tracking on Distributed Surveillance Systems

In this paper, we present a new approach to object tracking based on batteries of particle filter working in multicamera systems with non overlapped fields of view. In each view the moving objects are tracked with independent particle filters; each filter exploits a likelihood function based on both color and motion information. The consistent labeling of people exiting from a camera field of view and entering in a neighbor one is obtained sharing particles information for the initialization of new filtering trackers. The information exchange algorithm is based on path-nodes, which are a graph-based scene representation usually adopted in computer graphics. The approach has been tested even in case of simultaneous transitions, occlusions, and groups of people. Promising results have been obtained and here presented using a real setup of non overlapped cameras.

Roberto Vezzani, Davide Baltieri, Rita Cucchiara

On Improving the Efficiency of Eigenface Using a Novel Facial Feature Localization

Face recognition is the most popular non-intrusive biometric technique with numerous applications in commerce, security and surveillance. Despite its good potential, most of the face recognition methods in the literature are not practical due to the lack of robustness, slow recognition, and semi-manual localizations. In this paper, we improve the robustness of eigenface-based systems with respect to variations in illumination level, pose and background. We propose a new method for face cropping and alignment which is fully automated and we integrate this method in Eigenface algorithm for face recognition. We also investigate the effect of various preprocessing techniques and several distance metrics on the overall system performance. The evaluation of this method under single-sample and multi-sample recognition is presented. The results of our comprehensive experiments on two databases, FERET and JRFD, show a significant gain compared to basic Eigenface method and considerable improvement with respect to recognition accuracy when compared with previously reported results in the literature.

Aleksey Izmailov, Adam Krzyżak

Semiotic Design of a Hypermedia for Cultural Heritage

The paper proposes a semiotic model inspired to the narrative, structural framework by A.J.Greimas, and applied to the design and analysis of a hypermedia. The framework is structured into four levels of signification and allows for a constructive process of semantic aggregation. We follow such a process in order to explore the conceptual organization of the model. We construct a hypermedia regarding a collection of ancient mosaics from a roman villa. Our results indicate that the proposed approach opens novel perspectives, and looks promising towards the definition of semiotic methodologies of hypermedia design.

Vito Roberto, Elio Toppano

An Experimental Analysis of the Relationship between Biometric Template Update and the Doddington’s Zoo: A Case Study in Face Verification

The problem of biometric template representativeness has recently attracted much attention with the introduction of several template update methods. Automatic template update methods adapt themselves to the intra-class variations of the input data. However, it is possible to hypothesize that the effect of template updating may not be the same for all the clients due to different characteristics of clients present in the biometric database. The goal of this paper is to investigate this hypothesis by explicitly partitioning clients into different groups of the “Doddington’s zoo” as a function of their “intrinsic” characteristics, and studying the effect of state of art template “self update” procedure on these different groups. Experimental evaluation on Equinox database with a case study on face verification system based on EBGM algorithm shows the strong evidence of non-uniform update effects on different clients classes and suggest to modify the update procedures according to the client’s characteristics.

Ajita Rattani, Gian Luca Marcialis, Fabio Roli

Similarity Searches in Face Databases

In this paper the problem of similarity searches in face databases is addressed. An approach based on relevance feedback is proposed to iteratively improve the query result. The approach is suitable both to supervised and unsupervised contexts. The efficacy of the learning procedures are confirmed by the results obtained on publicly available databases of faces.

Annalisa Franco, Dario Maio

Vision-Based Motion Capture of Interacting Multiple People

Vision-based motion capture is getting popular for acquiring human motion information in various interactive applications. To enlarge its applicability, we have been developing a vision-based motion capture system which can estimate the postures of multiple people simultaneously using multiview image analysis. Our approach is divided into the following two phases: at first, extraction, or segmentation, of each person in input multiview images; then, posture analysis for one person is applied to the segmented region of each person. The segmentation is realized in the voxel space, which is reconstructed by visual cone intersection of multiview silhouettes. Here, a graph cut algorithm is employed to achieve optimal segmentation. Posture analysis is based on a model-based approach, where a skeleton model of human figure is matched with the multiview silhouettes based on a particle filter and physical constraints on human body movement. Several experimental studies show that the proposed method acquires human postures of multiple people correctly and efficiently even when they touch each otter.

Hiroaki Egashira, Atsushi Shimada, Daisaku Arita, Rin-ichiro Taniguchi

Probabilistic Corner Detection for Facial Feature Extraction

After more than 35 years of resarch, face processing is considered nowadays as one of the most important application of image analysis. It can be considered as a collection of problems (i.e., face detection, normalization, recognition and so on) each of which can be treated separately. Some face detection and face recognition techniques have reached a certain level of maturity, however facial feature extraction still represents the bottleneck of the entire process. In this paper we present a novel facial feature extraction approach that could be used for normalizing Viola-Jones detected faces and let them be recognized by an appearance-based face recognition method. For each observed feature a prior distribution is computed and used as boost map to filter the Harris corner detector response producing more feature candidates on interest region while discarding external values. Tests have been performed on both AR and BioID database using approximately 1750 faces and experimental results are very encouraging.

Edoardo Ardizzone, Marco La Cascia, Marco Morana

License Plate Detection and Character Recognition

In this paper we describe an approach based on infrared camera and novel methods about how to detect license plates on rear-side of a vehicle in still image or video stream. Particular contribution is posed on discovering plate area by edge search on each side of plate and reconstruction of rectangular shape. The recognized plate area is rotated and adjusted for a better character separation. Top hat morphological operator is used to extract characters from plate background. Each single character inside plate area is separated even in case of tilted shape. This approach try to slice the plate vertically, and it follow the character profile that hit on his vertical path. Pattern matching based on modified Yule dissimilarity measure is used for character recognition. Performance on 200 images are discussed.

Fabio Caccia, Roberto Marmo, Luca Lombardi

Towards a Subject-Centered Analysis for Automated Video Surveillance

In a typical video surveillance framework, a single camera or a set of cameras monitor a scene in which human activities are carried out. In this paper, we propose a complementary framework where human activities can be analyzed under a subjective point of view. The idea is to represent the focus of attention of a person in the form of a 3D view frustum, and to insert it in a 3D representation of the scene. This leads to novel inferences and reasoning on the scene and the people acting in it. As a particular application of this proposed framework, we collect the information from the subjective view frusta in an Interest Map, i.e. a map that gathers in an effective and intuitive way which parts of the scene are observed more often in a defined time interval. The experimental results on standard benchmark data witness the goodness of the proposed framework, encouraging further efforts for the development of novel applications in the same direction.

Michela Farenzena, Loris Bazzani, Vittorio Murino, Marco Cristani

Geometric-Aligned Cancelable Fingerprint Templates

A minutiae encryption algorithm based on geometric transformation of minutiae positions is proposed to generate cancelable fingerprint templates. A geometric transformation is used for alignment. A parameter-controlled minutiae encryption is performed within a local area to generate a cancelable minutiae template, and then all local encryption results are superimposed to form a protected template. Parameters to control the minutiae encryption are generated independent of the geometric-aligned minutiae, which ensures solid non-invertibility compared to those cancelable template generating algorithms with to-be-encrypted minutiae information as parameters.

Bian Yang, Christoph Busch, Mohammad Derawi, Patrick Bours, Davrondzhon Gafurov

A Hybrid Approach to Land Cover Classification from Multi Spectral Images

This work is part of a wider project whose general objective is to develop a methodology for the automatic classification, based on CORINE land-cover (CLC) classes, of high resolution multispectral IKONOS images. The specific objective of this paper is to describe a new methodology for producing really exploitable results from automatic classification algorithms. Input data are basically constituted by multispectral images, integrated with textural and contextual measures. The output is constituted by an image with each pixel assigned to one out of 15 classes at the second level of the CLC legend or let unclassified (somehow a better solution than a classification error), plus a stability map that helps users to separate the regions classified with high accuracy from those whose classification result should be verified before being used.

Primo Zingaretti, Emanuele Frontoni, Eva Savina Malinverni, Adriano Mancini

Improving the Accuracy of a Score Fusion Approach Based on Likelihood Ratio in Multimodal Biometric Systems

Multimodal biometric systems integrate information from multiple sources to improve the performance of a typical unimodal biometric system. Among the possible information fusion approaches, those based on fusion of match scores are the most commonly used. Recently, a framework for the optimal combination of match scores that is based on the likelihood ratio (LR) test has been presented. It is based on the modeling of the distributions of genuine and impostor match scores as a finite Gaussian mixture models. In this paper, we propose two strategies for improving the performance of the LR test. The first one employs a voting strategy to circumvent the need of huge datasets for training, while the second one uses a sequential test to improve the classification accuracy on genuine users.

Experiments on the NIST multimodal database confirmed that the proposed strategies can outperform the standard LR test, especially when there is the need of realizing a multibiometric system that must accept no impostors.

Emanuela Marasco, Carlo Sansone

A 3D Scanner for Transparent Glass

Many practical tasks in industry, such as automatic inspection or robot vision, often require the scanning of three-dimensional shapes by use of non-contact techniques. However, few methods have been proposed to measure three-dimensional shapes of transparent objects because of the difficulty of dealing with transparency and specularity of the surface. This paper presents a 3D scanner for transparent glass objects based on Scanning From Heating (SFH), a new method that makes use of local surface heating and thermal imaging.

Gonen Eren, Olivier Aubreton, Fabrice Meriaudeau, L. A. Sanchez Secades, David Fofi, A. Teoman Naskali, Frederic Truchetet, Aytul Ercil

Red Eye Detection through Bag-of-Keypoints Classification

Red eye artifacts are a well-known problem in digital photography. Small compact devices and point-and-click usage, typical of non-professional photography, greatly increase the likelihood for red eyes to appear in acquired images. Automatic detection of red eyes is a very challenging task, due to the variability of the phenomenon and the general difficulty in reliably discerning the shape of eyes.

This paper presents a method for discriminating between red eyes and other objects in a set of red eye candidates. The proposed method performs feature-based image analysis and classification just considering the bag-of-keypoints paradigm. Experiments involving different keypoint detectors/descriptors are performed. Achieved results are presented, as well as directions for future work.

Sebastiano Battiato, Mirko Guarnera, Tony Meccio, Giuseppe Messina

A New Large Urdu Database for Off-Line Handwriting Recognition

A new large Urdu handwriting database, which includes isolated digits, numeral strings with/without decimal points, five special symbols, 44 isolated characters, 57 Urdu words (mostly financial related), and Urdu dates in different patterns, was designed at Centre for Pattern Recognition and Machine Intelligence (CENPARMI). It is the first database for Urdu off-line handwriting recognition. It involves a large number of Urdu native speakers from different regions of the world. Moreover, the database has different formats – true color, gray level and binary. Experiments on Urdu digits recognition has been conducted with an accuracy of 98.61%. Methodologies in image pre-processing, gradient feature extraction and classification using SVM have been described, and a detailed error analysis is presented on the recognition results.

Malik Waqas Sagheer, Chun Lei He, Nicola Nobile, Ching Y. Suen

Object Matching in Distributed Video Surveillance Systems by LDA-Based Appearance Descriptors

Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use appearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the LDA bag-of-words model for appearance. The encoded appearance is then used to establish probable matching across cameras. Preliminary experiments are conducted on a dataset of 20 individuals and comparison against Madden’s I-MCHR is reported.

Liliana Lo Presti, Stan Sclaroff, Marco La Cascia

Analysis and Classification of Crithidia Luciliae Fluorescent Images

Autoantibody tests based on Crithidia Luciliae (CL) substrate are the recommended method to detect Systemic Lupus Erythematosus (SLE), a very serious sickness further to be classified as an invalidating chronic disease. CL is an unicellular organism containing a strongly tangled mass of circular dsDNA, named as kinetoplast, whose fluorescence determines the positiveness to the test. Conversely, the staining of other parts of cell body is not a disease marker, thus representing false positive fluorescence. Such readings are subjected to several issues limiting the reproducibility and reliability of the method, as the photo-bleaching effect and the inter-observer variability. Hence, Computer-Aided Diagnosis (CAD) tools can support physicians decision. In this paper we propose a system to classify CL wells based on a three stages recognition approach, which classify single cell, images and, finally, the well. The fusion of such different information permits to reduce the misclassifications effect. The approach has been successfully tested on an annotated dataset, proving its feasibility.

Paolo Soda, Leonardo Onofri, Amelia Rigon, Giulio Iannello

Confidence Measures for Error Correction in Interactive Transcription Handwritten Text

An effective approach to transcribe old text documents is to follow an interactive-predictive paradigm in which both, the system is guided by the human supervisor, and the supervisor is assisted by the system to complete the transcription task as efficiently as possible. In this paper, we focus on a particular system prototype called GIDOC, which can be seen as a first attempt to provide user-friendly, integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. More specifically, we focus on the handwriting recognition part of GIDOC, for which we propose the use of confidence measures to guide the human supervisor in locating possible system errors and deciding how to proceed. Empirical results are reported on two datasets showing that a word error rate not larger than a 10% can be achieved by only checking the 32% of words that are recognised with less confidence.

Lionel Tarazón, Daniel Pérez, Nicolás Serrano, Vicent Alabau, Oriol Ramos Terrades, Alberto Sanchis, Alfons Juan

On the Quantitative Estimation of Short-Term Aging in Human Faces

Facial aging has been only partially studied in the past and mostly in a qualitative way. This paper presents a novel approach to the estimation of facial aging aimed to the quantitative evaluation of the changes in facial appearance over time. In particular, the changes both in face shape and texture, due to

short-time aging

, are considered. The developed framework exploits the concept of “distinctiveness” of facial features and the temporal evolution of such measure. The analysis is performed both at a global and local level to define the features which are more stable over time.

Several experiments are performed on publicly available databases with image sequences densely sampled over a time span of several years. The reported results clearly show the potential of the methodology to a number of applications in biometric identification from human faces.

Marcos Ortega, Linda Brodo, Manuele Bicego, Massimo Tistarelli

3D Neural Model-Based Stopped Object Detection

In this paper we propose a system that is able to distinguish moving and stopped objects in digital image sequences taken from stationary cameras. Our approach is based on self organization through artificial neural networks to construct a model of the scene background and a model of the scene foreground that can handle scenes containing moving backgrounds or gradual illumination variations, helping in distinguishing between moving and stopped foreground regions, leading to an initial segmentation of scene objects. Experimental results are presented for video sequences that represent typical situations critical for detecting vehicles stopped in no parking areas and compared with those obtained by other existing approaches.

Lucia Maddalena, Alfredo Petrosino

A New Linguistic-Perceptual Event Model for Spatio-Temporal Event Detection and Personalized Retrieval of Sports Video

This paper proposes a new linguistic-perceptual event model tailoring to spatio-temporal event detection and conceptual-visual personalized retrieval of sports video sequences. The major contributions of the proposed model are hierarchical structure, independence between linguistic and perceptual part, and ability of capturing temporal information of sports events. Thanks to these advanced contributions, it is very easy to upgrade model events from simple to complex levels either by self-studying from inner knowledge or by being taught from plug-in additional knowledge. Thus, the proposed model not only can work well in unwell structured environments but also is able to adapt itself to new domains without the need (or with a few modification) for external re-programming, re-configuring and re-adjusting. Thorough experimental results demonstrate that events are modeled and detected with high accuracy and automation, and users’ expectation of personalized retrieval is highly satisfied.

Minh-Son Dao, Sharma Ishan Nath, Noboru Babaguichi

Towards Protein Interaction Analysis through Surface Labeling

The knowledge of the biological function of proteins would have great impact on the identification of novel drug targets, and on finding the molecular causes of diseases. Unfortunately, the experimental determination of protein function is a very expensive and time consuming process. As a consequence, the development of computational techniques to complement and guide the experimental process is a crucial and fundamental step for biological analysis.

The final goal of the activity here presented is to provide a method that allows the identification of sites of possible protein-protein and protein-ligand interaction on the basis of the geometrical and topological structure of protein surfaces. The goal is then to discover complementary regions (that is with concave and convex segments that match each others) among different proteins. In particular, we are considering the first step of this process: the segmentation of the protein surface in protuberances and inlets through the analysis of convexity and concavity. To this end, two approaches will be described with a comparative assessment in terms of accuracy and speed of execution.

Virginio Cantoni, Riccardo Gatti, Luca Lombardi

A Semi-automated Method for the Measurement of the Fetal Nuchal Translucency in Ultrasound Images

Nowadays the measurement of the nuchal translucency thickness is being used as part of routine ultrasound scanning during the end of the first trimester of pregnancy, for the screening of chromosomal defects, as trisomy 21. Currently, the measurement is being performed

manually

by physicians. The measurement can take a long time for being accomplished, needs to be performed by highly skilled operators, and is prone to errors. In this paper we present an algorithm that automatically detects the border of the nuchal translucency, once a region of interest has been manually identified. The algorithm is based on the minimisation of a cost function, and the optimisation is performed using the dynamic programming paradigm. The method we present overcomes several of the drawbacks present in the state of the art algorithms.

Ezio Catanzariti, Giovanni Fusco, Francesco Isgrò, Salvatore Masecchia, Roberto Prevete, Matteo Santoro

Video Background Segmentation Using Adaptive Background Models

This paper proposes an adaptive background model which combines the advantages of both Eigenbackground and pixel-based gaussian models. This method exploits the illumination changes by Eigenbackground. Moreover, it can detect the chroma changes and remove shadow pixels using gaussian models. An adaptively strategy is used to integrate two models. A binary graph cut is used to implement the foreground/background segmentation by developing our data term and smooth term. We validate our method on indoor videos and test it on the benchmark video. Experiments demonstrate our method’s efficiency.

Xiaoyu Wu, Yangsheng Wang, Jituo Li

Shape Analysis

A New Algorithm for Polygonal Approximation Based on Ant Colony Optimization

In shape analysis a crucial step consists in extracting meaningful features from digital curves. Dominant points are those points with curvature extreme on the curve that can suitably describe the curve both for visual perception and for recognition. In this paper we present a novel method that combines the dominant point detection and the ant colony optimization search. The excellent results have been compared both to works using an optimal search approach and to works based on exact approximation strategy.

Cecilia Di Ruberto, Andrea Morgera

Recognition of Occluded Shapes Using Size Functions

The robustness against occlusions and the ability to perform not only global matching, but also partial matching are investigated in computer vision in order to evaluate the performance of shape descriptors. In this paper we consider the size function shape descriptor, and we illustrate some results about size functions of occluded shapes. Theoretical results indicate that size functions are able to detect a partial matching between shapes by showing a common subset of cornerpoints. Experiments are presented which outline the potential of the proposed approach in recognition tasks in the presence of occlusions.

Barbara Di Fabio, Claudia Landi, Filippo Medri

Discrete Distortion for Surface Meshes

Discrete distortion for two- and three-dimensional combinatorial manifolds is a discrete alternative to Ricci curvature known for differentiable manifolds. Here, we show that distortion can be successfully used to estimate mean curvature at any point of a surface. We compare our approach with the continuous case and with a common discrete approximation of mean curvature, which depends on the area of the star of each vertex in the triangulated surface. This provides a new, area-independent, tool for curvature estimation and for morphological shape analysis. We illustrate our approach through experimental results showing the behavior of discrete distortion.

Mohammed Mostefa Mesmoudi, Leila De Floriani, Paola Magillo

Face Analysis

Webcam-Based Visual Gaze Estimation

In this paper we combine a state of the art eye center locator and a new eye corner locator into a system which estimates the visual gaze of a user in a controlled environment (e.g. sitting in front of a screen). In order to reduce to a minimum the computational costs, the eye corner locator is built upon the same technology of the eye center locator, tweaked for the specific task. If high mapping precision is not a priority of the application, we claim that the system can achieve acceptable accuracy without the requirements of additional dedicated hardware. We believe that this could bring new gaze based methodologies for human-computer interactions into the mainstream.

Roberto Valenti, Jacopo Staiano, Nicu Sebe, Theo Gevers

Real-Time Probabilistic Tracking of Faces in Video

In this note it is discussed how real-time face detection and tracking in video can be achieved by relying on a Bayesian approach realized in a multi-threaded architecture. To this end we propose a probabilistic interpretation of the output provided by a cascade of AdaBoost classifiers. Results show that such integrated approach is appealing with respect either to robustness and computational efficiency.

Giuseppe Boccignone, Paola Campadelli, Alessandro Ferrari, Giuseppe Lipori

A Robust Iris Localization Model Based on Phase Congruency and Least Trimmed Squares Estimation

Iris localization is a crucial step in iris recognition. The previous proposed algorithms perform unsatisfactorily due to the disturbing of eyelash and variation of image brightness. To solve these problems, we proposed a robust iris position estimation algorithm based on phase congruency analysis and LTSE (Least Trimmed Squares Estimation). Through using the robust regression method to fit iris edge points we can solve the eyelash occlusion problem at a certain extent. The experimental results demonstrate the validity of this algorithm.

Lili Pan, Mei Xie, Tao Zheng, Jianli Ren

Medical Imaging

Shape-Based Classification of 3D Head Data

Craniofacial disorders are one of the most common category of birth defects worldwide, and are an important topic of biomedical research. In order to better understand these disorders and correlate them with genetic patterns and life outcomes, researchers need to quantify the craniofacial anatomy. In this paper we introduce several different craniofacial descriptors that are being used in research studies for two craniofacial disorders: the 22q11.2 deletion syndrome (a genetic disorder) and plagiocephaly/brachycephaly, disorders caused by pressure on the head. Experimental results show that our descriptors show promise for quantifying craniofacial shape.

Linda G. Shapiro, Katarzyna Wilamowska, Indriyati Atmosukarto, Jia Wu, Carrie Heike, Matthew Speltz, Michael Cunningham

Sudden Changes Detection in WCE Video

The direct visual inspection of WCE video by an expert is a tiring and cost activity and it is a true bottleneck to the widespread application of this diagnostic technique. In this paper we apply the texton approach to characterize with a numeric indicator the sub-sequences of a WCE that show sharp change and that are likely to represent relevant medical details. Experiments show that the proposed fully automatic technique may safely reduce the amount of frames that need further examination of up to 70%.

Giovanni Gallo, Eliana Granata, Giuseppe Scarpulla

Denoising of Digital Radiographic Images with Automatic Regularization Based on Total Variation

We report here a principled method for setting the regularization parameter in total variation filtering, that is based on the analysis of the distribution of the gray levels on the noisy image. We also report the results of an experimental investigation of the application of this framework to very low photon count digital radiography that shows the effectiveness of the method in denoising such images. Total variation regularization leads to a non-linear optimization problem that is solved here with a new generation adaptive first order method. Results suggest a further investigation of both the convergence criteria and/or the scheduling of the optimization parameters of this method.

Mirko Lucchese, N. Alberto Borghese

Image Analysis and Pattern Recognition

A Real-Time Occlusion Aware Hardware Structure for Disparity Map Computation

Many machine vision applications deal with depth estimation in a scene. Disparity map recovery from a stereo image pair has been extensively studied by the computer vision community. Previous methods are mainly restricted to software based techniques on general-purpose architectures, presenting relatively high execution time due to the computationally complex algorithms involved. In this paper a new hardware module suitable for real-time disparity map computation module is realized. This enables a hardware based occlusion-aware parallel-pipelined design, implemented on a single FPGA device with a typical operating frequency of 511 MHz. It provides accurate disparity map computation at a rate of 768 frames per second, given a stereo image pair with a disparity range of 80 pixels and 640x480 pixel spatial resolution. The proposed method allows a fast disparity map computational module to be built, enabling a suitable module for real-time stereo vision applications.

Christos Georgoulas, Ioannis Andreadis

A Graph-Based Approach for Shape Skeleton Analysis

This paper presents a novel methodology to shape characterization, where a shape skeleton is modeled as a dynamic graph, and degree measurements are computed to compose a set of shape descriptors. The proposed approach is evaluated in a classification experiment which considers a generic set of shapes. A comparison with traditional shape analysis methods, such as Fourier descriptors, Curvature, Zernike moments and Multi-scale Fractal Dimension, is also performed. Results show that the method is efficient for shape characterization tasks, in spite of the reduced amount of information present in the shape skeleton.

André R. Backes, Odemir M. Bruno

Reconnecting Broken Ridges in Fingerprint Images

In this paper, we present a new method for reconnecting broken ridges in fingerprint images. The method is based on the use of a discrete directional mask and on the standard deviation of the gray-levels to determine ridge direction. The obtained direction map is smoothed by counting the occurrences of the directions in a sufficiently large window. The fingerprint image is, then, binarized and thinned. Linking paths to connect broken ridges are generated by using a morphological transformation to guide the process.

Nadia Brancati, Maria Frucci, Gabriella Sanniti di Baja

Optimum Sensors for ‘Chromaticity’ Constancy in the Pixel

In machine vision systems recording the colour of an object is crucial for applications such as skin detection while it will enhance applications including colour based recognition and image retrieval. Unfortunately, almost none of the existing colour constancy algorithms have been designed to deal with the high dynamic ranges that can occur in external, naturally illuminated scenes. One algorithm that can deal with these scenes has been proposed by Finlayson and Drew. In this paper a method of assessing the performance of this algorithm, and equivalent algorithms, are proposed. The performance of this algorithm is then significantly improved by optimising the spectral response of the sensors used to obtain the data required by algorithm. Since the resulting performance is comparable to that of the human visual system it appears that this algorithm is capable of obtaining useful chromaticity information under highly varying illumination conditions.

Sivalogeswaran Ratnasingam, Steve Collins

An Improved Adaptive Smoothing Method

An improvement of the Chen’s method has been provided through the calculation of a more accurate H map. The H map is the pixel’s contextual inhomogeneity value reflecting its proximity position with respect to an edge feature, and a more accurate H value leads to the more accurate smoothing speed for the pixel. While experiments on 5 real images show slight improvements in SNRs of our method over that of the Chen method, edge features preserving capability has been enhanced with low FARs (false alarm rates) for edge feature extracted from applying the Sobel filter to the image. Furthermore, parameter values have been determined through an exhaustive searching process resulting in the suggestions of h=0.4 and T=4 for practical applications where the original noise free image is not available and/or no viewer to visually make a selection of the final smoothed image as the output.

Xin Hu, Hui Peng, Joseph Kesker, Xiang Cai, William G. Wee, Jing-Huei Lee

Objective Quality Assessment Measurement for Typhoon Cloud Image Enhancement

There are kinds of enhancement methods for satellite image, however, visual quality of them are basically assessed by human eyes. This can result in wrong identification. This will result in wrong prediction for center and intensity of the typhoon. It is necessary to find an objective measure to evaluate the visual quality for enhanced typhoon cloud image. In order to solve this problem, we give an objective assessment measurement based on information, contrast and peak-signal-noise-ratio. We design an experiment to certify the proposed measure by using the typhoon cloud images which are provided by China Meteorological Administration, China National Satellite Meteorological Center.

Changjiang Zhang, Juan Lu, Jinshan Wang

Fuzzy Smoothed Composition of Local Mapping Transformations for Non-rigid Image Registration

This paper presents a novel method for medical image registration. The global transformation is obtained by composing affine transformations, which are recovered locally from given landmarks.Transformations of adjacent regions are smoothed to avoid blocking artifacts, so that a unique continuous and differentiable global function is obtained. Such composition is operated using a technique derived from fuzzy C-means clustering. The method was successfully tested on several datasets; results, both qualitative and quantitative, are shown. Comparisons with other methods are reported. Final considerations on the efficiency of the technique are explained.

Edoardo Ardizzone, Roberto Gallea, Orazio Gambino, Roberto Pirrone

Many-to-Many Matching under the l 1 Norm

The problem of object recognition can be formulated as matching feature sets of different objects. Segmentation errors and scale difference result in many-to-many matching of feature sets, rather than one-to-one. This paper extends a previous algorithm on many-to-many graph matching. The proposed work represents graphs, which correspond to objects, isometrically in the geometric space under the

norm. Empirical evaluation of the algorithm on a set of recognition trails, including a comparison with the previous approach, demonstrates the efficacy of the overall framework.

M. Fatih Demirci, Yusuf Osmanlıoğlu

Evaluation of a Foreground Segmentation Algorithm for 3D Camera Sensors

Our interest is focusing on the innovative time-of-flight (TOF) cameras. In this paper we present our approach to foreground segmentation, based on smart-seeded region growing. The seeding strategy makes use of the characteristic intensity signal provided by TOF cameras, and growing is proved by experimental measurements to produce a pixel-wise segmentation of 82%-92% quality. Compared to background subtraction, our approach uses more explicitly the unique capacity of TOF cameras to isolate foreground objects on the basis of their distance. Our work will find an application in gate monitoring and passage surveillance.

Luca Bianchi, Piercarlo Dondi, Riccardo Gatti, Luca Lombardi, Paolo Lombardi

Image Quality Assessment Based on Multi-scale Geometric Analysis

A novel objective full-reference image quality assessment metric based on Multi-scale Geometric Analysis (MGA) of contourlet transform is proposed. Contourlet transform has excellent properties for image representation, such as multiresolution, localization and directionality, which are the key characteristics of human vision system. Utilizing multiresolution and directionality of MGA, we extract the distortion of structural information from different vision scale and edge direction. The degradation of image quality is evaluated based on the defined energy of structural distortion. Performance experiments are made on professional image quality database with five different distortion types. Compared with some state-of-the-art measures, the results demonstrate that the proposed method improves accuracy and robustness of image quality prediction.

Mingna Liu, Xin Yang, Yanfeng Shang

Connected Component Labeling Techniques on Modern Architectures

In this paper we present an overview of the historical evolution of connected component labeling algorithms, and in particular the ones applied on images stored in raster scan order. This brief survey aims at providing a comprehensive comparison of their performance on modern architectures, since the high availability of memory and the presence of caches make some solutions more suitable and fast. Moreover we propose a new strategy for label propagation based on a 2x2 blocks, which allows to improve the performance of many existing algorithms. The tests are conducted on high resolution images obtained from digitized historical manuscripts and a set of transformations is applied in order to show the algorithms behavior at different image resolutions and with a varying number of labels.

Costantino Grana, Daniele Borghesani, Rita Cucchiara

Estimation of Object Position Based on Color and Shape Contextual Information

This paper presents a method to estimate the position of object using contextual information. Although convention methods used only shape contextual information, color contextual information is also effective to describe scenes. Thus we use both shape and color contextual information. To estimate the object position from only contextual information, the Support Vector Regression is used. We choose the Pyramid Match Kernel which measures the similarity between histograms because our contextual information is described as histogram. When one kernel is applied to a feature vector which consists of color and shape, the similarity of each feature is not used effectively. Thus, kernels are applied to color and shape independently, and the weighted sum of the outputs of both kernels is used. We confirm that the proposed method outperforms conventional methods.

Takashi Ishihara, Kazuhiro Hotta, Haruhisa Takahashi

Hallucinating Faces: Global Linear Modal Based Super-Resolution and Position Based Residue Compensation

A learning-based face hallucination method is proposed in this paper for the reconstruction of a high-resolution face image from a low-resolution observation based on a set of high- and low-resolution training image pairs. The proposed global linear modal based super-resolution estimates the optimal weights of all the low-resolution training images and a high-resolution image is obtained by applying the estimated weights to the high-resolution space. Then, we propose a position based local residue compensation algorithm to better recover subtle details of face. Experiments demonstrate that our method has advantage over some established methods.

Xiang Ma, Junping Zhang, Chun Qi

Joint Affine and Illumination Estimation Using Scale Manipulation Features

We present a novel image transform called Scale Manipulation Features (SMF). The transform calculates affine invariant features of objects in a global manner and avoids using any sort of edge detection. The transform can be used for registration of affine transformed images in the presence of non homogenous illumination changes and for estimation of the illumination changes. The computational load of the method is relatively low since it is linear in the data size. In this paper we introduce the transform and demonstrate its applications for illumination compensation and for object registration in the presence of an affine geometric transformation and varying illumination.

Kobi Bentolila, Joseph M. Francos

Real-Time Descriptorless Feature Tracking

This paper presents a simple and efficient estimator of long-term sparse optical flow. It is supported by a novel approach to feature tracking, essentially based on global coherence of local movements. Expensive invariant appearance descriptors are not required: the locations of salient points in successive frames provide enough information to create a large number of accurate and stable tracking histories which remain alive for significantly long times. Hence, wide-baseline matching can be achieved both in extremely regular scenes and in cases in which corresponding points are photometrically very different. Our experiments show that this method is able to robustly maintain in real time hundreds of trajectories in long video sequences using a standard computer.

Antonio L. Rodríguez, Pedro E. López-de-Teruel, Alberto Ruiz

Adaptive Sharpening with Overshoot Control

This paper presents the Adaptive Sharpening with Overshoot Control (ASOC), an algorithm for digital image contrast enhancement. The ASOC exploits a properly defined band-pass filter in order to better discriminate the "uniform/not uniform" zones of the input (blurred) image. A more aggressive sharpening is then performed on the textured zones, whilst the homogeneous regions are preserved.

Antonio Buemi, Gaetano Santoro, Arcangelo Bruna, Mirko Guarnera

Applying Visual Object Categorization and Memory Colors for Automatic Color Constancy

This paper presents a framework for using high-level visual information to enhance the performance of automatic color constancy algorithms. The approach is based on recognizing special visual object categories, called here as memory color categories, which have a relatively constant color (e.g. the sky). If such category is found from image, the initial white balance provided by a low-level color constancy algorithm can be adjusted so that the observed color of the category moves toward the desired color. The magnitude and direction of the adjustment is controlled by the learned characteristics of the particular category in the chromaticity space. The object categorization is performed using bag-of-features method and raw camera data with reduced preprocessing and resolution. The proposed approach is demonstrated in experiments involving the standard gray-world and the state-of-the-art gray-edge color constancy methods. In both cases the introduced approach improves the performance of the original methods.

Esa Rahtu, Jarno Nikkanen, Juho Kannala, Leena Lepistö, Janne Heikkilä

Anomaly-Based Detection of IRC Botnets by Means of One-Class Support Vector Classifiers

The complexity of modern cyber attacks urges for the definition of detection and classification techniques more sophisticated than those based on the well known

signature detection

approach. As a matter of fact, attackers try to deploy armies of controlled

bots

by infecting vulnerable hosts. Such bots are characterized by complex executable command sets, and take part in cooperative and coordinated attacks. Therefore, an effective detection technique should rely on a suitable model of both the envisaged networking scenario and the attacks targeting it.

We will address the problem of detecting

botnets

, by describing a behavioral model, for a specific class of network users, and a set of features that can be used in order to identify

botnet

-related activities. Tests performed by using an anomaly-based detection scheme on a set of real network traffic traces confirmed the effectiveness of the proposed approach.

Claudio Mazzariello, Carlo Sansone

Detection of Duplicated Regions in Tampered Digital Images by Bit-Plane Analysis

In this paper we present a new method for searching duplicated areas in a digital image. The goal is to detect if an image has been tampered by a copy-move process. Our method works within a convenient domain. The image to be analyzed is decomposed in its bit-plane representation. Then, for each bit-plane, block of bits are encoded with an ASCII code, and a sequence of strings is analyzed rather than the original bit-plane. The sequence is lexicographically sorted and similar groups of bits are extracted as candidate areas, and passed to the following plane to be processed. Output of the last planes indicates if, and where, the image is altered.

Edoardo Ardizzone, Giuseppe Mazzola

Color Features Performance Comparison for Image Retrieval

This paper proposes a comparison of color features for image retrieval. In particular the UCID image database has been employed to compare the retrieval capabilities of different color descriptors. The set of descriptors comprises global and spatially related features, and the tests show that HSV based global features provide the best performance at varying brightness and contrast settings.

Daniele Borghesani, Costantino Grana, Rita Cucchiara

An Adaptive Technique for Accurate Feature Extraction from Regular and Irregular Image Data

We present a single multi-scale gradient-based feature extraction algorithm that can be directly applied to irregular or regular image data and hence can be used on both range and intensity images. We illustrate the accuracy of this approach using the Figure of Merit evaluation technique on real images, demonstrating that the application of this approach to both range and intensity images is more accurate than the equivalent approach of applying a gradient operator, such as Sobel, to an intensity image and, separately, the scan-line approximation approach to range images.

Sonya Coleman, Shanmuglingam Suganthan, Bryan Scotney

Processing Hexagonal Images in a Virtual Environment

For many years the concept of using hexagonal pixels for image capture has been investigated, and several advantages of such an approach have been highlighted. Recently there has been a renewed interest in the use of hexagonal images, representation of architectures for such images and general hexagonal image processing. Therefore, we present multiscale hexagonal gradient operators, developed within the finite element framework, for use directly on hexagonal pixel-based images. We demonstrate these operators using two environments: a

virtual hexagonal environment

and the direct use of simulated hexagonal pixel-based images. In both scenarios, we evaluate the proposed operators and compare them with the use of standard image processing operators on typical square pixel-based images, demonstrating improved results in the case of simulated hexagonal pixel-based images.

Sonya Coleman, Bryan Scotney, Bryan Gardiner

Pixel Coverage Segmentation for Improved Feature Estimation

By utilizing intensity information available in images, partial coverage of pixels at object borders can be estimated. Such information can, in turn, provide more precise feature estimates. We present a pixel coverage segmentation method which assigns pixel values corresponding to the area of a pixel that is covered by the imaged object(s). Starting from any suitable crisp segmentation, we extract a one-pixel thin 4-connected boundary between the observed image components where a local linear mixture model is used for estimating fractional pixel coverage values. We evaluate the presented segmentation method, as well as its usefulness for subsequent precise feature estimation, on synthetic test objects with increasing levels of noise added. We conclude that for reasonable noise levels the presented method outperforms the achievable results of a perfect crisp segmentation. Finally, we illustrate the application of the suggested method on a real histological colour image.

Nataša Sladoje, Joakim Lindblad

A Generic Method of Line Matching for Central Imaging Systems under Short-Baseline Motion

Line matching across images taken by a central imaging system (perspective or catadioptric) with focus on short baseline motion of the system is proposed. The relationship between images of lines on unitary sphere is studied and a simple algorithm for matching lines are proposed assuming the rotation of the system is known apriori or it can be estimated from some correspondences in two views. Two methods are discussed for retrieving

in the case it is not known apriori. Experimental results on both synthetic and real images are also presented.

Saleh Mosaddegh, David Fofi, Pascal Vasseur

Reconstructing 3D Facial Shape Using Spherical Harmonics

It is now well established that 3D facial shape can be effectively and conveniently reconstructed using spherical harmonics. This paper extends the state-of-the-art by showing how to recover a 3D facial shape reconstruction using a spherical parameterization and minimizing a Harmonic Energy based on spherical medians. The solution is formulated as a linear system and we propose an iterative residual fitting algorithm (LSQR-IRF) to solve it. We demonstrate the effectivenss of the method on range-data.

Chengming Zou, Guanghui Zhao, Edwin R. Hancock

Retinex Combined with Total Variation for Image Illumination Normalization

This paper presents a method for the normalization of human facial images in arbitrary illumination conditions. The enhanced image is suitable to be used as an input to a face recognition system.

Luigi Cinque, Gabriele Morrone

A Color-Based Interest Operator

In this paper we propose a novel interest operator robust to photometric and geometric transformations. Our operator is closely related to the grayscale MSER but it works on the HSV color space, as opposed to the most popular operators in the literature, which are intensity based. It combines a fine and a coarse overlapped quantization of the HSV color space to find maximally stable extremal regions on each of its components and combine them into a final set of regions that are useful in images where intensity does not discriminate well. We evaluate the performance of our operator on two different applications: wide-baseline stereo matching and image annotation.

Marta Penas, Linda G. Shapiro

An Evaluation of Scale and Noise Sensitivity of Fibre Orientation Estimation in Volume Images

Fibre orientation influences many important properties of fibre-based materials, for example, strength and stiffness. Fibre orientation and the orientation anisotropy in paper and other wood fibre-based materials have previously been estimated using two-dimensional images. Recently, we presented a method for estimating the three-dimensional fibre orientation in volume images based on local orientation estimates. Here, we present an evaluation of the method with respect to scale and noise sensitivity. The evaluation is performed for both tubular and solid fibres. We also present a new method for automatic scale selection for solid fibres. The method is based on a segmentation of the fibres that also provides an estimate of the fibre dimension distribution in an image. The results show that the fibre orientation estimation performs well both in noisy images and at different scales. The presented results can be used as a guide to select appropriate parameters for the method when it is applied to real data. The applicability of the fibre orientation estimation to fibre-based materials with solid fibres is demonstrated for a volume image of a press felt acquired with X-ray microtomography.

Maria Axelsson

Real-Time Online Video Object Silhouette Extraction Using Graph Cuts on the GPU

Being able to find the silhouette of an object is a very important front-end processing step for many high-level computer vision techniques, such as Shape-from-Silhouette 3D reconstruction methods, object shape tracking, and pose estimation. Graph cuts have been proposed as a method for finding very accurate silhouettes which can be used as input to such high level techniques, but graph cuts are notoriously computation intensive and slow. Leading CPU implementations can extract a silhouette from a single QVGA image in 100 milliseconds, with performance dramatically decreasing with increased resolution. Recent GPU implementations have been able to achieve performance of 6 milliseconds per image by exploiting the intrinsic properties of the lattice graphs and the hardware model of the GPU. However, these methods are restricted to a subclass of lattice graphs and are not generally applicable. We propose a novel method for graph cuts on the GPU which places no limits on graph configuration and which is able to achieve comparable real-time performance in online video processing scenarios.

Zachary A. Garrett, Hideo Saito

On a New Measure of Classifier Competence Applied to the Design of Multiclassifier Systems

This paper presents a new method for calculating competence of a classifier in the feature space. The idea is based on relating the response of the classifier with the response obtained by a random guessing. The measure of competence reflects this relation and rates the classifier with respect to the random guessing in a continuous manner. Two multiclassifier systems representing fusion and selection strategies were developed using proposed measure of competence. The performance of multiclassifiers was evaluated using five benchmark databases from the UCI Machine Learning Repository and Ludmila Kuncheva Collection. Classification results obtained for three simple fusion methods and one multiclassifier system with selection strategy were used for a comparison. The experimental results showed that, regardless of the strategy used by the multiclassifier system, the classification accuracy has increased when the measure of competence was employed. The improvement was most significant for simple fusion methods (

sum, product

and

majority vote

). For all databases, two developed multiclassifier systems produced the best classification scores.

Tomasz Woloszynski, Marek Kurzynski

Multi-class Binary Symbol Classification with Circular Blurred Shape Models

Multi-class binary symbol classification requires the use of rich descriptors and robust classifiers.

Shape

representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we present the Circular Blurred Shape Model descriptor. This descriptor encodes the arrangement information of object parts in a correlogram structure. A prior blurring degree defines the level of distortion allowed to the symbol. Moreover, we learn the new feature space using a set of Adaboost classifiers, which are combined in the Error-Correcting Output Codes framework to deal with the multi-class categorization problem. The presented work has been validated over different multi-class data sets, and compared to the state-of-the-art descriptors, showing significant performance improvements.

Sergio Escalera, Alicia Fornés, Oriol Pujol, Petia Radeva

A Novel Recognition Approach for Sketch-Based Interfaces

Multimodal interfaces can be profitably used to support the more and more complex applications and services which support human activities in everyday life. In particular, sketch-based interfaces offer users an effortless and powerful communication way to represent concepts and/or commands on different devices. Developing a sketch-based interface for a specific application or service is a time-consuming operation that requires the re-engineering and/or the re-designing of the whole recognizer framework. This paper describes a definitive framework that allows users to define each kind of sketch-based interface, using freehand drawing only. The definition of the interface and its recognition process are performed using our developed Sketch Modeling Language (

SketchML

Danilo Avola, Andrea Del Buono, Giorgio Gianforme, Stefano Paolozzi

Simultaneous Document Margin Removal and Skew Correction Based on Corner Detection in Projection Profiles

Document images obtained from scanners or photocopiers usually have a black margin which interferes with subsequent stages of page segmentation algorithms. Thus, the margins must be removed at the initial stage of a document processing application. This paper presents an algorithm which we have developed for document margin removal based upon the detection of document corners from projection profiles. The algorithm does not make any restrictive assumptions regarding the input document image to be processed. It neither needs all four margins to be present nor needs the corners to be right angles. In the case of the tilted documents, it is able to detect and correct the skew. In our experiments, the algorithm was successfully applied to all document images in our databases of French and Arabic document images which contain more than two hundred images with different types of layouts, noise, and intensity levels.

M. Mehdi Haji, Tien D. Bui, Ching Y. Suen

Dense Two-Frame Stereo Correspondence by Self-organizing Neural Network

This work aims at defining an extension of a competitive method for matching correspondences in stereoscopic image analysis. The method we extended was proposed by Venkatesh, Y.V.

et al

where the authors extend a Self-Organizing Map by changing the neural weights updating phase in order to solve the correspondence problem within a two-frame area matching approach and producing dense disparity maps. In the present paper we have extended the method mentioned by adding some details that lead to better results. Experimental studies were conducted to evaluate and compare the solution proposed.

Marco Vanetti, Ignazio Gallo, Elisabetta Binaghi

Towards a Linear Combination of Dichotomizers by Margin Maximization

When dealing with two-class problems the combination of several dichotomizers is an established technique to improve the classification performance. In this context the margin is considered a central concept since several theoretical results show that improving the margin on the training set is beneficial for the generalization error of a classifier. In particular, this has been analyzed with reference to learning algorithms based on boosting which aim to build strong classifiers through the combination of many weak classifiers. In this paper we try to experimentally verify if the margin maximization can be beneficial also when combining already trained classifiers. We have employed an algorithm for evaluating the weights of a linear convex combination of dichotomizers so as to maximize the margin of the combination on the training set. Several experiments performed on publicly available data sets have shown that a combination based on margin maximization could be particularly effective if compared with other established fusion methods.

Claudio Marrocco, Mario Molinara, Maria Teresa Ricamato, Francesco Tortorella

Backmatter

Titel: Image Analysis and Processing – ICIAP 2009
herausgegeben von: Pasquale Foggia
Carlo Sansone
Mario Vento
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-04146-4
Print ISBN: 978-3-642-04145-7
DOI: https://doi.org/10.1007/978-3-642-04146-4