Skip to main content

2003 | Buch

Pattern Recognition and Image Analysis

First Iberian Conference, IbPRIA 2003, Puerto de Andratx, Mallorca, Spain, JUne 4-6, 2003. Proceedings

herausgegeben von: Francisco José Perales, Aurélio J. C. Campilho, Nicolás Pérez de la Blanca, Alberto Sanfeliu

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Inhaltsverzeichnis

Frontmatter

First Iberian Conference on Pattern Recognition and Image Analysis IbPRIA’2003

Solids Characterization Using Modeling Wave Structures

This paper introduces a characterization study on solid and 3D shapes based-on the recent Modeling Wave (MW) topological organization. The MW establishes a whole n-connectivity relationship in 3D objects modeling meshes. Now an extended use of MW is carried out. Through a new feature called Cone-Curvature, which originates from the MW concept, a flexible and extended surroundings geometry knowledge for every point of the solid surface is given. No-local nor no-global but a half-connectivity has been used for defining a robust 3D similarity measure. The method presented has been successfully tested in our lab over range data in a wide variety of shapes. Consequently, extended research on 3D objects clustering will be accomplished in the near future.

Miguel Adan, Antonio Adan
A Probabilistic Model for the Cooperative Modular Neural Network

This paper presents a model for the probability of correct classification for the Cooperative Modular Neural Network (CMNN). The model enables the estimation of the performance of the CMNN using parameters obtained from the data set. The performance estimates for the experiments presented are quite accurate (less than 1% relative difference). We compare the CMNN with a multi-layer perceptron with equal number of weights and conclude that the CMNN is preferred for complex problems. We also investigate the error introduced by one of the CMNN voting strategies.

Luís A. Alexandre, Aurélio Campilho, Mohamed Kamel
Robust Learning Algorithm for the Mixture of Experts

The Mixture of Experts model (ME) is a type of modular artificial neural network (MANN) whose architecture is composed by different kinds of networks who compete to learn different aspects of the problem. This model is used when the searching space is stratified. The learning algorithm of the ME model consists in estimating the network parameters to achieve a desired performance. To estimate the parameters, some distributional assumptions are made, so the learning algorithm and, consequently, the parameters obtained depends on the distribution. But when the data is exposed to outliers the assumption is not longer valid, the model is affected and is very sensible to the data as it is showed in this work. We propose a robust learning estimator by means of the generalization of the maximum likelihood estimator called M-estimator. Finally a simulation study is shown, where the robust estimator presents a better performance than the maximum likelihood estimator (MLE).

Héctor Allende, Romina Torres, Rodrigo Salas, Claudio Moraga
A Robust and Effective Learning Algorithm for Feedforward Neural Networks Based on the Influence Function

The learning process of the Feedforward Artificial Neural Networks relies on the data, though a robustness analysis of the parameter estimates of the model must be done due to the presence of outlying observations in the data. In this paper we seek the robust properties in the parameter estimates in the sense that the influence of aberrant observations or outliers in the estimate is bounded so the neural network is able to model the bulk of data. We also seek a trade off between robustness and efficiency under a Gaussian model. An adaptive learning procedure that seeks both aspects is developed. Finally we show some simulations results applied to the RESEX time series.

Héctor Allende, Rodrigo Salas, Claudio Moraga
Regularization of 3D Cylindrical Surfaces

In this paper we present a method for the regularization of 3D cylindrical surfaces. By a cylindrical surface we mean a 3D surface that can be expressed as an application S(l,θ) →R3, where (l,θ) represents a cylindrical parametrization of the 3D surface. We built an initial cylindrical parametrization of the surface. We propose a new method to regularize such cylindrical surface. This method takes into account the information supplied by the disparity maps computed between pair of images to constraint the regularization of the set of 3D points. We propose a model based on an energy which is composed of two terms: an attachment term that minimizes the difference between the image coordinates and the disparity maps and a second term that enables a regularization by means of anisotropic diffusion. One interesting advantage of this approach is that we regularize the 3D surface by using a bi-dimensional minimization problem.

Luis Alvarez, Carmelo Cuenca, Javier Sánchez
Non-rigid Registration of Vessel Structures in IVUS Images

We present a registration and retrieval algorithm of medical images. Our algorithm is oriented in a general fashion towards gray level medical images of non-rigid bodies such as coronary vessels, where object shape information provide poor information. We use rich descriptors based on both local and global (contextual) information, and at the same time we use a cooperative-iterative strategy in order to get a good set of correspondences as well as a good final transformation. We focus on a novel application of registration of medical images: registration of IVUS, a promising technique of analyzing the coronary vessels.

Jaume Amores, Petia Radeva
Underwater Cable Tracking by Visual Feedback

Nowadays, the surveillance and inspection of underwater installations, such as power and telecommunication cables and pipelines, is carried out by trained operators who, from the surface, control a Remotely Operated Vehicle (ROV) with cameras mounted over it. This is a tedious, time-consuming and expensive task, prone to errors mainly because of loss of attention or fatigue of the operator and also due to the typical low quality of seabed images. In this study, the development of a vision system guiding an Autonomous Underwater Vehicle (AUV) able to detect and track automatically an underwater power cable laid on the seabed has been the main concern. The system has been tested using sequences from a video tape obtained by means of a ROV during several tracking sessions of various real cables. The average success rate that has been achieved is about 90% for a frame rate higher than 25 frames/second.

Javier Antich, Alberto Ortiz
A Hierarchical Clustering Strategy and Its Application to Proteomic Interaction Data

We describe a novel strategy of hierarchical clustering analysis, particularly useful to analyze proteomic interaction data. The logic behind this method is to use the information for all interactions among the elements of a set to evaluate the strength of the interaction of each pair of elements. Our procedure allows the characterization of protein complexes starting with partial data and the detection of "promiscuous" proteins that bias the results, generating false positive data. We demonstrate the usefulness of our strategy by analyzing a real case that involves 137 Saccharomyces cerevisiae proteins. Because most functional studies require the evaluation of similar data sets, our method has a wide range of applications and thus it can be established as a benchmark analysis for proteomic data.

Vicente Arnau, Ignacio Marín
A New Optimal Classifier Architecture to Aviod the Dimensionality Curse

In paper we present the theoretical foundation for optimal classification using class-specific features and provide examples of its use. A new PDF projection theorem makes it possible to project probability density functions from a low-dimensional feature space back to the raw data space. An M-ary classifier is constructed by estimating the PDFs of class-specific features, then transforming each PDF back to the raw data space where they can be fairly compared. Although statistical sufficiency is not a requirement, the classifier thus constructed will become equivalent to the optimal Bayes classifier if the features meet sufficiency requirements individually for each class. This classifier is completely modular and avoids the dimensionality curse associated with large complex problems. By recursive application of the projection theorem, it is possible to analyze complex signal processing chains. It is possible to automate the feature and model selection process by direct comparison of log-likelihood values on the common raw data domain. Pre-tested modules are available for a wide range of features including linear functions of independent random variables, cepstrum, and MEL cepstrum.

Paul M. Baggenstoss
Learning from Imbalanced Sets through Resampling and Weighting

The problem of imbalanced training sets in supervised pattern recognition methods is receiving growing attention. Imbalanced training sample means that one class is represented by a large number of examples while the other is represented by only a few. It has been observed that this situation, which arises in several practical situations, may produce an important deterioration of the classification accuracy, in particular with patterns belonging to the less represented classes. In the present paper, we introduce a new approach to design an instance-based classifier in such imbalanced environments.

R. Barandela, J. S. Sánchez, V. García, F. J. Ferri
Morphological Recognition of Olive Grove Patterns

This paper presents a methodology to segment olive groves in high spatial resolution remotely sensed images. The developed algorithms exploit the typical spatial patterns presented by this forest cover and are mainly based on mathematical morphology operators. It consists on identifying firstly the olive groves followed by the recognition of their individual trees. The methodology is tested with ortophotomaps from a region in central Portugal.

Teresa Barata, Pedro Pina
Combining Multi-variate Statistics and Dempster-Shafer Theory for Edge Detection in Multi-channel SAR Images

A new scheme for detecting edges in multi-channel SAR images is proposed. The method is applied to a set of two full-polarimetric SAR images, i.e. a P-band and an L-band image. The first step is a low-level edge detector based on multi-variate statistical hypothesis tests. As the spatial resolution of the two SAR bands is not the same, the test is applied to the polarimetric information for each band separately. The multi-variate statistical hypothesis test is used to decide whether an edge of a given orientation passes through the current point. The test is repeated for a discrete number of orientations. Eight orientations are used. The response for the different orientations of the scanning rectangles as well as for different bands is combined using a method based on Dempster-Shafer Theory. The proposed scheme was applied to a multi-channel E-SAR image and results are shown and evaluated.

D. Borghys, C. Perneel
High-Level Clothes Description Based on Colour-Texture and Structural Features

This work is a part of a surveillance system where content-based image retrieval is done in terms of people appearance. Given an image of a person, our work provides an automatic description of his clothing according to the colour, texture and structural composition of its garments. We present a two-stage process composed by image segmentation and a region-based interpretation. We segment an image by modelling it due to an attributed graph and applying a hybrid method that follows a split-and-merge strategy. We propose the interpretation of five cloth combinations that are modelled in a graph structure in terms of region features. The interpretation is viewed as a graph matching with an associated cost between the segmentation and the cloth models. Finally, we have tested the process with a ground-truth of one hundred images.

Agnés Borràs, Francesc Tous, Josep Lladós, Maria Vanrell
A New Method for Detection and Initial Pose Estimation Based on Mumford-Shah Segmentation Functional

In this paper we describe a new method for detection and initial pose estimation of a person in a human computer interaction in an uncontrolled indoor environment. We used the Koepfler-Morel-Solimini mathematical formulation of Mumford-Shah segmentation functional adapted to color images. The idea is to obtain a system to detect the hands and face in a sequence of monocular or binocular images. The skin color is predefined and a procedure is parameterized to segment and recognize the homogeneous regions. Besides, we fit our results to a restriction that the two hands and face must be detected at the same time. We also use a biomechanical restriction to reach this initial estimation. So, the centroid of the blob is computed for every region. We explain the mathematical background segmentation, and region classification (hands, face, head and upper-torso). Finally, we present some interesting results and we implement the algorithm efficiently in order to obtain real time results processing standard video format.

Jose Maria Buades Rubio, Manuel González Hidalgo, Francisco José Perales López
Tracking Heads Using Piecewise Planar Models

We present a procedure for tracking a rigid object based on a piecewise planar model, and show how it can be used to track a human face. The tracking is performed using a single incremental SSD-based tracker. The main feature of the approach presented is that it can track a rigid set of arbitrarily small patches all of which could not be individually tracked.

José M. Buenaposada, Enrique Muñoz, Luis Baumela
Support Vector Machines for Crop Classification Using Hyperspectral Data

In this communication, we propose the use of Support Vector Machines (SVM) for crop classification using hyperspectral images. SVM are benchmarked to well–known neural networks such as multilayer perceptrons (MLP), Radial Basis Functions (RBF) and Co-Active Neural Fuzzy Inference Systems (CANFIS). Models are analyzed in terms of efficiency and robustness, which is tested according to their suitability to real–time working conditions whenever a preprocessing stage is not possible. This can be simulated by considering models with and without a preprocessing stage. Four scenarios (128, 6, 3 and 2 bands) are thus evaluated. Several conclusions are drawn: (1) SVM yield better outcomes than neural networks; (2) training neural models is unfeasible when working with high dimensional input spaces and (3) SVM perform similarly in the four classification scenarios, which indicates that noisy bands are successfully detected.

G. Camps-Valls, L. Gómez-Chova, J. Calpe-Maravilla, E. Soria-Olivas, J. D. Martín-Guerrero, J. Moreno
Vehicle License Plate Segmentation in Natural Images

A robust method for plate segmentation in a License Plate Recognition (LPR) system is presented, designed to work in a wide range of acquisition conditions, including unrestricted scene environments, light, perspective and camera-to-car distance. Although this novel text-region segmentation technique has been applied to a very specific problem, it is extensible to more general contexts, like difficult text segmentation tasks dealing with natural images. Extensive experimentation has been performed in order to estimate the best parameters for the task at hand, and the results obtained are presented.

Javier Cano, Juan-Carlos Pérez-Cortés
High-Accuracy Localization of an Underwater Robot in a~Structured Environment Using Computer Vision

This paper presents a vision-based localization approach for an underwater robot in a structured environment. The system is based on a coded pattern placed on the bottom of a water tank and an onboard down-looking camera. Main features are, absolute and map-based localization, landmark detection and tracking, and real-time computation (12.5 Hz). The proposed system provides three-dimensional position and orientation of the vehicle. The paper details the codification used in the pattern and the localization algorithm, which is illustrated with some images. Finally, the paper shows results about the accuracy of the system.

Marc Carreras, Pere Ridao, Joan Batlle, David Ribas
Determine the Composition of Honeybee Pollen by Texture Classification

Humans are interested in the knowledge of honeybee pollen composition, which depends on the local flora surrounding the beehive, due to their nutritional value and therapeutical benefits. Currently, pollen composition is manually determined by an expert palynologist counting the proportion of pollen types analyzing the pollen of the hive with an optical microscopy. This procedure is tedious and expensive for its systematic application. We present an automatic methodology to discriminate pollen loads of various genus based on texture classification. The method consists of three steps: after selection non-blurred regions of interest (ROIs) in the original image, a texture feature vector for each ROI is calculated, which is used to discriminate between pollen types. An statistical evaluation of the algorithm is provided and discussed.

Pilar Carrión, Eva Cernadas, Juan F. Gálvez, Emilia Díaz-Losada
Automatic Word Codification for the RECONTRA Connectionist Translator

Previous work has shown that a simple recurrent neural model called RECONTRA is able to successfully approach simple text-to-text Machine Translation tasks in limited semantic domains. In order to deal with tasks of medium or large vocabularies, distributed representations of the lexicons are required in this translator. This paper shows a method for automatically extracting these distributed representations from perceptrons with output context.

Gustavo A. Casañ, M. Asunción Castaño
The ENCARA System for Face Detection and Normalization

Facial image processing is becoming widespread in human-computer applications, despite its complexity. High-level processes such as face recognition or gender determination rely on low-level routines that must effectively detect and normalize the faces that appear in the input image. In this paper, a face detec- tion and normalization system is described. The approach taken is based on a cascade of fast, weak clasifiers that together try to determine whether a frontal face is present in the image. The system is also able to obtain the position of facial features, such as mouth and eyes, and it operates in real-time. Compre- hensive experiments carried out with real video sequences show that the system is faster than other approaches and effective in detecting frontal faces.

M. Castrillón, O. Déniz, M. Hernández
Prediction and Discrimination of Pharmacological Activity by Using Artificial Neural Networks

The design of new medical drugs is a very complex process in which combinatorial chemistry techniques are used. For this reason, it is very useful to have tools to predict and to discriminate the pharmacological activity of a given molecular compound so that the laboratory experiments can be directed to those molecule groups in which there is a high probability of finding new compounds with the desired properties. This work presents an application of Artificial Neural Networks to the problem of discriminating and predicting pharmacological characteristics of a molecular compound from its topological properties. A large amount of different configurations are tested, yielding very good performances.

M. J. Castro, W. Díaz, P. Aibar, J. L. Domínguez
A Spatio-temporal Filtering Approach to Motion Segmentation

In this paper, a new frequency-based approach to motion segmentation is presented. The proposed technique represents the sequence as a spatio-temporal volume, where a moving object corresponds to a three-dimensional object. In order to detect the {}“3D volumes” corresponding to significant motions, a new scheme based on a band-pass filtering with a set of logGabor spatio-temporal filters is used. It is well known that one of the main problems of these approaches is that a filter response varies with the spatial orientation of the underlying signal. To solve this spatial dependency, the proposed model allows to recombine information of motions that has been separated in several filter responses due to its spatial structure. For this purpose, motions are detected as invariance in statistical structure across a range of spatio-temporal frequency bands. This technique is illustrated on real and simulated data sets, including sequences with occlusion and transparencies.

Jesús Chamorro-Martínez, J. Fdez-Valdivia, Javier Martinez-Baena
Annotated Image Retrieval System Based on Concepts and Visual Property

Semantic content implies more than the simple identification of objects. Techniques for content-based image retrieval are not yet mature enough to recognize visual semantics completely. Therefore, it is necessary to use captions or text information attached to photos in the content-based information access of visual data. However, keyword-based retrieval is limited to the level of syntactic pattern matching. In other words, dissimilarity computation among terms is usually done by using string matching not concept matching. In this paper, we present a solution for retrieving images semantically, by means of the qualitative measurement of annotated keywords and also with the use of the spatial color distribution model.

Junho Choi, Miyoung Cho, Mijin Yoon, Kyungsu Kim, Pankoo Kim
Multimodal Attention System for an Interactive Robot

Social robots are receiving much interest in the robotics community. The most important goal for such robots lies in their interaction capabilities. An attention system is crucial, both as a filter to center the robot’s perceptual resources and as a mean of letting the observer know that the robot has intentionality. In this paper a simple but flexible and functional attentional model is described. The model, which has been implemented in an interactive robot currently under development, fuses both visual and auditive information extracted from the robot’s environment, and can incorporate knowledge-based influences on attention.

Oscar Déniz, Modesto Castrillón, Javier Lorenzo, Mario Hernández, Juan Méndez
New Wavelet-Based Invariant Shape Representation Functions

In this paper, new wavelet-based affine invariant functions for shape representation are derived. These functions are computed from the wavelet approximation coefficients of the shape boundary. The first function is computed from applying a single wavelet transform, whereas the second function is computed from applying two different wavelet transforms. All the previously derived affine invariant functions were based on wavelet details coefficients which are sensitive to noise in the finer scale levels. The proposed invariant functions are more stable and less sensitive to noise than the details-based invariant functions.

Ibrahim El Rube, Maher Ahmed, Mohamed Kamel
A Feature Selection Wrapper for Mixtures

We propose a feature selection approach for clustering which extends Koller and Sahami’s mutual-information-based criterion to the unsupervised case. This is achieved with the help of a mixture-based model and the corresponding expectation-maximization algorithm. The result is a backward search scheme, able to sort the features by order of relevance. Finally, an MDL criterion is used to prune the sorted list of features, yielding a feature selection criterion. The proposed approach can be classified as a wrapper, since it wraps the mixture estimation algorithm in an outer layer that performs feature selection. Preliminary experimental results show that the proposed method has promising performance.

Mário A. T. Figueiredo, Anil K. Jain, Martin H. Law
Tracking People for Automatic Surveillance Applications

The authors present a simple but robust real-time algorithm that allows tracking of multiple objects in complex environments. As the first stage, the foreground segmentation uses luminance contrast, reducing computation time avoiding the use colour information at this stage. Foreground pixels are then grouped into blobs analysing X-Y histograms. Tracking is achieved by matching blobs from two consecutive frames using overlapping information from bounding boxes and a linear prediction for the centroid’s position. This method successfully solves blobs merging into groups and tracking them until they split again. Application in automatic surveillance is suggested by linking blob’s information, in terms of trajectories and positions, with the events to be detected. Some examples in transport environments are outlined.

Luis M. Fuentes, Sergio A. Velastin
A User Authentication Technic Using a~Web Interaction Monitoring System

User authentication based on biometrics has explored both physiological and behavioral characteristics. We present a system, called Web Interaction Display and Monitoring (WIDAM), that captures an user interaction on the web via a pointing device. This forms the basis of a new authentication system that uses behavioral information extracted from these interaction signals. The user interaction logs produced by WIDAM are presented to a sequential classifier, that applies statistical pattern recognition techniques to ascertain the identity of an individual – authentication system. The overall performance of the combined acquisition / authentication systems is measured by the global equal error rate, estimated from a test set. Preliminary results show that the new technique is a promising tool for user authentication, exhibiting comparable performances to other behavioural biometric techniques. Exploring standard human-computer interaction devices, and enabling remote access to behavioural information, this system constitutes an inexpensive and practical approach to user authentication through the world wide web.

Hugo Gamboa, Ana Fred
Recovering Camera Motion in a~Sequence of Underwater Images through Mosaicking

A procedure for automatic mosaic construction over long image sequences is presented. This mosaic is used by an underwater vehicle to estimate its motion with respect to the ocean floor. The system exploits texture cues to solve the correspondence problem. The dynamic selection of a reference image extracted from the mosaic improves motion estimation, bounding accumulated error. Experiments with real images are reported.

Rafael Garcia, Xevi Cufí, Viorela Ila
Best Achievable Compression Ratio for Lossy Image Coding

The trade-off between image fidelity and coding rate is reached with several techniques, but all of them require an ability to measure distortion. The problem is that finding a general enough measure of perceptual quality has proven to be an elusive goal. Here, we propose a novel technique for deriving an optimal compression ratio for lossy coding based on the relationship between information theory and the problem of testing hypotheses. As an example of the proposed technique, we analyze the effects of lossy compression at the best achievable compression ratio on the identification of breast cancer microcalcifications.

Jose A. García, Joaquin Fdez-Valdivia, Rosa Rodriguez-Sánchez, Xose R. Fdez-Vidal
Performance and Improvements of a~Language Model Based on Stochastic Context-Free Grammars

This paper describes a hybrid language model defined as a combination of a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG with a word distribution into categories, which is defined to represent the long-term relations between these categories. Experiments on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech recognition experiment.

José García-Hernandez, Joan Andreu Sánchez, José Miguel Benedí
Segmentation of Curvilinear Objects Using a~Watershed-Based Curve Adjacency Graph

This paper presents a general framework to segment curvilinear objects in 2D images. A pre-processing step relies on mathematical morphology to obtain a connected line which encloses curvilinear objects. Then, a graph is constructed from this line and a Markovian Random Field is defined to perform objects segmentation. Applications of our framework are numerous: they go from simple surve segmentation to complex road network extraction in satellite images.

Thierry Géraud
Automatic Keyframing of Human Actions for Computer Animation

This paper presents a novel human action model based on key-frames which is suitable for animation purposes. By defining an action as a sequence of time-ordered body posture configurations, we consider that the most characteristic postures (called key-frames) are enough for modeling such an action. As characteristic postures are found to correspond to low likelihood values, we build a human action eigenspace, called aSpace, which is used to estimate the likelihood value for each posture. Once the key-frames have been found automatically, they are used to build a human action model called p–action by means of interpolation between key-frames. This parameterized model represents the time evolution of the human body posture during a prototypical action, and it can be used for computer animation. As a result, realistic and smooth motion is achieved. Furthermore, realistic virtual sequences involving several actions can be automatically generated.

Jordi Gonzàlez, Javier Varona, F. Xavier Roca, Juan José Villanueva
Robust Line Matching and Estimate of Homographies Simultaneously

This paper addresses the robust matching of lines simultaneously to the computation of homographies between two views, when structure and motion are unknown. Using viewpoint non invariant measures, such as image dependent parameters, gives a lot of non matched or wrong matched features. The inclusion of projective transformations gives much better results with short computing overload. We use line features which can usually be extracted more accurately than points and they can be used in cases when there are partial occlusion. In the first stage, the lines are matched to the weighted nearest neighbor using brightness-based and geometric-based image parameters. From them, robust homographies can be computed, allowing to reject wrong matches, and growing also additional matches in the final stage. Although lines and points are dual features to compute homographies, some problems related to data representation and normalization using lines are considered. Results show that the robust technique turns out stable, and its application is useful in many situations. We have used it for robot homing and we also present automatic matching of lines at aerial images.

José J. Guerrero, Carlos Sagüés
Modeling High-Order Dependencies in Local Appearance Models

We propose a novel local appearance modeling method for object detection and recognition in cluttered scenes. The approach is based on the joint distribution of local feature vectors at multiple salient points and their factorization with Independent Component Analysis (ICA). The resulting densities are simple multiplicative distributions modeled through adaptative Gaussian mixture models. This leads to computationally tractable joint probability densities which can model high-order dependencies. Our technique has been initially tested with natural and cluttered scenes with some degree of occlusions yielding promising results. We also propose a method to select a reduced set of learning samples in order to mantain the internal structure of an object to be able to use high-order dependencies reducing the computational load.

David Guillamet, Baback Moghaddam, Jordi Vitrià
An Experimental Evaluation of K-nn for Linear Transforms of Positive Data

We present an experimental evaluation of the subspaces obtained on positive data using the Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF) and Weighted Non-negative Matrix Factorization (WNMF) techniques in order to compare which technique provides a subspace that mantains the neighbourhood structure of the original space. Different distance metrics are used both in the original and the projected spaces in order to find which one is more adapted to our data. Results demonstrate that for our positive data (color histograms) a good candidate that preserves the original neighbourhood is NMF in conjunction with L1 distance metric when the χ2 metric is used in the original space. Since this is the most widely used distance metric when having histogram representations, our initial results seem to be relevant.

David Guillamet, Jordi Vitrià
Wavelet Packet Image Coder Using Coefficients Partitioning for Remote Sensing Images

In this paper, a new embedded wavelet packet image coder algorithm is proposed for an effective image coder using correlation between partitioned coefficients. This new algorithm presents parent-child relationship for reducing image reconstruction error using relations between individual frequency sub-bands. By parent-child relationships, every coefficient is partitioned and encoded for the zerotree data structure. It is shown that the proposed wavelet packet image coder algorithm achieves lower bit rates than SPIHT. It also demonstrates higher PSNR under the same bit rate. The perfect rate control is compared with the conventional methods. These results show that the encoding and decoding processes of the proposed coder are simpler and more accurate than the conventional ones for texture images that include many mid and high-frequency elements such as aerial and satellite photograph images. The experimental results imply the possibility that the proposed method can be applied to real-time vision system, on-line image processing and image fusion which require smaller file size and better resolution.

Su-Young Han, Seong-Yun Cho
Generation and Use of Synthetic Training Data in Cursive Handwriting Recognition

Three different methods for the synthetic generation of handwritten text are introduced. These methods are experimentally evaluated in the context of a cursive handwriting recognition task, using an HMM-based recognizer. In the experiments, the performance of a traditional recognizer, which is trained on data produced by human writers, is compared to a system that is trained on synthetic data only. Under the most elaborate synthetic handwriting generation model, a level of performance comparable to, or even slightly better than, the system trained on the writing of humans was observed.

Muriel Helmers, Horst Bunke
Video-Sensor for Detection and Tracking of Moving Objects

In this paper we present a complete chain of algorithms for detection and tracking of moving objects using a static camera. The system is based on robust difference of images for motion detection. However, the difference of images does not take place directly over the image frames, but over two robust frames which are continuously constructed by temporal median filtering on a set of last grabbed images, which allows working with slow illumination changes. The system also includes a Kalman filter for tracking objects, which is also employed in two ways: assisting to the process of object detection and providing the object state that models its behaviour. These algorithms have given us a more robust method of detection, making possible the handling of occlusions as can be seen in the experimentation made with outdoor traffic scenes.

Elias Herrero, C. Orrite, A. Alcolea, A. Roy, José J. Guerrero, Carlos Sagüés
Estimation of Anti-bacterial Culture Activity from Digital Images

The purpose of this paper is to describe a method developed for estimation of the diameter of circular disks resulting from anti-bacterial cultures, immersed in a bacterial gel, using digital images of the growth recipients (Petri dishes). This methodology uses a chain of several image processing algorithms that are applied to the Petri dishes images until a set of diameter estimates are produced. The experimental protocol is presented and the results are analyzed in terms of the best non-linear filtering method as a function of the diameter error estimate at the output of the system.

José Jasnau Caeiro, Armando Ventura
Robust Extraction of Vertices in Range Images by Constraining the Hough Transform

We describe a technique for extracting vertices from range images of cluttered box-like objects. Edge detection is performed and an edge map is acquired. Extraction of vertices is carried out using the edge map and comprises two steps: Linear boundary detection in 3D and boundary grouping. In order to recover the four parameters of a 3D linear segment, we decompose the problem in two 2D subproblems, each recovering two line parameters. These subproblems are solved by means of the Hough Transform, constrained in this way so that accurate and efficient propagation of the edge points localization error is achieved. Pairs of orthogonal boundaries are grouped to form a vertex. The orthogonality of a boundary pair is determined by a simple statistical test. Our strategy comprises many advantages, the most important of which robustness, computational efficiency and accuracy, the combination of which is not to be found in existing approaches.

Dimitrios Katsoulas
Comparison of Log-linear Models and Weighted Dissimilarity Measures

We compare two successful discriminative classification algorithms on three databases from the UCI and STATLOG repositories. The two approaches are the log-linear model for the class posterior probabilities and class-dependent weighted dissimilarity measures for nearest neighbor classifiers. The experiments show that the maximum entropy based log-linear classifier performs better for the equivalent of a single prototype. On the other hand, using multiple prototypes the weighted dissimilarity measures outperforms the log-linear approach. This result suggests an extension of the log-linear method to multiple prototypes.

Daniel Keysers, Roberto Paredes, Enrique Vidal, Hermann Ney
Motion Segmentation Using Distributed Genetic Algorithms

This paper presents a Bayesian framework for simultaneous motion segmentation and estimation using genetic algorithms (GAs). The segmentation label and motion field are modeled by Markov random fields (MRFs), and a MAP estimate is used to identify the optimal label and motion field. In this paper, the motion segmentation and estimation problems are formalized as optimization problems of the energy function. And, the process for optimization of energy function is performed by iterating motion segmentation and estimation using a genetic algorithm, which is robust and effective to deal with combinatorial problems. The computation is distributed into chromosomes that evolve by distributed genetic algorithms (DGAs). Experimental results shows that our proposed method estimates an accurate motion field and segments a satisfactory label fields.

Eun Yi Kim, Se Hyun Park
Supervised Locally Linear Embedding Algorithm for Pattern Recognition

The dimensionality of the input data often far exceeds their intrinsic dimensionality. As a result, it may be difficult to recognize multidimensional data, especially if the number of samples in a dataset is not large. In addition, the more dimensions the data have, the longer the recognition time is. This leads to the necessity of performing dimensionality reduction before pattern recognition. Locally linear embedding (LLE) 5,6 is one of the methods intended for this task. In this paper, we investigate its extension, called supervised locally linear embedding (SLLE), using class labels of data points in their mapping into a low-dimensional space. An efficient eigendecomposition scheme for SLLE is derived. Two variants of SLLE are analyzed coupled with a k nearest neighbor classifier and tested on real-world images. Preliminary results demonstrate that both variants yield identical best accuracy, despite of being conceptually different.

Olga Kouropteva, Oleg Okun, Matti Pietikäinen
Hash–Like Fractal Image Compression with Linear Execution Time

The main computational cost in Fractal Image Analysis (FIC) comes from the required range-domain full block comparisons. In this work we propose a new algorithm for this comparison, in which actual full block comparison is preceded by a very fast hash–like search of those domains close to a given range block, resulting in a performance linear with respect to the number of pixels. Once the algorithm is detailed, its results will be compared against other state–of–the–art methods in FIC.

Kostadin Koroutchev, José R. Dorronsoro
Learning of Stochastic Context-Free Grammars by Means of Estimation Algorithms and Initial Treebank Grammars

In this paper we study the problem of learning of Stochastic Context-Free Grammars by means of estimation algorithms. In these algorithms, which are based on a gradient descendent technique, the initial model play an important role. Here we explore the use of initial SCFG obtained from a treebank corpus. Experiments on the UPenn Treebank corpus are reported.

Diego Linares, Joan-Andreu Sánchez, José-Miguel Benedí, Francisco Torres
Computer-Aided Prostate Cancer Detection in Ultrasonographic Images

Prostate cancer is one of the most frequent cancer in men and a major cause of mortality in developed countries. Detection of the prostate carcinoma at an early stage is crucial for a succesfull treatment. In this paper, a method for analysis of transrectal ultrasonography images aimed at computer-aided diagnosis of prostate cancer is presented. Althogh the task is extremely difficult due to a problem of imperfect supervision, we have obtained promising results indicating that valid information for the diagnostic is present in the images. Two classifiers based on k-Nearest Neighbours and Hidden Markov Models are compared.

Rafael Llobet, Alejandro H. Toselli, Juan C. Perez-Cortes, Alfons Juan
Ultrasound to Magnetic Resonance Volume Registration for Brain Sinking Measurement

This paper addresses the registration of ultrasound scans and magnetic resonance (MR) volume datasets of the same patient. During a neurosurgery intervention, pre–operative MR images are often employed as a guide despite the fact that they do not show the actual state of the brain, which sometimes has sunk up to 1 cm. By means of a standard ecographer and a tracker connected to a computer, it is feasible to build on-line an updated picture of the brain. We propose an algorithm which first composes the volume ecography of the brain and registers it to the MR volume. Next, it aligns individual B-scans into the MR volume, thus providing a measure of the suffered deformation.

David Lloret, Joan Serrat, Antonio M. López, Juan J. Villanueva
Segmentation of Range Images in a~Quadtree

We apply a fast segmenter to planar range images. By segmenting normal vectors of estimated planes in a quadtree, we can analyze very noisy data at high tree levels and guarantee interactivity in visualizing underlying 3D scenes. Techniques to enhance data at the original spatial resolution are given. Results on the ABW range dataset are better than those of several other segmenters.

Robert E. Loke, Hans du Buf
Associative Memory for Early Detection of Breast Cancer

We present a new associative neural network design especially indicated for the early detection of malignant lesions in breast cancer screening. It is a BAM in which we have made some changes to the functioning of its neurons, and for which we have developed an automatic selection algorithm for the prototypes used to calculate the thresholds of the neurons conforming the input layer. The result is a structure that, while considerably reduced, is highly effective in identifying the images that indicate the presence of malignant tumours in screening for breast cancer. We endowed the network with a special pre-processing stage for the treatment of this kind of radiographic image. This pre-processing yields a more detailed analysis of possible signs of tumours.

Francisco J. López Aligué, Isabel Acevedo, Carlos Ga Orellana, Miguel Macías, Horacio G. Velasco
Bayesian SPECT Image Reconstruction with Scale Hyperparameter Estimation for Scalable Prior

In this work we propose a new method to estimate the scale hyperparameter for convex priors with scalable energy functions in Single Photon Emission Computed Tomography (SPECT) image reconstruction problems. Within the Bayesian paradigm, Evidence Analysis and circulant preconditioners are used to obtain the scale hyperparameter. The proposed method is tested on synthetic SPECT images using Generalized Gaussian Markov Random Fields (GGMRF) as scalable prior distributions.

Antonio López, Rafael Molina, Aggelos K. Katsaggelos
Reducing Training Sets by NCN-based Exploratory Procedures

In this paper, a new approach to training set size reduction is presented. This scheme basically consists of defining a small number of prototypes that represent all the original instances. Although the ultimate aim of the algorithm proposed here is to obtain a strongly reduced training set, the performance is empirically evaluated over nine real datasets by comparing not only the reduction rate but also the classification accuracy with those of other condensing techniques.

M. Lozano, José S. Sánchez, Filiberto Pla
Probabilistic Observation Models for Tracking Based on Optical Flow

In this paper, we present two new observation models based on optical flow information to track objects using particle filter algorithms. Although optical flow information enables us to know the displacement of objects present in a scene, it cannot be used directly to displace an object model since flow calculation techniques lack the necessary precision. In view of the fact that probabilistic tracking algorithms enable imprecise or incomplete information to be handled naturally, these models have been used as a natural means of incorporating flow information into the tracking.

Manuel J. Lucena, José M. Fuertes, Nicolas Perez de la Blanca, Antonio Garrido, Nicolás Ruiz
Simplified Texture Unit: A~New Descriptor of the Local Texture in Gray-Level Images

In this work we propose a new descriptor of the local texture in gray-level images, named Simplified Texture Unit (STU). This descriptor is a version, with a smaller computational cost as much in its obtaining as in its later use, of the well-known Texture Unit descriptor (TU) 6. We have carried out a comparative study of the capacity to describe the texture of a region with the capacity provided by the TU descriptor and two other versions of the same one, known as Local Binary Pattern (LBP) and Local Binary Pattern with Contrast (LBP/C) 11. The results of the experiment allow to affirm that the new descriptor has a greater performance with small region sizes, what makes it suitable for unsupervised texture segmentation since it could allow a greater accuracy in the localization of the frontiers between textured regions.

Francisco J. Madrid-Cuevas, R. Medina Carnicer, M. Prieto Villegas, N. L. Fernández García, A. Carmona Poyato
Active Region Segmentation of Mammographic Masses Based on Texture, Contour and Shape Features

In this paper we propose a supervised method for the segmentation of masses in mammographic images. The algorithm starts with a selected pixel inside the mass, which has been manually selected by an expert radiologist. Based on the active region approach, an energy function is defined which integrates texture, contour and shape information. Then, pixels are aggregated or eliminated to the region by optimizing this function allowing to obtain an accurate segmentation. Moreover, a texture feature selection process, performed before the segmentation, ensures a reliable subset of features. Experimental results prove the validity of the proposed method.

Joan Martí, Jordi Freixenet, Xavier Muñoz, Arnau Oliver
Refining 3D Recovering by Carving through View Interpolation and Stereovision

In this paper, we present a method for obtaining accurate 3D models by merging the carving and stereo-matching algorithms. Multiple views of an object are taken from known camera poses. Object images, when segmented, are used to carve a rough 3D model of the object. View synthesis results are compared with real object views in order to validate the recovered model. When errors are detected, commonly due to occlusions and/or concavities, a fine stereo-matching algorithm is applied. Obtained depth map updates the inconsistent areas of the object model. Performed tests show the reached improvements in the accuracy of the model.

Enric Martin, Joan Aranda, Antonio Martinez
Chromosome Classification Using Continuous Hidden Markov Models

Up-to-date results on the application of Markov models to chromosome analysis are presented. On the one hand, this means using continuous Hidden Markov Models (HMMs) instead of discrete models. On the other hand, this also means to conduct empirical tests on the same large chromosome datasets that are currently used to evaluate state-of-the-art classifiers. It is shown that the use of continuous HMMs allows to obtain error rates that are very close to those provided by the most accurate classifiers.

César Martínez, Héctor García, Alfons Juan, Francisco Casacuberta
Generalized k-Medians Clustering for Strings

Clustering methods are used in pattern recognition to obtain natural groups from a data set in the framework of unsupervised learning as well as for obtaining clusters of data from a known class. In sets of strings, the concept of set median string can be extended to the (set)k-medians problem. The solution of the k-medians problem can be viewed as a clustering method, where each cluster is generated by each of the k strings of that solution. A concept which is related to set median string is the (generalized) median string, which is an NP-Hard problem. However, different algorithms have been proposed to find approximations to the (generalized) median string. We propose extending the (generalized) median string problem to k strings, resulting in the generalizedk-medians problem, which can also be viewed as a clustering technique. This new technique is applied to a corpus of chromosomes represented by strings and compared to the conventional k-medians technique.

Carlos D. Martínez-Hinarejos, Alfons Juan, Francisco Casacuberta
A Quadtree-Based Unsupervised Segmentation Algorithm for Fruit Visual Inspection

Many segmentation techniques are available in the literature and some of them have been widely used in different application problems. Most of these segmentation techniques were motivated by specific application purposes. In this article we present the preliminary results of an unsupervised segmentation algorithm through a multiresolution method using color information for fruit inspection tasks. The use of a Quadtree structure simplifies the combination of a multiresolution approach with the chosen strategy for the segmentation process and speeds up the whole procedure. The algorithm has been tested in fruit images in order to segment the different zones of the fruit surface. Due to the unsupervised nature of the procedure, it can adapt to the huge variability of color and shape of regions in fruit inspection applications.

Adolfo Martínez Usó
Shape Deformation Models Using Non-uniform Objects in Multimedia Applications

In this paper we describe a system to generate in an interactive way and thinking in multimedia applications, realistic simulation of a 3D deformable object’s. The physically elastic deformation techniques are actually an important and challenging feature in applications where three-dimensional object interaction and behaviour is considered or explored. Also, in multimedia environments we need a rapid computation of deformations with a good visual realism. In this paper we present a prototype of a system for the animation and simulation of elastic objects in an interactive system and under real-time conditions. The approach makes use of the finite elements method (F.E.M) and Elasticity Theory. Using picking node selection the user can interactively apply forces to objects causing their deformation. The deformations computed with our approach have a physical interpretation based on the mathematical model defined. In particular we extend our original system to non-homogeneous regions. This property is very important to able simulations with material that have deferents density functions. Finally, a set of results are presented which demonstrate this capability. All programs are written in C++ using POO, VRML and Open Invertor tools.

Miquel Mascaró Portells, Arnau Mir, Francisco Perales
An Experimental Comparison of Dimensionality Reduction for Face Verification Methods

Two different approaches to dimensionality reduction techniques are analysed and evaluated, Locally Linear Embedding and a modification of Nonparametric Discriminant Analysis. Both are considered in order to be used in a face verification problem, as a previous step to nearest neighbor classification. LLE is focused in reducing the dimensionality of the space finding the nonlinear manifold underlying the data, while the goal of NDA is to find the most discriminative linear features of the input data that improve the classification rate (without making any prior assumption on the distribution).

David Masip, Jordi Vitrià
Bayesian Image Estimation from an Incomplete Set of Blurred, Undersampled Low Resolution Images

This paper deals with the problem of reconstructing a high-resolution image from an incomplete set of undersampled, blurred and noisy images shifted with subpixel displacement. We derive mathematical expressions for the calculation of the maximum a posteriori estimate of the high resolution image and the estimation of the parameters involved in the model. We also examine the role played by the prior model when this incomplete set of low resolution images is used. The performance of the method is tested experimentally.

Javier Mateos, Miguel Vega, Rafael Molina, Aggelos K. Katsaggelos
A Procedure for Biological Sensitive Pattern Matching in Protein Sequences

A Procedure for fast pattern matching in protein sequences is presented. It uses a biological metric, based on the substitution matrices as PAM or BLOSUM, to compute the matching. Biological sensitive pattern matching does pattern detection according to the available empirical data about similarity and affinity relations between amino acids in protein sequences. Sequence alignments is a string matching procedure used in Genomic; it includes insert/delete operators and dynamic programming techniques; it provides more sophisticate results that other pattern matching procedures but with higher computational cost. Heuristic procedures for local alignments as FASTA or BLAST are used to reduce this cost. They are based on some successive tasks; the first one uses a pattern matching procedure with very short sequences, also named k-tuples. This paper shows how using the L1 metric this matching task can be efficiently computed by using SIMD instructions. To design this procedure, a table that maps the substitution matrices is needed. This table defines a representation of each amino acid residue in a n-dimensional space of lower dimensionality as possible; this is accomplished by using techniques of Multidimensional Scaling used in Pattern Recognition and Machine Learning for dimensionality reduction. Based on the experimental tests, the proposed procedure provides a favorable ration of cost vs matching quality.

Juan Méndez, Antonio Falcón, Javier Lorenzo
An Algebra for the Treatment of Multivalued Information Systems

Descriptive knowledge about an Information System can be expressed in declarative form by means of a binary Boolean based language. This paper presents a contribution to the study of an arbitrary multivalued Information System structure by introducing an algebra (not binary) that allows the treatment of multiple valued data tables with systematic algebraic techniques. Elements |t i | and ||t p ||, called arrays and co-arrays, are defined, operations ~, ‡ and ∘ are described. The proposed methodology allows multivalued algebraic expressions describing a multivalued Information System (multivalued Object Attribute Table). Furthermore, the same Information System can be described by several distinct, but equivalent, algebraic expressions. Among these, the prime-ar expression is singled out. The usefulness of the described algebra to represent an Information System is shown.

Margaret Miró-Julià, Gabriel Fiol-Roig
Adaptive Learning for String Classification

A new LVQ-inspired adaptive method is introduced to optimize strings for the 1-NN classifier. The updating rule relies on the edit distance. Given an initial number of string prototypes and a training set, the algorithm builds supervised clusters by attaching training samples to prototypes. A prototype is then rewarded to get it closer to the members of its cluster. To this end, the prototype is updated according to the most frequent edit operations resulting from edit distance computations to all members of its cluster. The process reorganizes training samples into new clusters and continues until the convergence of prototypes is achieved. A series of learning/classification experiments is presented which show a better 1-NN performance of the new prototypes with respect to the initial ones, that were originally good for classification.

Ramón A. Mollineda, Enrique Vidal, Carlos Martínez-Hinarejos
Multiple Segmentation of Moving Objects by Quasi-simultaneous Parametric Motion Estimation

This paper presents a new framework for the motion segmentation and estimation task on sequences of two grey images without a priori information of the number of moving regions present in the sequence. The proposed algorithm combines temporal information, by using an accurate Generalized Least-Squares Motion Estimation process and spatial information by using an inlier/outlier classification process which classifies regions of pixels, in a first step, and the pixels directly, in a second step, into the different motion models present in the sequence. The performance of the algorithm has been tested on synthetic and real images with multiple objects undergoing different types of motion.

Raúl Montoliu, Filiberto Pla
Fusion of Color and Shape for Object Tracking under Varying Illumination

In this paper a new technique to perform tracking in cluttered scenarios with varying illumination conditions is presented. The robustness of the approach lies in the integration of appearance and structural information of the object. The fusion is done using the CONDENSATION algorithm that formulates multiple hypothesis about the estimation of the object’s color distribution and validates them taking into account the contour information of the object.

Francesc Moreno-Noguer, Juan Andrade-Cetto, Alberto Sanfeliu
Extending Fast Nearest Neighbour Search Algorithms for Approximate k-NN Classification

The nearest neighbour (NN) and k-nearest neighbour (k-NN) classification rules have been widely used in pattern recognition due to its simplicity and good behaviour. Exhaustive nearest neighbour search can become unpractical when facing large training sets, high dimensional data or expensive similarity measures. In the last years a lot of NN search algorithms have been developed to overcome those problems, and many of them are based on traversing a data structure (usually a tree) and selecting several candidates until the nearest neighbour is found. In this paper we propose a new classification rule that makes use of those selected (and usually discarded) prototypes. Several fast and widely known NN search algorithms have been extended with this rule obtaining classification results similar to those of a k-NN classifier without extra computational overhead.

Francisco Moreno-Seco, Luisa Micó, Jose Oncina
ART-VENA: Retinal Vaseular Caliber Measurement

The size of retinal vascular caliber in eye fundus images is a fundamental diagnosis parameter in the study of systemic vascular pathologies, like arterial hypertension or arteriosclerosis. ART-VENA is a semiautomatic system to measure the retinal vascular caliber. From the medical point of view, its repeatability (coefficients of variation under 1.5%) turns it into a reliable tool to objectify vascular changes which previously depended an the observer’s subjectivity.

Antonio Mosquera, Raquel Dosil, Victor Leborán, Fernando Pardo, Francisco Gómez-Ulla, Bashir Hayik, Antonio Pose, Marta Rodríguez
Use of Band Ratioing for Color Texture Classification

In the recent years, many authors have begun to exploit the extra information provided by color images to solve many computer vision problems. Among these problems, we find the texture classification field, which traditionally has used grayscale images, primarily due to the high hardware and processing costs. In this paper, a new approach for enhancing classical texture analysis methods is presented. By means of the band ratioing technique, we can extend any feature extraction algorithm to take advantage of color information and achieve higher classification rates. To prove this extreme, three standard techniques has been selected: Gabor filters, Wavelets and Cooccurrence Matrices. For testing purposes, 30 color textures have been selected from the Vistex database. We will perform a number of experiments on that texture set, combining different ways of adapting the former algorithms to process color textures and extract features from them.

Rubén Muñiz, José Antonio Corrales
Does Independent Component Analysis Play a~Role in Unmixing Hyperspectral Data?

Independent Component Analysis (ICA) have recently been proposed as a tool to unmix hyperspectral data. ICA is founded on two assumptions: i) The observed data vector is a linear mixture of the sources (abundance fractions); ii) sources are independent. Concerning hyperspectral data, the first assumption is valid whenever the constituent substances are surface distributed. The second assumption, however, is violated, since the sum of abundance fractions associated to each pixel is constant due to physical constraints in the data acquisition process. Thus, sources cannot be independent. This paper gives evidence that ICA, at least in its canonical form, is not suited to unmix hyperspectral data. We arrive to this conclusion by minimizing the mutual information of simulated hyperspectral mixtures. The hyperspectral data model includes signature variability, abundance perturbation, sensor Point Spread Function (PSF), abundance constraint and electronic noise. Mutual information computation is based on fitting mixtures of Gaussians to the observed data.

José M. P. Nascimento, José M. B. Dias
Vertex Component Analysis: A~Fast Algorithm to Extract Endmembers Spectra from Hyperspectral Data

Linear spectral mixture analysis, or linear unmixing, has proven to be a useful tool in hyperspectral remote sensing applications. It aims at estimating the number of reference substances, also called endmembers, their spectral signature and abundance fractions, using only the observed data (mixed pixels). This paper presents new method that performs unsupervised endmember extraction from hyperspectral data. The algorithm exploits a simple geometric fact: endmembers are vertices of a simplex. The algorithm complexity, measured in floating points operations, is O(n), where n is the sample size. The effectiveness of the proposed scheme is illustrated using simulated data.

José M. P. Nascimento, José M. B. Dias
On the Relationship between Classification Error Bounds and Training Criteria in Statistical Pattern Recognition

We present two novel bounds for the classification error that, at the same time, can be used as practical training criteria. Unlike the bounds reported in the literature so far, these novel bounds are based on a strict distinction between the true but unknown distribution and the model distribution, which is used in the decision rule. The two bounds we derive are the squared distance and the Kullback-Leibler distance, where in both cases the distance is computed between the true distribution and the model distribution. In terms of practical training criteria, these bounds result in the squared error criterion and the mutual information (or equivocation) criterion, respectively.

Hermann Ney
Application of Multinomial Mixture Model to Text Classification

The goal of text document classification is to assign a new document into one class from the predefined classes based on its contents. In this paper, a mixture of multinomial distributions is proposed as a model for class-conditional distributions in document classification task. A bag-of-words approach to vector document representation is employed. It is shown, that the accuracy of the Bayes document classifier can be improved by the proposed model in comparison with the Bayes classifiers based on the multivariate Bernoulli model, the multinomial model as well as the multivariate Bernoulli mixture model. Experimental results on the Reuters and the Newsgroups data sets indicate the effectiveness of the multinomial mixture model. Furthermore, an increase in classification accuracy is achieved for small training data sets, when multiclass Bhattacharyya distance is used instead of average mutual information as a feature selection criterion.

Jana Novovičová, Antonín Malík
An Empirical Comparison of Stack-Based Decoding Algorithms for Statistical Machine Translation

Unlike other heuristic search algorithms, stack-based decoders have been proved theoretically to guarantee the avoidance of search errors in the decoding phase of a statistical machine translation (SMT) system. The disadvantage of the stack-based decoders are the high computational requirements. Therefore, to make the decoding problem feasible for SMT, some heuristic optimizations have to be performed. However, this yields unavoidable search errors. In this paper, we describe, study, and implement the state of the art stack-based decoding algorithms for SMT making an empirical comparison which focuses specifically on the optimization problems, computational time, and translation results. Results are also presented for two well known task, the Tourist Task and the Hansards task.

Daniel Ortiz, Ismael García Varea, Francisco Casacuberta
Detection of Colour Channels Uncoupling for Curvature-Insensitive Segmentation

A segmentation method based on a physics-based model of image formation is presented in this paper. This model predicts that, in image areas of uniform reflectance, colour channels keep coupled in the sense that they are not free to take any intensity value, but they depend on the values taken by other colour channels. This coupling property is, however, broken at reflectance transition locations. Surface material changes (i.e. reflectance changes) can, thus, be found by looking for violations of the coupling properties. If edges are defined at points breaking the coupling and connected image areas not including edges are found, the set of resultant regions are guaranteed not to contain a material change. If, besides, edges are added to the most similar adjacent region, a first partition of the image can be obtained. Finally, a merging stage is executed to remove the probably low degree of oversegmentation which can result.

Alberto Ortiz, Gabriel Oliver
Analyzing Periodic Motion Classification

In this paper, we present a new technique for separating different types of periodic motions in a video sequence. We consider different motions those that have different periodic patterns with one or many fundamental frequencies. We select the temporal Fourier Transform for each pixel to be the representation space for a sequence of images. The classification is performed using Non-Negative Matrix Factorization (NNMF) over the power spectra data set. The paper we present can be applied on a wide range of applications for video sequences analysis, such as: background subtraction on non-static backgrounds framework, object segmentation and classification. We point out the fact that no registration technique is applied in the method that we introduce. Nevertheless, this method can be used as a cooperative tool for the existing techniques based on camera motion models (motion segmentation, layer classification, tracking of moving objects, etc).

Xavier Orriols, Xavier Binefa
Finding Breaking Curves in 3D Surfaces

In this paper we present a recursive least squares technique for extracting the breaking curve of a 3D range open surface. Unlike differential operators-based methods, the algorithm we propose is robust to noise and is applied to unorganized point sets. No assumptions such as smoothness and/or continuity on the boundary’s shape are performed. The method we present deals with large amount of data under a low computational cost, since no local computation is performed. A global approach is given to the technique in order to make it more robust, faster and simpler than individual point plus neighbours approaches.

Xavier Orriols, Xavier Binefa
Robust Segmentation and Decoding of a~Grid Pattern for Structured Light

This paper describes the implementation details of a coded structured light system useful for one-shot measurements of a surface. Since a unique pattern is projected, the technique is useful for measuring moving surfaces. A pattern based on grid structure is used. The main advantage of such structure is that redundant codification is applied to the cross-points. Since both pattern axis are coded, decoding errors can be corrected thanks to the proposed algorithm. Moreover, not only the cross-points of the grid can be reconstructed but also the pixels belonging to vertical and horizontal slits. A description of the segmentation and decoding stage is given in order to take profit of the advantages of the pattern codification.

Jordi Pagès, Joaquim Salvi, Carles Matabosch
A New Reference Point Detection Algorithm Based on Orientation Pattern Labeling in Fingerprint Images

Most fingerprint-based biometric systems establish a reference point in a fingerprint, then extract features based on this reference point. As such, the consistency and accuracy of the reference point location considerably affects the overall system performance. Accordingly, this paper presents an accurate and consistent reference point detection algorithm based on orientation pattern labeling. Experimental results demonstrate that the proposed method can produce a better performance in terms of accuracy and speed than Poincaré index or sine map-based methods.

Chul-Hyun Park, Sang-Keun Oh, Dong-Min Kwak, Bum-Soo Kim, Young-Chul Song, Kil-Houm Park
Global Motion Estimation in Sprite Generation by Eliminating Local Object Motions

In this paper, a new feature point selection method for the global motion estimation(GME) in sprite generation is proposed. GME for the sprite generation presented in this paper consists of two stages, feature selection and global motion estimation with selected blocks. First, local object motions are distinguished from the static background. Blocks with local motions are excluded in the subsequent procedure because local object motions would not be helpful to GME and often are even harmful to the exact motion estimation. Note that sprite generation mainly concerns the generation of the static background for a sequence of image frames. To identify local motions, conventional block-based motion estimation is performed for the blocks in the current frame. If it has a greater residual error than a threshold, this block is considered to have an object with local motions and is excluded in the subsequent procedure. Note that a large residual error of a block implies a change in the shape of the object and the block image could not be a part of the static background. The second stage extracts edges in the image excluding blocks selected in the first step and they are used for GME. Experiments show the proposed algorithm performs faster in selected images than existing methods with improved objective/subjective quality.

Sungchan Park, Miyoung Kim, Gueesang Lee
Automatic Inspection of Wooden Pallets Using Contextual Segmentation Methods

This paper presents a comparative study of several well-known and thoroughly tested techniques for the segmentation of textured images, including two algorithms belonging to the adaptive Bayesian family of restoration and segmentation methods, a probabilistic relaxation process, and a novel approach based on the recently introduced concept of the frequency histogram of connected elements. The application domain chosen for comparison purposes is the problem of detecting very thin cracks -around 1 mm width- in the wooden boards of used pallets, where a tricky balance between the crack detection and false alarm ratios must be guaranteed. After a brief description of each segmentation method and their respective application to the problem at hand, the paper discusses the comparative results, showing the excellent performance achieved with the frequency histogram of connected elements, which can be considered an attractive and versatile novel instrument for the analysis and recognition of textured images.

MiguelÁngel Patricio, Darío Maravall
Gabor Wavelets and Auto-organised Structures for Directional Primitive Extraction

This paper describes a computational framework developed for the extraction of low-level directional primitives present in an image, and subsequent organization using the laws of perceptual grouping. The system is divided in two stages. The first one consists on the extraction of the direction of pixels in the image, through an efficient implementation of Gabor wavelet decomposition. The second one consists on the reduction of these high dimensionality results by means of an auto-organized structure. For this second stage, three different auto-organized structures have been studied: self-organized maps (SOM), growing cell structures (GCS) and growing neural gas (GNG). Results have showed that GCS is the most appropriate structure in the context of this work.

Marta Penas, María J. Carreira, Manuel G. Penedo
A Colour Tracking Procedure for Low-Cost Face Desktop Applications

In this paper we present an environment for the tracking of a human face obtained from a real video sequence. We will describe the system and discuss the advantages and disadvantages of our approximation. We mainly focus on the situation of the main attributes of the human face (eyes, eyebrows, nose and moth). The tracking algorithm and the ulterior animation of the synthetic model must guarantee the real time response without the need of any additional markup of the actor. Due to the complexity of the process, we make an initial selection of the facial attributes involved without any efficiency or robustness loss. We define a probabilistic model of skin face area and we would like to track this region in the sequence of images. In parallel we propose additional criteria to search inside this tracked area main features in human face (as lisp, eyes, eyebrows, nose, etc..). The tracking algorithm is based in a efficient implementation of continuously adaptive mean shift procedure (CAMSHIFT) and this process is improved also with the second step with feature detections. In this paper only we present the whole process, the tracking background criteria and lips detection procedure. The synthesis phase is out scope of this paper and we generate the facial animations parameters (FAP) as input to a compliant MPEG-4 facial animation engine (FAE). This system is designed as a computer interface for controlling commercial computer applications which include avatar or clones in real time.

F. J. Perales, R. Mas, M. Mascaró, P. Palmer, A. Igelmo, A. Ramírez
Local Motion Estimation from Stereo Image Sequences

This paper proposes a method for representing local temporal deformations of a 3D flexible surface in an orthogonal space from a sequence of stereo images. The approach uses a disparity space as the main space in order to represent all the 3D information. The local motions are estimated removing the rigid motions from the global motion in the disparity space. A robust algorithm based on the RANSAC approach is used to estimate the rigid motions through the image sequence. An incremental SVD algorithm is used to estimate the representation space of the local motions as data is received. The approach presented in this paper is valid for any type of camera.

N. Pérez de la Blanca, J. M. Fuertes, M. Lucena, A. Garrido
A Change Validation System for Aerial Images Based on a Probabilistic Latent Variable Model

Change detection is an important part of image interpretation and automated geographical data collection. In this paper, we show a quality control system for the verification of image changes detected by a human operator. The system is based on a probabilistic system that learns the operator behaviour and tests the founded changes. Maximum likelihood estimators for the model are presented and their derivation is shown. Computational results are given with real image data that show the performance of the system.

Fernando Pérez Nava
Petrographic Classification at the Macroscopic Scale Using a Mathematical Morphology Based Approach

A novel methodology for the automatic classification of the different textural classes that constitute a rock at the macroscopic scale is presented in this paper. The methodology starts with the segmentation of elementary textural units of the image followed by their classification, whose feature space partition results from the geometric modelling of the training sets. This approach uses mainly mathematical morphology operators and is tested with images of macroscopic polished surfaces of 14 types of portuguese grey granites.

Pedro Pina, Teresa Barata
On the Efficiency of Luminance-Based Palette Reordering of Color-Quantized Images

Luminance-based palette reordering is often considered less efficient than other more complex approaches, in what concerns improving the compression of color-indexed images. In this paper, we provide experimental evidence that, for color-quantized natural images, this may not be always the case. In fact, we show that, for dithered images with 128 colors or more, luminance-based reordering outperforms other more complex methods.

Armando J. Pinho, António J. R. Neves
Feature-Driven Recognition of Music Styles

In this paper the capability of using self-organising neural maps (SOM) as music style classifiers of musical fragments is studied. From MIDI files, the monophonic melody track is extracted and cut into fragments of equal length. From these sequences, melodic, harmonic, and rhythmic numerical descriptors are computed and presented to the SOM. Their performance is analysed in terms of separability in different music classes from the activations of the map, obtaining different degrees of success for classical and jazz music. This scheme has a number of applications like indexing and selecting musical databases or the evaluation of style-specific automatic composition systems.

Pedro J. Ponce de León, José M. Iñesta
Multiple Model Approach to Deformable Shape Tracking

This paper describes a new proposal for tracking deformable objects in video sequences using multiple shape models of heterogeneous dimensionality. This models are generated unsupervisedly from a training sequence, and used to estimate the shape of an object along time by means of a novel tracking framework proposed. This framework is based in estimate the rigid and non-rigid shape transformations in two separated but related processes. The advantage of proceed in that way is that the a priori knowledge contained in the learned models is better exploited, resulting in a more reliable tracking performance. The Condensation algorithm is used to estimate the rigid transformation of the shape, while the non-rigid shape deformation is determined by combining the response of several Kalman Filters. The proposal is evaluated tracking a synthetic form, and the silhouette of a pedestrian.

Daniel Ponsa, Xavier Roca
Pixel-Based Texture Classification by Integration of Multiple Texture Feature Evaluation Windows

A wide variety of texture feature extraction methods have been proposed for texture based image classification and segmentation. These methods are typically evaluated over windows of the same size, the latter being usually chosen for each particular method on an experimental basis. This paper shows that pixel-based texture classification can be significantly improved by evaluating a given texture method over multiple windows of different size and then by integrating the results through a classical Bayesian scheme. The proposed technique has been applied to well-known families of texture methods that are frequently utilized for feature extraction from textured images. Experiments show that the integration of multisized windows yields lower classification errors than when optimal single-sized windows are considered.

Doménec Puig, Miguel Angel García
Vessel Segmentation and Branching Detection Using an Adaptive Profile Kalman Filter in Retinal Blood Vessel Structure Analysis

This paper presents an improved tracking based method for retinal vessel segmentation that uses blood vessel morphology to adapt the tracking parameters. The method includes branching detection and avoidance methods. A bi-level threshold method, based on local vessel information, is used for segmentation. Tracking is based on Kalman filtering. The results are compared with existing ground truth. It is concluded that ground truth segmentation is not easily comparable.

Pedro Quelhas, James Boyce
Reconstruction of Quadrics from Two Polarization Views

This paper addresses the problem of reconstructing texture-less objects of quadric like shape. It is known that a quadric can be uniquely recovered from its apparent contours in three views. But, in the case of only two views the reconstruction is a one parameter family of quadrics. Polarization imaging provides additional geometric information compared to simple intensity based imaging. The polarization image encodes the projection of the surface normals onto the image and therefore provides constraints on the surface geometry. In this paper it is proven that two polarization views of a quadric contain sufficient information for a complete determination of its shape. The proof itself is constructive leading to a closed-form solution for the quadric. Additionally, an indirect algorithm is presented which uses both polarization and apparent contours. By experiments it is shown that the presented algorithm produces accurate reconstruction results.

Stefan Rahmann
Some Results about the Use of Tree/String Edit Distances in a~Nearest Neighbour Classification Task

In pattern recognition there is a variety of applications where the patterns are classified using edit distance. In this paper we present some results comparing the use of tree and string edit distances in a handwritten character recognition task. Some experiments with different number of classes and of classifiers are done.

Juan Ramón Rico-Juan, Luisa Micó
Line Detection Using Ridgelets Transform for Graphic Symbol Representation

Retrieval and recognition of symbols in graphic images requires good symbol representation, able to identify those features providing the most relevant information about shape and visual appearance of symbols. In this work we have introduced Ridgelets transform as it permits to detect lineal singularities in an image, which are the most important source of information in graphic images. Sparsity is one of the most important properties of Ridgelets transform, which will permit us to extract a set of descriptors based on the angle and the distance to the origin of every straight line. We show how this representation can be normalized to make it invariant on traslation, rotation and scaling of the symbol. We present some preliminary results showing the usefulness of this representation with a set of architectural symbols.

Oriol Ramos Terrades, Ernest Valveny
Tree-Structured Representation of Musical Information

The success of the Internet has filled the net with lots of symbolic representations of music works. Two kinds of problems arise to the user: content-based search of music and the identification of similar works. Both belong to the pattern recognition domain. In contrast to most of the existing approaches, we pose a non-linear representation of a melody, based on trees that express the metric and rhythm of music in a natural way. This representation provide a number of advantages: more musical significance, more compact representation and others. Here we have worked on the comparison of melodies for identification.

David Rizo, José Manuel Iñesta, Francisco Moreno-Seco
Comparative Study of the Baum-Welch and Viterbi Training Algorithms Applied to Read and Spontaneous Speech Recognition

In this paper we compare the performance of acoustic HMMs obtained through Viterbi training with that of acoustic HMMs obtained through the Baum-Welch algorithm. We present recognition results for discrete and continuous HMMs, for read and spontaneous speech databases, acquired at 8 and 16 kHz. We also present results for a combination of Viterbi and Baum-Welch training, intended as a trade-off solution. Though Viterbi training yields a good performance in most cases, sometimes it leads to suboptimal models, specially when using discrete HMMs to model spontaneous speech. In these cases, Baum-Welch shows more robust than both Viterbi training and the combined approach, compensating for its high computational cost. The proposed combination of Viterbi and Baum-Welch only outperforms Viterbi training in the case of read speech at 8 kHz. Finally, when using continuous HMMs, Viterbi training reveals as good as Baum-Welch at a much lower cost.

Luis Javier Rodríguez, Inés Torres
Skin Lesions Diagnosis Based on Fluorescence Image Processing: Simple Parameters Scanning

This paper studies the viability of a skin lesion diagnosis scheme based on fluorescence images. Three kinds of skin lesions are considered: actinitic keratosis, basal cell carcinoma and psoriasis. A wide and diverse set of simple parameters have been extracted and their discrimination potential is evaluated through an automatic diagnosis scheme based on the k-nearest neighbours. We use a sequential scanning technique that automatically selects the most relevant parameters for the addressed problem.

E. Ros, M. M. Rodriguez, S. Mota, J. L. Bernier, I. Rojas, C. G. Puntonet, E. W. Lang
Incrementally Assessing Cluster Tendencies with a~Maximum Variance Cluster Algorithm

A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.

Krzysztof Rzaḑca, Francesc J. Ferri
Musical Style Recognition by Detection of Compass

The present paper describes a method for the automatic classification of musical styles from the musical signal. This method can be used for searching and indexing purposes in musical databases, which constitutes one of the applications of MPEG-7. The method is based on applying HMM models to detect the compass in a way very similar to what is done for continuous speech recognition. The experiments demonstrate the kindness of the HMMs applied to the recognition of musical styles.

F. J. Salcedo, J. E. Díaz-Verdejo, J. C. Segura
Subtexture Components for Texture Description

In this paper the problem of texture description for image browsing or annotation is approached. Previous works in this direction have proposed solutions that have shown to be limited due to the high degree of complexity natural textures can achieve. This problem is solved here by defining textures as a combination of several subtexture components, whose description is simpler since they only have one characteristic element. A computational method based on multiscale filtering with Laplacian of Gaussian is presented to identify the subtexture components of a texture, and a texture description based on these subtexture components attributes is given.

Anna Salvatella, Maria Vanrell, Ramon Baldrich
A 3D Ultrasound System for Medical Diagnosis

This paper presents a system for 3D ultrasound which aims to reconstruct a volume of interest from a set of ultrasound images. A Bayesian reconstruction algorithm has been recently proposed to perform this task. However, it is too slow to be useful in practice. This paper describes several techniques to improve the efficiency of the reconstruction procedure based multi-scale principles and based on the expansion of the likelihood function in a Taylor series. This allows the use of sufficient statistics which avoid processing all the images in each iteration and leads to a space-varying recursive filter designed according to the statistical properties of the data. Experimental results are provided to assess the performance of the proposed algorithms in medical diagnosis.

João Sanches, Jorge S. Marques, Fausto Pinto, Paulo J. Ferreira
Shot Segmentation Using a~Coupled Markov Chains Representation of Video Contents

We present a shot segmentation method based on the representation of visual contents in video using a coupled Markov chains approach. This representation allows us to combine different image features and to keep information about all the images since the beginning of the shot, instead of simply comparing adjacent frames. We also define an adaptative detection threshold that depends on the distance measures that are obtained, instead of trying to find a fixed threshold. Results show that the combination of color and motion image features in the same representation provides a more robust detection of shot boundaries than using each feature separately.

Juan M. Sánchez, Xavier Binefa
Estimating Confidence Measures for Speech Recognition Verification Using a~Smoothed Naive Bayes Model

Verification in speech recognition systems can be seen as a conventional pattern classification problem in which each hypothesized word is to be transformed into a feature vector and then classified as either correct or incorrect. Thus, our basic problems are to find appropriate pattern features and to design an accurate pattern classifier. In this paper, we present a new feature and a smoothed naive Bayes classification model. Experimental results are reported comparing the new feature with a set of well-known features. The best performance is obtained using the new feature in combination with Acoustic Stability.

Alberto Sanchis, Alfons Juan, Enrique Vidal
A New Wavelets Image Fusion Strategy

The aim of image fusion is to combine information from multiple images of the same scene. The result of image fusion is a new image, which is more suitable for human and machine perception or further image-processing tasks such as segmentation, feature extraction and object recognition. Different fusion methods have been proposed in the literature. This paper presents new methods based on the computation of local and global gradient. A comparative analysis is carried out against other existing strategies. The results are encouraging.

M. Santos, G. Pajares, M. Portela, J. M de la Cruz
Quality Assessment of Manufactured Roof-Tiles Using Digital Sound Processing

This paper describes a method for automatically assessing the quality of manufactured roof-tiles using digital audio signal processing and pattern recognition techniques. A prototype system has been developed that is based on a mixed PC/DSP platform, where the real-time constraint is one of the main key issues. The suitability of the classification process for implementation in an industrial environment is also addressed.

Vasco C. F. Santos, Miguel F. M. Sousa, Aníbal J. S. Ferreira
Graph Representations for Web Document Clustering

In this paper we describe clustering of web documents represented by graphs rather than vectors. We present a novel method for clustering graph-based data using the standard k-means algorithm and compare its performance to the conventional vector-model approach using cosine similarity. The proposed method is evaluated when using five different graph representations under two different clustering performance indices. The experiments are performed on two separate web document collections.

Adam Schenker, Mark Last, Horst Bunke, Abraham Kandel
Learning Decision Trees and Tree Automata for a~Syntactic Pattern Recognition Task

Decision trees have been widely used for different tasks in artificial intelligence and data mining. Tree automata have been used in pattern recognition tasks to represent some features of objects to be classified. Here we propose a method that combines both approaches to solve a classical problem in pattern recognition such as Optical Character Recognition. We propose a method which is organized in two stages: (1) we use a grammatical inference technique to represent some structural features of the characters and, (2) we obtain edit distances between characters in order to design a decision tree. The combination of both methods benefits from their individual characteristics and is formulated as a coherent unifying strategy.

José M. Sempere, Damián López
MOTRICO Project – Geometric Construction and Mesh Generation of Blood Vessels by Means of the Fusion of Angiograms and IVUS

The MOTRICO project plans the development of an advanced environment that will offer computer assistance for cardiac therapy and diagnosis, which would be useful in the hemodynamics units of those hospitals that have access to the instrumental techniques of Angiography and Intravascular Ultrasounds. The work presented in this paper will describe modeling and finite element mesh generation of an anatomically realistic model of the human left coronary artery bifurcation. The computational geometric model has been developed on the basis of real anatomical information and it has a coronary vessel segment developed on the basis of the information obtained by means of the fusion of angiograms and intravascular ultrasound images(IVUS).

Francisco J. Seron, Elsa Garcia, Jorge del Pico
Skipping Fisher’s Criterion

A new version of Fisher’s discriminant analysis (FDA) is introduced in this paper. Our algorithm searches also for a reduced space in which patterns can be discriminated. However, no intermediate class separability criterion (such as Fisher’s mean distance divided by variance) is used whatsoever. Classification performance is optimized directly. Since no statistical hypothesis are made, the method is of general applicability. Our evolutionary approach for optimization makes the number of projections and classes independent of each other. Even different numbers of projections, not necessarily the means, can be used for each class. As a proof of concept, the UCI thyroid problem (three classes) is solved in one dimension instead of two with state of the art performance and making use of only three of the 21 original features.

A. Sierra, A. Echeverría
New Class of Filters for Impulsive Noise Removal in Color Images

In this paper a novel approach to the problem of impulsive noise reduction in color images based on the nonparametric density estimation is presented. The basic idea behind the new image filtering technique is the maximization of the similarities between pixels in a predefined filtering window. The new method is faster than the standard vector median filter and better preserves edges and fine image details. Simulation results show that the proposed method outperforms standard algorithms of the reduction of impulsive noise in color images.

Bogdan Smolka
On the Nonparametric Impulsive Noise Reduction in Multichannel Images

This paper presents a new filtering scheme for the removal of impulsive noise in multichannel images. It is based on estimating the probability density function for image pixels in a filtering window by means of the kernel density estimation method. The filtering algorithm itself is based on the comparison of pixels with their neighborhood in a sliding filter window. The quality of noise suppression and detail preservation of the new filter is measured quantitatively in terms of the standard image quality criteria. The filtering results obtained with the new filter show its excellent ability to reduce noise while simultaneously preserving fine image details.

Bogdan Smolka
Robust Hash Functions for Visual Data: An Experimental Comparison

Robust hash functions for visual data need a feature extraction mechanism to rely on. We experimentally compare spatial and transform domain feature extraction techniques and identify the global DCT combined with the cryptographic hash function MD-5 to be suited for visual hashing. This scheme offers robustness against JPEG2000 and JPEG compression and qualitative sensitivity to intentional global and local image alterations.

Champskud J. Skrepth, Andreas Uhl
Error Concealment Using Discontinuity Features

In transmitting compressed video bit-stream over Internet, packet loss causes error propagation in both spatial and temporal domain, which in turn leads to severe degradation in image quality. In this paper, a new error concealment algorithm is proposed to repair damaged portions of the video frames in the receiver. Conventional BMA(Boundary Matching Algorithm) assumes that the pixels on the boundary of the missing block and its neighboring blocks are very similar, but has no consideration of edges across the boundary. In our approach, the edges are detected across the boundary of the lost or erroneous block. Once the orientation of each edge is found, only the pixel difference along the expected edges across the boundary is measured instead of the calculation of differences between all adjacent pixels on the boundary. Therefore, the proposed approach needs very few computations and the experiment shows an improvement of the performance over the conventional BMA in terms of both subjective and objective quality of video sequences.

NamRye Son, GueeSang Lee
Geometric Properties of the 3D Spine Curve

Through a 3D reconstruction of the human back surface using structured light techniques, we study the properties of spine curve by means of a set of parameters related to measures commonly applied in medicine. In this way, descriptors for measuring the abnormalities in the projections of the front and sagittal planes can be computed. We build the spine curve in 3D and analyse the behaviour of the Frenet frame when along the curve the deformation processes in idiophatic scoliosis appear.

J. M. Sotoca, M. Buendía, J. M. Iñesta, F. J. Ferri
Oriented Matroids for Shape Representation and Indexing

In this paper a novel method for indexing views of 3D objects is presented. The topological properties of the regions of the segmented images of the objects are used to define an index based on oriented matroid theory. Oriented matroids, which are projective invariants, encode incidence relations and relative position of the elements of the image and give local and global topological information about their spatial distribution. This indexing technique is applied to 3D object hypothesis generation from single views to reduce the number of candidates in object recognition processes.

E. Staffetti, A. Grau, F. Serratosa, A. Sanfeliu
Combining Phrase-Based and Template-Based Alignment Models in Statistical Translation

In statistical machine translation, single-word based models have an important deficiency; they do not take contextual information into account for the translation decision. A possible solution called Phrase-Based, consists in translating a sequence of words instead of a single word. We show how this approach obtains interesting results in some corpora. One shortcoming of the phrase-based alignment models is that they do not have the generalization capability in word reordering. A possible solution could be the template-based approach, which uses sequences of classes of words instead of sequences of words. We present a template-based alignment model that uses a Part Of Speech tagger for word classes. We also propose an improved model that combines both models. The basic idea is that if a sequence of words has been seen in training, the phrase-based model can be used; otherwise, the template-based model can be used. We present the results from different tasks.

Jesús Tomás, Francisco Casacuberta
The Log-polar Image Representation in Pattern Recognition Tasks

This paper is a review of works about the use of the log-polar image model for pattern recognition purposes. Particular attention is paid to the rotation- and scale-invariant pattern recognition problem, which is simplified by the log-polar mapping. In spite of this advantage, ordinary translations become a complicated image transform in the log-polar domain. Two approaches addressing the estimation of translation, rotation and scaling are compared. One of them, developed by the authors, takes advantage of the principles of the active vision paradigm.

V. Javier Traver, Filiberto Pla
Appearance Tracking for Video Surveillance

We present an algorithm which tracks multiple objects for video surveillance applications. This algorithm is based on a Bayesian framework and a Particle filter. In order to use this method in practical applications we define a statistical model of the object appearance to build a robust likelihood function. The tracking process is only based on image data, therefore, a previous step to learn the object shape and their motion parameters is not necessary. Using the localization results, we can define a prior density which is used to initialize the algorithm. Finally, our method has been proved successfully in several sequences and its performance is more accurate than classical filters.

Javier Varona, Jordi Gonzàlez, F. Xavier Roca, J. J. Villanueva
Sinusoidal Modelling with Complex Exponentials for Speech and Audio Signals

In this paper we propose a new approach based on energy-adaptive matching pursuits to improve sinusoidal modelling of speech and audio signals for coding and recognition purposes. To reduce the complexity of the algorithm, an over-complete dictionary composed of complex exponentials is used and an efficient implementation is presented. An analysis-synthesis windows scheme that avoids overlapping is proposed, too. Experimental results show evidence of the advantages of the proposed method for sinusoidal modelling of speech and audio signals compared to some others proposed in the literature.

P. Vera-Candeas, N. Ruiz-Reyes, D. Martinez-Muñoz, J. Curpian-Alonso, M. Rosa-Zurera, M. J. Lucena-Lopez
Smoothing Techniques for Tree-k-Grammar-Based Natural Language Modeling

In a previous work, a new probabilistic context-free grammar (PCFG) model for natural language parsing derived from a tree bank corpus has been introduced. The model estimates the probabilities according to a generalized k-grammar scheme for trees. It allows for faster parsing, decreases considerably the perplexity of the test samples and tends to give more structured and refined parses. However, it suffers from the problem of incomplete coverage. In this paper, we compare several smoothing techniques such as backing-off or interpolation that are used to avoid assigning zero probability to any sentence.

Jose L. Verdú-Mas, Jorge Calera-Rubio, Rafael C. Carrasco
Colour Constancy Algorithm Based on the Minimization of the Distance between Colour Histograms

Colour is an important clue in many applications in machine vision and image processing.

Jaume Vergés-Llahí, Alberto Sanfeliu
Video Segmentation for Traffic Monitoring Tasks Based on Pixel-Level Snakes

In this paper we address a moving object segmentation technique for a video monitoring system. This is approached by means of active contours which appear to be an efficient tool for the spatio-temporal data analysis from 2D image sequences. Particularly we make use of a new active contour concept: the pixel-level snakes whose characteristics allow a high control on the contour evolution and approach topological transformations with a low computational cost. The proposal is focused in the traffic monitoring and the incident detection systems.

D. L Vilariño, D. Cabello, X. M. Pardo, V. M. Brea
Optimal Detection of Symmetry Axis in Digital Chest X-ray Images

We present a method for detecting the axis of bilateral symmetry in a digital chest X-ray image and subsequently measuring the degree of symmetry of the image. The detection is achieved by analysing rotated-reflected digital chest X-ray images and it is posed as a global optimization problem solved with a probabilistic genetic algorithm (PGA). The global search is initially based on natural peak orientation information related to the orientation of the symmetry axis. Only a few generations of the PGA are needed to achieve convergence to all the images in the database. This method is applied directly on the intensity input image and does not require any prior segmentation.

Carlos Vinhais, Aurélio Campilho
Low Complexity Motion Estimation Based on Spatio-temporal Correlations

To remove temporal redundancy contained in a sequence of images, motion estimation techniques have been developed. However, the high computational complexity of the problem makes such techniques very difficult to be applied to high-resolution applications in a real time environment. For this reason, low complexity motion estimation algorithms are viable solutions. In this paper, we present an efficient algorithm based on exploiting temporally and spatially correlated motion information that defines the search pattern and the location of search starting point adaptively. Experiments show that the speedup improvement of the proposed algorithm over Diamond Search algorithm (DS) and HEXagon-Based Serch (HEXBS) can be up to 2 ~ 3 times faster and the image quality improvement can be better up to 0.1 ~ 1(dB).

Hyo Sun Yoon, Guee Sang Lee
Set-Permutation-Occurrence Matrix Based Texture Segmentation

We have investigated a combination of statistical modelling and expectation maximisation for a texture based approach to the segmentation of mammographic images. Texture modelling is based on the implicit incorporation of spatial information through the introduction of a set-permutation-occurrence matrix. Statistical modelling is used for data generalisation and noise removal purposes. Expectation maximisation modelling of the spatial information in combination with the statistical modelling is evaluated. The developed segmentation results are used for automatic mammographic risk assessment.

Reyer Zwiggelaar, Lilian Blot, David Raba, Erika R. E. Denton
Semi-automatic Segmentation of the Prostate

A semi-automatic method has been developed which segments the prostate in slices of Magnetic Resonance Imaging (MRI) data. The developed approach exploits the characteristics of the anatomical shape of the prostate when represented in a polar transform space. Simple techniques, such as line detection and non-maximum suppression, are used to track the boundary of the prostate. The initial results, based on a small set of data, indicate a good correlation with expert based manual segmentation.

Reyer Zwiggelaar, Yanong Zhu, Stuart Williams
Entropy Vector Median Filter

In this paper, a new adaptive multichannel filter for the detection and removal of impulsive noise, bit errors and outliers in digital color images is provided. The proposed nonlinear filter takes the advantages of the concept of the local entropy contrast and the robust order-statistics theory. The new entropy based vector median is computationally attractive, robust for a wide range of the impulsive noise corruption and significantly improves the signal-detail preservation capability of the standard vector median filter.

Rastislav Lukac, Bogdan Smolka, Konstantinos N. Plataniotis, Anastasios N. Venetsanopoulos
That Elusive Diversity in Classifier Ensembles

Is “useful diversity” a myth? Many experiments and the little available theory on diversity in classifier ensembles are either inconclusive, too heavily assumption-bound or openly non-supportive of the intuition that diverse classifiers fare better than non-divers ones. Although a rough general tendency was confirmed in our previous studies, no prominent link appeared between diversity of the ensemble and its accuracy. Diversity alone is a poor predictor of the ensemble accuracy. But there is no agreed definition of diversity to start with! Can we borrow a concept of diversity from biology? How can diversity, as far as we can define and measure it, be used to improve the ensemble? Here we argue that even without a clear-cut definition and theory behind it, studying diversity may prompt viable heuristic solutions. We look into some ways in which diversity can be used in analyzing, selecting or training the ensemble.

Ludmila I. Kuncheva
Backmatter
Metadaten
Titel
Pattern Recognition and Image Analysis
herausgegeben von
Francisco José Perales
Aurélio J. C. Campilho
Nicolás Pérez de la Blanca
Alberto Sanfeliu
Copyright-Jahr
2003
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-44871-6
Print ISBN
978-3-540-40217-6
DOI
https://doi.org/10.1007/b12122