Skip to main content

2010 | Buch

Advanced Concepts for Intelligent Vision Systems

12th International Conference, ACIVS 2010, Sydney, Australia, December 13-16, 2010, Proceedings, Part I

herausgegeben von: Jacques Blanc-Talon, Don Bone, Wilfried Philips, Dan Popescu, Paul Scheunders

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This volume collects the papers accepted for presentation at the 12th Int- national Conference on “Advanced Concepts for Intelligent Vision Systems” (ACIVS 2010). Following the ?rst meeting in Baden-Baden (Germany) in 1999, whichwaspartofalargemulticonference,theACIVSconferencethendeveloped into an independent scienti?c event and has ever since maintained the tradition of being a single track conference. ACIVS 2010 attracted computer scientists from 29 di?erent countries, mostly from Europe, Australia, and the USA, but also from Asia. Although ACIVS is a conference on all areas of image and video processing, submissions tend to gather within certain major ?elds of interest. This year 3D and depth processing and computer vision and surveillance were popular topics. Noteworthy are the growing number of papers related to theoretical devel- ments. We would like to thank the invited speakers Mubarak Shah (University of Central Florida), Richard Kleihorst (VITO, Belgium), Richard Hartley (A- tralian National University), and David Suter (Adelaide University) for their valuable contributions.

Inhaltsverzeichnis

Frontmatter

Image Processing and Analysis

A Criterion of Noisy Images Quality

This work describes an objective criterion of quality estimation of fine details in the noisy images in the normalized equal color space. Comparison with the standard PSNR criterion is presented for noisy images.

Sergey V. Sai, Ilya S. Sai, Nikolay Yu. Sorokin
Subjective Evaluation of Image Quality Measures for White Noise Distorted Images

Image Quality Assessment has diverse applications. A number of Image Quality measures are proposed, but none is proved to be true representative of human perception of image quality. We have subjectively investigated spectral distance based and human visual system based image quality measures for their effectiveness in representing the human perception for images corrupted with white noise. Each of the 160 images with various degrees of white noise is subjectively evaluated by 50 human subjects, resulting in 8000 human judgments. On the basis of evaluations, image independent human perception values are calculated. The perception values are plotted against spectral distance based and human visual system based image quality measures. The performance of quality measures is determined by graphical observations and polynomial curve fitting, resulting in best performance by Human Visual System Absolute norm.

Atif Bin Mansoor, Adeel Anwar
Real-Time Retrieval of Near-Duplicate Fragments in Images and Video-Clips

Detection and localization of unspecified similar fragments in random images is one of the most challenging problems in CBVIR (classic techniques focusing on full-image or sub-image retrieval usually fail in such a problem). We propose a new method for near-duplicate image fragment matching using a

topology-based

framework. The method works on visual data only, i.e. no semantics or a’priori knowledge is assumed. Near-duplicity of image fragments is modeled by topological constraints on sets of matched keypoints (instead of geometric constrains typically used in image matching). The paper reports a time-efficient (i.e. capable of working in real time with a video input) implementation of the proposed method. The application can be run using a mid-range personal computer and a medium-quality video camera.

Andrzej Śluzek, Mariusz Paradowski
Toward the Detection of Urban Infrastructure’s Edge Shadows

In this paper, we propose a novel technique to detect the shadows cast by urban infrastructure, such as buildings, billboards, and traffic signs, using a sequence of images taken from a fixed camera. In our approach, we compute two different background models in parallel: one for the edges and one for the reflected light intensity. An algorithm is proposed to train the system to distinguish between moving edges in general and edges that belong to static objects, creating an edge background model. Then, during operation, a background intensity model allow us to separate between moving and static objects. Those edges included in the moving objects and those that belong to the edge background model are subtracted from the current image edges. The remaining edges are the ones cast by urban infrastructure. Our method is tested on a typical crossroad scene and the results show that the approach is sound and promising.

Cesar Isaza, Joaquin Salas, Bogdan Raducanu
Neural Image Thresholding Using SIFT: A Comparative Study

The task of image thresholding mainly classifies the image data into two regions, a necessary step in many image analysis and recognition applications. Different images, however, possess different characteristics making the thresholding by one single algorithm very difficult if not impossible. Hence, to optimally binarize a single image, one must usually try more than one threshold in order to obtain maximum segmentation accuracy. This approach could be very complex and time-consuming especially when a large number of images should be segmented in real time. Generally the challenge arises because any thresholding method may perform well for a certain image class but not for all images. In this paper, a supervised neural network is used to “dynamically” threshold images by learning the suitable threshold for each image type. The thresholds generated by the neural network can be used to binarize the images in two different ways. In the first approach, the scale-invariant feature transform (SIFT) method is used to assign a number of key points to the whole image. In the second approach,the SIFT is used to assign a number of key points within a rectangle around the region of interest. The results of each test are compared with the Otsu algorithm, active shape models (ASM), and level sets technique (LS). The neural network is trained using a set of features extracted from medical images randomly selected form a sample set and then tested using the remaining images. This process is repeated multiple times to verify the generalization ability of the network. The average of segmentation accuracy is calculated by comparing every segmented image with corresponding gold standard images.

Ahmed A. Othman, Hamid R. Tizhoosh
Statistical Rail Surface Classification Based on 2D and 21/2D Image Analysis

We present an approach to high-resolution rail surface analysis combining 2D image texture classification and 2

1

/

2

D analysis of surface disruptions. Detailed analysis of images of rail surfaces is used to observe the condition of rails and, as a precaution, to avoid rail breaks and further damage. Single rails are observed by a color line scan camera at high resolution of approximately 0.2 millimeters and under special illumination in order to enable 2

1

/

2

D image analysis. Gabor filter banks are used for 2D texture description and classes are modeled by Gaussian mixtures. A Bayesian classifier, which also incorporates background knowledge, is used to differentiate between surface texture classes. Classes which can be related to surface disruptions are derived from the analysis of the anti-correlation properties between two color channels. Images are illuminated by two light sources mounted at different position and operating at different wavelengths. Results for data gathered in the Vienna metro system are presented.

Reinhold Huber-Mörk, Michael Nölle, Andreas Oberhauser, Edgar Fischmeister
Salient-SIFT for Image Retrieval

Local descriptors have been wildly explored and utilized in image retrieval because of their transformation invariance. In this paper, we propose an improved set of features extarcted from local descriptors for more effective and efficient image retrieval. We propose a salient region selection method to detect human’s Region Of Interest (hROI) from an image, which incorporates the Canny edge algorithm and the convex hull method into Itti’s saliency model for obtaining hROI’s. Our approach is a purely bottom-up process with better robustness. The salient region is used as a window to select the most distinctive features out of the Scale-Invariant Feature Transform (SIFT) features. Our proposed SIFT local descriptors is termed as salient-SIFT features. Experiment results show that the salient-SIFT features can characterize the human perception well and achieve better image retrieval performance than the original SIFT descriptors while the computational complexity is greatly reduced.

Zhen Liang, Hong Fu, Zheru Chi, Dagan Feng
Combined Retrieval Strategies for Images with and without Distinct Objects

This paper presents the design of an all-season image retrieval system. The system handles the images with and without distinct object(s) using different retrieval strategies. Firstly, based on the visual contrasts and spatial information of an image, a neural network is trained to pre-classify an image as distinct-object or no-distinct-object category by using the Back Propagation Through Structure (BPTS) algorithm. In the second step, an image with distinct object(s) is processed by an attention-driven retrieval strategy emphasizing distinct objects. On the other hand, an image without distinct object(s) (e.g., a scenery images) is processed by a fusing-all retrieval strategy. An improved performance can be obtained by using this combined approach.

Hong Fu, Zheru Chi, Dagan Feng
Spectral Matching Functions and Ellipse Mappings in Search for More Uniform Chromaticity and Color Spaces

In this study, modifying the CIE

xyz

color matching functions was considered to achieve a more uniform chromaticity space. New color matching functions resulting both from the non-negative tensor factorization and from the optimization were combined with two ellipse mapping approaches. In both approaches the original MacAdam ellipses were mapped to the new space. The first mapping approach depended on the dominant wavelengths and the second one on the spectral information for the five points on the locus of each ellipse. Equal semiaxis lengths (a constant radius) and equal areas for the mapped MacAdam ellipses were the characteristics for the uniformity of the new chromaticity space. The new color matching functions were modelled with the non-uniform rational B-splines and the optimization modified the independent parameters, namely the control points, for NURBS. The cost function was based on the size and shape of the mapped MacAdam ellipses. NURBS were also utilized as a smoothing operator when the color matching functions were directly output from the optimization task. The results indicate that modified color matching functions yield in more uniform chromaticity space. There still remains uncertainty about the ellipse mapping approaches and formulation on the cost function in the optimization tasks.

Maryam Pahjehfouladgaran, Arto Kaarna
Anatomy-Based Registration of Isometrically Transformed Surfaces Using Geodesic Area Functionals

A novel method for registration of isometrically transformed surfaces is introduced. The isometric transformation is locally decomposed into a sequence of low order transformations after manual analysis and partition of the template surface into its elementary parts. The proposed method employs geodesic moments, first, to find matching corresponding key points, and second, to generate matching regions for each of the object’s parts. The local transformation is estimated using second order moments of the corresponding regions. The method operation is demonstrated on the TOSCA dog object.

Boaz Vigdor, Joseph M. Francos
Trabecular Bone Anisotropy Characterization Using 1D Local Binary Patterns

This paper presents a new method to characterize the texture of gray level bone radiographic images. The technique is inspired from the Local Binary Pattern descriptor which has been classically applied on two dimensional (2D) images. Our algorithm is a derived solution for the 1D projected fields of the 2D images. The method requires a series of preprocessing of images. A clinical study is led on two populations of osteoporotic and control patients. The results show the ability of our technique to better discriminate the two populations than the classical LBP method. Moreover, they show that the structural organization of bone is more anisotropic for the osteoporotic cases than that of the control cases in accordance with the natural evolution of bone tissue linked to osteoporosis.

Lotfi Houam, Adel Hafiane, Rachid Jennane, Abdelhani Boukrouche, Eric Lespessailles

Segmentation and Edge Detection

Watershed Based Document Image Analysis

Document image analysis is used to segment and classify regions of a document image into categories such as text, graphic and background. In this paper we first review existing document image analysis approaches and discuss their limits. Then we adapt the well-known watershed segmentation in order to obtain a very fast and efficient classification. Finally, we compare our algorithm with three others, by running all the algorithms on a set of document images and comparing their results with a ground-truth segmentation designed by hand.

Results show that the proposed algorithm is the fastest and obtains the best quality scores.

Pasha Shadkami, Nicolas Bonnier
A Fast External Force Field for Parametric Active Contour Segmentation

Active contours or snakes are widely used for segmentation and tracking. We propose a new active contour model, which converges reliably even when the initialization is far from the object of interest. The proposed segmentation technique uses an external energy function where the energy slowly decreases in the vicinity of an edge. Based on this energy a new external force field is defined. Both energy function and force field are calculated using an efficient dual scan line algorithm. The proposed force field is tested on computational speed, its effect on the convergence speed of the active contour and the segmentation result. The proposed method gets similar segmentation results as the

gradient vector flow

and

vector field convolution

active contours, but the force field needs significantly less time to calculate.

Jonas De Vylder, Koen Douterloigne, Wilfried Philips
The Extraction of Venation from Leaf Images by Evolved Vein Classifiers and Ant Colony Algorithms

Leaf venation is an important source of data for research in comparative plant biology. This paper presents a method for evolving classifiers capable of extracting the venation from leaf images. Quantitative and qualitative analysis of the classifier produced is carried out. The results show that the method is capable of the extraction of near complete primary and secondary venations with relatively little noise. For comparison, a method using ant colony algorithms is also discussed.

James S. Cope, Paolo Remagnino, Sarah Barman, Paul Wilkin
Segmentation of Inter-neurons in Three Dimensional Brain Imagery

Segmentation of neural cells in three dimensional fluorescence microscopy images is a challenging image processing problem. In addition to being important to neurobiologists, accurate segmentation is a vital component of an automated image processing system. Due to the complexity of the data, particularly the extreme irregularity in neural cell shape, generic segmentation techniques do not perform well. This paper presents a novel segmentation technique for segmenting neural cells in three dimensional images. Accuracy rates of over 90% are reported on a data set of 100 images containing over 130 neural cells and subsequently validated using a novel data set of 64 neurons.

Gervase Tuxworth, Adrian Meedeniya, Michael Blumenstein
Noise-Robust Method for Image Segmentation

Segmentation of noisy images is one of the most challenging problems in image analysis and any improvement of segmentation methods can highly influence the performance of many image processing applications. In automated image segmentation, the fuzzy c-means (FCM) clustering has been widely used because of its ability to model uncertainty within the data, applicability to multi-modal data and fairly robust behaviour. However, the standard FCM algorithm does not consider any information about the spatial image context and is highly sensitive to noise and other imaging artefacts. Considering above mentioned problems, we developed a new FCM-based approach for the noise-robust fuzzy clustering and we present it in this paper. In this new iterative algorithm we incorporated both spatial and feature space information into the similarity measure and the membership function. We considered that spatial information depends on the relative location and features of the neighbouring pixels. The performance of the proposed algorithm is tested on synthetic image with different noise levels and real images. Experimental quantitative and qualitative segmentation results show that our method efficiently preserves the homogeneity of the regions and is more robust to noise than other FCM-based methods.

Ivana Despotović, Vedran Jelača, Ewout Vansteenkiste, Wilfried Philips
High Definition Feature Map for GVF Snake by Using Harris Function

In image segmentation the gradient vector flow snake model is widely used. For concave curvatures snake model has good convergence capabilities, but poor contrast or saddle corner points may result in a loss of contour. We have introduced a new external force component and an optimal initial border, approaching the final boundary as close as possible. We apply keypoints defined by corner functions and their corresponding scale to outline the envelope around the object. The Gradient Vector Flow (GVF) field is generated by the eigenvalues of Harris matrix and/or the scale of the feature point. The GVF field is featured by new functions characterizing the edginess and cornerness in one function. We have shown that the

max

(0,

log

[

max

(

λ

1

,

λ

2

)]) function fulfills the requirements for any active contour definitions in case of difficult shapes and background conditions. This new GVF field has several advantages: smooth transitions are robustly taken into account, while sharp corners and contour scragginess can be perfectly detected.

Andrea Kovacs, Tamas Sziranyi
Adaptive Constructive Polynomial Fitting

To extract geometric primitives from edges, we use an incremental linear-time fitting algorithm, which is based on constructive polynomial fitting. In this work, we propose to determine the polynomial order by observing the regularity and the increase of the fitting cost. When using a fixed polynomial order under- or even overfitting could occur. Second, due to a fixed treshold on the fitting cost, arbitrary endpoints are detected for the segments, which are unsuitable as feature points. We propose to allow a variable segment thickness by detecting discontinuities and irregularities in the fitting cost. Our method is evaluated on the MPEG-7 core experiment CE-Shape-1 database part B [1]. In the experimental results, the edges are approximated closely by the polynomials of variable order. Furthermore, the polynomial segments have robust endpoints, which are suitable as feature points. When comparing adaptive constructive polynomial fitting (ACPF) to non-adaptive constructive polynomial fitting (NACPF), the average Hausdorff distance per segment decreases by 8.85% and the object recognition rate increases by 10.24%, while preserving simplicity and computational efficiency.

Francis Deboeverie, Kristof Teelen, Peter Veelaert, Wilfried Philips
Long-Range Inhibition in Reaction-Diffusion Algorithms Designed for Edge Detection and Stereo Disparity Detection

The present paper demonstrates the significance of long-range inhibition in reaction-diffusion algorithms designed for edge detection and stereo disparity detection. In early visual systems, the long-range inhibition plays an important role in brightness perception. The most famous illusory perception due to the long-range inhibition is the Mach bands effect, which is observed in a visual system of an animal and also in the human visual system. The long-range inhibition also appears in the computer vision algorithm utilising the difference of two Gaussian filters for edge detection. Upon evidence implying analogy between brightness perception and stereo depth perception, several psychologists have suggested that such the long-range inhibition works not only in the brightness perception, but also in the depth perception. We previously proposed biologically motivated reaction-diffusion algorithms designed for edge detection and stereo disparity detection. Thus, we show that the long-range inhibition also plays an important role in both of the reaction-diffusion algorithms through experimental study. Results of the study provide a new idea of improving performance of the reaction-diffusion stereo algorithm.

Atsushi Nomura, Makoto Ichikawa, Koichi Okada, Hidetoshi Miike
An Edge-Sensing Universal Demosaicing Algorithm

In this paper, we introduce an edge detection algorithm for mosaiced images which can be used to enhance generic demosaicing algorithms. The algorithm is based on pixels color differences in the horizontal, vertical and diagonal directions. By using our edge-detection technique to enhance the universal demosaicing algorithm of Lukac

et al

., experimental results show that the presence of color shifts and artefacts in demosaiced images is reduced. This is confirmed in regard to both subjective and objective evaluation.

Alain Horé, Djemel Ziou
A New Perceptual Edge Detector in Color Images

In this paper we propose a new perceptual edge detector based on anisotropic linear filtering and local maximization. The novelty of this approach resides in the mixing of ideas coming both from perceptual grouping and directional recursive linear filtering. We obtain new edge operators enabling very precise detection of edge points which are involved in large structures. This detector has been tested successfully on various image types presenting difficult problems for classical edge detection methods.

Philippe Montesinos, Baptiste Magnier
Combining Geometric Edge Detectors for Feature Detection

We propose a novel framework for the analysis and modeling of discrete edge filters, based on the notion of signed rays. This framework will allow us to easily deduce the geometric and localization properties of a family of first-order filters, and use this information to design custom filter banks for specific applications. As an example, a set of angle-selective corner detectors is constructed for the detection of buildings in video sequences. This clearly illustrates the merit of the theory for solving practical recognition problems.

Michaël Heyvaert, David Van Hamme, Jonas Coppens, Peter Veelaert
Canny Edge Detection Using Bilateral Filter on Real Hexagonal Structure

Edge detection plays an important role in image processing area. This paper presents a Canny edge detection method based on bilateral filtering which achieves better performance than single Gaussian filtering. In this form of filtering, both spatial closeness and intensity similarity of pixels are considered in order to preserve important visual cues provided by edges and reduce the sharpness of transitions in intensity values as well. In addition, the edge detection method proposed in this paper is achieved on sampled images represented on a real hexagonal structure. Due to the compact and circular nature of the hexagonal lattice, a better quality edge map is obtained on the hexagonal structure than common edge detection on square structure. Experimental results using proposed methods exhibit also the faster speed of detection on hexagonal structure.

Xiangjian He, Daming Wei, Kin-Man Lam, Jianmin Li, Lin Wang, Wenjing Jia, Qiang Wu
Automated Segmentation of Endoscopic Images Based on Local Shape-Adaptive Filtering and Color Descriptors

This paper presents a novel technique for automatic segmentation of wireless capsule endoscopic images. The main contribution resides in the integration of three computational blocks: 1) local polynomial approximation algorithm which finds locally-adapted neighborhood of each pixel; 2) color texture analysis which describes each pixel by a vector of numerical attributes that reflect this pixel local neighborhood characteristcs; and 3) cluster analysis (k-means) for grouping pixels into homgeneous regions based on their color information. The proposed approach leads to a robust segmentation procedure which produces fine segments well matched to the image contents.

Artur Klepaczko, Piotr Szczypiński

3D and Depth

Dense Stereo Matching from Separated Views of Wide-Baseline Images

In this paper, we present a dense stereo matching algorithm from multiple wide-baseline images with separated views. The algorithm utilizes the coarse-to-fine strategy to propagate the sparse feature matching to dense stereo for image pixels. First, the images are segmented into non-overlapping homogeneous partitions. Then, in the coarse step, the initial disparity map is estimated by assigning the sparse feature correspondences, where the spatial location of these features is incorporated with the over-segmentation. The initial occlusion status is obtained by cross-checking test. Finally, the stereo maps are refined by the proposed discontinuity-preserving regularization algorithm, which directly coupling the disparity and occlusion labeling. The experimental results implemented on the real date sets of challenging samples, including the wide-baseline image pairs with both identical scale and different scale, demonstrated the good subjective performance of the proposed method.

Qian Zhang, King Ngi Ngan
Modeling Wavelet Coefficients for Wavelet Subdivision Transforms of 3D Meshes

In this paper, a Laplacian Mixture (LM) model is proposed to accurately approximate the observed histogram of the wavelet coefficients produced by lifting-based subdivision wavelet transforms. On average, the proposed mixture model gives better histogram fitting for both normal and non-normal meshes compared to the traditionally used Generalized Gaussian (GG) distributions. Exact closed-form expressions for the rate and the distortion of the LM probability density function quantized using generic embedded deadzone scalar quantizer (EDSQ) are derived, without making high-rate assumptions. Experimental evaluations carried out on a set of 3D meshes reveals that, on average, the D-R function for the LM model closely follows and gives a better indication of the experimental D-R compared to the D-R curve of the competing GG model. Optimal embedded quantization for the proposed LM model is experimentally determined. In this sense, it is concluded that the classical Successive Approximation Quantization (SAQ) is an acceptable, but in general, not an optimal embedded quantization solution in wavelet-based scalable coding of 3D meshes.

Shahid M. Satti, Leon Denis, Adrian Munteanu, Jan Cornelis, Peter Schelkens
3D Surface Reconstruction Using Structured Circular Light Patterns

Reconstructing a 3D surface in ℝ

3

from a 2D image in ℝ

2

has been a widely studied issue as well as one of the most important problems in image processing. In this paper, we propose a novel approach to reconstructing 3D coordinates of a surface from a 2D image taken by a camera using projected circular light patterns. Known information (i.e. intrinsic and extrinsic parameters of the camera, the structure of the circular patterns, a fixed optical center of the camera and the location of the reference plane of the surface) provides a mathematical model for surface reconstruction. The reconstruction is based on a geometrical relationship between a given pattern projected onto a 3D surface and a pattern captured in a 2D image plane from a viewpoint. This paper chiefly deals with a mathematical proof of concept for the reconstruction problem.

Deokwoo Lee, Hamid Krim
Computing Saliency Map from Spatial Information in Point Cloud Data

Saliency detection in 2D and 3D images has been extensively used in many computer vision applications such as obstacle detection, object recognition and segmentation. In this paper we present a new saliency detection method which exploits the spatial irregularities in an environment. A Time-of-Flight (TOF) camera is used to obtain 3D points that represent the available spatial information in an environment. Two separate saliency maps are calculated by employing local surface properties (LSP) in different scales and the distance between the points and the camera. Initially, residuals representing the irregularities are obtained by fitting planar patches to the 3D points in different scales. Then, residuals within the spatial scales are combined and a saliency map in which the points with high residual values represent non-trivial regions of the surfaces is generated. Also, another saliency map is generated by using the proximity of each point in the point cloud data. Finally, two saliency maps are integrated by averaging and a master saliency map is generated.

Oytun Akman, Pieter Jonker
A Practical Approach for Calibration of Omnidirectional Stereo Cameras

This paper presents a calibration method of an omnidirectional stereo camera (ODSC) for the purpose of long distance measurement. Existing calibration methods can be used for calibration of an ODSC, but they may be applicable either to calibration of the ODSC with a small baseline or to individual calibration of its two cameras independently. In practice, it is difficult to calibrate the ODSC with a large baseline. A calibration test pattern, which is simultaneously captured by the two cameras of an ODSC system, appears very small in at least one of the cameras. Nevertheless, the baseline should be large enough for long distance measurement to ensure consistency of the estimated distance. In this paper, therefore, we propose a calibration method of the ODSC with a large baseline and verify its feasibility by presenting the experimental results of its distance estimation.

Kang-San Lee, Hyun-Soo Kang, Hamid Gholamhosseini
Surface Reconstruction of Wear in Carpets by Using a Wavelet Edge Detector

Carpet manufacturers have wear labels assigned to their products by human experts who evaluate carpet samples subjected to accelerated wear in a test device. There is considerable industrial and academic interest in going from human to automated evaluation, which should be less cumbersome and more objective. In this paper, we present image analysis research on videos of carpet surfaces scanned with a 3D laser. The purpose is obtaining good depth images for an automated system that should have a high percentage of correct assessments for a wide variety of carpets. The innovation is the use of a wavelet edge detector to obtain a more continuously defined surface shape. The evaluation is based on how well the algorithms allow a good linear ranking and a good discriminance of consecutive wear labels. The results show an improved linear ranking for most carpet types, for two carpet types the results are quite significant.

Sergio A. Orjuela Vargas, Benhur Ortiz Jaramillo, Simon De Meulemeester, Julio C. Garcia Alvarez, Filip Rooms, Aleksandra Pižurica, Wilfried Philips
Augmented Reality with Human Body Interaction Based on Monocular 3D Pose Estimation

We present an augmented reality interface with markerless human body interaction. It consists of 3D motion capture of the human body and the processing of 3D human poses for augmented reality applications. A monocular camera is used to acquire the images of the user’s motion for 3D pose estimation. In the proposed technique, a graphical 3D human model is first constructed. Its projection on a virtual image plane is then used to match the silhouettes obtained from the image sequence. By iteratively adjusting the 3D pose of the graphical 3D model with the physical and anatomic constraints of the human motion, the human pose and the associated 3D motion parameters can be uniquely identified. The obtained 3D pose information is then transferred to the reality processing subsystem and used to achieve the marker-free interaction in the augmented environment. Experimental results are presented using a head mounted display.

Huei-Yung Lin, Ting-Wen Chen
Fusing Large Volumes of Range and Image Data for Accurate Description of Realistic 3D Scenes

Hand-held time-of-flight laser scene scanners provide very large volumes of 3D range (coordinate) and optical (colour) measurements for modelling visible surfaces of real 3D scenes. To obtain an accurate model, the measurement errors resulting e.g. in gaps, uncertain edges and small details have to be detected and corrected. This paper discusses possibilities of using multiple calibrated scene images collected simultaneously with the range data for getting a more complete and accurate scene model. Experiments show that the proposed approach eliminates a number of range errors while still may fail on intricate disjoint surfaces that can be met in practice.

Yuk Hin Chan, Patrice Delmas, Georgy Gimel’farb, Robert Valkenburg
Design of a Real-Time Embedded Stereo Smart Camera

This paper describes the architecture of a new smart vision system called BiSeeMos. This smart camera is designed for stereo vision purposes and the implementation of a simple dense stereo vision algorithm. The architecture has been designed for dedicated parallel algorithms in using a high performance FPGA. This chip provides the user with useful features for vision processing as integrated RAM blocks, embedded multipliers, phase locked loops and plenty of logic elements. In this paper, a description of our architecture and a comparison versus others works is done. A dense stereo vision algorithm has been implemented on the platform using the Census method.

Frantz Pelissier, François Berry
Optimal Trajectory Space Finding for Nonrigid Structure from Motion

The deformation in nonrigid structure from motion can be modeled either in shape domain or in time domain. Here, we view the deformation in time domain, model the trajectory of each 3D point as a linear combination of trajectory bases, and present a novel method to automatically find the trajectory bases based on orthographic camera assumption. In this paper, a linear relation is explicitly derived between 2D projected trajectory and 3D trajectory bases. With this formulation, an approximation is formulated for finding 3D trajectory bases which cast the trajectory bases finding into a problem of finding eigenvectors. Using the approximated trajectory bases as a start point, an EM-like algorithm is proposed which refine the trajectory bases and the corresponding coefficients. The proposed method demonstrates satisfactory results on both the synthetic and real data.

Yuanqi Su, Yuehu Liu, Yang Yang
Fast Depth Saliency from Stereo for Region-Based Artificial Visual Attention

Depth is an important feature channel for natural vision organisms that helps in focusing attention on important locations of the viewed scene. Artificial visual attention systems require a fast estimation of depth to construct a saliency map based upon distance from the vision system. Recent studies on depth perception in biological vision indicate that disparity is computed using object detection in the brain. The proposed method exploits these studies and determines the shift that objects go through in the stereo frames using data regarding their borders. This enables efficient creation of depth saliency map for artificial visual attention. Results of the proposed model have shown success in selecting those locations from stereo scenes that are salient for human perception in terms of depth.

Muhammad Zaheer Aziz, Bärbel Mertsching

Algorithms and Optimisations

A Caustic Approach of Panoramic Image Analysis

In this article, the problem of blur in a panoramic image from a catadioptric camera is analyzed through the determination of the virtual image. This determination is done first with an approximative method, and second through the caustic approach. This leads us to a general caustic approach of panoramic image analysis, where equations of virtual images are given. Finally, we give some direct applications of our analysis, such as depth of field (blur) or image resolution.

Siyuan Zhang, Emmanuel Zenou
Projection Selection Algorithms for Discrete Tomography

In this paper we study how the choice of projection angles affect the quality of the discrete tomographic reconstruction of an object. We supply four different strategies for selecting projection angle sets and compare them by conducting experiments on a set of software phantoms. We also discuss some consequences of our observations. Furthermore, we introduce a possible application of the proposed angle selection algorithms as well.

László Varga, Péter Balázs, Antal Nagy
Fast Mean Shift Algorithm Based on Discretisation and Interpolation

A fast mean shift algorithm for processing the image data is presented. Although it is based on the known basic principles of the original mean shift method, it improves the computational speed substantially. It is being assumed that the spatial image coordinates and range coordinates can be discretised by introducing a regular grid. Firstly, the algorithm precomputes the values of shifts at the grid points. The mean shift iterations are then carried out by making use of the grid values and trilinear interpolation. In the paper, it is shown that this can be done effectively. Measured by the order of complexity, the values at all grid points can be precomputed in the time that is equal to the time required, in the original method, for computing only one mean shift iteration for all image points. The interpolation step is computationally inexpensive. The experimental results confirming the theoretical expectations are presented. The use of the step kernel for computing the shifts (corresponding to the Epanechnikov kernel for estimating the densities), and the images with only a single value at each pixel are required.

Eduard Sojka, Jan Gaura, Tomáš Fabián, Michal Krumnikl
Learning to Adapt: A Method for Automatic Tuning of Algorithm Parameters

Most computer vision algorithms have parameters that must be hand-selected using expert knowledge. These parameters hinder the use of large computer vision systems in real-world applications. In this work, a method is presented for automatically and continuously tuning the parameters of algorithms in a real-time modular vision system. In the training phase, a human expert teaches the system how to adapt the algorithm parameters based on training data. During operation, the system measures features from the inputs and outputs of each module and decides how to modify the parameters. Rather than learning good parameter values in absolute terms,

incremental

changes are modelled based on relationships between algorithm inputs and outputs. These increments are continuously applied online so that parameters stabilise to suitable values. The method is demonstrated on a three-module people-tracking system for video surveillance.

Jamie Sherrah
Pseudo-morphological Image Diffusion Using the Counter-Harmonic Paradigm

Relationships between linear and morphological scale-spaces have been considered by various previous works. The aim of this paper is to study how to generalize the diffusion-based approaches in order to introduce nonlinear filters which effects mimic morphological dilation and erosion. A methodology based on the counter-harmonic mean is adopted here. Details of numerical implementation are discussed and results are provided to illustrate the behaviour of various studied cases: isotropic, nonlinear and coherence-enhanced diffusion. We also rediscover the classical link between Gaussian scale-space and dilation/erosion scale-spaces based on quadratic structuring functions.

Jesús Angulo
Non-maximum Suppression Using Fewer than Two Comparisons per Pixel

Non-Maximum Suppression (NMS) is the task of finding all local maxima in an image. This is often solved using gray-scale image dilation, which requires at least 6 comparisons per pixel in 2-D. We present two solutions that use fewer than 2 comparisons per pixel with little memory overhead. The first algorithm locates 1-D peaks along the image’s scan-line and compares each of these peaks against its 2-D neighborhood in a spiral scan order. The second algorithm selects local maximum candidates from the maxima of non-overlapping blocks of one-fourth the neighborhood size. Both algorithms run considerably faster than current best methods in the literature when applied to feature point detection. Matlab code of the proposed algorithms is provided for evaluation purposes.

Tuan Q. Pham
Hit-or-Miss Transform in Multivariate Images

The Hit-or-Miss transform (HMT) is a well-known morphological operator for template matching in binary images. A novel approach for HMT for multivariate images is introduced in this paper. The generic framework is a generalization of binary case based on a

h

-supervised ordering formulation which leads to reduced orderings. In particular, in this paper we focus on the application of HMT for target detection on high-resolution images. The visual results of the experiments show the performance of proposed approach.

Santiago Velasco-Forero, Jesús Angulo
Topological SLAM Using Omnidirectional Images: Merging Feature Detectors and Graph-Matching

Image feature extraction and matching is useful in many areas of robotics such as object and scene recognition, autonomous navigation, SLAM and so on. This paper describes a new approach to the problem of matching features and its application to scene recognition and topological SLAM. For that purpose we propose a prior image segmentation into regions in order to group the extracted features in a graph so that each graph defines a single region of the image. This image segmentation considers that the left part of the image is the continuation of the right part. The matching process will take into account the features and the structure (graph) using the GTM algorithm. Then, using this method of comparing images, we propose an algorithm for constructing topological maps. During the experimentation phase we will test the robustness of the method and its ability constructing topological maps. We have also introduced a new hysteresis behavior in order to solve some problems found in construction of the graph.

Anna Romero, Miguel Cazorla
Constraint Optimisation for Robust Image Matching with Inhomogeneous Photometric Variations and Affine Noise

While modelling spatially uniform or low-order polynomial contrast and offset changes is mostly a solved problem, there has been limited progress in models which could represent highly inhomogeneous photometric variations. A recent quadratic programming (QP) based matching allows for almost arbitrary photometric deviations. However this QP-based approach is deficient in one substantial respect: it can only assume that images are aligned geometrically as it knows nothing about geometry in general. This paper improves on the QP-based framework by extending it to include a robust rigid registration layer thus increasing both its generality and practical utility. The proposed method shows up to 4 times improvement in the quadratic matching score over a current state-of-the-art benchmark.

Al Shorin, Georgy Gimel’farb, Patrice Delmas, Patricia Riddle
Backmatter
Metadaten
Titel
Advanced Concepts for Intelligent Vision Systems
herausgegeben von
Jacques Blanc-Talon
Don Bone
Wilfried Philips
Dan Popescu
Paul Scheunders
Copyright-Jahr
2010
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-17688-3
Print ISBN
978-3-642-17687-6
DOI
https://doi.org/10.1007/978-3-642-17688-3