Skip to main content

Über dieses Buch

It is with great pleasure that we present the proceedings of the 6th Inter- tional, Symposium on Visual Computing (ISVC 2010), which was held in Las Vegas, Nevada. ISVC provides a common umbrella for the four main areas of visual computing including vision, graphics, visualization, and virtual reality. The goal is to provide a forum for researchers, scientists, engineers, and pr- titioners throughout the world to present their latest research ?ndings, ideas, developments, and applications in the broader area of visual computing. This year, the program consisted of 14 oral sessions, one poster session, 7 special tracks, and 6 keynote presentations. The response to the call for papers was very good; we received over 300 submissions for the main symposium from which we accepted 93 papers for oral presentation and 73 papers for poster p- sentation. Special track papers were solicited separately through the Organizing and Program Committees of each track. A total of 44 papers were accepted for oral presentation and 6 papers for poster presentation in the special tracks.



Poster Session

Lunar Image Classification for Terrain Detection

Terrain detection and classification are critical elements for NASA mission preparations and landing site selection. In this paper, we have investigated several image features and classifiers for lunar terrain classification. The proposed histogram of gradient orientation effectively discerns the characteristics of various terrain types. We further develop an open-source Lunar Image Labeling Toolkit to facilitate future research in planetary science. Experimental results show that the proposed system achieves 95% accuracy of classification evaluated on a dataset of 931 lunar image patches from NASA Apollo missions.

Heng-Tze Cheng, Feng-Tso Sun, Senaka Buthpitiya, Ying Zhang, Ara V. Nefian

Surface Modeling of the Corpus Callosum from MRI Scans

In this paper, the Bezier curve and surface are used to model the shape of the Corpus Callosum (CC) region from T1-weighted clinical MRI scans. We drive a closed form solution for the Bezier coefficients in 2D and 3D Euclidean spaces. The coefficients of the models are used for reconstruction of the CC contours and surfaces with varying degrees of accuracy, and constitute basis for discrimination between populations, and ways to enhance elastic registration of the CC. The discrimination ability of the Bezier curves and surfaces are evaluated against the Fourier Descriptors (FD) and Spherical Harmonics (SH) approaches.

Ahmed Farag, Shireen Elhabian, Mostafa Abdelrahman, James Graham, Aly Farag, Dongqing Chen, Manuel F. Casanova

Track Detection for Autonomous Trains

This paper presents a way to efficiently use lane detection techniques - known from driver assistance systems - to assist in obstacle detection for autonomous trains. On the one hand, there are several properties that can be exploited to improve conventional lane detection algorithms when used for railway applications. The heavily changing visual appearance of the tracks is compensated by very effective geometric constraints. On the other hand there are additional challenges that are less problematic in classical lane detection applications. This work is part of a sensor system for an autonmous train application that aims at creating an environmentally friendly public transportation system.

Michael Gschwandtner, Wolfgang Pree, Andreas Uhl

Local Descriptors for Document Layout Analysis

This paper presents a technique for layout analysis of historical document images based on local descriptors. The considered layout elements are regions of regular text and elements having a decorative meaning such as headlines and initials. The proposed technique exploits the differences in the local properties of the layout elements. For this purpose, an approach drawing its inspiration from state-of-the-art object recognition methodologies – namely Scale Invariant Feature Transform (


) descriptors – is proposed. The scale of the interest points is used for localization. The results show that the method is able to locate regular text in ancient manuscripts. The detection rate of decorative elements is not as high as for regular text but already yields to promising results.

Angelika Garz, Markus Diem, Robert Sablatnig

CT Image Segmentation Using Structural Analysis

We propose a segmentation method for blurred and low-resolution CT images focusing physical properties. The basic idea of our research is simple: two objects can be easily separated in areas of structural weakness. Given CT images of an object, we assign a physical property such as Young’s modulus to each voxel and create functional images (e.g., von Mises strain at the voxel). We then remove the voxel with the largest value in the functional image, and these steps are reiterated until the input model is decomposed into multiple parts. This simple and unique approach provides various advantages over conventional segmentation methods, including preciousness and noise robustness. This paper also demonstrates the efficiency of our approach using the results of various types of CT images, including biological representations and those of engineering objects.

Hiroyuki Hishida, Takashi Michikawa, Yutaka Ohtake, Hiromasa Suzuki, Satoshi Oota

Phase Space for Face Pose Estimation

Face pose estimation from standard imagery remains a complex computer vision problem that requires identifying the primary modes of variance directly corresponding to pose variation, while ignoring variance due to face identity and other noise factors. Conventional methods either fail to extract the salient pose defining features, or require complex embedding operations. We propose a new method for pose estimation that exploits oriented Phase Congruency (PC) features and Canonical Correlation Analysis (CCA) to define a latent pose-sensitive subspace. The oriented PC features serve to mitigate illumination and identity features present in the imagery, while highlighting alignment and pose features necessary for estimation. The proposed system is tested using the Pointing’04 face database and is shown to provide better estimation accuracy than similar methods including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and conventional CCA.

Jacob Foytik, Vijayan K. Asari, R. Cortland Tompkins, Menatoallah Youssef

Contour Based Shape Retrieval

This paper presents a contour-based indexing and retrieval method for content-based image/video retrieval applications. It is based on extracting closed contours, smoothing the contour, indexing with a variation of BK-trees, and using a turning function metric for data comparison. The method is very lightweight, fast and robust - the goal being retaining close to realtime speeds for real applicability. We provide evaluation data showing that the method performs well and fast, and is suitable for inclusion into content based retrieval systems as a descriptor for recognition of in-frame objects and shapes.

Levente Kovács

Illumination Normalization for Robust Face Recognition Using Discrete Wavelet Transform

In this paper, we introduce an illumination normalization approach within frequency domain by utilizing Discrete Wavelet Transform (DWT) as a transformation function in order to suppress illumination variations and simultaneously amplify facial feature such as eyeball, eyebrow, nose, and mouth. The basic ideas are: 1) transform a face image from spatial domain into frequency domain and then obtain two major components, approximate coefficient (Low frequency) and detail coefficient (High frequency) separately 2) remove total variation in an image by adopting Total Variation Quotient Image (TVQI) or Logarithmic Total Variation (LTV) 3) amplify facial features, which are the significant key for face classification, by adopting Gaussian derivatives and Morphological operators respectively. The efficiency of our proposed approach is evaluated based on a public face database, Yale Face Database B, and its extend version, Extend Yale Face Database B. Our experimental results are demonstrated that the proposed approach archives high recognition rate even though only single image per person was used as the training set.

Amnart Petpon, Sanun Srisuk

Feature-Based Lung Nodule Classification

Model-based detection and classification of nodules are two major steps in CAD systems design and evaluation. This paper examines feature-based nodule description for the purpose of classification in low dose CT scanning. After candidate nodules are detected, a process of classification of these nodules into types is needed. The SURF and the LBP descriptors are used to generate the features that describe the texture of common lung nodules. These features were optimized and the resultant set was used for classification of lung nodules into four categories: juxta-pleural, well-circumscribed, vascularized and pleural-tail, based on the extracted information. Experimental results illustrate the efficiency of using multi-resolution feature descriptors, such as the SURF and LBP algorithms, in lung nodule classification.

Amal Farag, Asem Ali, James Graham, Shireen Elhabian, Aly Farag, Robert Falk

Multiple-object Tracking in Cluttered and Crowded Public Spaces

This paper addresses the problem of tracking moving objects of variable appearance in challenging scenes rich with features and texture. Reliable tracking is of pivotal importance in surveillance applications. It is made particularly difficult by the nature of objects encountered in such scenes: these too change in appearance and scale, and are often articulated (e.g. humans). We propose a method which uses fast motion detection and segmentation as a constraint for both building appearance models and their robust propagation (matching) in time. The appearance model is based on sets of local appearances automatically clustered using spatio-kinetic similarity, and is updated with each new appearance seen. This integration of all seen appearances of a tracked object makes it extremely resilient to errors caused by occlusion and the lack of permanence of due to low data quality, appearance change or background clutter. These theoretical strengths of our algorithm are empirically demonstrated on two hour long video footage of a busy city marketplace.

Rhys Martin, Ognjen Arandjelović

Compliant Interframe Coding for Motion-JPEG2000

This publication proposes a strategy to enhance the compression performance of Motion-JPEG2000 by the reduction of interframe redundancies. This is achieved by a new frame structure allowing to remove redundancies and to reconstruct the original stream without loss. The introduced advancement is able to increase the compression efficiency and keeps other beneficial property of the traditional codec, as, high-quality frame-wise access and scalability. All of the required operations are accomplished in JPEG2000 domain by examining and handling still encoded contents. Thus, the approach requires little computing power. Our results demonstrate that the strategy is at least a efficient as the original codec even for less suited video contents.

René Rosenbaum, Heidrun Schumann

EVP-Based Multiple-View Triangulation

This paper addresses multiple-view



triangulation by proposing a new method based on eigenvalue problems (EVPs), which belong to the class of convex programming. The proposed method provides a candidate of the sought 3D point and a straightforward condition for establishing its optimality, which also yields a guaranteed range for the optimal cost of the triangulation problem in case of non-optimality. The proposed method is illustrated through some well-known examples with real data, for which the provided candidate 3D point is always optimal. These examples also show that the computational time of the proposed method is indeed small and competitive with existing approaches.

G. Chesi, Y. S. Hung

An Improved Shape Matching Algorithm for Deformable Objects Using a Global Image Feature

We propose an improved shape matching algorithm that extends the work of Felzenszwalb [3]. In this approach, we use triangular meshes to represent deformable objects and use dynamic programming to find the optimal mapping from the source image to the target image which minimizes a new energy function. Our energy function includes a new cost term that takes into account the center of mass of an image. This term is invariant to translation, rotation, and uniform scaling. We also improve the dynamic programming method proposed in [3] using the center of mass of an image. Experimental results on the Brown dataset show a 7.8 % higher recognition rate when compared with Felzenszwalb’s algorithm.

Jibum Kim, Suzanne M. Shontz

Multi-scale Topo-morphometric Opening of Arteries and Veins: An Evaluative Study via Pulmonary CT Imaging

Distinguishing pulmonary arterial and venous (A/V) trees via

in vivo

imaging is essential for quantification of vascular geometry useful to diagnose several pulmonary diseases. A multi-scale topo-morphologic opening algorithm has recently been introduced separating A/V trees via non-contrast CT imaging. The method starts with two sets of seeds — one for each of A/V trees and combines fuzzy distance transform, fuzzy connectivity, and morphologic reconstruction leading to locally-adaptive multi-scale opening of two mutually fused structures. Here, we present results of a comprehensive validation study assessing both reproducibility and accuracy of the method. Accuracy of the method is examined using both mathematical phantoms and CT images of contrast-separated pulmonary A/V casting of a pig’s lung. Reproducibility of the method is evaluated using multi-user A/V separations of patients’s CT pulmonary data and contrast-enhanced CT data of a pig’s lung at different volumes. The qualitative and quantitative results are very promising.

Zhiyun Gao, Colin Holtze, Randall Grout, Milan Sonka, Eric Hoffman, Punam K. Saha

Video Event Detection as Matching of Spatiotemporal Projection

Detection of events and actions in video entails substantial processing of very large, even open-ended, video streams. Video data presents a unique challenge for the information retrieval community because it is hard to find a way to properly represent video events. We propose a novel approach to analyze temporal aspects of video data. We consider the video data as a sequence of images that form a 3-dimensional spatiotemporal structure, and multiview orthographic projection is performed to transform the video data into 2-dimensional representations. The projected views allow a unique way to represent video events, and we apply template matching using color moments to detect video events.

Dong-Jun Park, David Eichmann

PixelLaser: Computing Range from Monocular Texture

The impressive advances in robotic spatial reasoning over the past decade have relied primarily on rich sensory data provided by laser range finders. Relative to cameras, however, lasers are heavy, bulky, power-hungry, and expensive. This work proposes and evaluates an image-segmentation pipeline that produces range scans from ordinary webcameras. Starting with a nearest-neighbor classification of image patches, we investigate the tradeoffs in accuracy, resolution, calibration, and speed that come from estimating range-to-obstacles using only single images. Experiments atop the low-cost iRobot Create platform demonstrate the accessibility and power of this pixel-based alternative to laser scans.

N. Lesperance, M. Leece, S. Matsumoto, M. Korbel, K. Lei, Z. Dodds

A Spatio-Spectral Algorithm for Robust and Scalable Object Tracking in Videos

In this work we propose a mechanism which looks at processing the low-level visual information present in video frames and prepares mid-level tracking trajectories of objects of interest within the video. The main component of the proposed framework takes detected objects as inputs and generates their appearance models, maintains them and tracks these individuals within the video. The proposed object tracking algorithm is also capable of detecting the possibility of collision between the object trajectories and resolving it without losing their models.

Alireza Tavakkoli, Mircea Nicolescu, George Bebis

Driving Fatigue Detection Using Active Shape Models

Driver fatigue is a major cause of traffic accidents. The fatigue detection systems based on computer vision have great potential given its property of non-invasiveness. Major challenges that arise are fast movements of eyes and mouth, changes in pose and lighting variations. In this paper an Active Shape Model is presented for facial features detection of features extracted from the parametric model Candide-3. We describe the characterization methodology from parametric model. Also quantitatively evaluated the accuracy for feature detection and estimation of the parameters associated with fatigue, analyzing its robustness to variations in pose and local variations in the regions of interest. The model used and characterization methodology showed efficient to detect fatigue in 100% of the cases.

Hernán García, Augusto Salazar, Damián Alvarez, Álvaro Orozco

Outlier Removal in Stereo Reconstruction of Orbital Images

NASA has recently been building 3-dimensional models of the moon based on photos taken from orbiting satellites and the Apollo missions. One issue with the stereo reconstruction is the handling of “outliers”, or areas with rapid and unexpected change in the data. Outliers may be introduced by issues such as shadows on the surface, areas with low amounts of surface detail, or flaws in the camera systems. These errors may result in elevation spikes which cause the model to differ significantly from accurate ground truth. We are seeking to remove outliers from reconstructions by using a pair of filters which target the characteristics of these outliers. The first filter will use edge detection to filter areas with low detail and the second filter will remove areas in the disparity map which differ too far from their surrounding neighbors.

Marvin Smith, Ara Nefian

Random Sampling Nonlinear Optimization for Camera Self-calibration with Modeling of Intrinsic Parameter Space

This paper presents a framework for random sampling nonlinear optimization for camera self-calibration with modeling of the camera intrinsic parameter space. The focal length is modeled using a Gaussian distribution derived from the results of the Kruppa equations, while the optical center is modeled based on the assumption that the optical center is close to the image center but deviates from it due to some manufacturing imprecision. This model enables us to narrow the search range of parameter space and therefore reduce the computation cost. In addition, a random sampling strategy is utilized in order to avoid local optima, where the samples are drawn according to this model. Experimental results are presented to show the effectiveness of the proposed nonlinear optimization algorithm, even in the under-constrained case involving only two frames.

Houman Rastgar, Eric Dubois, Liang Zhang

Facial Fraud Discrimination Using Detection and Classification

This paper proposes facial fraud discrimination using facial feature detection and classification based on the AdaBoost and a neural network. The proposed method detects the face, the two eyes, and the mouth by the AdaBoost detector. To classify detection results as either normal or abnormal eyes and mouths, we use a neural network. Using these results, we calculate the fraction of face images that contain normal eyes and mouths. These fractions are used for facial fraud detection by setting a threshold based on the cumulative density function of the Binomial distribution. The FRR and FAR of eye discrimination of our algorithm are 0.0486 and 0.0152, respectively. The FRR and FAR of mouth discrimination of our algorithm are 0.0702 and 0.0299, respectively.

Inho Choi, Daijin Kim

Segmentation of Abdominal Organs Incorporating Prior Knowledge in Small Animal CT

For quantification of drug’s delivery using small animals measuring biochemical changes in abdominal organs based on functional images is essential. However, in those images, the object boundaries are not clearly enough to locate its shape and position. And even though the structural information is compensated using image registration technique, delineation of organs is difficult and time-consuming. So we suggested an automatic procedure for delineation of organs in mouse PET image with the aid of atlas as a priori anatomical information. Prior information was given by voxel label number. CT used to construct an atlas is transformed to match mouse CT to be segmented. For each label corresponding voxels represent the same organ. Then, mouse CT-PET pairs should be aligned to identify organ area in PET. After all images are aligned and fused each other both structural and functional information can be observed simultaneously for several organs.

SooMin Song, Myoung-Hee Kim

Method of Interest Points Characterization Based C-HOG Local Descriptor

This article proposes an approach to detection and description of interest points based C-HOG. The study of two interest point local descriptor methods, the SIFT and the SURF, allows us to understand their construction and extracts the various advantages (invariances, speeds, repeatability). Our goal is to couple these advantages to create a new system (detector and descriptor). The latter must be as invariant as possible for the image transformation (rotations, scales, viewpoints). We will have to find a compromise between a good matching rate and the number of points matched. All the detector and descriptor parameters (orientations, thresholds, analysis pattern, parameters) will be also detailed in this article.

Manuel Grand-brochier, Christophe Tilmant, Michel Dhome

Stereo-Based Object Segmentation Combining Spatio-Temporal Information

In the case of cluttered backgrounds or low quality video input, automatic video object segmentation based on spatial-temporal information is still a problem without a general solution. A new approach is introduced in this work to deal with this problem by using depth information. The proposed approach obtains the initial object masks based on depth density image and motion segmentation. The objects boundaries are obtained by updating object masks using a simultaneous combination of multiple cues, including spatial location, colour, depth and motion, within a maximum likelihood method. The experimental result shows that this method is effective and has good output in cluttered backgrounds.

Yingdong Ma, Qian Chen

Fast Motion Estimation Based on Search Range Adjustment Using Neighboring MVDs

In this paper, we propose a new adaptive search range motion estimation method for H.264/AVC where search ranges are adjusted by the probabilities of motion vector differences (MVDs). The MVDs are modeled as a discrete Laplace distribution and then its parameter is estimated by the maximum likelihood estimator. The MVDs of neighboring blocks are employed as the samples for the estimation. With the estimated distribution, the search ranges which include the correct MVDs for a prefixed probability are analytically determined. Since the proposed method handles the search ranges instead of search point sampling patterns, it provides very flexible and hardware-friendly approach in motion estimation. Experimental results show that it is very similar to the optimal method (full search algorithm) in PSNR but gives significant reduction in the computational complexity.

Hyun-Soo Kang, Jae-Hyeung Park

Towards Computational Understanding of Skill Levels in Simulation-Based Surgical Training via Automatic Video Analysis

Analysis of motion expertise is an important problem in many domains including sports and surgery. Recent years, surgical simulation has emerged at the forefront of new technologies for improving the education and training of surgical residents. In simulation-based surgical training, a key task is to rate the performance of the operator, which is done currently by senior surgeons. This is deemed as a costly practice and researchers have been working towards building automated systems to achieve computational understanding of surgical skills, largely through analysis of motion data captured by video or data of other modalities. This paper presents our study on understanding a fundamental issue in building such automated systems: how visual features computed from videos capturing surgical actions may be related to the motion expertise of the operator. Utilizing domain-speciffic knowledge, we propose algorithms for detecting visual features that support understanding the skill of the operator. A set of video streams captured from resident surgeons in two local hospitals were employed in our analysis. The experiments revealed useful observations on potential correlations between computable visual features and the motion expertise of the subjects, hence leading to insights into how to build automatic system for solving the problem of expertise evaluation.

Qiang Zhang, Baoxin Li

Biomedical Image Retrieval in a Fuzzy Feature Space with Affine Region Detection and Vector Quantization of a Scale-Invariant Descriptor

This paper presents an approach to biomedical image retrieval by detecting affine covariant regions and representing them with an invariant fuzzy feature space. These regions refer to a set of pixels or interest points which change covariantly with a class of transformations, such as affinity. A vector descriptor based on Scale-Invariant Feature Transform (SIFT) computed from the intensity pattern within the region. These features are then vector quantized to build a codebbok of keypoints. By mapping the interest points extracted from one image to the keypoints in the codebook, their occurrences are counted and the resulting histogram is called the “bag of keypoints” for that image. Images are finally represented in fuzzy feature space by spreading each region’s membership values through a global fuzzy membership function to all the keypoints in the codebook. The proposed feature extraction and representation scheme is not only invariant to affine transformations but also robust against quantization errors. A systematic evaluation of retrieval results on a heterogeneous medical image collection has shown around 15-20% improvement in precision at different recall levels for the proposed fuzzy feature-based representation when compared to individual color, texture, edge, and keypoint-based features.

Md Mahmudur Rahman, Sameer K. Antani, George R. Thoma

Model Distribution Dependant Complexity Estimation on Textures

On this work a method for the complexity of a textured image to be estimated is presented. The method allow to detect changes on its stationarity by means of the complexity with respect to a given model set (distribution dependant). That detection is done in such a way that also allows to classify textured images according to the whole texture complexity. When different models are used to model data, the more complex model is expected to fit it better because of the higher degree of freedom. Thus, a naturally-arisen penalization on the model complexity is used in a Bayesian context. Here a nested models scheme is used to improve the robustness and efficiency on the implementation. Even when MRF models are used for the sake of clarity, the procedure it is not subject to a particular distribution.

Agustin Mailing, Tomás Crivelli, Bruno Cernuschi-Frías

Integrating Multiple Uncalibrated Views for Human 3D Pose Estimation

We address the problem of how human pose in 3D can be estimated from video data. The use of multiple views has the potential of tackling self-occlusion of the human subject in any particular view, as well as of estimating the human pose more precisely. We propose a scheme of allowing multiple views to be put together naturally for determining human pose, allowing hypotheses of the body parts in each view to be pruned away efficiently through consistency check over all the views. The scheme relates the different views through a linear combination-like expression of all the image data, which captures the rigidity of the human subject in 3D. The scheme does not require thorough calibration of the cameras themselves nor the camera inter-geometry. A formulation is also introduced that expresses the multi-view scheme, as well as other constraints, in the pose estimation problem. A belief propagation approach is used to reach a final human pose under the formulation. Experimental results on in-house captured image data as well as publicly available benchmark datasets are shown to illustrate the performance of the system.

Zibin Wang, Ronald Chung

A Novel Histogram-Based Feature Representation and Its Application in Sport Players Classification

Automatic sport team discrimination, that is the correct assignment of each player to the relative team, is a fundamental step in high level sport video sequences analysis applications. In this work we propose a novel set of features based on a variation of classic color histograms called Positional Histograms: these features try to overcome the main drawbacks of classic histograms, first of all the weakness of any kind of relation between spectral and spatial contents of the image. The basic idea is to extract histograms as a function of the position of points in the image, with the goal of maintaining a relationship between the color distribution and the position: this is necessary because often the actors in a play field dress in a similar way, with just a different distribution of the same colors across the silhouettes. Further, different unsupervised classifiers and different feature sets are jointly evaluated with the goal of investigate toward the feasibility of unsupervised techniques in sport video analysis.

Paolo Spagnolo, Pier Luigi Mazzeo, Marco Leo, Tiziana D’Orazio

Facial Expression Recognition Using Facial Features and Manifold Learning

This paper explores robust facial expression recognition techniques based on the underlying low dimensional manifolds embedded in facial images of varying expression. Faces are automatically detected and facial features are extracted, normalized and mapped onto a low dimensional projection surface using Locality Preserving Projections. Alternatively, processed image pixels are used for manifold construction. Classification models robustly estimate expression from the low dimensional projections in manifold space. This method performs robustly in natural settings, enabling more engaging human computer interfaces.

Raymond Ptucha, Andreas Savakis

Blurring Mean-Shift with a Restricted Data-Set Modification for Applications in Image Processing

A new mean-shift technique, blurring mean-shift with a restricted dataset modification, is presented. It is mainly intended for applications in image processing since, in this case, the coordinates of the points entering into the mean-shift procedure may be obviously split into two parts that are treated in different ways: The spatial part (geometrical position in image) and the range part (colour/brightness). The basic principle is similar as in the blurring mean-shift algorithm. In contrast to it, the changes of the dataset are restricted only to the range values (colour/brightness); the spatial parts do not change. The points that are processed during computation may be viewed as points of a certain image that evolves during the iterations. We show that the process converges. As a result, an image is obtained with the areas of constant colour/brightness, which can be exploited for image filtering and segmentation. The geodesic as well as Euclidean distance can be used. The results of testing are presented showing that the algorithm is useful.

Eduard Sojka, Jan Gaura, Štepán Šrubař, Tomáš Fabián, Michal Krumnikl

Detecting Straight Line Segments Using a Triangular Neighborhood

A novel straight line segment detection method is proposed in this paper, based on the theory of mapping straight line segment neighborhoods between the image and the HT spaces and the geometrical analysis of the HT butterfly wings. This paper makes full use of the information in the butterfly wings to detect the segments, i.e. detecting segments by matching its butterfly wings. Due to the fact that the butterfly changes its shape and orientation according to the segment parameters, this paper deduces an approximation of the butterfly wings with triangles by moving and/or flipping the segments to the position that minimizes the approximating error. This movement alleviates the computation and precision loss introduced by the butterfly distortions, because straight side triangular regions can be used to obtain the parameters of segments. Compared to existing methods that detect segments using HT data, the proposed method utilizes more information around the butterfly center, and hence is more effective, especially when it is used to detect collinear segments. The experiments verify the performance of the proposed method.

Shengzhi Du, Chunling Tu, Barend Jacobus van Wyk

Size Distribution Estimation of Stone Fragments via Digital Image Processing

Precise statistics play a key role in the management of systems and processes. For instance, having knowledge about size distribution of stone fragments in a mining factory can allow suitable choosing of the diameter of a sieve or designing of a better crusher, hence optimizing the production line. This paper describes and compares three image-based techniques that statistically estimate stone size distribution. The techniques are watershed, granulometry and area boundary. Results show that in many mining stone factories due to identical stone texture, granulometry is a good replacement for edge detection based methods. An important point about granulometry is that its results are very qualitative; it cannot determine the exact number of stone fragments, but it can superlatively distinguish size distribution of objects in real images including objects with different textures, disparity and overlapping.

Mohammad Salehizadeh, Mohammad T. Sadeghi

Image Enhancement by Median Filters in Algebraic Reconstruction Methods: An Experimental Study

Algebraic methods for image reconstruction provide good solutions even if only few projections are available. However, they can create noisy images if the number of iterations or the computational time is limited. In this paper, we show how to decrease the effect of noise by using median filters during the iterations. We present an extensive study by applying filters of different sizes and in various times of the reconstruction process. Also, our test images are of different structural complexity. Our study concentrates on the ART and its discrete variant DART reconstruction methods.

Norbert Hantos, Péter Balázs

3D Curvature-Based Shape Descriptors for Face Segmentation: An Anatomical-Based Analysis

The behavior of six curvature-based 3D shape descriptors which were computed on the surface of 3D face models, is studied. The set of descriptors includes






, Mean and Gaussian curvatures, Shape Index, and Curvedness. Instead of defining clusters of vertices based on the value of a given primitive surface feature, a face template composed by 28 anatomical regions, is used to segment the models and to extract the location of different landmarks and fiducial points. Vertices are grouped by: vertices themselves, region, and region boundaries. The aim of this study is to analyze the discriminant capacity of each descriptor to characterize regions and to identify key points on the facial surface. The experiment includes testing with data from synthetic face models and 3D face range images. In the results: the values, distributions, and relevance indexes of each set of vertices, were analyzed.

Augusto Salazar, Alexander Cerón, Flavio Prieto

Computational Hemodynamics in Intracranial Vessels Reconstructed from Biplane Angiograms

Recent works in neurology have explored ways to obtain a better understanding of blood flow circulation in the brain with the ultimate goal of improving the treatment of cerebrovascular diseases, such as strokes, stenosis, and aneurysms. In this paper, we propose a framework to reconstruct three-dimensional (3D) models of intracerebral vessels from biplane angiograms. The reconstructed vessel geometries are then used to perform simulations of computational fluid dynamic (CFD). A key component of our framework is to perform such a reconstruction by incorporating user interaction to identify the centerline of the vessels in each view. Then the vessel profile is estimated automatically at each point along the centerlines, and an optimization procedure refines the 3D model using epipolar constraints and back-projection in the original angiograms. Finally, the 3D model of the vessels is then used as the domain where the wall shear stress (WSS), and velocity vectors are estimated from a blood flow model that follows Navier-Stokes equations as an incompressible Newtonian fluid. Visualization of hemodynamic parameters are illustrated on two stroke patients.

Fabien Scalzo, Qing Hao, Alan M. Walczak, Xiao Hu, Yiemeng Hoi, Kenneth R. Hoffmann, David S. Liebeskind

Object Distance Estimation Based on Stereo Vision and Color Segmentation with Region Matching

Human vision system relies on stereovision to determine object distance in the 3-D world. Human vision system achieves this by first locating the objects, then matching the corresponding objects seen by the left and right eyes, and finally using triangulation to estimate the object distance. Inspired by the same concept, this paper presents a depth estimation method based on stereo vision and color segmentation with region matching in CIE Lab color space. Firstly, an automatic seeded region growing approach for color segmentation in perceptually uniform color space was proposed. Then color region matching method was implemented after color segmentation. Thereafter, 3D reprojection method was employed to calculate depth distances. Experimental results are included to validate the proposed concept for object distance estimation.

Guangming Xiong, Xin Li, Junqiang Xi, Spencer G. Fowers, Huiyan Chen

Multiscale Information Fusion by Graph Cut through Convex Optimization

This paper proposed a novel method for global continuous optimization of maximum a posterior(MAP) during wavelet-domain hidden Markov tree-based(WHMT) multiscale information fusion process. We start with calculating the multiscale classification likelihoods of wavelet coefficients by expectation-maximization(EM) algorithm. Energy function is then generated by combining boundary term estimated by classification likelihoods with regional term obtained by both pixel information and approximation coefficients. Through energy minimization through graph cut via convex optimization, objects are segmented accurately from the images in a global optimization sense. A performance measure for tobacco leaf inspection is used to evaluate our algorithm, the localization accuracy of weak boundary by fusing multiscale information via convex optimization is encouraging.

Yinhui Zhang, Yunsheng Zhang, Zifen He

A Fast Level Set-Like Algorithm for Region-Based Active Contours

Implicit active contours are widely employed in image processing and related areas. Their implementation using the level set framework brings several advantages over parametric snakes. In particular, a parameterization independence, topological flexibility, and straightforward extension into higher dimensions have led to their popularity. On the other hand, a numerical solution of associated partial differential equations (PDEs) is very time-consuming, especially for large 3D images. In this paper, we modify a fast level set-like algorithm by Nilsson and Heyden [14] intended for tracking gradient-based active contours in order to obtain a fast algorithm for tracking region-based active contours driven by the Chan-Vese model. The potential of the proposed algorithm and its comparison with two other fast methods minimizing the Chan-Vese model are demonstrated on both synthetic and real image data.

Martin Maška, Pavel Matula, Ondřej Daněk, Michal Kozubek

A Novel Hardware Architecture for Rapid Object Detection Based on Adaboost Algorithm

This paper proposed a novel hardware architecture for rapid object detection based on Adaboost learning algorithm with Haar-like features as weak classifiers. A 24x24 pipelined integral image array is introduced to reduce calculation time and eliminate the problem of the huge hardware resource consumption in integral image calculation and storage. An expansion of the integral image array is also proposed to increase the parallelism at a low cost of hardware resource consumption. These methods resulted in an optimized detection process. We further implemented the process on Xilinx XUP Virtex II Pro FPGA board, and achieved an accuracy of 91.3%, and a speed of 80 fps at clock rate of 100 MHz, for 352x288 CIF image.

Tinghui Wang, Feng Zhao, Jiang Wan, Yongxin Zhu

Using Perceptual Color Contrast for Color Image Processing

Many grayscale image processing techniques such as edge and feature detection, template matching, require the computations of image gradients and intensity difference. These computations in grayscale are very much like measuring color difference between two colors. The goal of this work is to determine an efficient method to represent color difference so that many existing grayscale image processing techniques that require the computations of intensity difference and image gradients can be adapted for color without significantly increasing the amount of data to process and without significantly altering the grayscale-based algorithms. In this paper, several perceptual color contrast measurement formulas are evaluated to determine the most applicable metric for color difference representation. Well-known edge and feature detection algorithms using color contrast are implemented to prove its feasibility.

Guangming Xiong, Dah-Jye Lee, Spencer G. Fowers, Jianwei Gong, Huiyan Chen

GPU Acceleration of Robust Point Matching

Robust Point Matching (RPM) is a common image registration algorithm, yet its large computational complexity prohibits registering large point sets in a timely manner. With recent advances in General Purpose Graphical Processing Units (GPGPUs), commodity hardware is capable of greatly reducing the execution time of RPM when non-rigidly aligning thousands of data points. In this paper, we identify areas where parallelism can be exploited in the RPM algorithm, and investigate a GPU-based approach to accelerate the implementation. Other common RPM implementations are compared with our solution. Experiments on synthetic and real data sets show that our approach achieves close to linear speed-up with respect to total computational power over the widely used Matlab implementation. Our tests indicate that utilizing our implementation on current state of the art GPU technology would enable the use of vastly greater point set sizes.

Chad Mourning, Scott Nykl, Huihui Xu, David Chelberg, Jundong Liu

A Wavelet-Based Face Recognition System Using Partial Information

This paper aims to integrate part-based feature extractor, namely Non-negative matrix factorization (NMF), Local NMF and Spatially Confined NMF in wavelet frequency domain. Wavelet transform, with its approximate decomposition is used to reduce the noise and produce a representation in the low frequency domain, and hence making the facial images insensitive to facial expression and small occlusion. 75% ratio of full-face images are used for training and testing since they contain sufficient information as reported in a previous study. Our experiments on


Database demonstrate that feature extractors in wavelet frequency domain perform better than without any filters. The optimum result is obtained for SFNMF of


= 60 with Symlet orthonormal wavelet filter of order 2 in the second decomposition level. The recognition rate is equivalent to 98%.

H. F. Neo, C. C. Teo, Andrew B. J. Teoh

A Study of Hierarchical Correlation Clustering for Scientific Volume Data

Correlation study is at the heart of time-varying multivariate volume data analysis and visualization. In this paper, we study hierarchical clustering of volumetric samples based on the similarity of their correlation relation. Samples are selected from a time-varying multivariate climate data set according to knowledge provided by the domain experts. We present three different hierarchical clustering methods based on quality threshold, k-means, and random walks, to investigate the correlation relation with varying levels of detail. In conjunction with qualitative clustering results integrated with volume rendering, we leverage parallel coordinates to show quantitative correlation information for a complete visualization. We also evaluate the three hierarchical clustering methods in terms of quality and performance.

Yi Gu, Chaoli Wang

Subversion Statistics Sifter

We present

Subversion Statistics Sifter

, a visualisation and statistics system for exploring the structure and evolution of data contained in Subversion repositories with respect to both developer activity and source code changes. We support a variety of visualisation techniques, including statistical graphics and graph plots. We exploit the inherent hierarchical structure of software archives to support users of the tool in navigation and orientation tasks and to allow them to obtain insight from the data on different levels of granularity such as directories, files, or even down to single lines of code. The tool is targeted towards large, tiled displays driven by graphics clusters; therefore, distant corresponding views are highlighted by a rubber-banding technique. The system is built on a .NET and WPF basis that exploits data binding and theming of common controls. Following this principle, the system can easily be extended by additional visualisation techniques. We illustrate the usefulness of

Subversion Statistics Sifter

by case studies of student lab and open source software projects.

Christoph Müller, Guido Reina, Michael Burch, Daniel Weiskopf

A Lossy/Lossless Coding Algorithm Using Histogram

In this paper we propose a method of compression which is both lossy and lossless. The decision regarding what amount of image will be compressed in lossy manner, and what in lossless, depends on the information obtained by the histogram of the image. Another important decision parameter is bit rate of the image. Together these two parameters enable us to divide the image in two parts for different types of compression. The results show that the difference between original and decompressed images is visually negligible. The PSNR and SSIM are comparable to JPEG...

Sunil Bhooshan, Shipra Sharma

Stereo Matching in Mean Shift Attractor Space

In this paper, we present a novel method for improving the speed and accuracy of the initial disparity estimation of the stereo matching algorithms. These algorithms are widely investigated, but fast and precise estimation of a disparity map still remains a challenging problem. Recent top ranking stereo matching algorithms usually utilize a window-based approach and mean shift based clustering. We propose an algorithm inspired by a top-down approach exploiting these two steps.

By using the mean shift algorithm, we transform the input images into the attractor space and then perform the matching on the attractor sets. In contrast to the state-of-the-art algorithms, where matching is done on the basis of pixel intensities, grouped according to the results of mean shift algorithm, we perform the matching between the attractor sets of both input images. In this way we are able to acquire fast disparity estimates for whole segments.

Michal Krumnikl

Undecimated Wavelet Transform-Based Image Interpolation

In this paper, a new wavelet-based image interpolation algorithm is developed for magnifying the image details so that the visibility of tiny features in a digital image is improved. The algorithm takes the LR image as the low-pass filtered subband of an unknown wavelet transformed high resolution image. Then an initial HR image of size twice the LR image is estimated using zero padding of the details. The HR image is transformed via UWT resulting in four subbands, three of which are related with the high frequency components of the image. In the UWT domain, the LL subband is replaced with the initially estimated HR image and applying the inverse UWT, the final HR image is determined. Experiments conducted with both gray level and color images show the superiority of the proposed algorithm over the state-of-the-art interpolation methods.

Numan Unaldi, Vijayan K. Asari

The Influence of Multimodal 3D Visualizations on Learning Acquisition

The present research addressed a critical barrier constantly facing developers and instructors involved in interactive web-based teaching. Participants were assigned to different multimodal training conditions (visual-, auditory-, audiovisual-modality, or no training) to learn and free recall a list of 14 terms associated with construction of a wood-frame house. The audiovisual- and visual-modality training conditions displayed comparable accuracy rates, while the auditory-modality training condition revealed lower accuracy, and the no-training condition exhibited little or no learning acquisition. The process of simultaneously exposing learners to interactive dynamic visualizations and prompting them to attend to information through the pragmatic use of audio cues reduced memory load, and in turn facilitated memory recall. Findings provided constructive feedback on the efficacy and usability of three-dimensional (


)dynamic visualizations in web-based distance education, and implementations for future development of human-computer user interfaces to optimize engineering design effectiveness.

Phuong T. Do, John R. Moreland, Dennis P. Korchek

Visualizing Gene Co-expression as Google Maps

We visualize gene co-regulation patterns by creating 2D embeddings from microarray data corresponding to complete gene sets from the mouse genome, across large numbers of cell types. We use google maps and client-side graphics to disseminate pre-rendered such visualizations with a small but intuitive set of interactions. We conduct an anecdotal evaluation with domain specialists and demonstrate that biologists appreciate this approach because it facilitates low-overhead access to readily analyzable perspectives of unfamiliar datasets and because it offers a convenient way of disseminating large datasets in visual form.

Radu Jianu, David H. Laidlaw

A New Approach for Lighting Effect Rendering

We propose a new approach to render different lighting effects on an image. Artists often use different stylizations for different lighting effects in the same image. However, existing work just try to extract shadows without distinction of their type. We introduce a lighting map to describe the different lighting effects and propose six non-photorealistic rendering based on artistic styles ranging from Chiaroscuro to comics. Giving an input image, we are able to automatically generate a lighting map which may be modified by the user to specify the types of shadow or light effects. Our model is flexible and specifically designed to help users and even amateur users, to semi-automatically stylize the different kinds of light effects in an image. It is designed to be integrated into an image editing tool.

Catherine Sauvaget, Vincent Boyer

SemaTime - Timeline Visualization of Time-Dependent Relations and Semantics

Timeline based visualizations arrange time-dependent entities along a time-axis and are used in many different domains like digital libraries, criminal investigation and medical information systems to support users in understanding chronological structures. By the use of semantic technologies, the information is categorized in a domain-specific, hierarchical schema and specified by semantic relations. Commonly semantic relations in timeline visualizations are depicted by interconnecting entities with a directed edge. However it is possible that semantic relations change in the course of time. In this paper we introduce a new timeline visualization for time-dependent semantics called SemaTime that offers a hierarchical categorization of time-dependent entities including navigation and filtering features. We also present a novel concept for visualizing time-dependent relations that allows the illustration of time-varying semantic relations and affords an easy understandable visualization of complex, time-dependent interrelations.

Christian Stab, Kawa Nazemi, Dieter W. Fellner

Comics Stylizations of 3D Scenes Using GPU

We propose a new comics stylization model based on a very efficient depth map generation. It is designed to render large scenes with multiple objects as well as single object in real time through a complete GPU implementation. 3D comics stylizations are generally view-dependent and only use the camera field of view to render the scene. In all cases, the depth of objects is computed according to the near and far planes while they are almost without any relation with the range depth of these objects present in the scene. Our model solves this problem by computing minimal and maximal values in the depth map. Moreover, it reproduces and improves better comics stylizations proposed for 2D images. Results show that our model is suitable for different kinds of 3D scenes and to produce various comics stylizations.

Jordane Suarez, Farès Belhadj, Vincent Boyer

Discovering Novelty in Gene Data: From Sequential Patterns to Visualization

Data mining techniques allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyse by end-users. In this paper, we focus on sequential pattern mining and propose a new visualization system, which aims at helping end-users to analyse extracted knowledge and to highlight the novelty according to referenced biological document databases. Our system is based on two visualization techniques: Clouds and solar systems. We show that these techniques are very helpful for identifying associations and hierarchical relationships between patterns among related documents. Sequential patterns extracted from gene data using our system were successfully evaluated by two biology laboratories working on Alzheimers disease and cancer.

Arnaud Sallaberry, Nicolas Pecheur, Sandra Bringay, Mathieu Roche, Maguelonne Teisseire

A Differential-Geometrical Framework for Color Image Quality Measures

We propose a differential-geometrical framework for color Image Quality Measures (IQMs). Our approach is based on the definition of a relevant image distortion measure in a Riemannian way. To do this, we use the concept of geodesic distance and apply the theoretical setting to exhibit closed-forms for all the differential geometric attributes of two well-know color spaces: Helmholtz and Stiles manifolds. With these formulæ, we generalize some useful IQMs from the Euclidean framework to the Riemannian one. Finally, we present some experiments performed on real images, gradually distorted by different kinds of noise to conclude that the Riemannian IQMs are meaningful and relevant.

Mourad Zéraï, Olfa Triki

Three Dimensional Reconstruction Using Vertical Constraints from a Photograph

Two photographs are necessary to reconstruct three dimensions conventionally. A problem to reconstruct three dimensions from one piece of photograph is a poor setting problem, that a right solution cannot be found uniformly. Even if human watches one piece of photograph, human feels depth. This is so that human has prior knowledge. We propose a technique to reconstruct three dimensions from one piece of photograph in this paper. We use a rectangle and perpendicular relations between quadrangles for prior knowledge here. We actually reconstruct three dimensions from one piece of photograph and show the effectiveness.

Satoru Morita

A Framework for Visual and Haptic Collaboration in Shared Virtual Spaces

We propose a framework for visual and haptic collaboration in X3D/VRML shared virtual spaces. In this collaborative framework, two pipelines—visual and haptic—complement each other to provide a simple and efficient solution to problem requiring collaboration in shared virtual spaces on the web. We consider shared objects defined as virtual object with their visual and physical properties rendered synchronously on each client computer. We introduce virtual tools which are shared objects associated with interactive and haptic devices. We implemented the proposed ideas as a server-client framework with a dedicated viewer. We discuss two implementation frameworks based on the strong and thin server concepts.

Lei Wei, Alexei Sourin, Herbert Stocker

Design and Costs Estimation of Electrical Substations Based on Three-Dimensional Building Blocks

Substations design is a fundamental engineering component in power network construction. The benefits obtained for having adequate tools and design systems are related mainly to cost savings, reduction of construction problems and faster throughput of projects. In this paper we propose an approach based on three dimensional building blocks to construct virtual substations. The building blocks can become 3D standards for advanced engineering, automated drawing, data extraction and reusability of designs. Therefore these substation designs can improve quality and reliability of the design process. With virtual substations we can use them to help on making decisions about construction site selection and community and government acceptance. Finally 3D visualization and walkthrough can be used to improve construction, commissioning, operations and maintenance of distribution and transmission electrical substations.

Eduardo Islas Pérez, Jessica Bahena Rada, Jesus Romero Lima, Mirna Molina Marín

Generating Shaded Image with Lighting Using Image Fusion Space

We generate an image with many lighting equipments by fusing images with a few lighting equipments. In general, the principle of the superposition is discussed for the lighting. An image with two lighting equipments is generated by adding the pixel values of two images with a lighting equipment. But it is difficult to generate the image by adding the pixel values of two images as the internal parameter changes in the case of using a single-lens reflex camera with the sensitivity automatic control facilities. On the other hand, it is difficult to generate the image using traditional fusion methods. We introduce the image fusion space describing the relationship between traditional fusion methods. We propose the method selecting the fusion method for each pixel from the image fusion space. The fusion image is generated using the fusion methods with the parameter suited in the divided regions by searching image fusion space. We show the effectiveness by generating the fusion images with lighting equipments in the general environment.

Satoru Morita

Automatic Detection of Morphologically Distinct Objects in Biomedical Images Using Second Generation Wavelets and Multiple Marked Point Process

Automatically analyzing morphology of biological objects such as cells, nuclei, and vessels is important for medicine and biology. However, detecting individual biological objects is challenging because biomedical images tend to have a complex structure composed of many morphologically distinct objects and unclear object boundaries. In this paper, we present a novel approach to automatically detect individual objects in biomedical images using a multiple marked point process, in which points are the positions of the objects and marks are their geometric attributes. With this model, we can consider both prior knowledge of the structure of the objects and observed data of an image in object detection. Our proposed method also uses the second generation wavelets-based edge-preserving image smoothing technique to cope with unclear boundaries of biological objects. The experimental results show the effectiveness of our method.

Hiroshi Hatsuda

Imaging-Based Computation of the Dynamics of Pelvic Floor Deformation and Strain Visualization Analysis

Understanding of the visco-elastic and contractile properties of pelvic floor muscles and their characteristics in deforming soft tissues are important factors in predicting their functionality. Ultrasound imaging provides a noninvasive way to visualize the temporal sequence of the displacement and deformation of abdominal structures and this paper provides a conceptual framework for dynamic bio-imaging analysis. The objective of this study is thus to measure the effect of displacement and deformation and strain generated by the reflex activation pelvic floor muscles. Computation of the deformation analysis is presented using the temporal and spatial components of urethral profile reflecting the deformation and strain generated by voluntary and reflex activity.

Christos E. Constantinou, Linda McLean, Ellen Kuhl, Bertha Chen

Exploiting Multiple Cameras for Environmental Pathlets

We present a novel multi-camera framework to extract reliable pathlets [1] from tracking data. The proposed approach weights tracks based on their spatial and orientation similarity to simultaneous tracks observed in other camera views. The weighted tracks are used to build a Markovian state space of the environment and Spectral Clustering is employed to extract pathlets from a state-wise similarity matrix. We present experimental results on five multi-camera datasets collected under varying weather conditions and compare with pathlets extracted from individual camera views and three other multi-camera algorithms.

Kevin Streib, James W. Davis

On Supervised Human Activity Analysis for Structured Environments

We consider the problem of developing an automated visual solution for detecting human activities within industrial environments. This has been performed using an overhead view. This view was chosen over more conventional oblique views as it does not suffer from occlusion, but still retains powerful cues about the activity of individuals. A simple blob tracker has been used to track the most significant moving parts i.e. human beings. The output of the tracking stage was manually labelled into 4 distinct categories: walking; carrying; handling and standing still which are taken together from the basic building blocks of a higher work flow description. These were used to train a decision tree using one subset of the data. A separate training set is used to learn the patterns in the activity sequences by Hidden Markov Models (HMM). On independent testing, the HMM models are applied to analyse and modify the sequence of activities predicted by the decision tree.

Banafshe Arbab-Zavar, Imed Bouchrika, John N. Carter, Mark S. Nixon

Human Behavior Analysis at a Point of Sale

This paper presents a method that analyzes human behavior in a shopping setting. Several actions are detected and we are especially interested in detecting interactions between customers and products. This paper first presents our application context, the advantages and constraint of a shopping setting. Then we present and evaluate several methods for human behavior understanding. Human actions are represented with Motion History Image (MHI), Accumulated Motion Image (AMI), Local Motion Context (LMC), and Interaction Context (IC). Then we use Support Vector Machines (SVM) to classify actions. Finally, we combine LMC and IC descriptors in a real-time system that recognizes human behaviors while shopping to enhance digital media impact at the point of sale.

R. Sicre, H. Nicolas


Weitere Informationen

Premium Partner