Skip to main content

Über dieses Buch

The two-volume set CCIS 483 and CCIS 484 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition, CCPR 2014, held in Changsha, China, in November 2014. The 112 revised full papers presented in two volumes were carefully reviewed and selected from 225 submissions. The papers are organized in topical sections on fundamentals of pattern recognition; feature extraction and classification; computer vision; image processing and analysis; video processing and analysis; biometric and action recognition; biomedical image analysis; document and speech analysis; pattern recognition applications.



Section I: Fundamentals of Pattern Recognition

A Nonlinear Classifier Based on Factorization Machines Model

Polynomial Classifier (PC) is a powerful nonlinear classification method that has been widely used in many pattern recognition problems. Despite its high classification accuracy, its computational cost for both training and testing is polynomial with the dimensionality of input data, which makes it unsuitable for large-scale problems. In this work, based on the idea of factorization machines (FMs), we propose an efficient classification method which approximates PC by performing a low-rank approximation to the coefficient matrix of PC. Our method can largely preserve the accuracy of PC, while has only linear computational complexity with the data dimensionality. We conduct extensive experiments to show the effectiveness of our method.

XiaoLong Liu, YanMing Zhang, ChengLin Liu

Training Deep Belief Network with Sparse Hidden Units

In this paper, we proposed a framework to train Restricted Boltzmann Machine (RBM) which is the basic block for Deep Belief Network (DBN). By introducing sparsity constraint to the Contrastive Divergence algorithm (CD algorithm), we trained RBMs with better performance than the off-the-shelf model in MNIST handwritten digit data set. The sparse model suffer from saturation slightly, however, by using a trade-off coefficient, the saturation problem can be solved well. To our knowledge, the sparsity constraint was first introduced to the hidden units of RBM.

Zhen Hu, Wenzheng Hu, Changshui Zhang

The Research of Matching Area Selection Criterion for Gravity Gradient Aided Navigation

Matching area selection is the basis of the gravity gradient aided navigation. In this paper, a criterion for gravity gradient matching area selection is proposed based on the gravity gradient tensor matching location, feature extraction and analysis. Matching position experiments on the gravity gradient tensor map are tested by sliding window, and the optimal matching areas on the basis of the gravity gradient tensor maps were found. By means of the gravity gradient feature parameters extraction for the optimal matching areas and analyzing the performance impact of the gravity gradient feature parameters to matching accuracy, a criterion for gravity gradient matching area selection is obtained. Making use of the proposed matching area selection criterion and the average absolute deviation (MAD) matching algorithm, the gravity gradient aided positioning simulation results show that the effect of the matching navigation in the adaptation area is markedly superior to the effect in the non-adaption area, the position error is less than a grid, and matching rate is greater than 90%.

KaiHan Li, Ling Xiong, Long Cheng, Jie Ma

A Manifold Learning Fusion Algorithm Based on Distance and Angle Preservation

Each manifold learning algorithm has its own advantages and applicable situations. And it is an important question that how to select out the best one as the result. To this end, a manifold learning fusion algorithm is proposed to select out the best one from multiple results yielded by different manifold learning algorithms according to an equation of criterion. Moreover, a kind of local optimal technique is used to optimize the embedded result. By combining the advantages of classical manifold learning algorithms that preserve some properties effectively and the better preservation to distance and angle, our algorithm can yield a more satisfactory result to almost all kind of manifolds. The effectiveness and stability of our algorithm are further confirmed by some experiments.

Yanchun Gu, Defeng Zhang, Zhengming Ma, Guo Niu

Application of Modified Teaching-Learning Algorithm in Coordination Optimization of TCSC and SVC

Due to the interaction among FACTS devices, coordination control of multi-FACTS devices is a hot and urgent topic. A multi-objective optimization problem is formulated in this paper. And a modified teaching-learning algorithm (MTLA) is presented to coordinate Thyristor Controlled Series Capacitor (TCSC), Static Var Compensator (SVC) and power angle difference damping characteristics of generators. The optimal parameters of controller are found out to improve the coordination control. Compared with basic-TLA, MTLA applies a new learner phase in order to avoid entrapment into local optima. Then it comes with a locked device phase for the improvement of convergence rate. Meanwhile, several meta-heuristic techniques are utilized to search and save Pareto-optimal solutions of controller parameters. The proposed algorithm is validated and illustrated on IEEE 4-machine 11-bus system.

Liwu Xiao, Qianlong Zhu, Canbing Li, Yijia Cao, Yi Tan, Lijuan Li

Multi-task Sparse Gaussian Processes with Improved Multi-task Sparsity Regularization

Gaussian processes are a popular and effective Bayesian method for classification and regression. Generating sparse Gaussian processes is a hot research topic, since Gaussian processes have to face the problem of cubic time complexity with respect to the size of the training set. Inspired by the idea of multi-task learning, we believe that simultaneously selecting subsets of multiple Gaussian processes will be more suitable than selecting them separately. In this paper, we propose an improved multi-task sparsity regularizer which can effectively regularize the subset selection of multiple tasks for multi-task sparse Gaussian processes. In particular, based on the multi-task sparsity regularizer proposed in [12], we perform two improvements: 1) replacing a subset of points with a rough global structure when measuring the global consistency of one point; 2) performing normalization on each dimension of every data set before sparsification. We combine the regularizer with two methods to demonstrate its effectiveness. Experimental results on four real data sets show its superiority.

Jiang Zhu, Shiliang Sun

Short-Term Load Forecasting of LSSVM Based on Improved PSO Algorithm

Based on the empirical, the precision of the forecasting will directly affect the reliability, economy and quality of power supply in power system. An improved particle swarm optimizer (IPSO) is proposed to be used on the least squares support vector machine (LSSVM) algorithm, which optimized the initialization parameters and improved the accuracy of short-term load forecasting. This thesis use the historical data of a certain grid to set up the short-term load forecasting model based on the optimization algorithm. While the data had comprehensive consideration the meteorology, weather, date, type and other factors which influencing the load. Compare with the LSSVM algorithm and the standard PSO-LSSVM, the empirical results show that IPSO-LSSVM model is more applicable in terms of convergence effect, accurate prediction and fast speed. The IPSO not only improves the accuracy of load forecasting, but also prevents LSSVM from great reliance on empirical results and random selection.

Qianhui Gong, Wenjun Lu, Wenlong Gong, Xueting Wang

Blob Detection with the Determinant of the Hessian

This study detected image blobs and estimated parameters using the determinant of the Hessian operator. To investigate differential detectors quantitatively, a mathematical function was used to represent the blobs and to solve the parameters, including the position, width, length, contrast, offset, and orientation, in a closed form. These proposed parameters are both novel and very accurate. Sub-pixel localization and interpolation improved the accuracy. Noise is suppressed using the neighbors of the feature. This method was tested with various types of synthesized blobs and real-world images, and it detected fewer duplicated features. Experiments showed that the proposed method outperformed other methods.

Xiaopeng Xu

A Study on Layer Connection Strategies in Stacked Convolutional Deep Belief Networks

This paper presents a study on the layer connections in stacked convolutional networks. To this purpose, three layer connection types namely: diverging connection, neighboring connection and full connection have been compared in convolutional deep belief networks (CDBN). The results showed that our proposed full connection could achieve better performance, a lower time and space cost in nearly all conditions compared with the other two strategies. It can be found that full connection strategy combined the features achieved from lower layers well and made a better typical higher layer features.

Lei Guo, Shijie Li, Xin Niu, Yong Dou

Fault Diagnosis for Distribution Networks Based on Fuzzy Information Fusion

In allusion to realize the fault pattern recognition in distribution network with a low degree of automation, a fault diagnosis method based on the fusion of fuzzy information is presented. For the purpose of improving the efficiency of fault diagnosis on important branch lines, based on the hierarchical model of feeder, the membership function through fusion of equipment alarm information in important branch lines and telephones complain information is improved. Besides, in order to reduce the adverse effects caused by manually setting the threshold too high or too low, the threshold value is partitioned by random sampling on the historical data, then we gets the statistical probability in different sections of threshold. Moreover, the evaluation parameter is defined to revise the fault diagnosis results to make the algorithm owe the ability to diagnosis multiple faults, thus solving the problem that multiple nodes fault is misjudged into fault in their public upstream node. Finally, a distribution line is taken as an example to verify the validity and permissibility of the algorithm in the case of both single failure and multiple failures.

Fangrong Wu, Minfang Peng, Mingjun Qi, Liang Zhu, Hua Leng, Yi Su, Qiang Zhong, Hu Tan

Kernel-Distance Target Alignment

The success of kernel methods are dependent on the kernel, thus a choice of a kernel and proper setting of its parameters are crucial importance. Learning a kernel from the data requires evaluation measures to assess the quality of the kernel. In this paper, we propose a new measure named kernel distance target alignment (KDTA). The measure retains the property of state-of-the-art evaluation measures, kernel target alignment (KTA) and feature space-based kernel matrix evaluation measure (FSM), additionally overcomes the limitation of them. Comparative experiments indicate that the new measure is a good indication of the superiority of a kernel and can get better parameter of RBF kernel.

Peiyan Wang, Cai Dongfeng

A LLE-Based HMM Applied to the Prediction of Kiln Coal Feeding Trend

As the data collected in rotary kiln is rather nonlinear ,linear transforming such as PCA、ICA and LPP to extract feature is not ideal, while manifold learning performs well in high dimensional nonlinear data transform. A new Hidden Markov Model (HMM) based method combined with Locally Linear Embedding (LLE) to predict the coal feeding trend is put forward. Firstly, LLE-HMM conducts nonlinear feature transforms on the sample data by LLE, then the feature data is quantized into observation symbol and HMM is establish to predict the coal feeding trend finally. Through the simulation of the sample data in rotary kiln production process and compared with PCA-HMM、ICA-HMM、LPP-HMM, the results of LLE-HMM shows that it has higher measurement accuracy, better tracking performance, which can satisfy the prediction of coal feeding requirements.

Liu Yunlong, Zhang Xiaogang

Improved Margin Sampling for Active Learning

Active learning is a learning mechanism which can actively query the user for labels. The goal of an active learning algorithm is to build an effective training set by selecting those most informative samples and improve the efficiency of the model within the limited time and resource. In this paper, we mainly focus on a state-of-the-art active learning method, the SVM-based margin sampling. However, margin sampling does not consider the distribution and the structural space connectivity among the unlabeled data when several examples are chosen simultaneously, which may lead to oversampling on dense regions. To overcome this shortcoming, we propose an improved margin sampling method by applying the manifold-preserving graph reduction algorithm to the original margin sampling method. Experimental results on multiple data sets demonstrate that our method obtains better classification performance compared with the original margin sampling.

Jin Zhou, Shiliang Sun

Research on the Ant Colony Optimization Fuzzy Neural Network Control Algorithm for ABS

As the convergence rate of the conventional fuzzy neural network control (FNC) algorithm for a vehicle anti-lock braking system is slow, an improved ant colony optimization fuzzy neural network control (ACO-FNC) algorithm for ABS is proposed, and the control object of ACO-FNC is slip rate. The simulation model of single-wheel ABS is established. According to the comparison of the results of the conventional FNC algorithm and ACO-FNC algorithm, the performance of ACO-FNC algorithm in convergence speed, slip ratio control quality and braking distance is better than FNC algorithm.

Changping Wang, Ling Wang

Section II: Feature Extraction and Classification

Schatten p-Norm Based Matrix Regression Model for Image Classification

Nuclear norm minimization problems for finding the minimum rank matrix have been well studied in many areas. Schatten


-norm is an extension of nuclear norm and the rank function. Different


provides flexible choices for suiting for different applications. Differing from the viewpoint of rank, we will use Schatten


-norm to characterize the error matrix between the occluded face image and its ground truth. Thus, a Schatten


-norm based matrix regression model is presented and a general framework for solving Schatten


-norm minimization problem with an added


regularization is solved by alternating direction method of multipliers (ADMM). The experiments for image classification and face reconstruction show that our algorithm is more effective and efficient, and thus can act as a fast solver for matrix regression problem.

Lei Luo, Jian Yang, Jinhui Chen, Yicheng Gao

Hyperspectral Image Classification by Exploiting the Spectral-Spatial Correlations in the Sparse Coefficients

This paper proposes a novel hyperspectral image (HSI) classification method based on sparse model, which incorporates the spectral and spatial information of the sparse coefficient. Firstly, a sparse dictionary is built by using the training samples and the sparse coefficient is obtained through the sparse representation method. Secondly, a probability map for each class is established by summing the sparse coefficients of each class. Thirdly, the mean filtering is applied on each probability map to exploit the spatial information. Finally, we compare the probability map to find the maximum probability for each pixel and then determine the class label of each pixel. Experimental results demonstrate the effectiveness of the proposed method.

Dan Liu, Shutao Li, Leyuan Fang

Spectral-Spatial Hyperspectral Image Classification Using Superpixel and Extreme Learning Machines

We propose an efficient framework for hyperspectral image (HSI) classification based on superpixel and extreme learning machines (ELMs). One superpixel can be regarded as a small region consisting of a number of pixels with similar spectral characteristics. The novel framework utilizes superpixel to exploit spatial information which can improve classification accuracy. Specifically, we first adopt an efficient segmentation algorithm to divide the HSI into many superpixels. Then, spatial features of superpixels are extracted by computing the mean of the spectral pixels within each superpixel. The mean feature can combine the spatial and spectral information of each superpixel. Finally, ELMs is used for the classification of each mean feature to determine the class label of each superpixel. Experiments on two real HSIs demonstrate the outstanding performance of the proposed method in terms of classification accuracies and high computational efficiency.

Wuhui Duan, Shutao Li, Leyuan Fang

Visual Tracking with Weighted Online Feature Selection

Most tracking-by-detection algorithms adopt an online learning classifier to separate targets from their surrounding background. These methods set a sliding window to extract some candidate samples from the local regions surrounding the former object location at current frame. The trained classifier is then applied to these samples, which sample with the maximum classifier score is considered as the new object location. However, in classifier training procedure, noisy samples may often be included when they are not


enough, thereby causing visual drift. Online discriminative feature selection (ODFS) method has been recently introduced into the tracking algorithms, which can alleviate drift to some extent. However, the ODFS tracker may detect the candidate sample that is less accurate because it does not discriminatively take the sample importance into consideration during the feature selection procedure. In this paper, we present a novel weighted online discriminative feature selection (WODFS) tracker, which integrates the sample’s contribution into the optimization procedure when selecting features, the proposed method optimizes the objective function in the steepest ascent direction with respect to the weighted positive samples while in the steepest descent direction with respect to the negative. Therefore, the selected features directly couple their scores with the contribution of samples which result in a more robust and stable tracker. Numerous experiments on challenging sequences demonstrate the superiority of the proposed algorithm.

Yu Tang, Zhigang Ling, Jiancheng Li, Lu Bai

A System of Image Aesthetic Classification and Evaluation Using Cloud Computing

Image aesthetic analysis is a new direction of computer vision, whose purpose is to simulate the visual perception and aesthetic criterion of human being to assess aesthetic value of a given image. Nowadays, due to the popularization of smart phones with built-in cameras, functions like automatic image management and aesthetic guidance in mobile devices are valuable and in great demand. In order to remedy the gap between the large amount of computation and the hardware limitation of mobile devices, an image aesthetic classification and evaluation system using cloud computing is built in this paper. The time consuming parts such as feature extraction and machine learning algorithms are deployed on the virtual machine in the cloud server, while the simple part such as user interface is left for client. In addition, to make full use of the cloud server, a parallel-processing strategy of feature extraction is put to use in the system The result shows that our approach achieves a promising accuracy and is well correlated with the subjective aesthetics evaluation of human. And the system is more efficient and easier to be used in mobile devices with the help of cloud computing.

Weining Wang, Jiancong Liu, Weijian Zhao, Jiachang Li

Image Feature Extraction via Graph Embedding Regularized Projective Non-negative Matrix Factorization

Non-negative matrix factorization (NMF) has been widely used in image processing and pattern recognition fields. Unfortunately, NMF does not consider the geometrical structure and the discriminative information of data, which might make it unsuitable for classification tasks. In addition, NMF only calculates the coefficient matrix of the training data and how to yields the coefficient vector of a new test data is still obscure. In this paper, we propose a novel graph embedding regularized projective non-negative matrix factorization (GEPNMF) method to address the aforementioned problems. By introducing a graph embedding regularization term, the learned subspace can preserve the local geometrical structure of data while maximizing the margins of different classes. We deduce a multiplicative update rule (MUR) to iteratively solve the objective function of GEPNMF and prove its convergence in theory. Experimental results on ORL and CMU PIE databases suggest the effectiveness of GEPNMF.

Haishun Du, Qingpu Hu, Xudong Zhang, Yandong Hou

Sparse Manifold Preserving for Hyperspectral Image Classification

The graph embedding (GE) algorithms have been widely applied for dimensionality reduction (DR) of hyperspectral image (HSI). However, a major challenge of GE is unclear how to select the neighborhood size and define the affinity weight. In this paper, we propose a new sparse manifold learning method, called sparse manifold preserving (SMP), for HSI classification. It constructs the affinity weight using the sparse coefficients which reserves the global sparsity and manifold structure of HSI data, while it doesn’t need to choose any model parameters for the similarity graph. Experiments on PaviaU HSI data set demonstrate the effectiveness of the presented SMP algorithm.

Hong Huang, Fulin Luo, Jiamin Liu, Zezhong Ma

Hyperspectral Image Classification Using Local Collaborative Representation

In this paper, a new local collaborative representation-based method is proposed for the hyperspectral image classification. First, some significant atoms are selected to represent the neighbors of the pixels based on the collaborative representation algorithm via replacing




to reduce the representation cost. Then, the query pixel is considered as a linear combination of these selected active atoms belong to different classes, and the ultimate classification is carried out based on the contribution of each class to the query pixel and its local neighbors. Experimental results on the real hyperspectral image confirm the effectiveness, accuracy of the method proposed.

Yishu Peng, Yunhui Yan, Wenjie Zhu, Jiuliang Zhao

Simplified Constraints Rank-SVM for Multi-label Classification

In this paper, we propose a Simplified Constraints Rank-SVM (SCRank-SVM) for multi-label classification based on well established Rank-SVM algorithm. Based on the features of the application, we remove the bias term


and modify the decision boundary. Due to the absence of term b, SCRank-SVM has milder optimization constraints. Therefore, SCRank-SVM achieves better solution space compared with Rank-SVM. Experimental results on five datasets show that the proposed algorithm is a powerful candidate for multi-label classification, compared with four existing state of the art multi-label algorithms according to four indicative measures.

Jiarong Wang, Jun Feng, Xia Sun, Su-Shing Chen, Bo Chen

Semi-supervised Image Classification Learning Based on Random Feature Subspace

Image classification is a well-known classical problem in multimedia content analysis. In this paper a framework of semi-supervised image classification method is presented based on random feature subspace. Firstly, color spatial distribution entropy is introduced to represent the color spatial information, and texture feature are extracted by using Gabor filter. Then random subspaces of the feature vector are dynamically generated from mixed feature vector as different views. Finally, three classifiers are trained by the classified images and tri-training algorithm is applied to classify sample images. Experimental results strongly demonstrate the effectiveness and robustness of the proposed system.

Liu Li, Zhang Huaxiang, Hu Xiaojun, Sun Feifei

An Improved Multi-label Classification Ensemble Learning Algorithm

This paper proposes an improved algorithm based on minimizing the weighted error of mistake labels and miss labels in multi-label classification ensemble learning algorithm. The new algorithm aims to avoid local optimum by redefining weak classifiers. This algorithm considers the correlations of labels under the precondition of ensuring the error drops with the number of weak classifiers increasing. This paper proposes two improved approaches; one introduces combinational coefficients when combining weak classifiers, another smooth the weak classifier’s output to avoid local optimum. We discuss the basis of these modifications, and verify the effectiveness of these algorithms. The experimental results show that all the improved algorithms are effective, and less prone to over fitting.

Zhongliang Fu, Lili Wang, Danpu Zhang

An Improved Sparse Representation De-noising for Keeping Structural Features

Considering the current image de-noising methods may lose some structural features, this paper proposes an improved sparse representation based method by adopting the histogram structural similarity. When the initial over-complete dictionary was applied in the sparse decomposition, similarity factor could replace the reconstruction error as the factor of fidelity. The orthogonal matching pursuit algorithm(OMP) is used to reconstruct the denoised image. The experimental results show that the proposed method could provide better PSNR and HSSIM results compared with the wavelet transformation, the K-SVD algorithm and the method presented in [10], meanwhile, and the structural features can be reserved effectively by the proposed method.

Zhi Cui

A SVM Method Trained by Improved Particle Swarm Optimization for Image Classification

As an important classification method, SVM has been widely used in different fields. But it is still a problem how to choose the favorable parameters of SVM. For optimizing the parameters and increasing the accuracy of SVM, this paper proposed an improved quantum behaved particle swarm algorithm based on a mutation operator (MQPSO). The new operator is used for enhancing the global search ability of particle. We test SVM based on MPSO method on solving the problem of image classification. Result shows our algorithm is quite stable and gets higher accuracy.

Qifeng Qian, Hao Gao, Baoyun Wang

Sparsity Based Feature Extraction for Kernel Minimum Squared Error

Kernel minimum squared error(KMSE) is well-known for its effectiveness and simplicity, yet it suffers from the drawback of efficiency when the size of training examples is large. Besides, most of the previous fast algorithms based on KMSE only consider classification problems with balanced data, when in real world imbalanced data are common. In this paper, we propose a weighted model based on sparsity for feature selection in kernel minimum squared error(KMSE). With our model, the computational burden of feature extraction is largely alleviated. Moreover, this model can cope with the class imbalance problem. Experimental results conducted on several benchmark datasets indicate the effectivity and efficiency of our method.

Jiang Jiang, Xi Chen, Haitao Gan, Nong Sang

Section III: Computer Vision

Saliency Detection Based on Spread Pattern and Manifold Ranking

In this paper, we propose a novel approach to detect visual saliency based on spread pattern and manifold ranking. We firstly construct a close-loop graph model with image superpixels as nodes. The saliency of each node is defined by its relevance to given queries according to graph-based manifold ranking technique. Unlike existing methods which choose a few background and foreground queries in a two-stage scheme, we propose to treat each node as a potential foreground query by assigning to it an initial ranking score based on its spread pattern property. The new concept

spread pattern

represents how the ranking score of one node is propagated to the whole graph. An accurate query map is generated accordingly, which is then used to produce the final saliency map with manifold ranking. Our method is computationally efficient and outperforms the state-of-the-art methods.

Yan Huang, Keren Fu, Lixiu Yao, Qiang Wu, Jie Yang

A Structural Constraint Based Dual Camera Model

The combination of fixed camera and PTZ (Pan Tilt Zoom) camera is a technical for picking up high-definition target images in large-scale scene. The challenge of dual camera model is to calculate the PTZ parameters. In this paper, a structural constraint based dual camera model is proposed, which can simplify the calculation of PTZ parameters (pan angle, tilt angle and zoom ra-tio). The advantage of the proposed approach is that the model parameters are off-line computed just once and cameras don’t require recalibration when they are working. Furthermore, a focusable dual camera system has been developed to track interested targets on-line and acquire their high definition images. The proposed approach has been compared with other three typical algorithms, and the implemented dual-camera system is applied to make pedestrian detection in natural scene and obtain their high-definition images. The simulation test and real-scene experiment prove the effectiveness of proposed approach, and the developed system achieves the desired effect.

Xinzhao Li, Yuehu Liu, Shaozhuo Zhai, Zhichao Cui

Hough Voting with Distinctive Mid-Level Parts for Object Detection

This paper presents an efficient method for object detection in natural scenes. It is accomplished via generalized Hough transform of distinctive midlevel parts. These parts are more meaningful than low-level patches such as lines or corners and would be able to cover the key structures of object. We collect the initial sets of parts by clustering with k-means in WHO space and train LDA model for every cluster. The codebooks are generated by applying the trained detectors to discover parts in whole positive training images and storing their spatial distribution relative to object center. When detecting in a new image, the energy map is formed by the voting from every entry in codebook and is used to predict the location of object. Experiment result shows the effectiveness of the proposed scheme.

Xiaoqin Kuang, Nong Sang, Feifei Chen, Runmin Wang, Changxin Gao

A Segmentation Based Change Detection Method for High Resolution Remote Sensing Image

This paper proposes a segmentation based change detection method for high resolution remote sensing images. Firstly, one of the multi-temporal images is segmented by a new image segmentation algorithm, in which, the particle swarm optimization algorithm (PSO) is adopted to obtain the optimal segmentation results. Secondly, the same segmentation mask is used to extract image regions from the other temporal image. Thirdly, the spectral, shape, texture and vegetation index features are extracted from image regions to identify the changed image regions. The performance of the proposed change detection method is assessed by comparing with 4 other widely used change detection methods on two data sets of multi-temporal ZiYuan-3 (ZY-3) high resolution remote sensing images. Experimental results show that accurate image regions and satisfied changed areas can be acquired by our proposed method.

Lin Wu, Zhaoxiang Zhang, Yunhong Wang, Qingjie Liu

Eye Localization Based on Multi-Channel Correlation Filter Bank

Accurate eye localization plays a key role in many face analysis related applications. In this paper, we propose a novel eye localization framework with a group of trained filter arrays called multi-channel correlation filter bank (MCCFB). Each filter array in the bank suits to a different face condition, thus combining these filter array can locate eyes more precisely for variable poses, appearances and illuminations when comparing to single filter/filter array. To demonstrate the performance of our strategy, MCCFB is compared to other eye localization methods, experimental results show superiority of our method in detection ratio, localization accuracy and robustness.

Rui Yang, Shiming Ge, Kaixuan Xie, Shuixian Chen

Person Re-identification by Cascade-Iterative Ranking

State-of-the-art methods on person re-identification usually match the probe set against the candidates in gallery set without the reversed matching process. The one-way methods may not obtain the correct corresponding result by using the limited information in gallery set. In this study, we propose a novel bidirectional framework for person re-identification which is called cascade-iterative ranking. The framework consists of iterative ranking and cascade strategy. The iterative ranking adjusts the rank order of candidates repeatedly by sorting with the bidirectional distance we propose. This distance exploits the bidirectional information of rank order hidden in both sets to get more robust result than the one-way methods. And the multiple features are integrated by a cascade strategy. This strategy can relieve the case that the effect of features offsets for each other. Experimental results on VIPeR and ETHZ datasets verify the effectiveness of our method.

Xiangyu Wang, Feng Chen, Yu Liu

Stereo Camera Based Real-Time Local Path-Planning for Mobile Robots

This paper presents a framework of local path-planning technique for stereo camera-equipped mobile robot with real-time local free road detection in unknown indoor environments. The aim of the proposed framework is to produce an optimized local path using 3D point cloud data, from which a global optimized trajectory can be generated in unknown indoor scene by finding series of sub-goal-points to start point. The framework is constructed with free road detection, variant rapidly-exploring random tree for a path planning and reactive obstacle avoidance behaviors. The information of free road and obstacles computed by 3D point cloud is prepared for path-planning and quick obstacle avoidance. We can make use of the precise relative position obtained by the sensor to efficiently solve the navigation problem, without building a global map. The result of the whole experiments shows that the framework proposed in the paper has a satisfactory performance of local navigation and path-planning.

Huanqing Yang, Jianhua Zhang, Shenyong Chen

A Tracking Method with Structural Local Mean and Local Standard Deviation Appearance Model

Aiming at the problem of illumination variation and partial occlusion in the object tracking, a structural local mean and local standard deviation appearance model is proposed. The object image is divided into some blocks. In each block, the local mean and local standard deviation are calculated, then, a feature vector is composed. In order to weaken the effect of the partial occlusion, an adaptive weighted value is set to each feature component. The Native Bayesian theory is applied to track the object in affine transform space. The experimental results demonstrate that the proposed tracking method performs favorably against several state-of-the-art methods.

Dawei Yang, Yang Cong, Yandong Tang, Yulian Li

Hough-RANSAC: A Fast and Robust Method for Rejecting Mismatches

This paper proposed a novel method - Hough-RANSAC for rejecting mismatches in image registration. Many well-known algorithms for rejecting mismatches, such as the Least Median of Square regression algorithm (LMedS) and the Random Sample Consensus algorithm (RANSAC), perform poorly when the percent of mismatches is more than 50%. Compared with the two well-known algorithms, the Hough-RANSAC algorithm can guarantee both time performance and accuracy, even if the percent of correct matches fell much below 20%.

Hongxia Gao, Jianhe Xie, Yueming Hu, Ze Yang

Partial Static Objects Based Scan Registration on the Campus

Scan registration has a critical role in mapping and localization for Autonomous Ground Vehicle (AGV). This paper addresses the problem of alignment with only exploiting the common static objects instead of the whole point clouds or entire patches on campus environments. Particularly, we wish to use instances of classes including trees, street lamps and poles amongst the whole scene. The distinct advantage lies in it can cut the number of pairwise points down to a quite low level. A binary trained Support Vector Machine (SVM) is used to classify the segmented patches as foreground or background according to the extracted features at object level. The Iterative Closest Point (ICP) approach is adopted only in the foreground objects given an initial guesses with GPS. Experiments show our method is real-time and robust even when the the signal of GPS suddenly shifts or invalid in the sheltered environment.

Chongyang Wei, Shuangyin Shang, Tao Wu, Hao Fu

Quasi-Orthorectified Panorama Generation Based on Affine Model from Terrain UAV Images

In this paper we present a new panorama generation method based on affine model. The images used for panorama generation are captured by an Unmanned Aerial Vehicle (UAV). We focus our research on terrain data, which contains few high buildings. In our method a Best-First Affine Model is used to generate panorama, with the affine parameters solved by a locally optimized RANSAC. The process of our image stitching method is fully automatic. Compared with existing methods, the panorama generated by ours is a quasi-orthorectified one and free from visible distortions.

Yuchong Li

Shape Recognition by Combining Contour and Skeleton into a Mid-Level Representation

Contour and skeleton are two main stream representations for shape recognition in the literature. It has been shown that such two representations convey complementary information, however combining them in a nature way is nontrivial, as they are generally abstracted by different structures (closed string


graph), respectively. This paper aims at addressing the shape recognition problem by combining contour and skeleton into a mid-level of shape representation. To form a mid-level representation for shape contours, a recent work named


ag of




ragments (BCF) is adopted; While for skeleton, a new mid-level representation named


ag of




aths (BSP) is proposed, which is formed by pooling the skeleton codes by encoding the skeleton paths connecting pairs of end points in the skeleton. Finally, a compact shape feature vector is formed by concatenating BCF with BSP and fed into a linear SVM classifier to recognize the shape. Although such a concatenation is simple, the SVM classifier can automatically learn the weights of contour and skeleton features to offer discriminative power. The encouraging experimental results demonstrate that the proposed new shape representation is effective for shape classification and achieves the state-of-the-art performances on several standard shape benchmarks.

Wei Shen, Xinggang Wang, Cong Yao, Xiang Bai

Visual Texture Perception with Feature Learning Models and Deep Architectures

Texture is an important property of images, and a key component for human visual perception. In this work, based on several feature learning models and deep architectures, we study the visual texture perception problem, which is helpful for understanding both the impact of texture itself and the basic mechanisms of human visual systems. Through a series of psychophysical experiments, we find that 12 perceptual features are significant to describe the texture images with regard to the human perceptions. Hence, we represent each texture image with a 12-dimensional vector, corresponding to the values of the 12 perceptual features. To improve the learnablity of existing feature learning models, we propose a set of deep architectures to learn compact representations of the texture perceptual features. Extensive experiments on texture images classification demonstrate the effectiveness of both the feature learning models and the deep architectures. In particular, the advantage of deep architectures over existing feature learning models is shown.

Yuchen Zheng, Guoqiang Zhong, Jun Liu, Xiaoxu Cai, Junyu Dong

Objects Detection Method by Learning Lifted Wavelet Filters

A fast objects detection method is proposed, which is based on the variance-maximization learning of lifting dyadic wavelet filters. First, we derive a difference equation from two kinds of lifting high-pass components of a target image. The difference equation is an approximation of an inverse problem of an elliptic equation, which includes free parameters of the lifting filter. Since this discrete inverse problem is ill-conditioned, the free parameters are learned by using the least square method and a regularization method. Objects detection is done by applying the learned lifting filter to a query image.

Aireti Abulikemu, Aliya Yushan, Turghunjan Abdukirim Turki, Abdurusul Osman

A Fast Straight-Line Growing Algorithm for Sheet-Counting with Stacked-Paper Images

The measurement of stacked-sheet quantity is an essential step in packaging and printing production, and its counting accuracy has a direct impact on economic efficiency of related companies. With its noncontact, non-destructivity and real-time measurement merits, the machine vision method has been widely applied to quality control for high-end printing products. In this paper, we aim to circumvent the fringe detection problem in stacked-sheet images by introducing a level line guided line-segment growing algorithm. Then, a high-accuracy measurement of stack quantity can be realized with the improvement of precision and completeness on fringe identification.Our work mainly consists of three parts: 1) A unidirectional gradient operator is adopted to eliminate multiple responses on a single fringe. 2) The gradient magnitude and level-line direction are combined to improve the growth of line support regions in noisy environment. 3) To completely identify each sheet fringe, a connected component analysis algorithm is integrated to remedy the local gap in line detection. The performance of our algorithm has been verified in experiments using various kinds of printed-papers with a large number. It is shown that the long-term measurement error is less than 0.75‰ and is sufficient to meet the requirement of factory applications.

ZhenXiao Gang, Yang Shuo, Changyan Xiao

Automatic Labanotation Generation Based on Human Motion Capture Data

As a kind of dance notation, Labanotation has been adopted extensively as an analysis and record system for performing dances. This article aims to generate Labanotation automatically from human motion capture data stored in BVH (Bio-vision Hierarchy) files. First, we convert motion capture data into position format. Then we analyze motions separately according to whether the motion belongs to supporting motion or not. Using the obtained Primary Motion Segments, a sequence of coded description of Labanotation - the Labanotation Data (LND) - is built. And finally, Labanotation is drawn and stored correctly on the basis of LND.

Hao Guo, Zhenjiang Miao, Feiyue Zhu, Gang Zhang, Song Li

Self-organizing Map-Based Object Tracking with Saliency Map and K-Means Segmentation

In this paper, a new method is presented for long-term object tracking in surveillance videos. The developed method combines surrounding image sampling, saliency map, self-organizing map neural network, k-Means segmentation and similarity measurement. Saliency map can provide valuable information to reduce over-segmentation. The surrounding image sampling always extracts the regions which are close to the centroid of the latest tracked target. The self-organizing map quantizes the image samples into a topological space, it compresses information while preserving the most important topological and metric relationships of the primary features. The k-Means algorithm will generate segmentation based on the output of the self-organizing map. Then, according to the segmentation results of the new frame and the first frame, a similarity measurement is used to get the most similar image sample to the specified object in the first frame and thus object position in new frame is found. We apply the developed method to track objects in the real-world environment of surveillance videos, computer simulations indicate that the proposed approach presents better results than those obtained by a direct method approach.

Yuanping Zhang, Yuanyan Tang, Bin Fang, Zhaowei Shang, C. Y. Suen

Superpixel-Based Global Optimization Method for Stereo Disparity Estimation

We proposed a novel global optimization method based on superpixel for stereo matching in this paper. Comparing with the pixel-based global optimization methods, the matching accuracy of our method is significantly improved. For improving the initial matching cost’s accuracy, we developed an adaptive matching window integrated with shape and size information to build the data term. To ensure the soft constraints of planar disparity distribution, a superpixel-based plane fitting method is introduced to obtain the initial disparity plane. We present a global optimization framework with data term and pixel-based smooth term to refine the disparity results. The experimental results on the Middlebury Stereo Datasets show that our method outperforms some state-of-the-art pixel-based global optimization approaches both quantitatively and qualitatively.

Haiqiang Jin, Sheng Liu, Shaobo Zhang, Gaoxuan Ying

Two-Stage Saliency Detection Based on Continuous CRF and Sparse Coding

In the state-of-the-art saliency detection methods based on contrast priors, little attention is paid on the region smoothness constraints. The paper proposes a two-stage saliency detection method in which a smoothness prior is explicitly involved in a continuous Conditional Random Field (CRF). In stage one, we construct a continuous CRF based on the sparse codes of perceptual features on all locations, and minimize the energy of CRF to obtain discrimination maps. In stage two, we train a discriminative machine and learn the saliency maps from discrimination maps, aiming to take the human attention priors into consideration. Our experiments on MSRA-1000 show that the new method is effective against the state-of-the-art methods.

Qiyang Zhao, Weibo Li, Fan Wang, Baolin Yin

Continuous Energy Minimization Based Multi-target Tracking

This paper proposes a novel method to deal with the issue of multi-target tracking by taking into account the information of continuous energy minimization and discriminative appearance models simultaneously. Specifically, the information of observation model, appearance model, exclusion model, dynamic model, trajectory persistence model and trajectory regulation model are first adopted to construct an objective function of each tracking trajectory; then, the gradient descent method is here adopted to obtain an approximate minimum of the constructed objective function at every moment, and to obtain the number of and the status of tracking targets; finally, continuous energy minimization based intelligent extrapolation method is here utilized to achieve the final continuous and smooth tracking trajectories. Experimental results on PETS 2009/2010 benchmark and TUD-Stadtmitte video database demonstrate the effectiveness and efficiency of the proposed scheme.

Zhe Shi, Songhao Zhu, Wei Sun, Baoyun Wang


Weitere Informationen

Premium Partner