nach oben

2016 | Buch

Kapitel lesen Erstes Kapitel lesen

Pattern Recognition

7th Chinese Conference, CCPR 2016, Chengdu, China, November 5-7, 2016, Proceedings, Part I

herausgegeben von: Tieniu Tan, Xuelong Li, Xilin Chen, Jie Zhou, Jian Yang, Hong Cheng

Verlag: Springer Singapore

Buchreihe : Communications in Computer and Information Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The two-volume set CCIS 662 and CCIS 663 constitutes the refereed proceedings of the 7th Chinese Conference on Pattern Recognition, CCPR 2016, held in Chengdu, China, in November 2016.The 121 revised papers presented in two volumes were carefully reviewed and selected from 199 submissions. The papers are organized in topical sections on robotics; computer vision; basic theory of pattern recognition; image and video processing; speech and language; emotion recognition.

Inhaltsverzeichnis

Frontmatter

Robotics

Frontmatter

Constrained Spectral Clustering on Face Annotation System

Face clustering is a common feature in face annotation system like intelligent photo albums and photo management systems. But unsupervised clustering algorithms perform poorly and researchers turn to work with constrained clustering algorithms that take the user interactions as constraints. Mostly, the constraints are pairwise constraints in the form of Must-Link or Cannot-Link, which can be easily integrated in spectral clustering algorithm. In this paper, we propose a design of face annotation system that can generate more informative constraints and better use constraints with constrained spectral clustering. And we examine the system in a lab situation dataset and a real-live dataset, of which results demonstrate the effectiveness of our method.

Jiajie Han, Jiani Hu, Weihong Deng

Axial-Decoupled Indoor Positioning Based on Location Fingerprints

Indoor positioning using location fingerprints, which are received signal strength (RSS) from wireless access points (APs), has become a hot research topic during the last a few years. Traditional pattern classification based fingerprinting localization methods suffer high computational burden and require a large number of classifiers to determine the object location. To handle this problem, axial-decoupled indoor positioning based on location-fingerprints is proposed in this paper. The purpose is to reduce the decision complexity while keeping localization accuracy through computing the position on X- and Y-axis independently. First, the framework of axial-decoupled indoor positioning using location fingerprints is given. Then, the training and decision process of the proposed axial-decoupled indoor positioning is described in detail. Finally, pattern classifiers including the least squares support vector machine (LS-SVM), support vector machine (SVM) and traditional k-nearest neighbors (K-NN) are adopted and embedded in the proposed framework. Experimental results illustrate the effectiveness of the proposed axial-decoupled positioning method.

Wei Yanhua, Zhou Yan, Wang Dongli, Wang Xianbing

AdaUK-Means: An Ensemble Boosting Clustering Algorithm on Uncertain Objects

This paper considers the problem of clustering uncertain objects whose locations are uncertain and described by probability density functions (pdf). Though K-means has been extended to UK-means for handling uncertain data, most existing works only focus on improving the efficiency of UK-means. However, the clustering quality of UK-means is rarely considered in existing works. The weights of objects are assumed same in existing works. However, the weights of objects which are far from their cluster representatives should not be the same as the weights of objects which are close to their cluster representatives. Thus, we propose an AdaUK-means to group the uncertain objects by considering the weights of objects in this article. In AdaUK-means, the weights of objects will be adjusted based on the correlation between objects by using Adaboost. If the object pairs are must-link but grouped into different clusters, the weights of the objects will be increased. In our ensemble model, AdaUK-means is run several times, then the objects are assigned by a voting process. Finally, we demonstrate that AdaUK-means performs better than UK-means on both synthetic and real data sets by extensive experiments.

Lei Xu, Qinghua Hu, Xisheng Zhang, Yanshuo Chen, Changrui Liao

A Vehicle Trajectory Analysis Approach Based on the Rigid Constraints of Object in 3-D Space

A reliable and effective trajectories similarity metric is one of key factors for vehicle trajectories clustering problem. A trajectory clustering algorithm based on the rigid constraints of vehicles in 3-D space is proposed in this paper, which conducts vehicle trajectories clustering effectively and precisely by using a new 3-D trajectories similarity metric. Based on two key procedures, camera calibration and a reconstruction of 2-D trajectories in 3-D space, a valuable principle that the heights of the trajectories have a linear relationship between them is found through using the kinematic properties of vehicle rigid body in moving. A more valuable information need to be pay attention is that the height of two trajectories that with displacement difference satisfies a plane surface character in 3-D space when conducts a height enumeration. The experimental results show that the trajectories are very stable and reliable for clustering and event detection when reconstructing their relative position in 3-D world coordinate system.

Wen Jiang, Zhang Zhaoyang, Song Huansheng, Pang Fenglan

Robust Features of Finger Regions Based Hand Gesture Recognition Using Kinect Sensor

Thanks to the emergence of commercial depth cameras, e.g., Kinect, hand gesture recognition has attracted great attention in recent years. In this context, we present a novel Kinect based hand gesture recognition system which focuses on the features of finger regions. A hand cropping approach is proposed to extract the useful finger regions from a noisy hand image including palm, wrist and arm obtained by Kinect. Furthermore, an original dissimilarity metric, called Balanced Finger Earth Movers Distance (BFEMD), is used to classify hand gestures along with the hierarchical recognition strategy. Finally, the 12 popular gestures recognition experiments have been done to illustrate the effectiveness of the proposed gesture recognition system, and the experimental results show that the proposed system can achieve high recognition accuracy at a high speed.

Fengyan Wang, Zengfu Wang

Circular Object Detection in Polar Coordinates for 2D LIDAR Data

This paper presents a new circular object detection method based on geometric property and polynomial fitting in polar coordinates instead of implementing it in Cartesian coordinates for 2-Dimension (2D) lidar data. There are three procedures of the algorithm. Firstly, a simple and fast segmentation method is proposed. Then, according to the circle property, five robust and effective features in natural lidar coordinates for each segment are defined. Finally, these features are normalized and fed into Support Vector Machine (SVM) to detect the target circular object. Three videos containing 1330 frames data are manually labeled and used to test the performance of the proposed algorithm. The best accuracy is 99.79 % and the execution time is lower than 16.93 ms. Experimental results demonstrate that circular object can be detected efficiently and accurately by the proposed method.

Xianen Zhou, Yaonan Wang, Qing Zhu, Zhiqiang Miao

Intensity Estimation of the Real-World Facial Expression

Affect computing or Automatic affect sensing has aroused extensive interests of researchers in the area of machine learning and pattern recognition. Most previous research focused on face detection and emotion recognition while our research explores facial intensity estimation, which cares more about the dynamic changes on a face. CK+ database and Real-world Affective Face Database (RAF-DB) are used to test and implement the algorithms in this paper. To settle the problem of intensity estimation, classification and ranking algorithms are used for training and testing intensity levels. Meanwhile, the performance of five different feature representations is evaluated using the accuracy results obtained from classification approach. By using the optimum feature representation as the input to the next designed training model, ranking results can be attained. Techniques of Learning to Rank in the area of information retrieval are utilized to combat the situation of intensity ranking. RankSVM and RankBoost are used as frameworks to estimate the ranking scores based on sequences of images. The experimental results of scoring are evaluated by the indexes used in information retrieval. Algorithms used in the research are well organized and compared to generate an optimal model for the ranking task.

Yan Gao, Shan Li, Weihong Deng

An Emotional Text-Driven 3D Visual Pronunciation System for Mandarin Chinese

This paper proposes an emotional text-driven 3D visual pronunciation system for Mandarin Chinese. Firstly, based on an articulatory speech corpus collected by Electro-Magnetic Articulography (EMA), the articulatory features are trained by Hidden Markov model (HMM), and the fully context-dependent modeling is taken into account by making full use of the rich linguistic features. Secondly, considering the fact that the emotion is more remarkably adjusted in the articulatory domain owing to the independency in the manipulation of articulators, the differences between articulatory movements in different emotions are investigated. Thirdly, the emotional speech is generated by adjusting the speech parameters, such as fundamental frequency (F0), duration and intensity, based on Praat. Then when playing the generated emotional speech, the corresponding articulatory movements are synthesized by the HMM prediction rules simultaneously which is used to drive the head mesh model along with emotional speech. The experiments demonstrate the system can synthesize accurate emotional speech synchronized animation of articulators at phoneme level.

Lingyun Yu, Changwei Luo, Jun Yu

Interactive Banknotes Recognition for the Visual Impaired With Wearable Assistive Devices

In this paper, we develop a new system, named WVIAS (Wearable Vision Impaired Assistive System), using camera-based computer vision technology to recognize banknote in natural scene aim to help visually impaired people. WVIAS is made up of two mainly parts. In the front, there is a micro camera, set on the glass or mounted on the helmet, to acquire video sequence. In the back, a high performance portable computer is planted to run processing algorithm. To make the system robust to variety conditions including occlusion, rotation, scaling, cluttered background, illumination change, viewpoint variation, and worn or wrinkled banknotes during recognition, we propose a method that using finger pointing as HCI to point out potential targeting district which we call region of interest (ROI), thereafter, we can sharply reduce the processing time by using ROI to replace original image combining with effective ORB feature. The HCI-based framework is effective in collecting more class-specific information and robust in dealing with partial occlusion and viewpoint changes. To authenticate the robustness and generalizability of the proposed approach, we have collected a large dataset of banknotes from natural scene. The proposed algorithm improved the mean average precision from 20.3 % to 61.6 %. The experiments result has shown the effectiveness of our proposal both on the natural scene static dataset and the dynamic video sequence.

Dian Huang, Hong Cheng, Lu Yang

Spontaneous Smile Recognition for Interest Detection

“Interest”is a critical bridge between cognitive and effective issues in learning. Student’s interest has great impact on learning performance. Hence, it’s necessary to detect student’s interest and make them more engaged in the learning process for productive learning. Student’s interest can be detected based on the facial expression recognition, e.g., smile recognition. However, various head poses, different illumination, occlusion and low image resolution make smile recognition difficult. In this paper, a conditional random forest based approach is proposed to recognize spontaneous smile in natural environment. First, image patches are extracted within the eye and mouth regions instead of the whole face to improve the robustness and efficiency. Then, the conditional random forests based approach is presented to learn the relations between image patches and the smile/non-smile features conditional to head poses. Furthermore, a K-means based voting method is introduced to improve the discrimination capability of the approach. Experiments have been carried out with different spontaneous facial expression databases. The encouraging results suggest a strong potential for interest detection in natural environment.

Zhenzhen Luo, Leyuan Liu, Jingying Chen, Yuanyuan Liu, Zhiming Su

Road Extraction Based on Direction Consistency Segmentation

A common strategy for road extraction from remote sensing images is classification based on spectral information. However, due to a common phenomenon that different objects can be with similar spectral characteristics, classification results usually contain many interference regions which do not correspond to any road entity. To solve this problem, a road extraction method based on direction consistency segmentation is proposed in this paper. In binary road classification images, considering that road regions in these images usually have consistent local directions, pixels with similar main directions are merged into objects. After acquiring these objects, geometric measurements such as LFI (Linear Feature Index) and region area are calculated and a segment-linking algorithm is used to recognize and extract road objects among them. Various test images are used to verify the effectiveness of this method and contrast experiments are performed between the proposed binary image processing method and two existing methods. Experimental results show that this method has advantages in both accuracy, computational efficiency and stability, which can be used to extract road regions in remote sensing images at different resolutions.

Lei Ding, Qimiao Yang, Jun Lu, Junfeng Xu, Jintao Yu

Fingertip in the Eye: An Attention-Based Method for Real-Time Hand Tracking and Fingertip Detection in Egocentric Videos

The hand and fingertip tracking is the crucial part in the egocentric vision interaction, and it remains a challenging problem due to various factors like dynamic environment and hand deformation. We propose a convolutional neural network (CNN) based method for the real-time and accurate hand tracking and fingertip detection in RGB sequences captured by an egocentric mobile camera. Firstly, we build a large scale dataset, Ego-Finger, containing plenty of scenarios and human labeled ground truth. Secondly, we propose a two stage CNN pipeline, i.e., the human vision inspired Attention-based Hand Tracker (AHT) and the hand physical constrained Multi-Points Fingertip Detector (MFD). Comparing with state-of-the-art methods, the proposed method achieves very promising results in the real-time fashion.

Xiaorui Liu, Yichao Huang, Xin Zhang, Lianwen Jin

Multiple-Classifiers Based Hand Gesture Recognition

Gesture recognition technology is important in the field of human-computer interaction (HCI), the gesture recognition technology which is based on visual is sensitive to the impact of the environment. We proposed a multiple-classifiers based gesture recognition algorithm that recognizes ten kinds of gesture. The algorithm gets the size and the direction of the gesture by hand tracking algorithm which has the ability of segmentation and can give us the rough outline of the tracked hand. Based on this information, we rotate the image and get the upright image of the hand gesture. Then we extract the HOG feature of the upright hand image, and use multiple classifiers to classify the gesture. The algorithm has a better recognition rate whether the background color is similar to the skin or complex.

Simin Li, Zihan Ni, Nong Sang

Recognition of Social Touch Gestures Using 3D Convolutional Neural Networks

This paper investigates on the deep learning approaches for the social touch gesture recognition. Several types of neural network architectures are studied with a comprehensive experiment design. First, recurrent neural network using long short-term memory (LSTM) is adopted for modeling the gesture sequence. However, for both handcrafted features using geometric moment and feature extraction using convolutional neural network (CNN), LSTM cannot achieve satisfactory performances. Therefore, we propose to use the 3D CNN to model a fixed length of touch gesture sequence. Experimental results show that the 3D CNN approach can achieve a recognition accuracy of 76.1 % on the human-animal affective robot touch (HAART) database in the recognition of social touch gestures challenge 2015, which significantly outperforms the best submitted system of the challenge with a recognition accuracy of 70.9 %.

Nan Zhou, Jun Du

Object Property Identification Using Uncertain Robot Manipulator

This paper presents a learning control algorithm to identify object properties by an uncertain robot manipulator. On one hand, for the robot system with unknown (or immeasurable) parameters, the manipulator dynamics properties are uncertain so we use adaptive parameter learning law to estimate the practically unknown dynamics. On the other hand, a reference model is specified to be followed. In order to identify the geometry and elasticity of the interacting object, the reference point and feedforward force in reference model is adapted in each trial. Because the updating of the reference model utilizes the estimated parameters, the learning law of parameter estimation is thus designed to guarantee the convergence of parameter estimation in finite time (FT). Simulation studies demonstrate the effectiveness of our proposed method.

Kunxia Huang, Chenguang Yang, Hong Cheng

Computer Vision

Frontmatter

Pose-Invariant Face Recognition Based on a Flexible Camera Calibration

In this paper, we present a flexible camera calibration for pose normalization to accomplish a pose-invariant face recognition. The accuracy of calibration can be easily influenced by errors of landmark detection or various shapes of different faces and expressions. By jointly using RANSAC and facial unique characters, we explore a flexible calibration method to achieve a more accurate camera calibration and pose normalization for face images. Our proposed method is able to eliminate noisy facial landmarks and retain the ones which best match the undeformable 3D face model. The experimental results show that our method improves the accuracy of pose-invariant face recognition, especially for the faces with unsatisfied landmark detection, variant shapes, and exaggerated expressions.

Xiaohu Shao, Cheng Cheng, Yanfei Liu, Xiangdong Zhou

Pedestrian Detection Aided by Deep Learning Attributes Task

Deep Learning methods have achieved great successes in pedestrian detection owing to their ability of learning discriminative features from pixel level. However, most of the popular methods only consider using the deep structure as a single feature extractor (one attribute) which may confuse positive with hard negative samples. To address this ambiguity, this work jointly learns three different attributes, including parts, deformation and similarity attributes. This paper proposes a new deep network which jointly optimizes the three attributes and formulates them to form a binary classification task. Extensive experiments show that the proposed method outperforms competing methods on the challenging Caltech and ETH benchmarks.

Chao Qiu, Yinhui Zhang, Jieqiong Wang, Zifen He

Crowd Collectiveness Measure via Path Integral Descriptor

Crowd collectiveness measuring has attracted a great deal of attentions in recently years. We adopt the path integral descriptor idea to measure the collectiveness of a crowd system. A new path integral descriptor is proposed by exponent generating function to avoid parameter setting. Several good properties of the proposed path integral descriptor are demonstrated in this paper. The proposed path integral descriptor of a set is regard as the collectiveness measure of a set, which can be a moving system such as human crowd, sheep herd and so on. Self-driven particle (SDP) model and the crowd motion database are used to test the ability of the proposed method in measuring collectiveness.

Wei-Ya Ren, Guo-Hui Li, Yun-Xiang Ling

Robust Face Frontalization in Unconstrained Images

The goal of face frontalization is to recover frontal facing views of faces appearing in single unconstrained images. The previous works mainly focus on how to achieve the frontal facing views effectively. However, they ignore the influences of the face images with occlusion. To overcome the problem, this paper presents a novel but simple scheme for robust face frontalization with only a single 3D model. We employ the same scheme with T. Hassner’s work to render the non-frontal facing view to the frontal facing view and estimate the invisible (self-occlusion) region. Subsequently, we compute the differences of the local patches around each fixed facial feature points between the average face (male average face or female average face) and test images for occlusion detection. Finally, we combine the proposed local face symmetry strategy and the Poisson image editing to fill the invisible region and occlusion region. Experimental results demonstrate advantages of the proposed method over the previous work.

Yuhan Zhang, Jianjun Qian, Jian Yang

Research on the Stability of Plantar Pressure Under Normal Walking Condition

The plantar pressure of identity recognition in pattern recognition field has a certain application. Only each of the plantar pressure is stable, personalized, plantar pressure identification is scientific, but the stability problem of the plantar pressure has not been verified. In the public security practice footprints can be were personal identification, but there is no underlying scientific principles as the theoretical support. To process the collected data by computer programming, so as to prove the data of plantar pressure of normal person is basically stable. Use MATLAB software to conduct the dynamic plantar pressure’s peak data table that acquired by FOOTSCAN plantar pressure measurement and analysis system, and use statistical correlation algorithm to deal with, so as to analysis the stability of plantar pressure quantitatively. Each one’s feature point variance of plantar pressure is basically stable. The stability of plantar pressure data is better in normal human walking condition, so it can be used to study and apply by the plantar pressure data.

Ding Han, Tang Yunqi, Guo Wei

Convolutional Neural Networks with Neural Cascade Classifier for Pedestrian Detection

The combination of traditional methods (e.g., ACF) and Convolutional Neural Networks (CNNs) has achieved great success in pedestrian detection. Despite effectiveness, design of this method is intricate. In this paper, we present an end-to-end network based on Faster R-CNN and neural cascade classifier for pedestrian detection. Different from Faster R-CNN that only makes use of the last convolutional layer, we utilize features from multiple layers and feed them to a neural cascade classifier. Such an architecture favors more low-level features and implements a hard negative mining process in the network. Both of these two factors are important in pedestrian detection. The neural cascade classifier is jointly trained with the Faster R-CNN in our unifying network. The proposed network achieves comparable performance to the state-of-the-art on Caltech pedestrian dataset with a more concise framework and faster processing speed. Meanwhile, the detection result obtained by our method is tighter and more accurate.

Bei Tong, Bin Fan, Fuchao Wu

Adaptive Multi-Metric Fusion for Person Re-identification

Person re-identification, which aims at recognizing a person of interest across spatially disjoint camera views, is still a challenging task. Plenty of approaches emerge in recent years and some of them achieve good matching results. Given a probe image, we observe that the ranking results generated by different approaches differ from each other. Considering these conventional methods are reasonable, we propose an Adaptive Multi-Metric Fusion (AMMF) method which fuses the existing ranking results with query-specific weights. Experiments on two challenging databases, VIPeR and ETHZ, demonstrate that the proposed method achieves further performance improvement.

Penglin Li, Mengxue Liu, Yun Gu, Lixiu Yao, Jie Yang

Face Detection Using Hierarchical Fully Convolutional Networks

Face detection in unconstrained environment is a challenge problem. Recent studies show that deep convolutional networks (DCNs) have achieved outstanding performance on this task, but most of them have multiple stages (e.g., region proposal, classification), which are complex and time-consuming in practice. In this paper, we propose a fully convolutional network (FCN) framework which can be trained straightforward in an end-to-end manner. In our network, hierarchical feature layers with different resolutions are used to detect different scale faces. For each hierarchical layer, a specific default boxes set with different aspect ratios and scales is associated with each map cell. At prediction time, the network generates confidence scores for the default boxes and produces offsets of default boxes to get better bounding boxes of faces. The predictions of each hierarchical layer are combined into final detection result. Experimental results on the AFW and FDDB datasets confirm the effectiveness of our method.

Jiang-Jing Lv, You-Ji Feng, Xiang-Dong Zhou, Xi Zhou

Depth Supporting Semantic Segmentation via Deep Neural Markov Random Field

Semantic segmentation is of great importance to various vision applications. Depth information plays an important role in human visual system to help people obtain meaningful segmentation results, but it is not well considered by most existing segmentation methods. In this paper, we address the problem of semantic segmentation by incorporating depth information via deep neural Markov Random Field. In our method, the color image and its corresponding depth map are first fed to a convolutional neural network. Then, a deconvolution approach is performed on the network output to obtain the pixelwise prediction in terms of the probability of labels assigned to pixels. Finally, the dense prediction is used to design unary term and pairwise term, which are determined by pixels coordinate, color and depth. Experiments are conducted on several public datasets to illustrate the effectiveness of the proposed method. On the PASCAL VOC 2011 test dataset, experimental results show that our method can get accurate results when compared with the ground truth. On the PASCAL VOC 2012 dataset and NYUDv2 dataset, the proposed method can obtain competitive results.

Wen Su, Zengfu Wang

Omega-Shape Feature Learning for Robust Human Detection

For most top view surveillance scenes, due to having little pose variations and being robust to partial occlusion, people’s head-shoulder Omega-like shapes are proven to be good cues for human detection. In this paper, we focus on learning Omega-shape features with improved discriminative ability in human detection. Orthogonal non-negative matrix factorization (ONMF) is introduced to model the local semantic parts of Omega-like shapes, which is much more robust to noise corruption and local occlusion. The properties only allowing additive, not subtractive combinations of ONMF can well suppress some background clutter. Furthermore, we introduce metric learning into SVM decision framework where object/nonobject classification is performed within a learned feasible metric space. Experimental results on a number of challenging datasets demonstrate the effectiveness and robustness of the proposed human detection method.

Pengfei Liu, Xue Zhou, Shibin Cai

Video Object Detection and Segmentation Based on Proposal Boxes

In this paper, we propose a new method to detect and segment foreground object in video automatically. Given a video sequence, our method begins by generating proposal bounding boxes in each frame, according to both static and motion cues. The boxes are used to detect the primary object in the sequence. We measure each box with its likelihood of containing a foreground object, connect boxes in adjacent frames and calculate the similarity between them. A layered Directed Acyclic Graph is constructed to select object box in each frame. With the help of the object boxes, we model the motion and appearance of the object. Motion cues and appearance cues are combined into an energy minimization framework to obtain the coherent foreground object segmentation in the whole video. Our method reports comparable results with state-of-the-art works on challenging benchmark dataset.

Xiaodi Zhang, Zhiguo Cao, Yang Xiao, Furong Zhao

Landmark Selecting on 2D Shapes for Constructing Point Distribution Model

A new method of selecting landmarks on 2D shapes which are represented by Centripetal Catmull-Rom spline is proposed in this paper. Firstly, a mean shape is generated from training set and landmarks on mean shape are extracted based on curvature and arc-length information. Then the corresponding landmarks on each shape can be obtained by projecting the mean shape back to each sample using non-rigid registration method Coherent Point Drift. Experiments showed that landmarks auto-generated are more accurate than landmarks manual annotated when used in segmentation.

Xianghu Ji, Lili Wang, Pan Tao, Zhongliang Fu

Dual Camera Based Feature for Face Spoofing Detection

This paper presents a fused feature using dual cameras for face spoofing detection. The feature takes full advantage of input image pairs in terms of texture and depth. It consists of two parts: 2D component and 3D component. For the former, we propose an algorithm based on image similarity to combine every pair of input images into one gray-level image, from which the 2D feature is extracted. For the latter, based on point feature histograms (PFH) method, we describe the point cloud obtained by stereo reconstruction algorithms. The concatenation of 2D and 3D features above is used to represent the input image pair. Experiments on self collected dataset demonstrate the competitive performance and potential of the proposed feature.

Xudong Sun, Lei Huang, Changping Liu

Structured Degradation Model for Object Tracking in Non-uniform Degraded Videos

Structure information is a hot spot currently in the domain of computer vision. As many people had applied structure information to their method, few people employed degradation information to their algorithm. However the degradation itself contains some important information. In this paper, we introduce a Structured Degradation Model with degradation assessment of the target to solve the tracking problem. To track the target in non-uniform degraded video, autocorrelation is used to generate the direction map and Tenengrad is used to extract the degradation degree of each target part. In our Structured Degradation Model, an undirected graph of the target is generated to track the target. The nodes of the graph are the target parts and the edges are the interactions between the parts. Experimental result shows that our method performs well especially for object tracking in degraded video.

Yuan Feng, Sheng Liu, ShaoBo Zhang

An Improved Background Subtraction Method Based on ViBe

The classic ViBe method has shortcoming that it may detect the “Ghosting” area, when the initial frame contains a moving target or a target moves from a stationary position. In this paper, the Ghosting phenomenon was investigated, and an improved background subtraction method based on ViBe was proposed. The proposed method provided an enhanced pixel classification mechanism and background update mechanism, a significantly better Ghosting melting speed was obtained in the proposed method as compared to the classic ViBe method. The experimental results found that the proposed method had a good performance in static background scenes, and a low computational cost, that the proposed method can be used in real-time supervisory control system.

Botao He, Shaohua Yu

Novel Face Hallucination Through Patch Position Based Multiple Regressors Fusion

The task of face hallucination is to estimate one high-resolution (HR) face image from the given low-resolution (LR) one through the learning based approach. In this paper, a novel local regression learning based face hallucination is proposed. The proposed framework has two phases. In the training phase, after the training samples is separated into several clusters at each face position, the Partial Least Squares (PLS) method is used to project the original space onto a uniform manifold feature space and multiple linear regression are learned in each cluster. In the prediction phase, once the cluster of the LR patch is gotten, the corresponding learned regression function can be used to estimate HR patch. Furthermore, a multi-regressors fusion model and HR induced clustering strategy are proposed to further improve the reconstruction quality. Experiment results show that the proposed method has a very competitive performance compared with other leading algorithm with low complexity.

Changkai Jiao, Zongliang Gan, Lina Qi, Changhong Chen, Feng Liu

Gait Retrieval: A Deep Hashing Method for People Retrieval in Video

Automated surveillance systems are required for the state of the art security. Everyday networks of cameras generate a very-large set of data, which makes recognition and identification tasks harder. In this paper, we present a new problem called Gait Retrieval in order to address the challenge of large-scale surveillance data. We have an interest in retrieving similar videos based on the human gait. Gait is the most important biometric for long distance human identification. We also propose a solution for the Gait Retrieval problem by using gait biometrics. The solution is based on the deep hashing technique to learn a hash function that preserves the similarities between the same labeled images. Deep hash function with convolutional neural network learns features and maps them to hash codes. Images with similar appearance should have similar hash codes. Training samples are arranged in a batch of triplets. Our proposed method outperforms traditional methods with good margin.

Muhammad Rauf, Yongzhen Huang, Liang Wang

Background Subtraction Based on Superpixels Under Multi-scale in Complex Scenes

Background subtraction in complex scenes is a challenging problem of computer vision. Most existing algorithms analyze the variation in pixels or regions for background subtraction. Unfortunately, these works ignoring the neighborhood information or similarity among pixels and do not work well in complex scenes. To solve this problem, a novel background subtraction method based on SuperPixels under Multi-Scale (SPMS) is proposed. In SPMS, the foreground consists of superpixels with foreground or background label, which decided by the statistic of its variation. The variation in superpixels is robust to noise and environmental changes, which endows the SPMS with the ability to work in extreme environment such as adverse weather and dynamic scenes. Finally, the summary of foregrounds under multiple scales improve the accuracy of the proposed approach. The experiments on standard benchmarks demonstrate encouraging performance of the proposed approach in comparison with several state-of-the-art algorithms.

Chenqiu Zhao, Tingting Zhang, Qianying Huang, Xiaohong Zhang, Dan Yang, Yinq Qu, Sheng Huang

Direct Discriminant Analysis Using Volterra Kernels for Face Recognition

Based on non-linear Volterra kernels mapping and direct discrimination analysis(DD-Volterra), a novel face recognition algorithm is proposed. Firstly, the original image is segmented into specific sub blocks and seeks functional mapping using truncated Volterra kernels. Next, simultaneous diagonalization obtain Volterra kernel optimal projection matrix. This matrix can discard useless information that exist in the null space of the inter-class. Also, it can reserve discriminative information that exist in the null space of the intra-class. Finally, in the test, each block of the test image is classified separately, voting strategy and nearest neighbor classifier algorithm are used for classification. Experiments show that the proposed DD-Volterra method has better performance for it is more effective than Volterrafaces during the extracting facial feature stage.

Guang Feng, Hengjian Li, Jiwen Dong, Jiashu Zhang

An Improved MEEM Tracker via Adaptive Binary Feature Encoding

We propose an adaptive binary feature encoding method to improve the tracking performance of the MEEM-tracker on real surveillance videos by enhancing the distinguished ability between the target object and the background. The adaptive binary feature encoding method transfers the source image data into binary features by calculating the online encoding parameters such as quantization number and quantization thresholds of each feature channel according to the current image data. The quantization number is calculated based on the dissimilarity between the target region and the surrounding region, and the quantization thresholds are decided by the feature clusters of each channel using the K-Means method. Our improved MEEM-tracker (IPMEEM) restores the online encoding parameters for producing distinguishing binary feature vectors in the current training and tracking procedure. In the experiments, our tracker achieves better overall performance on a surveillance dataset which has 12 new collected and labeled sequences under challenging scenes like “low contrast” and “low resolution”. We show that our tracker is more robust for real surveillance videos.

Yang Liu, Yuehuan Wang, Jun Wang

Real-Time Object Tracking Using Dynamic Measurement Matrix

Object tracking has attracted a lot of attention over the past decades. Features represent the main and primary information of object, however, fixed and invariable feature extraction methods would make the features losing their representation. In this paper, we propose a novel robust single object tracking approach using Dynamic Measurement Matrix to extract dynamic features. In particular, we employ the dynamic measurement matrix to adaptively extract features for discriminating object and background so that features have better and clear representativeness. In additional, our approach is a tracking-by-detection approach via a Naive Bayes Classifier with online updating. Compared to traditional approaches, we not only utilize a Naive Bayes Classifier to classify samples but exploit the nature of this classifier to weigh each compressive feature unit, which would be used to update the measurement matrix. The proposed approach runs in real-time and is robust to pose variation, illumination change and occlusion. Furthermore, both quantitative and qualitative experiments results show that our approach has more stable and superior performance.

Jing Li, Hong Cheng, Runzhou Wang, Lu Yang

A Novel Discriminative Weighted Pooling Feature for Multi-view Face Detection

Finding discriminative feature is crucial for building a high-performance object detection system, which has an effect on the detection speed and accuracy. In this paper, we propose a novel discriminative weighted pooling feature based on the multiple channel maps for multi-view face detection. The color and shape statistics of face structure can be utilized to enhance the discriminative ability of the box filter, which is generalized from the square channel filter. The discriminative information can be obtained with LDA and imbalance embedding LDA method, which is superior to the baseline box filter. The experimental result on the FDDB dataset shows that our proposed method has some advantages in accuracy or speed when compared with many other state-of-the-art methods.

Shiwei Shi, Jifeng Shen, Xin Zuo, Wankou Yang

Non-rigid 3D Model Retrieval Based on Weighted Bags-of-Phrases and LDA

This paper presents an improved BOP model, called weighted bags-of-phrases (W-BOP), and its application in non-rigid 3D shape retrieval. The W-BOP model uses the Gaussian weighted function and the 3D points’ ring based neighborhood to construct the spatial arrangement model of visual words. Compared with BOP model, it can describe the 3D model more detailedly. Compared with the SS-BOW model and the BOFG model, it needn’t perform feature detection step and has higher computation efficiency. To further improve the retrieval performance, the LDA algorithm is used to reduce the dimension of the W-BOP descriptor. Extensive experiments have validated the effectiveness of the designed W-BOP model and LDA based non-rigid 3D model retrieval method.

Hui Zeng, Huijuan Wang, Siqi Li, Wei Zeng

Adaptively Weighted Structure Preserved Projections for Face Recognition

In this paper, a new algorithm named Adaptively Weighted Structure Preserved Projections (Aw-SPP) is proposed for face recognition. Firstly, the configural structure relationship of sub-images in each face image is preserved in Aw-SPP. Then, an adaptive non-negative weight vector is introduced to take different contributions of various sub-pattern sets into account, which combines the Laplacian matrices obtained by different sub-pattern sets. Simultaneously, a Laplacian penalty constraint is also incorporated to preserve the intrinsic 2D structure of each sub-image. Finally, the procedures of feature extraction and non-negative weight vector learning are integrated into a unified framework. Moreover, an efficient iterative algorithm is designed to optimize our objective function. To validate the feasibility and effectiveness of the proposed approach, extensive experiments are conducted on three face databases (Extended YaleB, CMU PIE and AR). Experimental results demonstrate that the proposed Aw-SPP outperforms some other state of the art algorithms.

Yugen Yi, Wei Zhou, Yanjiao Shi, Guoliang Luo, Jianzhong Wang

A Color Model Based Fire Flame Detection System

Fire flame detection using color information is an important problem for public security and has many applications in computer vision and other domains. The color model based method used for fire flame detection has many advantages over conventional methods, such as simple, feasible and understandable. In order to improve the performance of fire flame detection based on video, we propose an effective color model based method for fire flame detection and build a corresponding fire flame detection system. Firstly, candidate fire flame regions are detected using the chromatic and dynamic measurements. Secondly, the fire flame regions are determined based on the area of the candidate regions. Finally, the fire flame detection system will give an alarm voice when the number of successive fire frames surpasses threshold. Experimental results show the effectiveness of our system on various fire-detection tasks in real-world environments.

Qing Lu, Jun Yu, Zengfu Wang

Robust Object Tracking Based on Collaborative Model via L2-Norm Minimization

The computational cost of the tracking algorithms with sparse representation is relatively large, however, we proposed a robust object tracking algorithm based on a sparse collaborative model that exploits both holistic templates and local representation to account for drastic appearance model, properly solved by the $$ l_{2} $$l2 norm minimization solution in a Bayesian inference framework, which is proved to be effective and efficient. In the process of the object tracking process, the positive template and negative template of the discriminant model together with the coefficient of the generative model are timely updated so as to have a strong adaptability and robust discrimination. In the discriminative module, we introduce an effective method to compute the confidence value that assigns more weights to the foreground than the background. In order to speed up the tracking algorithm, a variance particle filter algorithm is proposed to avoid the computational load of the particles with low similarity. Experiments on some challenge video sequences demonstrate that our proposed tracker is robust and effective to challenge issues such as illumination change, clutter background partial occlusion and so on and perform favorably against state-of-art algorithms.

Xiaokang Qiao, Kaile Su, Zhonglong Zheng, Huawen Liu, Xiaowei He

Online Adaptive Multiple Appearances Model for Long-Term Tracking

How to build a good appearance descriptor for tracking target is a basic challenge for long-term robust tracking. In recent research, many tracking methods pay much attention to build one online appearance model and updating by employing special visual features and learning methods. However, one appearance model is not enough to describe the appearance of the target with historical information for long-term tracking task. In this paper, we proposed an online adaptive multiple appearances model to improve the performance. Building appearance model sets, based on Dirichlet Process Mixture Model (DPMM), can make different appearance representations of the tracking target grouped dynamically and in an unsupervised way. Despite the DPMM’s appealing properties, it characterized by computationally intensive inference procedures which often based on Gibbs samplers. However, Gibbs samplers are not suitable in tracking because of high time cost. We proposed an online Bayesian learning algorithm to reliably and efficiently learn a DPMM from scratch through sequential approximation in a streaming fashion to adapt new tracking targets. Experiments on multiple challenging benchmark public dataset demonstrate the proposed tracking algorithm performs 22 % better against the state-of-the-art.

Shuo Tang, Longfei Zhang, Xiangwei Tan, Jiali Yan, Gangyi Ding

Multi-stream Deep Networks for Person to Person Violence Detection in Videos

Violence detection in videos has numerous applications, ranging from parental control and children protection to multimedia filtering and retrieval. A number of approaches have been proposed to detect vital clues for violent actions, among which most methods prefer employing trajectory based action recognition techniques. However, these methods can only model general characteristics of human actions, thus cannot well capture specific high order information of violent actions. Therefore, they are not suitable for detecting violence, which is typically intense and correlated with specific scenes. In this paper, we propose a novel framework, i.e., multi-stream deep convolutional neural networks, for person to person violence detection in videos. In addition to conventional spatial and temporal streams, we develop an acceleration stream to capture the important intense information usually involved in violent actions. Moreover, a simple and effective score-level fusion strategy is proposed to integrate multi-stream information. We demonstrate the effectiveness of our method on the typical violence dataset and extensive experimental results show its superiority over state-of-the-art methods.

Zhihong Dong, Jie Qin, Yunhong Wang

Basic Theory of Pattern Recognition

Frontmatter

1-Norm Projection Twin Support Vector Machine

In this paper, we propose a novel feature selection method which can suppress the input features automatically. We first introduce a Tikhonov regularization term to the objective function of projection twin support vector machine (PTSVM). Then we convert it to a linear programming (LP) problem by replacing all the 2-norm terms in the objective function with 1-norm ones. Then we construct an unconstrained convex programming problem according to the exterior penalty (EP) theory. Finally, we solve the EP problems by using a fast generalized Newton algorithm. In order to improve performance, we apply a recursive algorithm to generate multiple projection axes for each class. To disclose the feasibility and effectiveness of our method, we conduct some experiments on UCI and Binary Alphadigits data sets.

Rui Yan, Qiaolin Ye, Dong Zhang, Ning Ye, Xiaoqian Li

Consistent Model Combination of Lasso via Regularization Path

It is well-known that model combination can improve prediction performance of regression model. We investigate the model combination of Lasso with regularization path in this paper. We first define the prediction risk of Lasso estimator, and prove that Lasso regularization path contains at least one prediction consistent estimator. Then we establish the prediction consistency for convex combination of Lasso estimators, which gives the mathematical justification for model combination of Lasso on regularization path. With the inherent piecewise linearity of Lasso regularization path, we construct the initial candidate model set, then select the models for combination with Occam’s Window method. Finally, we carry out the combination on the selected models using the Bayesian model averaging. Theoretical analysis and experimental results suggest the feasibility of the proposed method.

Mei Wang, Yingqi Sun, Erlong Yang, Kaoping Song

Using Feature Correlation Measurement to Improve the Kernel Minimum Squared Error Algorithm

The kernel minimum squared error (KMSE) is less computationally efficient when applied to large datasets. In this paper, we propose IKMSE, an algorithm which improves the computational efficiency of KMSE by using just a part of the training set, key nodes, as a certain linear combination of key nodes in the feature space can be used to approximate the discriminant vector. Our algorithm includes three steps. The first step is to measure the correlation between the column vectors in the kernel matrix, known as the feature correlation, using the cosine distance between them. The second step is to determine the key nodes using the following requirement: two arbitrary column vectors of the kernel matrix that correspond to the key nodes should have a small cosine distance value. In the third step, we use the key nodes to construct the KMSE model and classify the testing samples. There are usually many fewer key nodes than training samples and this is the basis of producing the efficiency of feature extraction in our method. Experimental results show that our improved method has low computational complexity as well as high classification accuracy.

Zizhu Fan, Zuoyong Li

Robust Supervised Hashing

Hashing methods on large scale image retrieval have been extensively in attention. These methods can be roughly categorized as supervised and unsupervised. Unsupervised hashing methods mainly search for a projection matrix of the original data to preserve the Euclidean distance similarity, while supervised hashing methods aim to preserve the label similarity. However, most hashing methods propose a complicated objective function and search for optimized or relaxed solutions. Some methods will consume much time to train a good binary code. This paper is not focusing on formulating a complex solution like the previous state-of-art methods. Contrarily, we firstly propose a simple objective function on supervised hashing as far as we have learned. And we devise a novel solution which uses a maximum and equal Hamming distance code to construct the label information. This method keeps a comparable accuracy with the state-of-the-art supervised hashing methods.

Tongtong Yuan, Weihong Deng

Joint Learning of Distance Metric and Kernel Classifier via Multiple Kernel Learning

Both multiple kernel learning (MKL) and support vector metric learning (SVML) were developed to adaptively learn kernel function from training data, and have been proved to be effective in many challenging applications. Actually, many MKL formulations are based on either the max-margin or radius-margin based principles, which in spirit is consistent with the optimization principle of the between/within class distances adopted in SVML. This motivates us to investigate their connection and develop a novel model for joint learning distance metric and kernel classifier. In this paper, we provide a new parameterization scheme for incorporating the squared Mahalanobis distance into the Gaussian RBF kernel, and formulate kernel learning into a GMKL framework. Moreover, radius information is also incorporated as the supplement for considering the within-class distance in the feature space. We demonstrate the effectiveness of the proposed algorithm on several benchmark datasets of varying sizes and difficulties. Experimental results show that the proposed algorithm achieves competitive classification accuracies with both state-of-the-art metric learning models and representative kernel learning models.

Weiqi Zhang, Zifei Yan, Hongzhi Zhang, Wangmeng Zuo

Semi-supervised Sparse Subspace Clustering on Symmetric Positive Definite Manifolds

The covariance descriptor which is a symmetric positive definite (SPD) matrix, has recently attracted considerable attentions in computer vision. However, it is not trivial issue to handle its non-linearity in semi-supervised learning. To this end, in this paper, a semi-supervised sparse subspace clustering on SPD manifolds is proposed, via considering the intrinsic geometric structure within the manifold-valued data. Experimental results on two databases show that our method can provide better clustering solutions than the state-of-the-art approaches thanks to incorporating Riemannian geometry structure.

Ming Yin, Xiaozhao Fang, Shengli Xie

Combination of Multiple Classifier Using Feature Space Partitioning

Combination of Multiple Classifier has been consider as the approach of improving the classification performance. The popular diversification approach named local specialization is based on the simultaneous partitioning of the feature space and an assignment of a compound classifier to each of the sub-space. This paper presents a novel feature space partitioning algorithm for the combination of multiple classifier. The proposed method uses pairwise measure to get the diversity between classifiers and selects the complementary classifiers to get the pseudo labels. Based on the pseudo labels, it splits the feature space into constituents and selects the best classifier committee from the pool of available classifiers. The partitioning and selection are taken place simultaneously as part of a compound optimization process aimed at maximizing system performance. Evolutionary methods are used to find the optimal solution. The experimental results show the effectiveness and efficiency of the proposed method.

Xia Yingju, Hou Cuiqin, Sun Jun

On Tightening the M-Best MAP Bounds

We consider the problem of finding the M assignments with the maximum probabilities (or equivalently, the M-best MAP assignments) on a probabilistic graphical model. The covering graph approximation method provides an upper bound on each of the true M-best MAP costs. However, the tightness of these bounds is closely related to how to split the parameters of the duplicate nodes. We propose a monotonic algorithm to tighten the M-best MAP bounds by finding the optimal splitting of these parameters. Experimental results on synthetic and real problems show that our algorithm provides much tighter bounds than those provided by uniformly splitting the parameters.

Qiang Cheng, Li Chen, Yuanjian Xing, Yuhao Yang

A Data Cleaning Method and Its Application for Earthen Site Data Monitored by WSN

This paper focuses on a data cleaning method to denoise and detect outliers of earthen site monitoring data with wireless sensor network (WSN). A data cleaning method, named DC_ESVS is proposed, which is based on the temporal and spatial characteristics of monitoring data with WSN. Using the cubic exponential smoothing algorithm and voting strategy, it can denoise and detect outliers of earthen site monitoring data based on the decision rule. We conduct various experiments on the dataset of the monitoring data of Xi’an Tang Hanguangmen city wall site with WSN to show detection accuracy of the presented method. Experimental results on anther dataset of the monitoring data of the Ming Great Wall in Shaanxi also show good performance of the proposed method.

Yun Xiao, Xuanhong Wang, Xin Wang, Pengfei Xu, Xiaojiang Chen, Dingyi Fang, Baoying Liu

Analysis of Convergence Properties for Gath-Geva Clustering Using Jacobian Matrix

The GG algorithm is one of the most frequently used clustering algorithms. Just like FCM and other clustering algorithms, the coincident clustering result (all membership functions are equal to 1/c where c is the cluster number and all cluster centers are the mass center of data set) is a fixed point of GG clustering algorithm. In this paper we reveal the relation between the stable fixed points of the GG algorithm and the data sets using Jacobian matrix analysis. Unlike the Hessian matrix analysis, The Jacobian matrix analysis can be applied to the clustering algorithms which only have update equation. We demonstrate that the coincident clustering result is not a stable fixed point of GG clustering algorithm and discuss the relationship between convergence rate of clustering algorithm and fuzziness index m. Some experimental results verify the effectiveness of our theoretical results.

Chaomurilige Wang, Jian Yu, Jie Zhu

Subspace Clustering by Capped Norm

Subspace clustering, as an important clustering problem, has drawn much attention in recent years. State-of-the-art methods generally try to design an efficient model to regularize the coefficient matrix while ignore the influence of the noise model on subspace clustering. However, the real data are always contaminated by the noise and the corresponding subspace structures are likely to be corrupted. In order to solve this problem, we propose a novel subspace clustering algorithm by employing capped $$l_1$$l1 norm to deal with the noise. Consequently, the noise term with large error can be penalized by the proposed method. So it is more robust to the noise. Furthermore, the grouping effect of our method is theoretically proved, which means highly correlated points can be grouped together. Finally, the experimental results on two real databases show that our method outperforms state-of-the-art methods.

Quanmao Lu, Xuelong Li, Yongsheng Dong, Dacheng Tao

Clique-Based Locally Consistent Latent Space Clustering for Community Detection

Community structure is one of the most important properties of complex networks and a keypoint to understanding and exploring real-world networks. One popular technique for community detection is matrix-based algorithms. However, existing matrix-based community detection models, such as nonnegative matrix factorization, spectral clustering and their variants, fit the data in a Euclidean space and have ignored the local consistency information which is crucial when discovering communities. In this paper, we propose a novel framework of latent space clustering to cope with community detection, by incorporating the clique-based locally consistency into the original objective functions to penalize the latent space dissimilarity of the nodes within the clique. We evaluate the proposed methods on both synthetic and real-world networks and experimental results show that our approaches significantly improve the accuracy of community detection and outperform state-of-the-art methods, especially on networks with unclear structures.

Zhuanlian Ding, Dengdi Sun, Xingyi Zhang, Bin Luo

An Improved Self-adaptive Regularization Method for Mixed Multiplicative and Additive Noise Reduction

The noise in micro focus X-ray images is complicated with low signal-to-noise-ratio (SNR) and can be described as mixed multiplicative and additive noise. Nevertheless, the present self-adaptive regularization methods for smoothing such mixed noise remain scarce. Thus, this paper proposes an improved self-adaptive regularization method to reduce the mixed multiplicative and additive noise in micro focus X-ray images. A novel scheme to adaptively select the regularization operator and regularization parameter based on local variance is presented, in which a $$ \varvec{p} $$p-Laplace function is used as the regularization operator with self-adaptive $$ \varvec{p} $$p and the regularization parameter is designed according to a barrier function. Experiment results demonstrate that the proposed method can achieve a better balance between noise-reducing and edge-preserving, which effectively improve the denoising quality.

Ziling Wu, Hongxia Gao, Ge Ma, Lixuan Wu

The GEPSVM Classifier Based on L1-Norm Distance Metric

The proximal support vector machine via generalized eigenvalues (GEPSVM) is an excellent classifier for binary classification problem. However, the distance of GEPSVM from the point to the plane is measured by L2-norm, which emphasizes the role of outliers by the square operation. To optimize this, we propose a robust and effective GEPSVM classifier based on L1-norm distance metric, referred to as L1-GEPSVM. The optimization goal is to minimize the intra-class distance dispersion and maximize the inter-class distance dispersion simultaneously. It is known that the application of L1-norm distance is often considered as a simple and powerful way to reduce the impact of outliers, which improves the generalization ability and flexibility of the model. In addition, we design an effective iterative algorithm to solve the L1-norm optimal problems, which is easy to actualize and its convergence to a logical local optimum is theoretically ensured. Thus, the classification performance of L1-GEPSVM is more robust. Finally, the feasibility and effectiveness of L1-GEPSVM are proved by extensive experimental results on both UCI datasets and artificial datasets.

A. He Yan, B. Qiaolin Ye, C. Ying’an Liu, D. Tian’an Zhang

Kernel Learning with Hilbert-Schmidt Independence Criterion

Measures of statistical independence between random variables have been successfully applied in many learning tasks, such as independent component analysis, feature selection and clustering. The success is based on the fact that many existing learning tasks can be cast into problems of dependence maximization (or minimization). Motivated by this, we introduce a unifying view of kernel learning with the Hilbert-Schmidt independence criterion (HSIC) which is a kernel method for measuring the statistical dependence between random variables. The key idea is that good kernels should maximize the statistical dependence, measured by the HSIC, between the kernels and the class labels. As a special case of kernel learning, we also propose an effective Gaussian kernel optimization method for classification by maximizing the HSIC, where the spherical kernel is considered. The proposed approach is demonstrated with several popular UCI machine learning benchmark examples.

Tinghua Wang, Wei Li, Xianwen He

Key Course Selection in Academic Warning with Sparse Regression

Many colleges and universities are paying more attention to academic warning which warns large numbers of students who have unsatisfactory academic performance. Academic warning becomes a new part in the teaching management constitution but lacks of unified and scientific standards under the establishment of this stipulation at present. This paper solves the current setting of academic warning through well-known methods lasso and $$\ell _1$$ℓ1-norm support vector regression with $$\epsilon $$ϵ-insensitive loss function which can select key courses based on the failed credits in one semester. The experiments are made on our collected academic warning datasets which are incomplete data. We impute them with one nearest neighbor method. The experimental results show that sparse regression is effective for colleges and universities to remind the students of key courses.

Min Yin, Xijiong Xie, Shiliang Sun

The Necessary and Sufficient Conditions for the Existence of the Optimal Solution of Trace Ratio Problems

Many dimensionality reduction problems can be formulated as a trace ratio form, i.e. $$\hbox {argmax}_\mathbf{W}Tr(\mathbf{W}^T \mathbf{S}_p \mathbf{W}) / Tr(\mathbf{W}^T \mathbf{S}_t \mathbf{W})$$argmaxWTr(WTSpW)/Tr(WTStW), where $$\mathbf{S}_p$$Sp and $$\mathbf{S}_t$$St represent the (dis)similarity between data, $$\mathbf{W}$$W is the projection matrix, and $$Tr(\cdot )$$Tr(·) is the trace of a matrix. Some representative algorithms of this category include principal component analysis (PCA), linear discriminant analysis (LDA) and marginal Fisher analysis (MFA). Previous research focuses on how to solve the trace ratio problems with either (generalized) eigenvalue decomposition or iterative algorithms. In this paper, we analyze an algorithm that transforms the trace ratio problems into a series of trace difference problems, i.e. $$\hbox {argmax}_\mathbf{W}Tr[(\mathbf{W}^T (\mathbf{S}_p - \lambda \mathbf{S}_t )\mathbf{W}]$$argmaxWTr[(WT(Sp-λSt)W], and propose the necessary and sufficient conditions for the existence of the optimal solution of trace ratio problems. The correctness of this theoretical result is proved. To evaluate the applied algorithm, we tested it on three face recognition applications. Experimental results demonstrate its convergence and effectiveness.

Guoqiang Zhong, Xiao Ling

Robust Multi-label Feature Selection with Missing Labels

With the fast development of social networks, high-dimensionality is becoming an intractable problem in many machine learning and computer vision tasks. This phenomenon also exists in the field of multi-label classification. So far many supervised or semi-supervised multi-label feature selection methods have been proposed to reduce the feature dimension of training samples. However, almost all existing feature selection works focus on multi-label learning with complete labels. In fact, labels are very expensive to obtain and the training instances usually have an incomplete/partial set of labels (some labels are randomly missed). Very few researchers pay attention to the problem of multi-label feature selection with missing labels. In this paper, we propose a robust model to solve the above problem. We recover the missing labels by a linear regression model and select the most discriminative feature subsets simultaneously. The effective $$l_{2,p}$$l2,p-norm $$\left( {0 < p \le 1} \right) $$0<p≤1 regularization is imposed on the feature selection matrix. The iterative reweighted least squares (IRLS) algorithm is used to solve the optimization problem. To verify the effectiveness of the proposed method, we conduct experiments on five benchmark datasets. Experimental results show that our method has superior performance over the state-of-the-art algorithms.

Qian Xu, Pengfei Zhu, Qinghua Hu, Changqing Zhang

Fractional Orthogonal Fourier-Mellin Moments for Pattern Recognition

In this paper, we generalize the orthogonal Fourier-Mellin moments (OFMMs) to the fractional orthogonal Fourier-Mellin moments (FOFMMs), which are based on the fractional radial polynomials. We propose a new method to construct FOFMMs by using a continuous parameter $$ t $$t$$ \left( {t > 0} \right) $$t>0. The fractional radial polynomials of FOFMMs have the same number of zeros as OFMMs with the same degree. But the zeros of FOFMMs polynomial are more uniformly distributed than which of OFMMs and the first zero is closer to the origin. A recursive method is also given to reduce computation time and improve numerical stability. Experimental results show that the proposed FOFMMs have better performance.

Huaqing Zhang, Zongmin Li, Yujie Liu

Backmatter

Titel: Pattern Recognition
herausgegeben von: Tieniu Tan
Xuelong Li
Xilin Chen
Jie Zhou
Jian Yang
Hong Cheng
Verlag: Springer Singapore
Electronic ISBN: 978-981-10-3002-4
Print ISBN: 978-981-10-3001-7
DOI: https://doi.org/10.1007/978-981-10-3002-4

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Robotics

Frontmatter

Constrained Spectral Clustering on Face Annotation System

Axial-Decoupled Indoor Positioning Based on Location Fingerprints

AdaUK-Means: An Ensemble Boosting Clustering Algorithm on Uncertain Objects

A Vehicle Trajectory Analysis Approach Based on the Rigid Constraints of Object in 3-D Space

Robust Features of Finger Regions Based Hand Gesture Recognition Using Kinect Sensor

Circular Object Detection in Polar Coordinates for 2D LIDAR Data

Intensity Estimation of the Real-World Facial Expression

An Emotional Text-Driven 3D Visual Pronunciation System for Mandarin Chinese

Interactive Banknotes Recognition for the Visual Impaired With Wearable Assistive Devices

Spontaneous Smile Recognition for Interest Detection

Road Extraction Based on Direction Consistency Segmentation

Fingertip in the Eye: An Attention-Based Method for Real-Time Hand Tracking and Fingertip Detection in Egocentric Videos

Multiple-Classifiers Based Hand Gesture Recognition

Recognition of Social Touch Gestures Using 3D Convolutional Neural Networks

Object Property Identification Using Uncertain Robot Manipulator

Computer Vision

Frontmatter

Pose-Invariant Face Recognition Based on a Flexible Camera Calibration

Pedestrian Detection Aided by Deep Learning Attributes Task

Crowd Collectiveness Measure via Path Integral Descriptor

Robust Face Frontalization in Unconstrained Images

Research on the Stability of Plantar Pressure Under Normal Walking Condition

Convolutional Neural Networks with Neural Cascade Classifier for Pedestrian Detection

Adaptive Multi-Metric Fusion for Person Re-identification

Face Detection Using Hierarchical Fully Convolutional Networks

Depth Supporting Semantic Segmentation via Deep Neural Markov Random Field

Omega-Shape Feature Learning for Robust Human Detection

Video Object Detection and Segmentation Based on Proposal Boxes

Landmark Selecting on 2D Shapes for Constructing Point Distribution Model

Dual Camera Based Feature for Face Spoofing Detection

Structured Degradation Model for Object Tracking in Non-uniform Degraded Videos

An Improved Background Subtraction Method Based on ViBe

Novel Face Hallucination Through Patch Position Based Multiple Regressors Fusion

Gait Retrieval: A Deep Hashing Method for People Retrieval in Video

Background Subtraction Based on Superpixels Under Multi-scale in Complex Scenes

Direct Discriminant Analysis Using Volterra Kernels for Face Recognition

An Improved MEEM Tracker via Adaptive Binary Feature Encoding

Real-Time Object Tracking Using Dynamic Measurement Matrix

A Novel Discriminative Weighted Pooling Feature for Multi-view Face Detection

Non-rigid 3D Model Retrieval Based on Weighted Bags-of-Phrases and LDA

Adaptively Weighted Structure Preserved Projections for Face Recognition

A Color Model Based Fire Flame Detection System

Robust Object Tracking Based on Collaborative Model via L2-Norm Minimization

Online Adaptive Multiple Appearances Model for Long-Term Tracking

Multi-stream Deep Networks for Person to Person Violence Detection in Videos

Basic Theory of Pattern Recognition

Frontmatter

1-Norm Projection Twin Support Vector Machine

Consistent Model Combination of Lasso via Regularization Path

Using Feature Correlation Measurement to Improve the Kernel Minimum Squared Error Algorithm

Robust Supervised Hashing

Joint Learning of Distance Metric and Kernel Classifier via Multiple Kernel Learning

Semi-supervised Sparse Subspace Clustering on Symmetric Positive Definite Manifolds

Combination of Multiple Classifier Using Feature Space Partitioning

On Tightening the M-Best MAP Bounds

A Data Cleaning Method and Its Application for Earthen Site Data Monitored by WSN

Analysis of Convergence Properties for Gath-Geva Clustering Using Jacobian Matrix

Subspace Clustering by Capped Norm

Clique-Based Locally Consistent Latent Space Clustering for Community Detection

An Improved Self-adaptive Regularization Method for Mixed Multiplicative and Additive Noise Reduction

The GEPSVM Classifier Based on L1-Norm Distance Metric

Kernel Learning with Hilbert-Schmidt Independence Criterion

Key Course Selection in Academic Warning with Sparse Regression

The Necessary and Sufficient Conditions for the Existence of the Optimal Solution of Trace Ratio Problems

Robust Multi-label Feature Selection with Missing Labels

Fractional Orthogonal Fourier-Mellin Moments for Pattern Recognition

Backmatter

Premium Partner