Skip to main content

Über dieses Buch

The LNCS volume 10996 constitutes the proceedings of the 13th Chinese Conference on Biometric Recognition, held in Urumchi, China, in August 2018.

The 79 full papers and 67 poster papers presented were carefully reviewed and selected from 112 submissions. The papers cover a wide range of topics such as Biometrics, Speech recognition, Activity recognition and understanding, Online handwriting recognition, System forensics, Multi-factor authentication, Graphical and visual passwords.



Hand-Based Biometrics


Dorsal Hand Vein Recognition Method Based on Multi-bit Planes Optimization

With the development of technology, how to improve the accuracy of dorsal hand vein recognition has become the focus of current research. In order to solve this problem, this paper proposes a dorsal hand vein image recognition method which is based on multi-bit planes and Deep Learning network. The multi-bit planes can not only fully use the gray information of the images but also their intrinsic relationship between the bit planes of the images. In addition, the bit plane with less information is removed according to the Euclidean distance, and a new bit planes sequence is formed, and the accuracy of the recognition of the dorsal hand vein is improved. The algorithm is tested on the real dorsal hand vein database, and the recognition accuracy is more than 99%, which proves the effectiveness of the algorithm.

Haoxuan Li, Yiding Wang, Xiaochen Jiang

A Novel Finger-Knuckle-Print Recognition Based on Batch-Normalized CNN

Traditional feature extraction methods, such as Gabor filter and competitive coding, have been widely used in finger-knuckle-print (FKP) recognition. However, these methods focus on manually designed features which may not achieve satisfying results on FKP images. In order to solve this problem, a novel batch-normalized Convolutional Neural Network (CNN) architecture with data augmentation for FKP recognition is proposed. Firstly, a novel batch-normalized CNN is designed specifically for FKP recognition. Then, random histogram equalization is adopted as data augmentation here for training the CNN in FKP recognition. Meanwhile, batch-normalization is adopted to avoid overfitting during network training. Extensive experiments performed on the PolyU FKP database show that compared with traditional feature extraction method, the proposed method can not only extract more discriminative features, but also improve the accuracy of FKP recognition.

Yikui Zhai, He Cao, Lu Cao, Hui Ma, Junyin Gan, Junying Zeng, Vincenzo Piuri, Fabio Scotti, Wenbo Deng, Yihang Zhi, Jinxin Wang

A New Hand Shape Recognition Algorithm Based on Delaunay Triangulation

In this paper, we present a new hand shape recognition algorithm based on Delaunay triangulation. When collecting hand shape images by a non-contact acquisition equipment, the degree of stretching of fingers may cause finger root contour deformation, which leads to unstable central axis and width features. Thus, we propose to form a more robust and non-parametric finger central axis extraction algorithm, by using a Delaunay triangulation algorithm. We show that our robust algorithm achieves the recognition rate of 99.89% on our database, while the mean time of feature extraction is 0.09 s.

Fu Liu, Shoukun Jiang, Bing Kang, Tao Hou

Finger Vein Recognition Based on Weighted Graph Structural Feature Encoding

The finger-vein recognition performance is usually sensitive to illumination and pose variation. Exploring suitable feature representation method is therefore significant for finger-vein recognition improvement. In this paper, we propose a novel feature encoding method based on local graph structure (LGS), which behaves better in improving the matching accuracy of features. In terms of the variations of veins in running direction, oriented Gabor filters are firstly used for venous region enhancement. Then, a symmetric cross-weighted local graph structure (SCW-LGS) is proposed to locally represent the gradient relationships among the pixels in a neighborhood of the Gabor enhanced images. Based on SCW-LGS, a multi-orientation feature encoding method is developed for vein network feature representation. Experimental results show that the proposed approach achieves better performance than the state-of-the-art approaches on finger-vein recognition.

Shuyi Li, Haigang Zhang, Guimin Jia, Jinfeng Yang

Fingerprint Pore Extraction Using Convolutional Neural Networks and Logical Operation

Sweat pores have been proved to be discriminative and successfully used for automatic fingerprint recognition. It is crucial to extract pores precisely to achieve high recognition accuracy. To extract pores accurately and robustly, we propose a novel coarse-to-fine detection method based on convolutional neural networks (CNN) and logical operation. More specifically, pore candidates are coarsely estimated using logical operation at first; then, coarse pore candidates are further judged through well-trained CNN models; precise pore locations are finally refined by logical and morphological operation. The experimental results evaluated on the public dataset show that the proposed method outperforms other state-of-the-art methods in comparison.

Yuanhao Zhao, Feng Liu, Linlin Shen

Palmprint Recognition Using Siamese Network

Recently, palmprint representation using different descriptors under the incorporation of deep neural networks, always achieves significant recognition performance. In this paper, we proposed a novel method to achieve end-to-end palmprint recognition by using Siamese network. In our network, two parameter-sharing VGG-16 networks were employed to extract two input palmprint images’ convolutional features, and the top network directly obtained the similarity of two input palmprints according to their convolutional features. This method had a good performance on PolyU dataset and achieved a high recognition outcome with an Equal Error Rate (EER) of 0.2819%. To test the robustness of the proposed algorithm, we collected a palmprint dataset called XJTU from the practical daily environment. On XJTU, the EER of our method is 4.559%, which highlighted a promising potential of the usage of palmprint in personal identification system.

Dexing Zhong, Yuan Yang, Xuefeng Du

A Cylinder Code-Based Partial Fingerprint Matching Algorithm for Small Fingerprint Scanners

To solve the problem of partial fingerprint matching difficulty caused by very small fingerprint sensors on mobile terminals, this paper presents a Cylinder Code-based partial fingerprint matching algorithm. The algorithm is inspired by the Minutia Cylinder Code (MCC) structure, and keeps the original MCC structure characteristics while reducing data redundancy. In addition, ridge points are added in the algorithm, which solve the feature loss caused by the small size of the sensors. The proposed algorithm are tested on the FVC2002 database and compared with four well-known matching algorithms. The results show the proposed method has excellent comprehensive performance and ability to apply to light architecture that other algorithms cannot match.

Xiangwen Kong, Yumeng Wang, Rongsheng Wang, Changlong Jin, Hakil Kim

Optimal Parameter Selection for 3D Palmprint Acquisition System

3D palmprint recognition system have been widely studied in recent years. More and more 3D palmprint feature extraction and matching methods are proposed. However, most of the existing image acquisition systems are based on commercial equipment which has high cost, big equipment volume, over-high precision and long 3D data generation time. What’s more, those systems are not designed specialized for palmprint. Most of their parameters are not suitable for 3D palmprint acquisition. Those shortcomings have seriously hindered the applications of 3D palmprint identification. In this paper, we developed a new scheme to tune the initial system parameters to balance the tradeoff of device cost, volume, and data generation time. The samples collected by our proposed device have proved its effectiveness and advantages. The system is easy to implement and will promote the application of 3D palmprint.

Xu Liang, Gang Wu, Yan He, Nan Luo

Gesture, Gait and Action


Residual Gating Fusion Network for Human Action Recognition

Most of the recent works leverage Two-Stream framework to model the spatiotemporal information for video action recognition and achieve remarkable performance. In this paper, we propose a novel convolution architecture, called Residual Gating Fusion Network (RGFN), to improve their performance by fully exploring spatiotemporal information in residual signals. In order to further exploit the local details of low-level layers, we introduce Multi-Scale Convolution Fusion (MSCF) to implement spatiotemporal fusion at multiple levels. Since RGFN is an end-to-end network, it can be trained on various kinds of video datasets and applicative to other video analysis tasks. We evaluate our RGFN on two standard benchmarks, i.e., UCF101 and HMDB51, and analyze the designs of convolution network. Experiments results demonstrate the advantages of RGFN, achieving the state-of-the-art performance.

Junxuan Zhang, Haifeng Hu

Study on Human Body Action Recognition

A novel human body action recognition method based on Kinect is proposed. Firstly, the key frame of the original data is extracted by using the key frame extraction technology based on quaternion. Secondly, the moving pose feature based on the motion information of each joint point is constituted for the skeleton information of each key frame. And, combined with key frame, online continuous action segmentation is implemented by using boundary detection method. Finally, the feature is encoded by Fisher vector and input to the linear SVM classifier to complete the action recognition. In the public dataset MSR Action3D and the dataset collected in this paper, the experiments show that the proposed method achieves a good recognition effect.

Dong Yin, Yu-Qing Miao, Kang Qiu, An Wang

Multi-view Gait Recognition Method Based on RBF Network

Gait is an important biometrics in human identification, but the view variation problem seriously affects the accuracy of gait recognition. Existing methods for multi-view gait-based identification mainly focus on transforming the features of one view to another view, which might be unsuitable for the real applications. In this paper, we propose a multi-view gait recognition method based on RBF network that employs a unique view-invariant model. First, extracts the gait features by calculating the gait individual image (GII), which could better capture the discriminative information for cross view gait recognition. Then, constructs a joint model, use the DLDA algorithm to project the model and get a projection matrix. Finally, the projected eigenvectors are classified by RBF network. Experiments have been conducted in the CASIA-B database to prove the validity of the proposed method. Experiment results shows that our method performs better than the state-of-the-art multi-view methods.

Yaru Qiu, Yonghong Song

Video Emotion Recognition Using Local Enhanced Motion History Image and CNN-RNN Networks

This paper focus on the issue of recognition of facial expressions in video sequences and propose a local-with-global method, which is based on local enhanced motion history image and CNN-RNN networks. On the one hand, traditional motion history image method is improved by using detected human facial landmarks as attention areas to boost local value in difference image calculation, so that the action of crucial facial unit can be captured effectively, then the generated LEMHI is fed into a CNN network for categorization. On the other hand, a CNN-LSTM model is used as an global feature extractor and classifier for video emotion recognition. Finally, a random search weighted summation strategy is selected as our late-fusion fashion to final predication. Experiments on AFEW, CK+ and MMI datasets using subject-independent validation scheme demonstrate that the integrated framework achieves a better performance than state-of-arts methods.

Haowen Wang, Guoxiang Zhou, Min Hu, Xiaohua Wang

A Video Surveillance System Based on Gait Recognition

Gait recognition is a biometric technology with unique advantages over other conventional ones, and its wide applications are yet to come. The proposed system applies gait recognition over existing video camera networks, converting them into powerful surveillance systems. It provides an efficient way of searching through the accumulated videos, saving human reviewers from tedious and inefficient work. The system also enables various scenarios from different cameras to be processed in parallel so different equipment at different locations can be coordinated to work together thus greatly improve the efficiency for searching and tracing subject persons. The system is adopted by policing department and has showed outstanding robustness and effectiveness.

Dexin Zhang, Haoxiang Zhang

Plantar Pressure Data Based Gait Recognition by Using Long Short-Term Memory Network

As a kind of continuous time series, plantar pressure data contains rich contact of time information which has not been fully utilized in existing gait recognition methods. In this paper, we proposed a new gait recognition method based on plantar pressure data with a Long Short-Term Memory (LSTM) network. By normalization and dimensionality reduction, the raw pressure data was converted to feature tensor. Then we feed the LSTM network with the feature tensors and implement classification recognition. We collected data from 93 subjects of different age groups, and each subjects was collected 10 sets of pressure data. The experiment results turn out that our LSTM network can get high classification accuracy and performs better than CNN model and many traditional methods.

Xiaopeng Li, Yuqing He, Xiaodian Zhang, Qian Zhao

Improving Gait Recognition with 3D Pose Estimation

Gait is a kind of attractive biometric feature for human identification in recent decades. The view, clothing, carrying and other variations are always the challenges for gait recognition. One of the possible solutions is the model based methods. In this paper, 3D pose is estimated from 2D images are used as the feature for gait recognition. So gait can be described by the motion of human body joints. Besides, the 3D pose has better capacity for view variation than the 2D pose. Experimental results also prove that in the paper. To improve the recognition rates, LSTM and CNNs are employed to extract temporal and spatial features. Compared with other model-based methods, the proposed one has achieved much better performance and is comparable with appearance-based ones. The experimental results show the proposed 3D pose based method has unique advantages in large view variation. It will have great potential with the development of pose estimation in future.

Weizhi An, Rijun Liao, Shiqi Yu, Yongzhen Huang, Pong C. Yuen

Cross-Cascading Regression for Simultaneous Head Pose Estimation and Facial Landmark Detection

Head pose estimation and facial landmark localization are crucial problems which have a large amount of applications. We propose a cross-cascading regression network which simultaneously perform head pose estimation and facial landmark detection by integrating information embedded in both head poses and facial landmarks. The network consists of two sub-models, one responsible for head pose estimation and the other for facial landmark localization, and a convolutional layer (channel unification layer) which enables the communication of feature maps generated by both sub-models. To be specific, we adopt integral operation for both pose and landmark coordinate regression, and exploit expectation instead of maximum value to estimate head pose and locate facial landmarks. Results of extensive experiments demonstrate that our approach achieves state-of-the-art performance on the challenging AFLW dataset.

Wei Zhang, Hongwen Zhang, Qi Li, Fei Liu, Zhenan Sun, Xin Li, Xinxin Wan

Real Time Violence Detection Based on Deep Spatio-Temporal Features

Typical manually-selected features are insufficient to reliably detect violence actions. In this paper, we present a violence detection model that is based on a bi-channels convolutional neural network (CNN) and the support vector machine (SVM). The major contributions are twofolds: (1) we fork the original frames and the differential images into the proposed bi-channels CNN to obtain the appearance features and the motion features respectively. (2) The linear SVMs are adopted to classify the features and a label fusion approach is proposed to improve detection performance by integrating the appearance and motion information. We compared the proposed model with several state-of-the-art methods on two datasets. The results are promising and the proposed method can achieve real-time performance of 30 fps.

Qing Xia, Ping Zhang, JingJing Wang, Ming Tian, Chun Fei

Selecting the Effective Regions for Gait Recognition by Sparse Representation

In gait recognition the variations of clothing and carrying conditions can change the human body shape greatly. So the gait feature extracted from human body images will be greatly affected and the performance will decrease drastically. Thus in this paper, we proposed one gait recognition method to improve the robustness towards these variations. The main idea is to select effective regions by sparse representation. If the region can be represented by features from gait data without variations, that means the region is not occluded by some objects. Experimental results on a large gait dataset show that the proposed method can achieve high recognition rates, and even outperform some deep learning based methods.

Jiaqi Tan, Jiawei Wang, Shiqi Yu

A Method of Personnel Location Based on Monocular Camera in Complex Terrain

This article proposed a method based on monocular camera for locating people in complex terrain. The coordinates of the person in the 3D space are derived from the image coordinates of the person’s and the feature points in the model of the complex terrain. First, using the monocular camera, camera internal parameters and image coordinate system and combining some reference points in the three-dimensional world coordinate system, the three-dimensional point cloud of complex terrain can be obtained. And the 3D model of a complex terrain can be obtained by triangles generated by the region growing method. Second, the TensorFlow object detection model is used to detect people in the frame image of the video. The lower midpoint of the marked rectangular used to identify the person in image is taken as the person’s image coordinate point. The person’s 3D coordinates can be obtained from the person’s image coordinates combined with the 3D coordinates of the feature points in the already established model. Finally, the positioning of people in a complex terrain based on monocular camera can be done.

Yanqiong Liu, Gang Shi, Qing Cui, Yuhong Sheng, Guoqun Liu

Feature Extraction and Classification Theory


LPPNet: A Learning Network for Image Feature Extraction and Classification

PCANet is a very simple learning network for image classification. Inspired by PCANet, we propose a new learning network, referred to as LPPNet, for image feature extraction and classification. Different from PCANet, LPPNet takes the class information and the local geometric structure of data into account simultaneously. In LPPNet, local preserving projections (LPP) is first employed to learn filters, and then binary hashing and block histograms are used for indexing and pooling. Experimental results on several image datasets verify the effectiveness and robustness of LPPNet for image feature extraction and classification.

Guodong Li, Haishun Du, Meihong Xiao, Sheng Wang

Design of Multimodal Biometric Information Management System Based on Commercial Systems

In these years, Biometric technology has passed through its establishment and maintains a good momentum of growth. With the development and reform of social transformation, it seems almost inevitable that the public safety issues have increasingly become a focus. Biometric technology can effectively prevent infringement, obtain the criminal evidence and maintain the public safety. Many standards related to biometric identification in public security area are about to be implemented. Biometric identification will exploit better development opportunities. However, unimodal biometric may not be able to achieve the desired requirement for public security, especially for criminal in the civilian law enforcement environment. It has been found that unimodal biometric shows some inherent drawbacks in universality and accuracy. Hence, this paper proposes the design of multimodal biometric information management system (MBIMS) to create a collaborative platform by acquiring biometric data from multi-commercial systems, defines the data flow API and applies the prototype system successfully in the field of public security.

Wei-Jian Zhu, Chuan-Zhi Zhuang, Jing-Wei Liu, Ming Huang

Supervised Group Sparse Representation via Intra-class Low-Rank Constraint

Group sparse representation (GSR) which uses the group structure of training samples as the prior knowledge has achieved extensive attention. However, GSR represents a new test sample using the original input features for least reconstruction, which may not be able to obtain discriminative reconstruction coefficients since redundant or noisy features may exist in the original input features. To obtain more discriminative data representation, in this paper we propose a novel supervised group sparse representation via intra-class low-rank constraint (GSRILC). Instead of representing the target by the original input features, GSRILC attempts to use the compact projection features in a new subspace for data reconstruction. Concretely, GSRILC projects data sharing the same class to a new subspace, and imposes low-rank constraint on the intra-class projections, which ensures that samples within the same class have a low rank structure. In this way, small intra-class distances and large inter-class distances can be achieved. To illustrate the effectiveness of the proposal, we conduct experiments on the Extended Yale B and CMU PIE databases, and results show the superiority of GSRILC.

Peipei Kang, Xiaozhao Fang, Wei Zhang, Shaohua Teng, Lunke Fei, Yong Xu, Yubao Zheng

Partial Multi-view Clustering via Auto-Weighting Similarity Completion

With the development of data collection techniques, multi-view clustering (MVC) becomes an emerging research direction to improve the clustering performance. However, most MVC methods assume that the objects are observed on all the views. As a result, existing MVC methods may not achieve satisfactory performance when some views are incomplete. In this paper, we propose a new MVC method, called as partial multi-view clustering via auto-weighting similarity completion (PMVC-ASC). The major contribution lies in jointly learning the consensus similarity matrix, exploring the complementary information among multiple distinct feature sets, quantifying the contribution of each view and splitting the similarity graph into several informative submatrices, each submatrix corresponding to one cluster. The learning process can be modeled via a joint minimization problem, and the corresponding optimization algorithm is given. A series of experiments are conducted on real-world datasets to demonstrate the superiority of PMVC-ASC by comparing with the state-of-the-art methods.

Chen Min, Miaomiao Cheng, Jian Yu, Liping Jing

Phase Retrieval by the Inverse Power Method

Phase retrieval is to recover signals from phaseless linear measurements. The most efficient methods to tackle this problem are nonconvex gradient approaches, which however generally need an elaborate initialized guess to ensure successful reconstruction. The inverse power method is proposed to provide a more accurate initialization. Numerical experiments illustrate the higher accuracy of the proposed method over other initialization methods. And we further demonstrate the iterative use of the initialization method can obtain an even better estimate.

Qi Luo, Hongxia Wang, Jianyun Chen

Robust Discriminative Principal Component Analysis

Least square regression (LSR) and principal component analysis (PCA) are two representative dimensionality reduction algorithms in the fields of machine learning. In this paper, we propose a novel method to jointly learn projections from the subspaces derived from the modified LSR and PCA. To implement simultaneous feature learning, we design a novel joint regression learning model by imposing two orthogonal constraints. Therefore, the learned projections can preserve the minimum reconstruction error and the discriminative information in the low-dimensional subspaces. Besides, since the traditional LSR and PCA are sensitive to the outliers, we utilize the robust L2,1-norm as the metric of loss function to improve the model’s robustness. A simple iterative algorithm is proposed to solve the proposed framework. Experiments on face databases show the promising performance of our method.

Xiangxi Xu, Zhihui Lai, Yudong Chen, Heng Kong

Guided Learning: A New Paradigm for Multi-task Classification

A prevailing problem in many machine learning tasks is that the training and test data have different distribution (non i.i.d). Previous methods to solve this problem are called Transfer Learning (TL) or Domain Adaptation (DA), which belong to one stage models. In this paper, we propose a new, simple but effective paradigm, Guided Learning (GL), for multi-stage progressive training. This new paradigm is motivated by the “tutor guides student” learning mode in human world. Further, under the framework of GL, a Guided Subspace Learning (GSL) method is proposed for domain disparity reduction, which aims to learn an optimal, invariant and discriminative subspace through the guided learning strategy. Extensive experiments on various databases show that our method outperforms many state-of-the-art TL/DA methods.

Jingru Fu, Lei Zhang, Bob Zhang, Wei Jia

An Image Fusion Algorithm Based on Modified Regional Consistency and Similarity Weighting

We propose an image fusion algorithm based on modified regional consistency and similarity weighting to fuse two multi-focus images with strict registration of the same scene. The algorithm decomposes source image with the shift-invariant discrete wavelet transform (SIDWT) and obtain high frequency components and low frequency component. The regional energy consistency is used in high frequency fusion. The saliency map of multi-focus images is calculated with spectral residual (SR), and combine the similarity weighting method to fuse low frequency coefficient. The simulation results show that the improved algorithm is an effective image fusion algorithm. In terms of visual effects, fusion image keeps details and advances the vagueness. Compared with fusion algorithms based on regional consistency and similarity weighting, its objective evaluation indicators, such as standard deviation and mutual information are also improved.

Tingting Yang, Peiyu Fang



Discriminative Weighted Low-Rank Collaborative Representation Classifier for Robust Face Recognition

Recently, low-rank collaborative representation classification (LCRC) has proven to have good performance under controlled conditions. However, this algorithm stipulates that each singular value of the kernel norm is equal, which limits its ability and flexibility to deal with practical problems. Moreover, training samples and test samples may be damaged due to occlusion or disguise; this factor may reduce the face recognition rate. This paper presents a novel robust face recognition based on discriminative weighted low-rank collaborative representation (WDLCRC). Based on the LCRC, we add the constraint of structural inconsistency and assign the singular values with different weights by adaptively weighting the kernel norm. It is proved through experiments that the recognition rate of WDLCRC on AR database and CMU PIE database is higher than that of SRC, CRC and LCRC algorithms.

Xielian Hou, Caikou Chen, Shengwei Zhou, Jingshan Li

Face Expression Recognition Using Gabor Features and a Novel Weber Local Descriptor

This paper presents a novel fusion approach for facial expression recognition. The novelty of this paper lies in: (i) Gabor wavelets are introduced for image representation, which describes well local spatial scale characteristics and orientation selectivity of image textures. Gabor features are robust to variations due to illumination and noise. Furthermore, we reduce the dimensionality of Gabor feature vector, in order to reduce computation cost and improve discriminative power for feature extraction. (ii) The paper proposes Multi-orientation Symmetric Local Graph Structure (MSLGS) to calculate feature value for replacing differential excitation of Weber Local Descriptor (WLD), which captures more discriminative local images details. The orientation of original WLD also is extended by bringing more gradient direction, thus it can obtain more precise image description to spatial structure information. The comparative experimental results illustrated that the algorithm could achieve a superior performance with high accuracy.

Jucheng Yang, Meng Li, Lingchao Zhang, Shujie Han, Xiaojing Wang, Jie Wang

Face Synthesis for Eyeglass-Robust Face Recognition

In the application of face recognition, eyeglasses could significantly degrade the recognition accuracy. A feasible method is to collect large-scale face images with eyeglasses for training deep learning methods. However, it is difficult to collect the images with and without glasses of the same identity, so that it is difficult to optimize the intra-variations caused by eyeglasses. In this paper, we propose to address this problem in a virtual synthesis manner. The high-fidelity face images with eyeglasses are synthesized based on 3D face model and 3D eyeglasses. Models based on deep learning methods are then trained on the synthesized eyeglass face dataset, achieving better performance than previous ones. Experiments on the real face database validate the effectiveness of our synthesized data for improving eyeglass face recognition performance.

Jianzhu Guo, Xiangyu Zhu, Zhen Lei, Stan Z. Li

Single Shot Attention-Based Face Detector

Although face detection has taken a big step forward with the development of anchor based face detector, the issue of effective detection of faces with different scales still remains. To solve this problem, we present an one-stage face detector, named Single Shot Attention-Based Face Detector (AFD), which enables accurate detection of multi-scale faces with high efficiency, especially for small faces. Specifically, AFD consists of two inter-connected modules, namely attention proposal module (APM) and face detection module (FDM). The former aims to generate the attention region and coarsely refine the anchors. The latter takes the output from APM as input and further improve the detection results. We obtain state-of-the-art results on common face detection benchmarks, i.e. FDDB and WIDER FACE, and can run at 20 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images.

Chubin Zhuang, Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Stan Z. Li

Local Directional Amplitude Feature for Illumination Normalization with Application to Face Recognition

Face recognition under variant illumination conditions has been one of the major research topics in the development of face recognition systems. In this paper we analyze the strength and the weakness of different types of approaches, and design an illumination robust feature by combining the directional and amplitude information as an optimal solution to the problem. We first extract and process the direction and amplitude information of the pixel changes, and then fuse them into a comprehensive feature. We conducted our experiments on CMU-PIE database and Extended Yale B database, and all the results have shown the effectiveness of our approach.

Chitung Yip, Haifeng Hu, Zhihong Chen

Facial Expression Bilinear Encoding Model

Facial expressions are generated by contractions of facial muscles. The contractions lead to variations in the appearance of facial parts. It has been proved that the features from different facial parts can improve the accuracy of facial expression recognition. In this paper, we propose a bilinear encoding model for facial expression recognition. Our system uses the still facial expression images as inputs and employs the bilinear convolutional networks to capture the features in the appearance of facial parts. It detects crucial facial parts and extracts the appearance features simultaneously with end-to-end learning. To verify the performance of our system, we have made experiments on two popular expression databases: CK+ and Oulu-CASIA. The experimental results show that the proposed method achieves comparable or better performance compared with the state-of-the-art methods for facial expression recognition.

Haifeng Zhang, Wen Su, Zengfu Wang

Face Clustering Utilizing Scalable Sparse Subspace Clustering and the Image Gradient Feature Descriptor

Face clustering is an important topic in computer vision. It aims to put together facial images that belong to the same person. Spectral clustering-based algorithms are often used for accurate face clustering. However, a big occlusion matrix is usually needed to deal with the noise and sparse outlying terms, which makes the sparse coding process computationally expensive. Thus spectral clustering-based algorithms are difficult to extend to large scale datasets. In this paper, we use the image gradient feature descriptor and scalable Sparse Subspace Clustering algorithm for large scale and high accuracy face clustering. Within the image gradient feature descriptor, the scalable Sparse Subspace Clustering algorithm can be used in large scale face datasets without sacrificing clustering performance. Experimental results show that our algorithm is robust to illumination, occlusion, and achieves a relatively high clustering accuracy on the Extended Yale B and AR datasets.

Mingkang Liu, Qi Li, Zhenan Sun, Qiyao Deng

Fusing Multiple Deep Features for Face Anti-spoofing

With the growing deployment of face recognition system in recent years, face anti-spoofing has become increasingly important, due to the increasing number of spoofing attacks via printed photos or replayed videos. Motivated by the powerful representation ability of deep learning, in this paper we propose to use CNNs (Convolutional Neural Networks) to learn multiple deep features from different cues of the face images for anti-spoofing. We integrate temporal features, color based features and patch based local features for spoof detection. We evaluate our approach extensively on publicly available databases like CASIA FASD, REPLAY-MOBILE and OULU-NPU. The experimental results show that our approach can achieve much better performance than state-of-the-art methods. Specifically, 2.22% of EER (Equal Error Rate) on the CASIA FASD, 3.2% of ACER (Average Classification Error Rate) on the OULU-NPU (protocol 1) and 0.00% of ACER on the REPLAY-MOBILE database are achieved.

Yan Tang, Xing Wang, Xi Jia, Linlin Shen

Sensitive Information of Deep Learning Based Face Anti-spoofing Algorithms

Face anti-spoofing based on deep learning achieved good accuracy recently. However, deep learning model has no explicit mathematical presentation. Therefore, it is not clear about how the model works effectively. In this paper, we estimate the regions in face image, which are sensitive in deep learning based anti-spoofing algorithms. We first generate the adversarial examples from two different gradient-based methods. Then we analyze the distribution of the gradient and perturbations on the adversarial examples. And next we obtain the sensitive regions and evaluate the contribution of these regions to classification performance. By analyzing the sensitive regions, it could be observed that the CNN based anti-spoofing algorithms are sensitive to rich detailed regions and illumination. These observations are helpful to design an effective face anti-spoofing algorithm.

Yukun Ma, Lifang Wu, Meng Jian

Weighted Softmax Loss for Face Recognition via Cosine Distance

Softmax loss is commonly used to train convolutional neural networks (CNNs), but it treats all samples equally. Focal loss focus on training hard samples and takes the probability as the measurement of whether the sample is easy or hard one. In this paper, we use cosine distance of features and the corresponding centers as weight and propose weighted softmax loss (called C-Softmax). Unlike focal loss, we give greater weight to easy samples. Experiment results show that the proposed C-Softmax loss can train many well known models like ResNet, ResNeXt, DenseNet and Inception V3, and the performance of the proposed loss is better than softmax loss and focal loss.

Hu Zhang, Xianliang Wang, Zhixiang He

Improving Large Pose Face Alignment by Regressing 2D and 3D Landmarks Simultaneously and Visibility Refinement

This paper proposes an improved method for large pose face alignment. Unlike existing methods, the proposed method regresses both 2D and 3D coordinates of facial landmarks simultaneously. It first computes a coarse estimation of the landmarks via a shape regression network (SRN) whose input is only the input image. It then refines the landmarks with another SRN whose input consists of three components: the transformed image, the visible landmark heatmap and the feature map from the first SRN. These components are constructed by a transformation module based on the current estimates of 3D and 2D landmarks. By effectively exploring the 3D property of faces for constraining 2D landmarks and refining their visibility, the proposed method can better align faces under large poses. Extensive experiments on three public databases demonstrate the superiority of the proposed method in large pose face alignment.

Xu Luo, Pengfei Li, Fuxuan Chen, Qijun Zhao

RGB-D Face Recognition: A Comparative Study of Representative Fusion Schemes

RGB-D face recognition (FR) has drawn increasing attention in recent years with the advances of new RGB-D sensing technologies, and the decrease in sensor price. While a number of multi-modality fusion methods are available in face recognition, there is not known conclusion how the RGB and depth should be fused. We provide a comparative study of four representative fusion schemes in RGB-D face recognition, covering signal-level, feature-level, score-level fusions, and a hybrid fusion we designed for RGB-D face recognition. The proposed method achieves state-of-the-art performance on two large RGB-D datasets. A number of insights are provided based on the experimental evaluations.

Jiyun Cui, Hu Han, Shiguang Shan, Xilin Chen

An ICA-Based Other-Race Effect Elimination for Facial Expression Recognition

Other-race effect affects the performance of multi-race facial expression recognition significantly. Though this phenomenon has been noticed by psychologists and computer vision researchers for decades, few work has been done to eliminate this influence caused by other-race effect. This work proposes an ICA-based other-race effect elimination method for 3D facial expression recognition. Firstly, the local depth features are extracted from 3D face point clouds, and then independent component analysis is used to project the features into a subspace in which the feature components are mutually independent. Second, a mutual information based feature selection method is adopted to determine race-sensitive features. Finally, the features after race-sensitive information elimination are utilized to conduct facial expression recognition. The proposed method is evaluated on BU-3DFE database, and the results reveal that the proposed method is effective to other-race effect elimination and could improve the multi-race facial expression recognition performance.

Mingliang Xue, Xiaodong Duan, Wanquan Liu, Yuehai Wang

ClusterFace: Clustering-Driven Deep Face Recognition

Recent years, image-based 2D face recognition has achieved human-level performance with the big breakthrough of deep learning paradigm. However, almost all of the existing deep face recognition methods depend on millions and millions of labeled 2D face images from different individual for supervised deep learning. In this case, face labelling becomes the pain point of deep face recognition. To solve this issue, we propose a novel clustering driven unsupervised deep face recognition framework, namely ClusterFace. In particular, our framework firstly assume that we already have a well-trained deep face model and a large number of face images without any labels. Then, all these face images are represented by this deep face model and then unsupervised clustered into different clusters using a certain clustering algorithm. Finally, these clustering-based face labelling results are employed to train a new deep CNN model for face recognition. Experimental results demonstrated that the proposed framework with a simple Mini-batch K-Means clustering algorithm can achieve surprising state-of-the-art performance (99.41%) on the LFW dataset. We also presented an intuitional explanation the reason of achieving good performance of our framework and also demonstrated its robustness to the choice of the number of clusters and the amount of unlabeled face images.

Lingjiang Xie, Cuican Yu, Huibin Li, Jihua Zhu

Sketch Synthesized Face Recognition with Deep Learning Models

Sketch face recognition is of great significance in the field of criminal investigation, Internet search and management. In this paper, we explore the feature presentation of sketch synthesized face images with several deep learning models. In order to complete the matching of heterogeneous images, we propose a modified face synthesis technology that combines sketches and face templates into a human face portrait. Through experiments, we investigate the essential problem of the degree of synthetic with respect to face recognition. Several state-of-the-art Deep Neural Network (DNN) models in face recognition are transferred in feature extraction of sketch synthesized face images. Experiments show that the proposed synthetic method is effective working with the DNN models in sketch face recognition.

Wei Shao, Zhicheng Chen, Guangben Lu, Xiaokang Tu, Yuchun Fang

Face Anti-spoofing to 3D Masks by Combining Texture and Geometry Features

Anti-spoofing has become more important in face recognition systems. This paper proposes a novel approach to resist 3D face mask attacks, which jointly uses texture and shape features. Different from existing methods where depth information by extra equipments is required, we reconstruct geometry cues from RGB images through 3D Morphable Model. The hand-crafted features as well as the deep ones are then extracted to comprehensively represent texture and shape differences between real and fake faces and finally fused for decision making. The experiments are carried out on the 3D-MAD dataset and the competitive results indicate the effectiveness.

Yan Wang, Song Chen, Weixin Li, Di Huang, Yuhong Wang

An Illumination Augmentation Approach for Robust Face Recognition

Deep learning has achieved great success in face recognition and significantly improved the performance of the existing face recognition systems. However, the performance of deep network-based methods degrades dramatically when the training data is insufficient to cover the intra-class variations, e.g., illumination. To solve this problem, we propose an illumination augmentation approach to augment the training set by constructing new training images with additional illumination components. The proposed approach first utilizes an external benchmark to generate several illumination templates. Then we combine the generated templates with the training images to simulate different illumination conditions. Finally, we conduct color correction by using the singular value decomposition (SVD) algorithm to confirm that the color of the augmented image is consistent with the input image. Experimental results demonstrate that the proposed illumination augmentation approach is effective for improving the performance of the existing deep networks.

Zhanxiang Feng, Xiaohua Xie, Jianhuang Lai, Rui Huang

Robust Face Recognition with Deeply Normalized Depth Images

Depth information has been proven useful for face recognition. However, existing depth-image-based face recognition methods still suffer from noisy depth values and varying poses and expressions. In this paper, we propose a novel method for normalizing facial depth images to frontal pose and neutral expression and extracting robust features from the normalized depth images. The method is implemented via two deep convolutional neural networks (DCNN), normalization network ($$Net_{N}$$NetN) and feature extraction network ($$Net_{F}$$NetF). Given a facial depth image, $$Net_{N}$$NetN first converts it to an HHA image, from which the 3D face is reconstructed via a DCNN. $$Net_{N}$$NetN then generates a pose-and-expression normalized (PEN) depth image from the reconstructed 3D face. The PEN depth image is finally passed to $$Net_{F}$$NetF, which extracts a robust feature representation via another DCNN for face recognition. Our preliminary evaluation results demonstrate the superiority of the proposed method in recognizing faces of arbitrary poses and expressions with depth images.

Ziqing Feng, Qijun Zhao

MobileFaceNets: Efficient CNNs for Accurate Real-Time Face Verification on Mobile Devices

We present a class of extremely efficient CNN models, MobileFaceNets, which use less than 1 million parameters and are specifically tailored for high-accuracy real-time face verification on mobile and embedded devices. We first make a simple analysis on the weakness of common mobile networks for face verification. The weakness has been well overcome by our specifically designed MobileFaceNets. Under the same experimental conditions, our MobileFaceNets achieve significantly superior accuracy as well as more than 2 times actual speedup over MobileNetV2. After trained by ArcFace loss on the refined MS-Celeb-1 M, our single MobileFaceNet of 4.0 MB size achieves 99.55% accuracy on LFW and 92.59% TAR@FAR1e-6 on MegaFace, which is even comparable to state-of-the-art big CNN models of hundreds MB size. The fastest one of MobileFaceNets has an actual inference time of 18 ms on a mobile phone. For face verification, MobileFaceNets achieve significantly improved efficiency over previous state-of-the-art mobile CNNs.

Sheng Chen, Yang Liu, Xiang Gao, Zhen Han

Eye-Based Biometrics


Gabor Filtering and Adaptive Optimization Neural Network for Iris Double Recognition

The iris image is greatly affected by the collection environment, so, the outputs of different iris categories in the distance recognition algorithm may similar. Neural network recognition algorithm can improve the results distinction, but the same neural network structure has a great difference in the recognition effect of different iris libraries. They all may reduce the accuracy of iris recognition. This paper proposes an iris double recognition algorithm based on Gabor filtering and adaptive optimization neural network. Gabor filtering is used to extract iris features. Hamming distance is used to eliminate most of different categories in the first recognition. The BP neural network that connection weights are optimized by immune particle swarm optimization algorithm is used for the second recognition. The results that the proposed algorithm compares with many algorithms in different iris libraries show that the proposed algorithm can effectively improve iris recognition accuracy.

Shuai Liu, Yuanning Liu, Xiaodong Zhu, Zhen Liu, Guang Huo, Tong Ding, Kuo Zhang

Efficient Method for Locating Optic Disc in Diabetic Retinopathy Images

Diabetic retinopathy has no obvious symptoms at early stage, which leads to missing the best time for treatment. We apply image processing techniques to early diagnosis of diabetic retinopathy and present an efficient method of locating optic disc in fundus images. We first normalize the images in color, brightness, and exposure distribution to weaken the interference of pigment difference, uneven brightness and low contrast, and then extract the regions of interest in fundus images by the convolution of fundus images with a binary mask template to eliminate the influence of background on the accuracy of locating OD and decrease the computation amount. Next, we convert ROI into three grayscale images, in which the grayscale one from G channel is selected to locate OD since it is with the highest contrast and most original information. Finally, we create a universal template of optic disc for diabetic retinopathy images and design a fast method of locating OD in fundus images based on the OD template. The similarity between the OD template and the overlaid patch in fundus images is computed by means of correlation matching or standard correlation matching, and the position with maximal similarity is regarded as the center of OD. Experimental results demonstrate that our method is efficient and has a certain prospect of clinical application.

Aili Han, Anran Yang, Feilin Han

Research on Security of Public Security Iris Application

Combined with the security problems encountered in the public security iris application, this paper makes a comprehensive analysis of these security problems from two aspects of iris acquisition and iris verification. And using the current academic research results, combined with the practical application requirements, the iris template encryption, the original image encryption, transmission encryption, equipment authentication and other technical means are used to construct the security system of the public security iris application and ensure the public security iris application in an all-round way. Finally, the application of the research results in actual use is introduced.

Li Li, Shengguang Li, Shiwei Zhao, Lin Tan

Hybrid Fusion Framework for Iris Recognition Systems

Due to the advantages in uniqueness, convenience and non-contact, iris recognition is widely deployed for automatic identity authentication. Instead of a single signature, multiple templates are registered in real-world applications for the diversity of gallery samples, resulting in great enhanced user experience. In this paper, we exploit the connection among the multiple registration data and then make efforts to give a more comprehensive decision based on them. A novel hybrid fusion framework is proposed to fuse information at groups in feature and score levels. Specifically, the gallery samples are firstly divided into groups to balance the abundance and the robustness of information. Afterwards, hierarchical fusion is performed at the groups, which is actually the procedure of information mapping and reducing. The experimental results demonstrate the effectiveness and generalization ability of the proposed hybrid fusion framework.

He Zhang, Jing Liu, Zhiguo Zeng, Qianli Zhou, Shengguang Li, Xingguang Li, Hui Zhang

Design of a Long Distance Zoom Lens for Iris Recognition

This paper presents a zoom lens for iris recognition to solve the problem of narrow working range with fixed size of iris image. The lens with a wide and long working range and high resolution images with the same iris size in different work distance was designed. An 2.6× long distance zoom lens was designed which includes 35~91 mm focal length, work distance from 750 mm to 1910 mm, less than 0.5% distortion, 3.5 working F number, 9.5 mm image diameter, 780 nm~890 nm wavelength of operating spectrum. At 166 lp/mm, the lens can capture iris image over all field of view with MTF > 0.3.

Xiaoyu Lv, Wenzhe Liao, Kaijun Yi, Junxiong Gao

Efficient Near-Infrared Eye Detection Utilizing Appearance Features

Eye detection has been a critical problem for iris recognition, face recognition and some other applications. However, the unconstrained scene brings a lot of challenging problems to eye detection such as occlusion, rotation, blur and complex background etc. In this paper, we propose a novel eye detection algorithm for near-infrared image. We put forward four factors, which are IVSF, PLG, DRDF and IOSF to represent eye region features. The method is mainly composed of two steps. Firstly, candidate positions are generated. Secondly, a multi-strategy fusion method is designed to confirm final eye position. The experimental results demonstrate that the proposed algorithm is accurate and fast compared with some existing methods.

Qi Wang, Ying Lian, Ting Sun, Yuna Chu, Xiangde Zhang

Attention Detection by Learning Hierarchy Feature Fusion on Eye Movement

Human concentration state detection using the eye movement information is now a popular research topic in computer vision, especially the detection of driver fatigue and advertising analysis. In this paper we analyze eye movement styles on a person’s concentration state through watching different video clips. We propose a novel method including the fusion features of eye event data and raw eye movement to detect attention. Firstly, we use the logistic regression algorithm to conduct the new feature by eye movement event data, and use wavelet and approximate entropy algorithm to conduct the new feature by raw eye movement data. Secondly, we train attention detection model using these new merged features. In order to avoid the problem caused by insufficient samples, crossing method is used to train the model to ensure its accuracy. Our model achieves a satisfying 95.25% accuracy.

Bing Liu, Peilin Jiang, Fei Wang, Xuetao Zhang, Haifan Hao, Shanglin Bai

Emerging Biometrics


An Efficient 3D Ear Recognition System Based on Indexing

We propose a system for time-efficient 3D ear biometrics. The system is composed of two primary components, namely: (1) an ear shape-based index; and (2) categorization using the index. We built an index tree by using the shape feature computed from measures of circularity, rectangularity, ellipticity, and triangularity, based on ear segmentation results and then perform a nearest neighbor search to obtain a gallery of ear images that are closest in shape to the probe subjects. For the categorization component, separate index trees are built out of the gallery of ear images by using a reduced depth feature space for each image. We utilize an indexing technique to perform a range query in a reduced depth feature space for ears that are closest in shape to the probe subject. Experiments on the benchmark database demonstrate that the proposed approach is more efficient compared to the state-of-the-art 3D ear biometric system.

Qinping Zhu, Zhichun Mu

Actual Radiation Patterns-Oriented Non-deterministic Optical Wireless Channel Characterization

In optical wireless communication channel characterization, Monte Carlo-based non-deterministic modeling scheme has been widely adopted due to its simplicity and efficiency. Currently, this scheme is only applicable to characterize the optical sources with Lambertain radiation pattern. For flexibly modeling practical non-Lambertain source radiation patterns, the modified edition of above stochastic modeling scheme is systematically proposed in this paper. Numerical results prove that the modified scheme is capable of presenting various non-Lambertain sources with symmetric radiation pattern (e.g. LEDs LUXEON®Rebel from Lumileds Philips) at high accuracy while at significantly reduced computational complexity. As compared with the existing deterministic modeling scheme, the power deviation ratio of impulse response is less than 0.60% for the second order reflection while less than 13.8% for the third order reflection

Jupeng Ding, Chih-Lin I, Ruiyue Xie, Huicheng Lai, Chi Zhang

Detection of the Toe-off Feature of Planar Shoeprint Based on CNN

In Chinese forensic science, a planar footprint can provide police office lots of information, such as sex, age and gait for criminal investigation. The toe-off feature is an important feature of planar shoeprint, which can indicate the gait pattern of the walkers. However, the toe-off features of planar shoeprints are still analyzed artificially by criminal investigators, which is inefficient and subjective. In this research, a novel algorithm for the automatic detection of the toe-off feature is developed. We define the crescent feature in the toe-off feature of planar footprint as a positive sample, and define no such feature as a negative sample. We use CNN to detect them. In order to conduct the research, we take photo of planar shoeprints by the way of criminal scene photography. After performing some pre-processing steps on these pictures, we set up a planar shoeprint database. Experimental results show that the proposed method achieves detection accuracy of 97.0% on our planar shoeprint database.

Xiangyu Meng, Yunqi Tang, Wei Guo

Identification of the Normal/Abnormal Heart Sounds Based on Energy Features and Xgboost

A normal/abnormal heart sound identification method was put forward in the paper. The wavelet packet energy features of the heart sounds were extracted in a large database of 1136 recordings and xgboost algorithm was used as the classifier. The feature importance is also evaluated and analyzed. Top 3, 6, 9 and 12 features were used to classify the heart sounds. Experimental results showed that the proposed algorithm can identify the normal and abnormal heart sounds effectively. And the result used top 9 features was as good as that of all features, which can reduce almost half of computation.

Ting Li, Xing-rong Chen, Hong Tang, Xiao-ke Xu

Muscle Synergy Analysis for Stand-Squat and Squat-Stand Tasks with sEMG Signals

Human walking is the composite movement of the musculoskeletal system in lower limbs. The interaction mechanism of the different muscle groups in a combination action is of great importance. To this end, under the stand-squat and squat-stand tasks, the problems of the motion model decomposition and the muscle synergy were studied in this paper. Firstly, the envelopes were extracted from acquired and de-noised surface electromyography (sEMG) signals. Secondly, the non-negative matrix factorization (NMF) algorithm was explored to decompose the four synergistic modules and the corresponding activation coefficients under the two tasks. Finally, the relationship between the muscle synergy and the lower limb movement was discussed in normal and fatigue subjects. The results show that muscle participation of each synergistic module is consistent with the physiological function, and exhibit some differences in muscle synergies between normal and fatigue states. This work can help to understand the control strategies of the nervous system in lower extremity motor and have some significance for the evaluation of limb rehabilitation.

Chao Chen, Farong Gao, Chunling Sun, Qiuxuan Wu

ECG Based Biometric by Superposition Matrix in Unrestricted Status

The paper proposed an Electrocardiogram (ECG) feature extraction method for biometric. It relied on ECG superposition number matrix built by several single heartbeat ECG data. The target of the study was to find stable features of the ECG signal under unrestricted status for biometric. By matrix segmentation and similarity comparison, the stable feature distribution was gotten, and stable feature sets were also constructed. 13 volunteers’ ECG data collected by self-made ECG device in different status were gotten, the collecting period was lasting for half year; 28 healthy individuals’ ECG data under calm status were also collected; Besides that, 14 subjects’ ECG data in MIT-BIH were also involved in study. From the result of experiments, the average True Positive Rate (TPR) reached 83.21%, 83.93% and 80% on MIT data set, ECG data set in calm status and ECG data in different status respectively. It is also found that along with the increasing amount of ECG single heartbeat used to build superposition matrix, the stable features of one’s ECG were gradually revealed and this helped ECG based biometric effectively.

Gang Zheng, Xiaoxia Sun, Shengzhen Ji, Min Dai, Ying Sun

Ear Alignment Based on Convolutional Neural Network

The ability of biometric systems has recently been dramatically improved by the emergency of deep learning. In the process of ear verification, the accuracy is often lower than expected because of the influence of pose variation and occlusion. In this paper, we propose a novel ear alignment approach. According to the morphological characteristics and the geometric characteristics of the ear, we define six key points on the ear, three located in the inner ear region, and three located on the outer contour of the ear. In order to detect these key points on the ear image automatically, we train a cascaded convolutional neural network using our newly released USTB-Helloear database. Then the alignment of the test ear image is accomplished by radiological transformation which will minimize the mean square error of the six key points between the test image and the template image. Experimental results show that using ear alignment, the accuracy of the ear verification system can be improved.

Li Yuan, Haonan Zhao, Yi Zhang, Zeyu Wu

Evaluation of Outdoor Visible Light Communications Links Using Actual LED Street Luminaries

In existing indoor & outdoor visible light communications (VLC) works, the LED luminaries are almost all modeled following conventional Lambertian beam pattern. However, for providing energy efficient illumination and controlling lighting pollution, the beam patterns of the commercially available LED street luminaries are almost all application-oriented designed which can only be profiled by more sophisticated mathematical representation. In this paper, for the first time, to the best of our knowledge, the link coverages of the outdoor VLC with actual LED street luminaries are modeled and evaluated. Numerical results identify that, compared to the conventional Lambertian case the beam pattern of typical actual LED street luminary is superior to concentrate the emitted power in the served cell by sharp cell edge, naturally limit the inter cell interference to quite low level and provide highly link quality uniformity within each cell.

Jupeng Ding, Chih-Lin I, Chi Zhang, Baoshan Yu, Huicheng Lai

Readback Error Classification of Radiotelephony Communication Based on Convolutional Neural Network

The readback errors of radiotelephony communication result in serious potential risk to the air transportation safety. Therefore, it is essential to establish a proper model to identify and also to classify the readback errors automatically so as to improve the flight safety. In this paper, a new scheme, which has two channels to process the instructions and the readbacks (I-R pairs) respectively based on one-layer convolutional neural network (CNN), is proposed for the readback error classification. The semantics of the I-R pairs are learned by the one-layer CNN encoder. Then, the classification decision is made according to a matching vector of the I-R pairs. A new method of input is also tested. Extensive experiments have been conducted and the results show that the proposed scheme is effective for automatic readback error classification and the average classification accuracy on a Chinese civil radiotelephony communication dataset is up to 95.44%.

Fangyuan Cheng, Guimin Jia, Jinfeng Yang, Dan Li

Determination of Sex Discriminant Function Analysis in Chinese Human Skulls

Introduction: Human identity and sex determination are crucial for forensic investigations. The human skull is a useful tool for identities in natural disasters and criminal investigations. Aim: Using stepwise Fisher and logistic regression to build multivariate linear discriminant function to achieve sex determination for Uighur adult skull of Turpan, Xinjiang. Methods: Using CT equipment to acquire and reconstruct 267 (114 males and 153 females) three-dimensional skull models. Sixteen measurement indicators were measured and computed. Stepwise Fisher and logistic regression was performed to build the sex discriminant function and leave-one-out cross validation was used to evaluate accuracy. Results: Average of fifteen measurement indicators of male was bigger than that of female. Only one measurement indicator of male was smaller than female. Except two indicators (X7 and X13), the other existed significant difference ($$ p $$p < 0.01). According to sex discriminant function consisting of four indicators (X1, X4, X10, X11), using stepwise Fisher method, the accuracy of male was 86.8% and female was 86.2%. According to sex discriminant function consisting of five indicators (X4, X6, X12, X15, X16), using Logistic method, the accuracy of male was 89.4% and female was 90.2%. According to sex discriminant function consisting of incomplete skull with only frontal and mandibular, using stepwise Fisher method, the accuracy of male was 67.9% and female was 69.1%. Using Logistic method, the accuracy of male was 68.7% and female was 70.4%. Conclusion: By combining computer software with machine learning classification algorithm, the sex discrimination of complete skull and incomplete skull can be realized. In the gender identification of the Uygur population, the Logistic regression method is better than the stepwise Fisher method.

Wen Yang, Xiaoning Liu, Fei Zhu, Guohua Geng, Kang Li

Fast and Robust Detection of Anatomical Landmarks Using Cascaded 3D Convolutional Networks Guided by Linear Square Regression

Detecting anatomical landmarks on structural magnetic resonance imaging (MRI) is an important medical computer-aid technique. However, for some brain anatomical landmarks detection, linear/non-linear registration with skull stripping across subjects is usually unavoidable. In this paper, we propose a novel method. Starting from the original MRI data, a series of 3D convolutional neural networks (cascaded 3D-CNNs) are adopted to iteratively update the predicted landmarks. Specially, the predicted landmarks of each 3D-CNN model are used to estimate the corresponding linear transformation matrix by linear square regression, which is very different from traditional registration methods. Based on the estimated matrix, we can use it to transform the original image for getting the new image for the next 3D-CNN model. With these cascaded 3D-CNNs and linear square regression, we can finally achieve registration and landmark detection.

Zi-Rui Wang, Bao-Cai Yin, Jun Du, Cong Liu, Xiaodong Tao, Guoping Hu

An Automated Brain Tumor Segmentation Framework Using Multimodal MRI

An automated region of interest (ROI) segmentation framework is proposed for edema detection and brain tumor segmentation from brain magnetic resonance images (MRI). In order to further improve the accuracy of the framework, multimodal MRI data are applied in this framework. The framework mainly contains three stages. First the cluster algorithm and morphological operation are used for detecting the abnormal tissue i.e. edema so as to automatically initialize the level set method. Then edge-based level set method combining regional information is used for edema segmentation from Fluid Attenuated Inversion Recovery (FLAIR) MRI. The final segmentation result of brain tumor is obtained by using the cluster method, filling algorithm and opening (morphology) operation at T1 contrast-enhanced (T1c) MRI. The experiments are carried out on two modalities MRI slices of 8 true patients, which have the matching ground truth of the edema and tumor. Experimental results demonstrate the effectiveness of our algorithm.

Haifeng Zhao, Shuhai Chen, Shaojie Zhang, Siqi Wang

Video-Based Pig Recognition with Feature-Integrated Transfer Learning

Automatic detection and recognition of animals has long been a popular topic. It can be used on different areas, such as ecosystem protection, farming industry, insurance industry, etc. Currently, there is still no robust and efficient method for this problem. Deep neural network, a recently rapid developing technology, has shown its great power on image processing, but suffers from low training speed problem. Recently, transfer learning has become popular because it avoids training the network from scratch, which significantly speeds up the training speed. In this paper, we focus on the pig recognition contest organized by a Chinese finance company. Applying all frames for training the neural networks with VGG-19 will result in an accuracy lower than 60% in the prediction steps. With experiments, we find out a key to enhance the accuracy of the video-based pig recognition task is that the frames have to be carefully selected with a certain algorithm. To take advantage of the strengths of different network architectures, we apply feature integration method with the deep neural networks of DPN131, InceptionV3 and Xception network together. We then implement the integrated feature to train the labelled dataset which are frames extracted from the videos of 30 pigs. The resulted model receive an prediction accuracy of 96.41%. Experiments show that the best performance of our proposed methods outperforms all classic deep neural networks training from scratch.

Jianzong Wang, Aozhi Liu, Jing Xiao

Integrating Multi-scale Gene Features for Cancer Diagnosis

Cancer is one of the major diseases that threaten human life. The advancement of high-throughput sequencing technology provides a way to accurately diagnose cancer and reveal the pathogenesis of cancer at the molecular level. In this study, we integrated the differentially expressed genes, and differential DNA methylation patterns, and applied multiple machine learning methods to conduct cancer diagnosis. The experimental results show that the performance of cancer diagnosis can be significantly improved with the integrated multi-scale gene features of RNA and epigenetic level. The AUC of classifier can be increased by 7.4% with multi-scale gene features compared to only differentially expressed genes, which verifies the effectiveness of the integration of multi-scale gene features for cancer diagnosis.

Peng Hang, Mengjun Shi, Quan Long, Hui Li, Haifeng Zhao, Meng Ma

Behavioral Biometrics


A Novel Multiple Distances Based Dynamic Time Warping Method for Online Signature Verification

In this paper, a novel Multiple Distances Based Dynamic Time Warping (MDB-DTW) method is proposed for signature verification. In order to obtain more discriminative and complementary information, we take accounts of the multiple distance measurements on the Euclidian distance based DTW path. In addition, two classifiers (SVM-based classifier and PCA-based classifier) are adopted to fuse the useful information and remove the noise from the multiple dissimilarity vector space. The comprehensive experiments have conducted on three publicly accessible datasets MCYT-100, SUSIG and SVC-task2 with the obtained EER results are 1.87%, 1.28% and 6.32% respectively, which further demonstrates the robust and effectiveness of our proposed MDB-DTW method.

Xinyi Lu, Yuxun Fang, Qiuxia Wu, Junhong Zhao, Wenxiong Kang

The Detection of Beard Behavior of Taxi Drivers Based on Traffic Surveillance Video

This paper presents a method for automatic detection of taxi drivers beard behavior. First, Haar-Adaboost algorithm is used to locate the special window area of taxi. Secondly, the image is preprocessed by multi-scale retina enhancement algorithm on image value channel, then face feature points and chin area are extracted from the window area, and the beard pixel is segmented by skin tone and gray threshold method. Finally, The method was tested by real traffic surveillance video and the validity was proved

Zuyun Wang, Xunping Huang, Kebin Jia, Pengyu Liu, Zhonghua Sun

Robust Recognition Algorithm for Fall Down Behavior

Detecting fall down behavior is a meaningful work in the area of public video surveillance and smart home care, as this behavior is often caused by accident but usually trigger serious result. However, the uncertain individual behavior, the difference between different cameras, and the complexity of real application scene make the work absolutely hard. In this paper, a robust fall down behavior recognition algorithm is proposed based on the spatial and temporal analysis of the Key Area of Human Body (KAHB). Firstly, a modified ViBe method is applied to extract motion area. Then a pre-trained human body classifier combined with histogram tracking is used to locate the KAHB and extract its normalized spatial and temporal features. Finally, a SVM classifier is employed to find the fall down behavior.

Wei Yan, Jianbin Xie, Peiqin Li, Tong Liu

Multi-source Interactive Behavior Analysis for Continuous User Authentication on Smartphones

Analyzing smartphone users’ behavioral characteristics for recognizing the identities has received growing interest from security and biometric researchers. Extant smartphone authentication methods usually provide one-time identity verification in some specific applications, but the authenticated user is still subject to masquerader attacks or session hijacking. This paper presents a novel smartphone authentication approach by analyzing multi-source user-machine usage behavior (i.e., power consumption, physical sensors, and touchscreen interactions), which can continuously verify the presence of a smartphone user. Extensive experiments are conducted to show that our authentication approach can be up to a relatively high accuracy with an equal-error rate of 5.5%. This approach can also be seamlessly integrated with existing authentication methods, which does not need additional hardware and is transparent to users.

Xiaozi Liu, Chao Shen, Yufei Chen

Character-Based N-gram Model for Uyghur Text Retrieval

Uyghur is a low resourced language, but Uyghur Information Retrieval (IR) is getting more and more important recently. Although there are related research results and stem-based Uyghur IR systems, it is always difficult to obtain high-performance retrieval results due to the limitations of the existing stemming method. In this paper, we propose a character-based N-gram model and the corresponding smoothing algorithm for Uyghur IR. A full-text IR system based on character N-gram model is developed using the open-source tool Lucene. A series of experiments and comparative analysis are conducted. Experimental results show that our proposed method has the better performance compared with conventional Uyghur IR systems.

Turdi Tohti, Lirui Xu, Jimmy Huang, Winira Musajan, Askar Hamdulla

Multi-task Network Learning Representation Features of Attributes and Identity for Person Re-identification

Person re-identification (re-ID) has become increasingly popular due to its significance in practical application. In most of the available methods for person re-ID, the solutions focus on verification and recognition of the person identity and pay main attention to the appearance details of person. In this paper, we propose multi-task network architecture to learn powerful representation features of attributes and identity for person re-ID. Firstly, we utilize the semantic descriptor on attributes such as gender, clothing details to effectively learn representation features. Secondly, we employ joint supervision of softmax loss and center loss for person identification to obtain deep features with inter-class dispersion and intra-class compactness. Finally, we use the convolutional neural network (CNN) and multi-task learning strategy to integrate the person attributes and identity to complete classifications tasks for person re-ID. Experiments are conducted on Market1501 and DukeMTMC-reID to verify the efficiency of our method.

Junqian Wang, Mengsi Lyu

BoVW Based Feature Selection for Uyghur Offline Signature Verification

As an important research direction in the field of biometrics, offline signature verification plays an important role. This paper proposes BoVW based on feature selection algorithm MRMR for offline signature verification. In this paper, eigenvectors were formed by extracting visual word features and the features were obtained by building a visual word bag of signature samples. In order to improve the relevance between eigenvectors and categories, and reduce the redundancy between features, the Maximum Relevance and Minimum Redundancy algorithm was used to select features of visual word eigenvectors. The algorithm can find the optimal feature subset. The experiments were conducted using 1200 samples from in our Uyghur signature database, and comparison experiments were carried on selecting 2640 samples from CEDAR database. It was obtained 93.81% of ORR from Uyghur signature and 95.38% of ORR using Latin signature from CEADER database respectively. The experimental results indicated the efficiency of proposed method in this paper.

Shu-Jing Zhang, Mahpirat, Yunus Aysa, Kurban Ubul

Research on the Methods for Extracting the Sensitive Uyghur Text-Images for Digital Forensics

With the continuous development of filtration technology for text information, many criminal offenders made much harmful text information in Uyghur involving extreme religion and terrorism information by image editing software. In order to recognize the Uyghur text-images effectively, a scheme for recognizing printed Uyghur based on the features extracted by histogram of oriented gradient (HOG) and the multilayer perceptron (MLP) neural network is put forward. Firstly, preprocess the Uyghur text-images to obtain the binary images after eliminating noise. After that, segment the text-line by horizontal projection integral method and segment the words and characters by vertical projection integral method to obtain independent characters. Next, extract the features of characters by HOG. Finally, recognize the characters through the trained MLP neural network classifier and according to features extract by HOG. The experimental results showed that we could recognize Uyghur characters accurately by the method put forward.

Yasen Aizezi, Anniwaer Jiamali, Ruxianguli Abdurixiti, Kurban Ubul

A Study on the Printed Uyghur Script Recognition Technique Using Word Visual Features

This paper proposes a recognition technique which applies a combination of image processing and pattern recognition to visual features of individual words. Uyghur script is naturally cursive, and its characters have uneven width. Therefore, in image format, precisely cutting Uyghur words into characters is difficult. To avoid such problem, we use word models instead of character models. Besides, this technique does not need a large amount of training samples: prepared text samples are converted to image samples which are used to construct individual word models.

Halimulati Meimaiti

Multilingual Offline Handwritten Signature Recognition Based on Statistical Features

Signature recognition is an identity authentication method widely used in various fields such as finance, judiciary, banking, insurance so on, and it plays an important role in society as a behavioral trait. In order to improve the accuracy of multilingual off-line handwritten signature recognition, this paper was proposed the high-dimensional statistical feature extraction methods to multi signature samples. The signature image is preprocessed firstly. Then, 128 dimensional local center point features and 112 dimensional ETDT features were extracted from the mixture (English, Chinese and Uyghur) signatures, and a high dimensional feature vector is formed after combining this two features. At last, the different distance based metric learning methods were used to train and recognize the multilingual signature. It was obtained 91.50%, 95.75% and 97.50% of recognition rates respectively using mixed Chinese-English signature dataset, Chinese-Uyghur mixed signature dataset and the English-Uyghur mixed signature dataset separately. The experimental results indicated that the algorithm proposed in this paper can identify mixed signature effectively, and it is suitable for identifying multilingual handwritten signature.

Kurban Ubul, Xiao-li Wang, Ahat Yimin, Shu-jing Zhang, Tuergen Yibulayin

HMM-Based Off-Line Uyghur Signature Recognition

Signature as a new biometric-based feature, due to its convenience, reliability, and non-invasion, signature recognition has been accepted by people. It is widely used in many fields such as commercial, financial, judicial, insurance and other aspects, so offline signature recognition has important theoretical significance and practical value. In this paper, an offline signature recognition system based on Hidden Markov Models is established to extract the DCT features of off-line signatures. This method takes all the fonts in the offline signature image as a whole, uses image processing techniques to segment the entire font area, and then calculates the number of pixels in each font part. The whole is modeled by a Hidden Markov Model, the best state chain is obtained using viterbi segmentation, and the EM algorithm is used to train the model. There are 2000 Uyghur signatures from 100 different people, 1000 English signatures from 50 different people, the highest recognition rates were 99.5% and 97.5%, respectively. The experimental results show that Hidden Markov Model can accurately describe the characteristics of Uygur signatures.

Long-Fei Mo, Hornisa Mamat, Mutallip Mamut, Alimjan Aysa, Kurban Ubul


Weitere Informationen

Premium Partner

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.



Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.
Jetzt gratis downloaden!