Skip to main content

2019 | Buch

Biometric Recognition

14th Chinese Conference, CCBR 2019, Zhuzhou, China, October 12–13, 2019, Proceedings

insite
SUCHEN

Über dieses Buch

The LNCS volume 11818 constitutes the proceedings of the 14th Chinese Conference on Biometric Recognition, held in Zhuzhou, China, in October 2019.

The 56 papers presented in this book were carefully reviewed and selected from 74 submissions. The papers cover a wide range of topics such as face recognition and analysis; hand-based biometrics; eye-based biometrics; gesture, gait, and action; emerging biometrics; feature extraction and classification theory; and behavioral biometrics.

Inhaltsverzeichnis

Frontmatter

Hand-Based Biometrics

Frontmatter
Local Discriminative Direction Extraction for Palmprint Recognition

Direction features server as one of the most important features of palmprint and there have been a number of direction-based palmprint recognition methods. However, most existing direction-based methods extract the dominant direction features, which are possibly not the most discriminative features due to the influence of the neighboring directions. In this paper, we present a straightforward example to show that the direction with a large neighboring direction response difference (NDRD) is more stable so as to be more robust and discriminative. Inspired by that, we propose a new feature descriptor by extracting multiple direction features with the competitive NDRDs for palmprint recognition. Extensive experiments conducted on three widely used palmprint databases, including the PolyU, IITD and CASIA databases, demonstrate the effectiveness of the proposed method.

Zhanchun Qiu, Lunke Fei, Shaohua Teng, Wei Zhang, Dongning Liu, Yan Hou, Wei Jia
Fingerprint Presentation Attack Detection via Analyzing Fingerprint Pairs

With the ever growing deployments of fingerprint recognition systems, presentation attack detection has become the new bottleneck. In order to make full use of the difference in materials between the fake fingerprint and the real fingerprint, we proposed to utilize two images of a finger for classification. A pair of fingerprints are first aligned using a deformable registration algorithm and then are fed into MobileNet-v2 networks to perform the classification. Experimental results on the public dataset LivDet 2011 show that the performance of the proposed approach is promising and prove the effectiveness of fusing two fingerprints rather than using the fingerprints separately.

Meng Zhang, Jianjiang Feng, Jie Zhou
Finger Vein Recognition Based on Double-Orientation Coding Histogram

More and more handcraft finger vein recognition algorithms have been proposed successively in recent years, and the orientation coding-based finger vein recognition has great research significance. In the paper, we offer a double orientation coding (DOC) method for finger vein recognition to represent the direction of vein texture using two orientation values. To strengthen the discrimination ability and robustness of the direction description, we further convert the DOC into the double orientation coding histogram (DOCH). Subsequently, since the proposed DOCH method cannot represent vein information adequately, we fuse DOCH with LBP scores. Finally, we propose a weighted score fusion strategy to improve recognition performance, which integrates the DOCH score and the LBP score with the chi-square distance and SVM respectively. Experimental results on two public databases (i.e., the MMCBNU_6000 and the FV-USM databases) demonstrate the effectiveness of our method for finger vein recognition, which has achieved 0.55% and 0.16% EERs.

Yuting Lu, Mo Tu, Hao Wang, Junhong Zhao, Wenxiong Kang
Fingerprint Classification Based on Lightweight Neural Networks

Fast and accurate fingerprint classification is very important in large-scale fingerprint identification system. At present, fingerprint classification model has many problems such as complicated operation, lots of parameters, massive data. In this paper, we present a lightweight neural network for automatic extraction features and classification of fingerprint images. Fingerprint Region of Interest (ROI) images is regarded as the input of the network and fused with the shallow feature map to obtain accurate trend information of the shallow middle line. Transfer learning and fingerprint directional field map are combined to pre-train the lightweight network, then the parameters of the network are optimized and experimentally verified. Experimental results show that the fingerprint ROI is integrated into the deep features, which can improve the fingerprint classification effect. The transfer of the lightweight network model can reduce the network requirements for the target domain data and improve the classification performance of small sample fingerprint images.

Junying Gan, Ling Qi, Zhenfeng Bai, Li Xiang
3D Fingerprint Gender Classification Using Deep Learning

Optical Coherence Tomography (OCT) is a high resolution imaging technology, which provides a 3D representation of the fingertip skin. This paper for the first time investigates gender classification using those 3D fingerprints. Different with current fingerprint gender classification methods, the raw multiple longitudinal(X-Z) fingertip images of one finger can be applied instead of studying features extracted from fingerprints, and the model can be trained effectively when the training data set is relatively small. Experimental results show that the best accuracy of 80.7% is achieved by classifying left fore finger on a small database with 59 persons. Meanwhile, with the same data size and method, the accuracy of classification based on 3D fingerprints is much higher than that based on 2D fingerprints: the highest accuracy is increased by 46.8%, and the average accuracy is increased by 26.5%.

Haozhe Liu, Wentian Zhang, Feng Liu, Yong Qi
A Novel Method for Finger Vein Recognition

Benefiting from CNN’s strong feature expression ability, the finger vein recognition systems using the convolutional neural network (CNN) currently have shown a good performance. However, these systems usually adopt such large networks or complex step-by-step processes that they cannot be applied to the hardware platform with limited computing power and small memory. To address this limitation, this research proposes a finger vein recognition network based on difference image and 3C image for cascade fine-tuning. First, a difference image from the image pair for authentic matching or imposter matching is obtained by difference operation and a 2C image is acquired by regarding this image pair as a two-channel image; furthermore, a 3C image is gained with the channel connection of the difference image and 2C image. Then, the SqueezeNet (this network has been pre-trained on ImageNet) that receives the 3C image as input is fine-tuned and the best fine-tune manner is determined. Finally, a cascade fine-tune framework is designed to integrate the difference images and 3C image. In this paper, the size of SqueezeNet which is cascade fine-tuned on the basis of the pre-training weights is 5.63 MB, and the corresponding equal error rate(EER) acquired on the dataset MMCBNU_6000 and SDUMLA-HMT is 1.889% and 4.906% respectively. The experimental results fully prove that the proposed method achieves not only high recognition accuracy but also the simplification of network.

Junying Zeng, Yao Chen, Chuanbo Qin, Fan Wang, Junying Gan, Yikui Zhai, Boyuan Zhu
Rolled Fingerprint Mosaicking Algorithm Based on Block Scale

In the mosaicking process of rolled fingerprint, there will be a certain positional offset between adjacent two frames of fingerprint images due to the large elastic deformation of the fingertip portion, which results in a significant mosaicking gap in the rolled fingerprint. Aiming at the problem, a block scale based rolled fingerprint mosaicking algorithm is proposed in this paper. Firstly, the fingerprint image is divided into many small blocks, and the center areas are extracted by the binary search. Then the corresponding lines of the segmented images are mosaicked and all lines are optimized according to the sequence of frames captured in the fingerprint rolling process. Finally, all line sequences are combined to a complete rolled fingerprint. The experimental results show that the proposed algorithm can effectively eliminate the mosaicking gap and the positional deviation of misalignment and reprint, and improve the accuracy of fingerprint recognition.

Yongliang Zhang, Minghua Gao, Xiaosi Zhan, Yifan Wu, Shengyi Pan
Study and Realization of Partial Fingerprint Mosaicking Technology for Mobile Devices

To obtain a large fingerprint template from several partial fingerprints, we propose a novel partial fingerprint mosaicking scheme, which includes a coarse-to-fine alignment stage and a fusion stage, to build a synthetic template from multiple partial fingerprints. In the alignment stage, firstly, a central patch is determined after pair-wised patch matching, then the rest patches are coarsely aligned based on the central patch, thirdly, the locations of patches are optimized globally. In the feature fusion stage, not only minutiae, but also ridge sampling points are fused to increase the information in the synthetic template. The proposed algorithm is tested on two datasets of FVC2002 and the experimental results show that it can achieve a great fingerprint mosaicking performance.

Yumeng Wang, Xiangwen Kong, Rongsheng Wang, Changlong Jin, Hakil Kim

Gesture, Gait and Action

Frontmatter
Multiscale Temporal Network for Video-Based Gait Recognition

Gait is a kind of advanced feature for human identification at a distance. It also contains rich temporal information. In the paper an innovative gait recognition model, Multiscale Temporal Network (MSTN), is designed to extract discriminative feature at multiple scales in the temporal domain. MSTN can build a temporal pyramid from four different temporal resolutions. That means the human body motion can be described from coarse to fine by the four pathways in the network. The method is verified on a popular databset, CASIA-B. The experimental results show that the proposed MSTN can observably improve the recognition rate and MSTN is a straightforward and effective solution. It also shows that there is great potential in gait feature extraction from the temporal domain.

Xinhui Wu, Shiqi Yu, Yongzhen Huang
Global and Local Spatial-Attention Network for Isolated Gesture Recognition

In this paper, we focus on isolated gesture recognition from RGB-D videos. Our main idea is to design an algorithm that can extract global and local information from multi-modality inputs. To this end, we propose a novel attention-based method with 3D convolutional neural network (CNN) to recognize isolated gesture recognition. It includes two parts. The first one is a global and local spatial-attention network (GLSANet), which takes into account the global information that focuses on the context of the frame and the local information that focuses on the hand/arm actions of the person, to extract efficient features from multi-modality inputs simultaneously. The second part is an adaptive model fusion strategy to fuse the predicted probabilities from multi-modality inputs. Experiments demonstrate that the proposed method has achieved state-of-the-art performance on the IsoGD dataset.

Qi Yuan, Jun Wan, Chi Lin, Yunan Li, Qiguang Miao, Stan Z. Li, Lihua Wang, Yunxiang Lu
Authentication System Design Based on Dynamic Hand Gesture

Due to biometric immutability, an authentication system that depends on irrevocable biometric data (faces and fingerprints) is vulnerable to vicious attacks. Gestures, as short actions that contain static and dynamic behavioral information, are gradually replacing traditional biometrics. Compared to body gestures, hand gestures are more flexible and do not require the user’s entire body to appear in front of the camera. However, most existing feature extraction algorithms rely on the key point of a hand in motion or the image analysis of a static hand gesture, thereby making the authentication less real-time and less effective in the real-word. To alleviate these problems, we propose a user authentication system based on dynamic hand gestures jointly models the silhouette and skeletal properties of moving hands for user authentication. Our system obtains an average 0.105% false acceptance rate (FAR) and an average 3.40% false rejection rate (FRR) on the public Dynamic Hand Gesture 14/28 dataset.

Chang Liu, Wenxiong Kang, Linpu Fang, Ningxin Liang

Feature Extraction and Classification Theory

Frontmatter
Structure Feature Learning: Constructing Functional Connectivity Network for Alzheimer’s Disease Identification and Analysis

Functional connectivity network, which as a simplified representation of functional interactions, it has been widely used for diseases diagnosis and classification, especially for Alzheimer’s disease (AD). Although, many methods for functional connectivity network construction have been developed, these methods rarely adopt anatomical prior knowledge while constructing functional brain networks. However, in the neuroscience field, it is widely believed that brain anatomy structure determining brain function. Thus, integrating anatomical structure information into functional brain network representation is significant for disease diagnosis. Furthermore, ignoring the prior knowledge may lose some useful neuroscience information that is important to interpret the data, and lose information could be important for disease diagnosis. In this paper, we propose a novel framework for constructing the functional connectivity network for AD classification and functional connectivity analysis. The experimental results demonstrate the proposed method not only improves the classification performance, but also found alteration functional connectivity.

Qinghua Zhao, Zakir Ali, Jianfeng Lu, Hichem Metmer
Weakly Supervised Learning of Image Emotion Analysis Based on Cross-spatial Pooling

Convolutional neural networks (CNNs) simulate the structure and function of the nervous system based on biological characteristics. CNNs have been used to understand the emotions that images convey. Most existing studies of emotion analysis have focused only on image emotion classification, and few studies have paid attention to relevant regions evoking emotions. In this paper, we solve the issues of image emotion classification and emotional region localization based on weakly supervised deep learning in a unified framework. We train a fully convolutional network, followed by our proposed cross-spatial pooling strategy, to generate an emotional activation map (EAM), which represents the relevant region that could evoke emotion in an image and is only labelled with an image-level annotation. Extensive experiments demonstrate that our proposed method has the best performance in the accuracy of classification and emotional region localization.

Guoqin Peng, Dan Xu
Embarrassingly Easy Zero-Shot Image Recognition

Zero-shot Learning (ZSL) aims to transfer knowledge from seen image categories to unseen ones by leveraging semantic information. It is generally assumed that the seen and unseen classes share a common semantic space. A number of methods propose to design a common space to accomplish the projection between image and class embeddings by learning a compatibility function, which make up sample pairs to train the object function. However, considering the drawbacks of previous compatibility function, we design a new compatibility function in this paper. Different from previous compatibility pattern, our proposed compatibility function is more discriminative by employing label vectors, which can measure the similarity between the projected image features and all seen class prototypes. Extensive experiments on four benchmark datasets show the effectiveness of our proposed approach.

Wenli Song, Lei Zhang, Jingru Fu
On the Generalization of GAN Image Forensics

Recently GAN generated face images are more and more realistic with high-quality, even hard for human eyes to detect. On the other hand, the forensics community keeps on developing methods to detect these generated fake images and try to ensure the credibility of visual contents. Although researchers have developed some methods to detect generated images, few of them explore the important problem of generalization ability of forensics model. As new types of GANs are emerging fast, the generalization ability of forensics models to detect new types of GAN images is absolutely an essential research topic, which is also very challenging. In this paper, we explore this problem and propose to use preprocessed images to train a forensic CNN model. By applying similar image level preprocessing to both real and fake images, unstable low level noise cues are destroyed, and the forensics model is forced to learn more intrinsic features to classify the generated and real face images. Our experimental results also prove the effectiveness of the proposed method.

Xinsheng Xuan, Bo Peng, Wei Wang, Jing Dong

Face

Frontmatter
Deep Residual Equivariant Mapping for Multi-angle Face Recognition

Face recognition has caught a lot of attention and plenty of valuable methods have been proposed during the past decades. However, because it is hard to learn geometrically invariant representations, existing face recognition methods still perform relatively poorly in conducting multi-angle face recognition. In this paper, we hypothesize that there is an inherent mapping between the frontal and non-frontal faces, and the non-frontal face representations can be converted into the frontal face representations by an equivariant mapping. To carry out the mapping, we propose a Multi-Angle Deep Residual Equivariant Mapping (MADREM) block which adaptively maps the non-frontal face representation to the frontal face representation. It can be considered the MADREM block carry out face alignment and face normalization in the feature space. The residual equivariant mapping block can enhance the discriminative power of the face representations. Finally, we achieve an accuracy of 99.78% on the LFW dataset and 94.25% on CFP-FP dataset based on proposed multiscale-convolution and residual equivariant mapping block.

Wei Liu, Lintai Wu, Yong Xu, Dan Wang
The Impact of Data Correlation on Identification of Computer-Generated Face Images

The traditional image discriminating methods can accurately identify forged pictures generated by splicing, tampering, etc. But most methods cannot identify the forged pictures generated by the GAN models. In this paper, we specially explore to identify forged face created with the GAN models. Our target is to analyze the effect of data correlation on identification of computer created face images. In this work, we mainly test on false face datasets generated by StyleGAN and DCGAN. Both datasets are divided into two experimental control groups. We use the convolutional neural network models such as ResNet-18, VGG, and GoogLeNet to perform classification experiments on the control experimental groups. The results show that the models used in this paper can accurately distinguish the real faces and the forged faces generated with GAN. The validation analysis shows that the data correlation has a low influence on identification of forged faces with specific models.

Taifeng Tan, Xin Wang, Yuchun Fang, Wei Zhang
Face Image Deblurring Based on Iterative Spiral Optimazation

The motion blurred image is caused by the relative motion between the target and the capturing device during the exposure time. It’s difficult to analyze the face information of the motion blurred face image, therefore motion deblurring is needed. However, the existing algorithms cannot deal with the diversity of motion blur kernels well. Based on that, this paper proposes an iterative spiral optimization algorithm for blind motion blurring. The algorithm makes the blurred image spirally approximate the sharp image by calling the deblurring generator multiple times. It is proved that the algorithm can effectively restore the motion blurred image with diverse blurred kernels in the approximate natural state, and improve the visual effect of the image.

Yukun Ma, Yaowen Xu, Lifang Wu, Tao Xu, Xin Zhao, Lei Cai
AdaptiveNet: Toward an Efficient Face Alignment Algorithm

Face alignment, a challenging task in computer vision, has witnessed its tremendous improvement on the 300W benchmark. However, state-of-the-art algorithms are suffering from computational expense and therefore cannot apply in real-time. In this paper, we propose a time-efficient face alignment algorithm while maintain a sufficient algorithmic accuracy. Specifically, we adopt MobileNet-V2 as our backbone architecture to deal with easy samples, accompanied by a ResNet branch to handle hard examples. This combination leads to a low-latency and yet agreeable-performance design as our extensive experiment shows.

Xiehe Huang, Weihong Deng
Cross-Dimension Transfer Learning for Video-Based Facial Expression Recognition

Dynamic Facial Expression Recognition (FER) in videos is currently a topic of broad concern. Considering the fact that 3-dimensional convolutional networks (3D ConvNets) have recently demonstrated poor performance in this task, we propose a simple, yet effective approach to solve this problem within limited emotion data, which we call cross-dimension transfer learning (CTL). By transferring parameters learned from 2D ConvNets into 3D, network can be initialized reasonably, making it possible to avoid training 3D ConvNets from scratch. We introduce several transfer strategies and experiment results show that, CTL methods can bring considerable improvement to 3D ConvNets and compared with training from scratch, recognition accuracy on AFEW (Acted Facial Emotion in the Wild) has improved by 12.79%. We further extend our method to CK+ (The Extended CohnKanade) dataset and the classification performance shows the generalized ability of our approach.

Kailun Zhong, Yi Li, Li Fang, Ping Chen
Exploring Shape Deformation in 2D Images for Facial Expression Recognition

Facial expression recognition (FER) using 2D images has been rapidly developed in the past decade. However, existing 2D-based FER methods seldom consider the impact of identity factors, and do not utilize shape features which have been proven to be effective complement to texture features. Built upon latest 3D face reconstruction methods, this paper proposes to generate expression-induced shape deformation map (ESDM) from the 3D face reconstructed from the input 2D face image, and then extract shape feature from ESDM by using a deep network. The shape feature is then combined with the texture feature on the input 2D face image, resulting in a fused feature, based on which the expression of the input 2D face image is recognized by using a softmax classifier. Evaluation experiments on BU-3DFE, MMI and CK+ databases show that our proposed shape feature effectively improves the 2D-based FER accuracy, and our method using the fused feature achieves state-of-the-art accuracy.

Jie Li, Zhengxi Liu, Qijun Zhao
Facial Attractiveness Prediction by Deep Adaptive Label Distribution Learning

One of the biggest challenges in the problem of facial attractiveness prediction is the lack of reliable labeled training data. It is very hard to apply a well defined concept to describe the attractiveness of a face. In fact, facial attractiveness prediction is a label ambiguity problem. In order to solve the problem, we propose a novel deep architecture called Deep Adaptive Label Distribution Learning (DALDL). Different from previous works, we use discrete label distribution of possible ratings rather than single label to supervise the learning process of facial attractiveness prediction, and update the label distribution automatically during training process. Our approach provides a better description for facial attractiveness, and experiments have shown that DALDL achieves better or comparable results than the state-of-the-art methods.

Luyan Chen, Weihong Deng
LWFD: A Simple Light-Weight Network for Face Detection

In the latest field of object detection, whatever it is one-stage approach or two-stage approach, both of them are using the CNNs with complex calculations to guide the detection performance better. But this also greatly limits our use of the platform (only available on the GPU), so we propose a simple light-weight network for face detection based on the well performance light-weight network backbone which can run on CPU or ARM. In our approach, we have a light-weight network that combine a simple but effective detection framework, a hyperparameter to control the number of channels. It makes our model allowed to have smaller model, faster speed and better accuracy. LWFD can perform CNN inference on mobile devices, and 1.0x run at 90 ms on 2.4 GHz CPU with f-score of 89% on FDDB dataset.

Huan Liang, Jiani Hu, Weihong Deng
Dairy Cow Tiny Face Recognition Based on Convolutional Neural Networks

In practical applications of cow face recognition, the accuracy is often lower than expected because of the influence of camera’s low resolution and position. In this paper, we aim to develop and pilot a method for improving recognition accuracy and recovering identity information for generating cow faces closed to the real identity. Specifically, our network architecture consists of two parts: a super-resolution network for recovering a high-resolution cow face from a low-resolution one, and a face recognition network. The super-resolution network is cascaded with the recognition network. An alternately training strategy was introduced to ensure the stability of the training process. The cow face dataset was collected by us, which contains 85200 dairy cow face images from 1000 subjects. Experimental evaluations demonstrate the superiority of the proposed method. Our method has achieved 94.92% recognition accuracy on the small size (12 × 14) cow face.

Zehao Yang, Hao Xiong, Xiaolang Chen, Hanxing Liu, Yingjie Kuang, Yuefang Gao
Reconstructed Face Recognition

Computer-aided craniofacial reconstruction technology has very important application in the field of criminal investigation. But reconstruction a face from skull is not the end of work. The reconstructed face needs to be automatically identified in the missing population photo database. This paper proposed a reconstructed face recognition method based on deep learning. We trained a weighted fusion deep network for feature extraction, built two different neural network models for reconstructed face verification and use KNN for reconstructed face recognition. This paper uses 166 sets of data for experiments. In reconstructed face verification, the accuracy of using the Pseudo Siamese neural network is 98.33%. In reconstructed face recognition, the Top1 accuracy of the method using Pseudo Siamese neural network is 99.57%. Experiments show that the proposed method can effectively improve the accuracy of reconstructed face recognition.

Xiaoning Liu, Shanghao Zhao, Shixiong Wang, Yunpeng Jing, Jun Feng
A Two-Stage Method for Assessing Facial Paralysis Severity by Fusing Multiple Classifiers

Facial paralysis is a disease that face can not do normal movement on the malfunctioned side. This paper proposes a novel two-stage method for automatically assessing the severity of facial paralysis in a coarse to fine manner. In the first stage, the method coarsely determines whether the query face has severe or mild facial paralysis by analyzing the symmetry of the face under neutral expression and the appearance of the closed eye on the malfunctioned side of the face. In the second stage, the face of severe facial paralysis is further classified into two levels by analyzing the motion feature in showing teeth, while the face of mild facial paralysis is classified into four levels by analyzing the motion feature in showing teeth and raising eyebrows. In both stages, support vector machines (SVMs) are employed to classify the face into different facial paralysis severity levels based on different features. The final assessment is obtained by fusing the results of the multiple SVMs. Evaluation experiments on a database collected by ourselves obtain promising results and prove the effectiveness of fusing the results of multiple classifiers that are based on different features.

Pengfei Li, Shune Tan, Xiurong Zhou, Sicen Yan, Qijun Zhao, Jicheng Zhang, Zejun Lv
Latent Spatial Features Based on Generative Adversarial Networks for Face Anti-spoofing

With the wide deployment of the face recognition system, many face attacks, such as print attack, video attack and 3D face mask, have emerged. Face anti-spoofing is very important to protect face recognition system from attack. This paper proposes a structure of generative adversarial networks with skip connection for face anti-spoofing. First, we obtain the latent spatial features of faces by training generative adversarial networks to reconstruct both real and spoof faces; second, we use the convolution neural networks to detect the spoofing faces. In this paper, the proposed method is evaluated by three public databases. The results suggest that our approach achieves as high as 98% accuracy on both CASIA-FASD and REPLAY-ATTACK databases.

Jingtian Xia, Yan Tang, Xi Jia, Linlin Shen, Zhihui Lai
Similarity Measurement Between Reconstructed 3D Face and 2D Face Based on Deep Learning

Craniofacial reconstruction technology is very important in the field of criminal investigation. But reconstruction a face from skull is not the end of work. The key technology is the similarity measurement between reconstructed 3D face and 2D face image. It can not only be used to retrieve the most similar face image from missing population database but also can be used to value the reconstruction methods. We built a 3D reconstructed face dataset, trained a deep face feature extraction model and built a neural network for similarity measurement. Firstly, the reconstructed 3D face and 2D face image need to be preprocessing. Secondly, deep network is designed for similarity measurement. Finally, we tested the proposed model. The accuracy of the similarity between two kinds of face images was 96.67%. Experiments show that the proposed neural network model can effectively measure the similarity between two kinds of face images.

Shanghao Zhao, Xiaoning Liu, Shixiong Wang, Yunpeng Jing, Jun Feng
Real-Time Face Occlusion Recognition Algorithm Based on Feature Fusion

The real-time face occlusion recognition is an important computer vision problem, especially for the public safety field. In order to construct a real-time face occlusion recognition system, this paper first established a large occlusion face database. Then, this paper proposed a face occlusion recognition algorithm based on the fusion of histogram of oriented gradient(HOG) and local binary pattern(LBP), the experimental results show that the occlusion face recall rate and the unobstructed face recall rate are 92.03% and 93.58% respectively, the speed is about 12.26 ms. Finally, taking into account time factor, this paper established a lightweight deep neural network based on AlexNet with an occlusion face recall rate and an unobstructed face recall rate of 91.79% and 91.42% respectively, and the speed is approximately 22.92 ms. The experimental results show that the face occlusion recognition method based on HOG+LBP features not only improves the recognition rate of occlusion face, but also reduces the time complexity, and illustrates the effectiveness of the algorithm.

Xiangde Zhang, Bin Zheng, Yuanjie Li, Lianping Yang
Joint Face Detection and Alignment Using Focal Loss-Based Multi-task Convolutional Neural Networks

In order to learn more hard negative example features of face, MTCNN increases the proportion of negative to positive examples. However, the training data of MTCNN suffer from the extreme foreground-background class imbalance, which results in a large number of well-classified negative examples overwhelming the detector during training. To solve this problem, we introduce Focal Loss to MTCNN to improve face classification performance. Besides, the Batch Normalization is adopted to speed up training and prevent network performance degradation caused by overfitting. The experimental results demonstrate that the proposed method has a better performance than MTCNN on face classification, and has a good comprehensive performance on facial landmark localization.

Rongsheng Wang, Jinzhao Tian, Changlong Jin
A Face Recognition Workflow Based Upon Similarity Measurement

In order to combine with different feature extraction methods, in this paper, we propose a new method using similarity calculation between multiple features. We regard face recognition as a maximum-a-posteriori (MAP) problem and the de-pendency between different features is defined by a markov chain. We construct a matching similarity function T which helps us finding a better matching image. Experiments were tested using AR database and the results have shown that our recognition rate is higher, especially robust to small occlusion and noise.

Yigan Li, Zhaohui Wang
106-Point Facial Landmark Localization with Mobile Networks Based on Regression

Sparse facial landmark localization has lower precision for face reconstruction, while more point landmarks are competent to depict the structure of facial components. In this paper, the pipeline of detecting 106-point facial landmarks with regression is proposed. Based on the convergence and practical application of multi-points regression, we design MobileNetV2-FL and VGG16-FL. Besides, an effective data preprocessing strategy and some training tricks, such as the Online Hard Example Mining algorithm and Wing loss are applied to the issue. Experimental results show that the proposed method has lower failure rate, and is an effective and robust facial landmark localization method.

Xiangyang Zhai, Yuqing He, Qian Zhao, Yutong Ding

Eye-Based Biometrics

Frontmatter
Long Range Pupil Location Algorithm Based on the Improved Circle Fitting Method

Since the pupil will show abnormal contraction or expansion after taking drugs, the technology of long range pupil diameter detection can be used for rapid screening of drug users. In this paper, a long range pupil location algorithm is proposed based on the improved circle fitting method. Firstly, the improved histogram-based binarization method is used to obtain candidate pupil contours, and non-pupil contours are filtered according to the shape or size. Then, the ROI is extracted from the original image according to the pupil contours, and the facula contours are extracted and filled by the features, such as gray value, size and position. Finally, an improved circle fitting method is used to obtain an accurate pupil position. The experimental results on CASIA-IrisV4-Distance database show that the proposed algorithm can locate the pupil more accurately and the average deviation of center and radius is smaller compared with the existing pupil detection algorithms.

Yongliang Zhang, Xiaoxiao Qian, Zhongsu Luo, Keyi Zhu, Minjun Yu
Multi-source Heterogeneous Iris Recognition Using Locality Preserving Projection

Multi-source heterogeneous iris recognition (MSH-IR) has become one of the most challenging hot issues. Iris recognition is too dependent on the acquisition device, causing have large intra-class variations, capture iris duplicate data more and more larger. The paper proposed the application of locality preserving projection (LPP) algorithm based on manifold learning as a framework for MSH-IR. Looking for similar internal structures of iris texture, MSH-IR is performed by measuring similarity. The new solution innovation aspects that LPP algorithm is used to establish the neighboring structure of the similar feature points of the iris texture, and the similarity between the MSH-IR structures is measured after mapping to the low-dimensional space, and using the SVM algorithm to find and establish the optimal classification hyperplane in low-dimensional space to implement the classification of multi-source heterogeneous iris images. The experiment based on the JLU-MultiDev iris database. The experimental results demonstrates the effectiveness of the LPP dimension reduction algorithm for MSH-IR.

Guang Huo, Qi Zhang, Huan Guo, Wenyu Li, Yangrui Zhang
Iris Recognition Based on Adaptive Optimization Log-Gabor Filter and RBF Neural Network

In order to improve the universality and accuracy of one-to-one iris recognition algorithm, there proposes an iris recognition algorithm based on adaptive optimization Log-Gabor filter and RBF neural network in this paper. Iris amplitude features are extracted with Log-Gabor filter. The selection mutation operator and particle swarm optimization algorithm are used to optimize the filter parameters. Then principal component analysis (PCA) are used to reduce dimensions, thereby reducing the noise and redundancy. Then the Euclidean distance between iris amplitude features are calculated, and the RBF neural network is built for iris recognition. Compared with other iris recognition algorithms on JLU-6.0 iris library and CASIA-Iris-Interval iris library, the recognition rate of this algorithm is higher, and the ROC curve is closer to the coordinate axis, so it has good stability and robustness.

Qixian Zhang, Xiaodong Zhu, Yuanning Liu, Guang Huo, Guangyu Wang, Shuai Liu, Tong Ding, Kuo Zhang, Kiese Diangebeni Reagan, Chaoqun Wang
Retinal Vessel Segmentation Method Based on Improved Deep U-Net

The automatic segmentation of retinal vessels plays an important role in the early screening of eye diseases. However, vessels are difficult to segment with pathological retinal images. Hence, we propose the use of deep U-net, a new retinal vessel segmentation method based on an improved U-shaped fully convolutional neural network. The method uses not only local features learned from the shallow convolution layers, but also abstract features learned from deep convolution layers. To improve the segmentation accuracy for thin vessels, we applied Gaussian matched filtering to the U-net. The batch normalization layer was added in the U-net network, which increased the speed of convergence. In the training phase, a new sample amplification method called translation-reflection was proposed to increase the proportion of blood vessels in the training images. Results of the experiments showed that the proposed method leads to better retinal vessel segmentation than other methods developed in recent years do for the SE, SP, Acc, Ppv, and AUC evaluation metrics.

Yiheng Cai, Yuanyuan Li, Xurong Gao, Yajun Guo
Multi-pyramid Optimized Mask R-CNN for Iris Detection and Segmentation

Iris segmentation is an irreplaceable stage of iris recognition pipeline. The traditional segmentation methods are poorly robust, and the segmentation method using FCN runs very slowly. Therefore, in this paper, we propose an iris detection segmentation model based on multi-pysamid optimized Mask R-CNN. It is mainly realized by expanding the segmentation feature and performing the fusion operation on the segmentation feature obtained in the feature pyramid. This method enhances the expression of segmentation features and improves iris segmentation performance. Finally, experiments were conducted on two public datasets UBIRIS.v2 and CASIA.IrisV4-distance. Experimental results show that the proposed model achieves better results than state-of-the-art methods in the literature.

Huanwei Liang, Zilong Chen, Hui Zhang, Jing Liu, Xingguang Li, Lihu Xiao, Zhaofeng He
Constrained Sequence Iris Quality Evaluation Based on Causal Relationship Decision Reasoning

In order to select as many available irises as possible for recognition through the same indicators, a quality evaluation algorithm for constrained sequence iris is proposed in this paper. In the case where other indicators are set idealization, a variety of iris quality indicators are set from the perspective of sharpness, iris region nature, and offset degree. According to the causal relationship among quality indicators, the order of indicators evaluation will be adjusted, and then a quality decision reasoning process can be formed. The results of experiments used the JLU iris library of Jilin University indicate that the algorithm can effectively improve the survival rate of available iris in the sequence iris and play an active role in improving iris recognition accuracy.

Liu Shuai, Liu Yuanning, Zhu Xiaodong, Zhang Hao, Huo Guang, Wang Guangyu, Cui Jingwei, Li Xinlong, Wu Zukang, Dong Zhiyi
Iris Image Super Resolution Based on GANs with Adversarial Triplets

Iris recognition is a safe and reliable biometric technology commonly used at present. However, due to the limitations of equipment and environment in a variety of application scenarios, the obtained iris image may be of low quality and not clear enough. In recent years, there are many attempts to apply neural networks to iris image enhancement. This paper is inspired by SRGAN, and introduces the adversarial idea into the triplet network, finally proposing a novel iris image super-resolution architecture. With triplet loss, the Network can keep reducing intra-class distance and expanding inter-class distance during iris image reconstruction. The experiments on CASIA’s several benchmark iris image datasets yield considerable results. This architecture makes a contribution to enhancing iris images for recognition.

Xiao Wang, Hui Zhang, Jing Liu, Lihu Xiao, Zhaofeng He, Liang Liu, Pengrui Duan
SDItg-Diff: Noisy Iris Localization Based on Statistical Denoising

It is quite challenging to localize noisy iris. In order to improve the stability and accuracy of noisy iris localization, this paper presents a statistical denoising integral difference operator (SDItg-Diff). Firstly, we use the Itg-Diff operator to produce several candidate boundaries with large Itg-Diff values. Then, the Pauta criterion is used to exclude the severe outlier pixels on each candidate boundary and the SDItg-Diff indicator is calculated after noise removal. The boundary with the max SDItg-Diff indicator is taken as the final localization boundary. The experimental result shows that, compared with the Itg-Diff operator, the proposed method can achieve more stable localization on noisy iris images.

Xiangde Zhang, Runan Zhou, Xiangyue Meng, Qi Wang
End to End Robust Recognition Method for Iris Using a Dense Deep Convolutional Neural Network

Many algorithms of iris recognition have been proposed in academic field. Due to the iris image is obscured by illumination, blur and occlusion, iris recognition has not been widely adapted in life, the robustness of iris recognition algorithm is required to be higher. Hence, this paper proposes an end to end dense deep convolutional neural network (DDNet) for the iris recognition. DDNet used a deeper network structure and used the segmented images as input images without prior preprocessing or other conventional image processing techniques. The performance of the DDNet is tested on CASIA-Iris-V3 and IITD, from 138 and 224 different subjects respectively. Experiment results showed that DDNet is adapted and robust in different parameters, and its performance over most existing algorithms.

Ying Chen, Zhuang Zeng, Fei Hu

Emerging Biometrics

Frontmatter
X-Ray Image with Prohibited Items Synthesis Based on Generative Adversarial Network

Using deep learning to assist people in recognizing prohibited items in X-Ray images is crucial to improve the quality of security inspections. However, these methods require lots of data and the data collection usually takes much time and efforts. In this paper, we propose a method to synthesize X-ray image to support the training of prohibited items detectors. The proposed framework is built on the Generative Adversarial Networks (GAN) with multiple discriminators, trying to synthesize realistic X-Ray prohibited items and learn the background context simultaneously. In the other hand, a guided filter is introduced for detail preserving. The experimental results show that our model can smoothly synthesize prohibited items on background images. To quantitatively evaluate our approach, we add the generated samples into training data of the Single Shot MultiBox Detector (SSD) and show the synthetic images are able to improve the detectors’ performance.

Tengfei Zhao, Haigang Zhang, Yutao Zhang, Jinfeng Yang
A Deep Learning Approach to Web Bot Detection Using Mouse Behavioral Biometrics

Web bots are automated scripts that perform online tasks like human. Abuse of bot technology poses various threats to the security of websites. Recently, mouse dynamics has been applied to bot detection by analyzing whether recorded mouse operations are consistent with human operational patterns. In this paper, we introduce a deep neural network approach to bot detection. We propose a new representation method for mouse movement data, which converts every mouse movement into an image containing its spatial and kinematic information. This representation method makes it possible to utilize CNN models to automate feature learning from mouse movement data. Experimental results demonstrate that our method is able to detect 96.2% of bots with statistical attack ability while traditional detection methods using hand-crafted features or RNN can only detect less than 30% of them.

Ang Wei, Yuxuan Zhao, Zhongmin Cai
Multi-task Deep Learning for Child Gender and Age Determination on Hand Radiographs

In anthropology, especially in medico-legal or forensic investigation, the determination of gender and age of the subjects is typically a preliminary and compulsory obligation. State-of-the-art methods for gender determination use dimensions of the bones around the skull and pelvis area. Whereas age is determined on the basis of the degree in which bones have grown, for instance, dental eruption, epiphyseal fusion, tooth mineralization, and diaphyseal length. In this paper, we propose a convolutional neural network model with multi-task learning to determine the gender and age using left-hand radiographs. The model performs well by determining gender and age simultaneously. The results produced by the model specify that there is a relationship between gender and age which is with an increase in age, gender-related features become prominent. Phalanges and Metacarpals are the most significant parts of hand for gender detection based on a certain age group and age detection based on gender respectively. To our knowledge, our method is the first one to determine the gender and age of children simultaneously.

Mumtaz A. Kaloi, Xiaosen Wang, Kun He
Shoe Pattern Recognition: A Benchmark

In this paper, we propose a benchmark of shoe recognition based on convolutional neural network. To meet the training and testing needs, we also set up a shoe database which contains 50 pairs of shoes and 160231 images. The Caffe framework is applied in combination with different network models to train and test the image data of shoes, which could obtain the best network model, and the similarity measurement between different shoe pictures is estimated for shoe verification. At the same time, the error recognition image analysis and robustness test are performed. The experimental results show that the proposed method achieves good performance with an accuracy of 95.31%. The proposed method provides a new way for shoe recognition.

Mengjing Yang, Heng Jiang, Yunqi Tang
Learning Discriminative Representation for ECG Biometrics Based on Multi-Scale 1D-PDV

ECG has drawn increasing attention in the biometrics and achieves great success compared with other biological characteristics. However, ECG cannot satisfy the requirements of mobile application owing to the poor quality. In this paper, we learn discriminative representation for ECG biometrics based on multi-scale 1D-PDV feature. First, we choose PDV as the base feature and attempt to convert PDV to the one-dimensional and multi-scale in the ECG biometrics. Second, our method learns a mapping to project the multi-scale 1D-PDV to a low dimensional feature vector and capture discriminative information of ECG. Then each feature vector is pooled in the codebook and represented as a histogram feature. Last, we apply principal component analysis (PCA) to reduce the histogram feature dimension and compute the matching score with cosine similarity. We evaluate our method on two public databases and the results prove our method achieves superior performance than other existing methods.

Yanwen Sun, Gongping Yang, Yuwen Huang, Kuikui Wang, Yilong Yin
Off-Line Handwritten Signature Recognition Based on Discrete Curvelet Transform

In order to improve the offline handwritten signature recognition effect, an offline handwritten signature recognition method based on discrete curvelet transform is proposed. First, the necessary pre-processing of offline handwritten signatures is carried out, including grayscale, binarization, smooth denoising, etc. The pre-processed signature image is subjected to curvelet transform to obtain real-numbered curve coefficients in the cell matrix, and a total of 82-dimensional energy features are extracted, and multi-scale block local binary mode (MBLBP) is combined on the cell matrix of discrete curvelet transform to form a new signature feature, use the SVM classifier for training and classification. Experiments on two databases, Uyghur and Kirgiz, the highest accuracy was 97.95% and 97.42% respectively. The experimental results show that the proposed method has better accuracy in offline handwritten signature recognition.

Long-Fei Mo, Mahpirat, Ya-Li Zhu, Hornisa Mamat, Kurban Ubul
Research on Automatic Classification Method of Footwear Under Low Resolution Condition

On the basis of the shoe prints left by the suspects at the crime scene, it can be inferred that the specific type of shoes worn by the suspects, and then the type of suspected shoes can be searched in the monitoring around the crime scene, which is a common investigative technique used by public security organs. However, this technique is less automated and intelligent, and in most cases, the shoes under video monitoring are small and mostly fuzzy. An automatic classification method of footwear for pedestrians under low resolution video monitoring is proposed. A footwear database has been constructed with 149,199 footwear images; Then, based on the convolutional neural network, a network model suitable for automatic footwear classification is designed. The experimental results show that the accuracy of the automatic footwear classification network model in the test stage is up to 98.47%.

Heng Jiang, Mengjing Yang, Zhongliang Mi, Yunqi Tang

Behavioral Biometrics

Frontmatter
Low-Resolution Person Re-identification by a Discriminative Resolution-Invariant Network

Person Re-identification (re-id) needs to tackle with the problem of changing resolutions because the pedestrians from surveillance systems or public datasets have low-resolution problem (LR-REID) including low quality, blurry textures and so on, which results in a difficult challenge to extract the identity information under various resolutions. However, most existing re-id models are trained by high-resolution (HR) images, which will achieve poor performance when conducted directly on low-resolution images. In this paper, we propose a novel Discriminative Resolution-invariant Network (DRINet) to explore the subspace where LR and HR features are highly correlated and we can extract discriminant features in the commonly shared feature space. Firstly, we adopt ResNet as the backbone and impose the softmax loss together with the triplet loss to learn distinguishing features. Secondly, we impose the KL divergence loss on the backbone features to minimize the discrepancies between LR and HR features. Finally, we integrate the sparse auto-encoder (SAE) structure to find a subspace which is robust to the resolution variations. Experimental results verify the effectiveness of the DRINet in improving the LR-REID performance and the superiority of the DRINet against the state-of-the-art methods.

Tongtong Guo, Jianhuang Lai, Zhanxiang Feng, Zeyu Chen, Xiaohua Xie, Weishi Zheng
DHML: Deep Heterogeneous Metric Learning for VIS-NIR Person Re-identification

Narrowing the modal gap in person re-identification between visible domain and near infrared domain (VIS-NIR Re-ID) is a challenging problem. In this paper, we propose the deep heterogeneous metric learning (DHML) for VIS-NIR Re-ID. Our method explicitly learns a specific projection transformation for each modality. Furthermore, we design a heterogeneous metric module (HeMM), and embed it in the deep neural network to complete an end-to-end training. HeMM provides supervisory information to the network, essentially eliminating the cross-modal gap in the feature extraction stage, rather than performing a post-transformation on the extracted features. We conduct a number of experiments on the SYSU-MM01 dataset, the largest existing VIS-NIR Re-ID dataset. Our method achieves state-of-the-art performance and outperforms existing approaches by a large margin.

Quan Zhang, Haijie Cheng, Jianhuang Lai, Xiaohua Xie
Teager Energy Operator Based Features with x-vector for Replay Attack Detection

Audio replay attack poses great threat to Automatic Speaker Verification (ASV) systems. In this paper, we propose a set of features based on Teager Energy Operator and a slightly modified version of x-vector system to detect replay attacks. The proposed methods are tested on ASVspoof 2017 corpus. When using GMM with the proposed features, our best system has an EER of 6.13% on dev set and 15.53% on eval set, while the EER for the baseline system (GMM with CQCC) is 30.60% on eval set. When combined with the modified x-vector, the best EER further drops to 5.57% for dev subset and 14.21% for eval subset.

Zhenchuan Zhang, Liming Zhou, Yingchun Yang, Zhaohui Wu
Video Human Behaviour Recognition Based on Improved SVM_KNN for Traceability of Planting Industry

Multivariate data acquisition is a difficult problem in traceability of planting industry. Video-based human behaviour recognition technology can automatically identify various human behaviors in the process of crops planting, and realize automatic data collection. A feature extraction method based on three-dimensional skeleton of human body and an improved SVM_KNN method has been proposed in this paper to classify human behavior and realize multi-target human behavior recognition based on video. The experiment results show that the human behavior recognition method proposed in this paper can effectively identify different human behaviors in crop planting.

Wei Ni, Quhua Rao, Dingti Luo
Application of Unscented Kalman Filter in Tracking of Video Moving Target

The tracking of video moving target is actually an estimation problem of state variable. Kalman filter method is one of the classical estimators widely used in the field of state estimation. But in tracking system of video moving target, the classical Kalman filtering method has the problem of low tracking accuracy and divergence of filtering. In order to improve the tracking effect, a unscented Kalman filter algorithm is used to track moving target in video sequence. The application of unscented Kalman filter in tracking of video moving target is compared with that of Kalman filter by Matlab simulation software. The results show that unscented Kalman filter is more accurate and better than Kalman filter in tracking of video moving target.

Qin Guo, Cuixia Zeng, Zhizhao Jiang, Xiaotong Hu, Xiaofei Deng
Similarity Scores Based Re-classification for Open-Set Person Re-identification

In this paper, we propose a new similarity scores based re-classification method for open-set person re-identification, which exploits information among the top-n most similar matching candidates in the gallery set. Moreover, to make the cross-view quadratic discriminant analysis metric learning method effectively learn both the projection matrix and the metric kernel with open-set data, we introduce an additional regularization factor to adjust the covariance matrix of the obtained subspace. Our Experiments on challenging OPeRID v1.0 database show that our approach improves the Rank-1 recognition rates at 1% FAR by 8.86% and 10.51% with re-ranking, respectively.

Hongsheng Wang, Yang Yang, Shengcai Liao, Dong Cao, Zhen Lei
The GMM and I-Vector Systems Based on Spoofing Algorithms for Speaker Spoofing Detection

Automatic Speaker Verification (ASV) systems are more vulnerable to being attacked than other biometric systems, such as speech synthesis, voice conversion, and replay. In this paper, two frameworks (Gaussian mixture model based and i-vector based) are used to detect a variety of specific attack types. Three scoring methods (probabilistic normalization, linear regression and support vector machine) are used for the Gaussian Mixture Model (GMM) model. And three different classifiers (cosine distance, probabilistic linear discriminant analysis, support vector machine) are used for the i-vector system. Furthermore, the cosine classifier based on the i-vector system which uses three scoring methods is proposed in this paper. Experiments on the ASVspoof 2019 challenge logical access scenario show that the GMM classifier with the Support Vector Machines (SVM) scoring method based on different spoofing algorithms obtains the best performance on the evaluation set with EER of 7.03%. Moreover, SVM scoring is also useful for improving the i-vector system based on different spoofing algorithms.

Hui Tang, Zhenchun Lei, Zhongying Huang, Hailin Gan, Kun Yu, Yingen Yang
Feature Enhancement for Joint Human and Head Detection

Human and head detection have been rapidly improved with the development of deep convolutional neural networks. However, these two detection tasks are often studied separately, without taking advantage of the relationship between human and head. In this paper, we present a new two-stage detection framework, namely Joint Enhancement Detection (JED), to simultaneously detect human and head based on enhanced features. Specifically, the proposed JED contains two newly added modules, i.e., the Body Enhancement Module (BEM) and the Head Enhancement Module (HEM). The former is designed to enhance the features used for human detection, while the latter aims to enhance the features used for head detection. With these enhanced features in a joint framework, the proposed method is able to detect human and head simultaneously and efficiently. We verify the effectiveness of the proposed method on the CrowdHuman dataset and achieve better performance than baseline method for both human and head detection.

Yongming Zhang, Shifeng Zhang, Chubin Zhuang, Zhen Lei
Backmatter
Metadaten
Titel
Biometric Recognition
herausgegeben von
Zhenan Sun
Ran He
Jianjiang Feng
Shiguang Shan
Zhenhua Guo
Copyright-Jahr
2019
Electronic ISBN
978-3-030-31456-9
Print ISBN
978-3-030-31455-2
DOI
https://doi.org/10.1007/978-3-030-31456-9

Premium Partner