27-04-2018 | Issue 3/2019

Enhancing Face Recognition from Massive Weakly Labeled Data of New Domains
- Journal:
- Neural Processing Letters > Issue 3/2019
Important notes
This work is supported by NSF of China under Grants 61672548, U1611461, and the Guangzhou Science and Technology Program, China, under Grant 201510010165.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abstract
Training data are critical in face recognition systems. Labeling a large scale dataset for a particular domain needs lots of manpower. Without dataset related to current face recognition domain, we can’t get a strong face recognition model with existing public datasets. In this paper, we propose a semi-supervised method to automatically construct strong dataset which can be trained to achieve better performance on the target domain from massive weakly labeled data. In the case of Asian face recognition, a well trained VRCN model by CASIA, which achieves 98.63% on LFW and 91.76% on YTF, only achieves 88.53% recognition rate on our test dataset of Asian faces. We collect 530,560 weakly labeled Asian face images of 7962 identities, and get a cleaned dataset of size 285,933. Model trained by the cleaned dataset with VRCN network and same strategy achieves 95.33% recognition rate on the Asian face test dataset (6.8% improved).