面向大姿态人脸识别的正面化形变场学习
摘 要
目的 人脸识别已经得到了广泛应用,但大姿态人脸识别问题仍未完美解决。已有方法或提取姿态鲁棒特征,或进行人脸姿态的正面化。其中主流的人脸正面化方法包括2D回归生成和3D模型形变建模,前者能够生成相对自然真实的人脸,但会引入额外的噪声导致图像信息的扭曲;后者能够保持原始的人脸结构信息,但生成过程是基于物理模型的,不够自然灵活。为此,结合2D和3D方法的优势,本文提出了基于由粗到细形变场的人脸正面化方法。方法 该形变场由深度网络以2D回归方式学得,反映的是不同视角人脸图像像素之间的语义级对应关系,可以类3D的方式实现非正面人脸图像的正面化,因此该方法兼具了2D正面化方法的灵活性与3D正面化方法的保真性,且借鉴分步渐进的思路,本文提出了由粗到细的形变场学习框架,以获得更加准确鲁棒的形变场。结果 本文采用大姿态人脸识别实验来验证本文方法的有效性,在MultiPIE (multi pose,illumination,expressions)、LFW (labeled faces in the wild)、CFP (celebrities in frontal-profile in the wild)、IJB-A (intelligence advanced research projects activity Janus benchmark-A)等4个数据集上均取得了比已有方法更高的人脸识别精度。结论 本文提出的基于由粗到细的形变场学习的人脸正面化方法,综合了2D和3D人脸正面化方法的优点,使人脸正面化结果的学习更加灵活、准确,保持了更多有利于识别的身份信息。
关键词
Large pose face recognition with morphing field learning
Hu Lanqing1,2, Kan Meina1,2, Shan Shiguang1,2, Chen Xilin1,2(1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;2. Abstract
Objective Face recognition is currently challenging in the context of large variations in pose, expression, aging, lighting and occlusion. Pose variations tend to large non-planar face transformation among these factors. To address the pose variations, previous methods mainly attempt to extract pose invariant feature or frontalize non-frontal faces. Among them, the frontalization methods can release discriminative feature learning via pose variations elimination. There are mainly two kinds of face frontalization methods:2D and 3D frontalization methods. 2D methods can generate more natural frontal faces but it may lose facial structural information, which is the key factor of identity discrimination. 3D methods can well preserve facial structural information, but are not so flexible. In summary, both 3D methods and 2D methods have information loss in the frontalized faces especially for large pose variations like invisible pixels in 3D morphable model or pixel aberrance in 2D generative methods. Method We propose a novel coarse-to-fine morphing field network (CFMF-Net), combining both 2D and 3D face transformation methods to frontalize a non-frontal face image via the coarse-to-fine optimized morphing field for shifting each pixel. Thanks to the flexibility of 2D learning based methods and structure preservation of 3D morphable model-based methods, our proposed morphing learning method makes the learning process easier and reduces the probability of over-fitting. First, a coarse morphing field is learned to capture the major structure variation of single face image. Then, a residual module based facial information extraction is designed to promote the coarse morphing field of those output concatenated with the coarse morphing field to generate the final fine morphing field for face image input. The overall framework is for the pixel correspondences regression but not pixel values. The work ensures that all pixels in the frontalized face image are taken from the input non-frontal image, thus reducing information distortion to a large extent. Therefore, the identity information related to the input non-frontal face images are well preserved with favorable visual results, thus further facilitating the subsequent face recognition task. To achieve more accurate morphing field output, our design of the coarse-to-fine morphing field learning assures the robustness of learned morphing field and the residual complementing branch. Result To verify the effectiveness of our proposed work, extensive experiments on multi pose, illumination, expressions (MultiPIE), labeled faces in the wild (LFW), celebrities in frontal-profile in the wild (CFP) and intelligence advanced research projects activity Janus benchmark-A (IJB-A) datasets are carried out and the results are compared with other face transformation methods. Among these testing sets, MultiPIE, CFP and IJB-A datasets are all with full pose variation. In addition, IJB-A contains full pose variations as well as other complicated variations like low resolution and occlusion. The experiments follow the same training and testing protocol with previous works, i.e., training with both original and frontalized face images. For fair comparison, the commonly used LightCNN-29 is developed as the recognition model. Our method outperforms related works on the large pose testing protocol of MultiPIE and CFP and comparable performance on LFW and IJB-A. Additionally, our visualization results also show that our method can well preserve the identity information. Furthermore, the ablation study presents the feasibility of the coarse-to-fine framework in our CFMF-Net. In a word, the recognition accuracies and visualization results demonstrate that the proposed CFMF-Net can generate frontalized faces with identity information preserved and achieve higher large pose face recognition accuracy as well. Conclusion A coarse-to-fine morphing field learning framework frontalizes face images by shifting pixels to ensure the flexible learnability and identity information preservation. To improve its accuracy, the flexible learnability yields the network to optimize face frontalization objective without predefined 3D transformation rules. Moreover, the learned morphing field for each pixel makes the output frontal face shifted from the input image only, reducing the information loss. Simultaneously, the design of coarse-to-fine and residual architecture ensures more robust and accurate results further.
Keywords
large pose face recognition face frontalization morphing field learning coarse-to-fine learning fully convolutional network
|