人脸伪造及检测技术综述
曹申豪1, 刘晓辉2, 毛秀青3, 邹勤1(1.武汉大学计算机学院, 武汉 430072;2.国家计算机网络与信息安全管理中心, 北京 100029;3.信息工程大学密码工程学院, 郑州 450001) 摘 要
人脸伪造技术的恶意使用,不仅损害公民的肖像权和名誉权,而且会危害国家政治和经济安全。因此,针对伪造人脸图像和视频的检测技术研究具有重要的现实意义和实践价值。本文在总结人脸伪造和伪造人脸检测的关键技术与研究进展的基础上,分析现有伪造和检测技术的局限。在人脸伪造方面,主要包括利用生成对抗技术的全新人脸生成技术和基于现有人脸的人脸编辑技术,介绍生成对抗网络在人脸图像生成的发展进程,重点介绍人脸编辑技术中的人脸交换技术和人脸重现技术,从网络结构、通用性和生成效果真实性等角度对现有的研究进展进行深入阐述。在伪造人脸检测方面,根据媒体载体的差异,分为伪造人脸图像检测和伪造人脸视频检测,首先介绍利用统计分布差异、拼接残留痕迹和局部瑕疵等特征的伪造人脸图像检测技术,然后根据提取伪造特征的差异,将伪造人脸视频检测技术分为基于帧间信息、帧内信息和生理信号的伪造视频检测技术,并从特征提取方式、网络结构设计特点和使用场景类型等方面进行详细阐述。最后,分析了当前人脸伪造技术和伪造人脸检测技术的不足,提出可行的改进意见,并对未来发展方向进行展望。
关键词
A review of human face forgery and forgery-detection technologies
Cao Shenhao1, Liu Xiaohui2, Mao Xiuqing3, Zou Qin1(1.School of Computer Science, Wuhan University, Wuhan 430072, China;2.National Computer Network and Information Security Management Center, Beijing 100029, China;3.School of Cryptographic Engineering, Information Engineering University, Zhengzhou 450001, China) Abstract
Face image synthesis is one of the most important sub-topics in image synthesis. Deep learning methods like the generative adversarial networks and autoencoder networks enable the current generation technology to generate facial images that are indistinguishable by human eyes. The illegal use of face forgery technology has damaged citizens’ portrait rights and reputation rights and weakens the national political and economic security. Based on summarizing the key technologies and critical review of face forgery and forged-face detection, our research analyzes the limitations of current forgery and detection technologies, which is intended to provide a reference for subsequent research on fake-face detection. Our analysis is shown as bellows: 1) the technologies for face forgery are mainly divided into the use of generative confrontation technology to generate a category of new faces and the use of existing face editing techniques. First, our review introduces the development of generative adversarial network and its application in human face image generation, shows the face images generated at different development stages, and targets that generative adversarial network provides the possibility of generating fake face images with high resolution, real look and feel, diversified styles and fine details;furthermore, it introduces face editing technology like face swap, face reenactment and the open-source implementation of the current face swap and face reenactment technology on the aspects of network structure, versatility and authenticity of the generated image. In particular, face exchange and face reconstruction technologies both decompose the face into two spaces of appearance and attributes, design different network structures and loss functions to transfer targeted features, and use an integrated generation adversarial network to improve the reality of the generated results. 2) The technologies for fake face detection, according to the difference of media carriers, can be divided into fake face image detection and fake face video detection. Our review first details the use of statistical distribution differences, splicing residual traces, local defects and other features to identify fake facial image generated from straightforward generative adversarial network and face editing technologies. Next, in terms of the difference analysis of extracting forged features, the fake facial video detection technology is classified into technology based on inter-frame information, intra-frame information and physiological signals. The methodology of extracting features, the design of network structures and the use scenarios were illustrated in detail. The current fake image detection technology mainly uses convolutional neural networks to extract fake features, and realizes the location and detection of fake regions simultaneously, while fake video detection technologies mainly use a integration of convolutional neural networks and recurrent neural networks to extract the same features inter and inner frames; after that, the public data sets of fake-face detection are sorted out, and the comparison results of multiple fake-face detection methods are illustrated for multiple public data sets. 3) The summary and the prospect part analyze the weaknesses of the current face forgery technologies and forged-face detection technologies, and gives feasible directions for improvement. The current face video forgery technology mainly uses the method of partially modifying the face area with the following defects. There are forgery traces in a single video frame, such as blurred side faces and missing texture details in the face parts. The relevance of video frames was not considered and there were inconsistencies amongst the generated video frames, such as frame jumps, and the large difference in the position of key points of the two frames before and after; and the generated face video lacks normal biological information, such as blinks and micro expressions. The current forgery-detection technologies have poor robustness to real scenes and poor robustness against image and video compression algorithms. The detection methods trained on high-resolution datasets are not suitable for low-resolution images and videos. Forgery detection technologies are difficult to review the issue of continuous upgrade and evolution of forged technology. The further improvement is illustrated on forgery-detection technologies. For instance, when generating videos, it would be useful to add the location information of the face into the network to improve the coherence of the generated video. In related to forgery detection, the forgery features in the space and frequency domains can be fused together for feature extraction, and the 3D convolution and metric learning can be used to form a targeted feature distribution for forged faces and the genuine faces. The development of face forgery is featured by few-shot learning, strong versatility and high fidelity. Forgery-face detection technology is intended to high versatility, strong compression resistance, few-shot learning and efficient computing.
Keywords
|