Current Issue Cover
面向人脸修复篡改检测的大规模数据集

李伟1,2,3, 黄添强1,2,3, 黄丽清1,2,3, 郑翱鲲1,2, 徐超1,2(1.福建师范大学计算机与网络空间安全学院, 福州 350117;2.福建省公共服务大数据挖掘与应用工程技术 研究中心, 福州 350117;3.数字福建大数据安全技术研究所, 福州 350117)

摘 要
目的 图像合成方法随着计算机视觉的不断发展和深度学习技术的逐渐成熟为人们的生活带来了丰富的体验。然而,用于传播虚假信息的恶意篡改图像可能对社会造成极大危害,使人们对数字内容在图像媒体中的真实性产生怀疑。面部编辑作为一种常用的图像篡改手段,通过修改面部的五官信息来伪造人脸。图像修复技术是面部编辑常用的手段之一,使用其进行面部伪造篡改同样为人们的生活带来了很大干扰。为了对此类篡改检测方法的相关研究提供数据支持,本文制作了面向人脸修复篡改检测的大规模数据集。方法 具体来说,本文选用了不同质量的源数据集(高质量的人脸图像数据集CelebA-HQ及低质量的人脸视频数据集FF++),通过图像分割方法将面部五官区域分割,最后使用两种基于深度网络的修复方法CTSDG(image inpainting via conditional texture and structure dual generation)和RFR(recurrent feature reasoning for image inpainting)以及一种传统修复方法SC(struct completion),生成总数量达到60万幅的大规模修复图像数据集。结果 实验结果表明,由FF++数据集生成的图像在基准检测网络ResNet-50下的检测精度下降了15%,在Xception-Net网络下检测精度下降了5%。且不同面部部位的检测精度相差较大,其中眼睛部位的检测精度最低,检测精度为0.91。通过泛化性实验表明,同一源数据集生成的数据在不同部位的修复图像间存在一定的泛化性,而不同的源数据制作的数据集间几乎没有泛化性。因此,该数据集也可为修复图像之间的泛化性研究提供研究数据,可以在不同数据集、不同修复方式和不同面部部位生成的图像间进行修复图像的泛化性研究。结论 基于图像修复技术的篡改方式在一定程度上可以骗过篡改检测器,对于此类篡改方式的检测方法研究具有现实意义。提供的大型基于修复技术的人脸篡改数据集为该领域的研究提供了新的数据来源,丰富了数据多样性,为深入研究该类型的人脸篡改和检测方法提供了有力的基准。数据集开源地址https://pan.baidu.com/s/1-9HIBya9X-geNDe5zcJldw?pwd=thli。
关键词
Large-scale datasets for facial tampering detection with inpainting techniques

Li Wei1,2,3, Huang Tianqiang1,2,3, Huang Liqing1,2,3, Zheng Aokun1,2, Xu Chao1,2(1.College of Computer and Network Space Security, Fujian Normal University, Fuzhou 350117, China;2.Fujian Provincial Engineering Research Center for Public Service Big Data Mining and Application, Fuzhou 350117, China;3.Digital Fujian Big Data Security Technology Institute, Fuzhou 350117, China)

Abstract
Objective DeepFake technology, born with the continuous maturation of deep learning techniques, primarily utilizes neural networks to create non-realistic faces. This method has enriched people’s lives as computer vision advances and deep learning technologies mature. It has revolutionized the film industry by generating astonishing visuals and reducing production costs. Similarly, in the gaming industry, it has facilitated the creation of smooth and realistic animation effects. However, the malicious use of image manipulation to spread false information poses significant risks to society, casting doubt on the authenticity of digital content in visual media. Forgery techniques encompass four main categories: face reenactment, face replacement, face editing, and face synthesis. Face editing, a commonly employed image manipulation method, involves falsifying facial features by modifying the information related to the five facial regions. As one of the commonly employed methods in facial editing, image inpainting technology involves utilizing known content from an image to fill in missing areas, aiming to restore the image in a way that aligns as closely as possible with human perception. In the context of facial forgery, image inpainting is primarily used for identity falsification, wherein facial features are altered to achieve the goal of replacing a face. The use of image inpainting for facial manipulation similarly introduces significant disruption to people’s lives. To support research on detection methods for such manipulations, this paper produced a large-scale dataset for face manipulation detection based on inpainting techniques.Method This paper specifically focuses on the field of image tampering detection, utilizing two classic datasets: the high-quality CelebA-HQ dataset, comprising 25 000 high-resolution (1 024×1 024 pixels) celebrity face images, and the low-quality FF++ dataset, consisting of 15 000 face images extracted from video frames. On the basis of the two datasets, facial feature regions (eyebrows, eyes, nose, mouth, and the entire facial area) are segmented using image segmentation methods. Corresponding mask images are created, and the segmented facial regions are directly obscured on the original image. Two deep neural network-based inpainting methods (image inpainting via conditional texture and structure dual generation (CTSDG) and recurrent feature reasoning for image inpainting (RFR)) along with a traditional inpainting method (struct completion(SC)) were employed. The deep neural network methods require the provision of mask images to indicate the areas for inpainting, while the traditional method could directly perform inpainting on segmented facial feature images. The facial regions were inpainted using these three methods, resulting in a large-scale dataset comprising 600 000 images. This extensive dataset incorporates diverse pre-processing techniques, various inpainting methods, and includes images with different qualities and inpainted facial regions. It serves as a valuable resource for training and testing in related detection tasks, offering a rich dataset for subsequent research in the field, and also establishes a meaningful benchmark dataset for future studies in the domain of face tampering detection.Result We present comparative experiments conducted on the generated dataset, revealing notable findings. Experimental results indicate a 15% decrease in detection accuracy for images derived from the FF++ dataset under the ResNet-50 benchmark detection network. Under the Xception-Net network, the detection accuracy experiences a 5% decline. Furthermore, significant variations in detection accuracy are observed among different facial regions, with the lowest accuracy recorded in the eye region at 0.91. Generalization experiments suggest that inpainted images from the same source dataset exhibit a certain degree of generalization across different facial regions. In contrast, minimal generalization is observed among datasets created from different source data. Consequently, this dataset also serves as valuable research data for studying the generalization of inpainted images across different facial regions. Visualization tools demonstrate that the detection network indeed focuses on the inpainted facial features, affirming its attention to the manipulated facial regions. This work provides new research perspectives for methods of detecting image restoration-based manipulations.Conclusion The use of image inpainting techniques for tampering introduces a challenging scenario that can deceive conventional tampering detectors to a certain extent. Researching detection methods for this type of tampering is of practical significance. The provided large-scale face tampering dataset, based on inpainting techniques, encompasses high- and low-quality images, employing three distinct inpainting methods and targeting various facial features. This dataset offers a novel source of data for research in this field, enhancing diversity and providing benchmark data for further exploration of image restoration-related forgeries. With the scarcity of relevant datasets in this domain, we propose the utilization of this dataset as a benchmark for the field of image inpainting tampering detection. This dataset not only supports research in detection methodologies but also contributes to studies on the generalization of such methods. It serves as a foundational resource, filling the gap in the available datasets and facilitating advancements in the detection and generalization studies in the domain of image inpainting tampering. This benchmark includes a large-scale inpainting image dataset, totaling 600 000 images. The dataset’s quality is evaluated based on accuracy on manipulation detection networks, generalizability across different inpainting networks and facial regions, and modules such as data visualization.
Keywords

订阅号|日报