面向GAN生成图像的被动取证及反取证技术综述
摘 要
生成对抗网络(generative adversarial network,GAN)快速发展,并在图像生成和图像编辑技术等多个方面取得成功应用。然而,若将上述技术用于伪造身份或制作虚假新闻,则会造成严重的安全隐患。多媒体取证领域的研究者面向GAN生成图像已提出了多种被动取证与反取证方法,但现阶段缺乏相关系统性综述。针对上述问题,本文首先阐述本领域的研究背景和研究意义,然后分析自然图像采集与GAN图像生成过程的区别。根据上述理论基础,详细介绍了现有GAN生成图像的被动取证技术,包括:GAN生成图像检测算法,GAN模型溯源算法和其他相关取证问题。此外,针对不同应用场景介绍基于GAN的反取证技术。最后,通过实验分析当前GAN生成图像被动取证技术所面临的挑战。本文根据对现有技术从理论和实验两方面的分析得到以下结论:现阶段,GAN生成图像的被动取证技术已在空间域和频率域形成了不同技术路线,较好地解决了简单场景下的相关取证问题。针对常见取证痕迹,基于GAN的反取证技术已能够进行有效隐藏。然而,该领域研究仍存在诸多局限:1)取证与反取证技术的可解释性不足;2)取证技术鲁棒性和泛化性较弱;3)反取证技术缺乏多特征域协同的抗分析能力等。上述问题和挑战还需要研究人员继续深入探索。
关键词
Overview of passive forensics and anti-forensics techniques for GAN-generated image
He Peisong1, Li Weichuang1, Zhang Jingyuan1, Wang Hongxia1, Jiang Xinghao2(1.School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, China;2.School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China) Abstract
Generative adversarial network (GAN) has developed multimedia techniques like image generation and image editing. However, the issues of abusing GAN to generate fake identities and fake news can cause severe security risks and pose a great threat to the integrity and authenticity of digital images. Researchers in the field of multimedia forensics have proposed a variety of passive forensics and anti-forensics techniques about GAN-generated images, which has achieved the phased research results. In this article, the latest passive forensics and anti-forensics techniques for GAN-generated images are reviewed systematically. First, the research background and research significance about forensics and anti-forensics techniques of GAN-generated images have been illustrated, which can provide novel theories to protect the integrity and authenticity of digital image against the recent deep-learning based image generation/editing techniques and analyze the reliability of current forensic algorithms. The characteristics of several representative unconditional GANs and conditional GANs are illustrated on the aspect of network structure and training strategy for recent GAN-based image generation/editing techniques, where the visual defects caused by early GAN models have been continuously eliminated. Next, the differences between natural image acquisition and GAN image generation are analyzed, including the related signal processing operations. The abnormal traces of GAN images are explained in terms of color components, texture characteristics and global content. Based on the above theoretical foundation, this article introduces current passive forensics techniques of GAN-generated images in detail, including:GAN-generated image detection algorithm, GAN model identification algorithm and other related forensics issues. According to the types of forensic clues, GAN-generated image detection algorithms can be divided into two categories based on spatial information and frequency domain information respectively. The methodologies of feature extraction and classification about forensics traces are introduced, including hand-crafted feature based and convolutional neural network (CNN) based methods. More specifically, this article emphasizes the development process of CNN based detection algorithms, including preprocessing based on prior knowledge of forensics traces, advanced network structure and other training strategies. According to the experimental results, existing methods achieved promising detection performance in simple forensic scenarios. However, when testing samples suffer from post-processing operations or are generated by unseen GAN models, the detection performance will become dramatically worse. Then, GAN model identification algorithms of GAN-generated images are demonstrated, which can be split into two categories based on spatial information and frequency domain information. The characteristics of other related forensics issues are presented as well, i.e., DeepFake detection, where GAN-based image generation can be regarded as a part of the deep-learning based video forgery pipeline. On the other hand, this article has introduced anti-forensics techniques based on GAN, which includes white-box anti-forensics (attack) methods and black-box anti-forensics (attack) methods according to information required for the attack. Furthermore, existing methods mainly focus on two types of anti-forensics scenarios, including source identification and detection of image editing. Finally, several representative algorithms based on spatial information and frequency domain information are selected for the performance comparison. The open access datasets of GAN-generated images and pristine images are used to construct the training and testing samples. The challenging issues about GAN-based passive forensics are investigated, including robustness against post-processing operations and generalization capability of unknown GAN models. The influence of different preprocessing operations for input images, including resizing and crop, is also investigated. Moreover, the potential confrontation situations in practical applications are also considered, where GAN-generated image detection algorithms are applied to identify anti-forensic pictures based on GAN. Experimental results show that current passive forensics methods are invalid to expose unseen anti-forensic attacks. At present, the passive forensics techniques of GAN-generated images have constructed various technical routes in both the spatial domain and the frequency domain, which are capable to deal with the issues of GAN-generated image detection and GAN model identification in simple scenes. Besides, anti-forensics techniques based on GAN can effectively hide common forensics traces. However, by analyzing the related research works and conducting experiments, we think that the research in this field is still in an initial stage, and there are still many unsolved issues. 1) Insufficient interpretability of forensics and anti-forensics techniques. It is hard to analyze which kind of information (local or global; texture or color) plays a more important role in identifying GAN-generated images while the security of current GAN-based anti-forensics techniques lacks theoretical support; 2) Weak robustness and generalization of forensics techniques. It is valuable to explore other detection frameworks further, such as anomaly detection, which may be more efficient to deal with the continuously updated GAN models and unknown post-processing operations in practical applications; 3) The issue of designing the network structure, loss function and training strategy of anti-forensics techniques to hide newly introduced traces should be explored more carefully due to the lack of anti-forensics capability for multiple feature domains.
Keywords
digital image forensics anti-forensics generative adversarial network(GAN) convolutional neural network(CNN) image generation
|