二维码掩膜下的稀疏对抗补丁攻击
摘 要
目的 传统的基于对抗补丁的对抗攻击方法通常将大量扰动集中于图像的掩膜位置,然而要生成难以察觉的扰动在这类攻击方法中十分困难,并且对抗补丁在人类感知中仅为冗余的密集噪声,这大大降低了其迷惑性。相比之下,二维码在图像领域有着广泛的应用,并且本身能够携带附加信息,因此作为对抗补丁更具有迷惑性。基于这一背景,本文提出了一种基于二维码掩膜的对抗补丁攻击方法。方法 首先获取目标模型对输入图像的预测信息,为提高非目标攻击的效率,设定伪目标标签。通过计算能够远离原标签同时靠近伪目标标签的梯度噪声,制作掩膜将扰动噪声限制在二维码的有色区域。同时,本文利用基于 Lp-Box 的交替方向乘子法(alternating directionmethod of multipliers,ADMM)算法优化添加扰动点的稀疏性,在实现高效攻击成功率的条件下保证二维码本身携带的原有信息不被所添加的密集高扰动所破坏,最终训练出不被人类察觉的对抗补丁。结果 使用 ImageNet 数据集分别在 Inception-v3 及 ResNet-50(residual networks-50)模型上进行对比实验,结果表明,本文方法在非目标攻击场景的攻击成功率要比基于 L∞的快速梯度符号法(fast gradient sign method,FGSM)、DeepFool 和投影梯度下降(projectedgradient descent,PGD)方法分别高出 8. 6%、14. 6% 和 4. 6%。其中,对抗扰动稀疏度 L0和扰动噪声值在 L2、L1、L∞范数指标上对比目前典型的攻击方法均取得了优异的结果。对于量化对抗样本与原图像的相似性度量,相比 FSGM 方法,在峰值信噪比(peak signal-to-noise ratio,PSNR)和相对整体维数综合误差(erreur relative globale adimensionnellede synthèse,ERGAS)指标上,本文方法分别提高 4. 82 dB 和 576. 3,并在可视化效果上实现真正的噪声隐蔽。同时,面对多种先进防御算法时,本文方法仍能保持 100% 攻击成功率的高鲁棒性。结论 本文提出的基于二维码掩膜的对抗补丁攻击方法于现实攻击场景中更具合理性,同时采用稀疏性算法保护二维码自身携带信息,从而生成更具迷惑性的对抗样本,为高隐蔽性对抗补丁的研究提供了新思路。
关键词
Sparse adversarial patch attack based on QR code mask
Ye Yixuan1, Du Xia1, Chen Si1, Zhu Shunzhi1, Yan Yan2(1.School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China;2.School of Informatics, Xiamen University, Xiamen 361005, China) Abstract
Objective Convolutional neural networks (CNNs) and other deep networks have revolutionized the field of computer vision, particularly in the area of image recognition, leading to significant advancements in various visual tasks. Recent studies have unequivocally demonstrated that the performance of deep neural networks is significantly compromised in the presence of adversarial examples. Maliciously crafted inputs can cause a notable decline in the accuracy and reliability of deep learning models. Traditional adversarial attacks based on adversarial patches tend to concentrate a significant amount of perturbations in the masked regions of an image. However, crafting imperceptible perturbations for patch attack is highly challenging. Adversarial patches consist solely of noise and are visually redundant, lacking any practical significance in their existence. To address this issue, this paper proposes a novel approach called quick response (QR) code-based sparse adversarial patch attack. A QR code is a square symbol consisting of alternating dark and light modules, extensively employed in images. It uses a specialized encoding technique to store meaningful information. Utilizing QR codes as adversarial patches not only inherits the robustness of traditional adversarial patches but also increases the likelihood of evading suspicion. A crucial detail to highlight is that global-based perturbations can potentially disrupt the integrity of the valuable information stored in the QR code. Particularly when attacking robust images, excessive superimposed perturbations can significantly affect the white background of the QR code, thus ultimately rendering the generated adversarial QR code unscannable, preventing its successful detection and decoding. In this regard, we hope to ensure the integrity of QR code by limiting the amount of noise. Inspired by sparse attacks, we integrate the QR code patch with sparse attack techniques to control the sparsity of adversarial perturbations. By doing so, our proposed method effectively limits the number of noise points, minimizing the influence of noise on the QR code pixels and ensuring the robustness of the encoded information. Furthermore, our approach exhibits attack performance and maintains a certain level of imperceptibility, making it a compelling solution.Method Building upon the aforementioned analysis, our proposed method follows a step-by-step approach. First, we gather the prediction information of the target model on the input image. Next, we calculate the gradient that steers the prediction result away from the category with the highest probability. Simultaneously, we create a mask to confine the perturbation noise within the colored area of the QR code, thereby preserving the original information. Taking inspiration from recent advances, we employ the Lp-Box alternating direction method of multipliers algorithm to optimize the sparsity of added perturbation points. This optimization aims to ensure that the original information carried by QR codes remains intact even under the efficient conditions for successful adversarial attacks. By mitigating the impact of densely added high-distortion points, our approach achieves a balance between high attack success rates and preserving the inherent recognizability of QR codes. The final result is an adversarial patch that remains imperceptible to human observers.Result Experiments were conducted on the Inception-v3 and ResNet-50 models using the ImageNet dataset. Our method was compared against representative adversarial attacks in non-target scenarios, considering the attack success rate and Lp-norm perturbation. To assess the similarity between adversarial examples and the original image, we utilized several similarity measures (peak signal-to-noise ratio(PSNR), erreur relative globale adimensionnelle de synthèse(ERGAS), structural similarity index measure(SSIM), spectral angle mapping(SAM)) to calculate the similarity scores and compared them with other attacks. We also evaluated the robustness of our attack after applying several defense algorithms as pre-processing steps. In addition, we investigated the impact of different QR code sizes on the attack success rate and Lp-norm of the perturbation of our method. Experimental results demonstrate that our approach achieves a balance between attack success rate and imperceptible noise in non-target scenarios. The adversarial examples generated by our method exhibit the smallest L0 norm of perturbation among all the methods. Although our method may not always achieve the best similarity scores, visual results demonstrate that our crafted adversarial noise is optimally imperceptible. Moreover, even after pre-processing with various defense methods, our method continues to outperform other attacks. In the ablation study on QR code sizes for non-target attacks, we observed that reducing the QR code size from 55×55 pixels to 50×50 pixels led to a 3.8% decrease in the attack success rate. Conversely, increasing the size to 60×60 pixels resulted in a 2.7% improvement compared with 55×55 pixels. Similarly, reducing the size to 65×65 pixels led to a 1.1% decrease compared with 60×60 pixels, while increasing it to 70×70 pixels resulted in a 6.4% improvement compared with 65×65 pixels. With regard to the Lp-norm of perturbations, we found a positive correlation between the L1-norm and the number of perturbation points, whereas the L2-norm and L0-norm perturbations exhibited a negative correlation with the number of perturbed points.Conclusion The proposed QR code-based adversarial patch attack is more reasonable for real attack scenarios. By utilizing sparsity algorithms, we ensure the preservation of the information carried by the two-dimensional code itself, resulting in the generation of more perplexing adversarial samples. This approach provides novel insights into the research of highly imperceptible adversarial patches.
Keywords
|