Current Issue Cover
面向屏幕拍摄的端到端鲁棒图像水印算法

吴嘉奕, 李晓萌, 秦川(西安工程大学电子信息学院, 西安 710600)

摘 要
目的 在抗屏摄鲁棒图像水印算法的研究中,如何在保证含水印图像视觉质量的同时提高算法的鲁棒性是存在的主要挑战。为此,提出一种基于深度学习的端到端网络框架以用于鲁棒水印的嵌入与提取。方法 在该网络框架中,本文设计了包含摩尔纹在内的噪声层用以模拟真实屏摄噪声造成的失真,并通过网络训练来学习到抵抗屏摄噪声的能力,增强网络生成的含水印图像的鲁棒性;同时引入了最小可察觉失真(just noticeable distortion,JND)损失函数,旨在通过监督图像的JND系数图与含有水印信息的残差图之间的感知差异来自适应控制鲁棒水印的嵌入强度,以提高生成的含水印图像的视觉质量。此外,还提出了两种图像区域自动定位方法,分别用于解决:拍摄图像中前景与背景分割即含水印图像区域的定位矫正问题,以及含水印图像经过数字裁剪攻击后的解码问题。结果 实验结果表明,引入JND损失函数后嵌入水印图像的视觉质量得到了提高,平均的峰值信噪比(peak signalto-noise ratio,PSNR)、结构相似性(structural similarity,SSIM)可分别达到30.937 1 dB和0.942 4。加入摩尔纹的噪声模拟层后,所提算法的误码率可下降1%~3%,具有抵抗屏摄噪声的能力。另外,将图像的R通道嵌入用于抗裁剪的模板,使得算法可有效抵抗较大程度的数字裁剪攻击。本文算法的计算复杂度较低,对单幅图像进行嵌入时,定位与提取操作的总耗时小于0.1 s,可满足实际应用场景的实时性需求。结论 本文算法的嵌入容量和生成的含水印图像视觉质量较为理想,且在不同拍摄距离、角度以及不同拍摄和显示设备条件下的鲁棒性优于已报道的主流算法。
关键词
Screen-shooting robust watermarking with end-to-end neural network

Wu Jiayi, Li Xiaomeng, Qin Chuan(School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China)

Abstract
Objective With the rapid development of the Internet and imaging devices, the security of digital image storage and file sharing has become an important concern. Robust watermarking techniques can be used to solve these problems. The general idea of these techniques is to embed watermark information, such as copyright labels and user identification, into the to-be-protected image imperceptibly and then extract the watermark from the watermarked image even after undergoing some attacks. The two most important properties of robust watermarking are the robustness and visual quality of the watermarked image. Therefore, the watermarked image should be robust against different kinds of attacks and show satisfactory visual quality. As a typical robust watermarking technique, screen-shooting robust watermarking can resist the noises involved during the screen-shooting procedure. In other words, watermark information can still be accurately extracted from the watermarked image after screen-shooting. Method In this paper, we propose an effective, end-to-end network framework based on deep learning for screen-shooting robust watermarking. In this framework, a screen-shooting noise layer, including a Moiré pattern simulation, is introduced to simulate the noise within the screen-shooting channel so as to learn how to enhance the robustness of the network against realistic noise during the screen-shooting procedure through network training. In order to further improve the visual quality of the generated watermarked image, we define and introduce a just noticeable distortion(JND)loss function that aims to control the strength of the residual image containing the watermark information by supervising the visual perceptual loss between the JND maps of the original and residual images. We also propose two automatic localization methods for watermarked images. The first method locates the watermark of an image in a screen-shooting scenario, wherein the obtained screenshot may not only contain the image displayed on the screen but also some background information, which can affect the result of watermark extraction at the decoding end and render this result useless. To address this problem, this paper proposes the second method, namely, a region localization method that combines deep learning with traditional image processing. This method assumes that the image region that needs to extract the watermark accounts for most of the pixels in the screen-shooting result and that the background color is relatively uniform with no obvious mutation. The localization of the image region containing the watermark can be equated to the problem of foreground extraction in this case. We apply this method to the watermarking of images under digital attack. The robustness of the watermarking algorithm should not be limited to the robustness of the screen-shooting process but also to attacks in the digital environment, such as image filtering, image noise addition, and digital cropping. While the vast majority of the digital attacks can be equated by the noise introduced by the screen-shooting process, digital cropping attacks cannot be regarded as a kind of screen-shooting noise. For this reason, this paper introduces an anti-crop region localization method based on symmetric noise templates. This method divides the image into four sub-images, namely, top-left, bottom-left, top-right, and bottom-right. A two-channel watermark information residual map is generated and embedded in the green and blue channels to create four copies of the same watermark information in one image. Additionally, a symmetric noise template is embedded in the red channel for anti-crop localization. Even when the watermarked image suffers from cropping attacks, the localization method can still accurately extract the watermark information as long as more than 1/4 of the image area exists. Result Experimental results show that after introducing the JND loss function and embedding watermark, the visual quality of watermarked image is improved, and the average peak signal-to-noise ratio (PSNR)and structural similarity(SSIM)reach 30. 937 1 dB and 0. 942 4, respectively. After adding the Moiré noise simulation layer, the bit error rate of the proposed scheme is reduced to 1%~3%, which demonstrates the ability of this scheme to resist the noise generated from the screen shooting. This scheme also effectively resists strong cropping attacks by embedding the anti-cropping template into the R channel of the image. The total running time of embedding and extracting a single image is less than 0. 1 s, which is suitable for deployment in application scenarios with real-time requirements. Meanwhile, the performance of the proposed algorithm is compared with that of state-of-the-art screen-shooting robust watermarking algorithms across various experimental settings, including screen shooting and digital attack settings. Results of the bit error rate comparison demonstrate that the proposed algorithms not only help the network simulate screen-shooting noise with a high level of robustness against actual screen-shooting noise but also equip the network with the ability to withstand specific digital cropping attacks. Conclusion This paper proposes an end-to-end embedding-extraction network for robust watermarking against screen shooting. In this network, a Moiré noise simulation layer and a JND loss function module are introduced to enhance the robustness and visual quality of the watermarked images generated by the network. We also design two watermark localization methods to address two realistic scenarios, namely, screen shooting and digital cropping. Our experimental results demonstrate that our proposed scheme achieves a satisfactory embedding capacity and visual quality of the generated watermarked image and that the robustness of our scheme under different shooting distances, angles, and capturing/displaying devices is better than those of some state-of-the-art schemes. In our future research, we aim to investigate the decoding of watermarks when only a portion of the screen image is captured, which is a more intricate process than mere digital cropping and improving the visual quality of watermarked images in scenarios with high embedding capacity.
Keywords

订阅号|日报