最小依赖隐藏的屏摄鲁棒水印方法
摘 要
目的 现有屏摄水印方法无法有效平衡计算复杂度、嵌入水印后的图像质量以及水印鲁棒性3项指标,同时广泛使用透视畸变矫正预处理,大大限制了屏摄水印的实际商业使用。本文在重新设计噪声层的基础上,提出了一种最小依赖载体图像隐藏水印信息的屏摄鲁棒水印,将屏摄水印对于载体图像的依赖控制在最小。方法 为了保证水印的嵌入效率,极大简化依赖深度隐藏网络框架中的编码网络,达成对载体图像的最小依赖,大大减小计算复杂度;为了平衡网络深度减小所导致的网络提取能力损失,加入Sobel算子,引入载体图像的边缘信息;在噪声层中加入缩放攻击操作,并由此去除了限制屏摄水印应用范围的透视畸变矫正预处理,进一步拓宽了应用范围;为了训练网络的屏摄鲁棒性,重新定义了噪声层,改进原有噪声层的设计结构,对噪声层图像扰动类型和参数进行随机选择,使得解码网络的输入数据具有更高的样本均衡性和多样性。结果 在DIV2K(DIVerse 2K)数据集上与其他的3种方法进行了对比实验,本文方法获得了最高的PSNR(peak signal-to-noise ratio)和SSIM(structural similarity index measure)指标,并比排名第2的通用深度隐藏方法提高了12 dB的PSNR值和0.006的SSIM值;在有无攻击两种环境下,本文方法均能保持很高的ACC(accuracy)和F1指标,在攻击环境下比排名第2位的StegaStamp(steganography stamp)方法提高了0.262的F1分数。与同网络框架下的已有噪声层相比,在无攻击环境下,本文算法提高了0.124的ACC和0.284的F1分数;在有攻击环境下,本文算法提高了0.316的ACC和0.524的F1分数,水印提取的准确性更高。结论 本文算法在图像质量和水印鲁棒性方面获得了更优的效果,摆脱了透视畸变矫正的限制,拓宽了屏摄水印的应用范围。
关键词
LDH:least dependent hiding for screen-shooting resilient watermarking
Song Jiawei, Liu Chunxiao, Zhang Xinyi(School of Computer Science and Technology, Zhejiang Gongshang University, Hangzhou 310018, China) Abstract
Objective With the rapid development of internet and communication technology,the remote desktop technique enables separating the confidential information and the screen in space.However,it also engenders information security risks of confidential information because of illegal screen shooting.How can illegal screen shooting be prevented and the related responsibility identified? Adding a robust watermark and revealing the message hidden in the shot image is preferred.By taking photos of the files displayed on the screen,the captured photos can realize efficient,high-quality information recording.The pictures taken on the screen not only record effective information but also destroy the possible watermark signal carried to a large extent,making the photo leakage behavior concealed and difficult to trace.Screen-shooting watermark is a challenging subject in digital watermark.In screen shooting,the information displayed on the screen is received through camera capturing and postprocessing operations to transmit information from the screen to the camera in the optical channel involving optical changes,digital-analog conversion,image scaling,and image distortion.Four main methods are used to deal with this subject,namely,key-point-,template-,frequency-domain-,and deep neural network(DNN)-based methods.Traditional methods and DNN-based methods have some solutions.However,neither of them could balance computational complexity,image quality,and watermark robustness.The calculation of key points in keypoint based methods is always overly time-consuming for practical use.Template-based methods often bring great changes to the cover images,resulting in image quality degradation.Watermarks generated by the frequency-domain-based methods have poor robustness and could be easily destroyed.Almost all methods should correct and resize the warped image to its original image size for the following watermark extraction stage,which is the main reason why the watermarks in these methods could not achieve robustness to clipping and scaling in practice.To solve the above problems,the least dependent hiding for screen-shooting resilient watermarking method is proposed to consider computational complexity,image quality,and robustness comprehensively.The decoder-based reveal network only needs to disclose the watermark message from the corresponding location of the container image,which guarantees the semantic consistency of the reveal network and the embedding network.The embedded watermark,such as user name,time,and IP address,could be extracted under the screen-shooting attack or other attacks,and to imitate the information loss in screen shooting,an improved noise layer is designed for the training of our model.Method First,the watermark embedding network in the dependent deep hiding(DDH) framework is greatly simplified,and the Sobel operator is added to introduce the edge information of the cover image.The scaling attack operation is added to the noise layer,and the perspective distortion correction preprocessing is removed because it limits the application range of screen-shooting resilient watermarking.The existing noise layer is redefined in the way that the image disturbance types are randomly selected and the parameters of the specific image disturbance types are randomly changed,which increases the sample equilibrium and diversity of the training data of the reveal network.The investigation of previous DNN-based methods reveals their watermark residuals visually approximate the edges of the cover images.A strong correlation exists between the edges of the cover images and the invisibility of the watermark.To improve robustness and reduce computation complexity,the edge map of the cover image extracted by the Sobel operator is concatenated with the feature map of the watermark.The watermark embedding network is divided into two parts according to whether the cover image is used in the convolution because the network part without cover image participating in it could be previously calculated in practice.Second,the existing noise layer is modified to simulate the image scaling operation in the screen shooting,so the widely used perspective distortion correction can be canceled.Considering the class-balance principle,a new design idea of noise layer is proposed,in which random decision modules are added to the noise layer to make the data augmentation stronger than the original image disturbing effects.When training the network,learned perceptual image patch similarity(LPIPS) loss,L2 loss,and structural similarity index measure(SSIM) loss are used to constrain the visual similarity of the cover image and the container image while information entropy loss and weighted cross entropy loss are used to reconstruct the watermark with the form of a single-channel binary image.Model training and testing is carried out based on PyTorch.PyTorch is used to implement least dependent hiding(LDH) with NVIDIA GeForce 2080 Ti GPU and Intel Core i7-9700 3.00 GHz CPU.The whole neural network is optimized by Adam optimizer.The initial learning rate is set to 1e-3,which is then reduced 90% every 20 epochs.In the training,the input image resolution is 256 × 256 and the batch size is 2.A pretrained model trained without geometric transformation in the noise layer is used to initialize the model.Result Experimental results show the proposed noise layer is more effective than the three latest methods on the DIVerse 2K(DIV2K) dataset.The proposed method achieves the highest peak signal-tonoise ratio(PSNR) and SSIM index,which improves PSNR by 12 dB and SSIM by 0.006 compared with the second-best method—universal deep hiding(UDH) if no image attacks are applied.Moreover,it ranks second in accuracy and F1 index if no image attacks are applied.Compared with the same network framework using the noise layer proposed by the previous work,our algorithm achieves better indicators and higher accuracy for the watermark extraction in both modes with and without image attacks,which proves the noise layer proposed is indeed helpful to increase the training to improve the accuracy and robustness of watermark extraction.The watermark can be extracted from the screen shot images in the range of 10 cm to more than 50 cm,and it has a high extraction success rate at a usual distance.Conclusion In this paper,the least dependent hiding for screen-shooting resilient watermarking is proposed,which comprehensively balances computational complexity,image quality,and robustness.An effective noise layer improvement measure is also designed,which helps our algorithm perform better in image quality and watermark robustness.The proposed algorithm has the advantages of high embedding efficiency,high robustness,and high transparency,which means wider application range compared with the existing methods.
Keywords
|