中文水印字库的自动生成方法
摘 要
目的 文档水印技术是一种用以解决文档泄密溯源的信息隐藏技术。传统的基于字库的文档水印方案需要手动生成字库,极大地影响了水印的使用效率。为此本文设计了一种基于自动生成字库的鲁棒文档水印方案。方法 该方法由一个端到端的编码—解码器结构的自动字库生成网络、一个字符筛选嵌入端和一个神经网络提取端组成,可自动完成变形字库的生成,而后进行水印的嵌入和提取。为了抵抗传输过程中可能存在的失真,在编码器和解码器之间加入可导噪声层用以模拟失真过程,使得水印模型获得对应的鲁棒性。结果 本文方法在含252个中文字符的真实文档中嵌入252 bit水印信息,与其他文档水印方法的视觉质量和鲁棒性进行了对比。结果表明,相对于现有的基于字符特征的中文文档水印方法,本文方法的峰值信噪比(peak signal to noise ratio,PSNR)、结构相似性(structural similarity,SSIM)和主观质量评分分别提升了11.68 dB、0.08和5.8%,说明其有更好的视觉质量。对于数字信道传输场景,本文方法达到了与其他方法大致相当的性能;对于打印扫描场景,本文方法在三号、四号、小四号和五号字体下的水印提取率分别提升了2.4%、3.07%、1.34%和0.02%,在打印、扫描分辨率失配的场景下也具有较好性能,说明其在抗打印扫描上具有更高的鲁棒性。结论 与基于人工设计字库的中文字符水印相比,本文方法充分利用了字符的几何特征并且能够自动生成字库,降低了中文文档水印方案的复杂度。
关键词
Automatic generation of Chinese document watermarking fonts
Sun Shan1,2, Zhang Weiming1,2, Fang Han1,2, Yu Nenghai1,2(1. School of Cyber Security, University of Science and Technology of China, Hefei 230027, China;2. Abstract
Objective The copyright protection has been the hotspot with the amount of digital documents increased dramatically. In order to protect the document copyright and locate the source of the leaked document, watermarking technology innovation for documents has been widely focused on. The protection can be realized via adding invisible digital watermark information (e.g., device number, date, etc.) to the document. To realize the traceability of document leakage, the leaked source can be located by extracting the watermark from the document once the watermarked document is leaked. Meanwhile, the current watermarking technology can also act as a deterrent which effectively reduce the occurrence of the leaking events. The current document watermarking methods can be divided into five categories:document structure based methods, natural language processing based methods, grid pattern based methods, image based methods and font based methods. Among them, the font based methods guarantee the best performance in the view of robustness and transparency. The main idea of such methods is representing the watermark information into the characteristics of the fonts (e.g., the size, shape or brightness) while the modified fonts maintain the high visual consistency with the original one. The robustness, transparency, capacity as well as the integrity can be achieved simultaneously. However, the existing font based methods need to design the modification features manually, and cannot automatically generate the new fonts. For the Chinese character system which contains a large number of characters, such methods will cause a labor cost workload and severely less efficiency. To overcome such drawbacks, this research proposes an automatic font generation based robust document watermarking scheme.Method The framework of such scheme is comprised of an end-to-end encoder-decoder structure automatic font generation network, a character selection embedder and a neural network based extractor. With the designed font generation network, the deformed font library is further utilized for embedding the watermark generated automatically. Meanwhile, a differentiable noise layer is complemented between the encoder and the decoder to simulate the distortion process in order to realize the robustness against different distortions, so that the encoder can learn better features to create the new font and the decoder can be trained to be adaptive to such distortions. This research designs a combined noise layer that can effectively simulate the common distortions via the common distortions in digital transmission channels (e.g., screenshots, scaling, Gaussian noise and JPEG compression). The whole font generation network consists of four parts:encoder, noise layer, decoder and adversarial recognizer. The encoder receives the watermark information and the carrier character image to generate the encoded character image. The noise layer adds noise to the encoded image to generate the noisy image. In particular, several of six simulated noise layers (identity mapping layer, scaling layer, translation layer, rotation layer, Gaussian noise layer and Gaussian blur layer) are randomly opted as a combined noise layer at each iteraction. The decoder receives the noisy image and outputs the corresponding watermark label. The adversarial recognizer tries to detect whether the current image is a carrier character image or an encoded one, which aids to improve the visual quality of the generated font. The encoder provides training samples for the extractor to ensure better extraction performance, and the extractor guides the generation direction of encoder to create better character images. The two coordinated modules make the generated font with higher visual quality and stronger robustness. Based on the well trained font generation, the network has generated the watermarked font library via feeding them with an original font library and different watermark signal. Each character in the font library can corresponds to different perturbation, which can be decoded to different watermark signal further. Hence, based on the generated font library, the corresponding character in the codebook is sorted out in the watermark embedding stage according to the current watermark information to replace the current character in the input document. The watermark information can be embedded into the whole original document and generate the watermarked document in this way. In the extraction stage, the whole document is firstly divided into some single character images by character segmentation after receiving the distorted watermarked document transmitted with digital channel. Each character is sent to the pre-trained decoder in the watermarking generation stage. The watermark information embedded in the current character can be accurately extracted. The characters in the document undergo the transformation distortion from digital signal to analog signal (A-D) process and from analog signal to digital signal (D-A) process, and the image quality is greatly reduced in the context of the print-scan scene. Simultaneously, such process that contained various image attacks cannot be accurately simulated by differentiable distortions, so the robustness against print-scanning distortions should be considered as a target. To achieve such robustness, this proposal is a fine-tuning scheme for the extractor which can effectively train the extractor to be adaptive to the print-scanning distortions. Specifically, the font generation model is fixed as a pre-training network and a set of following documents are embedded with watermarks based on the pre-trained embedder. The real print-scanning process is conducted on such documents to generate the distorted image library. Based on the distorted image and its watermark, the decoder is fine-tuned to be adaptive to the distorted features further. The robustness against print-scanning distortion can be achieved.Result This scheme embeds 252 bits of watermark into a real document containing 252 Chinese characters in comparison of the visual quality and robustness with other document watermarking methods. The results show that the peak signal to noise ratio(PSNR), structural similarity(SSIM) and subjective quality scores of the proposed scheme are higher with 11.68 dB, 0.08 and 5.8% respectively, demonstrating the qualified visual quality of the proposed scheme. For the robustness, the watermark extraction rate of the scheme is 100% under the screenshot and scaling, and the performance under JPEG compression and Gaussian noise is approximately equivalent to that of other methods. For the print-scan scene, the watermark extraction rates of the scheme in the font size of three, four, small four and five are realized 2.4%, 3.07%, 1.34% and 0.02%, respectively. The qualified performance is achieved under different mismatched printing and scanning qualities as well, which indicates that the scheme has higher robustness in resisting the print-scanning distortions.Conclusion Compared with the existing Chinese character watermarking methods based on the manually designed font library, the proposed scheme can automatically generate the tagged Chinese font library that is similar to the target font visually, which effectively reduces the complexity of the font generation. The experimental results show that the proposed document watermarking scheme has presented better visual quality and embedding capacity. In addition, the proposed scheme maintains the strong robustness against digital editing channel as well as the print-scanning channel. However, the scheme is not suitable yet for print-shooting and screen-shooting process at present. Future research will be concentrated on how to design robust document watermarking schemes for these two scenes.
Keywords
document watermarking deep learning Chinese font generation anti digital distortion anti print-scanning distortion
|