Current Issue Cover
前-背景语义解耦的图像修复

叶学义, 睢明聪, 薛智权, 王佳欣, 陈华华(杭州电子科技大学通信工程学院)

摘 要
目的 修复前后的图像在语义上保持一致是图像修复研究遵循的基本规则之一,然而目前图像整体修复方法忽略了图像前-背景之间的语义差异,导致修复后的边界模糊和语义混乱等问题。本文由此提出了一种基于语义解耦前景背景的图像修复方法。方法 该方法包括语义修复、前景修复和整体修复三个阶段。首先,修复缺损的语义标签图;其次,利用修复后语义图解耦缺损图像的前景和背景区域,再将缺损的前景区域输入前景修复模块进行修复;最后,将修复后的前景区域嵌入缺损图像后,输入整体修复模块完成整体修复和前景背景融合。结果 通过与当前其它图像修复方法的对比实验表明:在CelebA-HQ和Cityscapes公开数据集上,该方法在学习感知图像块相似度、峰值信噪比和结构相似性指标上表现更好;相较于对比方法的最优平均值,在CelebA-HQ数据集上,学习感知图像块相似度降低8.86%,结构相似性提高1.10%,且本文方法峰值信噪比均值达到27.09dB;在Cityscapes数据集上,学习感知图像块相似度降低4.62%,结构相似性提高0.45%,且本文方法峰值信噪比均值达到27.31dB。此外消融实验的数据表明算法各个环节的必要性和有效性。结论 该图像修复方法通过将前景背景的语义解耦,采用三段式算法流程递进完成图像修复,有效减少了语义混乱和边界模糊的影响,修复后生成的图像前景背景边界清晰,颜色风格更加合理。
关键词
Image inpainting with foreground-background semantic decoupling

Ye Xueyi, Sui Mingcong, Xue Zhiquan, Wang Jiaxin, Chen Huahua(School of Communication Engineering,Hangzhou Dianzi University)

Abstract
Objective Image inpainting is a technique that infers and repairs damaged or missing regions of an image based on the known content of the image. It originated from artists restoring damaged paintings or photographs to restore their quality as close as possible to the original image. The technique has been widely applied in fields such as cultural heritage preservation, image editing, and medical image processing. The development of image inpainting technology has undergone a transition from traditional to modern methods. Traditional methods are usually good at handling small areas of simple structured image textures, but they often fail to achieve satisfactory results when faced with large missing areas and complex structural and textual information. With the rise of the big data era, deep learning methods such as Generative Adversarial Networks (GAN) have rapidly developed, significantly improving the effectiveness of image inpainting. Compared to traditional image inpainting algorithms, deep learning methods can better understand the semantic information of images, improving the accuracy and efficiency of repair. By learning from a large amount of data, deep learning models can better understand the semantic information of images and generate more accurate repair results. However, current methods usually treat images as whole for repair. From a semantic perspective, foreground and background have significant differences. Treating the foreground and background may lead to problems such as blurred edges and structural deformation, resulting in unsatisfactory results. To address this issue, a new image inpainting framework has been proposed that utilizes semantic label maps to separate foreground and background for repair. Method The image inpainting method includes three modules: semantic inpainting module, foreground inpainting module and overall inpainting module. The purpose of the semantic repair module is to repair the defective semantic map to guide the subsequent semantic decoupling of foreground and background areas. In the semantic repair phase, we can repair the missing semantic Tag Graph and enhance the semantic information of the missing region. Then, the foreground mask is extracted using the repair semantic map to obtain the accurate boundary and shape information of the foreground region. In the stage of foreground restoration, the foreground region of the defect image is extracted based on the foreground mask, and then the foreground restoration module is used to restore the texture and fill the missing region. The foreground area usually contains the key information in the image. By repairing the foreground part, more accurate and detailed foreground objects and their semantic information can be obtained. Then the restored foreground region is embedded into the missing image. Finally, it is input to the overall repair module, which completes the two tasks of repairing the background region of the missing image and Foreground Background fusion. The overall inpainting module repairs the entire image based on the context information of the foreground, maintaining the consistency and smoothness of the image, and further improving the inpainting effect of the foreground region. In this study, we employ a joint loss function for the three stages of image inpainting. The semantic inpainting module utilizes adversarial loss and semantic distribution loss to further improve the accuracy of semantic inpainting. The foreground inpainting and overall inpainting modules further incorporate perceptual loss, style loss, and global loss on top of the losses. In specific terms, the role of the perceptual loss is to ensure that the restoration results closely resemble the original image in terms of perception; the purpose of the style loss is to reduce the occurrence of checkerboard artifacts caused by transposed convolution layers; and the function of the global loss is to guarantee that the restored results exhibit a more coherent structure and content across the entire image. By utilizing these different types of loss functions, our method can generate more realistic and natural images while maintaining high-quality inpainting results. Result Through comparative experiments with other current image restoration methods, it is demonstrated that our approach outperforms in terms of learned perceptual image patch similarity, peak signal-to-noise ratio, and structural similarity index on the CelebA-HQ and Cityscapes public datasets. Compared to the best average values of the baseline methods, on the CelebA-HQ dataset, the learned perceptual image patch similarity (LPIPS) decreased by 8.86%, the structure similarity index measure (SSIM) improved by 1.1%, and the average peak signal-to-noise ratio (PSNR) is 27.09dB; On the Cityscapes dataset, the LPIPS decreased by 4.62%, the SSIM improved by 0.45%, and the average PSNR is 27.31dB. Furthermore, ablation experiments confirm the necessity and effectiveness of each component in our algorithm. Conclusion This image inpainting method decouples the semantics of the foreground and background, and uses a three-stage algorithm process to complete the image inpainting step by step, which effectively reduces the impact of semantic confusion and fuzzy boundary. The foreground and background boundaries of the repaired image are clear, and the color style is more reasonable.
Keywords

订阅号|日报