空间感知通道注意力引导的高动态图像重建
摘 要
目的 通过融合一组不同曝光程度的低动态范围(low dynamic range, LDR)图像,可以有效重建出高动态范围(high dynamic range, HDR)图像。但LDR图像之间存在背景偏移和拍摄对象运动的现象,会导致重建的HDR图像中引入鬼影。基于注意力机制的HDR重建方法虽然有一定效果,但由于没有充分挖掘特征空间维度和通道维度的相互关系,只在物体出现轻微运动时取得比较好的效果。当场景中物体出现大幅运动时,这些方法的效果仍然存在提升空间。为此,本文提出了空间感知通道注意力引导的多尺度HDR图像重建网络来实现鬼影抑制和细节恢复。方法 本文提出了一种全新的空间感知通道注意力机制(spatial aware channel attention mechanism, SACAM),该机制在挖掘通道上下文关系的过程中,通过提取特征通道维度的全局信息和显著信息,来进一步强化特征的空间关系。这有助于突出特征空间维度与通道维度有益信息的重要性,实现鬼影抑制和特征中有效信息增强。此外,本文还设计了一个多尺度信息重建模块(multiscale information reconstruction module, MIM)。该模块有助于增大网络感受野,强化特征空间维度的显著信息,还能充分利用不同尺度特征的上下文语义信息,来重构最终的HDR图像。结果 在Kalantari测试集上,本文方法的PSNR-L(peak signal to noise ratio-linear domain)和SSIM-L(structural similarity-linear domain)分别为41.101 3、0.986 5。PSNR-μ(peak signal to noise ratio-tonemapped domain)和SSIM-μ(structural similarity-tonemapped domain)分别为43.413 6、0.990 2。在Sen和Tursun数据集上,本文方法较为真实地重构了场景的结构,并清晰地恢复出图像细节,有效避免了鬼影的产生。结论 本文提出的空间感知通道注意力引导的多尺度HDR图像重建网络,有效挖掘了特征中对重构图像有益的信息,提升了网络恢复细节信息的能力。并在多个数据集上取得了较为理想的HDR重建效果。
关键词
Spatial aware channel attention guided high dynamic image reconstruction
Tang Lingfeng, Huang Huan, Zhang Yafei, Li Fan(Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China) Abstract
Objective High dynamic range (HDR) imaging technology is widely used in modern imaging terminals. Hindered by the performance of the imaging sensor, photographs can capture information only in a limited range. HDR images can be reconstructed effectively through a group of low dynamic range (LDR) images fusion with multiple exposure levels. Due to shooting in real scene accompanied by camera shake and motion of shooting object,different exposures-derived LDR images do not have rigid pixel alignment in space, and the fused HDR results are easy to introduce artifacts, which greatly reduces the image quality. Although the attention based HDR reconstruction methods has a certain effect on improving the image quality, it achieves good results only when the object moves slightly for it does not fully mine the interrelationship in space dimension and channel dimension. When large foreground motion occurs in the scene, there is still a large room for improvement in the effects of these methods. Therefore, it is important to improve the ability of network to eliminate artifacts and restore details in saturated region. We develop multi-scale HDR image reconstruction network guided by spatial-aware channel attention. Method The medium-exposure LDR image is used as the reference image, and the remaining images are used as the non-reference images. Therefore, it is necessary to make full use of the effective complementary information of the non-reference images in the process of HDR reconstruction to enhance the dynamic range of the fused image, suppress the invalid information in the non-reference images and prevent the introduction of artifacts and saturation. In order to improve the ability of the network to eliminate artifacts and restore the details of saturated areas, we demonstrate a spatial-aware channel attention mechanism (SACAM) and a multi-scale information reconstruction module (MIM). In the process of mining channel context, SACAM strengthens the spatial relationship of features further via global information extraction and key information of feature channel dimension. Our research is focused on highlighting the importance of useful information in space dimension and channel dimension, and realizing ghost suppression and effective information enhancement in features. The MIM is beneficial to increase the network receptive field, strengthen the significant information of feature space dimension, and make full use of the contextual semantic information of different scale features to reconstruct the final HDR image. Result Our experiments are carried out on three public HDR datasets, including Kalantari dataset, Sen dataset and Tursun dataset. It can obtain better visual performance and higher objective evaluation results. Specifically, 1) on the Kalantari dataset, our PSNR-L and SSIM-L are 41.101 3 and 0.986 5, respectively. PSNR-μ and SSIM-μ are 43.413 6 and 0.990 2, respectively. HDR-VDP-2 is 64.985 3. In order to verify the generalization performance of each method, we also compare the experimental results on unlabeled Sen dataset and Tursun dataset. 2) On Sen dataset, our method can not only effectively suppress the ghosts, but also resilient clearer image details. 3) On the Tursun dataset, we reconstruct scene structure more real and avoid the artifacts effectively. In addition, ablation study proves the effectiveness of the proposed method. Conclusion A spatial-aware channel attention guided multi-scale HDR reconstruction network (SCAMNet) is facilitated. The spatial aware channel attention mechanism and multi-scale information reconstruction module are integrated into one framework, which effectively solves the artifact caused by target motion and detail recovery in saturated region. To enhance the useful information in the features for the reconstructed image, our spatial-aware channel attention mechanism tends to establish the relationship between features in spatial and channel dimensions. The multi-scale information reconstruction module makes full use of the context semantic relationship of different scale features to further mine the useful information in the input image and reconstruct the HDR image. The potentials of our method are evaluated and verified qualitatively and quantitatively.
Keywords
|