基于Transformer和通道混合并行卷积的高光谱图像去噪
胡帅1,2, 高峰1,2, 龚卓然1, 陶盛恩1,2, 上官心语1,2, 董军宇1,2(1.中国海洋大学计算机科学与技术学院, 青岛 266100;2.中国海洋大学三亚海洋研究院, 三亚 572025) 摘 要
目的 高光谱图像因设备及环境因素容易受到噪声污染,导致图像的可见性和分析精度降低,因此高光谱图像去噪任务已经成为遥感图像处理领域国内外研究热点。当前的高光谱图像去噪方法主要面临两个难题:1)对特征的全局信息利用不足。当前基于卷积神经网络的方法受限于卷积核的大小,难以捕获特征的全局信息;2)卷积神经网络和 Transformer 在结构上存在差异,导致两者难以融合,因此,需要考虑合理的特征交互方式,来平衡局部和全局特征提取之间的关系。方法 针对上述问题,本文提出了基于 Transformer 和通道混合并行卷积的高光谱图像去噪模型,包括 3 个模块:通道混合特征提取模块、基于块下采样的全局增强模块和自适应双向特征融合模块。通过这 3 个模块的相互作用,可以充分结合全局和局部的特征信息,处理不同区域中的噪声和纹理差异,有效提高模型对空间细节信息的恢复能力。结果 实验在 2 个数据集上与主流的 5 种方法进行比较,在 Pavia 数据集中设置不同高斯噪声强度的情况下,相比于性能第 2 的模型,峰值信噪比(peak signal-to-noise ratio,PSNR)值最大提高了0. 4 dB;在 ICVL 数据集中设置各种混合噪声的情况下,相比于性能第 2 的模型,PSNR 最大提高了 2. 18 dB。同时可视化的去噪结果图像体现了本文所提出的去噪模型的优异性能。结论 本文方法在各种噪声情况下均具有较好的去噪效果,显著优于当前主流方法,能够有效去除高光谱图像中噪声,同时保留图像丰富的纹理信息。
关键词
Parallel channel shuffling and Transformer-based denoising for hyperspectral images
Hu Shuai1,2, Gao Feng1,2, Gong Zhuoran1, Tao Shengen1,2, ShangGuan Xinyu1,2, Dong Junyu1,2(1.School of Computer Science and Technology, Ocean University of China, Qingdao 266100, China;2.Sanya Oceanographic Institution, Ocean University of China, Sanya 572025, China) Abstract
Objective With the increasing availability and advancement of hyperspectral imaging technology, hyperspectral images have become an invaluable resource in various fields, including agriculture, environmental monitoring, and remote sensing. However, these images are often prone to noise contamination, which can significantly degrade their quality and hinder accurate analysis and interpretation. As a result, denoising hyperspectral images has become a crucial task in the field of remote sensing image processing, attracting significant attention from researchers worldwide. The challenges associated with denoising hyperspectral images are multifaceted. First, the inherent characteristics of hyperspectral data, such as high dimensionality and complex spectral information, pose significant difficulties for traditional denoising approaches. The presence of noise in hyperspectral images can obscure valuable information embedded within the spectral bands, making it essential to develop advanced denoising techniques that can effectively restore the original signal while preserving the rich texture and spatial details. Furthermore, the development of deep learning techniques, particularly convolutional neural networks (CNNs), has revolutionized the field of image processing, including denoising tasks. CNN-based approaches have shown promising results in denoising various types of images. However, when it comes to hyperspectral data, traditional CNN architectures face limitations in capturing the global contextual information necessary for accurate denoising. The fixed-size receptive fields of CNNs restrict their ability to exploit the spatial and spectral correlations present in hyperspectral images, thereby reducing their overall denoising performance. To overcome these limitations, recent research has explored the integration of Transformers, which were originally designed for natural language processing tasks, into the field of computer vision, including hyperspectral image denoising. Transformers are capable of capturing long-range dependencies and global contextual information, making them an attractive alternative to CNNs for denoising tasks. However, directly applying Transformer-based models to hyperspectral data requires careful consideration of the specific challenges posed by the unique characteristics of hyperspectral images.Method In this study, we propose a novel denoising model for hyperspectral images that combines the strengths of Transformers and parallel convolution operations. Our model comprises three key modules: channel shuffling module, block downsampling global enhancement module, and adaptive bidirectional feature fusion module. These modules work synergistically to address the challenges encountered in denoising hyperspectral images. The channel shuffling module exploits the inter-channel relationships within hyperspectral data by incorporating channel-mixing operations. By fusing information across different spectral channels, the module enhances the representation power of the network and enables more comprehensive feature extraction. This approach effectively addresses the limitation of traditional CNN-based methods in fully utilizing the global information available in hyperspectral images, ultimately improving the model’s denoising performance. In the block downsampling global enhancement module, we leverage a block downsampling strategy to capture global contextual information. By reducing the spatial resolution of the input hyperspectral image, the module enlarges the receptive fields, allowing the model to incorporate larger-scale information during the denoising process. This mechanism enhances the model’s understanding of the overall structure of the image, facilitating more effective noise suppression and accurate restoration of spatial details. The adaptive bidirectional feature fusion module is designed to strike a balance between local and global feature extraction, leveraging the complementary strengths of CNNs and Transformers. This module introduces a mechanism for adaptively fusing features from local and global contexts, enabling the model to effectively combine local details with global information. By considering the intricate relationship between spatial and spectral features, our proposed approach improves the denoising performance and preserves the rich texture information inherent in hyperspectral images.Result To evaluate the effectiveness of our proposed model, extensive experiments were conducted on publicly available hyperspectral image datasets, including ICVL and Pavia. Experimental results demonstrated the superior denoising performance of our approach compared with that of current state-of-the-art methods. Our model consistently outperformed existing techniques in various noise scenarios, effectively removing noise while preserving the fine spatial details and rich texture information of hyperspectral images. The experimental evaluation involved quantitative metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and spectral angel mapping (SAM). Our proposed model achieved significantly higher PSNR values and SSIM scores compared with the baseline methods, indicating improved denoising accuracy and visual quality of the restored images. In addition, the SAM values obtained using our model were consistently lower, indicating higher spectral similarity. Moreover, we conducted a comprehensive analysis of the computational efficiency of our model. With the increasing volume and complexity of hyperspectral data, developing denoising methods that are computationally efficient without sacrificing performance is crucial. Our proposed model demonstrated competitive computational efficiency, making it practical for real-world applications that involve large-scale hyperspectral image processing.Conclusion The success of our denoising model can be attributed to the synergistic combination of the Transformer-based architecture and the channel-mixing parallel convolution operations. The Transformer module enables effective capture of global contextual information, facilitating better understanding of the relationships between spectral bands and spatial features. By incorporating channel-mixing operations, our model exploits the inter-channel correlations and enhances the discriminative power of feature extraction, resulting in improved denoising performance. Furthermore, our model’s ability to handle diverse noise scenarios and maintain image quality can be attributed to the adaptive bidirectional feature fusion module. This module intelligently combines local and global features, enabling effective noise suppression while preserving the fine details and texture information specific to different regions of the hyperspectral images. The adaptability of the feature fusion mechanism ensures robust denoising performance across various noise levels and image characteristics. In conclusion, this study presents a novel denoising model for hyperspectral images based on the integration of Transformers and channel-mixing parallel convolution. The proposed model effectively addresses the limitations of traditional approaches in utilizing global information and captures the complex spatial-spectral correlations inherent in hyperspectral data. Experimental results demonstrate its superior denoising performance compared with that of state-of-the-art methods, with improved accuracy and preservation of fine details and texture information. The model’s computational efficiency further enhances its practicality for real-world applications. Future research directions may include exploring additional mechanisms for adaptive feature fusion and investigating the model’s performance on other hyperspectral image processing tasks such as classification and segmentation.
Keywords
|