自适应模态融合双编码器MRI脑肿瘤分割网络
摘 要
目的 评估肿瘤的恶性程度是临床诊断中的一项具有挑战性的任务。因脑肿瘤的磁共振成像呈现出不同的形状和大小,肿瘤的边缘模糊不清,导致肿瘤分割具有挑战性。为有效辅助临床医生进行肿瘤评估和诊断,提高脑肿瘤分割精度,提出一种自适应模态融合双编码器分割网络D3D-Net (double3DNet)。方法 本文提出的网络使用多个编码器和特定的特征融合的策略,采用双层编码器用于充分提取不同模态组合的图像特征,并在编码部分利用特定的融合策略将来自上下两个子编码器的特征信息充分融合,去除冗余特征。此外,在编码解码部分使用扩张多纤维模块在不增加计算开销的前提下捕获多尺度的图像特征,并引入注意力门控以保留细节信息。结果 采用BraTS2018(brain tumor segmentation 2018)、BraTS2019和BraTS2020数据集对D3D-Net网络进行训练和测试,并进行了消融实验。在BraTS2018数据集上,本模型在增强肿瘤、整个肿瘤、肿瘤核心的平均Dice值与3D U-Net相比分别提高了3.6%,1.0%,11.5%,与DMF-Net (dilated multi-fiber network)相比分别提高了2.2%,0.2%,0.1%。在BraTS2019数据集上进行实验,增强肿瘤、整个肿瘤、肿瘤核心的平均Dice值与3D U-Net相比分别提高了2.2%,0.6%,7.1%。在BraTS2020数据集上,增强肿瘤、整个肿瘤、肿瘤核心的平均Dice值与3D U-Net相比分别提高了2.5%,1.9%,2.2%。结论 本文提出的双编码器融合网络能够充分融合多模态特征可以有效地分割小肿瘤部位。
关键词
Adaptive modal fusion dual encoder MRI brain tumor segmentation network
Zhang Yihan, Bai Zhengyao, You Yilin, Li Zekai(School of Information Science and Engineering, Yunnan University, Kunming 650500, China) Abstract
Objective Accurate segmentation of brain tumors is a challenging clinical diagnosis task, especially in assessing the degree of malignancy. The magnetic resonance imaging(MRI) of brain tumors exhibits various shapes and sizes, and the accurate segmentation of small tumors plays a crucial role in achieving accurate assessment results. However, due to the significant variability in the shape and size of brain tumors, their fuzzy boundaries make tumor segmentation a challenging task. In this paper, we propose a multi-modal MRI brain tumor image segmentation network, named D3D-Net, based on a dual encoder fusion architecture to improve the segmentation accuracy. The performance of the proposed network is evaluated on the BraTS2018 and BraTS2019 datasets. Method The paper proposes a network that utilizes multiple encoders and a feature fusion strategy. The network incorporates dual-layer encoders to thoroughly extract image features from various modal combinations, thereby enhancing the segmentation accuracy. In the encoding phase, a targeted fusion strategy is adopted to fully integrate the feature information from both upper and lower sub-encoders, effectively eliminating redundant features. Additionally, the encoding-decoding process employs an expanded multi-fiber module to capture multiscale image features without incurring additional computational costs. Furthermore, an attention gate is introduced in the process to preserve fine-grained details. We conducted experiments on the BraTS2018, BraTS2019, and BraTS2020 datasets, including ablation and comparative experiments. We used the BraTS2018 training dataset, which consists of the magnetic resonance images of 210 high-grade glioma(HGG) and 75 low-grade glioma(LGG) patients. The validation dataset contains 66 cases. The BraTS2019 dataset added 49 HGG cases and 1 LGG case on top of the BraTS2018 dataset. Specifically, BraTS2018 is an open dataset that was released for the 2018 Brain Tumor Segmentation Challenge. The dataset contains multi-modal magnetic resonance images of HGG and LGG patients, including T1-weighted, T1-weighted contrastenhanced, T2-weighted, and fluid-attenuated inversion recovery(FLAIR) image sequences. T1-weighted, T1-weighted contrast-enhanced, T2-weighted, and FLAIR images are all types of MRI sequences used to image the brain. T1-weighted MRI scans emphasize the contrast between different tissues on the basis of the relaxation time of the hydrogen atoms in the brain. In T1-weighted images, the cerebrospinal fluid appears dark, while the white matter appears bright. This type of scan is often used to detect structural abnormalities in the brain, such as tumors, and assess brain atrophy. T1-weighted contrast-enhanced MRI scans involve the injection of a contrast agent into the bloodstream to improve the visualization of certain types of brain lesions. This type of scan is particularly useful in detecting tumors because the contrast agent tends to accumulate in abnormal tissues. T2-weighted MRI scans emphasize the contrast between different tissues on the basis of the water content in the brain. In T2-weighted images, the cerebrospinal fluid appears bright, while the white matter appears dark. This type of scan is often used to detect areas of brain edema or inflammation. FLAIR MRI scans are similar to T2-weighted images but with the suppression of signals from the cerebrospinal fluid. This type of scan is particularly useful in detecting abnormalities in the brain that may be difficult to visualize with other types of scans, such as small areas of brain edema or lesions in the posterior fossa. The dataset is divided into two subsets:the training and validation datasets. The training dataset includes 285 cases, including 210 HGG and 75 LGG patients. The validation dataset includes 66 cases. Result The proposed D3D-Net exhibits superior performance compared with the baseline 3D U-Net and DMF-Net models. Specifically, on the BraTS2018 dataset, the D3D-Net achieves a high average Dice coefficient of 79. 7%, 89. 5%, and 83. 3% for enhancing tumors, whole tumors, and tumor core segmentation, respectively. Result shows the effectiveness of the proposed network in accurately segmenting brain tumors of different sizes and shapes. The D3D-Net also demonstrated an improvement in segmentation accuracy compared with the 3D U-Net and DMF-Net models. In particular, compared with the 3D U-Net model, D3D-Net showed a significant improvement of 3. 6%, 1. 0%, and 11. 5% in enhancing tumors, whole tumors, and tumor core segmentation, respectively. Additionally, compared with the DMF-Net model, D3D-Net respectively demonstrated an improvement of 2. 2%, 0. 2%, and 0. 1% in the same segmentation tasks. On the BraTS2019 dataset, D3D-Net also achieved high accuracy in segmenting brain tumors. Specifically, the network achieved an average Dice coefficient of 89. 6%, 91. 4%, and 92. 7% for enhancing tumors, whole tumors, and tumor core segmentation, respectively. The improvement in segmentation accuracy compared with the 3D U-Net model was 2. 2%, 0. 6%, and 7. 1%, respectively, for enhancing tumors, whole tumors, and the tumor core segmentation. Results suggest that the proposed D3D-Net is an effective and accurate approach for segmenting brain tumors of different sizes and shapes. The network's superior performance compared with the 3D U-Net and DMF-Net models indicates that the dual encoder fusion architecture, which fully integrates multi-modal features, is crucial for accurate segmentation. Moreover, the high accuracy achieved by D3D-Net in both the BraTS2018 and BraTS2019 datasets demonstrates the robustness of the proposed method and its potential to aid in the accurate assessment of brain tumors, ultimately improving clinical diagnosis. On the BraTS2020 dataset, the average Dice values for enhanced tumor, whole tumor, and tumor core increased by 2. 5%, 1. 9%, and 2. 2%, respectively, compared with those on 3D U-Net. Conclusion The proposed dual encoder fusion network, D3D-Net, demonstrates a promising performance in accurately segmenting brain tumors from MRI images. The network can improve the accuracy of brain tumor segmentation, aid in the accurate assessment of brain tumors, and ultimately improve clinical diagnosis. The proposed network has the potential to become a valuable tool for radiologists and medical practitioners in the field of neuro-oncology.
Keywords
brain tumor segmentation multimodal fusion dual encoder magnetic resonance imaging(MRI) attention gate
|