多尺度融合增强的纵膈淋巴结超声弹性图像分割

周奇; 杨行; 田传耕; 唐璐; 惠雨

发布时间： 2024-03-15
摘要点击次数： 946
全文下载次数： 779
DOI: 10.11834/jig.230324
2024 | Volume 29 | Number 3

多尺度融合增强的纵膈淋巴结超声弹性图像分割

周奇¹, 杨行^1,2, 田传耕³, 唐璐¹, 惠雨¹(1.徐州医科大学医学影像学院, 徐州 221004;2.中国矿业大学信息与控制工程学院, 徐州 221116;3.徐州工程学院信息工程学院 (大数据学院), 徐州 221018)

摘要

目的支气管超声弹性成像具有丰富的通道语义信息，精准的分割纵膈淋巴结对诊断肺癌是否转移具有重要意义，也对癌症的分期和治疗有着重要作用。目前，超声弹性图像分割研究较少，没有充分挖掘图像通道特征之间的关系。因此，提出一种结合注意力机制的多尺度融合增强的纵膈淋巴结超声弹性图像分割U-Net （attention-based multi-scale fusion enhanced ultrasound elastic images segmentation network for mediastinal lymph node，AMFE-UNet）。方法首先，考虑到图像可以提供纵膈淋巴结的位置和通道信息，设计密集卷积网络（dense convolutional network，DenseNet）作为模型编码器；其次，结合注意力机制和空洞卷积设计多尺度融合增强解码器，从多尺度和范围对结节的边界和纹理进行建模；最后，用选择性内核网络设计跳跃连接，将编码器的中间特征与解码器的输出特征充分融合。根据解码器特征进行数值或通道融合的方式不同，将AMFE-UNet分为A和B两个子型。结果在超声弹性图像数据集上进行对比实验与验证。结果表明AMFE-UNet平均Dice系数达到86.593%，较U-Net提升了1.986%；相较于对比模型，AMFE-UNet A在Dice、精确度和特异度指标上均达到了最优；AMFE-UNet B在交并比、灵敏度和豪斯多夫距离指标上也达到最优。消融实验和可视化分析表明提出的改进方法具有明显的提升效果。结论本文通过密集卷积网络设计分割模型编码器，并利用通道注意力机制优化模型特征恢复和连接过程，在超声弹性图像中获得了良好的纵膈淋巴结分割效果，具有较高的临床应用价值。代码链接：https：//github.com/Philo-github/AMFE-UNet。

关键词

超声弹性成像(UE) 纵膈淋巴结实例分割 U-Net 通道注意力机制

Multi-scale fusion-enhanced ultrasound elastic images segmentation for mediastinal lymph node

Zhou Qi¹, Yang Hang^1,2, Tian Chuangeng³, Tang Lu¹, Hui Yu¹(1.School of Medical Imaging, Xuzhou Medical University, Xuzhou 221004, China;2.School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China;3.School of Information Engineering(School of Big Data), Xuzhou University of Technology, Xuzhou 221018, China)

Abstract

Objective Ultrasound elastography enables non-invasive diagnosis of lesion tissues by analyzing the differences in hardness among different body tissues. It is gradually being used in the diagnoses of many diseases. In bronchial ultrasound elastography, accurately segmenting mediastinal lymph nodes from images is significant for diagnosing whether lung cancer has metastasized and has an important role in the consequent staging and diagnosis of cancer. Manual segmentation methods performed by radiologists are always time-consuming, and research on automated segmentation, specifically for ultrasound elastic images, is limited. Therefore, deep learning-based assisted segmentation methods have attracted considerable attention. Although ultrasound elastic images can provide some guidance for the segmentation of regions of interest, the obscuring of texture information in this area also makes segmentation challenging to execute. Existing research has focused primarily on the encoder structure of the model, particularly by incorporating different pre-trained models to accommodate the three-channel data format of ultrasound elastic images. However, limited research has been conducted on the intermediate features obtained by the encoder and decoder structures, resulting in less precise segmentation results. Therefore, this study proposes a network for the segmentation of the mediastinal lymph node, called attention-based multi-scale fusion enhanced ultrasound elastic images segmentation network for mediastinal lymph node (AMFE-UNet). Method First, a pre-trained dense convolutional network(DenseNet) with dense connections is introduced into the U-Net architecture to extract channel and position information from ultrasound elastic images. Second, to model the boundaries and textures of the nodules from different scales and scopes, this research enhanced the decoder module with efficient channel attention(ECA) and dilated convolutions. Three dilated convolution branches and one pooling branch are set up in each decoder module. Different combinations of the results from these branches are used to obtain the following four decoder structures. 1) Decoder-A:Results from each branch are added and passed through the ECA module. 2) Decoder-B:Results from each branch are concatenated along the channel dimension and passed through an ECA module. 3) DecoderC:Each branch is equipped with an ECA module, and results from each branch are concatenated along the channel dimension. 4) Decoder-D:Results from each branch are densely connected and passed through an ECA module. Lastly, selective kernel network(SK-Net) is used to enhance the fusion of features obtained from the encoder and decoder, ensuring a considerably comprehensive integration. In the experiments, the proposed models are implemented using Python 3. 7 and PyTorch 1. 12. The image processing workstation is equipped with an Intel i9-13900K CPU and two NVIDIA RTX 4090 GPUs, each with 24 GB memory. The initial parameters of the model are obtained using the default initialization method in PyTorch. The Adam optimizer is used to update the network parameters. Learning rate is initially set to 0. 000 1, with a weight decay coefficient of 0. 1, and it is decayed every 90 iterations. Dice coefficient is used as loss function, and the model is trained for 190 epochs. Result The experiment is performed on a collected dataset of bronchial ultrasound elastic images with six-fold cross-validation. The evaluation metrics include the Dice coefficient, sensitivity, specificity, precision, intersection over union(IoU), Hausdorff distance 95 percentile(HD95), parameters, and GFlops. The range of the first five metrics is between 0 and 1;a higher value indicates better segmentation performance. HD95 does not have a specified range, and a lower value indicates better segmentation performance. The ablation experiments show improvements in the skip connection structure and decoder structure proposed for the model. The model using SK-Net as skip connections is only slightly less sensitive than Dense-UNet, while the remaining five metrics are better than Dense-UNet. The four models using the multi-scale fusion-enhanced decoder outperform Dense-UNet by 0. 4% to 0. 9% in Dice coefficient and up to 2% in precision. Two final models were designed according to the ablation experiment:AMFE-UNet A and AMFE-UNet B. AMFE-UNet compared with a variety of models, including U-Net, Att-UNet, Seg-Net, DeepLabV3+, Trans-UNet, U-Net++, BPAT-UNet, CTO, and ACE-Net. The Dice coefficient of AMFE-UNet is 86. 59% on average, which is an improvement of 1. 983% compared with U-Net. AMFE-UNet A is optimal in terms of Dice coefficient, precision, and specificity. Meanwhile, AMFE-UNet B is optimal in terms of sensitivity, IoU, and HD95. The class activation map demonstrates that AMFE-UNet achieves better segmentation sensitivity and completeness by focusing on the content of the region at the lower levels of the network and on the boundaries of the region at the higher levels of the network. The other networks only focus on the content of the region and are ineffective at segmenting the region's boundaries. The loss variation curves for training and testing of the model indicate that AMFE-UNet B has faster convergence and better segmentation than AMFE-UNet A. Conclusion Adequate experiments demonstrate the excellent segmentation effectiveness of the AMFE-UNet combined attention mechanism for ultrasound elastic images, which has significance for future research on multichannel medical images. The code is available at https://github.com/Philo-github/AMFE-UNet.

Keywords

ultrasound elastography(UE) mediastinal lymph nodes instance segmentation U-Net channel attention mechanism

在线采编平台

论文出版

年度会议

下载中心

年度信息