Current Issue Cover
基于Transformer网络的COVID-19肺部CT图像分割

樊圣澜, 柏正尧, 陆倩杰, 周雪(云南大学信息学院, 昆明 650500)

摘 要
目的 COVID-19 (corona virus disease 2019)患者肺部CT (computed tomography)图像病变呈多尺度特性,且形状不规则。由于卷积层缺乏长距离依赖性,基于卷积神经网络(convolutional neural network,CNN)的语义分割方法对病变的假阴性关注度不够,存在灵敏度低、特异度高的问题。针对COVID-19病变的多尺度问题,利用Transformer强大的全局上下文信息捕获能力,提出了一种COVID-19患者肺部CT图像分割的Transformer网络: COVID-TransNet。方法 该网络以Swin Transformer为主干,在编码器部分提出了一个具有残差连接和层归一化(layer normalization,LN)的线性前馈模块,用于特征图通道维度的调整,并用轴向注意力模块(axial attention)替换跳跃连接,提升网络对全局信息的关注度。在解码器部分引入了一种新的特征融合模块,在上采样的过程中逐级细化局部信息,并采用多级预测的方法进行深度监督,最后利用Swin Transformer模块对解码器各级特征图进行解码。结果 在COVID-19 CT segmentation数据集上实现了0.789的Dice系数、0.807的灵敏度、0.960的特异度和0.055的平均绝对误差,较Semi-Inf-Net分别提升了5%、8.2%、0.9%,平均绝对误差下降了0.9%,取得了先进水平。结论 基于Transformer的COVID-19 CT图像分割网络,提高了COVID-19病变的分割精度,有效解决了CNN方法低灵敏度、高特异度的问题。
关键词
A Transformer network based CT image segmentation for COVID-19-derived lung disease

Fan Shenglan, Bai Zhengyao, Lu Qianjie, Zhou Xue(School of Information Science and Engineering, Yunnan University, Kunming 650500, China)

Abstract
Objective The corona virus disease 2019(COVID-19) patients-oriented screening is mostly focused on reverse transcription-polymerase chain reaction(RT-PCR) nowadays. However,its challenges have been emerging in related to lower sensitivity and time-consuming. To optimize the related problem of diagnostic accuracy and labor intensive,chest X-ray(CXR) images and computed tomography(CT) images have been developing as two of key techniques for COVID-19 patients-oriented screening. However,these methods still have such limitations like clinicians-related experience factors in visual interpretation. In addition,inefficient diagnostic time span is challenged to be resolved for CT scanning technology as well. To get a rapid diagnosis of COVID-19 patients,emerging deep learning technique based CT scanning technology have been applied to segment and identify lesion regions in CT images of patients. Most of semantic segmentation methods are implemented in terms of convolutional neural networks(CNNs). The lesions of COVID-19 are multi-scale and irregular,and it is still difficult to capture completed information derived of the limited receptive field of CNN. Therefore,CNNbased semantic segmentation method does not pay enough attention to false negatives when such lesions are dealt with,and it still has the problem of low sensitivity and high specificity. Method First,Swin Transformer is as the backbone and the output is extracted of the second,fourth,eighth,and twelfth Swin Transformer modules. Four sort of multi-scale feature maps are generated after that. Numerous of datasets are required to be used in terms of transfer learning method and its pretraining weight on ImageNet. Second,a residual connection and layer normalization(LN) based linear feed-forward module is developed to adjust the channel dimension of feature maps,and the axial attention module is applied to improve global information-related network's attention as well. The linear feed-forward module-relevant fully connected layer can be carried out in the channel dimension only,and axial attention module-relevant self-attention is only computed locally, so computing cost has barely shrinked. Finally,for the decoder part,to improve the segmentation accuracy of edge information,a structure is developed to refine local information step by step,as well as multi-level prediction method is used for deep supervision. Furthermore,a multi-level prediction approach is also used for deep supervision. The Swin Transformer module is used to decode all levels of feature maps of the decoder part,which can optimize network learning and its related ability to refine local information gradually. Result For data augmentation-excluded data set of the COVID-19 CT segmentation,the Dice coefficient is 0. 789,the sensitivity is 0. 807;the specificity is 0. 960,and the mean absolute error(MAE) is 0. 055. Compared to the Semi-Inf-Net,each of it is increased by 5%,8. 2%,and 0. 9%,and the MAE is decreased by 0. 9%. For the ablation experiment,we have also verified the improvement of segmentation accuracy based on each module. The generalization ability is verified on 638 slices of the COVID-19 infection segmentation dataset,for which the Dice coefficient is 0. 704,the sensitivity is 0. 807,and the specificity is 0. 960. Compared to the Semi-Inf-Net,each of it has increased by 10. 7%,0. 1%,and 1. 3% further. Conclusion Such a Transformer is applied to segment COVID-19 CT images. Our network proposed can be dealt with both of local information and global information effectively through Transformer-purified network structure. The segmentation accuracy of COVID-19 lesions can be improved,and the problem of low sensitivity and high specificity of traditional CNN can be solved effectively to a certain extent. Experiments demonstrate that COVID-TransNet has its generalization performance and the ability of high accuracy segmentation. It is beneficial to assist clinicians efficiently in relevant to diagnosing COVID-19 patients.
Keywords

订阅号|日报