Current Issue Cover
双重编—解码架构的肠胃镜图像息肉分割

魏天琦, 肖志勇(江南大学人工智能与计算机学院,无锡 214122)

摘 要
目的 肠胃镜诊断一直被认为是检测及预防结直肠癌的金标准,但当前的临床检查中仍存在一定的漏诊概率,基于深度学习的肠胃内窥镜分割方法可以帮助医生准确评估癌前病变,对诊断和干预治疗都有积极作用。然而提高目标分割的准确性仍然是一项具有挑战性的工作,针对这一问题,本文提出一种基于双层编—解码结构的算法。方法 本文算法由上、下游网络构成,创新性地利用上游网络训练产生注意力权重图,对下游网络解码过程中的特征图产生注意力引导,使分割模型更加注重目标区域;提出子空间通道注意力结构,在跨越连接中提取多分辨率下的跨通道信息,可以有效细化分割边缘;最终输出添加残差结构防止网络退化。结果 在公共数据集CVC-ClinicDB(Colonoscopy Videos Challenge-ClinicDataBase)和Kvasir-Capsule上进行测试,采用Dice相似系数(Dice similariy coefficient,DSC)、均交并比(mean intersection over union,mIoU)、精确率(precision)以及召回率(recall)为评价指标,在两个数据集上的DSC分别达到了94.22%和96.02%。进一步将两个数据集混合,测试了算法在跨设备图像上的鲁棒性,其中DSC提升分别达到17%—20%,在没有后处理的情况下,相较其他先进模型(state-of-the-art,SOTA),如U-Net在DSC、mIoU以及recall上分别取得了1.64%、1.41%和2.54%的提升,与ResUNet++的对比中,在DSC以及recall指标上分别取得了2.23%和9.87%的提升,与SFA (selective feature aggregation network)、PraNet和TransFuse等算法相比,在上述评价指标上也均有显著提升。结论 本文算法可以有效提高医学图像分割效果,并且对小目标分割、边缘分割具有更高的准确率。
关键词
Dual encoded-decoded polyp segmentation method for gastroscopic images architecture

Wei Tianqi, Xiao Zhiyong(School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China)

Abstract
Objective Adenomatous polyp is demonstrated as the early manifestation of colorectal cancer. Early intervention is an effective way to prevent colorectal cancer. Current gastroscopy has been regarded as the “gold standard” for detection and prevention of colorectal cancer. However, a certain probability of missed diagnosis is still existed for clinical examination. Deep learning based gastrointestinal endoscopy segmentation method can aid to assess precancerous lesions efficiently, which has a positive effect on diagnosis and clinical intervention. Intestinal polyps are also characterized by small, round and blurred edges, which greatly increase the difficulty of semantic segmentation. Our research is focused on developing an improved algorithm based on the double-layer encoder-decode structure. Method Our algorithm comprises of upstream and downstream architectures. The attention weight graph generated by the upstream network training is melted into the decoding part of the downstream network. 1) To promote effective network for target area in the image, the generated attention guidance is clarified to the feature map in the decoding process. The background-area-ignored model can be paid more attention to the segmentation contexts, which has a significant effect on small target recognition in semantic segmentation. 2) The edge extraction issue is concerned as well. Due to the similarity of intestinal wall and polyp mucous membrane, the segmentation target edge is blurred. It is essential to strengthen the edge extraction ability of the model and obtain more accurate segmentation results as well. In order to improve the segmentation ability of polyp target boundary, subspace channel attention is integrated into the cross-connection portion of the downstream network for extracting cross-channel information at multi-resolution and refining the edges. Unlike the convolution operation, a self-attention mechanism is involved in. Its ability to model remote dependencies provides an infinite receptive field for the application of visual models. However, traditional attention mechanism brings a huge amount of additional computational overhead. To realize the refine edge effect, the introduction of lightweight subspace channel attention mechanism can feature each space division, reduce the amount of calculation, learn the attention of multiple features, and get the attention of the fusion feature maps.We conduct tests performed on the public datasets Colonoscopy Videos Challenge-ClinicDataBase(CVC-ClinicDB) a摮?漠湋??噳???汃楡湰楳捵???愠湔摨?琠敃獖瑃攭摃?潩湮??癄慂猠楤牡??慳灥獴甠汩敳??慳湥摤??╯?楴湨?????噧??摤慡瑴慡猠敯瑦??瑮牴慥楳湴敩摮?潬渠??癬慹獰楳爠??慬灬獥畣汴敥?愠湢摹?瑣敯獮瑶敥摮?潩湯??噬???汬楯湮楯捳??????扮??潴湨捥汲略猠楡潲湥??戱??坰敩?灴牵潲灥潳猠敩?愠湴?慴瑡瑬攬渠瑷楨潩湬?猠敋杶浡敳湩瑲愭瑃楡潰湳?浬潥搠敤污?睡楳瑥桴?摴略慮汤?攠湴捯漠摴敨?搠敩捭潡摧敥爠?慡牴捡栠楯瑦攠捰瑯畬特数??佣畯牬?慥汣杴潥牤椠瑢桹洠?捡慰湳?楬浥瀠牧潡癳整?瑯桳散?数晹映敡据瑤?潴晨?浲敥搠楡捲慥氠‵椵洠慰杩散?獵敲来浳攠湩瑮愠瑴楯潴湡?攮映晁攠换瑩楧瘠敧污祰??慥湥摤?栠慴獯?桢楥朠桢敲物?慧捥捤甠物慮挠祩?晡潧物?獧洠慡汬汴?瑯慵牧杨攠瑴?獥攠杳浡敭湥琠慫瑩楮潤湳?慯湦搠?敡摲杧敥?獳攠条浲敥渠瑣慯瑬楬潥湣?潥湤?椠流灴爠潴癨楥渠杳?捭潥氠潴物敭捥琬愠汴?挠慦湵捲整牨?獲挠牰敲敯湶楥渠杴?獥琠牲慯瑢敵杳楴敮獥?s of this algorithm, our tests are carried out on the ultrasound nerve segmentation dataset, which has 5 633 ultrasound images of the brachial plexus taken by the imaging surgeon. The resolution of all images are set to 224×224 pixels and it can be randomly scrambled, divided into training set, verification set and test set according to the ratio of 6 ∶2 ∶2 and trained on a single GTX 1080Ti GPU. Our saliency network is implemented in Pytorch. In the experiment, binary cross entropy loss function(BCE loss) and Dice loss are proportionally mixed to construct a new Loss function, which has better performance for semantic segmentation of dichotomies. The Adam optimizer is used as well. The initial learning rate is 0.000 3 and the learning rate attenuation is set. Result The Dice similariy coefficient(DCS), mean intersection over union(mIoU), precision and recall are used as the quantitative evaluation metrics, and these metrics are all between 0 and 1. The higher of the index is, the segmentation performance of the model is better. The experimental results showed that the DCS of our model on CVC-ClinicDB and Kvasir-Capsule datasets reached 94.22% and 96.02%, respectively. Compared with U-Net, our DCS, mIoU, precision and recall is increased by 1.89%, 2.42%, 1.04%, 1.87% of each in CVC-ClinicDB dataset and 1.06%, 1.9%, 0.4%, 1.58% in Kvasir-Capsule dataset. The robustness of our algorithm on cross-device images is tested further by mixing the two data sets. Among them, DSC is increased by 17% to 20%, Compared with U-Net, the DCS of our model is increased by 16.73% in CVC-KC dataset (traine
Keywords

订阅号|日报