多区域融合注意力网络模型下的核性白内障分类
章晓庆1, 肖尊杰1, 东田理沙1,2, 陈婉3, 胡衍1, 袁进3, 刘江1,4,5(1.南方科技大学计算机科学与工程系, 深圳 518055;2.TOMEY株式会社, 名古屋 451-0051, 日本;3.中山大学中山眼科中心, 广州 510060;4.中国科学院宁波材料技术与工程研究所慈溪生物医学工程研究所, 宁波 315201;5.广东省类脑智能计算重点实验室, 深圳 518055) 摘 要
目的 核性白内障是主要致盲和导致视觉损害的眼科疾病,早期干预和白内障手术可以有效改善患者的视力和生活质量。眼前节光学相干断层成像图像(anterior segment optical coherence tomography,AS-OCT)能够非接触、客观和快速地获取白内障混浊信息。临床研究已经发现在AS-OCT图像中核性白内障严重程度与核性区域像素特征,如均值存在强相关性和高可重复性。但目前基于AS-OCT图像的自动核性白内障分类工作较少且分类结果还有较大提升空间。为此,本文提出一种新颖的多区域融合注意力网络(multi-region fusion attention network,MRA-Net)对AS-OCT图像中的核性白内障严重程度进行精准分类。方法 在提出的多区域融合注意力模型中,本文设计了一个多区域融合注意力模块(multi-region fusion attention,MRA),对不同核性区域特征表示进行融合来增强分类结果;另外,本文验证了以人和眼为单位的AS-OCT图像数据集拆分方式对核性白内障分类结果的影响。结果 在一个自建的AS-OCT图像数据集上结果表明,本文模型的总体分类准确率为87.78%,比对比方法至少提高了1%。在10种分类算法上的结果表明:以眼为单位的AS-OCT数据集优于以人为单位的AS-OCT数据集的分类结果,F1和Kappa评价指标分别最大提升了4.03%和8%。结论 本文模型考虑了特征图不同区域特征分布的差异性,使核性白内障分类更加准确;不同数据集拆分方式的结果表明,考虑到同一个人两只眼的核性白内障严重程度相似,建议白内障的AS-OCT图像数据集拆分以人为单位。
关键词
Nuclear cataract classification based on multi-region fusion attention network model
Zhang Xiaoqing1, Xiao Zunjie1, Risa Higashita1,2, Chen Wan3, Hu Yan1, Yuan Jin3, Liu Jiang1,4,5(1.Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China;2.TOMEY Corporation, Nagoya 451-0051, Japan;3.Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;4.Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, China;5.Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Shenzhen 518055, China) Abstract
Objective Cataracts are the primary inducement for human blindness and vision impairment. Early intervention and cataract surgery can effectively improve the vision and life quality of cataract patients. Anterior segment optical coherence tomography (AS-OCT) image can capture cataract opacity information through a non-contact, objective, and fast manner. Compared with other ophthalmic images like fundus images, AS-OCT images are capable of capturing the clear nucleus region, which is very significant for nuclear cataract (NC) diagnosis. Clinical studies have identified that a strong opacity correlation relationship and high repeatability between average density value of the nucleus region and NC severity levels in AS-OCT images. Moreover, the clinical works also have suggested that the correlation relationships between different nucleus regions and NC severity levels. These original research works provide the clinical reference for automatic AS-OCT image-based NC classification. However, automatic NC classification based on AS-OCT images has been rarely studied, and there is much improvement room for NC classification performance on AS-OCT images. Method Motivated by the clinical research of NC, this paper proposes an efficient multi-region fusion attention network (MRA-Net) model by infusing clinical prior knowledge, aiming to classify nuclear cataract severity levels on AS-OCT images accurately. In the MRA-Net, we construct a multi-region fusion attention (MRA) block to fuse feature representation information from different nucleus regions to enhance the overall classification performance, in which we not only adopt the summation operation to fuse different region information but also apply the softmax function to focus on salient channel and suppress redundant channels. In respect of the residual connection can alleviate the gradient vanishing issue, the MRA block is plugged into a cluster of Residual-MRA modules to demonstrate MRA-Net. Moreover, we also test the impacts of two different dataset splitting methods on NC classification results:participant-based splitting method and eye-based splitting method, which is easily ignored by previous works. In the training, this paper resizes the original AS-OCT images into 224×224 pixels as the network inputs and set batch size to 16. Stochastic gradient descent (SGD) optimizer is used as the optimizer with default settings and we set training epochs to 100. Result Our research analysis demonstrates that the proposed MRA-Net achieves 87.78% accuracy and obtains 1% improvement than squeeze and excitation network (SENet) based on a clinical AS-OCT image dataset. We also conduct comparable experiments to verify that the summation operation works better the concatenation on the MRA block by using ResNet as the backbone network. The results of two dataset splitting methods also that ten classification methods like MRA-Net and SENet obtain better classification results on the eye-based dataset than the participant-based dataset, e.g., the highest improvements on F1 and Kappa are 4.03% and 8% correspondingly. Conclusion Our MRA-Net considers the difference of feature distribution in different regions in a feature map and incorporates the clinical priors into network architecture design. MRA-Net obtains surpassing classification performance and outperforms advanced methods. The classification results of two dataset splitting methods on AS-OCT image dataset also indicated that given the similar nuclear cataract severity in the two eyes of the same participant. Thus, the AS-OCT image dataset is suggested to be split based on the participant level rather than the eye level, which ensures that each participant falls into the same training or testing datasets. Overall, our MRA-Net has the potential as a computer-aided diagnosis tool to assist clinicians in diagnosing cataract.
Keywords
nuclear cataract classification anterior segment optical coherence tomography(AS-OCT) image multi-region fusion attention block deep learning nucleus region
|