视觉Transformer与多特征融合的脑卒中检测算法
赵琛琦1, 王华虎2, 赵涓涓1, 冀伦文3, 王麒达1, 李慧芝4, 赵紫娟1(1.太原理工大学信息与计算机学院, 晋中 030600;2.北京大学光华管理学院, 北京 100871;3.太原理工大学期刊中心, 太原 030024;4.山西慧虎健康科技有限公司, 太原 030032) 摘 要
目的 急性缺血性卒中是最常见的脑卒中类型,具有发病率高、死亡率高和致残率高的特点。患者发病前症状不明显、发病急骤以及溶栓治疗时间窗窄等问题导致其成为临床上的高危疾病。中医望诊可以在疾病发展早期,通过观察患者形、色、气和神的变化,对患者病情进行诊断和预测,达到“治未病”的目的,与人工智能技术相结合,可以解决缺乏客观和定量评价标准的问题。因此,通过中医望诊中的脸部和手部图像,充分利用两种图像的颜色、纹理等特征以及二者之间的关系特征,本文提出一种基于序列自注意力网络的急性缺血性卒中辅助诊断方法。方法 对脸部和手部图像进行山根和大鱼际处的感兴趣区域提取。采用YCbCr颜色空间和灰度共生矩阵,提取区域图像的颜色和纹理特征,将颜色特征和纹理特征进行融合并将其与原图像特征相结合,得到的特征图序列化地输入到Transformer模型中,进一步学习高层次的空间特征和注意力特征。将模型输出结果输入到多层感知机中,从而实现急性缺血性卒中的检测。结果 在收集的急性缺血性卒中患者数据集上进行实验,结果表明,提出的基于序列自注意力网络的方法取得了83.57%的准确率,获得较高性能,在速度和便携性上具有很大的优势。结论 该方法采用端到端的学习方式,能够有效解决目前临床诊断因医疗资源的差异而受到影响的问题,对于初步判断患者疾病具有指导性的作用,为诊断急性缺血性卒中提供了一种新思路和新方法。
关键词
Cerebral stroke detection algorithm for visual Transformer and multi-feature fusion
Zhao Chenqi1, Wang Huahu2, Zhao Juanjuan1, Ji Lunwen3, Wang Qida1, Li Huizhi4, Zhao Zijuan1(1.College of Information and Computer, Taiyuan University of Technology, Jinzhong 030600, China;2.Guanghua School of Management, Peking University, Beijing 100871, China;3.The Journal Center, Taiyuan University of Technology, Taiyuan 030024, China;4.Shanxi Huihu Health Science and Technology Company with Limited Liability, Taiyuan 030032, China) Abstract
Objective Cerebral ischemic stroke is the most common type of cerebral stroke, which is characterized by high morbidity, mortality and disability. The lack of obvious symptoms before the onset of the disease, the rapid onset of the disease, and the narrow time window for thrombolytic therapy have led to it being a high-risk disease in clinical practice. Although initial progress has been made in cerebral stroke prevention and treatment, it remains a significant cause of disability or death in adults. According to the survey, approximately 75% of stroke patients have varying degrees of functional impairment and loss of work, causing a heavy burden on families and society. With the accelerated aging and urbanization of society, the prevalence of unhealthy lifestyles among the population and the widespread exposure to cerebrovascular disease risk factors, in the disease burden of stroke has greatly increased, with a trend of rapid growth in low-income groups, marked gender and geographical differences and youthfulness. Therefore, effective ways to reduce disability and mortality rates should be developed. The early diagnosis of cerebral stroke is important. Many methods can be used to diagnose cerebral stroke in modern medicine, but the processes are relatively complex. In addition, some tests have certain drawbacks, and the presence of the disease is hard to detect in the early stages of illness, thus requiring advanced equipment and experienced clinicians. How to improve the accuracy of early diagnosis of cerebral stroke has become an important research hotspot for medical aid diagnosis. The characteristics and advantages of traditional Chinese medicine (TCM) are essential in the contemporary medical system of diseases, especially the inspection diagnosis of TCM, which is the most important in TCM diagnosis. Chinese medicine diagnosis is an objective and accurate empirical medicine, which has gradually formed and developed in long-term medical practice and clinically proven, with extremely rich connotations. Based on the basic principles of Chinese medicine diagnosis (the inspection diagnosis of TCM), and diagnosis can be improved by applying modern scientific knowledge and methods in practice. This method not only provides strong evidence for early diagnosis and treatment, but also has extremely important practical significance in saving medical resources, reducing the medical burden on patients and alleviating the harm caused by cerebral stroke disease. Method First, feature extraction is performed on the images of the patient's face and hands. The color features are easily affected by light, and the chroma component in YCbCr color space is used to reduce the effect of luminance. The most important of the texture features are the features of texture length, depth and thickness in the images, and the gray level co-generation matrix (GLCM) was used to extract the image texture features effectively. Then, the higher-order spatial dimensional features further learned from the original image and the attentional features are learned from the different features by designing a reasonable dual Transformer joint classification model. Different transformer modules were cascaded, and multi-layer perception was used for image classification. This method not only considers color and texture features in the image, but also analyzes the spatial features of the image. Based on the differences arising from successive changes in color and texture between different regions in an image, this paper uses transformer to extract the attention features between different regions to improve the performance of the diagnostic model. In addition, the detection model is trained end-to-end. During the training process, the batch size is set to 4, the learning rate is set to 1E-5 and the maximum number of cycles is set to 100. The experiment uses NVIDIA TITAN XP GPU, and the data set was divided into five groups equally for five cross-validations. Finally, the average accuracy of all cross-validated results was taken as the final result of the experiment. Result When detecting cerebral ischemic stroke, the models with color features (YCbCr) and texture features (GLCM) extracted separately achieved accuracies of 79.40% and 80.46% on the dataset, while the model with the fusion of color and texture features achieved an accuracy of 83.53% on the dataset, which was significantly better than the model without feature fusion. Color features and texture features can effectively improve the classification accuracy in classification by using a transformer model, and feature fusion can make the model further improve the detection accuracy. Under the premise of fusion of color and texture features, the accuracy of model classification using a transformer module has dropped by approximately 2%. This finding shows that features from different parts play different roles in the final detection, and the gaps between the same features from different parts can easily disappear in the process of feature fusion into one transformer module. The dual transformer joint classification model uses color, texture, spatial and attention features, and the combination of these features can effectively improve the performance of the model. In addition, the average accuracy of the proposed model on the dataset in this paper outperforms the experimental results of related classification models. Conclusion In this paper, we proposed an end-to-end joint classification detection method based on the dual Transformer module. High-quality data were acquired using YCbCr color space and GLCM to accelerate the convergence process of the model. In addition, we extracted feature information from the patient's face and hand images. More importantly, the model learning capability was enhanced, and the model performance was improved using a self-attentive mechanism to learn the association between features and assign weights. The proposed model has a good diagnostic effect, and the automatic assisted diagnosis reduced the influence of subjective factors, which is valuable in the study of cerebral ischemic stroke auxiliary diagnosis, provides a reference for clinicians to make decisions on cerebral ischemic stroke disease diagnosis and provides a new method for patients to conduct effective self-screening.
Keywords
inspection diagnosis of traditional Chinese medicine feature extraction feature fusion end-to-end Transformer
|