相似度感知蒸馏的统一弱监督个性化联邦图像分割

潘建珊; 林立; 吴洁伟; 刘翼翔; 陈孝华; 林其友; 黄建业; 唐晓颖

发布时间： 2024-03-15
摘要点击次数： 1100
全文下载次数： 908
DOI: 10.11834/jig.230295
2024 | Volume 29 | Number 3

相似度感知蒸馏的统一弱监督个性化联邦图像分割

潘建珊¹, 林立^2,3,4, 吴洁伟^2,4, 刘翼翔², 陈孝华¹, 林其友¹, 黄建业³, 唐晓颖^2,4(1.深圳市公共信用中心, 深圳 518000;2.南方科技大学电子与电气工程系, 深圳 518000;3.香港大学电机电子工程系, 香港 999077;4.南方科技大学嘉兴研究院, 嘉兴 314000)

摘要

目的联邦学习允许多个机构在不侵犯数据隐私、安全的前提下协作训练强大的深度模型。现有多数联邦范式在处理多中心不同数据分布时性能通常会下降，且弱监督条件下的联邦范式鲜有研究，特别是各站点数据采用不同形式稀疏标注的情况。针对该问题，提出一种站点分布相似度感知知识蒸馏的统一弱监督个性化联邦学习框架（unified weakly supervised personalized federated image segmentation via similarity-aware distillation，pFedWSD），以应对多中心数据分布和标注上的差异。方法所提出的pFedWSD通过循环知识蒸馏为每个站点训练个性化模型，包含动态循环公共知识积累及个性化两个阶段。第1阶段以不确定度感知方式动态地排序每轮训练中各站点模型性能，并以循环知识蒸馏的形式积累公共知识；第2阶段通过批标准化层的统计信息来度量各站点间相似性并聚合得到各站点教师模型并进行知识蒸馏。在弱监督方面，引入门控条件随机场损失和树能量损失相结合的训练目标，以产生更为精确的伪标注监督信号。结果在眼底视杯视盘分割和视网膜中心凹无血管区分割两项任务中，pFedWSD的Dice系数和HD95（95% Hausdorff distance）指标均优于多种中心式联邦和个性化联邦方法，在两项任务中，Dice系数分别为90.38%和93.12%，相比于较先进的方法FedAP （federated learning with adaptive batchnorm for personalized healthcare）和FedALA （adaptive local aggregation for personalized federated learning）分别提升了1.67%和6.56%，性能接近于全监督集中式训练所得的模型。结论本文提出的弱监督个性化联邦学习框架能有效统一不同形式稀疏标注数据并对不同分布的各站点数据训练得到个性化模型，使各站点分割性能均得到显著提升。

关键词

相似度感知知识蒸馏弱监督学习个性化联邦学习医学图像分割

pFedWSD：unified weakly supervised personalized federated image segmentation via similarity-aware distillation

Pan Jianshan¹, Lin Li^2,3,4, Wu Jiewei^2,4, Liu Yixiang², Chen Xiaohua¹, Lin Qiyou¹, Huang Jianye³, Tang Xiaoying^2,4(1.Shenzhen Public Credit Center, Shenzhen 518000, China;2.Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen 518000, China;3.Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong 999077, China;4.Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing 314000, China)

Abstract

Objective Federated learning(FL) allows multiple healthcare institutions to collaboratively train a powerful deep learning model without compromising data privacy and security(i. e., centralizing data). However, employing a single model to accommodate the diverse data distributions from different sites is extremely challenging. Performance degradation is common for existing approaches when huge distribution gaps exist across sites. Additionally, previous works paid little attention to FL under weak supervision, especially under the supervision of different sparsely grained forms(i. e., point-, bounding box-, scribble-, block-wise). Weakly supervised FL is clinically practical but challenging. To address this issue, we propose a unified and weakly-supervised personalized FL framework named pFedWSD, targeting medical image segmentation and based on similarity-aware knowledge distillation across multiple sites. We aim to accommodate the domain gaps and annotation drifts across multiple sites and enhance the segmentation model's performance for each site. Method The proposed pFedWSD trains a personalized model for each site via cyclic knowledge distillation, which consists of two stages:uncertainty-aware dynamic and cyclic common knowledge accumulation and similarity-aware personalization. In the first stage, during each training round, the performance of each site's model is dynamically ranked in an uncertainty-aware manner, and common knowledge is accumulated in the form of cyclic knowledge distillation. In the second stage, the similarity between two sites is measured and aggregated based on the statistics from the batch normalization layers to attain a teacher model for each site and perform knowledge distillation. As for weakly-supervised learning, a combination of partial cross-entropy loss, gated conditional random field(CRF) loss, and tree energy loss is employed. Specifically, the partial cross-entropy loss is employed for supervising the annotated regions, ensuring informative guidance. The tree energy loss establishes pairwise affinities on the basis of the preserved characteristics of high and low semantic spatial structures for the same object. This approach, in conjunction with the model's predictions, generates soft pseudo-labels for the unlabeled regions. Through continuous online training and refinement, the model's predictions and the delivered pseudo-annotations gradually improve over time. Furthermore, the gated CRF loss serves as a regularization term, effectively curbing the potential issues of excessive expansion or contraction of the target regions'pseudo-labels that may arise from solely employing the tree energy loss. This approach adeptly consolidates diverse sparsely annotated data for training, facilitating real-time generations of additional pseudo proposals, and consequently attaining exceptional segmentation performance without requiring supplementary supervised data, iterative optimization, nor time-intensive post-processing. To the best of our knowledge, pFedWSD is a pioneering weakly supervised personalized federated learning approach for medical image segmentation and adeptly implemented under heterogeneous annotation settings on multiple client devices. Result We create two datasets(from multiple publicly available datasets), each with five subsets serving as five different sites, for optic/disc cup(OD/OC) segmentation and retinal foveal avascular zone(FAZ) segmentation, respectively. Quantitative and qualitative experimental results show that pFedWSD outperforms representative state-of-the-art(SOTA) centralized and personalized FL methods in terms of Dice coefficients and HD95 statistics. The proposed pFedWSD achieves an average Dice coefficient of 90. 38% on the OD/OC segmentation task, exhibiting a remarkable improvement of 1. 67% over the previous best-performing method. Moreover, pFedWSD demonstrates a marginal difference of only 0. 58% compared with local training under full supervision and a slight gap of merely 1. 23% from centralized training under full supervision. Regarding the FAZ segmentation task, the proposed method achieves an impressive average Dice coefficient of 93. 12%, showcasing a substantial improvement of 6. 56% over the previous state-of-the-art method. Furthermore, pFedWSD has a marginal difference of 0. 5% from local training under full supervision and a mere 0. 86% difference from centralized training under full supervision. Conclusion The proposed weakly-supervised and personalized FL framework(pFedWSD) can effectively unify different forms of sparsely labeled data and train personalized models that adapt well to different data distributions, with an established superior segmentation performance. Our pFedWSD demonstrates its effectiveness through achieving optimal performance on both OD/OC and FAZ segmentation tasks across datasets from multiple centers, with its overall performance closely approaching that of local or centralized training using fully supervised labels. Extensive ablation experiments demonstrate the importance and efficacy of each stage in pFedWSD and each component in the weakly supervised composite objective. Moreover, through site-ablation experiments, we analyze the contribution of each site to the federation, providing valuable guidance for medical institutions regarding the appropriate data volume and the sparse annotation form in federated learning. Future research directions include the further reduction of the communication and computation overhead and the integration of universal large model training paradigms, like prompt learning, to concurrently foster our proposed framework's generalization performance and adaptive personalization capacity toward diverse data distributions.

Keywords

在线采编平台

论文出版

年度会议

下载中心

年度信息