生成对抗式网络及其医学影像应用研究综述
摘 要
生成对抗式网络(generative adversarial network,GAN)由负责学习数据分布的生成器和负责鉴别样本真伪的判别器构成,二者在相互对抗过程中互相学习逐渐变强。该网络模型使深度学习方法可以自动学习损失函数,减少了对专家知识的依赖,已经广泛应用于自然图像处理领域,对解决医学影像处理的相关瓶颈问题亦具有巨大应用前景。本文旨在找到生成对抗式网络与医学影像领域面临挑战的结合点,通过分析已有工作对未来研究方向进行展望,为该领域研究提供参考。1)阐述了生成对抗式网络的基本原理,从任务拆分、条件约束以及图像到图像的翻译等角度对其衍生模型进行分析回顾;2)对生成对抗式网络在医学影像领域中的数据增广、模态迁移、图像分割以及去噪等方面的应用进行回顾,分析各方法的优缺点与适用范围;3)对现有图像生成质量评估方法进行小结;4)总结生成对抗式网络在医学影像领域的研究进展,并结合该领域问题特性,指出现有理论应用存在的不足与改进方向。生成对抗式网络提出以来,理论不断完善,在医学影像的处理应用中也取得了长足发展,但仍然存在一些亟待解决的问题,包括3维数据合成、几何结构合理性保持、无标记和未配对数据使用以及多模态数据交叉应用等。
关键词
A review of generative adversarial networks and the application in medical image
Zhang Yinglin1, Hu Yan1, Risa Higashita1,2, Liu Jiang1(1.Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China;2.TOMEY Corporation, Nagoya 451-0051, Japan) Abstract
The generative adversarial network (GAN) consists of a generator based on the data distribution learning and an identified sample's authenticity discriminator. They learn from each other gradually in the process of confrontation. The network enables the deep learning method to learn the loss function automatically and reduces expertise dependence. It has been widely used in natural image processing, and it is also a promising solution for related problems in medical image processing field. This paper aims to bridge the gap between GAN and specific medical field problems and point out the future improvement directions. First, the basic principle of GAN is issued. Secondly, we review the latest medical images research on data augmentation, modality migration, image segmentation, and denoising; analyze the advantage and disadvantage of each method and the scope of application. Next, the current quality assessment is summarized. At the end, the research development, issue, and future improvement direction of GAN on medical image are summarized. GAN theoretical study focus on three aspects of task splitting, introducing conditional constraints and image-to-image translation, which effectively improved the quality of the synthesized image, increased the resolution, and allowed more manipulation across the image synthesis process. However, there are some challenges as mentioned below:1) Generate high-quality, high-resolution, and diverse images under large-scale complex data sets. 2) The manipulation of synthesized image attributes at different levels and different granularities. 3) The lack of paired training data and the guarantee of image translation quality and diversity. GAN application study in data augmentation, modality migration, image segmentation, and denoising of medical images has been widely analyzed. 1) The network model based on the Pix2pix basic framework can synthesize additional high-quality and high-resolution samples and improve the segmentation and classification performance based on data augmentation effectively. However, there are still problems such as insufficient synthetic sample diversity, basic biological structures maintenance difficulty, and limited 3D image synthesis capabilities. 2) The network model based on the CycleGAN basic framework does not require paired training images. It has been extensively analyzed in modality migration, but may lose the basic structure information. The current research on structure preservation in modality migration limits in the fusion of information, such as edges and segmentation. 3) Both the generator and the discriminator can be fused with the current segmentation model to improve the performance of the segmentation model. The generator can synthesize additional data, and the discriminator can guide model training from a high-level semantic level and make full use of unlabeled data. However, current research mainly focuses on single-modality image segmentation. 4) GAN application in image denoising can reconstruct normal-dose images from low-dose images, reducing the radiation impact suffered by patients. The critical issues of GAN in medical image processing are presented as follows:1) Most medical image data is three-dimensional, such as MRI (magnetic resonance imaging) and CT (computed tomography), etc. The improvement of the synthesis quality and resolution of the three-dimensional data is a critical issue. 2) The difficulty in ensuring the diversity of synthesized data while keeping its basic geometric structure's rationality. 3) The question on how to make full use of unlabeled and unpaired data to generate high-quality, high-resolution, and diverse images. 4) The improvement of algorithms' cross-modality generalization performance, and the effective migration of different modality data. Future research should focus on the issues as following:1) To optimize network architecture, objective function, and training methods for 3D data synthesis, improving model training stability, quality, resolution, and diversity of 3D synthesized images. 2) To further promote the prior geometric knowledge integration with GAN. 3) To take full advantage of the GAN's weak supervision characteristics. 4) To extract invariant features via attribute decoupling for good generalization performance and achieve attribute control at different levels, granularities, and needs in the process of modality migration. To conclude, ever since GAN was proposed, its theory has been continuously improved. A considerable evolution in medical image applications has been sorted out, such as data augmentation, modality migration, image segmentation, and denoising. Some challenging issues are still waiting to be resolved, including three-dimensional data synthesis, geometric structure rationality maintenance, unlabeled and unpaired data usage, and multi-modality data application.
Keywords
generative adversarial network (GAN) medical image deep learning data augmentation modality migration image segmentation image denoising
|