Current Issue Cover
面向水下图像目标检测的退化特征增强算法

钱晓琪1, 刘伟峰2, 张敬3, 曹洋4(1.杭州电子科技大学自动化学院, 杭州 310018;2.陕西科技大学电气与控制工程学院, 西安 710021;3.悉尼大学计算机科学学院, 悉尼 2006, 澳大利亚;4.中国科学技术大学信息科学技术学院自动化系, 合肥 230026)

摘 要
目的 基于清晰图像训练的深度神经网络检测模型因为成像差异导致的域偏移问题使其难以直接泛化到水下场景。为了有效解决清晰图像和水下图像的特征偏移问题,提出一种即插即用的特征增强模块(feature de-drifting module Unet,FDM-Unet)。方法 首先提出一种基于成像模型的水下图像合成方法,从真实水下图像中估计色偏颜色和亮度,从清晰图像估计得到场景深度信息,根据改进的光照散射模型将清晰图像合成为具有真实感的水下图像。然后,借鉴U-Net结构,设计了一个轻量的特征增强模块FDM-Unet。在清晰图像和对应的合成水下图像对上,采用常见的清晰图像上预训练的检测器,提取它们对应的浅层特征,将水下图像对应的退化浅层特征输入FDM-Unet进行增强,并将增强之后的特征与清晰图像对应的特征计算均方误差(mean-square error,MSE)损失,从而监督FDM-Unet进行训练。最后,将训练好的FDM-Unet直接插入上述预训练的检测器的浅层位置,不需要对网络进行重新训练或微调,即可以直接处理水下图像目标检测。结果 实验结果表明,FDM-Unet在PASCAL VOC 2007(pattern analysis,statistical modeling and computational learning visual object classes 2007)合成水下图像测试集上,针对YOLO v3(you only look once v3)和SSD (single shot multibox detector)预训练检测器,检测精度mAP (mean average precision)分别提高了8.58%和7.71%;在真实水下数据集URPC19(underwater robot professional contest 19)上,使用不同比例的数据进行微调,相比YOLO v3和SSD,mAP分别提高了4.4%~10.6%和3.9%~10.7%。结论 本文提出的特征增强模块FDM-Unet以增加极小的参数量和计算量为代价,不仅能直接提升预训练检测器在合成水下图像的检测精度,也能在提升在真实水下图像上微调后的检测精度。
关键词
Underwater-relevant image object detection based feature-degraded enhancement method

Qian Xiaoqi1, Liu Weifeng2, Zhang Jing3, Cao Yang4(1.School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China;2.School of Electrical and Control Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China;3.School of Computer Science, The University of Sydney, Sydeny 2006, Australia;4.Department of Automation, School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China)

Abstract
Objective Underwater-relevant object detection aims to localize and recognize the objects of underwater scenarios. Our research is essential for its widespread applications in oceanography, underwater navigation and fish farming. Current deep convolutional neural network based (DCNN-based) object detection is via large-scale trained datasets like pattern analysis, statistical modeling and computational learning visual object classes 2007 (PASCAL VOC 2007) and Microsoft common objects in context (MS COCO) with degradation-ignored. Nevertheless, the issue of degradation-related has to be resolved as mentioned below:1) the scarce underwater-relevant detection datasets affects its detection accuracy, which inevitably leads to overfitting of deep neural network models. 2) Underwater-relevant images have the features of low contrast, texture distortion and blur under the complicated underwater environment and illumination circumstances, which limits the detection accuracy of the detection algorithms. In practice, image augmentation method is to alleviate the insufficient problem of datasets. However, image augmentation has limited performance improvement of deep neural network models on small datasets. Another feasible detection solution is to restore (enhance) the underwater-relevant image for a clear image (mainly based on deep learning methods), improve its visibility and contrast, and reduce color cast. Actually, some detection results are relied on synthetic datasets training due to the lack of ground truth images. Its enhancement effect of ground truth images largely derived of the quality of synthetic images. Our pre-trained model is effective for underwater scenes because it is difficult to train a high-accuracy detector. Clear images-based deep neural network detection models' training are difficult to generalize underwater scenes directly because of the domain shift issue caused by imaging differences. We develop a plug-and-play feature enhancement module, which can effectively address the domain shift issue between clear images and underwater images via restoring the features of underwater images extracted from the low-level network. The clear image-based detection network training can be directly applied to underwater image object detection. Method First, to synthesize the underwater version based on an improved light scattering model for underwater imaging, we propose an underwater image synthesis method, which first estimates color cast and luminance from real underwater images and integrate them with the estimated scene depth of a clear image. Next, we design a lightweight feature enhancement module named feature de-drifting module Unet (FDM-Unet) originated from the Unet structure. Third, to extract the shallow features of clear images and their corresponding synthetic underwater images, we use common detectors (e.g., you only look once v3 (YOLO v3) and single shot multibox detector (SSD)) pre-trained on clear images. The shallow feature of the underwater image is input into FDM-Unet for feature de-drifting. To supervise the training of FDM-Unet, our calculated mean square error loss is in terms of the interconnections of the enhanced feature and the original shallow feature. Finally, the embedded training results do not include re-training or fine-tuning further after getting the shallow layer of the pre-trained detectors. Result The experimental results show that our FDM-Unet can improve the detection accuracy by 8.58% mean average precision (mAP) and 7.71% mAP on the PASCAL VOC 2007 synthetic underwater image test set for pre-trained detectors YOLO v3 and SSD, respectively. In addition, on the real underwater dataset underwater robot professional contest 19 (URPC19), using different proportions of data for fine-tuning, FDM-Unet can improve the detection accuracy by 4.4%~10.6% mAP and 3.9%~10.7% mAP in contrast to the vanilla detectors YOLO v3 and SSD, respectively. Conclusion Our FDM-Unet can be as a plug-and-play module at the cost of increasing the very small number of parameters and calculation. The detection accuracy of the pre-trained model is improved greatly with no need of retraining or fine-tuning the detection model on the synthetic underwater image. Real underwater fine-tuning experiments show that our FDM-Unet can improve the detection performance compared to the baseline. In addition, the fine-tuning performance can improve the pre-trained detection model for real underwater image beyond synthesis image.
Keywords

订阅号|日报