适合跨域目标检测的雾霾图像增强
摘 要
目的 室外监控在雾霾天气所采集图像的成像清晰度和目标显著程度均会降低,当在雾霾图像提取与人眼视觉质量相关的自然场景统计特征和与目标检测精度相关的目标类别语义特征时,这些特征与从清晰图像提取的特征存在明显差别。为了提升图像质量并且在缺乏雾霾天气目标检测标注数据的情况下提升跨域目标检测效果,本文综合利用传统方法和深度学习方法,提出了一种无监督先验混合图像特征级增强网络。方法 利用本文提出的传统先验构成雾气先验模块;其后连接一个特征级增强网络模块,将去散射图像视为输入图像,利用像素域和特征域的损失实现场景统计特征和目标类别语义相关表观特征的增强。该混合网络突破了传统像素级增强方法难以表征抽象特征的制约,同时克服了对抗迁移网络难以准确衡量无重合图像域在特征空间分布差异的弱点,也减弱了识别算法对于低能见度天候采集图像标注数据的依赖,可以同时提高雾霾图像整体视觉感知质量以及局部目标可识别表现。结果 实验在两个真实雾霾图像数据集、真实图像任务驱动的测试数据集(real-world task-driven testing set, RTTS)和自动驾驶雾天数据集(foggy driving dense)上与最新的5种散射去除方法进行了比较,相比于各指标中性能第2的算法,本文方法结果中梯度比指标R值平均提高了50.83%,属于感知质量指标的集成自然图像质量评价指标(integrated local natural image quality evaluator, IL-NIQE)值平均提高了6.33%,属于跨域目标检测指标的平均精准率(mean average precision, MAP)值平均提高了6.40%,平均查全率Recall值平均提高了7.79%。实验结果表明,本文方法结果在视觉质量和目标可识别层面都优于对比方法,并且本文方法对于高清视频的处理速度达50帧/s,且无需标注数据,因而在监控系统具有更高的实用价值。结论 本文方法可以同时满足雾霾天候下对采集视频进行人眼观看和使用识别算法进行跨域目标检测的需求,具有较强的应用意义。
关键词
Cross-domain object detection based foggy image enhancement
Guo Qiang1, Pu Shiliang2, Zhang Shifeng2, Li Bo1(1.Beijing Key Laboratory of Digital Media, Beihang University, Beijing 100191, China;2.Hangzhou Hikvision Digital Technology Company, Hangzhou 310000, China) Abstract
Objective The acquired images relating fog, mist and damp weather conditions are subjected to the atmospheric scattering, affecting the observation and target analysis of intelligent detection systems. The scattering of reflected lights deduct the contrast of images in the context of the increased scene depth, and the uneven sky illumination constrains images visibility. This two constraints yield to deduction and fuzzes on the weak texture in foggy images. The degradation of foggy images affects the pixels based statistical distributions like saturation and weber contrast and changes the statistical distribution between pixels, such as target contour intensity. Thus, visual perception quality related to natural scene statistical (NSS) features of fog image and the target detection accuracy related target category semantic (TCS) features are significantly different with ground truth, Traditional image restoration methods can build the defogging mapping to improve image contrast based on the conventional scattering model. But, it is challenged to remove the severe scattering of image features. Deep learning based image enhancement methods have better scattering image removal results close to the distribution of training data. It is a challenged issue of insufficient generalization ability like dense artifacts for ground truth foggy images derived of complex degradation the degradation in synthetic foggy images excluded. Current methods have focused on semi-supervision technique based generalization ability improvement but the large domain distance constrains between real and synthetic foggy images existing. To optimize the image features, current deep learning based methods are challenged to achieve the niches between visual quality and machine perception quality via image classification or target detection. To interpret pros and cons of prior-based and deep leaning based methods, a semi-supervision prior hyrid network for feature enhancement is illustrated to demonstrate the feature enhancement for detection and object analysis. Method Our research is designated to a semi-supervision prior hybrid network for NSS and TCS feature enhancement. First, a prior based fog inversion module is used to remove the atmospheric scattering effect and restore the uneven illumination in foggy images. The method is based on an extended atmospheric scattering model and a regional gradient constrained prior for transmission estimation and illumination decomposition. Then, a feature enhancement module is designed based on the condition generative adversarial network (CGAN) as a subsequent module, which regards the defogged image as the input image. The generator uses six Res-blocks with the instance norm layer and a long skip connection to achieve the image domain translation of defogged images. There are three discriminators in the network. The style and feature discriminator with five convolution layers and leakReLU layers are used to identify the image style between "defogged" and "clear", promoting the generator using the adversarial technique with CGAN loss in pixel-level and feature-level. Excluded the CGAN loss for further removing the scattering degradation in defogged images, the generator is also trained based on a content loss, which constrains the details distortion in the image translation process. Moreover, our research analyzes the domain differences between the defogged image and the clear image in the target feature level, and utilizes a target cumulative vector loss based a target sematic discriminator to guide the refinement of target outline in the defogged image. Thus, the feature enhancement module is implemented to contrast and brightness related NSS features and improve the performance of TCS features about target clearance. The reversed features are constrained to match the CGAN loss and content loss in terms of the information differences between image features and image pixel performance. Our network resolves the interconnections between the traditional method and the convolutional neural network (CNN) module and obtains the enhanced result via the scattering removal and feature enhancement. It is beneficial to solve the dependence of feature learning on synthetic paired training data and the instability of semi-supervision learning in realizing image translation. Abstract features representations is also improved through definite direction and fine granularity related feature learning method. The traditional image enhancement module is optimized to make the best defogged result via parameters adjusting. The feature enhancement module is trained with adaptive moment estimation(ADAM) optimizer for 250 epochs with the momentum takes the values of 0.5 and 0.999. The learning rate is set to 2E-4. And unpaired 270×170 patches randomly cropped from 2 000 defogged real-world images and 2 000 clearance images are taken as the inputs of the generator and discriminator. The train and test processes are carried out by PyTorch in a X86 computer with a core i7 3.0 GHz processor, 64 GB RAM and a NVIDIA 2080ti graphic. Result Our experimental results are compared with 5 state-of-the-art enhancement methods, including the 2 traditional approaches and 3 deep learning methods on 2 public fog ground truth image datasets called RTTS(real-world task-driven testing set) and foggy driving dense dataset. RTTS dataset contains 4 322 foggy or dim images and foggy driving dense dataset has 21 dense fog online images collection. The quantitative evaluation metrics contain the image quality index and the detection quality index. The quality indexes are composed of the enhanced gradient ratio R and the blind image quality analyzer integrated local natural image quality evaluator(IL-NIQE). The detection indexes are based on mean average precision(MAP) and recall. We also demonstrated more enhanced results of each method for qualitative comparison in the experimental section. In RTTS and foggy driving dense dataset, compared with the method ranking second in each index, the mean R value is improved 50.83%, the mean IL-NIQE value is improved 6.33%, the MAP value is improved 6.40% and the mean Recall value is improved 7.79%. The enhanced results from the proposed method are much similar to clear color, brightness and contrast images qualitatively. The obtained experimental results illustrates that our network can improves the visual quality and machine perception for the foggy image captured in bad weather conditions with more than 50 (frame/s)/Mp as well. Conclusion Our semi-supervision prior hybrid network integrates traditional restoration methods and deep learning based enhancement models for multi-level feature enhancement, achieving the enhancement of NSS features and TCS features. Our illustrations demonstrates our method has its priority for real foggy images in terms of image quality and object detectable ability for intelligent detection system.
Keywords
foggy image dehazing feature enhancement prior hyrid network semi-supervison learning image domain translation
|