Current Issue Cover
利用条件生成对抗网络的光场图像重聚焦

谢柠宇, 丁宇阳, 李明悦, 刘渊, 律睿慜, 晏涛(江南大学人工智能与计算机学院, 无锡 214122)

摘 要
目的 传统的基于子视点叠加的重聚焦算法混叠现象严重,基于光场图像重构的重聚焦方法计算量太大,性能提升困难。为此,本文借助深度神经网络设计和实现了一种基于条件生成对抗网络的新颖高效的端到端光场图像重聚焦算法。方法 首先以光场图像为输入计算视差图,并从视差图中计算出所需的弥散圆(circle of confusion,COC)图像,然后根据COC图像对光场中心子视点图像进行散焦渲染,最终生成对焦平面和景深与COC图像相对应的重聚焦图像。结果 所提算法在提出的仿真数据集和真实数据集上与相关算法进行评价比较,证明了所提算法能够生成高质量的重聚焦图像。使用峰值信噪比(peak signal to noise ratio,PSNR)和结构相似性(structural similarity,SSIM)进行定量分析的结果显示,本文算法比传统重聚焦算法平均PSNR提升了1.82 dB,平均SSIM提升了0.02,比同样使用COC图像并借助各向异性滤波的算法平均PSNR提升了7.92 dB,平均SSIM提升了0.08。结论 本文算法能够依据图像重聚焦和景深控制要求,生成输入光场图像的视差图,进而生成对应的COC图像。所提条件生成对抗神经网络模型能够依据得到的不同COC图像对输入的中心子视点进行散焦渲染,得到与之对应的重聚焦图像,与之前的算法相比,本文算法解决了混叠问题,优化了散焦效果,并显著降低了计算成本。
关键词
Light field image re-focusing based on conditional generative adversarial networks leverage

Xie Ningyu, Ding Yuyang, Li Mingyue, Liu Yuan, Lyu Ruimin, Yan Tao(School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China)

Abstract
Objective Light field images like rich spatial and angular information are widely used in computer vision applications. Light field information application can significantly improve the visual effect based on the focal plane and depth of field of an image. The current methods can be divided into two categories as mentioned below: One of the categories increases the angular resolution of a light field image via light field reconstruction. Since aliasing phenomenon is derived of disparity amongst light-field-images-based of the sub-aperture views. These methods require high computational costs and may introduce color errors or other artifacts. In addition, these methods can just improve the quality of refocusing straightforward under original focus plane and depth of field. Another category illustrates various filters derived of the circle of confusion (COC) map to defocus/render the center sub-aperture view to produce bokeh rendering effect. A rough defocusing visual effect can obtained. This above category has low computational cost and can sort both the focus plane and depth of field out. Deep convolutional neural network (DCNN) has its priority in bokeh rendering. To this end, we facilitate a novel conditional generative adversarial network based (C-GAN-based) for bokeh rendering. Method Our analysis takes a light field image as input. It contains three aspects as following: First, it calculates the COC map with different focal planes and depths of field derived of the disparity map for the input light field image estimation. The obtained COC map and the central sub-view of the light field image are fed into the generator of the conditional GAN. Next, the generator processes two input data each based on two four-layer encoders in order to integrate two-encoders-based features extraction, which add the four consecutive residual modules. At the end, the acquired refocused image is melted into the discriminator to identify that the obtained refocused image corresponding to the COC map. To enhance the high-frequency details of the refocused/rendered image, we adopt a pre-trained Visual Geometry Group 16-layer (VGG-16) network to calculate the style loss and the perceptual loss. L1 loss is used as the loss of the generator, and the discriminator adopts the cross-entropy loss. The Blender is used to adjust the position of focus planes and depths of field and render corresponding light field images. A digital single lens reflex(DSLR) camera plug-in tool of the Blender is used to render the corresponding refocused images as the ground truth. Our network is implemented based on the Keras framework. The input and output sizes of the network are both 512×512×3. The network is trained on a Titan XP GPU card. The number of epochs for training our targeted neural network is set to 3 500. The initial learning rate is set to 0.000 2. The training process took about 28 hours. Result Our synthetic dataset and the real-world dataset are compared with similar algorithms, including current refocusing algorithms, three different light field reconstruction algorithms, and defocusing algorithm using anisotropic filtering with COC map. Our quantitative analysis uses the peak signal to noise ratio (PSNR) and structural similarity (SSIM) for evaluation. Our proposed network-structure-based qualitative evaluation can obtain refocused images with different focus planes and depths of field in terms of the input COC map analysis. In the process of quantitative analysis, our average PSNR obtained is 1.82 dB. The average SSIM was improved by 0.02. Compared with the methods that use COC map and anisotropic filtering, our average PSNR was improved 7.92 dB and the average SSIM is improved 0.08. The methods had achieved poor PSNR values in the context of reconstruction/super-resolution due to the chromatic aberration of the generated sub-views. Conclusion Our algorithm can generate the disparity-map-based corresponding COC map obtained from the input light field image, refocusing plane and depth of field. To produce the corresponding refocused image, our conditional generative adversarial network demonstration can perform bokeh rendering on the central sub-view image based on differentiate COC map.
Keywords

订阅号|日报