深度学习背景下的图像三维重建技术进展综述
摘 要
三维重建是指从单幅或多幅二维图像中重建出物体的三维模型并对三维模型进行纹理映射的过程。三维重建可获取从任意视角观测并具有色彩纹理的三维模型,是计算机视觉领域的一个重要研究方向。传统的三维重建方法通常需要输入大量图像,并进行相机参数估计、密集点云重建、表面重建和纹理映射等多个步骤。近年来,深度学习背景下的图像三维重建受到了广泛关注,并表现出了优越的性能和发展前景。本文对深度学习背景下的图像三维重建的技术方法、评测方法和数据集进行全面综述。首先对三维重建进行分类,根据三维模型的表示形式可将图像三维重建方法分类为基于体素的三维重建、基于点云的三维重建和基于网格的三维重建;根据输入图像的类型可将图像三维重建分类为单幅图像三维重建和多幅图像三维重建。随后介绍了不同类别的三维重建方法,从三维重建方法的输入、三维模型表示形式、模型纹理颜色、重建网络的基准值类型和特点等方面进行总结,归纳了深度学习背景下的图像三维重建方法的常用数据集和实验对比,最后总结了当前图像三维重建领域的待解决问题以及未来的研究方向。
关键词
The growth of image-related three dimensional reconstruction techniques in deep learning-driven era:a critical summary
Yang Hang, Chen Rui, An Shipeng, Wei Hao, Zhang Heng(Tianjin Key Laboratory of Imaging and Sensing Microelectronics Technology, School of Microelectronics, Tianjin University, Tianjin 300072, China) Abstract
Image-related three dimensional reconstruction techniques refer to the process of reconstructing the three dimensional model derived of a single image or multi-view images. It can illustrate a three dimensional model relevant to any view-derived color texture. Traditional three dimensional reconstruction methods are often required for a large number of images in relevance to such multiple contexts like sparse point cloud reconstruction,camera parameter estimation,dense point cloud reconstruction,surface reconstruction and texture mapping. In recent years,deep learning-driven imagerelated three dimensional reconstruction techniques have been concerned about,and current literatures are focused on introducing the traditional methods of image or special objects-based three dimensional reconstruction. The critical summary of image-based three dimensional reconstructions is called for further in terms of deep learning contexts. We summarize recent situation for deep learning based three dimensional reconstructions in terms of image analysis. First,three dimensional reconstructions are mainly introduced from two aspects:traditional-based and deep learning-based. Three sorts of three dimensional models are listed below:voxel model,point cloud model and mesh model. Voxel is similar to a cube in three-dimensional space,which is equivalent to pixels in three-dimensional space;Mesh is a polyhedral structure composed of the triangles,which is used to simulate the surface of complex objects;Point cloud is a collection of points in the coordinated system,which consists of the information of three-dimensional coordinates,colors and classification. For voxel model,the two-dimensional convolution used in image analysis can be easily extended to three-dimensional space, but the reconstruction of voxel model usually requires large of computing memory. The memory and calculation requirements of the method based on voxel model are cubic proportional to the resolution of voxel model. The point cloud-based shape reconstruction is smoother and takes less memory than voxel model based method. Compared to voxel model and point cloud model,mesh model can be used to analyze the object surface more completely. Then,we faciliate the classification of image-based three dimensional reconstructions,which can be classified from two aspects:the representation of three dimensional models and the type of input images. For the types of three dimensional reconstruction targets,we segment the existing three dimensional reconstruction methods into two categories:single image-related and multi-view imagesrelated. For single image-related three dimensional reconstructions,we divide the method into three categories according to the representation of single image-related three dimensional reconstructions:voxel-based,point cloud based and mesh based. For three dimensional reconstructions in related to multi-view images,we divide the method into two categories as well:voxel-based and mesh-based. Then,existing image-based three dimensional reconstruction methods are introduced in detail,the methods are summarized critically in relevance to the input of three dimensional reconstruction method,three dimensional model representation,model texture color,ground truth and property of reconstruction network. The experiments of three dimensional reconstructions are analyzed from three aspects:evaluation method,dataset and comparison method. For the experimental aspect,current three dimensional reconstruction-related datasets are introduced,e. g. , repository of shapes represented by 3D CAD models(ShapeNet)dataset,pattern analysis,statistical modeling and computational learning(PASCAL)3D+ dataset,3D CAD model dataset(ModelNet)dataset,database for 3D object recognition (ObjectNet3D)dataset,benchmark of diverse image-shape pairs with pixel-level 2D-3D alignment(Pix3D)dataset,Danmarks Tekniske Universitet(DTU)dataset,New York University(NYU)depth dataset,and Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago(KITTI)dataset. For the experiment of three dimensional reconstructions,the ShapeNet dataset is selected and benched for comparison,the pros and cons of the existing methods are analyzed further. Finally,future research direction of image-based three dimensional reconstruction is predicted and its challenging problems and future potentials are summarized from five aspects further as following:the generalization ability of three dimensional reconstruction methods;the fineness of three dimensional reconstruction;the combination of three dimensional reconstruction and the methods of segmentation and recognition;the texture mapping of three dimensional model; and the evaluation system of three dimensional reconstruction.
Keywords
|