Current Issue Cover
面向深度学习的三维点云补全算法综述

胡伏原1, 李晨露1, 周涛2, 程洪福3, 顾敏明4(1.苏州科技大学电子与信息工程学院;2.北方民族大学计算机科学与工程学院;3.苏州市拙政园管理处(苏州市园林博物馆);4.苏州科技大学电子信息与工程学院)

摘 要
点云因其丰富的信息表达能力已成为三维视觉的主要表现形式,然而实际采集到的点云数据往往因各种因素导致稀疏或残缺,严重影响点云后续处理。点云补全算法旨在从残缺点云数据中重建完整点云模型,是3D重建、目标检测和形状分类等领域的重要研究基础。目前,基于深度学习的点云补全算法逐渐成为三维点云领域的研究热点,但补全任务中模型结构、精度和效率等挑战正阻碍点云补全算法的发展。本文对深度学习背景下的点云补全算法进行系统综述,首先根据网络输入模态将点云补全算法分为两大类,即基于单模态的方法以及基于多模态的方法。接着根据三维数据表征方式将基于单模态的方法分为三大类,即基于体素的方法,基于视图的方法以及基于点的的方法,并对经典方法和最新方法进行系统的分析和总结,同时结合热点模型,如生成对抗网络(generative adversarial networks,GAN)、Transformer模型等进一步分类对比,评述各类模型下点云补全算法的方法特点与网络性能。再对基于多模态的方法进行实际应用分析,结合扩散模型等方法进行算法性能对比。然后总结点云补全任务中常用的数据集及评价标准,分别以多种评价标准对比分析现有基于深度学习的点云补全算法在真实数据集与多种合成数据集上的性能表现。最后根据各分类的优缺点提出点云补全算法在深度学习领域的未来发展和研究趋势,为三维视觉领域的补全算法研究者提供重要参考价值。
关键词
A survey on point cloud completion algorithms for deep learning

(1.School of Electronic and Information Engineering, Soochow University of Science and Technology;2.School of Computer Science and Engineering,North Minzu University;3.Suzhou Humble Administrator''s Garden Administration Office (Suzhou Garden Museum))

Abstract
Point cloud has become the main form of three-dimensional vision because of its rich information expression ability. However, the actual collected point cloud data are often sparse or incomplete due to the characteristics of the measured object, the performance of the measuring instrument, environmental and human factors, which seriously affect the subsequent processing of point cloud. Point cloud completion algorithm aims to reconstruct a complete point cloud model from incomplete point cloud data, which is an important research basis for 3D reconstruction, object detection and shape classification. With the rapid development of deep learning methods, its efficient feature extraction ability and excellent data processing ability make it widely used in 3D point cloud algorithms. At present, point cloud completion algorithms based on deep learning have gradually become a research hotspot in the field of 3D point cloud. However, challenges such as model structure, accuracy and efficiency in completion tasks are hindering the development of point cloud completion algorithms. For example, the problem of missing key structural information, the problem of fine-grained reconstruction, and the problem of inefficiency of the algorithm model. This paper systematically reviews the point cloud completion algorithms in the background of deep learning. Firstly, according to the network input modality, the point cloud completion algorithms are divided into two categories, namely, single-modality-based methods and multi-modality-based methods. Then, according to the representation of 3D data, the methods based on single modality are divided into three categories, namely voxel-based methods, view-based methods and point-based methods. The classical methods and the latest methods are systematically analyzed and summarized. The method characteristics and network performance of point cloud completion algorithms under various models were reviewed. Then, the practical application analysis of the multimodal method is carried out, and the performance of the algorithm is compared with the diffusion model and other methods. Then, it summarizes a variety of datasets and evaluation criteria commonly used in point cloud completion tasks, and compares and analyzes the performance of existing point cloud completion algorithms based on deep learning on real datasets and synthetic datasets with a variety of evaluation criteria. Finally, according to the advantages and disadvantages of each classification, the future development and research trend of point cloud completion algorithm in the field of deep learning are proposed. The research results are as follows: Since the concept of point cloud completion algorithm was proposed in 2018, most of the methods based on single mode use the point method for completion, and combine the hotspot model for algorithm optimization, such as generative adversarial networks(GAN), Transformer model, Mamba model, etc. Multimodal methods have developed rapidly since they were proposed in 2021, especially after the diffusion model was applied to the point cloud completion algorithm, which truly realized multimodal input and output. Many researchers have explored multi-modal information fusion at the feature level to improve the model accuracy of the completion algorithm. It also provides an updated algorithm theoretical basis for multi-vehicle cooperative intelligent perception technology in robotics and autonomous driving. Point cloud completion based on multimodal methods is also the development trend of point cloud completion algorithms in the future. Through a comprehensive survey and review of point cloud completion algorithms based on deep learning, it is found that the current research results have improved the ability of point cloud data feature extraction and model generation to a certain extent, but there are still the following research difficulties: 1) Feature and fine-grained: At present, most algorithms are dedicated to making full use of structural information to predict and generate fine-grained and more complete point cloud shape. It is still of great research significance to perform multiple fusion of geometric structure and attribute information based on point cloud data structure to enrich the high-quality generation of point cloud data. 2) Multi-modal data fusion: point cloud data is usually fused with other sensor data to obtain more comprehensive information, such as RGB images, depth images, etc. How to improve the method of multi-modal feature extraction and fusion, and explore the smart fusion of multi-modal data to improve the accuracy and robustness of point cloud completion algorithm will be the difficulties of future research. In the future, the development of point cloud completion algorithms will realize that all modes from text, image to point cloud will be completely opened, and any input and output will be realized in the real sense. 3) Data augmentation and diversity: Large model of point cloud will be a hot research topic in the future. How to improve the generalization ability and data diversity of point cloud completion algorithms in various scenarios through data augmentation or model diffusion will also become a difficulty in the field of point cloud. 4) Real-time and interactivity: Real-time requirements limit the development of point cloud completion algorithms in applications such as autonomous driving and robotics. The high complexity of the algorithm, the difficulty of multi-modal feature information fusion, and the difficulty of large-scale data processing make the algorithm model inefficient, resulting in poor real-time performance. How to reduce the size of the data through data preprocessing and downsampling, and then choose a relatively lightweight model structure, such as Mamba model, to improve the efficiency of the model. At the same time, the rapid adjustment and optimization of high-quality point cloud completion results according to user interaction information will also be difficult for future development. A systematic review of point cloud completion algorithms under the background of deep learning provides important reference value for researchers of completion algorithms in the field of 3D vision.
Keywords

订阅号|日报