Current Issue Cover
结合密集残差结构和多尺度剪枝的点云压缩网络

朱威1,2, 张雨航1, 应悦1, 郑雅羽1,2, 何德峰1,2(1.浙江工业大学信息工程学院, 杭州 310023;2.浙江省嵌入式系统联合重点实验室, 杭州 310023)

摘 要
目的 点云是一种重要的三维数据表示形式,已在无人驾驶、虚拟现实、三维测量等领域得到了应用。由于点云具有分辨率高的特性,数据传输需要消耗大量的网络带宽和存储资源,严重阻碍了进一步推广。为此,在深度学习的点云自编码器压缩框架基础上,提出一种结合密集残差结构和多尺度剪枝的点云压缩网络,实现了对点云几何信息和颜色信息的高效压缩。方法 针对点云的稀疏化特点以及传统体素网格表示点云时分辨率不足的问题,采用稀疏张量作为点云的表示方法,并使用稀疏卷积和子流形卷积取代常规卷积提取点云特征;为了捕获压缩过程中高维信息的依赖性,将密集残差结构和通道注意力机制引入到点云特征提取模块;为了补偿采样过程的特征损失以及减少模型训练的动态内存占用,自编码器采用多尺度渐进式结构,并在其解码器不同尺度的上采样层之后加入剪枝层。为了扩展本文网络的适用范围,设计了基于几何信息的点云颜色压缩方法,以保留点云全局颜色特征。结果 针对几何信息压缩,本文网络在MVUB (Microsoft voxelized upper bodies)、8iVFB (8i voxelized full bodies)和Owlii (Owlii dynamic human mesh sequence dataset) 3个数据集上与其他5种方法进行比较。相对MPEG (movingpicture experts group)提出的点云压缩标准V-PCC (video-based point cloud compression),BD-Rate (bjontegaard deltarate)分别增加了41%、54%和33%。本文网络的编码运行时间与G-PCC (geometry-based point cloud compression)相当,仅为V-PCC的2.8%。针对颜色信息压缩,本文网络在低比特率下的YUV-PSNR (YUV peak signal to noise ratio)性能优于G-PCC中基于八叉树的颜色压缩方法。结论 本文网络在几何压缩和颜色压缩上优于主流的点云压缩方法,能在速率较小的情况下保留更多原始点云信息。
关键词
A dense residual structure and multi-scale pruning-relevant point cloud compression network

Zhu Wei1,2, Zhang Yuhang1, Ying Yue1, Zheng Yayu1,2, He Defeng1,2(1.College of Information Technology, Zhejiang University of Technology, Hangzhou 310023, China;2.United Key Laboratory of Embedded System of Zhejiang Province, Hangzhou 310023, China)

Abstract
Objective In recent years,point clouds technique have been widely used in autonomous driving,virtual reality,3D measurement and other related domains. Point clouds can provide a more denser and realistic representation than mesh representing 3D data to a certain extent. Due to the high resolution characteristics of point clouds,their data is usually very large,and dense point clouds contain millions of points and more complex attribute information. A challenging issue is to be resolved for its transmission efficiency and storage resources of point cloud because it needs to consume a lot of network bandwidth and storage resources. Therefore,it is very necessary to develop point cloud compression methods with high compression ratio and low distortion. Method First,to represent point clouds,we develop a sparse tensor-related network to replace voxels via COO(coordinate)format. Sparse convolution(SC)and sub-manifold sparse convolution (SSC)are used to replace regularized convolution. The SSC can preserve features-extracting sparse features,and the network’ s ability is optimized to extract local features while SC has a larger receptive field,which can make up for the lack of SSC receptive field. Second,point clouds analysis are challenged for sparse and unorganized status in space,and their channel-related information is likely to be more effective than spatial information. By combining channel attention with the dense residual network that has a good performance in the field of image super-resolution,we construct a three-dimensional dense residual module with channel attention(3D-RDB-CA). This module is capable to capture cross-channel features of high-dimensional information and improve compression performance. Furthermore,existing point cloud compression networks reconstruct high-resolution point clouds from low-resolution features through multiple layers of de-convolution,but de-convolution layers-stacked may produce a checkerboard effect. Therefore,to mitigate this effect and reduce dynamic memory footprint during compression,a pruning layer is added after the multi-scale up-sampling layer in the decoder. According to the saved side information during encoding,this module cuts out the feature points,which do not contribute enough to the compression accuracy in the reconstruction process,and the optimal effect of dynamic memory of model training and convergence speed can be achieved. Finally,a geometric information-compressed point cloud color compression scheme is designed to expand the applicable scope of the compression network. Result For geometric information compression,three sort of conventional point cloud compression algorithms(G-PCC(octree),G-PCC(trisoup)and V-PCC)and two kind of point cloud compression algorithms(pcc_geo_cnn_v2 and learned_pcgc)based on deep learning are involved in the comparative experiment with the proposed network. By calculating the peak signal to noise ratio(PSNR)based on D1-p2point(D1 PSNR)and D2-p2plane(D2 PSNR)mentioned above,the corresponding rate-distortion curves are drawn at the same time. For G-PCC and V-PCC,the range of bit rate and corresponding parameters are configurable according to the MPEG CTC-related guidance. Finally,compared to the performance of D1 PSNR and D2 PSNR under the corresponding bit rate range,the proposed network can be used as the baseline,and the BD-Rate and BD-PSNR of other related methods can be calculated to compare its performance. Compared to the point cloud compression standard V-PCC proposed by MPEG,BD-Rate gains can reach more than 41%,54% and 33% of each dataset. The encoding runtime of the proposed network is equivalent to G-PCC and it is 2. 8% of V-PCC only. For color information compression,G-PCC(octree)can be as the baseline. By setting different octree bit depths,quantization ratios and color quality,the color compression distortion can be obtained at different bit rate. By calculating the YUV-PSNR of the two methods under the corresponding bit rate and geometric distortion,the rate-distortion curves can be drawn to compare their compression performance. The experiment demonstrates that the YUV-PSNR performance of the proposed network at low bit rate is better than the octree-based color compression method in G-PCC. Conclusion The proposed network has its great potentials in geometric compression and color compression,and more original point cloud information with less bit rate can be preserved. It also can be used to facilitate geometry and color-compression-relevant applicable domains of point cloud compression method in the context of deep learning technique.
Keywords

订阅号|日报