Current Issue Cover
面向识别未知旋转的3维网格模型的矢量型球面卷积网络

张强1, 赵杰煜1,2, 陈豪1(1.宁波大学信息科学与工程学院, 宁波 315211;2.浙江省移动网应用技术重点实验室, 宁波 315211)

摘 要
目的3维目标分类是视觉领域的一个基本问题,3维目标的旋转变化给分类带来极大挑战。同时不规则3维网格模型难以运用传统2维卷积网络提取特征。针对这两个问题,提出一种基于矢量型球面卷积网络的分类方法,用于识别未知旋转的3维网格模型。方法 使用矢量型神经元作为网络的基础神经元,并提出一种新型矢量层间的卷积方式。首先,将3维模型规范化并映射到单位球上,获取球面的信号表示;然后,使用矢量型分类网络和重建网络学习等变的3维模型特征;最后,使用分类网络完成3维模型分类。结果 经过消融实验对比,使用本文提出的球面卷积模块和矢量卷积层,并在训练时加入重建模块。对原本未旋转(no rotation,NR)数据集进行任意旋转(arbitrary rotation,AR),并设定NR/AR,AR/AR,NR/NR共3种训练/测试策略的分类任务,其中NR/AR任务衡量模型识别未知旋转的能力。在刚性数据集ModelNet40上,相比基于球面卷积网络(spherical convolutional neural network,SCNN)的分类方法,在3种任务上分别提高了7.7%,1.8%,3.1%。为验证本文方法在识别非刚性3维网格目标的优越性,在非刚性数据集SHREC15(shape retrieval contest 2015)上,相比SCNN,本文方法在3种任务上分别提高了8.8%,4.5%,5.0%。结论 本文提出一种将矢量型网络运用在3维目标分类的思路,使用光线投射法获得分布在球面空间的特征,便于使用统一的球面卷积算子进行处理;设计一种球面残差模块避免梯度消失;使用矢量型神经元并设计矢量层之间的卷积方式以保证网络的等变性,使得识别任意旋转的3维模型时更加准确。
关键词
A vectorized spherical convolutional network for recognizing 3D mesh models with unknown rotation

Zhang Qiang1, Zhao Jieyu1,2, Chen Hao1(1.College of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China;2.Key Laboratory of Zhejiang Province in Mobile Network Application Technology, Ningbo 315211, China)

Abstract
Objective The 3D meshes are concerned of spatial information-demonstrated surface triangles,which can optimize surface information than other related representations like voxel or point cloud. The 3D shape analysis is still to be resolved in relevant to mesh representation on two aspects:1)the irregular data structure of the mesh model is challenged for feature extraction using traditional 2D convolutional networks,and 2)the 3D rotation transformation is challenged for object recognition as well. The emerging convolutional neural networks(CNNs)have been developing dramatically in the context of 2D vision like classification,segmentation,detection,as well as 3D objects-oriented applications. Current CNNbased 3D mesh classification is developed from two aspects:1)the 3D object is transferred to 2D images and the following 2D-based CNN methods are used,and 2)convolution methods are designed on 3D mesh data. However,it is still challenged to recognize rotated objects due to traditional CNN-equivariant-lacked pooling operation. The lack of rotation equivariance can be improved on the basis of two networks which are vectorized and equivariant networks. The vectorized network,known as capsule network,has shown its potentials in learning spatial transformation on 2D images,but convolutionconstrained method is required to be applied on 3D mesh further. To apply the vectorized neural network to 3D mesh data and preserve rotation equivariance,we develop a vectorized spherical neural network-derived method for 3D mesh classification. Method Our method can be segmented into three categories as mentioned below:First,the 3D mesh model is preprocessed to signals on the sphere. We normalize the 3D mesh into a unit sphere and get the spherical signals on the unit sphere using the ray casting scheme. The obtained spherical signals are nearly equivalent 3D shape representations and can be further processed by spherical convolution methods. The aims of processing 3D mesh to spherical signals are 1)to utilize the spherical signals-defined equivariant spherical convolution operators,and 2)to design vectorized neurons in a coordinated manner. Second,the autoencoder-structured model is used to learn the feature of the spherical signal. The model is composed of two sub-networks:i)a vectorized spherical convolutional neural network(VSCNN)to encode the equivariant feature and classify the 3D object,and ii)a multilayer perceptron decoder to decode the extracted feature back to sphere signal. The VSCNN is based on two kinds of spherical residual convolution block and the vector convolution layer. To train deeper networks and resist overfitting,we develop two spherical convolution modules which are S2 convolution block and SO(3) convolution block,and the primary vectorized neurons are obtained after that. The vector convolution layer is used to learn high-level vectorized features derived from the lower layer. The vector convolutional layer can be used to transfer the primary vectorized neurons to get high-level ones. A deep vectorized network can be constructed through the vector convolutional layer-based stacking. To guarantee the rotation-equivariant spherical vector neurons can be learned well during convolution,we use the SO(3) convolution operator to predict the high-level neurons. The VSCNN and the network-reconstructed are trained simultaneously. Third,the VSCNN-based 3D object classification is demonstrated. For validation,we use VSCNN to clarify the category information of the 3D model only. Result The ModelNet40 and SHREC15 of two 3D datasets are verified for the effectiveness of the proposed method. Our model is trained on the non-rotated(NR) and arbitrarily rotated(AR)training set,and it is tested on the non-rotated and rotated test set as well. The robustness of the model is demonstrated to rotation. For the rigid data set ModelNet40,the accuracy of rotation-unidentified targets can be reached to 85. 2%,surpassing the baseline method by 7. 7%. The comparative analysis shows that our method proposed can surpass most of multi-view and point cloud methods compared to other related 3D data representations. The NR/NR result can show its optimization ability in comparison with the benchmarks. At the same time,to identify non-rigid threedimensional grid targets,we carry out a rotation classification experiment on the non-rigid data set SHREC15,and the accuracy rate can be reached to 90. 4%,surpassing the baseline method by 8. 8%. Conclusion We develop a 3D object classification method for rotated mesh. The robustness to 3D rotation can be optimized in terms of vectorized neurons and the equivariant vector convolution layer. The 3D models-rotated recognition is facilitated excluding rotation augmentation, and it shows the learning ability for vectorized networks-based transformation.
Keywords

订阅号|日报