形状的全尺度可视化表示与识别
摘 要
目的 视觉目标的形状特征表示和识别是图像领域中的重要问题。在实际应用中,视角、形变、遮挡和噪声等干扰因素造成识别精度较低,且大数据场景需要算法具有较高的学习效率。针对这些问题,本文提出一种全尺度可视化形状表示方法。方法 在尺度空间的所有尺度上对形状轮廓提取形状的不变量特征,获得形状的全尺度特征。将获得的全部特征紧凑地表示为单幅彩色图像,得到形状特征的可视化表示。将表示形状特征的彩色图像输入双路卷积网络模型,完成形状分类和检索任务。结果 通过对原始形状加入旋转、遮挡和噪声等不同干扰的定性实验,验证了本文方法具有旋转和缩放不变性,以及对铰接变换、遮挡和噪声等干扰的鲁棒性。在通用数据集上进行形状分类和形状检索的定量实验,所得准确率在不同数据集上均超过对比算法。在MPEG-7数据集上精度达到99.57%,对比算法的最好结果为98.84%。在铰接和射影变换数据集上皆达到100%的识别精度,而对比算法的最好结果分别为89.75%和95%。结论 本文提出的全尺度可视化形状表示方法,通过一幅彩色图像紧凑地表达了全部形状信息。通过卷积模型既学习了轮廓点间的形状特征关系,又学习了不同尺度间的形状特征关系。本文方法在视角变化、局部遮挡、铰接变形和噪声等干扰下能保持较高的识别正确率,可应用于图像采集干扰较多以及红外或深度图像的目标识别,并适用于大数据场景下的识别任务。
关键词
Visualized all-scale shape representation and recognition
Min Ruipeng1, Li Yifan1, Huang Yao1, Yang Jianyu1, Zhong Baojiang2(1.School of Rail Transportation, Soochow University, Suzhou 215100, China;2.School of Computer Science and Technology, Soochow University, Suzhou 215100, China) Abstract
Objective The feature representation of shape contour plays an important role in shape recognition and retrieval tasks, which is an important issue in the field of pattern recognition and image processing. With the increasing application scenarios of big data, deep learning methods are widely used to deal with masses of images for its effectiveness of learning. To use deep learning methods, for example, the popular convolutional neural network for image classification, an image representation of shape features is necessary. Thus, representing the shape features of object contour as an image, rather than a series of feature values, is desired. Moreover, dealing with various disturbance factors and noise, including viewpoint variation, scaling, partial occlusion, articulation, projective transformation, and noise, is unavoidable because different kinds of cameras and sensors are widely used for image and video capturing. These disturbances and noise decrease the quality of the images and videos, and consequently, the accuracy of the following object recognition and retrieval tasks. To solve the above problems, a visualized all-scale shape representation and recognition method is proposed in this work. In our method, the representation of shape features can be learned by the widely used deep learning models, which is effective for recognition and retrieval tasks in big data application scenarios. The proposed method is also robust to various disturbances and noise. Method First, three kinds of invariant shape features, namely, area feature, arc length feature, and central distance feature, are extracted from the shape contour. The three kinds of shape features are invariant features in different aspects of shape at different dimensions, which are normalized to the size of the shape in the image. The features at all scales in the scale space are extracted to obtain sufficient shape information and fully represent the shape because these three shape features can be extracted at different scales with respect to the shape. After that, all the features in the scale space are compactly represented by a color image. In this image representation, the R, G, and B channels are used to represent the three kinds of invariant shape features. The value of the feature is represented as the value of color. In each channel, the x axis of the image is regarding the sequence of contour points, whereas the y axis is regarding all the scales. A convolution neural network is designed to learn the shape features from the color image because the shape is represented by the color image. To learn as much shape information, the original shape image and the color image representation are used as input of the convolutional model. Thus, the model is designed with two convolutional streams, one for the original image and one for the color image. Therefore, the deep learning method can effectively learn the shape features to perform shape classification and retrieval tasks. Result In the extensive experimental evaluations, quality experiments and quantity experiments are implemented. Quality experiments are implemented to test the robustness of the proposed method to various disturbances and noise, including rotation, scale variation, partial occlusion, articulated deformation, and noise. In the experiments, each kind of disturbance is added to the shape image, and then the color image representation is compared with that of the original shape image. Experimental results validate that the proposed method is invariant to rotation and scaling, and robust to articulated deformation, partial occlusion, and noise. Furthermore, quantity experiments of shape recognition and retrieval tasks are implemented on the benchmark datasets. The recognition and retrieval accuracy of the proposed method is tested on general datasets, including MPEG-7 dataset and Animal dataset, and the performance of our method under disturbances is evaluated on the articulated shape dataset and projective shape dataset. The recognition and retrieval accuracy of our method is compared with other state-of-the-art methods. Our method outperforms all other methods for shape recognition and retrieval accuracy on all the datasets, which verifies that the proposed shape representation method is effective for shape recognition and retrieval. Furthermore, the accuracy of our method is 99.57% on the MPEG-7 dataset, that is, our method can correctly classify nearly all the shapes. Moreover, in the experiments on the articulated and projective datasets, our method achieves 100% recognition results, which greatly outperform state-of-the-art methods. These evaluations verify that the proposed method can maintain a high accuracy in shape recognition and retrieval tasks under different kinds of disturbances. Conclusion In this paper, a visualized all-scale shape representation method is proposed for shape recognition and retrieval. Different kinds of invariant shape features can be extracted at all the scales in the scale space, where the shape features are captured as much as possible. The color image representation is compact to represent the extracted shape features, and the shape features can be visualized in this color image. Furthermore, with this color image representation, the effectiveness of deep learning method can be utilized for feature learning and shape classification. The proposed two-stream convolutional neural network can fully learn the shape features from the color image representation and the original binary shape image. Via the deep learning from the color image representation, not only the shape context along the shape contour is learned in the x axis of the color image but also the relations of shape features among different scales are learned in the y axis. The proposed method is robust to various disturbances and noise, and can maintain high recognition accuracy regardless of the influences of viewpoint variation, nonlinear deformation, partial occlusion, and articulated deformation. Therefore, it can be used in complex environments. It can be used for object recognition and retrieval tasks from infrared image and depth image because the shape images are binary images, which can be easily obtained from depth maps. The classification engine is based on the deep learning model, which is also suitable for recognition tasks in big data applications.
Keywords
|