Current Issue Cover
注意力机制改进轻量SSD模型的海面小目标检测

贾可心1,2, 马正华1, 朱蓉2, 李永刚2(1.常州大学计算机与人工智能学院, 阿里云大数据学院, 常州 213000;2.嘉兴学院数理与信息工程学院, 嘉兴 314001)

摘 要
目的 海面目标检测图像中的小目标数量居多,而基于深度学习的目标检测方法通常针对通用目标数据集设计检测模型,对图像中的小目标检测效果并不理想。使用一般目标检测模型检测海面目标图像的特征时,通常会出现小目标漏检情况,而一些特定的小目标检测模型对海面目标的检测效果还有待验证。为此,在标准的SSD(single shot multiBox detector)目标检测模型基础上,结合Xception深度可分卷积,提出一种轻量SSD模型用于海面目标检测。方法 在标准的SSD目标检测模型基础上,使用基于Xception网络的深度可分卷积特征提取网络替换VGG-16(Visual Geometry Group network-16)骨干网络,通过控制变量来对比不同网络的检测效果;在特征提取网络中的exit flow层和Conv1层引入轻量级注意力机制模块来提高检测精度,并与在其他层引入轻量级注意力机制模块的模型进行检测效果对比;使用注意力机制改进的轻量SSD目标检测模型和其他几种模型分别对海面目标检测数据集中的小目标和正常目标进行测试。结果 为证明本文模型的有效性,进行了多组对比实验。实验结果表明,模型轻量化导致特征表达能力降低,从而影响检测精度。相对于标准的SSD目标检测模型,本文模型在参数量降低16.26%、浮点运算量降低15.65%的情况下,浮标的平均检测精度提高了1.1%,漏检率减小了3%,平均精度均值(mean average precision,mAP)提高了0.51%,同时,保证了船的平均检测精度,并保证其漏检率不升高,在对数据集中的小目标进行测试时,本文模型也表现出较好的检测效果。结论 本文提出的海面小目标检测模型,能够在压缩模型的同时,保证模型的检测速度和检测精度,达到网络轻量化的效果,并且降低了小目标的漏检率,可以有效实现对海面小目标的检测。
关键词
Attention-mechanism-based light single shot multiBox detector modelling improvement for small object detection on the sea surface

Jia Kexin1,2, Ma Zhenghua1, Zhu Rong2, Li Yonggang2(1.College of Computer and Artificial Intelligence, Aliyun School of Big Data, Changzhou University, Changzhou 213000, China;2.College of Mathematics Physics and Information, Jiaxing University, Jiaxing 314001, China)

Abstract
Objective Object detection on the sea surface plays a key role in the development and utilization of marine resources. The sea environment is complex and changeable, and there are many kinds of objects. Considering the factors such as safety and obstacle avoidance, the shooting process of sea surface object detection images will target on the amount of small and medium-sized objects in the image majority, which puts forward higher requirements for accurate detection of objects on the sea surface. Although some regular object detection methods with good detection results have been proposed, they still face the problems of low detection accuracy and slow detection speed. With the rapid development of deep learning theory, the feature extraction capability of deep learning model is gradually mature, and it is widely used in object detection technology. Compared with the original object detection methods, deep-learning-based object detection method has its priority in speed and accuracy. Deep-learning-based object detection method focuses on the construction of deeper network to improve the detection accuracy. The network model usually has the difficulties with too large parameters, which leads to the slow detection speed. Most of the good detection network can only run on high-performance graphics processor unit (GPU), which requires higher computing power equipment. It will also interfere the detection accuracy of the network if the model is compressed. In addition, the initial deep-learning-based object detection method is a detection model designed for the general object dataset. For the small object in the image, the detection effect is not very ideal. In terms of the characteristics of the sea object detection image, the general object detection model will miss the detection of small objects, and the detection effect of some small-targeted object detection models for sea objects needs to be verified. Method The original data of this demonstration is based on the marine obstacle detection dataset 2 (MODD 2), which is mainly composed of boats, buoys and other sea objects. Total 5 050 images of them are used in the illustrated data. To construct the sea surface object dataset, the boats and buoys are calibrated by calibration software called LabelImg, and processed in accordance with the format of visual object class 2007 (VOC2007) dataset. First, on the basis of standard single shot MultiBox detector (SSD) object detection model, Visual Geometry Group network-16 (VGG-16) backbone network is substituted via depth wise separable convolution feature extraction network based on Xception network. The detection effect of different network models is compared based on variables application, including VGG-16-based SSD network, Mobilenet-based SSD network and Xception-based SSD network. In the process of training, the size of the input image is scaled to the RGB image of 300×300 pixels. The following input images are normalized. The trained model is based on the Xception pre-trained model on common objects in context(COCO) dataset. Next, the SSD + Xception object detection model is used as lightweight SSD model based on Xception feature extraction network. The lightweight attention mechanism module is evolved into exit flow layer and Conv1 layer in feature extraction network to improve the detection accuracy, and the detection effect is compared with the model of lightweight attention mechanism module in other layers. The model parameters (params), floating-point operations per second (FLOPs) and the quantity of images can be processed via frames per second (FPS). Precision rate and miss rate are used to evaluate the model. The mean average precision (mAP) are used to evaluate the performance of the model. At last, the small object and normal object in the sea object detection dataset are tested via the lightweight SSD object detection model with improved attention mechanism and other models. Result In order to prove the effectiveness of this model, a quantity of comparative experiments are conducted. Firstly, the parameters and floating-point operation of each model are compared, and the reason of network lightweight is analyzed. The demonstrated illustration analyzed that the model can improve the memory reading and writing speed to achieve the network lightweight effect via Mul deduction and operations adding. But, the compression model will lead to the reduction of network feature expression ability, thus affecting the detection accuracy to a certain extent. The SSD object detection model with Mobilenet as feature extraction network is the lightest, but its detection accuracy is interfered at most, and the mAP is reduced by 2.28%. The SSD + Xception object detection model is opted as the lightweight SSD sea object detection model based on Xception feature extraction network. The model transforms a certain amount of Mul and Add operations only, which reduces the parameter amount by 19.01% and the floating-point operation amount by 18.40%. It garantees the feature expression ability of the model, and maintains the amount of images processed per second based on cutting parameters and floating-point operations. The quantity is basically unchanged, and the detection accuracy is reduced less, which achieves the effect of lightening the network under the condition of a certain detection accuracy. In order to improve the detection accuracy of the lightweight SSD sea object detection model, a lightweight attention mechanism module is issued in the lower layer of the model to focus on some significant or interesting information, which facilitates the illustration of the feature semantic information of small objects. In comparison of the standard SSD target detection model, the analyses demonstrate that the average accuracy of the buoy is increased by 1.1%, the miss detection rate is reduced by 3%, and the mAP is increased by 0.51%, the parameters are reduced by 16.26% and the floating-point operation is reduced by 15.65%. Simultaneously, the average detection accuracy of the boat is guaranteed, and the miss detection error is not generated more. The illustrated model also shows qualified detection effect based on small objects in the dataset verification. Conclusion For the small object detection in the sea image, this small object detection model can identify the detection speed of the model and guarantee the detection accuracy of the model, and achieve the effect of network lightweight. Moreover, this model reduces the rate of missing detection of small objects to realize the detection of small sea objects effectively as well.
Keywords

订阅号|日报