Current Issue Cover
融合多特征与自注意力的室外直升机桨叶旋转目标检测

徐飞龙, 熊邦书, 欧巧凤, 余磊, 饶智博(南昌航空大学)

摘 要
目的 桨叶运动参数是直升机设计到生产的重要指标,传统的视觉测量方法直接应用于室外环境下,由于受复杂光照背景影响,存在找不到桨叶区域,不能进行准确测量的问题。据此,本文提出一种融合多特征与自注意力的旋转目标检测器(fusion multi-feature and self-attention rotating detector,FMSA-RD)。方法 首先,针对YOLOv5s(you only look once version 5 small)特征提取能力不足和冗余问题,在主干网络中设计了更为有效的多特征提取和融合模块,结合不同时刻位置与尺度下的特征信息以提高网络对室外桨叶的检测精度;并去掉部分无关卷积层以简化模块结构参数。其次,融合多头自注意力机制与CSP(crossstage partial convolution)瓶颈结构,整合全局信息以抑制室外复杂光照背景干扰。最后,引入SKEWIOU倾斜交并比损失和角度损失,改进损失函数,进一步提升桨叶检测精度。结果 本文进行了多组对比实验,分别在自制的室外直升机桨叶数据集、自制室外仿真桨叶数据集和公共数据集DOTA-v1.0(dataset for object detection in aerial images-version 1.0)上进行验证,对比基线YOLOv5s目标检测网络,本文模型在平均精度均值(mean average precision,mAP)分别提高6.6%和12.8%,在速度(frames per second,FPS)分别提高21.8%和47.7%。结论 本文设计的旋转目标检测模型,提升了室外复杂光照背景下桨叶的检测精度和速度。
关键词
Fusion of multi-feature and self-attention for rotating target detection of outdoor helicopter blades

xufeilong, xiongbangshu, ouqiaofeng, yulei, raozhibo(Nanchang Hangkong University)

Abstract
Objective The motion parameters of helicopter rotor blades include flapping angle, lead-lag angle, twist angle, and coning angle, which provide important basis for rotor structure design, upper and lower limit block design of hub, and blade load design. They are important parameters that need to be measured in ground tests before rotorcraft certification and helicopter flight tests. The traditional visual measurement method for rotor blade motion parameters has achieved good results in indoor wind tunnel environments. However, under the influence of complex outdoor backgrounds, there is a problem of being unable to detect the rotor blades from the image and accurately measure the parameters. Unlike indoor environments, the outdoors have complex lighting conditions such as different seasons, weather, times, and lighting directions, as well as different sky and background environments. Under these complex background interferences, the features of the rotor blades are weakened, making it difficult to accurately locate the position of the rotor blades. Deep learning is a mainstream method for object detection, and how to design deep learning models to enhance the target features of rotor blades and reduce the interference of complex backgrounds is a huge challenge. In this paper, based on the network structure of the YOLOv5s, a prediction angle branch is added, and a rotation object detector (FMSA-RD) that fuses multiple features and self-attention is proposed to facilitate the detection of outdoor helicopter rotor blades. Method Firstly, the FMSA-RD model improves the C3 module used for feature extraction in YOLOv5s by adding multiple shortcuts. The improved feature extraction module is called convolution five (C5), which completes the feature extraction by fusing local features from different positions, thereby reducing the network structure while maintaining the feature extraction ability. Specifically, C5 replaces the BottleNeck module in C3 with two 3×3 convolution kernels to avoid the additional overhead caused by using multiple BottleNeck modules and increase the receptive field of the convolution layer feature map. Increasing the number of convolution layers does not necessarily lead to optimized parameters and may cause gradient divergence and network degradation. Therefore, C5 adds shortcut branches to three main convolution layers to effectively avoid the accumulation of useless parameters and extract feature information from different positions. Secondly, the multi-feature fusionspatial pyramid pooling cross stage partial fast tiny (SPPFCSPT) module enhances the ability to fuse graphic features at different scales. This module uses a block-merging method, using multiple serial 5×5 MaxPool layers to extract four different receptive fields, which improves detection accuracy while keeping the model lightweight.Then, due to the weak ability of CNN structures to connect global features, the B3 module is designed to improve the extraction ability of global features by combining the multi-head self-attention mechanism of Transformer with the CSP bottleneck structure, which suppresses the influence of complex outdoor rotor blade backgrounds. Finally, SKEWIOU tilt intersection ratio loss and angle loss are introduced to improve the loss function and further enhance the accuracy of blade detection. Result Our experiments were conducted on a self-made outdoor helicopter rotor blade dataset, a self-made outdoor simulated blade data set and the public dataset DOTA-v1.0 for training and validation. The self-made outdoor rotor blade dataset contains 3878 images, which were randomly divided into training, testing, and validation sets in a 7:2:1 ratio. Our FMSA-RD has been compared with mainstream horizontal and rotational models such as RetinaNet, FCOS, YOLOv5s, YOLOv6s, YOLOv7 tiny, CenterRot, FAB+DRB+CRB, H2RBox, R3Det. The experimental results show that our method achieves an average detection accuracy of 98.5%, FPS of 110.5 frames per second. Based on the comparison experiments using a self-made outdoor blade dataset, the analysis of mAP is as follows: 1) FMSA-RD has a 14.1% higher mAP than RetinaNet. 2) FMSA-RD has a 7.8% higher mAP than FCOS. 3) FMSA-RD has an 6.6% higher mAP than YOLOv5s. 4) FMSA-RD has a 3.2% higher mAP than YOLOv6s. 5) FMSA-RD has a 3.9% higher mAP than YOLOv7 tiny. 6) FMSA-RD has a 3.0% higher mAP than CenterRot. 7) FMSA-RD has a 3.1% higher mAP than FAB+DRB+CRB. 8) FMSA-RD has a 2.3% higher mAP than H2RBOX. 9) FMSA-RD has a 4.2% higher mAP than R3Det. The public dataset DOTA contains 2806 remote sensing images with a resolution of 800×800, covering various scene types such as cities, industrial areas, buildings, and roads. This comparative experiment is to verify the generalization ability of the FMSA-RD network. We chose mainstream rotating object detection models for comparative experiments. On the self-made outdoor simulated paddle data set, the data in the morning and noon are used as the training set, and the data at night are used as the validation set. Experiments have shown that FMSA-RD has low computational complexity, high detection accuracy, and good generalization ability, making it suitable for different scenarios and environments. Conclusion Our FMSA-RD can reduce complexity while integrating local feature information from different positions, suppressing complex background noise interference. The fusion of different scale features improves the accuracy of blade detection. The fusion of self-attention mechanism extracts global information and distinguishes blades without circular markers, achieving accurate detection of complex backgrounds and high aspect ratio blades while reducing model parameters and improving detection accuracy.
Keywords

订阅号|日报