结合关键点与引导向量的旋转目标检测网络
佘浩东1,2,3, 赵良瑾1,3(1.中国科学院空天信息创新研究院, 北京 100190;2.中国科学院大学电子电气与通信工程学院, 北京 100049;3.中国科学院网络信息体系技术重点实验室, 北京 100091) 摘 要
目的 目标检测是遥感智能解译中重要的研究方向之一,大多数目标检测算法难以实现密集排列的旋转目标的高精度检测。提出了一种基于关键点与引导向量预测的目标检测算法,实现高精度旋转目标检测的同时,还可对目标的朝向进行表征。方法 首先提出了一种新的旋转目标建模方式,将目标检测分解成中心点、头部顶点、引导向量以及目标宽度的参数回归以更贴合检测目标;其次设计旋转椭圆高斯核,能够更好地拟合遥感目标的形状,从而提升关键点的预测精度;最后通过预测中心点指向头部顶点的引导向量,完成同一个目标内中心点与头部顶点的匹配,从而生成一个精准的带方向的旋转矩形检测框。结果 在大长宽比舰船目标的HRSC(high-resolution ship collections)数据集上的实验结果表明,相比于其他主流的目标检测算法,本文算法获得了更好的检测结果,在VOC 2007(visual object classes)和VOC 2012的平均精度分别达到了90.78%和97.85%。在小长宽比飞机目标UCAS-AOD(UCAS-high resolution aerial object detection dataset)数据集上达到了98.81%的平均精度。实验结果表明了本文算法的可行性与有效性。结论 本文算法利用椭圆高斯核计算中心点与头部顶点,并设计引导向量对点匹配关系进行约束,实现了旋转目标的方向检测。
关键词
Rotating target detection network that combines key points and guide vectors
She Haodong1,2,3, Zhao Liangjin1,3(1.Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China;2.School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China;3.Key Laboratory of Network Information System Technology, Chinese Academy of Sciences, Beijing 100091, China) Abstract
Objective Optical remote sensing images objectively and accurately record the implementation of surface features and are widely used in the investigation,detection,analysis,and prediction forecasting of resources,environment,disasters,regions,and cities.The primary task of optical remote sensing image object detection is to locate and classify objects in the input remote sensing images with important values for research and application in the field of Earth observation.Traditional remote sensing object detection algorithms require manually designed features.However,features designed in this manner are limited,and consume considerable human and material resources but are not generalized and accurate enough to be improved.With the rapid development of deep learning in recent years,remote sensing object detection algorithms based on deep learning have achieved good results in optical image object detection.In contrast with object detection in natural scenes,objects in optical remote sensing images are rigid and most of them have key information,such as direction.Horizontal rectangular detection frames in natural scenes have problems in the field of optical remote sensing object detection,such as excessive background area,overlapping adjacent detection frames,and loss of object motion information.To achieve more accurate object detection in optical remote sensing images,a rotating rectangular frame that fits object contour is a more suitable choice.The detection of rotating remote sensing objects through the discovery of key points is one of the current mainstream approaches.However,these key point-based object detection algorithms tend to have problems,such as the overlapping of adjacent key points and inaccurate key point detection,due to the dense arrangement of remote sensing objects.To solve these key point regression problems,this study proposes an improved rotating elliptic Gaussian kernel with vector-guided point pair matching module,which achieves high-precision rotating object detection through the accurate prediction and matching of object centroids and head vertices.Method An hourglass network is different from the general feature extraction network,because its structure can fuse high-level features with rich semantic information and underlying features with rich spatial information.The generated high-resolution feature map can achieve the precise location of key points.The circular Gaussian kernel that returns key points in natural scenes exhibits the problems of uncertainty of Gaussian kernel radius and the overlapping of Gaussian kernels for densely arranged objects in remote sensing image object detection.The rotating elliptical Gaussian kernel proposed in this study solves the aforementioned problems.It is particularly constructed in such a way that the long and short axes of the elliptical Gaussian kernel are determined by the length and width of the rotating rectangular box of the object and the angle of the long axis of the ellipse is the same as the angle of the object.This rotated elliptical Gaussian kernel fits the shape of the object more closely,achieving better key point regression effect.In this study,the two key points of the object(i.e.,the center point and the head vertex) are modeled as the core,and a point pair matching module that uses bootstrap vectors is proposed to achieve the exact pairing of the center point and the head vertex of the same object.Result Our model is evaluated on the HRSC2016 and UCAS-AOD public datasets.The HRSC2016 dataset has 436 training images,181 validation images,and 444 test images,with image sizes ranging from 300 × 300 to 1 500 × 900.The UCAS-AOD dataset has image sizes of 1 280 ×659,with 1 000 aircraft images and 510 vehicle images,including 7 482 aircraft objects and 7 114 vehicle objects.The annotations in the HRSC dataset contain the head vertices.The annotations of the aircraft category in the UCAS-AOD dataset contain the specific orientation angles of the objects,and thus,the head vertices of aircraft can be calculated.During the experiment,images of various sizes were cropped and deflated to 640 × 640 resolution and inputted into the network model.Four Nvidia RTX 2080Ti graphics cards were used,with a batch size of eight images per card and an initial learning rate set to 0.01.The optimizer for training was the stochastic gradient descent method with a momentum factor set to 0.9.Before training,the dataset was augmented through flipping and rotation.Recall,accuracy,and average precision are used as the evaluation metrics of the model.The experimental results on the HRSC dataset with large-aspect-ratio ship objects show that the proposed algorithm achieves better detection results than the other mainstream object detection algorithms,with an average accuracy of 90.78%(VOC 2007) and 97.85%(VOC 2012),and the precision-recall curves are also better than those of the other object detection algorithms.Conclusion Our experimental results show that the rotating object detection model that combines key points and bootstrap vectors is excellent and advanced.The rotating elliptic Gaussian kernel achieves more accurate key point regression,and the point pair matching module based on bootstrap vectors achieves accurate matching of centroids and head vertices,improving the detection of rotating objects.
Keywords
object detection deep learning rotating elliptic Gaussian kernel guidance vectors oriented detection
|