Current Issue Cover
面向航拍图像中工程车辆检测与识别的改进胶囊网络

钟映春1, 郑海阳1, 张文祥1, 王波2, 罗志勇3(1.广东工业大学自动化学院, 广州 510006;2.广东省机械技师学院机电工程系, 广州 510450;3.广州市优飞信息科技有限公司, 广州 510630)

摘 要
目的 利用无人机(unmanned aerial vehicle,UAV)巡检识别航拍图像中的工程车辆对于减少电力安全事故的发生具有重要意义。采用人工提取特征的经典模式识别方法或YOLOv5(you only look once v5)等深度学习算法识别无人机电力巡检航拍图像中的工程车辆,存在识别准确率低、模型参数规模大等问题。针对上述问题,提出一种改进的胶囊网络识别航拍图像中的工程车辆。方法 采用多层密集连接型方法改进原始胶囊网络结构,以提取图像中工程车辆更多的特征;改进了胶囊网络的动态路由方法,以提高胶囊网络的抗干扰能力;探索了网络层数和动态路由算法中关键参数对识别准确率的影响,以找到识别准确率最高时的参数。结果 实验结果表明:1)在所采用的算法模型中,本文方法的平均识别率(mean average precision,mAP)达到94.56%,明显高于其他方法。2)网络层数对识别准确率有很大影响,但二者之间并非单调线性关系。在本文的应用场景中,5层胶囊网络的识别准确率最高;此外,动态路由算法改进与否并不会影响识别准确率跟随网络层数的变化趋势。3)胶囊网络层数增加会降低识别效率,但是并不会明显增加参数规模,且参数规模与mAP无明显关联。结论 本文方法在获得较高识别准确率的同时具有参数规模较小的特点,为无人机在机载端识别目标物奠定了基础。
关键词
Improved capsule network method for engineering vehicles detection and recognition in aerial images

Zhong Yingchun1, Zheng Haiyang1, Zhang Wenxiang1, Wang Bo2, Luo Zhiyong3(1.School of Automation, Guangdong University of Technology, Guangzhou 510006, China;2.Department of Electro-Mechanical Engineering, Guangdong Machinery Technician College, Guangzhou 510450, China;3.Guangzhou Ufly Technology Co., Ltd., Guangzhou 510630, China)

Abstract
Objective Electrical power lines construction, plays an important role in the urban development, especially the high-voltage power lines. Engineering vehicles are composed of excavators and wheeled cranes contexts, which are used in construction sites. If the engineering vehicle is working on site surrounding the high-voltage power line, its bucket or boom would probably enter the high-voltage breakdown range when they are lifted, which is very easy to result in accidents such as short circuit breakdowns. So, it is necessary to find out the adequate engineering vehicles working scenario near high-voltage power line. The multiple rotors unmanned aerial vehicle (UAV) is widely used to acquire amounts of aerial images for power lines inspection. The engineering vehicle information should be recognized from these aerial images manually in common. The classical pattern recognition methods and some deep learning models like you only look once version 5 (YOLOv5) has been challenged to some issues of recognizing the engineering vehicle in acquired aerial image, such as inefficiency and inaccuracy. The classical pattern recognition method needs to manually extract the features. Some deep learning models usually have large parameter scale and complex network structure, and are not accurate enough while the training set is small. In order to solve these problems, our research demonstrated an improved capsule network model to recognize engineering vehicles from aerial images. Capsule network improvement is mainly on the two aspects as mentioned below:one is to improve the network structure of the capsule network model, and the other one is to improve the dynamic routing algorithm of the capsule network. Method First, we built up an image dataset, which includes 1 890 aerial images in total. The dataset is then separated into training set and testing set at a ratio of 4:1. Next, we improved the network structure of capsule network through a multi-layer densely connected method to extract more features of the engineering vehicle from the image, named improved model No.1. The multi-layer densely connected capsule network has 3 layers, 5 layers or 7 layers probably. Third, we facilitated the dynamic routing method of the capsule network by replacing the softmax function with the leaky-softmax function to improve the anti-interference performance of the capsule network, named improved model No.2. We named the model with multi-layer densely connected network and the leaky-softmax function as the improved model No.3. Fourth, we embedded several key parameters on those models. The key parameters are related to the number of layers in the capsule network, the routing coefficient and squeeze coefficient in the dynamic routing algorithm. Result The aim of first group of experiments is to validate whether the two improved approaches are effective or not. We compared the mean average precision (mAP) of the original capsule network model with improvement model No.1, improvement model No.2 and improvement model No.3. All models use the 3-layer densely connected capsule network. Our experimental results illustrate that the mAP of the improvement model No.1 is 91.70%, and the mAP of the model with improvement No.2 is 90.01%, which are 2.21% and 0.54% each better than the original capsule network. The improvement model No.3 further improves the recognition accuracy, whose mAP reaching 92.10%. The aim of second group of experiments is to classify the issue of the number of network layers influence the mAP of those models. The experimental results demonstrate that the number of network layers influences the mAP greatly. When the number of network layers is small, the mAP increases while the number of network layers increasing. After a peak mAP of recognition shown, the mAP often decreases while the number of network layers increasing. So, their relationship is non-monotonic and nonlinear. In the application case, a 5-layer capsule network has the best recognition mAP. Additionally, the various trends of mAP are not affected by the improvement of dynamic routing algorithm. Furthermore, the efficiency of those improved models all decreased dramatically while the number of capsule network layers increase. And the parameter volume of those improved models is not obviously various, which means that the volume of parameter is irrelevant to the target recognition precision. The aim of third group of experiments is to find out the optimal model with appropriate routing coefficient and squeeze coefficient. This group of experimental results show that the mAP of 5-layer densely connected capsule network reaches up to 94.56% while the routing coefficient is 5 and the squeeze coefficient is l, which is an increase of 5.07% compared to the original capsule network. Meanwhile, the parameter volume of this optimal model is close to original model. Therefore, this optimal model has quite qualified mAP and small parameter volume. The aim of fourth group of experiments is to compare the performance of optimal model with other models. This kind of result shows that our optimal model is better than the classical pattern recognition model and YOLOv5x model in mAP, and the parameter volume of the optimal model is smaller. Conclusion Our research harnessed two approaches to improve the capsule network model for the engineering vehicles recognition derived of UAV aerial images. Our demonstrated experiments illustrate that this improved model has the small parameter volume and quite good recognizing precision, which is very significant for the UAV to recognize the airborne target information.
Keywords

订阅号|日报