面向复杂动态场景的无人移动视觉技术研究进展
张艳宁, 王昊宇, 闫庆森, 杨佳琪, 刘婷, 符梦芹, 吴鹏, 张磊(西北工业大学) 摘 要
随着人类活动范围的不断扩大和国家利益的持续发展,新域新质无人系统已成为世界各大国科技战略竞争的制高点和制胜未来的关键力量。无人移动视觉技术是无人系统辅助人类透彻感知理解物理世界的核心关键之一,旨在基于无人移动平台捕获的视觉数据,精准感知理解复杂动态场景与目标特性。深度神经网络凭借其超强的非线性拟合能力和区分能力,已经成为无人移动视觉技术的基准模型。然而,实际应用中无人系统通常面临成像环境复杂动态、成像目标高速机动-伪装对抗、成像任务需求多样,导致基于深度神经网络的无人移动视觉模型成像质量大幅退化,场景重建解译与目标识别分析精度显著下降,从而严重制约无人系统在复杂动态场景下对物理世界的感知解译能力与应用前景。针对这一挑战,本文深入探讨了面向复杂动态场景的无人移动视觉技术发展现状,分别从图像增强处理、三维重建、场景分割、目标检测识别以及异常检测与行为分析等五个关键技术入手,详细介绍了每项技术的基本研究思路与发展现状,分析每项技术中典型算法的优缺点,探究该技术目前依然面临的问题与挑战,并展望未来研究方向,为面向复杂动态场景的无人移动视觉技术长远发展与落地奠定基础。
关键词
Research progress of unmanned mobile vision technologyfor complex dynamic scenes
Zhang Yanning, Wang Haoyu, Yan Qingsen, Yang Jiaqi, Liu Ting, Fu Mengqin, Wu Peng, Zhang Lei(Northwestern Polytechnical University) Abstract
In today's era of advancing automation and intelligence, unmanned systems are rapidly emerging as a new focal point of technological strategic competition among major global powers. These new domains and qualities of unmanned systems are not only key to supporting national security and strategic interests but also the core force driving future technological innovation and application development. Unmanned systems are redefining the boundaries of national security and the connotations of strategic advantages. As a key component of unmanned systems, unmanned mobile visual technology is demonstrating its immense potential in assisting humans to deeply understand the physical world. The progress of this technology not only provides unmanned systems with richer and more precise perceptual capabilities but also offers new perspectives for humans to observe, analyze, and ultimately master the complex and ever-changing physical environment. In the early stages of the development of unmanned mobile visual technology, researchers mainly relied on traditional learning methods for processing. These methods focused on manual feature extraction, depending on the experience and knowledge of domain experts. For instance, feature descriptors such as Scale-Invariant Feature Transform and Histogram of Oriented Gradients played significant roles in image matching and target detection tasks. Although traditional visual analysis methods still have their value in specific situations, their reliance on manual feature extraction and professional knowledge limits the efficiency and accuracy of the analysis. With the rise of deep neural network technology, unmanned mobile visual technology has ushered in revolutionary progress. Deep neural networks, through automatic feature extraction and hierarchical structures, can automatically learn feature representations from simple to complex. This enables them to capture local image features while also understanding and interpreting higher-level semantic information. This significantly enhances the fitting and discriminative capabilities of the models, demonstrating advantages that traditional methods cannot match, making deep neural networks the benchmark model for unmanned mobile visual technology. However, in practical applications, unmanned systems often face complex and diverse and dynamically changing application scenarios, posing great challenges to the application of deep learning. Firstly, the complexity and dynamics of the imaging environment are issues that unmanned systems must confront. Drastic changes in environmental lighting, uncertainty in weather conditions, and interference from other moving objects in the scene can all lead to a decline in image quality, thereby affecting subsequent processing and analysis. Secondly, the high-speed maneuverability and camouflage and concealment behaviors of imaging targets pose higher requirements for unmanned mobile visual systems. The rapid movement of targets makes it difficult for the system to track stably, while camouflage and concealment behaviors make target detection more difficult. These factors work together, causing the precision of scene reconstruction interpretation and target identification and analysis based on deep neural networks in unmanned mobile visual models to decline significantly. In addition, the diversity of imaging tasks also brings challenges to unmanned mobile visual technology. Different tasks may require different visual processing strategies and analysis methods, and the system needs to have enough flexibility and adaptability to meet the needs of different tasks. However, current deep neural network models are often optimized for specific tasks in the design, and their adaptability to diverse tasks is limited. The uncertainty and unpredictability of environmental factors pose extremely demanding requirements for the application of unmanned mobile visual technology, which requires unmanned mobile visual technology to provide precise perception and in-depth analysis, thus providing decision support for automated systems, enabling them to respond quickly and accurately to environmental changes, and improving system efficiency and reliability. In response to the visual challenges of unmanned systems in complex dynamic scenes, this article deeply explores the current state of development of unmanned mobile visual technology in dealing with these challenges, focusing on five key technical areas: image enhancement, 3D reconstruction, scene segmentation, object detection, and anomaly detection. Image enhancement is the first step in improving the quality of visual data. It improves the contrast, clarity, and color of images, providing more reliable input for subsequent analysis and processing, thereby enhancing the performance of unmanned systems under various environmental conditions. 3D reconstruction technology allows the recovery of three-dimensional structures from two-dimensional images, enabling unmanned systems to understand the depth and spatial layout of the scene, thus enhancing the system's understanding and adaptability to complex environments. Scene segmentation divides the image into multiple semantically meaningful regions or objects, providing a basis for precise environmental perception and target recognition. Object detection is a core task in unmanned mobile visual technology, enabling the system to locate and recognize specific targets in images or videos. Anomaly detection focus on identifying anomalies or events in the scene, providing the ability for unmanned systems to timely identify and respond to potential threats. For these key technologies, this article will deeply explore their research ideas, current status, and the advantages and disadvantages of typical algorithms, analyzing their performance in practical applications. The integration and collaborative work of these technologies have significantly enhanced the visual perception capabilities of unmanned systems in complex dynamic scenes, enabling them to perform tasks more intelligently and autonomously. Although some research has made certain progress, unmanned mobile visual technology still faces many problems in practical applications in complex dynamic scenes. This review paper aims to provide a comprehensive perspective, systematically combing and analyzing the latest research progress in unmanned mobile visual technology for complex dynamic scenes. It explores the advantages and limitations of the above key tasks in practical applications. In addition, this article will discuss the gaps and challenges in current research and propose future possible research directions. Through in-depth exploration of these research directions, unmanned mobile visual technology will continue to make progress, providing more powerful and flexible solutions to address the challenges in complex dynamic scenes, and laying a solid foundation for the long-term development and practical application of unmanned systems in the fields of automation and intelligence.
Keywords
unmanned mobile vision complex dynamic scenes image enhancement 3D reconstruction scene segmentation object detection anomaly detection
|