小股人群重识别研究进展
张权1, 赖剑煌1,2,3,4, 谢晓华1,3, 陈泓栩1(1.中山大学计算机学院, 广州 510006;2.广州新华学院, 广州 510520;3.广东省信息安全技术重点实验室, 广州 510006;4.视频图像智能分析与应用技术公安部重点实验室, 广州 510006) 摘 要
小股人群重识别旨在将非重叠视域的摄像头网络下具有相同成员的群组图像进行正确的关联。小股人群重识别是传统行人重识别任务的一个重要拓展,在安防监控场景下有着重要的研究意义和应用前景。小股人群重识别所面临的独特挑战在于如何针对群内成员的数量变化和布局变化进行建模,并提取稳定、鲁棒的特征表达。近年来,小股人群重识别引发了研究人员的广泛关注,并获得了快速的发展。本文对小股人群重识别技术的研究进展进行了全面的梳理回顾。首先简要介绍本领域的研究背景,对基本概念、数据集和相关技术进行了简要总结。在此基础上,对多种小股人群重识别算法进行了详细的介绍,并在多个数据集上对前沿算法进行性能对比。最后,对该任务进行展望。整体而言,与行人重识别相比,小股人群重识别的现有方法在具体场景下的特定挑战性能表现欠佳,还需要从数据收集和方法设计两方面进一步探讨。此外,现有的小股人群重识别研究与其他视觉任务的关联性不够紧密,如何协同多任务作业以解决更多业界需求、加速产业落地,需要学术界和工业界共同思考和推动。
关键词
A summary on group re-identification
Zhang Quan1, Lai Jianhuang1,2,3,4, Xie Xiaohua1,3, Chen Hongxu1(1.School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China;2.Guangzhou Xinhua University, Guangzhou 510520, China;3.Guangdong Province Key Laboratory of Information Security Technology, Guangzhou 510006, China;4.Key Laboratory of Video and Image Intelligent Analysis and Application Technology, Ministry of Public Security, Guangzhou 510006, China) Abstract
Pedestrians-oriented group re-identification(GReID)analysis is focused on non-overlapped and multi-viewed small groups. To extract stable and robust feature representations,the challenge issue of GReID is to model the temporal changes and intra-group pedestrians. Our summary is reviewed on the growth of GReID critically. First,we review its research domain in related to its basic concepts,technologies and datasets in relevance. To optimize the surveillance in public security,the GReID can monitor and prevent group-based crimes accurately like women and children-oriented kidnapping and trafficking. Due to pedestrians-targeted are severely occluded or even disappeared,it can leverage the appearance features of pedestrians’partners as additional prior information for recognition. Specifically,GReID-based groups are composed of 2 to 8 members. First,the same group can be identified when the identified intersection-over-union(IoU) ratio of member is greater than 60% in the two group images. Then,a variety of GReID algorithms are introduced and tested in detail. The existing works can be categorized from three perspectives:1)data,2)method,and 3)label. For data types,the existing methods can be segmented into:real image-based,synthetic images-based,and real video-based methods. The real images-based method is basically focused on the datasets collected from real surveillance scenarios, such as CUHK-SYSU Group(CSG),RoadGroup,iLIDS-MCTS,and etc. These datasets can be used to collect several group images from different camera views of different groups and provide the elements of location information and identification information of member. This supervision information can be used to design discriminative group feature representations. However,it is still more challenging to collect and label the real group datasets than the traditional pedestrian reidentification datasets because the consistent group identity is required to be judged between group images,including member variations and layout variations. The following datasets are proposed based on 3D synthetic images. This type of datasets can generate mass group images with high-quality labels efficiency and effectively. These methods can be used to improve the performances of the model in real datasets through massive synthetic data. The video-based datasets can provide several consecutive frames for each group from the surveillance videos. Researchers can extract the group features according to the potential patio-temporal or intra-group relationships. They can be mainly divided into:traditional methods and deep learning methods. The former one is to design group descriptors and extract group features derived of human experience. However,due to the high dependence on the prior knowledge of expertise,it is unable to describe and generalize all possible situations for group images. The model can construct the representations of group images automatically because the emerging deep learning based methods is beneficial for a large number of data samples,and the discrimination and robustness of the deep models have been significantly improved. Deep learning-based methods can be divided into 1)the feature learning-based,2)metric learning based,and 3)generative adversarial network(GAN)based. The deep feature learning based methods aim to design a discriminative network structure or a discriminative feature learning strategy. The features-extracted can reflect the group identification of the input images accurately,and it can be robust enough to suppress occlusion,illumination,number and layout variations of intra-group members. Metric learning based methods can be focused on a similarity criterion evaluation between two groups of images. To get high similarity under the designed measurement criteria,even two group images from the same group class have great differences. To optimize small size of the dataset,GAN-based method attempts to expand the dataset scale of the GReID task by style transfer of samples from other related pedestrian re-identification datasets. For its label,the existing methods can be categorized into:supervised and unsupervised. Supervised learning based methods tend to be more competitive because the group labels or the member labels are participated in the entire training process. It often can learn the similarity only for the local area of the group images because labels are not be provided in the unsupervised learning,and cluster methods can be designed to extract the feature representations of the same group class. To sum up, 1)the specific scenarios based GReID is required to be developed from the aspects of data collection and method design further;2)GReID is still not interrelated to other related visual tasks mutually. Therefore,multiple tasks-collaborated are called to resolve more industry needs,and the implementation of the industry is required to be accelerated for the domain of academia and industry. Furthermore,the data privacy policyrelevant ethic issue needs to be utilized for virtual data and real data in the future.
Keywords
group re-identification(GReID) pedestrian re-identification synthesis data deep learning feature learning metric learning Transformer
|