Current Issue Cover
跨域联合学习与共享子空间度量的车辆重识别

汪琦1, 雪心远2, 闵卫东1,3,4, 汪晟2, 盖迪1, 韩清1(1.南昌大学数学与计算机学院, 南昌 330031;2.南昌大学软件学院, 南昌 330047;3.南昌大学元宇宙研究院, 南昌 330031;4.江西省智慧城市重点实验室, 南昌 330031)

摘 要
目的 现有的跨域重识别任务普遍存在源域与目标域之间的域偏差大和聚类质量差的问题,同时跨域模型过度关注在目标域上的泛化能力将导致对源域知识的永久性遗忘。为了克服以上挑战,提出了一个基于跨域联合学习与共享子空间度量的车辆重识别方法。方法 在跨域联合学习中设计了一种交叉置信软聚类来建立源域与目标域之间的域间相关性,并利用软聚类结果产生的监督信息来保留旧知识与泛化新知识。提出了一种显著性感知注意力机制来获取车辆的显著性特征,将原始特征与显著性特征映射到一个共享子空间中并通过它们各自全局与局部之间的杰卡德距离来获取共享度量因子,根据共享度量因子来平滑全局与局部的伪标签,进而促使模型能够学习到更具鉴别力的特征。结果 在3个公共车辆重识别数据集VeRi-776(vehicle re-identification-776 dataset)、VehicleID(largescale vehicle re-identification dataset)和VeRi-Wild(vehicle re-identification dataset in the wild)上与较新方法进行实验对比,以首位命中率(rank-1 accuracy,Rank-1)和平均精度均值(mean average precision,mAP)作为性能评价指标,本文方法在VeRi-776→VeRi-Wild,VeRi-Wild→VeRi-776,VeRi-776→VehicleID,VehicleID→VeRi-776的跨域任务中,分别在目标域中取得了42.40%,41.70%,56.40%,61.90%的Rank-1准确率以及22.50%,23.10%,41.50%,49.10%的mAP准确率。在积累源域的旧知识表现中分别取得了84.60%,84.00%,77.10%,67.00%的Rank-1准确率以及55.80%,44.80%,46.50%,30.70%的mAP准确率。结论 相较于无监督域自适应和无监督混合域方法,本文方法能够在积累跨域知识的同时有效缓解域偏差大的问题,进而提升车辆重识别的性能。
关键词
Cross-domain joint learning and shared subspace metric for vehicle re-identification

Wang Qi1, Xue Xinyuan2, Min Weidong1,3,4, Wang Sheng2, Gai Di1, Han Qing1(1.School of Mathematics and Computer Science, Nanchang University, Nanchang 330031, China;2.School of Software, Nanchang University, Nanchang 330047, China;3.Institute of Metaverse, Nanchang University, Nanchang 330031, China;4.Jiangxi Key Laboratory of Smart City, Nanchang 330031, China)

Abstract
Objective Vehicle re-identification(Re-ID)is a technology that uses computer vision technology to determine whether a specific target vehicle exists in an image or video sequence,which is considered a subproblem of image retrieval. Vehicle Re-ID technology can be used to monitor specific abandoned vehicles and prevent driving escape and is widely applied in the fields of intelligent surveillance and transportation. The previous methods mainly focused on supervised training in a single domain. If the effective Re-ID model in the single domain is transferred to an unlabeled new domain for testing,retrieval accuracy will significantly decrease. Some researchers have gradually proposed many cross-domain-based Re-ID methods to alleviate the manual annotation cost of massive surveillance data. This study aims to transfer the trained supervised Re-ID model from the labeled source domain to the unlabeled target domain for clustering. The entire transfer process uses unsupervised iteration and update of model parameters,thereby achieving the goal of reducing manual annotation costs. However,the existing cross-domain Re-ID tasks generally have two main challenges:on the one hand,the existing cross-domain Re-ID methods focus too much on the performance of the target domain,often neglecting the old knowledge previously learned in the source domain,which will cause catastrophic forgetting of the old knowledge. On the other hand,the large deviation between the source and target domains will directly affect the generalization ability of the Re-ID model mainly because of the significant differences in data distribution and domain attributes in different domains. Hence, a vehicle Re-ID method based on cross-domain joint learning and a shared subspace metric is proposed to overcome the above challenges. Method First,a cross-confidence soft cluster is designed in cross-domain joint learning to establish the inter-domain correlation between the source and target domains. The cross-confidence soft cluster aims to introduce prior knowledge of the source domain data into the target domain by calculating the confidence level of the cross mean. The cluster also aims to jointly perform soft clustering,thereby effectively integrating prior knowledge of the source domain with new knowledge of the target domain. The training data are re-labeled with pseudo labels based on the cross-mean confidence of each type of source domain data. Moreover,the supervised information generated by the soft clustering results is ultimately retained to preserve old knowledge and generalize new knowledge. Then,a salient-aware attention mechanism is proposed to obtain the salient features of vehicles. The salient-aware attention mechanism module is embedded into the reference network to improve the Re-ID model’s ability to identify significant regions of vehicles in the channel and spatial dimensions. Then,the expression of vehicle significant region features is improved by calculating the channel and spatial weight factors. For the channel weight factor,a convolution operation with a convolution kernel of 1 is used to compress the channel dimensions of the feature matrix,and the importance of each channel in the feature matrix is calculated in a self-aware manner. In addition,global average pooling is applied to the feature matrix to prevent the loss of some channel spatial information when compressing channel dimensions. Moreover,further refined channel style attention is jointly inferred by considering channel self-attention and channel-by-channel spatial information. The original and salient features are mapped into a shared subspace,and the shared metric factors are obtained through the Jaccard distance of their respective global and local regions. Finally,a shared metric factor is used to smooth global and local pseudo-labels based on the results of crossconfidence soft clustering to further alleviate the label noise caused by domain bias. This approach enables the training model to learn further discriminating features. The proposed method in this study is trained in the Python 3. 7 and Python 1. 6. 0 frameworks,with an operating system of Ubuntu 18. 04 and CUDA 11. 2. The hardware configuration is an Intel (R)Xeon(R)Silver 4210 CPU @ 2. 20 GHz model CPU,a Tesla V100 graphics card with 32 GB of graphics memory, and a running memory of 64 GB. The whole training uses ResNet-50 as the baseline model,and the size of the input image is uniformly cropped to 224×124 pixels. The total number of training iteration epochs is 50,and the batch size is set to 64. The pre-training model on ImageNet is used as the initialization model in this study,and the initial learning rate is set to 0. 000 35. Moreover,stochastic gradient descent(SGD)is used to iterate and optimize the model weight. Result Experimental comparisons are conducted on three public vehicle Re-ID datasets,the vehicle Re-ID-776 dataset(VeRi-776),the large-scale vehicle Re-ID dataset(VehicleID),and the vehicle Re-ID dataset in the wild(VeRi-Wild),with the latest existing methods. This study uses rank-1 accuracy(Rank-1)and mean average precision(mAP)as evaluation indicators. The proposed method achieved a Rank-1 accuracy of 42. 40%,41. 70%,56. 40%,and 61. 90% in the target domain in the cross-domain tasks of VeRi-776→VeRi-Wild,VeRi-Wild→VeRi-776,VeRi-776→VehicleID,and VehicleID→VeRi- 776,respectively. The accuracy of mAP is 22. 50%,23. 10%,41. 50%,and 49. 10%,respectively. The method also achieved a Rank-1 accuracy of 84. 60%,84. 00%,77. 10%,and 67. 00%,respectively,in accumulating old knowledge representation in the source domain. The mAP accuracy is 55. 80%,44. 80%,46. 50%,and 30. 70%,respectively. In addition,a series of experiments is conducted to further demonstrate the robustness of the proposed method in cross-domain tasks,including ablation comparison of different modules,comparison of different training methods,comparison of outliers and visualization of attention maps,comparison of rank lists,and comparison of t-distributed stochastic neighbor embedding(t-SNE)visualization. Conclusion In this study,compared with unsupervised domain adaptive and unsupervised hybrid domain methods,the proposed method can effectively alleviate the problem of large domain deviation while accumulating cross-domain knowledge,thereby improving the performance of vehicle Re-ID tasks.
Keywords

订阅号|日报