区域注意力机制引导的双路虹膜补全
摘 要
目的 虹膜识别是一种稳定可靠的生物识别技术,但虹膜图像的采集过程会受到多种干扰造成图像中虹膜被遮挡,比如光斑遮挡、上下眼皮遮挡等。这些遮挡的存在,一方面会导致虹膜信息缺失,直接影响虹膜识别的准确性,另一方面会影响预处理(如定位、分割)的准确性,间接影响虹膜识别的准确性。为解决上述问题,本文提出区域注意力机制引导的双路虹膜补全网络,通过遮挡区域的像素补齐,可以显著减少被遮挡区域对虹膜图像预处理和识别的影响,进而提升识别性能。方法 使用基于Transformer的编码器和基于卷积神经网络(convolutional neural network, CNN)的编码器提取虹膜特征,通过融合模块将两种不同编码器提取的特征进行交互结合,并利用区域注意力机制分别处理低层和高层特征,最后利用解码器对处理后的特征进行上采样,恢复遮挡区域,生成完整图像。结果 在CASIA(Institute of Automation, Chinese Academy of Sciences)虹膜数据集上对本文方法进行测试。在虹膜识别性能方面,本文方法在固定遮挡大小为64×64像素的情况下,遮挡补全结果的TAR(true accept rate)(0.1%FAR(false accept rate))为63%,而带有遮挡的图像仅为19.2%,提高了43.8%。结论 本文所提出的区域注意力机制引导的双路虹膜补全网络,有效结合Transformer的全局建模能力和CNN的局部建模能力,并使用针对遮挡的区域注意力机制,实现了虹膜遮挡区域补全,进一步提高了虹膜识别的性能。
关键词
Region attention mechanism based dual human iris completion technology
Zhang Zhili1, Zhang Hui2, Wang Jia1, Xia Yufeng1, Liu Liang1, Li Peipei1, He Zhaofeng1(1.Beijing University of Posts and Telecommunications, Beijing 100876, China;2.Beijing IrisKing Tech Co., Ltd., Beijing 100084, China) Abstract
Objective Human iris image recognition has achieved qualified accuracy based on most recognized databases. But, the real captured iris images are presented low-quality occlusion derived from the light spot, upper and lower eyelid, leading to the quality lossin iris recognition and segmentation. Recent development of deep learning has promoted the great progress image completion method. However, since most convolutional neural networks (CNNs) are difficult to capture global cues, iris image completion remains a challenging task in the context of the large corrupted regions and complex texture and structural patterns. Most CNNs are targeted on local features extraction with unqualified captured global cues in practice. Current transformer architecture has been introduced to visual tasks. The visual transformer harnesses complex spatial transforms and long-distance feature dependencies for global representations in terms of self-attention mechanism and multi-layer perceptron (MLP) structure. Visual transformers have their challenges to identify ignored local feature details in related to the discriminability decreases between background and foreground. The CNN-based convolution operations targets on local features extraction with unqualified captured global representations. The visual transformer based cascaded self-attention modules can capture long-distance feature dependencies with local feature loss details. We illustrate a region attention mechanism based dual iris completion network, which uses the bilateral guided aggregation layer to fuse convolutional local features with transformer-based global representations within interoperable scenario. To improve recognition capability,the impact of the occluded region on iris image pre-processing and recognition can be significantly reduced based on the missing iris information completion. Method A region attention mechanism based dual iris completion network contains a Transformer encoder and a CNN encoder. Specifically, we use the Transformer encoder and the CNN encoder to extract the global and local features of the iris image, respectively. To better utilize the extracted global and local iris images, a fusion network is adopted to preserve the global and local features of the images based on the integration of the global modeling capability of Transformer and the local modeling capability of CNN both, which improve the quality of the repaired iris images, as well as maintain the global and local consistency of the images. Furthermore, we propose a region attention module to efficiently achieve the completion of the occluded regions. Beyond the pixel-level image reconstruction constraints, an effective identity preserving constraint is also designed to ensure the identity consistency between the input and the completed image. Pytorch framework is used to implement our method and evaluate it on the CASIA(Institute of Automation, Chinese Academy of Sciences) iris dataset. We use peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) as the evaluation metrics for the generation quality, PSNR is the reversed result of comparing each pixel of an image, which can reflect the ground truth of the generated image and is an objective standard for evaluating the image. SSIM estimates the holistic similarity between two images, while iris recognition as the evaluation metric for the identity preserving quality. Result Our extended demonstration results on the CASIA iris dataset demonstrate that our method is capable to generate visually qualified iris completion results with identity preserving qualitatively and quantitatively. Furthermore, we have performed experiments on images with same type of occlusion. Images for training and testing are set to the resolution of 160×160 pixels for fair comparisons. The qualitative results have shown that the repaired results of our demonstration perform well in terms of region retention and global consistency compared to the other three methods. The quantitative comparisons are conducted in two metrics. For the repaired results of different occlusion types, our PSNR and SSIM are optimal to represent better the occluded iris images restoration and the consistency of the repaired results. To verify the effectiveness of the method in improving the accuracy of iris segmentation, we use white occlusion to simulate light spot occlusion. The segmentation results of repaired images are more accurate compared to those of the occluded images. Specifically, our method achieves 63% on true accept rate(TAR)(0.1%false accept rate(FAR)), which significantly more qualified that the baseline by 43.8% in terms of 64×64 pixels. The ablation studies are implemented to demonstrate the effectiveness of the components of our network structure. Conclusion We facilitates a region attention mechanism based dual iris completion network, which utilizes transformer and CNN to extract both the global topology and local details of iris images. A fusion network is employed to fuse the global and local features. A region attention module and identity preserving loss are also issued to guide the completion task. The extended quantitative and qualitative results demonstrate the effectiveness of our iris completion method in terms of CASIA iris dataset.
Keywords
iris inpainting iris recognition iris segmentation Transformer convolutional neural network(CNN) attention
|