高级检索

结合Transformer与对称型编解码器的噪声虹膜图像分割方法

A Symmetrical Encoder-Decoder Network with Transformer for Noise-Robust Iris Segmentation

  • 摘要: 针对少约束场景下采集的虹膜图像容易受到镜面反射、睫毛和头发遮挡、运动和离焦模糊等噪声的干扰,导致难以准确地分割有效的虹膜区域的问题,提出一种结合Transformer与对称型编解码器的噪声虹膜图像分割方法.首先,使用Swin Transformer作为编码器,将输入图像的区块序列送入分层Transformer模块中,通过自注意力机制建模像素间的长距离依赖,增强上下文信息的交互;其次,构建与编码器对称的Transformer解码器,对所提取的高阶上下文特征进行多层解码,解码过程中与编码器跳跃连接进行多尺度特征融合,减少下采样造成的空间位置信息丢失;最后,对解码器每个阶段的输出进行监督学习,提升不同尺度特征的抽取质量.基于3个公开的噪声近红外和可见光虹膜数据集NICE.I,CASIA.v4-distance和MICHE-I,与若干包括传统方法、基于卷积神经网络的方法和基于现有Transformer的方法在内的基准方法进行对比实验,实验结果表明,所提方法在E1, E2, F1和MIOU定量评价指标上均取得了比基准方法更优的分割性能,尤其是在减少噪声的干扰上具有明显的优势.此外,在CASIA.v4-distance数据集上的虹膜识别实验表明,文中方法可以有效地提升虹膜识别的性能,显示了良好的应用潜力.

     

    Abstract: Iris images captured in less-constrained environments are susceptible to the interference of noise such as specular reflections, eyelash and hair occlusions, motion and defocus blur, which makes it hard to accurately segment the valid iris region. To solve the problem, a symmetrical encoder-decoder network with Transformer is proposed for noise-robust iris segmentation. First, Swin Transformer is employed as the encoder to feed a sequence of the input image patches into hierarchical Transformer modules, so as to model the long-range dependencies of image pixels through self-attention mechanism, and enhance the interaction of contextual information.Secondly, a Transformer decoder which is symmetrical with the encoder is constructed, where the high-order context features extracted earlier are decoded in multiple layers. Besides, the skip connections are introduced to fuse the multi-scale features from the encoder with the up-sampled decoded features, which reduces the loss of spatial position information caused by down-sampling. Finally, supervised learning is carried out on each stage of the decoder, which improves the quality of extracted different-scaled features. A comparative experiment is carried out on three challenging noisy near-infrared(NIR) and visible(VIS) iris datasets, i.e., NICE.I,CASIA.v4-distance, and MICHE-I.Resultsshow that the proposed method achieves better segmentation performance than several benchmark methods including traditional methods, convolutional neural network-based methods and existing Transformer-based methods on multiple evaluation metrics like E1, E2, F1, and MIOU, and particularly demonstrates significant advantages in reducing the interference of adverse noise. The iris recognition experiment on the CASIA.v4-distance dataset also shows that the proposed method can effectively improve the performance of iris recognition, suggesting a good application potential.

     

/

返回文章
返回