融合结构化卷积和双重注意力机制的轻量级眼底图像分割网络

汪华登; 刘金; 黎兵兵; 潘细朋; 刘振丙; 蓝如师; 罗笑南

doi:10.3724/SP.J.1089.2024.19843

融合结构化卷积和双重注意力机制的轻量级眼底图像分割网络

Lightweight Fundus Image Segmentation Network Combining Structured Convolution and Dual Attention Mechanism

摘要

摘要: 眼底血管图像的自动分割对于多种眼科疾病的计算机辅助诊断具有重要作用. 针对血管的尺度差异和图像噪声导致眼底血管图像分割困难、使用单一尺度卷积运算的深度学习方法获取的特征感受野有限, 以及现有的方法复杂度过高的问题, 提出一个融合结构化卷积和双重注意力机制的轻量级眼底图像分割网络. 通过编码器增强、减少下采样次数和特征深度的编码-解码网络设计, 实现参数量只有0.63M的轻量化网络. 在编码阶段, 提出一种结构化卷积方法, 有效地避免了网络训练过拟合, 提高了网络捕获差异化血管特征的能力; 在解码阶段, 采用基于空间和通道的双重注意力机制, 使网络更加关注血管特征的上下文和几何空间信息, 抑制病变等噪声的干扰. 在DRIVE, CHASE_DB1和STARE数据集上进行实验的结果表明, 所提网络图像分割的准确率分别为96.92%, 97.57%和97.51%, 灵敏度分别为83.68%, 84.99%和84.87%, 受试者曲线下的面积(AUC)分别为98.67%, 99.05%和99.02%; 并通过在DRIVE和STARE数据集上的交叉训练, 验证了该网络的泛化能力.

Abstract: Automatic segmentation of fundus blood vessel images plays an important role in computer-aided diagnosis of various ophthalmic diseases. In order to solve the problems of difficulty in segmenting fundus vascular images caused by vascular scale difference and image noise, limited feature field obtained by deep learning method using single-scale convolution operation, and high complexity of existing methods, a lightweight fundus image segmentation network combining structured convolution and dual attention mechanism is proposed. Through the design of coder-decoder network with encoder enhancement, downsampling times and feature depth reduction, the lightweight network with only 0.63M parameters is realized. In the coding stage, a structured convolution method is proposed, which effectively avoids overfitting of network training and improves the ability of network to capture differentiated vascular features. In the decoding stage, a dual attention mechanism based on spatial and channel is adopted to make the network pay more attention to the context and geometric spatial information of vascular features, and suppress the interference of lesion noise. Experiments on the DRIVE, CHASE_DB1 and STARE datasets show that the accuracy of the image segmentation is 96.92%, 97.57%, and 97.51%, the sensitivity is 83.68%, 84.99%, and 84.87%, and the area under curve of the receiver operating characteristic is 98.67%, 99.05%, and 99.02%, respectively. Cross-training on the DRIVE and STARE datasets validated the generalization ability of the network.

HTML全文

参考文献(37)

施引文献

资源附件(0)