Abstract:
A retina vessel segmentation algorithm that fuses CLTransformer with cross-scale attention is proposed to address issues such as mis-segmentation of the optic disc, blurred main vessel texture, and microvascular branch breaks in existing methods. Firstly, a lightweight residual encoder-decoder module is designed for encoding and decoding, enabling coarse-grained extraction of vessel texture features. Secondly, a multi-scale feature selection module is employed at the encoder-decoder connection to fuse coarse-grained features across levels. Thirdly, a cross-layer transformer module is added at the bottom of the network to cross-fuse deep semantic information, refining vessel feature contours. Finally, a fusion loss function is used to supervise the training and testing of the algorithm. Experiments are conducted on the DRIVE, STARE, and CHASE_DB1 datasets, achieving accuracies of 97.10%, 97.66%, and 97.62%, specificities of 98.64%, 99.03%, and 98.72%, and F1 scores of 83.05%, 84.07%, and 81.18%, respectively. Overall, most state-of-the-art methods are outperformed by the proposed algorithm.