Self-supervised Learning for Multi-modal Medical Image Segmentation
-
-
Abstract
Self-supervised learning (SSL) can well capture generic knowledge about different concepts, thereby beneficial for various downstream image analysis tasks. However, existing self-supervised learning SSL methods do not fully consider an essential multimodality characteristic in the medical images to design an efficient and effective proxy task. Additionally, most of them may suffer from learning task-agnostic representations from heterogeneous data. Therefore, they often lead to only marginal improvements. In this paper, we propose a novel self-supervised learning method that considers multiple imaging modalities for medical image segmentation, termed SLeM, to learn better feature representation for downstream segmentation. We introduce the multi-modal classified task, which facilitates rich representation learning from multiple image modalities. The learned representations allow for subsequent fine-tuning on diverse downstream tasks. To effectively solve the problem of diverse target object shapes in different periods, we then propose a new context fusing block (CFB) to extract features for tumors of various sizes. Finally, we transfer the learned representation to the downstream multi-modal medical image segmentation task via simple fine-tuning, which can significantly improve the performance. Comprehensive experiments demonstrate that the proposed SLeM outperforms state-of-the-art methods on BraTS 2019 and CHAOS datasets.
-
-