基于循环双向Transformer的伪CT影像生成方法
Synthetic Computed Tomography Generation via Cycle Bidirectional-Transformer
-
摘要: 磁共振成像引导的放射治疗可以根据肿瘤和对器官的威胁情况实时调整治疗计划, 依靠使用磁共振成像生成伪计算机断层扫描进行放射治疗. 目前, 伪层析成像的生成技术基于对抗性网络的生成方法, 但这种方法在训练过程中使用像素级损失更新网络参数, 很容易导致模式崩溃, 生成不稳定的伪计算机断层扫描. 为了精准地实现基于磁共振影像的伪计算机断层扫描生成, 利用视觉Transformer的上下文敏感性以及卷积算子的归纳偏置, 提出一种循环双向Transformer医学影像合成方法. 在编码预测阶段, 循环双向Transformer利用U-Net编码得到的码本表示图像, 并使用非自回归编码与向量量化方式缩短生成码本的长度, 生成局部真实并且全局一致的图像; 使用归一化互信息作为损失函数, 并加入了循环一致性损失解决数据不匹配的问题. 在颅脑磁共振成像数据集TCGA-GBM与CPTAC-GBM上进行一系列实验, 验证了所提方法在影像生成任务上的有效性; 该方法的MAE, PSNR和SSIM分别达到86.3, 25.96 dB和0.897; 与对比方法相比, 该方法也表现出优越的性能.Abstract: Magnetic resonance imaging (MRI)-guided radiotherapy allows real-time adjustments to treatment plans based on the threat posed by tumors and organs. It relies primarily on using MRI-generated pseudo-computed tomography (CT) scans for radiotherapy. Currently, pseudo-CT imaging techniques are mainly based on generative adversarial networks. This method, using pixel-level loss during training to update network parameters, is prone to mode collapse, resulting in unstable pseudo-CT scans. To accurately generate pseudo-CT scans based on magnetic resonance (MR) images, this paper proposes a novel medical image synthesis method — the Cycle Bi-directional Transformer. It leverages the context sensitivity of visual Transformers and the inductive bias of convolutional operators. During the encoding and prediction phase, the Cycle Bi-directional Transformer utilizes a codebook obtained from U-Net encoding to represent images. It shortens the length of the generated codebook using non-autoregressive encoding and vector quantization, thereby producing locally realistic and globally consistent images. Furthermore, this paper employs normalized mutual information as a loss function and introduces cycle consistency loss to address data mismatch issues. A series of experiments on a collected dataset of cranial MRI images validates the effectiveness of the proposed method. The Cycle Bi-directional Transformer achieves Mean Absolute Error (MAE), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM) values of 86.3 HU, 25.96 dB, and 0.897, respectively. In comparison with other methods, this approach demonstrates superior performance.