基于Transformer的多尺度自适应图像去雾方法
Multi-Scale Adaptive Transformer-Based Image Dehazing
-
摘要: 针对复杂雾霾场景下传统去雾方法存在的全局建模能力不足、局部细节丢失以及对非均匀雾适应性差等问题, 本文提出一种基于Transformer的自适应图像去雾模型. 该模型通过融合多尺度特征与引入动态门控注意力机制, 在实现全局上下文理解的同时, 有效增强局部细节恢复能力. 具体而言, 本文设计了一种多尺度Transformer编码器, 结合空洞卷积以扩展感受野并加强高频信息提取; 同时提出动态门控注意力模块(DGAM), 采用通道-空间双分支校准策略, 实现跨尺度特征的自适应融合, 从而提升模型对非均匀雾分布的适应能力. 此外, 构建混合损失函数, 引入物理模型约束与感知损失协同优化去雾质量. 在多个公开数据集及自建数据集上的实验验证了该方法在图像质量提升方面优于现有主流方法, 特别是在处理非均匀雾场景时展现出更强的细节恢复能力. 可视化结果进一步表明, 该模型在色彩还原与边缘锐化方面表现出明显优势, 为低能见度环境下的智能感知系统提供了有力支持.Abstract: To address the limitations of traditional dehazing methods in complex haze scenarios—such as insufficient global modeling capacity, loss of local details, and poor adaptability to non-uniform haze—this paper proposes an adaptive image dehazing model based on Transformer architecture. By integrating multi-scale feature fusion and a dynamic gated attention mechanism, the model effectively balances global context modeling with local detail restoration. Specifically, a multi-scale Transformer encoder is designed, incorporating dilated convolutions to expand the receptive field and enhance high-frequency feature extraction. Furthermore, a Dynamic Gated Attention Module (DGAM) is introduced, which employs a dual-branch channel-spatial calibration strategy to adaptively fuse multi-scale features, improving robustness to non-uniform haze distribution. A hybrid loss function is also constructed, combining physical model constraints with perceptual loss to optimize reconstruction quality. Experiments conducted on multiple public datasets as well as a custom-built dataset demonstrate that the proposed method outperforms existing mainstream approaches in terms of image quality, particularly in recovering details under non-uniform haze conditions. Visualization results further confirm the model's advantages in color fidelity and edge sharpness, offering strong technical support for intelligent perception systems in low-visibility environments.
下载: