高级检索

DL-Diff:基于暗转亮扩散模型的低光照视频增强方法

DL-Diff: Advancing Low-Light Video Enhancement through Dark-to-Light Diffusion Models

  • 摘要: 针对低光照视频增强(low-light video enhancement,LLVE)任务中细节恢复不足与时空一致性缺失的问题,提出一种基于暗转亮扩散模型的低光照视频增强方法DL-Diff。首先,基于预训练的潜在扩散模型构建基础模型,将低光照视频增强任务转换为有条件的视频到视频的生成任务;其次,设计协同工作的恢复组件与时序组件,其中恢复组件实现低光照视频到正常光照视频的映射,时序组件则确保连续帧之间的时序一致性;此外,设计了一个多阶段训练流程,通过分步优化网络参数从而逐步提高视频的恢复质量。在成对的DID和SDSD数据集上进行了充分的实验。定量实验表明,DL-Diff在各个指标上达到了最优和次优的效果;尤其在DID数据集上,DL-Diff的空间指标FID达到41.29,LPIPS达到0.17,时序指标AB(Var)达到25.40,MABD达到0.08,均超越对比LLVE方法。同时,定性结果也表明,DL-Diff能够生成兼具空间对齐和时间连续性的正常光照视频,且视觉效果优于对比方法。

     

    Abstract: To address the challenges of insufficient detail recovery and spatiotemporal inconsistency in low-light video enhancement (LLVE), this thesis proposes DL-Diff, a novel dark-to-bright diffusion model. The framework first establishes a conditional video-to-video generation task based on a pre-trained latent diffusion model. It then incorporates two synergistic components: a restoration component that learns the mapping from low-light to normal-light domains, and a temporal component that maintains inter-frame consistency. Furthermore, this thesis develops a multi-stage training strategy to optimize network parameters for enhanced video quality progressively. Extensive experiments on paired DID and SDSD datasets demonstrate DL-Diff’s superior performance. On the DID benchmark, this method achieves state-of-the-art results with spatial metrics of 41.29 (FID) and 0.17 (LPIPS), along with temporal metrics of 25.40 (AB(Var)) and 0.08 (MABD), outperforming existing LLVE approaches. Qualitative evaluations confirm that DL-Diff generates visually pleasing videos with improved spatial alignment and temporal coherence compared to competing methods.

     

/

返回文章
返回