DL-Diff: Advancing Low-Light Video Enhancement through Dark-to-Light Diffusion Models
-
Graphical Abstract
-
Abstract
To address the challenges of insufficient detail recovery and spatiotemporal inconsistency in low-light video enhancement (LLVE), this thesis proposes DL-Diff, a novel dark-to-bright diffusion model. The framework first establishes a conditional video-to-video generation task based on a pre-trained latent diffusion model. It then incorporates two synergistic components: (1) a Restoration Component that learns the mapping from low-light to normal-light domains and (2) a Temporal Component that maintains inter-frame consistency. Furthermore, this thesis develops a multi-stage training strategy to optimize network parameters for enhanced video quality progressively. Extensive experiments on paired DID and SDSD datasets demonstrate DL-Diff's superior performance. On the DID benchmark, this method achieves state-of-the-art results with spatial metrics of 41.29 (FID) and 0.17 (LPIPS), along with temporal metrics of 25.40 (AB(Var)) and 0.08 (MABD), outperforming existing LLVE approaches. Qualitative evaluations confirm that DL-Diff generates visually pleasing videos with improved spatial alignment and temporal coherence compared to competing methods.
-
-