联合时域虚拟帧的多帧视频质量增强方法

丁丹丹; 吴熙林; 佟骏超; 姚争为; 潘志庚

doi:10.3724/SP.J.1089.2020.17952

联合时域虚拟帧的多帧视频质量增强方法

Multi-Frame Video Enhancement Using Virtual Frame Synthesized in Time Domain

摘要

摘要: 基于神经网络的视频质量增强方法能够明显减少视频压缩噪声,提高压缩视频的主观与客观质量.目前,大多研究采用的是空域单帧增强策略.然而,视频图像在时域也具备高度相关性,这些信息还未在视频增强上得到充分利用.为此,提出了一种联合时空域信息的重建视频增强方法.首先,使用自适应网络,根据前后重建帧预测得到当前帧的虚拟帧;该虚拟帧携带了大量时域信息,当前帧在空域又有高度相关性,因此,提出使用渐进融合网络进一步融合两者信息,从而增强当前帧的质量.实验结果表明,在随机访问编码模式测试条件下,文中方法与H.265/HEVC相比,平均可获得0.38 dB PSNR增益;与仅用单帧增强相比,可获得0.06 dB PSNR增益;与已有的多帧增强方法(multi-frame quality enhancement,MFQE)相比,可获得0.26 dB PSNR增益,且参数量仅为MFQE的12.2%.此外,文中方法对重建视频的主观质量也有明显改善.

Abstract: The convolutional neural network based video enhancement can effectively reduces compression artifacts,improving both the video coding efficiency and subjective quality.State-of-the-art methods usually adopt the single-frame enhancement strategies.However,video frames are also highly correlated in temporal domain,indicating that the reconstructed frames in temporal domain can also provide useful information to enhance the quality of current frame.To sufficiently utilize the temporal information,this paper proposes a spatial-temporal video enhancement method by introducing a virtual frame in time domain.We first employ an adaptive network to predict the virtual frame of current frame from its neighboring reconstructed frames.This virtual frame carries abundant temporal information.On the other hand,the current frame is also highly correlated in spatial domain.Hence we can combine spatial-temporal information together for extensive enhancement.To this end,we develop an enhancing network,which is structured in a progressive fusion manner,to combine both the virtual frame and the current frame for further frame fusion.Experimental results show that under random access configuration,the proposed method can obtain an average gain of 0.38 dB and 0.06 dB PSNR compared to the anchor H.265/HEVC and the single-frame-based strategy.Moreover,it outperforms the state-of-the-art multi-frame quality enhancement network(MFQE)0.26 dB PSNR,whereas the number of parameters is only 12.2%of MFQE.The proposed method also significantly improves the subjective quality of the compressed videos.

HTML全文

参考文献(0)

施引文献

资源附件(0)