高级检索
于冰, 丁友东, 谢志峰, 黄东晋, 马利庄. 基于时空生成对抗网络的视频修复[J]. 计算机辅助设计与图形学学报, 2020, 32(5): 769-779. DOI: 10.3724/SP.J.1089.2020.17962
引用本文: 于冰, 丁友东, 谢志峰, 黄东晋, 马利庄. 基于时空生成对抗网络的视频修复[J]. 计算机辅助设计与图形学学报, 2020, 32(5): 769-779. DOI: 10.3724/SP.J.1089.2020.17962
Yu Bing, Ding Youdong, Xie Zhifeng, Huang Dongjin, Ma Lizhuang. Temporal-Spatial Generative Adversarial Networks for Video Inpainting[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(5): 769-779. DOI: 10.3724/SP.J.1089.2020.17962
Citation: Yu Bing, Ding Youdong, Xie Zhifeng, Huang Dongjin, Ma Lizhuang. Temporal-Spatial Generative Adversarial Networks for Video Inpainting[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(5): 769-779. DOI: 10.3724/SP.J.1089.2020.17962

基于时空生成对抗网络的视频修复

Temporal-Spatial Generative Adversarial Networks for Video Inpainting

  • 摘要: 针对现有视频修复中存在的修复结果语义信息不连续问题,提出基于时空生成对抗网络的修复方法,其包含2种网络模型:单帧修复模型和序列修复模型.单帧修复模型采用单帧堆叠式生成器和空间判别器,实现对起始帧的高质量空间域缺损修复.在此基础上,序列修复模型针对后续帧的缺损问题,采用序列堆叠式生成器和时空判别器,实现时空一致的视频修复.在UCF-101和FaceForensics数据集上的实验结果表明,该方法能够大幅提升修复视频的时空连贯性,与基准方法相比,在峰值信噪比、结构相似性、图像块感知相似性和稳定性误差等性能指标上均表现更优.

     

    Abstract: The existing video inpainting methods may fail to yield semantic continuous results.We proposed a method based on temporal-spatial generative adversarial networks to solve the above problem.This method includes two network models:the single-frame inpainting model and the sequence inpainting model.The single-frame inpainting model consisting of the single-frame stacked generator and spatial discriminator can realize the high-quality completion for the start frames with spatial missing regions.On this basis,the sequence inpainting model consisting of the sequence stacked generator and the temporal-spatial discriminator is used to achieve the temporal-spatial consistent video completion for the subsequent frames.Experimental results on the UCF-101 and FaceForensics datasets show that our method can greatly improve the temporal and spatial coherence of video completion.Compared with the benchmark method,our method performs better in peak signal to noise ratio,structural similarity index,learned perceptual image patch similarity and stability error.

     

/

返回文章
返回