基于深度鉴伪的图像和谐化方法
Image Harmonization Method Based on Deep Forgery Detection
-
摘要: 在计算机视觉和增强现实领域中, 将前景物体融合到背景场景中并实现图像和谐是一项重要且具有挑战性的任务. 目前主流的和谐化方法大多通过调整图像前景的外观使其与背景相适应来达到视觉上一致性, 但多数方法局限于和谐化网络卷积层面的改进, 且对图像是否和谐的判断依据依赖于较为主观的人眼视觉感受或全局性的重构误差, 因此和谐化效果的提升空间有限. 本文在当前和谐化工作的基础上, 提出图像和谐化鉴伪网络, 该网络基于深度学习的方式鉴定图像是否为合成图像, 利用生成对抗机制将鉴伪网络的结果作为判定指标与现有编码器-解码器和谐化网络构建GAN模型, 二者互相博弈以达到鉴伪网络无法识别出和谐重构后的结果为合成图像的目的. 为了进一步保证和约束图像和谐化过程中的通道相关性和光照一致性, 本文在鉴伪网络中添加了图像差分模块和图像光照模块, 其中差分模块利用内部通道计算差分信息, 光照模块采用编码器网络提取特征并解析出局部特征与全局特征, 由特征融合和线性预测获取图像的光照信息. 模型最终经过特征提取与融合模块综合全部信息鉴定图像的和谐结果. 本文从定性和定量两方面在公共的基准数据集iHarmony上进行了多组实验评估, 实验结果表明本文方法的MSE和PSNR评估指标值均优于现有的方法, 本文方法在图像和谐化任务上取得了优异的表现.Abstract: In computer vision and augmented reality fields, it is an important and challenging task to fuse foreground objects into the background scene and achieve image harmonization. Most of the current mainstream harmonization methods adjust the appearance of the image foreground to make it compatible with the background visually. However, most of these methods are limited to improving the convolutional layer of the harmonization network, and the judgment of whether the image is harmonious depends on relatively subjective human visual angle or global reconstruction error. Therefore, there is limited room for improvement in the harmonization effect. Based on the current work of image harmonization, this paper proposes an image harmonization authenticity identification network. The network identifies whether the image is a synthetic image based on deep learning. Construct a GAN model with the results of the forgery detection network as the judgment indicator and the existing encoder-decoder harmonization network using the generative adversarial mechanism. The two networks compete with each other to achieve the result that the authenticity identification network cannot recognize the reconstruction result of the harmonization network as a synthetic image. To further ensure channel correlation and lighting consistency during image harmonization, this paper adds an image difference module and an image lighting module to the network. The difference module uses internal channels to calculate the difference information, and the illumination module uses an encoder network to extract features and parse local features and global features. Then it performs feature fusion and linear prediction to obtain the illumination information of the image. Finally, the feature extraction and fusion module comprehensively identifies the harmonization result of the image. This paper conducted multiple qualitative and quantitative experiments on the public benchmark dataset iHarmony. The experimental results show that this method has better MSE and PSNR evaluation index values than existing methods, and this method has achieved excellent performance in image harmonization tasks.