高级检索

融合特征位置编码和误差修正的唐卡图像风格迁移模型

A Style Transfer Model for Thangka Images Based on Fusion of Feature Position Coding and Error Correction

  • 摘要: 唐卡是藏族文化中一种独具特色的艺术风格,通过对唐卡图像进行二次创作能够帮助人们了解唐卡风格特征,促进文化传承与保护.针对目前唐卡图像风格迁移技术生成新图像中,生成图像存在伪影、局部图像模糊和细节信息处理不到位等问题,提出一种融合特征位置编码和误差修正的唐卡风格迁移模型FPC-EI.利用二维相对位置编码获取内容序列与风格序列之间的对应位置信息,实现图像特征之间的对齐和匹配;坐标注意力机制模块强化利用二维相对位置编码增强编码器对于唐卡以及内容图像纹理细节信息的捕获能力;密集深度反投影网络使用采样层模块不断地进行上下采样,进行误差修正,指导模型提高生成图像质量.在自建的唐卡数据集和任意图像数据集上进行模型训练的实验结果表明,与AdaIN,ArtFlow,AdaAttN等风格迁移模型相比,FPC-EI的SSIM指标平均提升32%,LPIPS和MSE指标平均下降7.3%和2.7%,可以有效地保留内容结构、特征融合和图像局部细节方面特征.

     

    Abstract: Thangka is a unique art style in Tibetan culture, and the secondary creation of thangka images can help people understand the stylistic characteristics of thangka, and promote the cultural heritage and protection. Aiming at the problems of artifacts, local image blurring and inappropriate processing of detail information in the new image generated by the current thangka image style migration technology, we propose a thangka style migration model FPC-EI, which integrates the feature position encoding and error correction. The two-dimensional relative position encoding is used to obtain the corresponding position information between the content sequences and the style sequences, and to realize the alignment and matching of the image features; the coordinate attention mechanism module reinforces the use of two-dimensional relative position encoding to get the position information between content sequences and style sequences, and realizes the alignment and matching of image features. The coordinate attention mechanism module enhances the ability of the encoder to capture the texture details of the Tangkas and the content images by utilizing 2D relative position coding, and the dense deep inverse projection network uses a module that continuously samples up and down the sampling layer to correct the error and guide the model to improve the quality of the generated images. The experimental results of model training on the self-constructed thangka dataset and arbitrary image dataset show that compared with the style migration models such as AdaIN, ArtFlow, AdaAttN, etc., the SSIM index of FPC-EI is improved by 32% on average, and the LPIPS and MSE indexes are reduced by 7.3% and 2.7% on average, and the structure of the content, the fusion of the features and the local details of the image can be effectively preserved. The FPC-EI can effectively retain the content structure, feature fusion and local details of the image.

     

/

返回文章
返回