Abstract:
Thangka is a unique art style in Tibetan culture, and the secondary creation of thangka images can help people understand the stylistic characteristics of thangka, and promote the cultural heritage and protection. Aiming at the problems of artifacts, local image blurring and inappropriate processing of detail information in the new image generated by the current thangka image style migration technology, we propose a thangka style migration model FPC-EI, which integrates the feature position encoding and error correction. The two-dimensional relative position encoding is used to obtain the corresponding position information between the content sequences and the style sequences, and to realize the alignment and matching of the image features; the coordinate attention mechanism module reinforces the use of two-dimensional relative position encoding to get the position information between content sequences and style sequences, and realizes the alignment and matching of image features. The coordinate attention mechanism module enhances the ability of the encoder to capture the texture details of the Tangkas and the content images by utilizing 2D relative position coding, and the dense deep inverse projection network uses a module that continuously samples up and down the sampling layer to correct the error and guide the model to improve the quality of the generated images. The experimental results of model training on the self-constructed thangka dataset and arbitrary image dataset show that compared with the style migration models such as AdaIN, ArtFlow, AdaAttN, etc., the SSIM index of FPC-EI is improved by 32% on average, and the LPIPS and MSE indexes are reduced by 7.3% and 2.7% on average, and the structure of the content, the fusion of the features and the local details of the image can be effectively preserved. The FPC-EI can effectively retain the content structure, feature fusion and local details of the image.