Abstract:
Thangka is a unique art style in Tibetan culture, and secondary creation of Thangka images can help people understand the characteristics of Thangka style and promote cultural inheritance and preservation. However, there are still shortcomings in the generation of new images by the current thangka image style migration technique, such as artifacts in the generated images, blurred local images and poor processing of detail in-formation. To address the above problems, a Tangka style migration model (FPC-EI) that incorporates feature location coding and error correction is proposed, which utilizes the Transformer's ability to capture the remote dependencies of image features to achieve Tangka style migration. First, an encoder highlighting feature position information is designed to obtain the corresponding position information between content sequences and style sequences using two-dimensional relative position encoding to achieve alignment and matching between image features. Secondly, a coordinate attention mechanism module is also added to the encoder to enhance the ability of capturing texture detail information of the tangka and content images using two-dimensional relative position encoding. Finally, in order to improve the quality of the generated im-ages, a dense depth inverse projection network is added to guide the model to improve the quality of the generated images by using a module that constantly up samples and down samples layers for error correction. Experimental results show that the method proposed in this paper achieves better results in terms of preserving content structure, feature fusion and local details of images compared with style migration models such as AdaIN and WCT.