Advanced Search
Cai Jianghai, Huang Chengquan, Shunxia Wang, Guiyan Yang, Senyan Luo, Lihua Zhou. Generative Visual Image Understanding Based on Disentangled Representation Learning[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2024-00003
Citation: Cai Jianghai, Huang Chengquan, Shunxia Wang, Guiyan Yang, Senyan Luo, Lihua Zhou. Generative Visual Image Understanding Based on Disentangled Representation Learning[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2024-00003

Generative Visual Image Understanding Based on Disentangled Representation Learning

  • Interpretable visual image representation learning to reveal image variation factors is a hot research topic in computer vision. Many existing disentanglement methods discover variation factors of images and learn disentangled representations by using extra regularization term. However, it usually leads to an imbalance between disentanglement and generative quality, which affects visual image understanding. To address this issue, a generative visual image understanding method based on disentangled representation learning is proposed in terms of interpretable variations in images. Firstly, a pre-trained Glow model is designed to acquire the latent representations of target images. Secondly, a learning strategy based on image variation is constructed from the latent representations to obtain interpretable directions of candidate traversals. Finally, the contrast module is designed under the contrastive learning perspective to simulate image variations based on the interpretable directions of candidate traversals and then extract disentangled representations. The experimental results show that better results are achieved on the popular disentanglement datasets, which are Shapes3D, MPI3D, Anime, MNIST and Cars3D, where the MIG, DCI, FactorVAE score and -VAE score metrics reach 0.16, 0.27, 0.89 and 0.98, respectively, on the Cars3D dataset, verifying the effectiveness and feasibility of the proposed method.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return