高级检索
王前前, 章子豪, 姜洪旭, 冯伟, 高全学, 焦李成. 基于自监督信息熵学习的深度多模态聚类[J]. 计算机辅助设计与图形学学报. DOI: 10.3724/SP.J.1089.null.2023-00624
引用本文: 王前前, 章子豪, 姜洪旭, 冯伟, 高全学, 焦李成. 基于自监督信息熵学习的深度多模态聚类[J]. 计算机辅助设计与图形学学报. DOI: 10.3724/SP.J.1089.null.2023-00624
Qianqian Wang, Zihao Zhang, Hongxu Jiang, Wei Feng, Quanxue Gao, Licheng Jiao. Deep Multimodal Clustering Based on Self-supervised Information entropy learning[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.null.2023-00624
Citation: Qianqian Wang, Zihao Zhang, Hongxu Jiang, Wei Feng, Quanxue Gao, Licheng Jiao. Deep Multimodal Clustering Based on Self-supervised Information entropy learning[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.null.2023-00624

基于自监督信息熵学习的深度多模态聚类

Deep Multimodal Clustering Based on Self-supervised Information entropy learning

  • 摘要: 为了保持多个模态间聚类空间的一致性, 消除各个模态内的无关信息, 提出一种基于自监督信息熵学习的深度多模态聚类算法. 首先采用多模态卷积自编码器结合重建任务以获取低维的潜层特征; 然后使用深度嵌入技术为多个模态学习一个理想的公共聚类空间, 将其作为标签以自监督的方式来约束各模态的聚类子空间不断接近理想, 保证每个模态的潜层特征具有相似的分布; 最后结合信息熵的理论, 约束标签与各模态潜层特征间的互信息, 保证模态间的相关性, 同时降低模态内数据的冗余性. 此外, 在Fashion-MNIST, COIL-20, FRGC, YTF, RGB-D以及Noisy-MNIST基准数据集上展开实验. 实验结果表明, 所提算法在ACC和NMI聚类指标上均优于其他对比算法, 尤其在Fashion-MNIST数据集上, ACC相较于先进的StSNE算法提高了2.2个百分点. 消融实验和参数分析证明了所提算法的合理性和鲁棒性.

     

    Abstract: In order to maintain consistency in the clustering space among multiple modalities and eliminate irrelevant information within each modality, a deep multimodal clustering algorithm based on self-supervised information entropy learning is proposed. Firstly, a multimodal convolutional autoencoder is employed in conjunction with a reconstruction task to acquire low-dimensional latent features. Subsequently, deep embedding techniques are utilized to learn an ideal common clustering space for multiple modalities. This common space is used as labels to supervise and guide each modality's clustering subspaces to progressively approach the ideal state, ensuring that the latent features of each modality have similar distributions. Finally, combining the theory of information entropy, the mutual information between labels and latent features of each modality is constrained to guarantee inter-modality correlation while simultaneously reducing redundancy within modality data. Additionally, experiments are conducted on benchmark datasets including Fashion-MNIST, COIL-20, FRGC, YTF, RGB-D, and Noisy-MNIST. Experimental results demonstrate that the proposed algorithm outperforms other comparison algorithms in terms of ACC and NMI clustering metrics. Particularly on the Fashion-MNIST dataset, ACC is improved by 2.2 percentage points compared to the advanced StSNE algorithm.

     

/

返回文章
返回