高级检索

单目视频中着装人体重建的时空特征融合

Spatial-Temporal Feature Fusion for Clothed Human Reconstruction from Monocular Video

  • 摘要: 针对单目视频输入的着装人体重建中模型形状结构不完整、局部细节预测不准确和运动模型不平滑的问题, 提出一种着装人体重建的时空特征融合方法. 首先在观测空间与规范空间之间的变形场定义时序变形, 对连续帧之间的上下文信息进行时序特征表示和加强; 然后以时序特征引导细粒度空间特征的学习, 获取着装人体的全局点级特征和像素对齐特征; 最后提出基于自注意力的时空特征融合方法, 得到具有时序信息的全局点级特征和像素对齐特征的融合特征, 通过构建结合融合特征与规范空间坐标的神经辐射场重建准确的着装人体. 在ZJU-MoCap数据集上的新视角和新姿态实验结果表明, 所提方法的PSNR总和分别为190.96 dB和184.03 dB, 较对比方法提升10.62 dB和2.45 dB, 能够提高着装人体重建的准确性, 生成具有形状合理、服装纹理丰富和肢体平滑的着装人体模型; 文本驱动模型的定性实验结果表明, 该方法也具有一定的泛化性.

     

    Abstract: To address the problem of incomplete shape structure, inaccurate local details prediction and unsmooth motion model in clothed human reconstruction from monocular video, a spatial-temporal feature fusion method for clothed human reconstruction is proposed. Firstly, the temporal deformation is defined in deformation field between observe space and canonical space, representing and enhancing temporal features of contextual information between consecutive frames. Secondly, temporal features are used to guide the learning of fine-grained spatial features, obtaining global point-level features and pixel-aligned features of the clothed human body. Lastly, the spatial-temporal feature fusion module based on self-attention is proposed to obtain fusion features of global point-level features and pixel-aligned features with temporal information, and neural radiance fields is constructed that combines fusion features and canonical space coordinates to reconstruct accurate clothed human. Experimental results of the novel view and novel pose on ZJU-MoCap dataset show that total peak signal-to-noise ratio (PSNR) is 190.96 dB and 184.03 dB respectively, which is 10.62 dB and 2.45 dB higher than the comparison methods. The proposed method can improve accuracy of clothed human reconstruction and generate clothed human models with reasonable shapes, rich clothing textures, and smooth limbs. Experimental results of text driven model show that the proposed method also has a certain degree of generalization ability.

     

/

返回文章
返回