基于结构引导与动态特征融合的纹理补全方法

赵娅; 朱婉珍; 贾迪; 姚文达; 江流洋

doi:10.3724/SP.J.1089.2025-00205

基于结构引导与动态特征融合的纹理补全方法

Texture Completion Method Based on Structure Guidance and Dynamic Feature Fusion

摘要

摘要: 针对三维人脸重建任务中因头部姿态变化与自遮挡导致的纹理缺失问题，为了有效地融合结构与纹理信息以提升补全质量，提出一种基于结构引导与动态特征融合的纹理补全方法。首先，引入人脸左右对称约束，将偏移姿态下可见纹理映射至对称纹理空间，并结合空洞卷积提取多尺度上下文语义特征；其次，构建基于门控卷积的双分支生成网络，分别编码人脸几何结构与局部纹理特征，为纹理生成提供关键的结构引导信息；随后，设计融合局部交叉注意力机制的动态特征融合模块，在局部区域内建立结构特征与纹理特征之间的语义关联，并根据区域特征自适应调整特征融合权重，增强结构信息对纹理生成的引导能力；最后，从全局纹理完整性、边缘结构合理性与局部细节连续性三个维度构建多尺度判别网络，以强化对生成结果的判别约束。在CelebA与FFHQ数据集上的实验结果表明，所提方法在结构相似性指数达到0.81，峰值信噪比达到29.56dB，在SSIM和PSNR指标上分别取得了约3.8%和5.2%的提升，在多姿态场景中表现出稳定的纹理补全质量；在FLAME模型上的纹理映射结果呈现出更加真实的光照效果和细节表达，验证了其在三维人脸重建渲染任务中的有效性与鲁棒性。

Abstract: For the problem of texture loss caused by head pose variation and self-occlusion in the 3D face reconstruc-tion task, in order to effectively fuse structural and texture information to improve the completion quality, a texture completion method based on structure guidance and dynamic feature fusion is proposed. Firstly, the left-right symmetry constraint of the face is introduced, and the visible texture under the offset pose is mapped to the symmetric texture space, and the dilated convolution is combined to extract multi-scale context semantic features; Secondly, a dual-branch generation network based on gated convolution is con-structed, which encodes the face geometric structure and local texture features, providing key structural guidance information for texture generation; Subsequently, a dynamic feature fusion module with local cross-attention mechanism is designed to establish semantic associations between structural features and texture features in the local region, and adaptively adjust the feature fusion weights according to the re-gional features to enhance the guiding ability of structural information on texture generation; Finally, a multi-scale discriminative network is constructed from three dimensions: global texture integrity, edge structure rationality, and local detail continuity, to strengthen the discriminative constraints on the gener-ated results. Experimental results on the CelebA and FFHQ datasets show that the proposed method achieves a structural similarity index of 0.81 and a peak signal-to-noise ratio of 29.56 dB, achieving ap-proximately 3.8% and 5.2% improvements in SSIM and PSNR indicators respectively, and demonstrates stable texture completion quality in multi-pose scenarios; The texture mapping results on the FLAME mod-el present more realistic lighting effects and detail expressions, verifying its effectiveness and robustness in the 3D face reconstruction rendering task.

HTML全文

参考文献(0)

施引文献

资源附件(0)

英文长摘要