基于多分支空间变化卷积网络的多层次三维人脸表情编辑
Multi-level 3D Facial Expression Editing based on Multi-branch Spatially Varying Convolutional Network
-
摘要: 针对三维人脸表情编辑中, 模型的操作繁复、生成表情的真实感低、细节信息缺乏的问题, 提出一种基于多分支空间变化卷积网络的多层次三维人脸表情编辑方法, 可根据面部模型上控制点的位移, 生成真实自然且细节丰富的新表情. 首先, 面部网格经过网络模型的高频区域划分模块, 识别出高频区域; 然后, 将整个面部网格和控制点约束输入粗略编辑模块, 生成基本表情, 同时, 将高频区域和控制点约束输入精细编辑模块, 生成丰富的表情细节; 最后, 将基本表情与表情细节融合得到新表情. 这种分层次处理可使网络以较快的速度生成精细的细节. 通过引入空间变化卷积, 可更好地捕获不规则网格的空间特征, 提高了表情编辑的准确性. 提出融合顶点空间邻近性和运动一致性的时空相关性准则, 并改进K-Means聚类算法, 极大地提高了高频区域自动划分的合理性和准确性; 通过构建结合顶点位置和顶点法向约束的新损失函数, 有效地提升了整个网络的精度. 对4个角色模型分别使用所提方法进行表情编辑, 与其他方法相比, 所提方法能够根据用户的编辑要求, 生成细节丰富的高真实感表情.Abstract: Aiming at the problems in 3D facial expression editing, such as the complicated operation of models, the low reality of generated expressions and the lack of detailed information, this paper proposes a multi-level 3D facial expression editing method based on multi-branch spatially varying convolutional network. The method can generate new expressions that are both realistic and rich in detail based on the displacement of control points on the facial model. Firstly, the facial mesh undergoes a high-frequency region segmentation module in the network model to identify the high-frequency regions. Then, the entire facial mesh and control point constraints are input to the coarse editing module to generate basic expressions. Simultaneously, the high-frequency regions and control point constraints are input to the fine editing module to generate rich expression details. Finally, the basic expressions and expression details are fused to obtain the new expression. This hierarchical processing allows the network to generate fine details at a relatively fast speed. By introducing spatially varying convolution, spatial features of irregular meshes can be better captured, thus improving the accuracy of expression editing. A spatiotemporal correlation criterion that combines vertex spatial proximity and motion consistency is proposed, and the K-Means clustering algorithm is improved to greatly enhance the rationality and accuracy of automatic high-frequency region segmentation. A new loss function combining vertex position and vertex normal constraints is constructed, effectively improving the overall accuracy of the network. Using the proposed methods, expression editing was performed on four different character models. Compared to other methods, the proposed method can generate highly realistic expressions with rich details according to the user's editing requests.