Abstract:
Abstract: 3D head reconstruction is one of the fundamental techniques for building the metaverse, and it also has broad applications in the fields of film and cartoon creation, game design, and intelligent education. Compared to modeling 3D heads manually by artists or capturing head scans using 3D scanners or stereo imaging systems, reconstructing 3D heads from a single image is much more economical and practical. However, single-image 3D head reconstruction is an ill-posed problem, and existing methods often suffer from low fidelity, fewer details, and poor generalization ability. To this end, this paper proposes a single-image 3D head reconstruction method using normals enhanced implicit function. First, a deep network for estimating normal maps of heads from a single image is designed. Second, the surface of a 3D head model is defined as a level set described by an implicit function, and an end-to-end deep neural network is built to extract visual features from the input image and the estimated normal map and predict whether a 3D point lies on the head surface or not. Our method achieves Chamfer distances of 0.769 6 mm and 1.308 0 mm respectively on the FCH (FaceScape、CoMA、HeadSpace) and FaceVerse datasets, which outperform other single-image 3D head reconstruction methods by a large margin; experimental results on images collected from the Internet also show that our method can reconstruct 3D head models with high fidelity and rich details from single images with different ages, human races, genders, and facial expressions, and has a strong generalization capability.