Abstract:
The automatic generation of 3D content is of great significance for virtual reality, games, film and television and other fields. In view of the problems existing in the efficient and controllable generation of high-quality three-dimensional objects according to natural language instructions, a method of programmed generation and evaluation of three-dimensional assets based on large language model is proposed. Firstly, the user’s natural language description is transformed into the script instructions of InfiniGen library to generate a three-dimensional model that meets the semantic requirements, and realize the cross modal transformation from text to code and then to three-dimensional objects. Then, the screening optimization mechanism of clip model is used to evaluate and score the generated results. Finally, the 3D model is automatically selected according to the score. Experimental results show that compared with the baseline method without clip filtering, the proposed method significantly improves the quality of 3D model generation and the consistency with the input text. This method achieves higher clip matching scores for multi class 3D object generation tasks. The feasibility of combining large model with programmed 3D generation technology is proved, which provides an effective path for the controllable generation of 3D content across modes.