面向虚拟场景交互任务的分阶段视线预测方法

李国安; 刘俊辰; 汪淼

doi:10.3724/SP.J.1089.2023-00333

面向虚拟场景交互任务的分阶段视线预测方法

Staged Gaze Prediction in Virtual Scene Interaction Tasks

摘要

摘要: 视线预测方法是指通过用户的各类信息预测用户当前注视方向的一种算法模型, 现有的在虚拟场景中进行视线预测的方法通常使用泛化的模型, 在各类具体交互任务上仍有较大的提升空间.所提方法的交互任务设置为在虚拟场景中走动寻找—接近—触碰物体, 首先构建了该任务场景的第1个数据集, 包含21名用户在3个交互场景下各进行5次交互任务过程中记录的注视点、物体、头盔和手柄4类参数序列和录制视频; 进一步将用户完成交互任务的过程分为3个阶段, 即寻找目标物体—锁定目标物体—接近目标物体; 然后逐阶段进行相关性分析, 选择与视线相关度最高的参数集输入网络进行训练.所提方法在自行构建的数据集上进行验证, 与当前主流算法的3.31° 的预测误差相比, 所提方法达到了2.60° 的预测误差, 预测误差减小21.45%, 提升了该任务场景下的预测精度.

Abstract: Gaze prediction method refers to an algorithm model that predicts the user’s current gaze direction through various types of user information. However, current methods for predicting gaze in virtual scenes typically rely on generalized models and still have considerable room for improvement in specific interactive tasks. This paper focuses on improving gaze prediction for the interactive task flow of finding-locking onto-approaching target objects in virtual scenes. We first construct the first dataset for this task, consisting of gaze recordings, object, helmet and controller parameters, as well as recorded videos, during five interacting tasks performed by 21 users in three interactive scenes. The users’ interaction process is divided into three stages: finding target objects, locking onto target objects, and approaching target objects. We then conduct phase-wise correlation analysis, selecting the parameter set with the highest correlation with gaze to input into the network for training. The proposed method is validated on the self-constructed dataset, achieving a gaze prediction error of 2.60°, which represents a 21.45% improvement over the current SOTA method’s error of 3.31°, significantly enhancing gaze prediction accuracy for this task scenario.

HTML全文

参考文献(21)

施引文献

资源附件(0)