Advanced Search
Miao Yongwei, Chen Jiahui, Zhang Xinjie, Ma Wenjuan, Sun Shusen. Efficient 3D Object Detection of Indoor Scenes Based on RGB-D Video Stream[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(7): 1015-1025. DOI: 10.3724/SP.J.1089.2021.18630
Citation: Miao Yongwei, Chen Jiahui, Zhang Xinjie, Ma Wenjuan, Sun Shusen. Efficient 3D Object Detection of Indoor Scenes Based on RGB-D Video Stream[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(7): 1015-1025. DOI: 10.3724/SP.J.1089.2021.18630

Efficient 3D Object Detection of Indoor Scenes Based on RGB-D Video Stream

  • For indoor object detection,the input complex scenes often have some defects such as incomplete RGB-D scanning data or mutual occlusion of its objects.Meanwhile,due to the limitations of single RGB-D data or point cloud data input of indoor scenes,it is always difficult to detect all of 3 D objects simultaneously.In order to overcome this issue and also alleviate its low efficiency for indoor object detection,an efficient 3 D object detection method is proposed which takes RGB-D video streams as input.First,the RGB-D video stream of different indoor environments can be obtained using Kinect camera,and also captured its continuous RGB frames and corresponding point cloud data.Secondly,the Hash function is adopted to extract the content-sensitive key frames from the continuous RGB frames,and the objects semantic relationship can also be constructed according to the type/number of 3 D objects contained in adjacent key frames for ensuring that different objects will appear in each key frame.Then,3 D objects of the extracted key frames can be detected by using VoteNet,and the detection results of other frames can be estimated owing to relative posture relationship between adjacent frames by using the quaternion spherical linear interpolation algorithm.Finally,it can achieve efficient 3 D object detection for each frame in the RGB-D video stream.Using SUN RGB-D dataset to train the object detection network of key frame,the detection result of proposed method is accurate,and the overall detection time is greatly reduced if comparing with the VoteNet based frame-by-frame detection scheme.Experimental results demonstrate that proposed method is effective and efficient.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return