高级检索
宋元杰, 王璐, 孟祥旭. 采用Intel集成众核架构的并行光线追踪加速方法[J]. 计算机辅助设计与图形学学报, 2015, 27(12): 2313-2322.
引用本文: 宋元杰, 王璐, 孟祥旭. 采用Intel集成众核架构的并行光线追踪加速方法[J]. 计算机辅助设计与图形学学报, 2015, 27(12): 2313-2322.
Song Yuanjie, Wang Lu, Meng Xiangxu. An Accelerated Ray Tracing Algorithm for the Intel® Xeon PhiTM Architectures[J]. Journal of Computer-Aided Design & Computer Graphics, 2015, 27(12): 2313-2322.
Citation: Song Yuanjie, Wang Lu, Meng Xiangxu. An Accelerated Ray Tracing Algorithm for the Intel® Xeon PhiTM Architectures[J]. Journal of Computer-Aided Design & Computer Graphics, 2015, 27(12): 2313-2322.

采用Intel集成众核架构的并行光线追踪加速方法

An Accelerated Ray Tracing Algorithm for the Intel® Xeon PhiTM Architectures

  • 摘要: 针对真实感渲染光线追踪流程中光线和场景求交计算量大、渲染速度慢的问题,提出一种基于Intel集成众核架构的并行光线追踪加速方法.在场景预处理阶段,首先构建四分支场景加速结构,以适应于MIC的硬件架构.在光线追踪阶段,首先通过CPU主核控制光线追踪整体流程,该主核采用多线程调度优化策略,调度MIC从核进行光线和场景树的求交操作,实现CPU和MIC的异步数据传输,充分利用主从核的计算能力;在MIC从核的光线和场景树求交过程中提出一种并行求交算法,充分利用MIC宽SIMD处理单元,实现光线和场景树4个结点并行求交的向量化操作,以加速求交过程.实验结果表明,与CPU原生模式相比,文中方法在光线求交阶段可达到2~4倍的加速效果,整体光线追踪流程渲染速度亦得到显著提升.

     

    Abstract: To accelerate ray-box intersection tests and speed the ray tracing in photorealistic rendering, a novel parallel ray tracing algorithm based on Intel many integrated cores(MIC) is presented in this paper. At the stage of scene preprocessing, this algorithm constructs a bounding volume hierarchy(BVH) with a branching factor of 4 which well adapts to the architecture of MIC. While tracing rays, the algorithm uses CPU to control the entire pipeline. Specifically, it adopts the optimized multi-threads scheduling strategy to schedule the coprocessor MIC to conduct ray-box intersection tests, asynchronously transmits data between CPU and MIC, and well exploits and utilizes the computing power of both CPU and coprocessor. Furthermore, we propose a parallel intersection algorithm to accelerate ray-box intersection tests. It takes full advantage of MIC's wide SIMD processing unit that it applies vectorization operations to take 4 ray-box intersections at the same time. Experimental results show that, compared with the native CPU implementation, the algorithm presented in this paper is 2~4 times faster at testing the ray-box intersections, ending up with a well accelerated ray tracing rendering process.

     

/

返回文章
返回