片上P/G网求解算法及其GPU上的并行化

唐亮; 潘月斗; 王嘉琪; 骆祖莹

片上P/G网求解算法及其GPU上的并行化

Algorithm Parallelization of on-Die Power/Ground Network Solving Based on GPU Parallel Computing

摘要

摘要: 为了得到片上电源线/地线网络(P/G网)快速而准确的求解算法,根据结构化供电网的局部性效应,重新分析了连续过松弛迭代法(SOR)和变向隐含迭代法(ADI)在P/G网中的求解效率及并行性,提出了利于GPU加速的并行算法:G_RBSOR和G_ADI.它们均采用规则的数据结构,以利于GPU并行读写数据,并采用合并归约来并行计算迭代结束标志位.为了避免GPU计算的数据冲突,G_RBSOR算法采用棋盘格方式对电路节点进行红黑分类,并对红黑节点进行交错松弛.实验结果表明,在不损失精度的前提下,与各自对应的CPU串行算法相比,G_RBSOR和G_ADI算法均取得了超过50倍的加速效果;与高效的P/G分析串行求解算法ICCG相比,也取得了超过5倍的加速效果.

Abstract: In order to study fast and accurate algorithms for power/ground network(P/G network)analyses,based on the locality effect of structure P/G networks,this work rethinks the efficiency and parallelism of successive over relaxation(SOR) algorithm and alternating direction implicit(ADI)algorithm.And then it proposes the optimized GPU-friendly parallel algorithms:G_RBSOR and G_ADI.The algorithms both use the regular data structure to facilitate GPU parallel data reading/writing.And they both use the merging reduction technique for GPU parallel computing to fast calculate the iteration-ending flags,too.Furthermore,in order to avoid the data collision in GPU parallel calculating,G_RBSOR uses the checkerboard strategy to classify all P/G network nodes into red and black groups and then,relax red nodes and black nodes step-by-step.Experimental results show that without any precision penalty,G_RBSOR and G_ADI algorithms can achieve more than 50 X speedup over their serial CPU counterparts.In comparison with the efficient serial algorithm ICCG,both can also achieve more than 5X speedup.

HTML全文

参考文献(0)

施引文献

资源附件(0)