Template-Based Adaptive Training Acceleration Framework for Deep Learning Algorithms
-
Graphical Abstract
-
Abstract
Field-programmable gate array(FPGA)is usually used to accelerate the training phase of deep learning algorithms,but it usually requires a long development cycle and rich hardware design expertise for satisfied exe-cution performance.In order to deal with this challenge,an adaptive acceleration framework for deep learning algorithm is proposed in this paper.We investigate the application scale,parallel scheduling strategy,resource usage and the scalability of functionality.With the CPU-FPGA heterogeneous acceleration template based tech-nology,an adaptive model compiler is proposed to customize the accelerator based on the algorithm’s complexity and hardware resources available.The proposed hardware and software co-design framework can effectively adapt to different FPGA hardware resources and support the fast evolution of deep learning algorithms.Taking the graph neural network as an example,it can obtain 7~41x performance improvements compared to the general purpose CPU platform.
-
-