Abstract:
As a popular iterative method to solve linear equations,restarted generalized minimal residual method(GMRES) has the advantages of fast convergence and good stability.This paper implements a parallel GMRES in GPU based on CUDA.Particularly,the sparse matrix vector multiplication is optimized with coherence visiting and shared memory,which significantly improves the performance.We tested the paralleled GMRES on a GPU of GeForce GTX260,and compared its performance with those of the traditional GMRES on Intel Core 2 Quad CPU Q9400@2.66GHz and Intel Core i7 CPU 920@2.67GHz,which showed 40 times of speed-up and 20 times of speed-up on average respectively.