AdELL: An Adaptive Warp-Balancing ELL Format for Efficient Sparse Matrix-Vector Multiplication on GPUs. The sparse matrix-vector multiplication (SpMV) is a fundamental computational kernel used in science and engineering. As a result, the performance of a large number of applications depends on the efficiency of the SpMV. This kernel is, in fact, a bandwidth-limited operation and poses a challenge for optimization when the matrix has an irregular structure. The literature on implementing SpMV on throughput-oriented many core processors is extensive and mostly focuses on matrix formats, proposing different ideas to adapt matrix sparsity to the underlying architecture. In this paper, we propose a novel ELL-based matrix format called Adaptive ELL (AdELL) to improve the state-of-the-art of the SpMV on Graphic Processing Units (GPUs). The AdELL format is based on the idea of distributing working threads to rows according to their computational load, creating balanced hardware-level blocks (warps) that take full advantage of the vectorized execution on Streaming Multiprocessors (SMs). The AdELL data structure is created using a novel warp-balancing heuristic designed to smooth the workload among warps without the need of tuning any parameters. AdELL provides an efficient warp-level synchronization (as opposed to block-level) but can also use atomic operations to distribute very skewed rows over multiple warps. Moreover, we introduce a loop unrolling heuristic that optimizes the SpMV performance by selecting the best unrolling factor based on the warp workload. We tested the proposed AdELL sparse format on a set of conventional benchmarks from heterogeneous application domains. The results show substantial and consistent performance improvements for double-precision calculations, outperforming the state-of-the-art ensemble framework clSpMV. We could observe speedup peaks up to 1.94 and a 25% (geometric) average improvement, which can be potentially increased to 43% introducing a simple 1x2 blocking strategy.
References in zbMATH (referenced in 1 article )
Showing result 1 of 1.
- Filippone, Salvatore; Cardellini, Valeria; Barbieri, Davide; Fanfarillo, Alessandro: Sparse matrix-vector multiplication on GPGPUs (2017)