WarpDrive

WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes. Hash maps are among the most versatile data structures in computer science because of their compact data layout and expected constant time complexity for insertion and querying. However, associated memory access patterns during the probing phase are highly irregular resulting in strongly memory-bound implementations. Massively parallel accelerators such as CUDA-enabled GPUs may overcome this limitation by virtue of their fast video memory featuring almost one TB/s bandwidth in comparison to main memory modules of state-of-the-art CPUs with less than 100 GB/s. Unfortunately, the size of hash maps supported by existing single-GPU hashing implementations is restricted by the limited amount of available video RAM. Hence, hash map construction and querying that scales across multiple GPUs is urgently needed in order to support structured storage of bigger datasets at high speeds. In this paper, we introduce WarpDrive - a scalable, distributed single-node multi-GPU implementation for the construction and querying of billions of key-value pairs. We propose a novel subwarp-based probing scheme featuring coalesced memory access over consecutive memory regions in order to mitigate the high latency of irregular access patterns. Our implementation achieves 1.4 billion insertions per second in single-GPU mode for a load factor of 0.95 thereby outperforming the GPU-cuckoo implementation of the CUDPP library by a factor of 2.8 on a P100. Furthermore, we present transparent scaling to multiple GPUs within the same node with up to 4.3 billion operations per second for high load factors on four P100 GPUs connected by NVLink technology. WarpDrive is free software and can be downloaded at https://github.com/sleeepyjack/warpdrive.

Keywords for this software

Anything in here will be replaced on browsers that support the canvas element


References in zbMATH (referenced in 1 article )

Showing result 1 of 1.
Sorted by year (citations)

  1. Daniel Jünger; Robin Kobus; André Müller; Christian Hundt, Kai Xu; Weiguo Liu; Bertil Schmidt: WarpCore: A Library for fast Hash Tables on GPUs (2020) arXiv