Kokkos

The KOKKOS package was developed primarily by Christian Trott (Sandia) with contributions of various styles by others, including Sikandar Mashayak (UIUC), Stan Moore (Sandia), and Ray Shan (Sandia). The underlying Kokkos library was written primarily by Carter Edwards, Christian Trott, and Dan Sunderland (all Sandia). The KOKKOS package contains versions of pair, fix, and atom styles that use data structures and macros provided by the Kokkos library, which is included with LAMMPS in lib/kokkos. The Kokkos library is part of Trilinos and can also be downloaded from Github. Kokkos is a templated C++ library that provides two key abstractions for an application like LAMMPS. First, it allows a single implementation of an application kernel (e.g. a pair style) to run efficiently on different kinds of hardware, such as a GPU, Intel Phi, or many-core CPU. The Kokkos library also provides data abstractions to adjust (at compile time) the memory layout of basic data structures like 2d and 3d arrays and allow the transparent utilization of special hardware load and store operations. Such data structures are used in LAMMPS to store atom coordinates or forces or neighbor lists. The layout is chosen to optimize performance on different platforms. Again this functionality is hidden from the developer, and does not affect how the kernel is coded. These abstractions are set at build time, when LAMMPS is compiled with the KOKKOS package installed. All Kokkos operations occur within the context of an individual MPI task running on a single node of the machine. The total number of MPI tasks used by LAMMPS (one or multiple per compute node) is set in the usual manner via the mpirun or mpiexec commands, and is independent of Kokkos. Kokkos currently provides support for 3 modes of execution (per MPI task). These are OpenMP (for many-core CPUs), Cuda (for NVIDIA GPUs), and OpenMP (for Intel Phi). Note that the KOKKOS package supports running on the Phi in native mode, not offload mode like the USER-INTEL package supports. You choose the mode at build time to produce an executable compatible with specific hardware.