kdtree++ OpenCL Peridynamics solver: OpenCL implementation of a high performance 3D peridynamic model on graphics accelerators. Parallel processing is one of the major trends in the computational mechanics community. Due to inherent limitations in processor design, manufacturers have shifted towards the multi- and many-core architectures. The graphics processing units (GPUs) are gaining more and more popularity due to high availability and processing power as well as maturity of development tools and community experience. In this research we describe a rather general approach to using OpenCL implementation of 3D Peridynamics model on GPU platform. Peridynamics is a non-local continuum theory for describing the behavior of material used especially when damage and crack nucleation or propagation is of interest. The steps taken for developing an OpenMP code from the serial one as well as the comparison between OpenCL and OpenMP codes are provided. Optimization techniques and their effects on the performance of the code are described. The implementations are tested on some 3D benchmarks with hundred of thousands to millions of nodes. The behavior of codes in terms of being memory or compute bound are analyzed. In all test cases reported, the OpenCL implementation consistently outperforms serial and OpenMP ones and paves the road for the development of high performance Peridynamics codes.