AccFFT: A library for distributed-memory 3-D FFT on CPU and GPU architectures. AccFFT is a new massively parallel FFT library for CPU/GPU architectures. FFT is a fundamental algorithm originally developed by Cooley-Tukey. Many different libraries have been developed using this simple yet briliant algorithm. In recent years, the need to perform distributed, parallel computational has created a need for a fast parallel FFT library that can scale to thousands of processors. AccFFT is specifically designed with the goal of achieving maximum performance and scalability for both CPUs and GPUs. It uses a series of novel algorithms to reduce communication costs inherent in distributed FFTs.