The parallel PM N-body code PMFAST is cost-effective and memory-efficient. PMFAST is based on a two-level mesh gravity solver where the gravitational forces are separated into long and short range components. The decomposition scheme minimizes communication costs and allows tolerance for slow networks. The code approaches optimality in several dimensions. The force computations are local and exploit highly optimized vendor FFT libraries. It features minimal memory overhead, with the particle positions and velocities being the main cost. The code features support for distributed and shared memory parallelization through the use of MPI and OpenMP, respectively.
The current release version uses two grid levels on a slab decomposition, with periodic boundary conditions for cosmological applications. Open boundary conditions could be added with little computational overhead. Timing information and results from a recent cosmological production run of the code using a 3712^3 mesh with 6.4 x 10^9 particles are available.