Miao Guo, Yong Tao Pei, et al.
WCITS 2011
We show teraflop performance of the fully featured ab initio molecular dynamics code CPMD on an IBM pSeries 690 cluster. A mixed distributed-memory, coarse-grained parallel approach using the MPI library and shared-memory, fine-grained parallelism using OpenMP directives is used to optimally map the algorithms on the available hardware. The top performance achieved is ≈20% of the peak performance and an estimated parallel efficiency of ≈45% on 1024 processors for a system of 1000 atoms. The main limiting factor of parallel efficiency was found to be the latency of the interconnect. © 2005 Elsevier B.V. All rights reserved.
Miao Guo, Yong Tao Pei, et al.
WCITS 2011
Vijay K. Naik, Sanjeev K. Setia, et al.
Journal of Parallel and Distributed Computing
Annina Riedhauser, Viacheslav Snigirev, et al.
CLEO 2023
Hannaneh Hajishirzi, Julia Hockenmaier, et al.
UAI 2011