Yicong Zhu, Changnian Han, et al.
Computer Physics Communications
QCDOC is a massively parallel supercomputer with tens of thousands of nodes distributed on a six-dimensional torus network. The 6D structure of the network provides the needed communication resources for many communication-intensive applications. In this paper, we present a parallel algorithm for three-dimensional Fast Fourier Transform and its implementation for a 4096-node QCDOC prototype. Two techniques have been used to increase its parallel performance: simultaneous multi-dimensional communication and communication-and-computation overlapping. Benchmarking experiments suggest that 3D FFTs of size 128 × 128 × 128 can scale well on such platforms up to 4096 nodes. Our performance results suggest stronger scalability on QCDOC than on IBM BlueGene/L supercomputer. © 2007 Elsevier B.V. All rights reserved.
Yicong Zhu, Changnian Han, et al.
Computer Physics Communications
Ziji Zhang, Georgios Kementzidis, et al.
Computer Physics Communications
Paul Solomon, Brian A. Bryce, et al.
E3S 2013
Razvan A. Nistor, Glenn Martyna, et al.
Physical Review B - CMMP