Power and performance optimization at the system level
Valentina Salapura, Randy Bickford, et al.
CF 2005
The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.
Valentina Salapura, Randy Bickford, et al.
CF 2005
Alan Gara, Matthias A. Blumrich, et al.
IBM J. Res. Dev
Mark E. Giampapa, Ralph Bellofatto, et al.
IBM J. Res. Dev
Pavlos Vranas, Gyan Bhanot, et al.
ACM/IEEE SC 2006