David Eppstein, Zvi Galil, et al.
Journal of the ACM
We give an optimal algorithm for broadcasting on SIMD hypercubes without the hardware support for indirect addressing; i.e., an indirect memory reference takes as many cycles as the number of different memory locations (offsets) accessed by all active processors. With communication restricted to one dimension at a time, our algorithm broadcasts m elements in m + n - 1 communication steps and 2m direct memory references in an n-cube. The well-known algorithm based on the spanning binomial tree needs mn communication steps and 2m direct memory references. A previous algorithm based on the n edge-disjoint spanning trees takes m + n - 1 communication steps but needs at least 2 mn direct memory references. We then give a reduction algorithm on SIMD hypercubes, which is optimal in terms of the communication times and memory references and nearly optimal (about a factor of n (n - 1) when m ≫ n) with respect to arithmetic time. We also generalize our algorithms to the case in which communication is restricted to k dimensions at a time, for any integer k in the range of 1 {slanted equal to or less-than} k ≤ n, with their optimality preserved. © 1991.
David Eppstein, Zvi Galil, et al.
Journal of the ACM
Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010
Annina Riedhauser, Viacheslav Snigirev, et al.
CLEO 2023
Hong Guan, Saif Masood, et al.
SoCC 2023