Ben Fei, Jinbai Liu
IEEE Transactions on Neural Networks
An Orthogonal Access Multiprocessing system allows a multiplicity of processors to access distinct rows or columns of a rectangular array of data elements concurrently. The resulting tightly coupled system is feasible with current technology and has been suggested for VLSI as a "reduced mesh." In this paper we introduce the architecture and concentrate on its application to a number of basic vector and numerical computations. We prove that the machine exhibits the same performance as any other system with the same number of processors within a factor of 3. Matrix multiplication, LU decomposition, polynomial evaluation, and solutions to linear systems and partial differential equations all show a speedup of O(n) for an n-processor system. The flexibility in the choice of the number of PEs makes the architecture a strong competitor in the world of special-purpose parallel systems. © 1989.
Ben Fei, Jinbai Liu
IEEE Transactions on Neural Networks
Rei Odaira, Jose G. Castanos, et al.
IISWC 2013
Joxan Jaffar
Journal of the ACM
Jihun Yun, Peng Zheng, et al.
ICML 2019