Efficient computation of convolutions on the IBM 3090 VF
Abstract
The discrete linear convolution is a fundamental operation in many signal processing applications. In the advent of the new vector architecture machines, such as the IBM 3090 Vector Facility, a significant improvement in the throughput of scalar convolution algorithms can be obtained. This paper presents a vector block approach to the realization of linear convolutions, based on the Fourier method. It is shown that there exists an optimal block size which maximizes the computational efficiency of the proposed algorithm. The new scheme offers reduced initialization time and improved runtime and memory cache performance with respect to the existing algorithm. A preliminary benchmark, using a Fortran based code, indicates that the effective throughput is increased by a factor of 1.5-3. Additional enhancement can be gained if the proposed'scheme is integrated with the code of the fast Fourier transform available on this machine.