QUKU: A dual-layer reconfigurable architecture
Abstract
A new architecture, QUKU, is proposed for implementing stream-based algorithms on FPGAs, which combines the advantages of FPGA and Coarse Grain Reconfigurable Arrays (CGRAs). QUKU consists of a dynamically reconfigurable, coarse-grain Processing Element (PE) array with an associated softcore processor providing system support. At a coarse-grain, the PE array can be reconfigured on a cycle-by-cycle basis to change the PE functionality similarly to that in a conventional CGRA. At a fine-grain, the whole FPGA can be reconfigured statically to implement a completely different PE array that serves the target application in a better way. Advantages of the fine-grain reconfiguration include individually customized PEs, adaptable numeric format support and customizable interconnect network. A prototype CAD tool framework is also developed which facilitates programming the QUKU architecture. An example application consisting of two different image detectors is implemented to demonstrate the advantages of QUKU. QUKU provides up to 140 times speedup and 40 times improvement in area-time product compared to an implementation running on an FPGA-based softcore. The area-time product for QUKU is around 16% lower than that of a custom circuit based implementation on the same FPGA. The per-PE customization provides an area-time saving of approximately 31% compared to a homogeneous 4 × 4 array of PEs for the same application. The experimental results demonstrate that a dual layered reconfigurable architecture provides significant potential benefits in terms of flexibility, area and processing efficiency over existing reconfigurable computing architectures for DSP. © 2013 ACM.