Geoffrey Burr, Sidney Tsai, et al.
IEDM 2023
Hardware acceleration of deep learning using analog non-volatile memory (NVM) requires large arrays with high device yield, high accuracy Multiply-ACcumulate (MAC) operations, and routing frameworks for implementing arbitrary deep neural network (DNN) topologies. In this article, we present a 14-nm test-chip for Analog AI inference - it contains multiple arrays of phase change memory (PCM)-devices, each array capable of storing 512times 512 unique DNN weights and executing massively parallel MAC operations at the location of the data. DNN excitations are transported across the chip using a duration representation on a parallel and reconfigurable 2-D mesh. To accurately transfer inference models to the chip, we describe a closed-loop tuning (CLT) algorithm that programs the four PCM conductances in each weight, achieving <3% average weight-error. A row-wise programming scheme and associated circuitry allow us to execute CLT on up to 512 weights concurrently. We show that the test chip can achieve near-software-equivalent accuracy on two different DNNs. We demonstrate tile-to-tile transport with a fully-on-chip two-layer network for MNIST (accuracy degradation 0.6%) and show resilience to error propagation across long sequences (up to 10 000 characters) with a recurrent long short-term memory (LSTM) network, implementing off-chip activation and vector-vector operations to generate recurrent inputs used in the next on- chip MAC.
Geoffrey Burr, Sidney Tsai, et al.
IEDM 2023
Daniele Lelmini, Stefano Ambrogio
Nanotechnology
Pritish Narayanan, Geoffrey W. Burr, et al.
IEEE JESTCS
Geoffrey W. Burr, Stefano Ambrogio, et al.
CSTIC 2019