Neuro-inspired computing: From resistive memory to optics
Charles Mackin, Pritish Narayanan, et al.
CLEO/Europe-EQEC 2019
ALite Bidirectional Encoder Representations from Transformers model is demonstrated on an analog inference chip fabricated at 14nm node with phase change memory. The 7.1 million unique analog weights shared across 12 layers are mapped to a single chip, accurately programmed into the conductance of 28.3 million devices, for this first analog hardware demonstration of a meaningfully large Transformer model. The implemented model achieved near iso-accuracy on the General Language Understanding Evaluation benchmark of seven tasks, despite the presence of weight-programming errors, hardware imperfections, readout noise, and error propagation. The average hardware accuracy was only 1.8% below that of the floating-point reference, with several tasks at full iso-accuracy. Careful fine-tuning of model weights using hardware-aware techniques contributes an average hardware accuracy improvement of 4.4%. Accuracy loss due to conductance drift – measured to be roughly 5% over 30 days – was reduced to less than 1% with a recalibration-based “drift compensation” technique.
Charles Mackin, Pritish Narayanan, et al.
CLEO/Europe-EQEC 2019
Geoffrey Burr, Pritish Narayanan, et al.
VLSI Technology 2023
Alessandro Fumarola, Pritish Narayanan, et al.
ICRC 2016
Stefano Ambrogio, M. Gallot, et al.
IEDM 2019