Divya Taneja, Jonathan Grenier, et al.
ECTC 2024
The need to repeatedly shuttle around synaptic weight values from memory to processing units has been a key source of energy inefficiency associated with hardware implementation of artificial neural networks. Analog in-memory computing (AIMC) with spatially instantiated synaptic weights holds high promise to overcome this challenge, by performing matrix-vector multiplications directly within the network weights stored on a chip to execute an inference workload. In this talk, I will first present our latest multi-core AIMC chip in 14-nm complementary metal–oxide–semiconductor (CMOS) technology with backend-integrated phase-change memory (PCM). The fully-integrated chip features 64 256x256 AIMC cores interconnected via an on-chip communication network. Experimental inference results on ResNet and LSTM networks will be presented, with all the computations associated with the weight layers and the activation functions implemented on-chip. Then, I will present our open-source toolkit (https://aihw-composer.draco.res.ibm.com/) to simulate inference and training of neural networks with AIMC. Finally, I will present our latest architectural solutions to increase the weight capacity of AIMC chips towards supporting large-language models, as well as alternative solutions suited for low-power edge computing applications.
Divya Taneja, Jonathan Grenier, et al.
ECTC 2024
Juan Miguel De Haro, Rubén Cano, et al.
IPDPS 2022
David Stutz, Nandhini Chandramoorthy, et al.
MLSys 2021
Stefano Ambrogio
MRS Spring Meeting 2022