Irem Boybat-Kara
IEDM 2023
Analog Non-Volatile Memory-based accelerators offer high-throughput and energy-efficient Multiply-Accumulate operations for the large Fully-Connected layers that dominate Transformer-based Large Language Models (LLMs). We describe recent chip-demo and architectural efforts, quantify the unique benefits of Fully- (rather than Partially-) Weight-Stationary systems, and discuss factors affecting latency of token-processing and generation.
Irem Boybat-Kara
IEDM 2023
Laura Bégon-Lours, Mattia Halter, et al.
MRS Spring Meeting 2023
Ying Zhou, Gi-Joon Nam, et al.
DAC 2023
Corey Liam Lammie, Julian Büchel, et al.
ISCAS 2025