Olivier Maher, N. Harnack, et al.
DRC 2023
Analog Non-Volatile Memory-based accelerators offer high-throughput and energy-efficient Multiply-Accumulate operations for the large Fully-Connected layers that dominate Transformer-based Large Language Models (LLMs). We describe recent chip-demo and architectural efforts, quantify the unique benefits of Fully- (rather than Partially-) Weight-Stationary systems, and discuss factors affecting latency of token-processing and generation.
Olivier Maher, N. Harnack, et al.
DRC 2023
Thomas Lesueur, David Danovitch, et al.
ECTC 2025
Tommaso Stecconi, Roberto Guido, et al.
Advanced Electronic Materials
Max Bloomfield, Amogh Wasti, et al.
ITherm 2025