R.M. Shelby, J. Hoffnagle, et al.
Optics Letters
Analog Non-Volatile Memory-based accelerators offer highthroughput and energy-efficient Multiply-Accumulate operations for the large Fully-Connected layers that dominate Transformer-based Large Language Models (LLMs). We describe recent chip-demo and architectural efforts, quantify the unique benefits of Fully- (rather than Partially-) Weight-Stationary systems, and discuss factors affecting latency of token-processing and generation.
R.M. Shelby, J. Hoffnagle, et al.
Optics Letters
B. Rajendran, M.H. Lee, et al.
VLSI Technology 2008
J. Amet, F.I. Baida, et al.
Photonics and Nanostructures - Fundamentals and Applications
M.-P. Bernal, G.W. Burr, et al.
CLEO 1998