Publication
IEEE Micro
Paper

ML-HW Co-Design of Noise-Robust TinyML Models and Always-On Analog Compute-in-Memory Edge Accelerator

View publication

Abstract

Always-on TinyML perception tasks in Internet of Things applications require very high energy efficiency. Analog compute-in-memory (CiM) using nonvolatile memory (NVM) promises high energy efficiency and self-contained on-chip model storage. However, analog CiM introduces new practical challenges, including conductance drift, read/write noise, fixed analog-to-digital (ADC) converter gain, etc. These must be addressed to achieve models that can be deployed on analog CiM with acceptable accuracy loss. This article describes AnalogNets: TinyML models for the popular always-on tasks of keyword spotting (KWS) and visual wake word (VWW). The model architectures are specifically designed for analog CiM, and we detail a comprehensive training methodology, to retain accuracy in the face of analog nonidealities, and low-precision data converters at inference time. We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator, with a layer-serial approach to remove the cost of complex interconnects associated with a fully pipelined design. We evaluate the AnalogNets on a calibrated simulator, as well as real hardware, and find that accuracy degradation is limited to 0.8%/1.2% after 24 h of PCM drift (8 bits) for KWS/VWW. AnalogNets running on the 14-nm AON-CiM accelerator demonstrate 8.55/26.55/56.67 and 4.34/12.64/25.2 TOPS/W for KWS and VWWs with 8-/6-/4-bit activations, respectively.