Efficient AI System Design with Cross-Layer Approximate ComputingSwagath VenkataramaniXiao Sunet al.2020Proceedings of the IEEE
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and InferenceJinwook OhSae Kyu Leeet al.2020VLSI Circuits 2020
DyVEDeep: Dynamic Variable Effort Deep Neural NetworksSanjay GanapathySwagath Venkataramaniet al.2020ACM TECS
Hybrid 8-bit floating point (HFP8) training and inference for deep neural networksXiao SunJungwook Choiet al.2019NeurIPS 2019
Memory and Interconnect Optimizations for Peta-Scale Deep Learning SystemsSwagath VenkataramaniVijayalakshmi Srinivasanet al.2019HiPC 2019
Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators∗Swagath VenkataramaniJungwook Choiet al.2019IISWC 2019
DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI AcceleratorSwagath VenkataramaniJungwook Choiet al.2019IEEE Micro
Dynamic Spike Bundling for Energy-Efficient Spiking Neural NetworksSarada KrithivasanSanchari Senet al.2019ISLPED 2019
BiScaled-DNN: Quantizing long-tailed datastructures with two scale factors for deep neural networksShubham JainSwagath Venkataramaniet al.2019DAC 2019
SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural NetworksSanchari SenShubham Jainet al.2019IEEE TC