Swagath Venkataramani

Title

Principal Research Scientist, AIU Architecture and Compilers

Publications

Efficient AI System Design with Cross-Layer Approximate Computing
- - Swagath Venkataramani
  - Xiao Sun
  - et al.
- 2020
- Proceedings of the IEEE
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference
- - Jinwook Oh
  - Sae Kyu Lee
  - et al.
- 2020
- VLSI Circuits 2020
DyVEDeep: Dynamic Variable Effort Deep Neural Networks
- - Sanjay Ganapathy
  - Swagath Venkataramani
  - et al.
- 2020
- ACM TECS
Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks
- - Xiao Sun
  - Jungwook Choi
  - et al.
- 2019
- NeurIPS 2019
Memory and Interconnect Optimizations for Peta-Scale Deep Learning Systems
- - Swagath Venkataramani
  - Vijayalakshmi Srinivasan
  - et al.
- 2019
- HiPC 2019
Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators∗
- - Swagath Venkataramani
  - Jungwook Choi
  - et al.
- 2019
- IISWC 2019
DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator
- - Swagath Venkataramani
  - Jungwook Choi
  - et al.
- 2019
- IEEE Micro
Dynamic Spike Bundling for Energy-Efficient Spiking Neural Networks
- - Sarada Krithivasan
  - Sanchari Sen
  - et al.
- 2019
- ISLPED 2019
BiScaled-DNN: Quantizing long-tailed datastructures with two scale factors for deep neural networks
- - Shubham Jain
  - Swagath Venkataramani
  - et al.
- 2019
- DAC 2019
SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks
- - Sanchari Sen
  - Shubham Jain
  - et al.
- 2019
- IEEE TC

Top collaborators

Alberto Mannari

Software Developer

Matthew Ziegler

Principal Research Scientist

Xiaodong Cui

Principal Research Scientist

Prasanth Chatarasi

Staff Research Scientist, AIU Accelerator Compilers and Architecture

Swagath Venkataramani

Title

Publications

Efficient AI System Design with Cross-Layer Approximate Computing

A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference

DyVEDeep: Dynamic Variable Effort Deep Neural Networks

Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks

Memory and Interconnect Optimizations for Peta-Scale Deep Learning Systems

Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators∗

DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator

Dynamic Spike Bundling for Energy-Efficient Spiking Neural Networks

BiScaled-DNN: Quantizing long-tailed datastructures with two scale factors for deep neural networks

SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks

Patents

Single Function To Perform Combined Matrix Multiplication And Bias Add Operations

Method To Map Convolutional Layers Of Deep Neural Network On A Plurality Of Processing Elements With Simd Execution Units, Private Memories, And Connected As A 2d Systolic Processor Array

Hybrid Data-model Parallelism For Efficient Deep Learning

Multichannel Memory To Augment Local Memory

Low Precision Deep Neural Network Enabled By Compensation Instructions