Eliminating Redundancy: Ultra-compact Code Generation for Programmable Dataflow AcceleratorsPrasanth ChatarasiAlex Gateaet al.2026CGO 2026
Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference InfrastructureRui XieAsad Ul Haqet al.2025IEEE Computer Architecture Letters
MixTrain: accelerating DNN training via input mixingSarada KrithivasanSanchari Senet al.2024Frontiers in Artificial Intelligence
A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoCMonodeep KarJoel Silbermanet al.2024ISSCC 2024
Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoCMonodeep KarJoel Silbermanet al.2024IEEE Journal of Solid-State Circuits
DNNDaSher: A Compiler Framework for Dataflow Compatible End-to-End Acceleration on IBM AIUSanchari SenShubham Jainet al.2024IEEE Micro
Approximate computing and the efficient machine learning expeditionJörg HenkelHai Liet al.2022ICCAD 2022
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning AcceleratorsSubhankar PalSwagath Venkataramaniet al.2022Transactions on Embedded Computing Systems
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit QuantizationAndrea FasoliChia-Yu Chenet al.2022INTERSPEECH 2022
27 Nov 2023US11831467Programmable Multicast Protocol For Ring-topology Based Artificial Intelligence Systems
06 Nov 2023US11810340System And Method For Consensus-based Representation And Error Checking For Neural Networks
11 May 2023CNZL202010150294.1Programmable Data Delivery To A System Of Shared Processing Elements With Shared Memory
09 Jan 2023US11551054System-aware Selective Quantization For Performance Optimized Distributed Deep Learning