Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory SystemYunhua FangRui Xieet al.2025IEEE Computer Architecture Letters
Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference InfrastructureRui XieAsad Ul Haqet al.2025IEEE Computer Architecture Letters