Weichao Mao, Haoran Qiu, et al.
NeurIPS 2023
We present the novel micro-architectural features, supported by an innovative and novel pre-silicon methodology in the design of POWER10. The resulting projected energy efficiency boost over POWER9 is 2.6x at core level (for SPECint) and up to 3x at socket level. In addition, a new feature supporting inline AI acceleration was added to the POWER ISA and incorporated into the POWER10 processor core design. The resulting boost in SIMD/AI socket performance is projected to be up to 10x for FP32 and 21x for INT8 models of ResNet-50 and BERT-Large. In this paper, we describe the novel methodology deployed and used not only to obtain these efficiency boosts for traditional workloads, but also to infuse AI/ML/HPC capability directly into the POWER10 core.
Weichao Mao, Haoran Qiu, et al.
NeurIPS 2023
Yue Zhu, Chen Wang, et al.
MASCOTS 2024
Haoran Qiu, Weichao Mao, et al.
USENIX ATC 2023
Apoorve Mohan, Matthew Sheard
NVIDIA GTC 2022