An integrated simdization framework using virtual vectors
- Peng Wu
- Alexandre E. Eichenberger
- et al.
- 2005
- ICS 2005
Design and development of high-performance computing systems.
I work as a Principal Research Staff Member in the Z Research group at the IBM T.J. Watson Research Center. My research interests focus on the interaction between compiler technology and micro- architecture design.
My most recent works focus on accelerating Deep Neural Networks for CPUs as well as custom AI hardware accelerator such as the IBM Telum dedicated on-chip accelerator for AI inference. I am the lead of the Open-Source ONNX-MLIR project, which aims to lower ONNX neural net models in optimized code using the MLIR infrastructure as well as the LLVM optimizing backend. I am also on the ONNX steering committee representing IBM.
Prior works included work on OpenMP, GPU acceleration, multi-threading, and SIMD code generation.