TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware DatatypesCarl PearsonKun Wuet al.2021HPDC 2021