Large graph convolutional network training with GPU-oriented data communication architectureSeung Won MinKun Wuet al.2021VLDB
TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware DatatypesCarl PearsonKun Wuet al.2021HPDC 2021