Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUsConnor EspenshadeRachel Penget al.2024EuroMLSys 2024
Towards Pareto Optimal Throughput in Small Language Model ServingPol G. RecasensYue Zhuet al.2024EuroMLSys 2024