Slow and stale gradients can win the race: Error-Runtime trade-offs in distributed SGDSanghamitra DuttaGauri Joshiet al.2018AISTATS 2018