Differentially Private Stochastic Coordinate Descent
Abstract
Stochastic coordinate descent (SCD) is considered an appealing alternative to the classical stochastic gradient descent (SGD) algorithm: SCD converges faster in certain applications and does not require any hyperparameter tuning. Yet, SCD poses a fundamental privacy problem. Whereas updates in SGD operate on a single model vector, making it possible to add controlled noise and hide critical information about individuals, SCD crucially relies on keeping auxiliary information in memory during training, which constitutes a major privacy leak. In this paper we ask whether we can make SCD private without sacrificing its benefits, and we show that the answer is positive. Driven by the insight that under the addition of noise, the consistency of the auxiliary information holds in expectation, we present DP-SCD, the first differentially private stochastic coordinate descent algorithm. We analyze our new method theoretically and argue that decoupling and parallelizing coordinate updates is essential for its utility. On the empirical side we demonstrate competitive performance against the popular stochastic gradient descent alternative (DP-SGD) while requiring significantly less hyperparameter tuning.