Guaranteed Convergence of Training Convolutional Neural Networks via Accelerated Gradient Descent

Shuai Zhang; Meng Wang; Sijia Liu; Pin-Yu Chen; Jinjun Xiong

doi:10.1109/CISS48834.2020.1570627111

CISS 2020

Conference paper

01 Mar 2020

Guaranteed Convergence of Training Convolutional Neural Networks via Accelerated Gradient Descent

View publication

Abstract

In this paper, we study the linear regression problem of training an one-hidden-layer non-overlapping convolutional neural networks (ConvNNs) with the rectified linear unit (ReLU) activation functions. Given a set of training data that contains the inputs (feature vectors) and outputs (labels), the outputs are assumed to be generated from a ConvNN with unknown weights, and our goal is to recover the ground-truth weights by minimizing a non-convex optimization problem whose object function is the empirical loss function. We have proved that if the inputs belong to Gaussian distribution, then the optimization problem can be solved by accelerated gradient descent (AGD) algorithm with a well-designed initial point and enough samples, and the iterates via AGD algorithm converge linearly to the ground-truth weights.

Conference paper