Abstract
This paper investigates how to accelerate the convergence of distributed optimization algorithms on nonconvex problems. We propose a distributed primal-dual stochastic gradient descent(SGD) equipped with “powerball” method to accelerate. We show that the proposed algorithm achieves the linear speedup convergence rate for general smooth (possibly nonconvex) cost functions. We demonstrate the efficiency of the considered algorithm through numerical experiments where we train two-layer fully connected neural networks and convolutional neural networks on MNIST dataset to compare with the state-of-the-art distributed SGD algorithms and centralized SGD algorithms.
Publication
In 2021 IEEE Symposium Series on Computational Intelligence