KAUST Research Workshop on Optimization and Big Data
Stochastic gradient descent (SGD) methods have recently found wide applications in large-scale data analysis, especially in machine learning. These methods are very attractive to process online streaming data as they only scan through the dataset only once but still generate solutions with acceptable accuracy. However, it is known that classical SGDs are ineffective in processing streaming data distributed over multi-agent network systems (e.g., sensor and social networks), mainly due to the high communication costs incurred by these methods.
In this talk, we present a new class of SGDs, referred to as stochastic decentralized communication sliding methods, which can significantly reduce the aforementioned communication costs for decentralized stochastic optimization and machine learning. We show that these methods can skip inter-node communications while performing SGD iterations. As a result, these methods require a substantially smaller number of communication rounds than existing decentralized SGDs, while the total number of required stochastic (sub)gradient computations are comparable to those optimal bounds achieved by classical centralized SGD type methods.
This talk is based on a joint work with Soomin Lee and Yi Zhou.