KAUST Research Workshop on Optimization and Big Data
Georgia Institute of Technology
Guanghui (George) Lan is A. Russell Chandler III Early Career Professor and Associate Professor in the H. Milton Stewart School of Industrial and Systems Engineering at Georgia Institute of Technology since January 2016. Before that he had served as a faculty member in the Department of Industrial and Systems Engineering at the University of Florida from 2009 to 2015, after he earned his Ph.D. degree from Georgia Institute of Technology in August, 2009. His main research interests lie in optimization and machine learning. His research has been supported by the National Science Foundation and Office of Naval Research. The academic honors that he received include the INFORMS Computing Society Student Paper Competition First Place (2008), INFORMS George Nicholson Paper Competition Second Place (2008), Mathematical Optimization Society Tucker Prize Finalist (2012), INFORMS Junior Faculty Interest Group (JFIG) Paper Competition First Place (2012) and the National Science Foundation CAREER Award (2013). Dr. Lan serves as the associate editor for a few leading optimization journals including Mathematical Programming, SIAM Journal on Optimization and Computational Optimization and Applications.
Stochastic gradient descent (SGD) methods have recently found wide applications in large-scale data analysis, especially in machine learning. These methods are very attractive to process online streaming data as they only scan through the dataset only once but still generate solutions with acceptable accuracy. However, it is known that classical SGDs are ineffective in processing streaming data distributed over multi-agent network systems (e.g., sensor and social networks), mainly due to the high communication costs incurred by these methods.
In this talk, we present a new class of SGDs, referred to as stochastic decentralized communication sliding methods, which can significantly reduce the aforementioned communication costs for decentralized stochastic optimization and machine learning. We show that these methods can skip inter-node communications while performing SGD iterations. As a result, these methods require a substantially smaller number of communication rounds than existing decentralized SGDs, while the total number of required stochastic (sub)gradient computations are comparable to those optimal bounds achieved by classical centralized SGD type methods.
This talk is based on a joint work with Soomin Lee and Yi Zhou.