KAUST Research Workshop on Optimization and Big Data
We discuss recent directions in optimization algorithms used for the training of machine learning systems, such as generalized linear models (regression, classification) and deep learning. For distributed optimization when using many machines, as well as for integrated compute devices with greatly varying compute and memory capacities (such as GPUs paired with regular compute nodes), we present ideas from convex optimization which help greatly accelerating training. In particular, we will employ importance sampling methods and primal-dual gap techniques.
Joint work with Celestine Dünner, Thomas Parnell, Anant Raj and Sebastian Stich.