KAUST Research Workshop on Optimization and Big Data
Katya Scheinberg is the Harvey E. Wagner Endowed Chair Professor at the Industrial and Systems Engineering Department at Lehigh University. She attended Moscow University for her undergraduate studies in applied mathematics and then moved to New York and received her PhD degree in operations research from Columbia University. After receiving her doctoral degree she has worked at the IBM T.J. Watson Research Center as a research staff member for over a decade before joining Lehigh in 2010.Katya’s main research areas are related to developing practical algorithms (and their theoretical analysis) for various problems in continuous optimization, such as convex optimization, derivative free optimization, machine learning, quadratic programming, etc. She has been focusing on large-scale optimization method for Big Data applications and Machine Learning since 2000. In 2015, jointly with Andy Conn and Luis Vicente, she received the Lagrange Prize awarded jointly by SIAM and MOS. Katya is the editor-in-chief of SIAM-MOS Optimization book series and an associate editor of Mathematical Programming and SIAM Journal on Optimization.
The predictive quality of most machine learning models is measured by expected prediction error or so-called Area Under the Curve (AUC). However, these functions are not used in the empirical loss minimization, because their empirical approximations are nonconvex and discountinuous, and more importantly have zero derivative almost everywhere. Instead, other loss functions are used, such as logistic loss. In this work, we show that in the case of linear predictors, and under the assumption that the data has normal distribution, the expected error and the expected AUC are not only smooth, but have well defined derivatives, which depend on the first and second moments of the distribution. We show that these derivatives can be approximated and used in empirical risk minimization, thus proposing a gradient-based optimization methods for direct optimization of prediction error and AUC. Moreover the proposed algorithm has no dependence on the size of the dataset, unlike logistic regression and all other well known empirical risk minimization techniques.