KAUST Research Workshop on Optimization and Big Data
The predictive quality of most machine learning models is measured by expected prediction error or so-called Area Under the Curve (AUC). However, these functions are not used in the empirical loss minimization, because their empirical approximations are nonconvex and discountinuous, and more importantly have zero derivative almost everywhere. Instead, other loss functions are used, such as logistic loss. In this work, we show that in the case of linear predictors, and under the assumption that the data has normal distribution, the expected error and the expected AUC are not only smooth, but have well defined derivatives, which depend on the first and second moments of the distribution. We show that these derivatives can be approximated and used in empirical risk minimization, thus proposing a gradient-based optimization methods for direct optimization of prediction error and AUC. Moreover the proposed algorithm has no dependence on the size of the dataset, unlike logistic regression and all other well known empirical risk minimization techniques.