Nonparametric Regression, Confidence Regions and Regularization

In this paper we offer a unified approach to the problem of nonparametric regression on the unit interval. It is based on a universal, honest and non-asymptotic confidence region which is defined by a set of linear inequalities involving the values of the functions at the design points. Interest will typically centre on certain simplest functions in the confidence region where simplicity can be defined in terms of shape (number of local extremes, intervals of convexity/concavity) or smoothness (bounds on derivatives) or a combination of both. Once some form of regularization has been decided upon the confidence region can be used to provide honest non-asymptotic confidence bounds which are less informative but conceptually much simpler. Although the procedure makes no attempt to minimize any loss function such as MISE the resulting estimates have optimal rates of convergence in the supremum norm both for shape and smoothness regularization. We show that rates of convergence can be misleading even for samples of size n = 10^6 and propose a different form of asymptotics which allows model complexity to increase with sample size.

( Full Paper )