Document

Local Minimax Learning of Approximately Polynomial Functions

File

DagSemProc.06201.3.pdf
• Filesize: 215 kB
• 12 pages

Cite As

Lee Jones and Konstantin Rybnikov. Local Minimax Learning of Approximately Polynomial Functions. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)
https://doi.org/10.4230/DagSemProc.06201.3

Abstract

Suppose we have a number of noisy measurements of an unknown real-valued function \$f\$ near point of interest \$mathbf{x}_0 in mathbb{R}^d\$. Suppose also that nothing can be assumed about the noise distribution, except for zero mean and bounded covariance matrix. We want to estimate \$f\$ at \$mathbf{x=x}_0\$ using a general linear parametric family \$f(mathbf{x};mathbf{a}) = a_0 h_0 (mathbf{x}) ++ a_q h_q (mathbf{x})\$, where \$mathbf{a} in mathbb{R}^q\$ and \$h_i\$'s are bounded functions on a neighborhood \$B\$ of \$mathbf{x}_0\$ which contains all points of measurement. Typically, \$B\$ is a Euclidean ball or cube in \$mathbb{R}^d\$ (more generally, a ball in an \$l_p\$-norm). In the case when the \$h_i\$'s are polynomial functions in \$x_1,ldots,x_d\$ the model is called locally-polynomial. In particular, if the \$h_i\$'s form a basis of the linear space of polynomials of degree at most two, the model is called locally-quadratic (if the degree is at most three, the model is locally-cubic, etc.). Often, there is information, which is called context, about the function \$f\$ (restricted to \$B\$ ) available, such as that it takes values in a known interval, or that it satisfies a Lipschitz condition. The theory of local minimax estimation with context for locally-polynomial models and approximately locally polynomial models has been recently initiated by Jones. In the case of local linearity and a bound on the change of \$f\$ on \$B\$, where \$B\$ is a ball, the solution for squared error loss is in the form of ridge regression, where the ridge parameter is identified; hence, minimax justification for ridge regression is given together with explicit best error bounds. The analysis of polynomial models of degree above 1 leads to interesting and difficult questions in real algebraic geometry and non-linear optimization. We show that in the case when \$f\$ is a probability function, the optimal (in the minimax sense) estimator is effectively computable (with any given precision), thanks to Tarski's elimination principle.
Keywords
• Local learning
• statistical learning
• estimator
• minimax
• convex optimization
• quantifier elimination
• semialgebraic
• ridge regression
• polynomial

Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0