Optimal design for smooth supersaturated models

doi:10.1016/j.jspi.2013.11.014

Journal of Statistical Planning and Inference

Volume 154, November 2014, Pages 3-11

https://doi.org/10.1016/j.jspi.2013.11.014 Get rights and content

Abstract

Smooth supersaturated models are interpolation models in which the underlying model size, and typically the degree, is higher than would normally be used in statistics, but where the extra degrees of freedom are used to make the model smooth using a standard second derivative measure of smoothness. Here, the solution is derived from a closed-form quadratic programme, leading to tractable matrix representations. This representation aids considerably in the choice of optimal knots in the interpolation case and in the optimal design when the SSM is used as a way of obtaining kernels, but where the statistical problem is set up separately. Some examples are given in one and two dimensions.

Introduction

The basic idea of smooth supersaturated models (SSM) on which this paper is founded appears in Bates et al. (in press), and follows a few years of development (an arXiv version has been available since 2009), particularly in the context of computer experiments. In the present paper a theory of optimal experimental designs for SSM is developed. Insofar as a high order SSM can be considered as an approximation to a multidimensional spline, a solution to the optimal design problem for SSM gives an approximate solution to optimal design for splines which, in high dimensions is not very much studied, but see Studden and VanArmann (1969) and Dette et al. (2001) for some work in the area.

As with splines there is the problem of the choice of knots. We shall explain how optimal design and optimal choice of knots arise as two different problems and suggest solutions to each.

Let $(x_{1}, \dots, x_{k})$ be a general point in R^k. A monomial is defined by a non-negative integer vector $α = (α_{1}, \dots, α_{k})$ : $x^{α} = x_{1}^{α_{1}} \dots x_{d}^{α_{k}} .$ Following the experimental design avenue of algebraic statistics (Pistone et al., 2001), it is clear that for observations over any design D_n with n points in R^k there is at least one exact polynomial interpolator. Specifically, let a design be defined as a set of distinct points $D = {x^{(1)}, \dots, x^{(n)}}$ in R^k. A general polynomial model can be written as $η (x) = \sum_{α \in M} θ_{α} x^{α},$ for some set M of distinct index vectors, α. The algebraic theory shows that there is always a set of indices M for which we have an exact interpolation of a set of observations $y = {y_{1}, \dots, y_{n}}$ at design points $x^{(1)}, \dots, x^{(n)}$ respectively and for which the size of M is n: $| M | = n$ . Moreover there is a method of finding M based on Gröbner bases and M has a hierarchical structure: if $α \in M$ then $β \in M$ , for any $β \leq α$ , where $\leq$ is the usual entrywise ordering. We speak informally of “the model M”. The algebraic method is the starting point or at least a theoretical underpinning for SSM.

A supersaturated polynomial model is one in which the number of parameters, p, is larger than that suggested by the size of the design, the number of observations n. In the present day terminology we may say that this is a “p bigger than n problem”. However, the SSM approach is a little different. Initially we increase the size of the model, $| M |$ so that $| M | > p$ . In statistical terminology this leaves $| M | - n$ “free” degrees of freedom which we use to increase the smoothness of the model, in a well defined sense, while still interpolating the original data set y.

Section snippets

The SSM construction

We start with a data set y over a design D_n, with n points (D for short). The data y is given as a column vector of size n. Write the vector of model terms as $f (x) = {(x^{α} : α \in M)}^{T}$ so that $η (x) = f {(x)}^{T} θ,$ where θ is the vector of coefficients for monomials in f(x) in a suitable order according to elements of M. Denote the number of model terms as $| M | = N$ and assume that $N > n$ . Let the region of interest be $Ω \subset R^{k}$ , which we call the “integration region”. Our measure of smoothness Bates et al. (2003) is $ϕ (M, Ω) = \int_{Ω} \sum_{1}$

Large bases and splines

The optimality criteria ϕ is precisely that for which splines, particularly cubic beta splines and thin plate splines are known to be optimal in the class of functions with bounded, continuous derivatives of given order. It ought to be true that as the size of the basis given by $| M |$ gets large in an appropriate way, we tend to splines. We state a somewhat more general definition than is available in the literature and refer to Dupont and Scott (1980).

Definition 5

Let Ω be a bounded Lebesgue integrable

Orthogonal polynomials and computation

The computations required for smooth supersaturated models can be easily implemented when the number of factors is small, say less than ten, and also the number of observations and terms is not huge, say in the region of a few hundred observations and a few hundred extra terms. However, as the number of factors increase, computations may slow considerably. Computing the matrix K involves summation over k² pairs of factors, and in each case a matrix of size N×N is to be computed. Some efficiency

Optimal knots

If one considers the interpolation discussed here as a method which can be applied to any data set y, at the selected knot design D, then there is a sense in which D should be independent of the actual (true) underlying process yielding the data.

For a model given by M and given that $ϕ^{⁎} = y^{T} Qy$ , we may consider simple measures on the matrix Q. Consider for motivation the case when the vector of observations y has a multivariate distribution with mean vector μ and covariance matrix $Σ$ . Then the

Optimal design

As discussed briefly above we shall study the pure optimal design problem for the kernel basis given by the SSM basis: $g (x) = {(g_{1} (x), \dots, g_{n} (x))}^{T}$ . For this we take the classical approach. We assume that we have a design d with m points: $d = {z^{(i)}, i = 1, \dots, m}$ . The model for observation Y_i, taken at point $z^{(i)}$ , is $Y_{i} = g {(z^{(i)})}^{T} β + ε_{i} .$ Here β is the vector of parameters and errors ε_i are independent and normally distributed with zero mean and variance σ². To study design we need the design matrix $Z = (g_{j} (z^{(i)})) .$

Discussion

An issue not covered in this paper is the fact that the integration region can be quite general (provided the integral exists) and one could also change the measure integration of Eq. (4) all without changing the basic theory. This is implied also by the general definition of a Duchon spline given in Section 3, see Duchon (1976). We could also have given a version for general measures. Thus one can see the paper as an approach to solving spline-like problems of a quite general nature,

Acknowledgments

The third and fourth authors acknowledge support under EPSRC UK Grants EP/D048893/1 and EP/K036106/1 and EPSRC PhD funding for the first author. The fourth author acknowledges a Leverhulme emeritus fellowship.

References (12)

R. Bates et al.
A global selection procedure for polynomial interpolators
Technometrics
(2003)
Bates, R., Maruri-Aguilar, H., Wynn, H.P. Smooth supersaturated models. J. Stat. Comput. Simul.,...
J. Bien et al.
A lasso for hierarchical interactions
Ann. Stat.
(2013)
Dette, H., Melas, V.B., Pepelyshev, A., 2011. Optimal design for smoothing splines. Ann. Inst. Stat. Math. 63,...
Duchon, J., 1976. Splines minimizing rotation invariant semi-norms in Sobolev spaces. In: Lectures Notes in...
T. Dupont et al.
Polynomial approximation of functions in Sobolev spaces
Math. Comput.
(1980)

There are more references available in the full text version of this article.

Cited by (0)

View full text

Journal of Statistical Planning and Inference

Optimal design for smooth supersaturated models

Abstract

Introduction

Section snippets

The SSM construction

Large bases and splines

Orthogonal polynomials and computation

Optimal knots

Optimal design

Discussion

Acknowledgments

A global selection procedure for polynomial interpolators

Technometrics

A lasso for hierarchical interactions

Ann. Stat.

Polynomial approximation of functions in Sobolev spaces

Math. Comput.