Published in:

1984 | OriginalPaper | Chapter

Prediction Performance and the Number of Variables in Multivariate Linear Regression

Author : Ton Steerneman

Published in: Misspecification Analysis

Publisher: Springer Berlin Heidelberg

Included in: Professional Book Archive

Get Access

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

The multivariate linear regression model is considered for prediction purposes. It will be assumed that there are in principle infinitelymany regressors ordered according to decreasing importance. The conditional variance of the regressand given the first p regressors will be denoted by ω_p2. A natural measure of the performance of regression-predictors is mean squared prediction error MSEP(n,p), n being the sample size and p the number of variables. If {ω_p2} is known, then p will be chosen such that MSEP(n,p) is minimized. Typically, this may give an optimal p* much smaller than n; in other words it pays to delete variables when coefficients have to be estimated. In practice the quantity MSEP(n,p) is unknown but it can be unbiasedly estimated by msep(n,p), the so-called S_p-criterion. It is frequently suggested to choose the number of variables p̂ such that msep(n,p) is minimized. This rule is asymptotically optimal in the sense that msep(n, p̂)/MSEP(n, p*) → 1 in probability as n → ∞. It will be shown that there are a lot of other selection-of-variables procedures, which share this property. So, asymptotic optimality is by itself not very compelling, and minimization of the S_p-criterion needs more justification.

Springer Professional