1984 | OriginalPaper | Chapter
Prediction Performance and the Number of Variables in Multivariate Linear Regression
Author : Ton Steerneman
Published in: Misspecification Analysis
Publisher: Springer Berlin Heidelberg
Included in: Professional Book Archive
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
The multivariate linear regression model is considered for prediction purposes. It will be assumed that there are in principle infinitelymany regressors ordered according to decreasing importance. The conditional variance of the regressand given the first p regressors will be denoted by ωp2. A natural measure of the performance of regression-predictors is mean squared prediction error MSEP(n,p), n being the sample size and p the number of variables. If {ωp2} is known, then p will be chosen such that MSEP(n,p) is minimized. Typically, this may give an optimal p* much smaller than n; in other words it pays to delete variables when coefficients have to be estimated. In practice the quantity MSEP(n,p) is unknown but it can be unbiasedly estimated by msep(n,p), the so-called Sp-criterion. It is frequently suggested to choose the number of variables p̂ such that msep(n,p) is minimized. This rule is asymptotically optimal in the sense that msep(n, p̂)/MSEP(n, p*) → 1 in probability as n → ∞. It will be shown that there are a lot of other selection-of-variables procedures, which share this property. So, asymptotic optimality is by itself not very compelling, and minimization of the Sp-criterion needs more justification.