Prediction and confidence intervals for nonlinear measurement error models without identifiability information

https://doi.org/10.1016/S0167-7152(02)00141-4Get rights and content

Abstract

A major difficulty in applying a measurement error model is that one is required to have additional information in order to identify the model. In this paper, we show that there are cases in nonlinear measurement error models where it is not necessary to have additional information to construct prediction intervals for the future dependent variable Y and confidence intervals for the conditional expectation E(Y|X) where X is the future observable independent variable. In particular, we consider two nonlinear models, the exponential and loglinear models. By applying pseudo-likelihood estimation of variance functions in the weighted least squares method, we construct theoretically justifiable prediction and confidence intervals in these two models. Some simulation results which show that the proposed intervals perform well are also provided.

Introduction

A measurement error model is one in which the explanatory variable is not observed exactly. Instead it is observed with an error, a case often involved in applications. This problem has a long history and dated back to Adcock (1878). It still continues to be an important one both in theory and application. See Carroll et al. (1995), Fuller (1987), and Cheng and Van Ness (1999) and references cited therein.

Having error in the explanatory variable, however, causes a tremendous problem in theory and in application due to unidentifiability. A model is said to be unidentifiable when there are more than two sets of parameters that govern the same distribution of the observations. Consequently, given the observations or even the exact distribution of these observations, it is impossible to provide any indication as to which set of parameters really governs the data. Usually, statisticians who apply measurement error models would then need to have some additional information, i.e., that some of the parameters involved are known or estimable. In a few situations, these parameters are known. In some situations, especially when there are validation data, it is possible to estimate some parameters by the validation data. However, in many situations, no such information is available and the model remains unidentifiable. This seems to make measurement error models useless.

In the linear measurement error model, it is known that if the objective is prediction, it is not necessary to adjust for measurement errors (see Fuller, 1987, p. 74). The same rationale holds in the probit measurement error model (Buzas and Stefanski, 1996). Hence, in the above two models we can directly construct statistical intervals aiming at predicting for the future observation or its mean as if we have observed the true explanatory variable (i.e. treat the surrogate variable as the true explanatory variable) and there is no need to modify the models to adjust for measurement errors. Nevertheless, in this paper we investigate another two nonlinear measurement error models, the exponential and loglinear models. We find that one does not need to have any identifiability information if the goal is prediction, but has to do some modification to the models to adjust for measurement errors. We show that after some modification to the model, one can use pseudo-likelihood estimation of variance functions in the weighted least squares method to construct prediction interval for Yn+1 and confidence interval for E(Yn+1|Xn+1), the conditional expectation of the future observation Yn+1 given Xn+1.

In conclusion, unidentifiable measurement error models can be useful if the goal is prediction. In some models (for example, linear and probit models), one does not need to make adjustment for measurement errors. In other models (such as exponential and loglinear models), some modification to the model is required to adjust for measurement errors. However, it is interesting in many situations that additional information is not needed in prediction.

Section snippets

General approach

Assume a measurement error model where we observe Yi and Xi satisfyingYi|Ui,Xip.d.f.g(ui,Θ)(i.e., the conditional p.d.f. of Yi given Ui and Xi is g(ui,Θ), where Θ is the unknown vector of parameters) andXi=Uii.

In this paper, we mainly focus on the univariate Xi's. However, the idea obviously can be generalized to the multivariate Xi's in a straightforward manner. We assume that Ui and δi are independently normally distributed,Ui∼N(mUU2)andδi∼N(0,σδ2).Consequently, the conditional p.d.f. of

Exponential model

ConsiderYi=b0eb1Uii,Xi=Uii,1⩽i⩽n,where Ui and δi are distributed as in (2.1) andεi has a distribution with mean 0 and variance σε2. Assume that Ui and δi are independent of εi. By (2.2), we can write Yi as a form of (2.3),Yi=E(b0eb1Uii|Xi)+b0eb1Uii−E(b0eb1Uii|Xi)=f(Xi,β)+εi,wheref(Xi,β)=β0eβ1Xi,β=(β01),β0=b0eb1(1−r)mU+b12δ2/2,β1=rb1,εi=b0eb1Ui−β0eβ1Xii.Since Xi and εi are uncorrelated, we can proceed as if we deal with an ordinary exponential regression model. However, by

Loglinear model

In this section, we consider the modelYi|UiPoisson(eb0+b1Ui),Xi=Uii,1⩽i⩽n,where Ui and δi are independently normally distributed as in (2.1). By (2.2), we can write Yi as a form of (2.3),Yi=E(Yi|Xi)+Yi−E(Yi|Xi)=f(Xi,Θ)+εi,where f(Xi,Θ)=eβ0+β1Xi, β0=b0+b1(1−r)mU+b12δ2/2, β1=rb1, and εi=Yieβ01Xi. Again, Xi and εi are uncorrelated. Note that although the conditional expectation of Yi given Xi has the same form as that of E(Yi|Ui), the conditional distribution of Yi given Xi is not a

Conclusion

In this paper, we discuss how to construct valid statistical intervals in two nonlinear measurement error models without additional information. The problem is important since additional information is often unavailable in practice. By using pseudo-likelihood estimation of variance functions in the weighted least squares method, this is possible if the target is the future response variable Yn+1 or the conditional mean of Yn+1 given Xn+1, where Xn+1 is the observed surrogate variable

References (6)

  • R.J. Adcock

    Note on the method of least squares

    Analyst

    (1878)
  • J.S. Buzas et al.

    Instrumental variable estimation in generalized linear measurement error models

    J. Amer. Statist. Assoc.

    (1996)
  • R. Carroll et al.

    Transformation and Weighting in Regression

    (1987)
There are more references available in the full text version of this article.

Cited by (7)

View all citing articles on Scopus
View full text