Identification and estimation using heteroscedasticity without instruments: The binary endogenous regressor case
Introduction
Linear regression models containing endogenous regressors are generally identified using outside information such as exogenous instruments, or by parametric distribution assumptions. Some papers obtain identification without outside instruments by exploiting heteroscedasticity, including Rigobon (2003), Klein and Vella (2010), Lewbel (2012), and Prono (2014). See also Lewbel (2017).
Some authors, including include Emran et al. (2014) and Hoang et al. (2014), have questioned whether the Lewbel (2012) estimator can be used when the endogenous regressor is binary. Others, including Le Moglie et al. (2015), have applied the Lewbel (2012) estimator with a binary endogenous variable, though without verifying if the assumptions hold.
Examples of such applications would include Diff-in-Diff models with endogenous fixed effects, or models with binary endogenous treatment indicators. Binary endogenous regressors are a natural case to consider in part because they imply that the instrument equation will automatically have heteroscedastic errors, which is one of the requirements of the estimator.
This paper shows validity of the Lewbel (2012) estimator when an endogenous regressor is binary. So, e.g., the estimator might be applied to estimate a (homogeneous) treatment effect when the binary treatment is not randomly assigned and when exogenous instruments are not available. However, the sufficient conditions given here do impose strong restrictions on the error term of the model.
Section snippets
The model and estimator
Assume a sample of observations of endogenous variables and , and a vector of exogenous covariates . We wish to estimate and the vector in the model where the errors and may be correlated. As in Lewbel (2012), we also consider the more general case where for some nonlinear, possibly unknown function .
Standard instrumental variables estimation depends on having an element of that appears in the equation but not in the equation, and uses
A binary endogenous regressor
Suppose that is binary. Then is a linear probability model. But we also wish to allow for more general models, so let where . Here is possibly nonlinear and possibly unknown. For example, if satisfies a probit or logit model, then where is the cumulative normal or logistic distribution function. Also included are nonparametric models, where is estimated by a nonparametric regression of on .
If the equation is a linear probability model,
Conclusions and caveats
Theorem 1 shows that the assumptions required to apply the Lewbel (2012) can be satisfied when is binary, and a supplemental appendix to this paper provides a different way to satisfy these assumptions when both and are binary. So, e.g., the STATA module IVREG2H by Baum and Schaffer (2012) can be used without change when just is binary, or when both and are binary.
A drawback of these results is that there are no obvious behavioral models that directly imply Assumption A2. This
References (18)
- et al.
A note on the closed-form identification of regression models with a mismeasured binary regressor
Statist. Probab. Lett.
(2008) - et al.
Nonparametric identification of regression models containing a misclassified dichotomous regressor without instruments
Econom. Lett.
(2008) Endogenous regressor binary choice models without instruments, with an application to migration
Econom. Lett.
(2010)- et al.
Non-farm activity, household expenditure, and poverty reduction in rural Vietnam: 2002–2008
World Dev.
(2014) - et al.
Estimating a class of triangular simultaneous equations models without exclusion restrictions
J. Econometrics
(2010) - et al.
Is it just a matter of personality? On the role of subjective well-being in childbearing behavior
J. Econ. Behav. Organ.
(2015) - et al.
IVREG2H: Stata module to perform instrumental variables estimation using heteroskedasticity-based instruments
- et al.
Assessing the frontiers of ultrapoverty reduction: Evidence from challenging the frontiers of poverty reduction/targeting the ultra-poor, an innovative program in Bangladesh
Econ. Dev. Cult. Change
(2014) - et al.
Two-step GMM estimation of the errors-in-variables model using high-order moments
Econometric Theory
(2002)