Endogeneity bias and growth regressions
Introduction
The problem of endogeneity bias is a central concern in any comparative study of economic growth. Reverse causality and omitted variables bias are the two most common sources of endogeneity bias to which critics of growth regressions refer.1 These sources of bias are surely present in most, if not all, least-squares estimates of the determinants of growth, yet the quantitative magnitude of this bias has not been assessed. In most instances we care about the sign, rough magnitude and statistical significance of the estimated coefficient on a growth determinant, rather than a precise effect. If the quantitative magnitude of endogeneity bias is small, then the endless (and probably futile) quest for the perfect instrument may be misplaced – particularly because instrumental variables estimation is fraught with problems of its own, such as weak instruments and doubts about the validity of the instruments. In small sample studies of the type typically found in growth regressions, we cannot ignore the efficiency of our estimators either. Using Monte Carlo simulations, this paper takes a first step toward evaluating the magnitude of endogeneity bias under various assumptions about the extent of endogeneity itself.
A very simple univariate example may serve to illustrate our goal. In a simple cross-sectional regression equation of the form the expected value of the OLS estimate of the main coefficient of interest β1 can be written as: This equation shows that a necessary condition for an unbiased estimator is zero covariance between regressor and the residual term. If this condition does not hold, the extent of resulting endogeneity bias depends on the relative magnitude of the variance of the regressor, compared to its covariance with the residual term. The variance of the regressor is observable from the data. The question we ask is: by varying the covariance between the residual term and the regressor, for a given variance of the regressor, how much bias do we generate on ?
The specific application that we consider is growth regressions. We start from a data generating process based on the augmented Solow (1956) model. We chose this model largely because of its empirical tractability, and also because it serves as a theoretical basis for a wide body of empirical work on the determinants of growth (see for instance Barro and Sala-i Martin, 2003). The empirical literature that has attempted to estimate the parameters of the Solow model has used a wide variety of panel data estimators.2 These various estimators take different approaches to the empirical issues created by the estimation of the model. In particular, some use a GMM approach with instrumental variables designed to address the potential problem of endogeneity bias in the regressions. As we note below, all of these estimators have some disadvantages, and the relative magnitude of the errors generated thereby are not easily assessed using econometric theory. Consequently, we use Monte Carlo techniques to assess the average absolute bias of several of the estimators from this literature in the presence of endogeneity bias.3 Our method is similar to that used in Hauk and Wacziarg (2009). However, this paper extends their analysis in several dimensions. Whereas that paper focused on the measurement error in the regressors, this study focuses on the properties of the residual term.4 We generate simulated data based on moments observed from real data and fix the parameters of the Solow model to values commonly used in the theoretical literature. Also, using the latest data from version 8.1 of the Penn World Tables,5 we are able to extend our dataset for one extra decade. We generate an error term that is allowed to covary to a specified degree with the steady-state determinants of income. We then run regressions using various panel data estimators on this simulated data, and compare the average of the absolute biases of the estimates obtained over many runs to the known, “true” parameters of the model.
The panel data estimators that we focus on in this paper are the fixed effects estimator (henceforth, FE), the between estimator (BE), the “Mankiw et al. (1992)” estimator (MRW),6 the random effects estimator (RE), the Arellano and Bond GMM estimator (AB), and the Blundell and Bond system GMM estimator (BB). We also consider the effects of differing assumptions about country-level heterogeneity across our simulations.
Our findings suggest that the BE and RE estimators (i.e. the “cross-sectional” estimators that use across country variation to identify their estimated parameters) perform better than the other estimators, if our goal is to minimize the average absolute bias across all of the estimated coefficients. We come to this conclusion after examining the performance of the estimators listed above across a wide sample of mathematically-possible assumptions about the extent of the correlation between the steady-state determinants of the Solow growth model and the unobserved residual terms from growth regressions, and within several sub-samples of the Monte Carlo simulations that we conduct. However, as much of this literature has been focused primarily on estimating the rate at which countries converge to a growth steady state, we also look at the performance of the various estimators at estimating just the coefficient on lagged income, which tells us the rate of convergence. Here we find that the various “within-country” estimators (FE, AB and BB) perform better than their cross-sectional counterparts. Even here, though, the two GMM estimators generally do not perform noticably better than FE. Because one estimator does not clearly dominate the other estimators across all simultations, we conclude that the “best” estimator to use in growth regressions will depend on the question that is being asked, and to some extent, on the assumed level of regressor endogeneity.
This paper is structured as follows: Section 2 briefly discusses theoretical considerations related to the methodology of growth regressions. Section 3 presents our basic simulation methodology. Section 4 discusses our results, Section 5 does a robustness check using differing assumptions about country-level heterogeneity, Section 6 does a robustness check looking at sample size variation, and Section 7 concludes.
Section snippets
Growth regressions and the solow model
The theoretical basis for our data-generating process is Solow (1956) model. We choose it, not because of any prior beliefs that it is a particularly compelling model of growth, but because it is tractable and constitutes arguably the only strict theoretical basis for the specific functional forms often estimated in the vast cross-country growth literature. This model is also well-suited for generating simulated data. Mankiw et al. (1992) and Islam (1995) showed that the Solow growth model can
Simulation methodology
We take as our starting point the Solow model of economic growth, as transformed into a linear regression model by Mankiw et al. (1992) and Islam (1995). We collect the data typically used in such regressions – PPP-adjusted per-capita GDP (log yit in the regressions), investment as a share of total GDP () and population growth ()13
Table 1, Row 1 – no correlation between steady-state determinants and residuals
We begin our analysis of the results by looking at Table 1, in which the correlations between the residuals and all three steady-state determinants are fixed at specified levels.20 The results of our baseline case, with no correlation between the residuals
Extensions: changing assumptions about country-level heterogeneity
As noted above, we have assumed thus far that the correlations between the country fixed-effects term and the regressors used in our data-generating process are 50% as large as those predicted by a fixed-effects regression on the real-world data. Similarly, we assume that the variance of the fixed-effects term is 50% as large as that predicted by the same regression. The assumption that the “real” correlations and variance are smaller than those predicted in a fixed-effects regression is
Extensions: changing the sample size
One of the more difficult problems to address in the growth regressions literature is that researchers only have a limited number of observations to use in their empirical analyses. As of this writing, there are 195 independent countries in the world, and not all of them have the relevant economic data going back several decades needed to merit their inclusion as an observation in a growth regression. In particular, the simulations described in this article use as the cross-sectional
Conclusion
This paper has examined the role that endogenous regressors play in growth regressions based on the Solow growth model. We have derived a method of running Monte Carlo simulations that generates data to match the real-world moments of observable growth data while allowing us to impose arbitrary correlation levels between the steady-state determinants of the Solow model and the unobservable residual term.
As a result of our simulations, we have concluded that the BE and RE estimators (especially
References (28)
- et al.
Another look at the instrumental variables estimation of error-components models
J. Econom.
(1995) - et al.
A new data set of educational attainment in the world, 1950–2010
J. Dev. Econ.
(2013) - et al.
Initial conditions and moment restrictions in dynamic panel data models
J. Econom.
(1998) - et al.
Growth econometrics
The transition from stagnation to growth: unified growth theory
- et al.
Financial intermediation and growth: causality and causes
J. Monet. Econ.
(2000) - Arellano, M., 2003. Modelling Optimal Instrumental Variables for Dynamic Panel Data.CEMFI Working Paper no....
- et al.
Some tests of specification for panel data: monte carlo evidence and an application to employment questions
Rev. Econ. Stud.
(1991) Convergence and modernization
Econ. J.
(2015)- et al.
Economic Growth
(2003)
Is growth exogenous? Taking Mankiw, Romer and weil seriously
NBER Macroecon. Annu.
Reopening the convergence debate: a new look at cross-country growth empirics
J. Econ. Growth
The marginal product of capital
Q. J. Econ.
Cited by (10)
The impact of institutional quality on manufacturing sectors: A panel data analysis
2021, Economic SystemsCitation Excerpt :Therefore, the extent to which Hauk and Wacziarg’s (2009) results pertain to our specifications is unclear. Also, Monte-Carlo simulations by Hauk (2017) show that, in the presence of reverse causality, estimators that use within-country variation result in lower bias toward the convergence rate than the between or random-effects estimators. We present the between-effects, cross-sectional, and random-effects estimates in Appendix Table A6 for comparison purposes.
The impact of budget deficit on economic growth and its channels in South Africa
2023, African Journal of Economic and Management StudiesThe Impact of China, the EU, and the US on Africa through the Lens of Output Growth and FDI
2021, Frontiers of Economics in ChinaDoes financial liberalization lead to financial development? Evidence from emerging economies
2021, Journal of International Trade and Economic DevelopmentNext reaction method for solving dynamic macroeconomic models: A growth regressions simulation
2020, Journal of Scientific and Industrial Research