Long-horizon regressions: theoretical results and applications

https://doi.org/10.1016/S0304-405X(03)00065-5Get rights and content

Abstract

I use asymptotic arguments to show that the t-statistics in long-horizon regressions do not converge to well-defined distributions. In some cases, moreover, the ordinary least squares estimator is not consistent and the R2 is an inadequate measure of the goodness of fit. These findings can partially explain the tendency of long-horizon regressions to find “significant” results where previous short-term approaches find none. I propose a rescaled t-statistic, whose asymptotic distribution is easy to simulate, and revisit some of the long-horizon evidence on return predictability and of the Fisher effect.

Introduction

There has been an increasing interest in long-horizon regressions, because studies using long-horizon variables seem to find significant results where previous “short-term” approaches have failed. For example, Fama and French (1988), Campbell and Shiller 1987, Campbell and Shiller 1988, Mishkin 1990, Mishkin 1992, Boudoukh and Richardson (1993), and Fisher and Seater (1993), all studies with long-run variables, have received a lot of attention in finance and economics. The results in those papers are based on long-horizon variables, where the long-horizon variable is a rolling sum of the original series. In the literature, it is heuristically argued that long-run regressions produce more accurate results by strengthening the signal coming from the data while eliminating the noise. Whether the focus is on excess returns/dividend yields, the Fisher effect, or neutrality of money, the striking results produced by such studies prompted me to scrutinize the appropriateness of the econometric methods.

In this paper, I show that long-horizon regressions will always produce “significant” results, whether or not there is a structural relation between the underlying variables. To understand this conclusion, notice that in a rolling summation of series integrated of order zero (or I(0)), the new long-horizon variable behaves asymptotically as a series integrated of order one (or I(1)). Such persistent stochastic behavior will be observed whenever the regressor, the regressand, or both are obtained by summing over a nontrivial fraction of the sample. Based on this insight, I use the Functional Central Limit Theorem (FCLT) to analyze the distributions of statistics from long-run regressions commonly used in economics and finance. I find that, in addition to incorrect testing, overlapping sums of the original series might lead to inconsistent estimators and to a coefficient of determination, R2, that does not converge to one in probability. These results are reminiscent of, but not analogous to, the result in Granger and Newbold (1974) and Phillips 1986, Phillips 1991 and those recently discussed by Ferson et al. (2003) in forecasting excess returns. The analogy lies in finding a spurious correlation between persistent variables when they are in fact statistically independent. However, there are two major differences. First, in long-horizon regressions, the rolling summation alters the stochastic order of the variables, resulting in unorthodox limiting distributions of the slope estimator, its t-statistic, and the R2. More importantly, even if there is an underlying relationship between variables, the t-statistic will tend to reject it. In other words, estimation and testing using long-horizon variables cannot be carried out using the usual regression methods.

I provide a simple guide on how to conduct estimation and inference using long-horizon regressions. Based on previous empirical studies, I classify such regressions into four cases. The proposed classification emerges naturally from a consideration of null hypotheses and the persistence of the regressors. It allows a systematic analysis of the small-sample properties of long-horizon regressions. Those properties are analyzed by using the FCLT to derive accurate approximations of the small-sample distributions of the OLS estimator of the slope coefficient, its t-statistic and the coefficient of determination R2. The estimators from some regressions, frequently used in empirical work, are not consistent. Moreover, the t-statistics from all considered regressions do not converge to well-defined distributions, thus calling into question the conclusions from studies that use long-run variables. The analytical results yield exact rates of convergence or divergence and permit us to modify the statistics in order to conduct correctly sized tests.

I propose a rescaled t-statistic, t/T, for testing long-horizon regressions. Its asymptotic distribution, although non-normal, is easy to simulate. The results are quite general and applicable whenever long-horizon regressions are employed. The rescaled t-statistic is easy to use and can be computed with existing computer routines. Using arguments similar to those in Richardson and Stock (1989) and Viceira (1997), I show that its asymptotic distributions can be approximated with a relatively small sample. Phillips (1986) uses similar arguments to show that, in the context of spurious regressions, the t/T statistic converges to a well-defined distribution. For a good, although a bit outdated, overview of this literature, see Stock (1994).

I use the derived analytical expressions to explain the empirical and simulation results obtained by previous authors, including the excess returns/dividend yield regressions in Fama and French (1988), and the Fisher effect tests in Boudoukh and Richardson (1993), and Mishkin (1992). For example, I address the interesting question of whether long-horizon regressions have greater power to detect deviations from the null than do short-horizon regressions, or whether the significant results are a mere product of size distortion. This question, indirectly discussed in Hodrick (1992), Mishkin 1990, Mishkin 1992, Goetzmann and Jorion (1993), and Campbell (2001), is posed explicitly in Campbell et al. (1997). Some Monte Carlo simulations suggest power gains (Hodrick, 1992), and others show size distortions (Goetzmann and Jorion, 1993; Nelson and Kim, 1993), but a definite, analytic answer has not yet been provided. I show that the significant results from long-horizon regressions are due to incorrect critical values. However, if appropriately rescaled, tests of long-horizon regressions have a somewhat better power at rejecting alternatives than their short-horizon analogues. Another implication of my analysis is that a significant R2 in such regressions cannot be interpreted as an indication of a good fit.

I use the FCLT because it has been shown to provide a very good approximation of the finite-sample distributions of interest. Richardson and Stock (1989) use a similar methodology, but they consider only univariate regressions. Their results can be viewed as a special case within my framework. Viceira (1997) uses FCLT asymptotics to analyze various tests for structural breaks. In recent work, Lanne (2002) and Torous et al. (2002) also consider a particular case of long-horizon regressions, representing a special case in my framework. In this paper, I offer an exhaustive and coherent treatment of regressions in which the overlap in the observations is a nontrivial fraction of the sample size. The complete analysis allows me to compare various ways of running long-horizon regressions.

This analytical framework is broad enough to accommodate forecasting variables that, although persistent, do not have an exact unit root. Such a generalization is important in practice since many of the predictors, although highly serially correlated, must nevertheless be stationary. Also, given the sensitivity of unit-root tests to model misspecifications, it is often hard to say whether a process is truly stationary (Schwert 1987, Schwert 1989). The cost of this generalization is the introduction of a nuisance parameter that measures deviations from the exact unit root case. I discuss three ways of dealing with this nuisance parameter.

This paper is not a condemnation of long-horizon studies. My aim is to put inference and testing using long-run regressions on a firm basis and not to rely exclusively on simulation methods. The conclusions from Monte Carlo or bootstrap studies are limited to the case study at hand, but fail to yield general insights applicable to other cases. In contrast, my analysis provides general guidelines on how to test long-horizon relations. Many researchers are aware that normal asymptotic approximations are not adequate when using overlapping variables. The reason for the poor approximations is attributed to serial correlation in the error terms. However, Monte Carlo simulations show that even after correcting for serially correlated errors, using Hansen and Hodrick (1980) or Newey–West (1987) standard errors, the small-sample distribution of the estimators and the t-statistics are very different from the asymptotic normal distribution (Mishkin, 1992; Goetzmann and Jorion, 1993).

The paper is structured as follows. Section 2 presents the various ways of specifying long-horizon regressions that have commonly been used in the empirical literature. Section 3 provides the main theoretical results. Testing and inference are analyzed using asymptotic methods. In Section 4, I conduct simulations to illustrate the analytical results. Section 5 applies the conclusions in Section 3 to the excess returns/dividend yield equations of Fama and French (1988) and to the long-run Fisher effect, as tested in Boudoukh and Richardson (1993) and Mishkin (1992). Section 6 concludes.

Section snippets

The model

The underlying data-generating processes areYt+1=α+βXt1,t+1,(1−φL)b(L)Xt+1=μ+ε2,t+1.The variable Xt is represented as an autoregressive process, whose highest root, φ, is conveniently factored out and b(L)=b0+b1L+b2L2+⋯+bpLp is invertible. Let φ=1+c/T, where the parameter c measures deviations from the unit root in a decreasing (at rate T) neighborhood of 1. The unit-root case corresponds to c=0. This parameterization allows us to examine highly persistent regressors, such as the

Theoretical results

In this section, I present the analytical results. In addition to stating the theorems, I also provide informal discussions to clarify their implications and applications. Proofs are in Appendix A.

The assumptions and additional notation are summarized here for convenience.

Assumptions. In model (1−2),

  • 1.

    Ztk=∑i=0k−1Yt+i and Qtk=∑i=0k−1Xt+i.

  • 2.

    The portion of overlapping is a fraction of the sample size, or k=[λT], where λ is fixed between 0 and 1, and [.] denotes the lesser greatest integer operator.

  • 3.

    φ=1+

Simulations

The theorems in the previous section provide an asymptotic approximation of the distributions of β̂,t/T, and R2. It is well known that rescaled partial sums converge quickly to their limiting distributions (Stock, 1994). In other words, the asymptotic distributions provide a very accurate approximation of the small-sample distributions even for samples of (say) 100 observations. Here, I conduct Monte Carlo simulations to illustrate some of the points made in the previous section. First, I

Long-horizon predictability of excess returns using dividend yields

The predictability of excess returns, labeled as one the “new facts in finance” by Cochrane (1999), is so widely accepted in the profession that it has generated a new wave of models (e.g., Barberis, 2000; Brennan et al., 1997; Campbell and Viceira, 1999; Liu, 1999) that try to analyze the implications of return predictability on portfolio decisions. Fama and French (1988) and Campbell and Shiller (1988) first argued that, unlike short-horizon returns, long-horizon returns can be predicted

Conclusion

I analyze four ways of conducting long-horizon regressions that have frequently been used in empirical finance and macroeconomics. The least squares estimator of the slope coefficient, its t-statistic, and the R2 have non-standard asymptotic properties. I reach several conclusions. First, the coefficient is not always consistently estimated. For reliable estimates, one must specify regressions by aggregating both the regressor and the regressand (cases 2 and 4), i.e., running one long-horizon

References (43)

  • N. Barberis

    Investing for the long run when returns are predictable

    Journal of Finance

    (2000)
  • J. Boudoukh et al.

    Stock returns and inflation: a long-horizon perspective

    American Economic Review

    (1993)
  • J. Campbell et al.

    Cointegration and tests of present value models

    Journal of Political Economy

    (1987)
  • J. Campbell et al.

    The dividend-price ratio and expectations of future dividends and discount factors

    Review of Financial Studies

    (1988)
  • J. Campbell et al.

    Consumption and portfolio decisions when expected returns are time varying

    Quarterly Journal of Economics

    (1999)
  • J. Campbell et al.

    The Econometrics of Financial Markets

    (1997)
  • C. Cavanagh et al.

    Inference in models with nearly integrated regressors

    Econometric Theory

    (1995)
  • Cochrane, J., 1999. New Facts in Finance, Unpublished working paper, University of...
  • Ferson, W., Sarkissian, S., Simin, T., 2003. Spurious regressions in financial economics? Journal of Finance,...
  • M. Fisher et al.

    Long-run neutrality and superneutrality in an ARIMA framework

    American Economic Review

    (1993)
  • W. Goetzmann et al.

    Testing the predictive power of dividend yields

    Journal of Finance

    (1993)
  • Cited by (0)

    I have benefited from detailed comments by Mark Watson, Yacine Ait-Sahalia, Sergei Sarkissian, and an anonymous referee, and from discussions with Maria Gonzalez, Pedro Santa-Clara, Walter Torous, Ivo Welch, and participants at the 2000 Western Finance Association meetings in Sun Valley, Idaho. All remaining errors are my own.

    View full text