Skip to main content
main-content

Über dieses Buch

Since Efron's profound paper on the bootstrap, an enormous amount of effort has been spent on the development of bootstrap, jacknife, and other resampling methods. The primary goal of these computer-intensive methods has been to provide statistical tools that work in complex situations without imposing unrealistic or unverifiable assumptions about the data generating mechanism. The primary goal of this book is to lay some of the foundation for subsampling methodology and related methods.

Inhaltsverzeichnis

Frontmatter

Basic Theory

Frontmatter

1. Bootstrap Sampling Distributions

Abstract
The bootstrap was discovered by Efron (1979), who coined the name. In this chapter, the bootstrap is developed as a general method to approximate the sampling distribution of a statistic, a pivot, or a root (defined below), in order to construct confidence regions for a parameter of interest. The use of the bootstrap to approximate a null distribution in the construction of hypothesis tests is also considered. Much of the theoretical foundations of the bootstrap are laid out in Bickel and Freedman (1981), Singh (1981), and Beran (1984). The development begins by focusing on the independent, identically distributed (i.i.d.) case.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

2. Subsampling in the I.I.D. Case

Abstract
In this chapter, a general theory for the construction of confidence intervals or regions is presented. Much of what is presented is extracted from Politis and Romano (1992c, 1994b). The basic idea is to approximate the sampling distribution of a statistic based on the values of the statistic computed over smaller subsets of the data. For example, in the case where the data are n observations that are independent and identically distributed, a statistic is computed based on the entire data set and is recomputed over all data sets of size b. Implicit is the notion of a statistic sequence, so that the statistic is defined for samples of size n and b. These recomputed values of the statistic are suitably normalized to approximate the true sampling distribution.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

3. Subsampling for Stationary Time Series

Abstract
It is well known that inference methods for i.i.d. data or, more generally, independent data are simply not consistent when the underlying sequence is dependent. Therefore, the resampling and subsampling methods discussed in the previous chapters need to be modified to be applicable with time series data.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

4. Subsampling for Nonstationary Time Series

Abstract
Stationary time series are very convenient to work with from a mathematical point of view, but the assumption of stationarity is often violated when modeling real-life data. To mention only two examples, many economic time series exhibit seasonal fluctuations, while stock return data typically show time-dependent variability. The goal of this chapter is to demonstrate that the subsampling method is by no means restricted to stationary series. We will provide sufficient conditions under which asymptotically correct inference can be made even in the presence of nonstationarity. In outline and style, this chapter follows the previous one very closely. In particular, the subsampling methodology for nonstationary observations will be identical to the one for stationary observations. Many of the results derived under stationarity will be restated and reproven under weaker conditions. Much of what is presented is taken from Politis, Romano, and Wolf (1997).
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

5. Subsampling for Random Fields

Abstract
Suppose {X(t), t ∈ G d } is a random field in d dimensions, with d ∈ ℤ+; that is, {X(t), t ∈ G d } is a collection of random variables X(t) taking values in a state space S, defined on a probability space (Ω, A, P), and indexed by the variable t ∈ G d . Throughout this chapter, G will stand for either the set of real numbers ℝ, or the set of integers ℤ; thus, the random field {X(t)} is allowed to “run” in either continuous or discrete “time.” Similarly, G+ will denote ℝ+ or ℤ+ (the sets of positive real numbers and integers, respectively) according to whether G = ℝ or G = ℤ.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

6. Subsampling Marked Point Processes

Abstract
In this chapter, we assume that {X(t), t ∈ ℝ d } is a homogeneous random field in d dimensions, with d ∈ ℤ+; that is, X(t), t ∈ ℝ d } is a collection of random variables X(t) taking values in an arbitrary state space S, and indexed by the continuous variable t ∈ ℝ d . However, for reasons to be apparent shortly, the probability law of the random field {X(t), t ∈ ℝ d } will be denoted by PX (and not P) throughout this chapter.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

7. Confidence Sets for General Parameters

Abstract
Let X1,…, X n denote a realization of a stationary time series. Suppose the infinite dimensional distribution of the infinite sequence is denoted P. The problem we consider is inference for a parameter θ(P). The focus of the present chapter is the case when the parameter space Θ is a metric space. The reason for considering such generality is to be able to consider the case when the parameter of interest is an unknown function, such as the marginal distribution of the process or the spectral distribution function of the process. Here, we need to extend the arguments of previous chapters to cover the more general case.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

Extensions, Practical Issues, and Applications

Frontmatter

8. Subsampling with Unknown Convergence Rate

Abstract
Let X1,…, X n be an observed stretch of a (strictly) stationary, strong mixing sequence of random variables Xt, t ∈ Z} taking values in an arbitrary sample space S; the probability measure generating the observations is denoted by P. The strong mixing condition means that the sequence tends to zero as k tends to infinity, where A and B are events in the σ-algebras generated by Xt, t < 0} and Xt, t ≥ k}, respectively; the case where X1,…,X n are independent, identically distributed (i.i.d.) is an important special case of the general scenario.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

9. Choice of the Block Size

Abstract
The main practical problem in applying the subsampling method lies in choosing the block size b. This problem is shared by all blocking methods, such as, for example, the moving blocks bootstrap or Carlstein’s (1986) variance estimator (see Sections 3.8 and 3.9). The asymptotic conditions, at least for first-order theory, are usually b → ∞ and b/n → 0 as n → ∞. Although any choice of b satisfying these conditions will yield the required consistency of subsampling methods, the two asymptotic conditions do not give much guidance with respect to how to choose b when faced with a finite sample. The aim of this chapter is to provide some guidelines for this task. It turns out that the optimal choice of b depends on the purpose for which subsampling is used.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

10. Extrapolation, Interpolation, and Higher-Order Accuracy

Abstract
In this chapter, we consider \( {{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{X}}_{n}} = ({X_{1}}, \ldots ,{X_{n}}) \) to be an observed stretch of a stationary, strong mixing sequence of real-valued random variables {Xt,t∈ℤ}. The probability measure generating the observations is again denoted by P. As mentioned in Appendix A, the strong mixing condition amounts to αx (k) = supA,B |P(AB) —P(A)P(B)|→0 as k tends to infinity, where A and B are events in the σ-algebras generated by {Xt,t < 0 } and {Xt, tk}, respectively. The case where X1,…, X n are independent, identically distributed (i.i.d.) will be treated here as an important special case where αx(k) = 0 for all k > 0.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

11. Subsampling the Mean with Heavy Tails

Abstract
It has been two decades since Efron (1979) introduced the bootstrap procedure for estimating sampling distributions of statistics based on independent and identically distributed (i.i.d.) observations. While the bootstrap has enjoyed tremendous success and has led to something like a revolution in the field of statistics, it is known to fail for a number of counterexamples. One well-known example is the case of the mean when the observations are heavy-tailed. If the observations are i.i.d. according to a distribution in the domain of attraction of a stable law with index α < 2 (see Feller, 1971), then the sample mean, appropriately normalized, converges to a stable law. However, Athreya (1987) showed that the bootstrap version of the normalized mean has a limiting random distribution, implying inconsistency of the bootstrap. An alternative proof of Athreya’s result was presented by Knight (1989). Kinateder (1992) gave an invariance principle for symmetric heavy-tailed observations. It has been realized that taking a smaller bootstrap sample size can result in consistency of the bootstrap, but knowledge of the tail index of the limiting law is needed (see Section 1.3 and Athreya, 1985, and Arcones, 1990; also see Wu, Carlstein, and Cambanis, 1993, and Arcones and Giné, 1989).
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

12. Subsampling the Autoregressive Parameter

Abstract
This chapter is concerned with making inference for p in the simple AR(1) model
$$ {X_{t}} = \mu + \rho {X_{{t - 1}}} + { \in _{t}}, $$
(12.1)
where {∈ t } is a strictly stationary white noise innovation sequence and ρ ∈( —1, 1]. It is well known that if |ρ|.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

13. Subsampling Stock Returns

Abstract
There has been considerable debate in the recent finance literature over whether stock returns are predictable. A number of studies appear to provide empirical support for the use of the current dividend-price ratio, or dividend yield, as a measure of expected stock returns (see, for example, Rozeff, 1984; Campbell and Shiller, 1988b; Fama and French, 1988; Hodrick, 1992; and Nelson and Kim, 1993). The problem with such studies is that stock return regressions face several kinds of statistical problems, among them strong dependency structures and biases in the estimation of regression coefficients. These problems tend to make findings against the no predictability hypothesis appear more significant than they really are. Having recognized this, Goetzmann and Jorion (1993) argue that previous findings might be spurious and are largely due to the poor small sample performance of commonly used inference methods. They employ a bootstrap approach and conclude that there is no strong evidence indicating that dividend yields can be used to forecast stock returns. One should note, however, that their special approach is not shown to be backed up by theoretical properties. Also, it requires a lot of custom-tailoring to the specific situation at hand. For other scenarios, a different tailoring would be needed.
Dimitris N. Politis, Joseph P. Romano, Michael Wolf

Backmatter

Weitere Informationen