Abstract
Panel data of our interest consist of a moderate or relatively large number of panels, while the panels contain a small number of observations. This paper establishes testing procedures to detect a possible common change in means of the panels. To this end, we consider a ratio type test statistic and derive its asymptotic distribution under the no change null hypothesis. Moreover, we prove the consistency of the test under the alternative. The main advantage of such an approach is that the variance of the observations neither has to be known nor estimated. On the other hand, the correlation structure is required to be calculated. To overcome this issue, a bootstrap technique is proposed in the way of a completely data driven approach without any tuning parameters. The validity of the bootstrap algorithm is shown. As a by-product of the developed tests, we introduce a common break point estimate and prove its consistency. The results are illustrated through a simulation study. An application of the procedure to actuarial data is presented.
Similar content being viewed by others
References
Andrews DWK (1991) Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59(3):817–858
Bai J (2010) Common breaks in means and variances for panel data. J Econom 157(1):78–92
Billingsley P (1986) Probability and measure, 2nd edn. Wiley, New York
Chan J, Horváth L, Hušková M (2013) Change-point detection in panel data. J Stat Plan Inference 143(5):955–970
Chen Z, Tian Z (2014) Ratio tests for variance change in nonparametric regression. Stat J Theor Appl Stat 48(1):1–16
Csörgő M, Horváth L (1997) Limit theorems in change-point analysis. Wiley, Chichester
Horváth L, Horváth Z, Hušková M (2009) Ratio tests for change point detection. In: Balakrishnan N, Peña EA, Silvapulle MJ (eds) Beyond parametrics in interdisciplinary research: Festschrift in honor of professor Pranab K. Sen, vol 1. IMS Collections, Beachwood, pp 293–304
Horváth L, Hušková M (2012) Change-point detection in panel data. J Time Ser Anal 33(4):631–648
Hušková M, Kirch C (2012) Bootstrapping sequential change-point tests for linear regression. Metrika 75(5):673–708
Hušková M, Kirch C, Prášková Z, Steinebach J (2008) On the detection of changes in autoregressive time series, II. Resampling procedures. J Stat Plan Inference 138(6):1697–1721
Katz ML (1963) Note on the Berry–Esseen theorem. Ann Math Stat 34(3):1107–1108
Lindner AM (2009) Stationarity, mixing, distributional properties and moments of GARCH(p, q)-processes. In: Andersen TG, Davis RA, Kreiss JP, Mikosch T (eds) Handbook of financial time series. Springer, Berlin, pp 481–496
Liu Y, Zou C, Zhang R (2008) Empirical likelihood ratio test for a change-point in linear regression model. Commun Stat Theory Methods 37(16):2551–2563
Madurkayová B (2011) Ratio type statistics for detection of changes in mean. Acta Univ Carol Math Phys 52(1):47–58
Meyers GG, Shi P (2011) Loss reserving data pulled from NAIC Schedule P. http://www.casact.org/research/index.cfm?fa=loss_reserves_data. [Online; Updated September 01, 2011; Accessed June 10, 2014]
Pešta M, Hudecová Š (2012) Asymptotic consistency and inconsistency of the chain ladder. Insur Math Econ 51(2):472–479
Acknowledgments
The authors thank two anonymous referees and the Associate Editor for the suggestions that improved this paper. This paper was written with the support of the Czech Science Foundation Project GAČR No. P201/13/12994P.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper was written with the support of the Czech Science Foundation Project GAČR No. P201/13/12994P.
Appendices
Appendix 1: Supporting theorems
Suppose that \(\{\varvec{\xi }_n\}_{n=1}^{\infty }\) is a sequence of random variables/vectors existing on a probability space \((\varOmega ,\mathcal {F},\mathsf {P})\). A bootstrap version of \(\varvec{\xi }\equiv [\varvec{\xi }_1,\ldots ,\varvec{\xi }_n]^{\top }\) is its (randomly) resampled sequence with replacement—denoted by \(\varvec{\xi }^*\equiv [\varvec{\xi }_1^*,\ldots ,\varvec{\xi }_n^*]^{\top }\)—with the same length, where for each \(i\in \{1,\ldots ,n\}\) it holds that \(\mathsf {P}_{\varvec{\xi }}^*[\varvec{\xi }_i^*=\varvec{\xi }_j]\equiv \mathsf {P}[\varvec{\xi }_i^*=\varvec{\xi }_j|\varvec{\xi }] =1/n,\,j=1,\ldots ,n\). In the sequel, \(\mathsf {P}_{\varvec{\xi }}^*\) denotes the conditional probability given \({\varvec{\xi }}\). So, \(\varvec{\xi }_i^*\) has a discrete uniform distribution on \(\{\varvec{\xi }_1,\ldots ,\varvec{\xi }_n\}\) for every \(i=1,\ldots ,n\). The conditional expectation and variance given \({\varvec{\xi }}\) are denoted by \(\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\) and \(\mathsf {Var}\,_{\mathsf {P}_{\varvec{\xi }}^*}\).
If a statistic has an approximate normal distribution, one may be interested in the asymptotic comparison of the bootstrap distribution with the original one. A tool for assessing such an approximate closeness can be a bootstrap central limit theorem for triangular arrays.
Theorem 6
(Bootstrap CLT for triangular arrays) Let \(\{\xi _{n,k_n}\}_{n=1}^{\infty }\) be a triangular array of zero mean random variables on the same probability space such that the elements of the vector \([\xi _{n,1},\ldots ,\xi _{n,k_n}]^{\top }\) are iid for every \(n\in \mathbb {N}\) satisfying
and \(k_n\rightarrow \infty \) as \(n\rightarrow \infty \). Suppose that \(\varvec{\xi }^*\equiv [\xi _{n,1}^*,\ldots ,\xi _{n,k_n}^*]^{\top }\) is the bootstrapped version of \(\varvec{\xi }\equiv [\xi _{n,1},\ldots ,\xi _{n,k_n}]^{\top }\) and denote
If
then
Theorem 7
(Bootstrap multivariate CLT for triangular arrays) Let \(\{\varvec{\xi }_{n,k_n}\}_{n=1}^{\infty }\) be a triangular array of zero mean \(q\)-dimensional random vectors on the same probability space such that the elements of the vector sequence \(\{\varvec{\xi }_{n,1},\ldots ,\varvec{\xi }_{n,k_n}\}\) are iid for every \(n\in \mathbb {N}\) satisfying
where \(\varvec{\xi }_{n,1}\equiv [\xi _{n,1}^{(1)},\ldots ,\xi _{n,1}^{(q)}]^{\top }\in \mathbb {R}^q,\,n\in \mathbb {N}\) and \(k_n\rightarrow \infty \) as \(n\rightarrow \infty \). Assume that \(\varvec{\varXi }^*\equiv [\varvec{\xi }_{n,1}^*,\ldots ,\varvec{\xi }_{n,k_n}^*]^{\top }\) is the bootstrapped version of \(\varvec{\varXi }\equiv [\varvec{\xi }_{n,1},\ldots ,\varvec{\xi }_{n,k_n}]^{\top }\). Denote
If
then
Appendix 2: Proofs
Proof (of Theorem 1)
Let us define
Using the multivariate Lindeberg-Lévy CLT for a sequence of \(T\)-dimensional iid random vectors \(\{[\sum _{s=1}^1\varepsilon _{i,s},\ldots , \sum _{s=1}^T\varepsilon _{i,s}]^{\top }\}_{i\in \mathbb {N}}\), we have under \(H_0\)
since \(\mathsf {Var}\,[\sum _{s=1}^1\varepsilon _{1,s},\ldots ,\sum _{s=1}^T\varepsilon _{1,s}]^{\top }=\varvec{\varLambda }\). Indeed, the \(t\)-th diagonal element of the covariance matrix \(\varvec{\varLambda }\) is
and the upper off-diagonal element on position \((t,v)\) is
Moreover, let us define the reverse analogue to \(U_N(t)\), i.e.,
Hence,
and, consequently,
Using the Cramér–Wold device, we end up with
\(\square \)
Proof (of Theorem 2)
Let \(t=\tau +1\). Then, under alternative \(H_1\)
where \(\bar{\varepsilon }_{i,\tau +1}=\frac{1}{\tau }\sum _{v=1}^{\tau +1}\varepsilon _{i,v}\).
Since there is no change after \(\tau +1\) and \(\tau \le T-3\), then by Theorem 1 we have
\(\square \)
Proof (of Theorem 3)
Let us define \(S_N^{(i)}(t):=\frac{1}{t}\sum _{s=1}^t(Y_{i,s}-\bar{Y}_{i,t})^2\) and, consequently, \(S_N(t):=\frac{1}{N}\sum _{i=1}^N S_N^{(i)}(t)\). Then,
where \(\bar{\varepsilon }_{i,t}=\frac{1}{t}\sum _{s=1}^t\varepsilon _{i,s}\). By the definition of the cumulative autocorrelation function, we have for \(2\le t\le \tau \)
In the other case when \(t>\tau \), one can calculate
Realize that \(S_N^{(i)}(t)-\mathsf {E}S_N^{(i)}(t)\) are independent with zero mean for fixed \(t\) and \(i=1,\ldots ,N\). Due to Assumption C2, for \(2\le t\le \tau \) it holds
where \(C_1(t,\sigma )>0\) is some constant not depending on \(N\). If \(t>\tau \), then
where \(C_j(t,\tau ,\sigma )>0\) does not depend on \(N\) for \(j=2,3,4\).
The Chebyshev inequality provides \(S_N(t)-\mathsf {E}S_N(t)=\mathcal {O}_{\mathsf {P}}\left( \sqrt{\mathsf {Var}\,S_N(t)}\right) \) as \(N\rightarrow \infty \). According to Assumption C1 and the Cauchy-Schwarz inequality, we have
Since the index set \(\{1,\ldots ,T\}\) is finite and \(\tau \) is finite as well, then
where \(K_j(\sigma )>0\) are constants not depending on \(N\) for \(j=1,2,3,4\). Thus, we also have uniform stochastic boundedness, i.e.,
Adding and subtracting, one has
The above inequality holds for each \(t\in \{2,\ldots ,T\}\) and, particularly, it holds for \(\widehat{\tau }_N\). Note that \(\widehat{\tau }_N=\arg \max _tS_N(t)\). Hence, \(S_N(\tau )-S_N(\widehat{\tau }_N)\le 0\). Therefore,
If \(\widehat{\tau }_N>\tau \), then the left hand side of (9) is \(\mathcal {O}_{\mathsf {P}}(1)\) as \(N\rightarrow \infty \), but the right hand side is unbounded because of Assumption C1. So, if \(\widehat{\tau }_N\le \tau \), then
which yields due to the monotonicity of \(r(t)/t^2\) that \(\mathsf {P}[\widehat{\tau }_N=\tau ]\rightarrow 1\) as \(N\rightarrow \infty \). \(\square \)
Proof (of Theorem 4)
Let us define \(\widehat{\epsilon }_{i,t}:=\sigma ^{-1}\sum _{s=1}^t\widehat{e}_{i,s}\), \(\widehat{\epsilon }_{i,t}^*:=\sigma ^{-1}\sum _{s=1}^t\widehat{e}_{i,s}^*\),
and
Realize that \(\widehat{\epsilon }_{i,t}\) depends on \(\widehat{\tau }_N\) and, hence, it depends on \(N\). Thus, \(\widehat{\epsilon }_{i,t}\equiv \widehat{\epsilon }_{i,t}(N)\). Since Assumption C2 holds, then according to the bootstrap multivariate CLT for triangular arrays (Theorem 7) of \(T\)-dimensional vectors \(\varvec{\xi }_{N,i}=[\widehat{\epsilon }_{i,1}(N),\ldots ,\widehat{\epsilon }_{i,T}(N)]^{\top }\) with \(k_N=N\), we have
where \(\varvec{\varGamma }_N=\mathsf {Var}\,[\widehat{\epsilon }_{i,1},\ldots , \widehat{\epsilon }_{i,T}]^{\top }\).
Now, it is sufficient to realize that \([\widehat{U}_N(1),\ldots ,\widehat{U}_N(T)]^{\top }\) has an approximate multivariate normal distribution with zero mean and covariance matrix \(\varvec{\varGamma }=\lim _{N\rightarrow \infty }\varvec{\varGamma }_N\). Using the law of total variance,
Since \(\lim _{N\rightarrow \infty }\mathsf {P}[\widehat{\tau }_N=\tau ]=1\) and \(\mathsf {E}[\widehat{e}_{i,t}|\widehat{\tau }_N=\tau ]=0\), then
Similarly with the covariance, i.e., after applying the law of total covariance, we have
Note that
where
Taking into account the definitions of \(r(t)\), \(R(t,v)\), and \(S(t,v,d)\) together with some simple algebra, we obtain that \(\mathsf {Var}\,[\widehat{\epsilon }_{i,s}|\widehat{\tau }_N=\tau ]=\gamma _{t,t}(\tau )\) and \(\mathsf {Cov}\,\left( \widehat{\epsilon }_{i,t},\widehat{\epsilon }_{i,v}|\widehat{\tau }_N=\tau \right) =\gamma _{t,v}(\tau )\) for \(t<v\), where the elements \(\gamma _{t,t}(\tau )\) and \(\gamma _{t,v}(\tau )\) are as in the statement of Theorem 4.
Then the sum in the nominator of \(\mathcal {R}_N^*(T)\) can be alternatively rewritten as
Concerning the denominator of \(\mathcal {R}_N^*(T)\), one needs to perform a similar calculation as in the proof of Theorem 1 with \(V_N(t)\), i.e., to define \(\widehat{V}_N(t)\) and \(\widehat{V}_N^*(t)\) analogously to \(\widehat{U}_N(t)\) and \(\widehat{U}_N^*(t)\) as \(V_N(t)\) is to \(U_N(t)\). Applying the Cramér–Wold theorem completes the proof. \(\square \)
Proof (of Theorem 5)
Recall the notation from the proof of Theorem 4. Under \(H_0\), B2, and C2 it holds
Then in view of (4),
\(\square \)
Proof (of Theorem 6)
The Lyapunov condition (Billingsley 1986, [p. 371]) for a triangular array of random variables \(\{\xi _{n,k_n}\}_{n=1}^{\infty }\) is satisfied due to (5) and (6), i.e., for \(\omega =2\):
Therefor, the CLT for \(\{\xi _{n,k_n}\}_{n=1}^{\infty }\) holds and
Now, to prove the theorem, it suffices to show the following three statements:
-
(i)
\(\sup _{x\in \mathbb {R}}\left| \mathsf {P}_{\varvec{\xi }}^*\left[ \frac{\sqrt{k_n}}{\sqrt{\mathsf {Var}\,_{\mathsf {P}_{\varvec{\xi }}^*}\xi _{n,1}^*}}\left( \bar{\xi }_n^*-\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\bar{\xi }_n^*\right) \le x\right] \!-\!\int _{-\infty }^x\frac{1}{\sqrt{2\pi }}\exp \left\{ -\frac{t^2}{2}\right\} \text{ d }t\right| \xrightarrow [n\rightarrow \infty ]{\mathsf {P}}0\);
-
(ii)
\(\mathsf {Var}\,_{\mathsf {P}_{\varvec{\xi }}^*}\xi _{n,1}^*-\varsigma _n^2\xrightarrow [n\rightarrow \infty ]{\mathsf {P}}0\);
-
(iii)
\(\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\bar{\xi }_n^*=\bar{\xi }_n,\, [\mathsf {P}]-a.s.\)
Proving (iii) is trivial, because \(\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\bar{\xi }_n^*=\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\xi _{n,1}^*=k_n^{-1}\sum _{i=1}^{k_n}\xi _{n,i}=\bar{\xi }_n,\, [\mathsf {P}]\)-\(a.s.\)
Let us calculate the conditional variance of the bootstrapped variable \(\xi _{n,1}^*\): \(\mathsf {Var}\,_{\mathsf {P}_{\varvec{\xi }}^*}\xi _{n,1}^*=\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\xi _{n,1}^{*2}-(\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}\xi _{n,1}^*)^2=k_n^{-1}\sum _{i=1}^{k_n}\xi _{n,i}^2-\left( k_n^{-1}\sum _{i=1}^{k_n}\xi _{n,i}\right) ^2,\, [\mathsf {P}]\)-\(a.s.\) The weak law of large numbers together with (5) provides
and
The last result of the WLLN is true, because (5) implies
Thus (ii) is proved.
The Berry–Esseen–Katz theorem (see Katz 1963) with \(g(x)=|x|^{\epsilon },\,\epsilon >0\) for the bootstrapped sequence of \(iid\) (with respect to \(\mathsf {P}^*\)) random variables \(\{\xi _{n,i}^*\}_{i=1}^{k_n}\) results in
for all \(n\in \mathbb {N}\) where \(C>0\) is an absolute constant.
The Jensen inequality and Minkowski inequality provide an upper bound for the nominator from the right-hand side of (10):
The right-hand side of the previously derived upper bound is uniformly bounded in probability \(\mathsf {P}\), because of Markov’s inequality and (5). Indeed, for fixed \(\eta >0\)
and
Since \(\mathsf {E}_{\mathsf {P}_{\varvec{\xi }}^*}|\xi _{n,1}^*-\mathsf {E}_{\mathsf {P}^*}\xi _{n,1}^*|^{2+\epsilon }\) is bounded in probability \(\mathsf {P}\) uniformly over \(n\) and the denominator of the right-hand side of (10) is uniformly bounded away from zero due to (6), then the left-hand side of (10) converges in probability \(\mathsf {P}\) to zero as \(n\) tends to infinity. So, (i) is proved as well. \(\square \)
Proof (of Theorem 7)
According to the Cramér–Wold theorem, it is sufficient to ensure that all assumptions of one-dimensional bootstrap CLT (6) for triangular arrays are valid for any linear combination of the elements of the random vector \(\varvec{\xi }_{n,1},\,n\in \mathbb {N}\).
For arbitrary fixed \(\mathbf {t}\in \mathbb {R}^q\) using the Jensen inequality, we get
Hence, assumption (7) implies assumption (5) for the random variables \(\{\mathbf {t}^{\top }\varvec{\xi }_{n,k_n}\}_{n\in \mathbb {N}}\).
Similarly, assumption (8) implies assumption (6) for such an arbitrary linear combination, i.e., positive definiteness of the matrix \(\varvec{\varGamma }\) yields
\(\square \)
Rights and permissions
About this article
Cite this article
Peštová, B., Pešta, M. Testing structural changes in panel data with small fixed panel size and bootstrap. Metrika 78, 665–689 (2015). https://doi.org/10.1007/s00184-014-0522-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-014-0522-8