This paper focuses on the issue of detecting the multiple change points for linear processes under negatively super-additive dependence (NSD). We propose a CUSUM-type method in the multiple variance change model and establish the weak convergence rate of the change points estimation. To carry out this method, we give a multiple variance-change iterative (MVCI) algorithm. Additionally, some simulations are implemented to substantiate the validity of the CUSUM-type method. Comparison with some best methods indicates that the CUSUM-type change point estimation is computationally competitive and superior in terms of the mean squared error (MSE).
Notes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
As a common feature of ‘big data’, change point arises in many areas such as signal processing (Basseville [1]), finance (Chen and Gupta [2]), ecology (Hawkins [3]), disease outbreak watch (Sparks et al. [4]), and neuroscience (Ratnam et al. [5]; Lena et al. [6]) and has been much investigated in the last few decades. To detect change point and estimate its location, there has emerged a number of approaches including least squares (LS, Bai [7]), Bayesian method (Fearnhead [8]), maximum likelihood (Zou et al. [9]), and some nonparametric methods (Matteson and James [10]; Haynes et al. [11]). The cumulative sum (CUSUM) method, based on the LS estimation, is a very attractive one for detecting the variance change in a sequence because it avoids some assumptions about the underlying error distribution function and is computed simply (Gombay et al. [12]). For independent sequences, Gombay et al. [12] constructed the CUSUM statistic to detect and estimate the change of variance. Wang and Wang [13] used the CUSUM test to detect the variance change in a linear process with long memory errors. Zhao et al. [14] considered the ratio test for variance change in a linear process. Qin et al. [15] investigated the strong convergence rate of the CUSUM estimator of the variance change in linear processes.
However, most of the references above assume the change point number in a sequence is one, which is a serious restriction when applied to practical problems. For multiple change point detection, Inclán and Tiao [16] employed the cumulative sums of squares to detect the multiple changes of variance in the uncorrelated sequences. Lavielle [17] obtained the convergence rate for multiple change detection for strongly mixing and strongly dependent processes. Li and Zhao [18] gave the convergence rate for multiple change-points estimation of moving-average processes. More recently, Haynes et al. [11] proposed a computationally efficient nonparametric approach for change point detection, and Laurentiu et al. [19] offered the Bayesian loss-based approach to analyze change point problem. But both of them require the information of the underlying error distribution function, which may lead to the complexity of computation.
Advertisement
In this contribution, we consider the following multiple variance change model:
$$ {{Y_{t}} = \mu + {\sigma _{i}} {e_{t}},\quad t_{i-1}^{*} \le t \le t_{i} ^{*}, 1 \le i \le r,} $$
(1)
where r is the known number of change points, μ and \({\sigma _{i}}\) (\(1 \le i \le r\)) are parameters, \(t_{i}^{*} \), \(1 \le i \le r\), \(t_{0}^{*} = 0\), \(t_{r + 1}^{*} = n\) are the true change locations with \(t_{i}^{*} = [ {{\tau _{i}}n} ]\), where \([x]\) denotes the integer part of x, \(\boldsymbol{\tau } = ( {\tau _{1}^{*} ,\tau _{2}^{*} , \ldots ,\tau _{r}^{*}} )\) are the change points, and \({e_{t}}\) is linear processes given as follows:
where \({a_{j}}\) is an array of real numbers satisfying \(\sum_{j = 0}^{\infty }{a_{j}^{2}} < \infty \), \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) are stationary random variables.
Under the independent or dependent assumptions of \(\{{{\varepsilon _{m}},m \in Z} \}\), the convergence rates of the single change point estimators have been established for the linear processes (2). We refer to Bai [7] and Qin et al. [15] for independence case, to Li and Zhao [18] for linear negative quadrant dependence, and to Wang and Wang [13] for long range dependence. In this article, we will consider the multiple variance change model, and simultaneously \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) are negatively super-additive dependence (NSD) whose definition is based on the super-additive functions.
where \(\{ {X_{m}^{*},1 \le m \le n} \}\) are independent random variables that have the same marginal distribution with \(\{ {X_{m},1 \le m \le n} \}\) for each i, and ϕ is a super-additive function such that the expectations in (3) exist.
A sequence of random variables \(( {{X_{1}},{X_{2}}, \ldots ,{X_{n}}, \ldots } )\) is called NSD if, for all \(n \ge 1\), \(( {{X_{1}},{X_{2}}, \ldots ,{X_{n}}} )\) is NSD.
NSD has received considerable attention since it includes the well-known negative association (see Christofides and Vaggelatou [22]). Eghbal et al. [23] explored the strong law of large numbers and the rate of convergence for NSD sequences with the existence of high order moments. Shen et al. [24] and Wu et al. [25] got the almost sure and complete convergence, respectively, for NSD random variables. Wang et al. [26] investigated the complete convergence, and Yu et al. [27] established the central limit theorem for weighted sums of NSD random variables. Moreover, NSD samples have been introduced to various models; for example, under NSD errors, Yu et al. [27] considered the M-test problem of regression parameters in a linear model; Wang et al. [28] studied the strong consistency and weak consistency of the LS estimators in an EV regression model, and Yu et al. [29] obtained the convergence rates of the wavelet thresholding estimators in a nonparametric regression model.
The aim of this study is to detect the multiple change points for linear processes under NSD. We propose the CUSUM-type change point estimator in model (1) and establish the weak convergence rate of the estimator with the mean parameter μ estimated by its LS estimator. Moreover, some simulations are implemented by R Software to compare the CUSUM-type estimator with some methods. The result indicates that the CUSUM-type change point estimator is broadly comparable with those obtained by the typical methods.
The remainder of this paper is organized as follows. In Sect. 2, we describe the CUSUM-type multiple change point estimation and give the weak convergence rate of this estimator. Also, we give a multiple variance-change iterative (MVCI) algorithm to evaluate the estimator. In Sect. 3, some simulations are presented to show the performances of the estimator. Finally, the proofs of the main results are given in Sect. 4.
2 Estimation and main results
Let \({\tilde{Y}_{j}} = {Y_{j}} - {\hat{\mu }_{n}}\), where \({\hat{\mu } _{n}} = \frac{1}{n}\sum_{t = 1}^{n} {{Y_{t}}} \) is the LS estimator of the mean μ. Assume that
Denote \(\hat{\boldsymbol{\tau }}^{{\delta _{n}}} = {{\hat{\boldsymbol{t}} ^{{\delta _{n}}}} / n}\), the CUSUM-type multiple change point estimator is given by
Conditions (A1) and (A2) are easily satisfied (see Yu et al. [29]). (A3) is often applied to obtain the convergence rate of change point estimator (e.g., Qin et al. [15]; Shi et al. [30]). Condition (A4) is weaker than Bai [11], which requires \(\sum_{j = 0}^{\infty }{j| {{a_{j}}} |} < \infty \). Furthermore, condition (A4) implies that \(\sum_{j = 0}^{\infty }{a_{j}^{2}} < \infty \) and \(\sum_{j = 0}^{\infty }{a_{j}^{4}} < \infty \).
Theorem 1
Assume that conditions (A1)–(A4) hold. Then, for all\(1 \le j \le r\), we have
When the mean μ is known, Qin et al. [15] established the strong convergence of the CUSUM estimator. It is obvious that Theorem 1 is still true when μ is known, and we will give the following corollary without proof.
Corollary 1
If the mean is known (\(\mu = 0\)), conditions (A1)–(A4) hold, then we have the same conclusion of Theorem 1.
Under assumptions (A1)–(A4), we can further establish the convergence rate of the CUSUM-type multiple change point estimator \(\hat{\boldsymbol{\tau }}^{{\delta _{n}}}\).
Theorem 2
Let\(M(n)\)be a natural number sequence with\(M(n)\rightarrow \infty \). Then, under the conditions of Theorem 1, we further have
To implement the CUSUM-type multiple change-point method, we also give the multiple variance-change iterative (MVCI) algorithm based on Qin et al. [15] and Shi et al. [30] as follows:
Step 2. Set \(i = 1\), \(m = 0\), and \(l = [ {n{\delta _{n}}} ]\). Divide the sample into L subintervals \({I_{j}}\) with the equal interval length l.
Step 3. For each subinterval \({I_{j}}\), \(j = 1,2, \ldots ,L\), find \(\hat{t}_{j}^{ ( i )} = \arg \max _{t \in ( {1 + m, m + l} )} R ( {{t_{j}}} )\).
Step 4. Compute the set \(\Delta = \{ {R ( {\hat{t} _{j}^{ ( i )}} )} \}\), and select r change locations which correspond to r maximum values of \(R ( {\hat{t} _{j}^{ ( i )}} )\) in the set Δ.
Step 5. For the selected r change locations \(\hat{t}_{j} ^{ ( i )}\), \(j = 1, \ldots ,r\), find \(\hat{t}_{j}^{ ( {i + 1} )} = [4] \arg \max _{t \in ( {\hat{t}_{j}^{ ( i )} - 2M ( l ), \hat{t}_{j}^{ ( {i + 1} )} + 2M ( l )} )} R ( {{t_{j}}} )\).
Step 6. Set \(l = 4M ( l )\) and \(m = \hat{t}_{j} ^{ ( i )} - 2M ( l )\).
Step 7. If \({ \Vert { \hat{\boldsymbol{t}} ^{ ( {i + 1} )} - \hat{\boldsymbol{t}}^{ ( i )}} \Vert _{ \infty }} < \eta \), then proceed to Step 8, otherwise set \(i=i+1\), go back to Step 3.
Step 8. \({ \hat{\boldsymbol{t}}_{\mathrm{MVCI}}} = \hat{\boldsymbol{t}}^{ ( i )}\) and \({\hat{\tau }_{ \mathrm{MVCI}}} = {{ \hat{\boldsymbol{t}}^{ ( i )}} / n}\).
3 Simulation studies
We present a set of simulation studies to illustrate the availability of the CUSUM-type MVCI algorithm via R packages. Additionally, we implement some available competitors including segment neighborhood (SN), pruned exact linear time (PELT), binary segmentation (BS), and wild binary segmentation (WBS) to compare the performance of the MVCI algorithm.
In model (1), we take \(r = 4\), \(\mu = 0\), \({\sigma _{1}} = 2\), \({\sigma _{2}} = 4\), \({\sigma _{3}} = 8\), \({\sigma _{4}} = 4\), \({\sigma _{5}} = 2\), and suppose the true change locations \(t_{1}^{*} = 100\), \(t_{2}^{*} = 200\), \(t_{3}^{*} = 300\), \(t_{4}^{*} = 400\), we model the NSD sequence \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) as a multivariate mixture of normal distribution with joint distribution \(N ( {0,0,1,4; -0.5} )\). The sample size is taken to be \(n = 500\) and the weight functions are satisfied \({a_{j}} = {2^{ - j}}\), \(j \in \mathbb{Z}\). Figure 1 displays the simulated sequence of \({Y_{t}}\), \(1 \le t \le 500\), and the true change locations.
×
To carry out the SN (Auger and Lawrence [31]), PELT (Killick et al. [32]), BS (Killick and Eckley [33]), we use the penalty likelihood method, which can be implemented by changepoint package in (Killick [34]). As to WBS (Killick and Eckley [33]), we utilize package wbsts (Korkas, Karolos, and Piotr [35]) in our model with the threshold \({\lambda _{n}} = C\sqrt{2} {\log ^{{1 / 2}}}n\), where \(C = 1\). We assume the parameter \({\delta _{n}} = {n^{ {{ - 1} / 2}}}\) in the MVCI algorithm. The mean squared error (MSE) of the CUSUM-type variance change point estimator of \(\tau ^{*}\) is defined as \(\mathrm{MSE} = \frac{1}{r}\sum_{i = 1}^{r} {{{ ( {\hat{\tau }_{i} - \tau _{i}^{*}} )} ^{2}}} \), and the performances of the above methods are described in Table 1 (all of the simulations are run for 100 replicates).
Table 1
Comparison of the MVCI algorithm with SN,PELT, BS, and WBS methods
\(\tau _{i}^{*}\)
n
SN
PELT
BS
WBS
MVCI
0.2
500
0.222
0.222
0.218
0.210
0.210
0.4
500
0.410
0.398
0.392
0.406
0.396
0.6
500
0.590
0.590
0.594
0.590
0.592
0.8
500
0.784
0.792
0.796
0.792
0.790
MSE
23.5 × 10−5
16.3 × 10−5
11.0 × 10−5
7.5 × 10−5
7.0 × 10−5
Table 1 presents the average MSEs of the MVCI, SN, PELT, BS, and WBS methods. Generally, the first change point is overestimated and the rest change points are underestimated. When the sample size is large (\(n=500\)), all of the methods can estimate the change points availably, but the MVCI method is superior in terms of the average MSE. This also indicates that the CUSUM-type variance-change method is computationally competitive with some of best change point estimation methods.
4 Proof of the theorems
Throughout the proof, let C be a general positive constant and \({c_{0}},{c_{1}},{c_{2}},{C_{0}},{C_{1}}, \ldots ,{C_{4}}\) be some positive constants. Denote \({x^{+} } = xI ( {x \ge 0} )\) and \({x^{-} } = - xI ( {x < 0} )\). In the following we will state some lemmas which are needed.
Suppose that\(\{ {{X_{m}},m \ge 1} \}\)is an NSD random sequence with conditions (A1)–(A2) hold, \(\{ {{a_{m}},1 \le m \le n,n \ge 1} \}\)is a sequence of real numbers satisfying\(\sum_{m = 1}^{\infty }{a_{m}^{2}} < \infty \). Then
Suppose that\({e_{t}}\)is linear processes under NSD random sequence with conditions (A1)–(A4) hold, let\(\sigma _{\varepsilon }^{2} = E ( {e_{t}^{2}} )\). Then
Decomposing \(\varepsilon _{t}\) as \(\varepsilon _{t} = \varepsilon _{t}^{+} - \varepsilon _{t}^{-} \), from properties (P2) and (P3) in Lemma 1, one can see that \(\varepsilon _{t}^{+} \), \(\varepsilon _{t}^{-} \), \({ ( {\varepsilon _{t}^{-} } )^{2}}\) and \({ ( {\varepsilon _{t} ^{+} } )^{2}}\) are NSD random sequences. From formula (5), we have \(E ( {XY} ) \leq E ( X )E ( Y )\), then
Note that \(ER ( {{t_{i}}} )\) is increasing for \({t_{i}} \le t_{i}^{*} \) decreasing while \({t_{i}} \ge t_{i}^{*} \), thus the maximum of \(ER ( {{t_{i}}} )\) is
where \(\delta = {{{C_{0}}\varepsilon } / 2}\) is an arbitrarily small positive number. According to the definition of \(ER ( {{t_{i}}} )\), one can see that
Since Eq. (12) can be proved similarly as (11), we only consider Eq. (11), thus the proof of Theorem 1 is finished by taking \(k = {t_{i}} - {t_{i - 1}}\) in Lemma 5. □
Let θ be a constant in the interval \(( {0,1} )\). Denote \(D_{n,r}^{M ( n )} = \{ t \in A_{n,r}^{{\delta _{n}}}, n\theta > {{ \Vert {t - {t^{*} }} \Vert }_{\infty }} > M ( n ) \}\). By Theorem 1, we have
Without loss of generality, we assume that \({\delta _{0}} < 0\). In view of the fact that \(| x | \ge | y |\) is equivalent to (i) \(x - y \ge 0\) and \(x + y \ge 0\), or (ii) \(x - y \le 0\) and \(x + y \le 0\), then
In the view of \(n{\delta _{n}} \to \infty \) and \(M ( n ) \to \infty \), Lemma 5 yields
$$ {Q_{i}} \to 0,\quad i = 1,2,3,4. $$
Thus \({T_{1}} \to 0\). We can treat \({T_{2}}\) analogously as \({T_{1}}\), hence \({T_{2}} \to 0\).
To complete the proof of Theorem 2, it is sufficient to show \({T_{3}} \to 0\). Since \(R ( {{t_{i}}} ) + R ( {t_{i} ^{*} } ) \le 0\) implies that \(R ( {{t_{i}}} ) - ER ( {{t_{i}}} ) + R ( {t_{i}^{*} } ) - ER ( {t_{i}^{*} } ) \le - ER ( {{t_{i}}} ) - ER ( {t_{i}^{*} } ) \le - ER ( {t_{i}^{*} } )\), we obtain
Thus \({T_{3}} \to 0\). This completes the proof of Theorem 2. □
5 Conclusions
In this study, we consider the multiple variance change model and develop a CUSUM-type methodology for change points estimation. We assume the errors from linear processes under NSD. The weak convergence rate of the change points estimation has been established. Recently, Qin et al. [15] and Shi et al. [30] concentrated on the strong convergence of the CUSUM-type estimator, we believe that the proposed estimation in this paper also has the strong convergent property. Additionally, investigating the change points estimation with the unknown number of the change points is an interesting topic, and this is our next work.
Acknowledgements
The authors would like to thank everyone for help.
Availability of data and materials
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.