On maximum likelihood estimators for a threshold autoregression

doi:10.1016/S0378-3758(98)00113-X

Journal of Statistical Planning and Inference

Volume 75, Issue 1, 15 November 1998, Pages 21-46

https://doi.org/10.1016/S0378-3758(98)00113-X Get rights and content

Abstract

For a stationary ergodic self-exciting threshold autoregressive model with single threshold parameter, Chan (1993) obtained the consistency and limiting distribution of the least-squares estimator for the underlying true parameters. In this paper, we derive the similar results for the maximum likelihood estimators of the same model under some regularity conditions on the error density, not necessarily Gaussian.

Introduction

In the past five decades, linear time-series models have dominated the development of the time-series analysis. But many applied fields such as electronics, oceanography, hydrology, ecology, marine engineering, medical engineering, solar astrophysics, and physics have revealed a common connecting theme: piecewise linearity. Tong (1977) mentioned the usefulness of a time-series that is piecewise linear in the past variables and in parameters. Later, Tong (1978a), Tong (1978b), Tong (1980) developed these models further in a systematic way for modeling of discrete time-series data. He argued that various phenomena such as limit cycles, jump resonance, harmonic distortion, modulation effects and chaos can be modeled by discrete time-series that are piecewise linear. He called these models the self-exciting threshold autoregressive (SETAR) models. See Tong (1983), Tong (1990) for a comprehensive introduction to general SETAR models.

This paper is concerned with the large sample behaviors of maximum likelihood estimators in a two regime SETAR model, called SETAR(2;p,p), defined as follows: $X_{i} =h(X_{i−1}, θ)+ε_{i}, i⩾1$ for some $θ =(θ_{1} ′, θ_{2} ′,r,d)′∈ R^{2p+3} ×{1,2,…,p}$ , where $X_{i−1} =(X_{i−1},…,X_{i−p})′$ , $θ_{j} =(θ_{0j},θ_{1j},…,θ_{pj})′∈ R^{p+1}, j=1,2$ and for $x ∈ R^{p}$ , $h(x, θ)= θ_{01} + ∑ k=1 p θ_{k1} x_{k} I(x_{d} ⩽r)+ θ_{02} + ∑ k=1 p θ_{k2} x_{k} I(x_{d} >r).$ The errors {ε_i} in Eq. (1)are independent and identically distributed random variables with mean zero, finite nonzero variance and ε_i is independent of X_i−1,X_i−2,…,i⩾1. The parameter r, the location of the change of the autoregressive function h, is called the threshold. The time delay d is called the delay parameter.

In this paper, we assume that the time-series in model (1) is stationary and ergodic. Detailed discussions about the stationarity and ergodicity can be found in Chan et al. (CPTW)(1985), Chan and Tong (1985).

For the case of the threshold having only finite number of possible values and assuming Gaussian errors, Tong (1983) constructed maximum likelihood estimators of the unknown parameters by using Akaike Information Criterion (Akaike, 1973). If the threshold r is known, CPTW (1985) obtained the consistency and asymptotic normality property of the least-squares estimators of the coefficient parameter $θ_{c} =(θ_{1} ′, θ_{2} ′)′$ under some regularity conditions for p=1. But, in practice, the threshold parameter r is unknown and can take infinitely many values in $R$ . In this case, Petruccelli (1986) proved that the conditional least-squares estimator (CLSE) of $θ$ is strongly consistent for SETAR(2;1,1) model.

The present paper is motivated by Chan (1993), who developed the strong consistency and limiting distribution of the CLSE in a SETAR(2;p,p) model (1). It derives the asymptotics of maximum likelihood estimator (MLE) of the underlying parameter $θ$ in model (1), when the errors have a density f, not necessarily Gaussian. Unlike the popular AR model, the likelihood function of SETAR(2;p,p) model is not continuous in the threshold parameter in general. Thus, the routine method of computing maximum likelihood estimator cannot be adopted. Instead, Section 2discusses the maximum likelihood method to obtain the MLE $θ ̂_{n} =(θ ̂_{cn} ′, r ̂_{n}, d ̂_{n})′$ of $θ =(θ_{c} ′,r,d)′$ . Section 3describes assumptions and obtains the strong consistency of the MLE of the true parameter θ. Section 4shows the n-consistency of the threshold estimator under the assumption of discontinuity of h at r. Section 5obtains the uniform asymptotic normality of the coefficient parameter estimator $θ ̂_{cn}$ over a bounded interval and some more byproduct results. In Section 6, as a consequence of the n-consistency of $r ̂_{n}$ , a suitably normalized log-likelihood sequence of processes ${l ̂_{n}}$ is shown to be approximated by a sequence of simpler processes which describe the log-likelihood under known coefficient parameter $θ_{c}$ . Through the latter processes, we obtain the limiting distribution of the standardized maximum likelihood estimator as the left endpoint of a random interval on which a superposition of independent compound Poisson processes attains a minimum.

Notation. Throughout the paper, the symbol $θ$ is the fixed unknown underlying parameter, the function f is the p.d.f. of ε₁ and F denotes the distribution function corresponding to f. The expectation under $θ$ is denoted by E.

Weak convergence is denoted by ⇒. A sequence (random) goes to zero (in probability) is denoted by o(1)(o_P(1)) while O(1)(O_P(1)) means that it is bounded (in probability). The multivariate normal distribution with mean zero and covariance matrix Γ is denoted by N(0,Γ). Let $R$ be the real line (−∞,∞), and $R ̄ = R ∪{−∞,∞}$ , then the compactness of the set $R ̄$ is under the metric d(·,·) defined by d(x,y)=|arctanx−arctany|. A function ϕ satisfies the Lip(1) if $∀x, y∈ R$ , ∃L⩾0, such that |ϕ(x)−ϕ(y)|⩽L|x−y|. For any event A, the complement event of A is denoted by A^c and the indicator function is denoted by I(A). Throughout, the capital letter C, the symbols $γ_{i}, i=1,2,…$ stand for absolute constants and they can have different values in different places. The notation x′y stands for the inner product of vectors x and y. For any matrix M=(m_ij), ||M||=∑_i,j|m_ij|, det(M) and adj(M) stand for the determinant and adjoint matrix of M, respectively. Vectors of dimension more than one are denoted by bold face letters. The index i in the summation varies from 1 to n unless specified otherwise.

Section snippets

The maximum likelihood estimation

We begin with the definition of the maximum likelihood estimators of the unknown underlying parameter $θ$ in model (1). Throughout we assume that the true parameter $θ$ is an interior point of the parameter space $R^{2p+2} × R ̄ ×{1,2,…,p}$ . There exists a compact subset K of $R^{2p+2}$ such that $θ$ is an interior point of $K× R ̄ ×{1,2,…,p}$ .

Denote $Ω=K× R ̄ ×{1,2,…,p}$ , then $Ω$ is a compact set. Let $ϑ =(α ′, β ′,s,q)′$ be any point in $Ω$ . Let $X_{i} =(X_{i},…,X_{i−p+1})′$ , then ${X_{i}}$ is a Markov chain. Let $g_{ϑ} (X_{0})$ be the initial density of $X_{0}$

Assumptions and strong consistency

We begin this section by listing the needed assumptions on the density f of the error ε₁ and the underlying process.

(C1) f is absolutely continuous and positive everywhere on $R$ . With the a.e. derivative ḟ, let $ϕ= f ̇ /f$ and $I(f)=∫ϕ^{2} (x)f(x) d x<∞$ .

(C2) ϕ is Lip(1).

(C3) ϕ is differentiable and the derivative $ϕ ̇$ is Lip(1).

(C4) E|ε₁|⁴<∞.

To derive the n-consistency and the limiting distribution of the threshold estimator, we need to make the following model assumptions:

(M1) The threshold r in $R$ is the

n-consistency of the threshold estimator

From now on we will invoke conditions (M1) and (M2). The discontinuity of h at r will give a stronger result about the estimator $r ̂_{n}$ of the threshold r, i.e., the n-consistency of $r ̂_{n}$ .

Theorem 2. Suppose conditions (C1)–(C4), (M1) and (M2) hold, then $|n(r ̂_{n} −r)|= O_{P} (1).$ First assume p=d=1. We will begin with some notation. Let $J : R^{2} → R$ and $p(x)=EJ(x,ε_{1}), p_{1} (x)=E|J(x,ε_{1})|, p_{2} (x)=EJ^{2} (x,ε_{1}), x∈ R,$ For u⩾0, define $G(u)=EI(r<X_{0} ⩽r+u), G_{n} (u)= 1 n ∑ I(r<X_{i−1} ⩽r+u),$ and $R_{n} (u)= 1 n ∑ J(X_{i−1},ε_{i})I(r<X_{i−1} ⩽r+u).$ $r_{n} (u)= 1 n ∑ p(X_{i−1})I(r<X_{i−1}$

Asymptotic normality

We now consider the limiting distribution of $θ ̂_{cn}$ . Recall that $ψ(X_{i−1},ε_{i}, ϑ)= ln f(ε_{i} +h(X_{i−1}, θ)−h(X_{i−1}, ϑ)) f(ε_{i}), ϑ ∈Ω$ and the log likelihood ratio function is $l_{n} (ϑ)= 1 n ∑ψ(X_{i−1},ε_{i}, ϑ), ϑ ∈Ω.$ In the definition of the MLE $θ ̂_{n}$ , the first 2p+2 components of the parameter point in $Ω$ is treated separately from the last component and we have proved that $r ̂_{n}$ is n-consistent. Thus, we need some results of $ϑ_{cn} (s)$ uniformly in s in the interval [r−B/n,r+B/n] for some B, 0<B<∞ which is given in the following theorem.

Limiting distribution of the threshold estimator

In this section, we first discuss the limiting behavior for a sequence of normalized profile log-likelihood processes. Then we obtain the limiting distribution of the standardized maximum likelihood estimator of the threshold parameter.

Recall that $l_{n} (ϑ_{c},s)= 1 n ∑ ln f(X_{i} −h_{s} (X_{i−1}, ϑ_{c})) f(ε_{i}), (ϑ_{c} ′,s)′∈Ω.$ For $t∈ R$ , a sequence of normalized profile log-likelihood processes is $l ̂_{n} (t)=−2n[l_{n} (ϑ_{cn} (r+t/n),r+t/n)−l_{n} (ϑ_{cn} (r),r)].$ Observe that in view of Theorem 3, $ϑ_{cn} (r+t/n)$ is an approximation of $θ_{c}$ uniformly in t

Acknowledgements

I would sincerely like to express my deep gratitude to my thesis advisor, Professor Hira L. Koul, for all his expert guidance, numerous useful suggestions and continuous encouragement. I also thank an anonymous referee for many useful comments and suggestions.

References (21)

Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F....
K.S. Chan et al.
On the use of the deterministic lyapunov function for the ergodicity of stochastic difference equations
Adv. Appl. Probab.
(1985)
K.S. Chan et al.
A multiple-threshold AR(1) model
J. Appl. Probab.
(1985)
K.S. Chan
Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model
Ann. Statist.
(1993)
P. Hall et al.
Martingale Limit Theory and its Applications
(1980)
Huber P.J., 1967. The behavior of maximum likelihood estimates under nonstandard conditions. In: Proc. 5th Berkeley...
Jacod, J., Mémin, J., 1980. Sur la convergence des semimartingales vers un processus a accroissements independents....
Jacod, J., Shiryaev, A.N., 1987. Limit Theorems for Stochastic Processes. Springer, Berlin, Ch....
Koul, H.L., 1992. Weighted empiricals and linear models. IMS Lecture Notes-Monograph Ser....
Koul, H.L., Schick, A., 1995. Efficient estimation in nonlinear time series models. Manuscript. MSU Tech. Report...

There are more references available in the full text version of this article.

Cited by (30)

Threshold autoregressive models for interval-valued time series data
2018, Journal of Econometrics
Modeling and forecasting symbolic data, especially interval-valued time series (ITS) data, has received considerable attention in statistics and related fields. The core of available methods on ITS analysis is based on various applications of conventional linear modeling. However, few works have considered possible nonlinearities in ITS data. In this paper, we propose a new class of threshold autoregressive interval (TARI) models for ITS data. By matching the interval model with interval observations, we develop a minimum-distance estimation method for TARI models, and establish the asymptotic theory for the proposed estimators. We show that the threshold parameter estimator is T-consistent and follows an asymptotic compound Poisson process as the sample size $T \to \infty$ . And the estimators for other TARI model parameters are root-T consistent and asymptotically normal. Simulation studies show that the proposed TARI model provides more accurate out-of-sample forecasts than the existing center–radius self-exciting threshold (CR-SETAR) model for ITS data in the literature. Empirical applications to the S&P 500 Price Index document significant asymmetric reactions of the stock markets in Japan, U.K. and France to shocks from the U.S. stock market and that incorporating this asymmetric effect yield better out-of-sample forecasts than a variety of popular models available in the literature.
On the least squares estimation of multiple-regime threshold autoregressive models
2012, Journal of Econometrics
Citation Excerpt :
In the econometric literature, there are other methods to conduct inference for TAR models, for example, the sequential estimation procedure in Gonzalo and Pitarakis (2002), the subsampling in Gonzalo and Wolf (2005), etc. More earlier related results on the LSE for TAR models can be found in Petruccelli (1986), Chan and Tsay (1998), Qian (1998), Tsay (1998), Caner and Hansen (2001), etc. At the same time, probabilistic structures of TAR models were studied intensively by Chan et al. (1985), Chan and Tong (1985), Chen and Tsay (1991), Brockwell et al. (1992), Liu and Susko (1992), An and Huang (1996), Ling (1999), Cline and Pu (2004) and so on.
This paper studies the least squares estimator (LSE) of the multiple-regime threshold autoregressive (TAR) model and establishes its asymptotic theory. It is shown that the LSE is strongly consistent. When the autoregressive function is discontinuous over each threshold, the estimated thresholds are $n$ -consistent and asymptotically independent, each of which converges weakly to the smallest minimizer of a one-dimensional two-sided compound Poisson process. The remaining parameters are $\sqrt{n}$ -consistent and asymptotically normal. The theory of Chan (1993) is revisited and a numerical approach is proposed to simulate the limiting distribution of the estimated threshold via simulating a related compound Poisson process. Based on the numerical result, one can construct a confidence interval for the unknown threshold. This issue is not straightforward and has remained as an open problem since the publication of Chan (1993). This paper provides not only a solution to this long-standing open problem, but also provides methodological contributions to threshold models. Simulation studies are conducted to assess the performance of the LSE in finite samples. The results are illustrated with an application to the quarterly U.S. real GNP data over the period 1947–2009.
A note on the consistency of a robust estimator for threshold autoregressive processes
2009, Statistics and Probability Letters
The method of conditional least squares is commonly used for estimating threshold autoregressive parameters, and its consistency was derived by Chan [Chan, K.S., 1993. Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model. Annals of Statistics 21, 520–533]. In this note we consider a general class of robust estimators for threshold autoregressive models, and under some regularity conditions and a proper choice of the weight function, the consistency is demonstrated.
Asymptotic theory on the least squares estimation of threshold moving-average models
2013, Econometric Theory
A Novel Double-Banded-Threshold Mixture Autoregressive Model
2024, SSRN
Optimal model averaging based on leave- h-out forward-validation for threshold autoregressive models
2023, Stat

View all citing articles on Scopus

¹: Research was partly supported by the NSF grant: DMS-94 02904.

View full text

On maximum likelihood estimators for a threshold autoregression1

Abstract

Introduction

Section snippets

The maximum likelihood estimation

Assumptions and strong consistency

n-consistency of the threshold estimator

Asymptotic normality

Limiting distribution of the threshold estimator

Acknowledgements

On the use of the deterministic lyapunov function for the ergodicity of stochastic difference equations

Adv. Appl. Probab.

A multiple-threshold AR(1) model

J. Appl. Probab.

Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model

Ann. Statist.

Martingale Limit Theory and its Applications

On maximum likelihood estimators for a threshold autoregression¹