On maximum likelihood estimators for a threshold autoregression1

https://doi.org/10.1016/S0378-3758(98)00113-XGet rights and content

Abstract

For a stationary ergodic self-exciting threshold autoregressive model with single threshold parameter, Chan (1993) obtained the consistency and limiting distribution of the least-squares estimator for the underlying true parameters. In this paper, we derive the similar results for the maximum likelihood estimators of the same model under some regularity conditions on the error density, not necessarily Gaussian.

Introduction

In the past five decades, linear time-series models have dominated the development of the time-series analysis. But many applied fields such as electronics, oceanography, hydrology, ecology, marine engineering, medical engineering, solar astrophysics, and physics have revealed a common connecting theme: piecewise linearity. Tong (1977) mentioned the usefulness of a time-series that is piecewise linear in the past variables and in parameters. Later, Tong (1978a), Tong (1978b), Tong (1980) developed these models further in a systematic way for modeling of discrete time-series data. He argued that various phenomena such as limit cycles, jump resonance, harmonic distortion, modulation effects and chaos can be modeled by discrete time-series that are piecewise linear. He called these models the self-exciting threshold autoregressive (SETAR) models. See Tong (1983), Tong (1990) for a comprehensive introduction to general SETAR models.

This paper is concerned with the large sample behaviors of maximum likelihood estimators in a two regime SETAR model, called SETAR(2;p,p), defined as follows:Xi=h(Xi−1,θ)+εi,i⩾1for some θ=(θ1′,θ2′,r,d)′∈R2p+3×{1,2,…,p}, where Xi−1=(Xi−1,…,Xi−p)′, θj=(θ0j1j,…,θpj)′∈Rp+1,j=1,2 and for xRp,h(x,θ)=θ01+k=1pθk1xkI(xd⩽r)+θ02+k=1pθk2xkI(xd>r).The errors {εi} in Eq. (1)are independent and identically distributed random variables with mean zero, finite nonzero variance and εi is independent of Xi−1,Xi−2,…,i⩾1. The parameter r, the location of the change of the autoregressive function h, is called the threshold. The time delay d is called the delay parameter.

In this paper, we assume that the time-series in model (1) is stationary and ergodic. Detailed discussions about the stationarity and ergodicity can be found in Chan et al. (CPTW)(1985), Chan and Tong (1985).

For the case of the threshold having only finite number of possible values and assuming Gaussian errors, Tong (1983) constructed maximum likelihood estimators of the unknown parameters by using Akaike Information Criterion (Akaike, 1973). If the threshold r is known, CPTW (1985) obtained the consistency and asymptotic normality property of the least-squares estimators of the coefficient parameter θc=(θ1′,θ2′)′ under some regularity conditions for p=1. But, in practice, the threshold parameter r is unknown and can take infinitely many values in R. In this case, Petruccelli (1986) proved that the conditional least-squares estimator (CLSE) of θ is strongly consistent for SETAR(2;1,1) model.

The present paper is motivated by Chan (1993), who developed the strong consistency and limiting distribution of the CLSE in a SETAR(2;p,p) model (1). It derives the asymptotics of maximum likelihood estimator (MLE) of the underlying parameter θ in model (1), when the errors have a density f, not necessarily Gaussian. Unlike the popular AR model, the likelihood function of SETAR(2;p,p) model is not continuous in the threshold parameter in general. Thus, the routine method of computing maximum likelihood estimator cannot be adopted. Instead, Section 2discusses the maximum likelihood method to obtain the MLE θ̂n=(θ̂cn′,r̂n,d̂n)′ of θ=(θc′,r,d)′. Section 3describes assumptions and obtains the strong consistency of the MLE of the true parameter θ. Section 4shows the n-consistency of the threshold estimator under the assumption of discontinuity of h at r. Section 5obtains the uniform asymptotic normality of the coefficient parameter estimator θ̂cn over a bounded interval and some more byproduct results. In Section 6, as a consequence of the n-consistency of r̂n, a suitably normalized log-likelihood sequence of processes {l̂n} is shown to be approximated by a sequence of simpler processes which describe the log-likelihood under known coefficient parameter θc. Through the latter processes, we obtain the limiting distribution of the standardized maximum likelihood estimator as the left endpoint of a random interval on which a superposition of independent compound Poisson processes attains a minimum.

Notation. Throughout the paper, the symbol θ is the fixed unknown underlying parameter, the function f is the p.d.f. of ε1 and F denotes the distribution function corresponding to f. The expectation under θ is denoted by E.

Weak convergence is denoted by ⇒. A sequence (random) goes to zero (in probability) is denoted by o(1)(oP(1)) while O(1)(OP(1)) means that it is bounded (in probability). The multivariate normal distribution with mean zero and covariance matrix Γ is denoted by N(0,Γ). Let R be the real line (−∞,∞), and R̄=R∪{−∞,∞}, then the compactness of the set R̄ is under the metric d(·,·) defined by d(x,y)=|arctanx−arctany|. A function ϕ satisfies the Lip(1) if ∀x,y∈R, ∃L⩾0, such that |ϕ(x)−ϕ(y)|⩽L|xy|. For any event A, the complement event of A is denoted by Ac and the indicator function is denoted by I(A). Throughout, the capital letter C, the symbols γi,i=1,2,… stand for absolute constants and they can have different values in different places. The notation xy stands for the inner product of vectors x and y. For any matrix M=(mij), ||M||=∑i,j|mij|, det(M) and adj(M) stand for the determinant and adjoint matrix of M, respectively. Vectors of dimension more than one are denoted by bold face letters. The index i in the summation varies from 1 to n unless specified otherwise.

Section snippets

The maximum likelihood estimation

We begin with the definition of the maximum likelihood estimators of the unknown underlying parameter θ in model (1). Throughout we assume that the true parameter θ is an interior point of the parameter space R2p+2×R̄×{1,2,…,p}. There exists a compact subset K of R2p+2 such that θ is an interior point of R̄×{1,2,…,p}.

Denote Ω=K×R̄×{1,2,…,p}, then Ω is a compact set. Let ϑ=(α′,β′,s,q)′ be any point in Ω. Let Xi=(Xi,…,Xi−p+1)′, then {Xi} is a Markov chain. Let gϑ(X0) be the initial density of X0

Assumptions and strong consistency

We begin this section by listing the needed assumptions on the density f of the error ε1 and the underlying process.

(C1) f is absolutely continuous and positive everywhere on R. With the a.e. derivative ḟ, let ϕ=ḟ/f and I(f)=∫ϕ2(x)f(x)dx<∞.

(C2) ϕ is Lip(1).

(C3) ϕ is differentiable and the derivative ϕ̇ is Lip(1).

(C4) E|ε1|4<∞.

To derive the n-consistency and the limiting distribution of the threshold estimator, we need to make the following model assumptions:

(M1) The threshold r in R is the

n-consistency of the threshold estimator

From now on we will invoke conditions (M1) and (M2). The discontinuity of h at r will give a stronger result about the estimator r̂n of the threshold r, i.e., the n-consistency of r̂n.


Theorem 2. Suppose conditions (C1)–(C4), (M1) and (M2) hold, then|n(r̂n−r)|=OP(1).First assume p=d=1. We will begin with some notation. Let J:R2R andp(x)=EJ(x,ε1),p1(x)=E|J(x,ε1)|,p2(x)=EJ2(x,ε1),x∈R,For u⩾0, defineG(u)=EI(r<X0⩽r+u),Gn(u)=1nI(r<Xi−1⩽r+u),andRn(u)=1nJ(Xi−1i)I(r<Xi−1⩽r+u).rn(u)=1np(Xi−1)I(r<Xi−1

Asymptotic normality

We now consider the limiting distribution of θ̂cn. Recall thatψ(Xi−1i,ϑ)=lnf(εi+h(Xi−1,θ)−h(Xi−1, ϑ))f(εi),ϑ∈Ωand the log likelihood ratio function isln(ϑ)=1n∑ψ(Xi−1i,ϑ),ϑ∈Ω.In the definition of the MLE θ̂n, the first 2p+2 components of the parameter point in Ω is treated separately from the last component and we have proved that r̂n is n-consistent. Thus, we need some results of ϑcn(s) uniformly in s in the interval [rB/n,r+B/n] for some B, 0<B<∞ which is given in the following theorem.

Limiting distribution of the threshold estimator

In this section, we first discuss the limiting behavior for a sequence of normalized profile log-likelihood processes. Then we obtain the limiting distribution of the standardized maximum likelihood estimator of the threshold parameter.

Recall thatln(ϑc,s)=1nlnf(Xi−hs (Xi−1,ϑc))f(εi),(ϑc′,s)′∈Ω.For t∈R, a sequence of normalized profile log-likelihood processes isl̂n(t)=−2n[ln(ϑcn(r+t/n),r+t/n)−ln(ϑcn(r),r)].Observe that in view of Theorem 3, ϑcn(r+t/n) is an approximation of θc uniformly in t

Acknowledgements

I would sincerely like to express my deep gratitude to my thesis advisor, Professor Hira L. Koul, for all his expert guidance, numerous useful suggestions and continuous encouragement. I also thank an anonymous referee for many useful comments and suggestions.

References (21)

  • Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F....
  • K.S. Chan et al.

    On the use of the deterministic lyapunov function for the ergodicity of stochastic difference equations

    Adv. Appl. Probab.

    (1985)
  • K.S. Chan et al.

    A multiple-threshold AR(1) model

    J. Appl. Probab.

    (1985)
  • K.S. Chan

    Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model

    Ann. Statist.

    (1993)
  • P. Hall et al.

    Martingale Limit Theory and its Applications

    (1980)
  • Huber P.J., 1967. The behavior of maximum likelihood estimates under nonstandard conditions. In: Proc. 5th Berkeley...
  • Jacod, J., Mémin, J., 1980. Sur la convergence des semimartingales vers un processus a accroissements independents....
  • Jacod, J., Shiryaev, A.N., 1987. Limit Theorems for Stochastic Processes. Springer, Berlin, Ch....
  • Koul, H.L., 1992. Weighted empiricals and linear models. IMS Lecture Notes-Monograph Ser....
  • Koul, H.L., Schick, A., 1995. Efficient estimation in nonlinear time series models. Manuscript. MSU Tech. Report...
There are more references available in the full text version of this article.

Cited by (30)

  • On the least squares estimation of multiple-regime threshold autoregressive models

    2012, Journal of Econometrics
    Citation Excerpt :

    In the econometric literature, there are other methods to conduct inference for TAR models, for example, the sequential estimation procedure in Gonzalo and Pitarakis (2002), the subsampling in Gonzalo and Wolf (2005), etc. More earlier related results on the LSE for TAR models can be found in Petruccelli (1986), Chan and Tsay (1998), Qian (1998), Tsay (1998), Caner and Hansen (2001), etc. At the same time, probabilistic structures of TAR models were studied intensively by Chan et al. (1985), Chan and Tong (1985), Chen and Tsay (1991), Brockwell et al. (1992), Liu and Susko (1992), An and Huang (1996), Ling (1999), Cline and Pu (2004) and so on.

View all citing articles on Scopus
1

Research was partly supported by the NSF grant: DMS-94 02904.

View full text