Abstract
The marginal distribution of count data processes rarely follows a simple Poisson model in practice. Instead, one commonly observes deviations such as overdispersion or zero inflation. To express the extend of such deviations from a Poisson model, one can compute an appropriately defined dispersion index or zero index. In this article, we develop several tests based on such indexes, including joint tests being based on an index combination. The asymptotic distribution of the resulting test statistics under the null hypothesis of a Poisson INAR(1) model is derived, and the finite-sample performance of the resulting tests is analyzed. Real data examples illustrate the application of these tests in practice.
Similar content being viewed by others
Change history
12 September 2019
Unfortunately, due to a technical error, the articles published in issues 60:2 and 60:3 received incorrect pagination. Please find here the corrected Tables of Contents. We apologize to the authors of the articles and the readers.
References
Al-Osh MA, Alzaid AA (1987) First-order integer-valued autoregressive (INAR(1)) process. J Time Ser Anal 8(3):261–275
Alzaid AA, Al-Osh MA (1988) First-order integer-valued autoregressive process: distributional and regression properties. Stat Neerl 42(1):53–61
Barreto-Souza W (2015) Zero-modified geometric INAR(1) process for modelling count time series with deflation or inflation of zeros. J Time Ser Anal 36(6):839–852
Böhning D (1994) A note on a test for Poisson overdispersion. Biometrika 81(2):418–419
Cochran WG (1954) Some methods for strengthening the common \(\chi ^2\) tests. Biometrics 10(4):417–451
David HA (1985) Bias of \(S^2\) under dependence. Am Stat 39(3):201
Douglas JB (1980) Analysis with standard contagious distributions. International Co-operative Publishing House, Fairland
Fisher RA (1950) The significance of deviations from expectation in a Poisson series. Biometrics 6:17–24
Freeland RK, McCabe BPM (2004) Forecasting discrete valued low count time series. Int J Forecast 20(3):427–434
Ibragimov I (1962) Some limit theorems for stationary processes. Theory Probab Its Appl 7(4):349–382
Jazi MA, Jones G, Lai C-D (2012) First-order integer valued AR processes with zero inflated Poisson innovations. J Time Ser Anal 33:954–963
Johnson NL, Kemp AW, Kotz S (2005) Univariate discrete distributions., Wiley series in probability and statisticsWiley, New York
Jung RC, Ronning G, Tremayne AR (2005) Estimation in conditional first order autoregression with discrete support. Stat Pap 46:195–224
Maiti R, Biswas A, Guha A, Ong SH (2014) Modelling and coherent forecasting of zero-inflated count time series. Stat Modell 14(5):375–398
McKenzie E (1985) Some simple models for discrete variate time series. Water Resour Bull 21(4):645–650
Meintanis S, Karlis D (2014) Validation tests for the innovation distribution in INAR time series models. Comput Stat 29(5):1221–1241
Moriña D, Puig P, Ríos J, Vilella A, Trilla A (2011) A statistical model for hospital admissions caused by seasonal diseases. Stat Med 30(26):3125–3136
Nastić AS, Ristić MM, Miletić Ilić AV (2017) A geometric time series model with an alternative dependent Bernoulli counting series. Commun Stat-Theory Methods 46(2):770–785
Park Y, Kim H-Y (2012) Diagnostic checks for integer-valued autoregressive models using expected residuals. Stat Pap 53(4):951–970
Puig P, Valero J (2006) Count data distributions: some charaterizations with applications. J Am Stat Assoc 101(473):332–340
Puig P, Valero J (2007) Characterization of count data distributions involving additivity and binomial subsampling. Bernoulli 13(2):544–555
Pujol M, Barquinero JF, Puig P, Puig R, Caballín MR, Barrios L (2014) A new model of biodosimetry to integrate low and high doses. PLoS ONE 9(12):1–19
Rao CR, Chakravarti IM (1956) Some small sample tests of significance for a Poisson distribution. Biometrics 12(3):264–282
Sáez-Castillo AJ, Conde-Sánchez A (2015) Detecting over- and under-dispersion in zero inflated data with the hyper-Poisson regression model. Stat Pap. doi:10.1007/s00362-015-0683-1
Schweer S (2015) On the time-reversibility of integer-valued autoregressive processes of general order. In: Steland A et al (ed) Stochastic models, statistics and their applications, Springer Proceedings in mathematics and statistics. Springer, Wroclaw, vol 122, pp 169–177
Schweer S, Weiß CH (2014) Compound Poisson INAR(1) processes: stochastic properties and testing for overdispersion. Comput Stat Data Anal 77:267–284
Scotto MG, Weiß CH, Gouveia S (2015) Thinning-based models in the analysis of integer-valued time series: a review. Stat Modell 15(6):590–618
Steutel FW, van Harn K (1979) Discrete analogues of self-decomposability and stability. Ann Probab 7(5):893–899
van den Broek J (1995) A score test for zero inflation in a Poisson distribution. Biometrics 51(2):738–743
Weiß CH (2008) Thinning operations for modelling time series of counts—a survey. Adv Stat Anal 92(3):319–341
Weiß CH (2013) Integer-valued autoregressive models for counts showing underdispersion. J Appl Stat 40(9):1931–1948
Weiß CH, Schweer S (2015) Detecting overdispersion in INARCH(1) processes. Stat Neerl 69(3):281–297
Zheng H, Basawa IV, Datta S (2007) First-order random coefficient integer-valued autoregressive processes. J Stat Plan Inference 173:212–229
Zhu F (2012) Zero-inflated Poisson and negative binomial integer-valued GARCH models. J Stat Plan Inference 142(4):826–839
Acknowledgements
The authors are grateful to two referees for useful comments on an earlier draft of this article. The authors would also like to thank Tobias Möller, Helmut Schmidt University Hamburg, for making them aware of the Regensburg time series studied in Sect. 5.2. The third author was funded by the Grants MTM2012-31118 and MTM2015-69493-R from the Spanish Ministry of Economy and Competitiveness.
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to the memory of Bent Jørgensen (1954–2015).
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix 1: The ZIB-ZIP-INAR(1) model for zero-inflation
The basic INAR(1) model from Definition 1 has been generalized in several ways by replacing the binomial thinning operator by another type of thinning operation, see the surveys by Weiß (2008) and Scotto et al. (2015). To inflate the number of zeros being observed for a Poisson INAR(1) process, we combine the idea of using zero-inflated innovations (Jazi et al. 2012) with the idea of using an appropriately modified type of thinning operation: the zero-inflated binomial thinning (ZIB thinning) operator. It is defined by a conditional ZIB-distribution (instead of a conditional binomial distribution) via
The latter guarantees that \(\frac{\alpha }{1-\omega }<1\). Note that \(\omega =0\) just corresponds to the usual binomial thinning operator, while negative values for \(\omega \) (and hence some kind of “zero-deflated binomial thinning”) are impossible as long as N is allowed to take arbitrarily large values from \(\mathbb N_0\). The parametrization in (17) is chosen such that \({\mathbb {E}}[\alpha \circ _{\omega } N\,|\,N]\) still equals \(\alpha \cdot N\), while \(\mathrm{Var}[\alpha \circ _{\omega } N\,|\,N]=\alpha (1-\frac{\alpha }{1-\omega })\,N + \alpha ^2\,\frac{\omega }{1-\omega }\,N^2\).
It should be noted that ZIB-thinning is just that instance of random coefficient thinning (Zheng et al. 2007), where the thinning parameter equals 0 with probability \(\omega \), and \(\frac{\alpha }{1-\omega }\) with probability \(1-\omega \). The same operator (and, hence, also the same model as below) was recently also considered by Nastić et al. (2017), although not motivated by the issue of zero inflation but by obtaining some kind of “dependent binomial thinning”. Therefore, these authors refer to the operator ‘\(\circ _{\omega }\)’ as the “alternative generalized binomial thinning operator”, but we shall prefer the name “ZIB thinning” here.
The modified type of INAR(1) model to be considered for our power analyses in Sect. 4, referred to as the ZIB-ZIP-INAR(1) model, is now defined by the recursion
This differs from Nastić et al. (2017), who mainly focus on the case of a geometric marginal distribution. It immediately follows, also see (Zheng et al. 2007; Nastić et al. 2017), that \({\mathbb {E}}[X_t]=\mu \) and \(\rho (k)=\alpha ^k\), like for the usual INAR(1) model, while
The transition probabilities are computed as
Further marginal properties like the zero probability \(p_0\) can be computed numerically by utilizing the Markov property: choosing M sufficiently large and defining \(\tilde{{\mathbf {P}}}:=(p_{i|j})_{i,j=0,\ldots ,M}\), the marginal probabilities \((p_0,\ldots ,p_M)^{\top }\) are approximated by the solution of the eigenvalue problem \(\tilde{{\mathbf {P}}}\,\tilde{{\varvec{p}}}=\tilde{{\varvec{p}}}\).
Appendix 2: Proofs
1.1 Appendix 2.1: Proof of Theorem 3.1.1
For the computation of the joint moments the cases \(k=0\) and \(k>0\) have to be considered. As we will see, the joint moment formula for \(k>0\) also holds for \(k=0\), when calculating \(\sigma _{11}, \sigma _{12}, \sigma _{13}\).
For \(\sigma _{11}\), we compute
So it follows that
For \(\sigma _{12}\), it follows that
Hence,
Note that if the process would not be t.r., we would have to compute \({\mathbb {E}}[Z_{t,1} \cdot Z_{t-k,2}]\) separately.
Before computing \(\sigma _{13}\), we derive
So it follows that
The remaining entries of \({\varvec{\Sigma }}\) have already been calculated in Lemma A.5.1 in Schweer and Weiß (2014), so the proof of Theorem 3.1.1 is complete.
1.2 Appendix 2.2: Proof of Theorem 3.2.1
We apply the Delta method to Theorem 3.1.1 and the function \({\varvec{g}}:\mathbb {R}^3\rightarrow \mathbb {R}^2\) with components
where the dispersion function \(f_{d }\) satisfies \(f_{d }(\mu , \mu (1+\mu ))=1\) for all \(\mu \). The Jacobian of \({\varvec{g}}\), evaluated at \((e^{-\mu }, \mu , \mu (1+\mu ))\), then has the form
Hence, we obtain
where
Considering that all \(\sigma _{ij}\) from Theorem 3.1.1 except \(\sigma _{11}\) have a similar structure, we can further simplify
Finally, consider the particular case of the dispersion index (10), i.e., the case where \(f_{d }(x_2,x_3) := \frac{x_3}{x_2}-x_2\). Then
Hence, for the above Jacobian of \({\varvec{g}}\), evaluated at \((e^{-\mu }, \mu , \mu (1+\mu ))\), we obtain
As a result, \(d_{22} + (1+2\mu )\, d_{23} = 0\) such that the expressions B, C for the covariance matrix further simplify to
Inserting \(d_{23}\ =\ 1/\mu \) completes the proof of Theorem 3.2.1. Note that the expression for C was already shown in Schweer and Weiß (2014).
1.3 Appendix 2.3: Proof of Proposition 1
The asymptotic behavior of the particular combination \(\hat{I}_{\mathrm{PV}}\) or \(\hat{I}_{\mathrm{vdB}}\) and \(\hat{I}_{\mathrm{d}}\) is a direct consequence of Theorem 3.2.1. For the case \(\hat{I}_{\mathrm{PV}}\), to obtain the expressions for \(d_{11}, d_{12}\), we have to use the function \(f_{\mathrm{z}}\) in Theorem 3.2.1 given by \(f_{\mathrm{z}}(x_1,x_2) := 1+\frac{\ln {x_1}}{x_2}\), with partial derivatives
Evaluating in \((e^{-\mu }, \mu )\), we obtain
Similarly, for the case \(\hat{I}_{\mathrm{vdB}}\) we have to use, \(f_{\mathrm{z}}(x_1,x_2)\ :=\ x_1e^{x_2}-1, \) with partial derivatives
Again, evaluating in \((e^{-\mu }, \mu )\), we obtain
Inserting \(d_{11}\) and \(d_{12}\) into Theorem 3.2.1, the results hold immediately.
1.4 Appendix 2.4: Proof of Proposition 2
The expectation of \(\hat{I}_{\mathrm{z}}\) is determined by applying the second-order Taylor expansion to \(\hat{I}_{\mathrm{z}}\). We obtain, \({\mathbb {E}}[\hat{I}_{\mathrm{z}}-I_{\mathrm{z}}]\approx {\mathbb {E}}[\frac{1}{2}{\varvec{Y}}_T^{'} {\mathbf {H}}_{f_{\mathrm{z}}}{\varvec{Y}}_T]\), with \({\varvec{Y}}_T=\frac{1}{\sqrt{T}}\sum _{t=1}^T{\varvec{Z}}_T\) satisfying \({\mathbb {E}}[{\varvec{Y}}_T]={\varvec{0}}\), and therefore
The results for \(\hat{I}_{\mathrm{PV}}\) and \(\hat{I}_{\mathrm{vdB}}\) follow by direct calculations, taking into account that for \(I_{\mathrm{PV}}\) the Hessian matrix of \(f_{\mathrm{z}}\) is
and for \(I_{\mathrm{vdB}}\) the Hessian matrix of \(f_{\mathrm{z}}\) is
Rights and permissions
About this article
Cite this article
Weiß, C.H., Homburg, A. & Puig, P. Testing for zero inflation and overdispersion in INAR(1) models. Stat Papers 60, 823–848 (2019). https://doi.org/10.1007/s00362-016-0851-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-016-0851-y