Information measures for generalized gamma family
Introduction
The generalized gamma distribution offers a flexible family in the varieties of shapes and hazard functions for modeling duration. It was introduced by Stacy (1962). Difficulties with convergence of algorithms for maximum likelihood estimation (Hager and Bain, 1970) inhibited applications of the model. Prentice (1974) resolved the convergence problem using a nonlinear transformation of model. However, despite its long history and growing use in various applications, the family has been remarkably absent in the information theoretic literature. Thus far a maximum entropy (ME) derivation of is given in Kapur (1989), where it is referred to as generalized Weibull distribution, and only recently the entropy of has appeared in the context of flexible families of distributions (Nadarajah and Zografos, 2003). The family has not been included in information studies such as the existing ME distributional fitting of the parametric families (see, e.g., Soofi and Retzer, 2002 and references therein), the discrimination information statistics analysis of the parametric families (Alwan et al., 1998), and the entropy orderings of the parametric families (Ebrahimi et al., 1999). The main objective of this paper is to fill this void and integrate the family into the information theoretic literature. For this purpose, we develop information criteria for discriminating between the and its subfamilies and for assessing the fit of to the data. We also present Bayesian inference about the discrimination and the fit.
Analysis of duration data is increasingly used in various areas of economics and related fields (Keifer, 1988). In labor economics, examples include studies of the duration of unemployment, (Lancaster, 1979, Kiefer, 1984, McDonald and Butler, 1987, Yamaguchi, 1992), turnover in labor market (Kiefer et al., 1985), length of contract (Gronberg, 1994), and duration of strike (Jaggia, 1991). Examples in other areas include studies of firms survival (Audretsch and Mahmoud, 1995), duration that firms spend under Chapter 11 (Orbe et al., 2002), duration that a property is on the market (Genesove and Mayer, 1997), duration of schooling at higher education (Diaz, 1999), duration of stages of oilfield exploration (Favero et al., 1994), household interpurchase time (Vakratsas and Bass, 2002), interpurchase time in financial markets (Allenby et al., 1999), and length of the time that new movies stay on screens (Blumenthal, 1988).
Distributions that are used in duration analysis in economics include exponential (Kiefer, 1984, Diebold and Rudebusch, 1990), lognormal (Eckstein and Wolpin, 1995), gamma (Lancaster, 1979), and Weibull (Favero et al., 1994). The family, which encompasses exponential, gamma, and Weibull as subfamilies, and lognormal as a limiting distribution, has been used in economics by Jaggia (1991), Yamaguchi (1992), and Allenby et al. (1999). Some authors (e.g., Jaggia, 1991, Allenby et al., 1999) have argued that the flexibility of makes it suitable for duration analysis, while others have been using simpler models and avoiding the estimation difficulties caused by the complexity of parameter structure. Obviously, there would be no need to endure the costs associated with the application of a complex model if the data do not discriminate between the and members of its subfamilies, or if the fit of a simpler model to the data is as good as that for the complex . The question therefore is: Do the data necessitate use of a model? From the information theoretic perspective, this question is dealt with derivation of probability models based on partial information in the form of a set of constraints, measuring the incremental information content of additional constraints, and thereby assessing compatibility of models with the data. The information measures, presented in this paper, offer tools, with axiomatic basis and intuitive appeals, for as a general class of duration models.
The paper is organized as follows. Section 2 discusses information properties of the family and presents several discrimination information measures for the and its subfamilies. Section 3 gives entropy representations of the likelihood statistic, AIC, and BIC measures. Section 4 discusses Bayesian inference about the parameters and discrimination information measures. Section 5 presents an information index of fit of the model to the histogram and Bayesian inference about the fit. Section 6 illustrates application of the information criteria to the duration of unemployment and duration of CEO tenure. Section 7 gives some brief concluding remarks.
Section snippets
Information properties of GG family
The probability density function of the GG distribution, , iswhere is the gamma function, and are shape parameters, and is the scale parameter.
The family is flexible in that it includes several well-known models as subfamilies (see, Johnson et al., 1994). The subfamilies of thus far considered in the literature are exponential , gamma for , and Weibull for . The lognormal distribution is also obtained as a
Likelihood-based measures
The likelihood function based on a set of observations from iswhere and .
The likelihood equations for the derivatives of the log-likelihood function with respect to and are the two moment equations (6) and (7) with and . These equations give , where is given by (8) with the MLE estimates , , and of the
Bayesian inference for discrimination information
Given data , discrimination information statistics for the family are obtained by estimating the Kullback–Leibler functions presented in the preceding section. We may estimate the discrimination information measures by estimating the parameters using the maximum likelihood, the methods of moments, generalized method of moments, and Bayesian procedures. These estimates of information provide descriptive statistics which are useful diagnostic measures for quantifying data information
Bayesian inference for maximum entropy index
Maximum entropy fit indices and tests are constructed based on properties of the parametric family of the model. Consider the distributions in the moment class (5). If is the ME model, then for any , by the information distinguishability (ID) relation (Soofi et al., 1995), we haveThat is, the discrepancy between ME distribution and any other distribution in is given by the difference between entropies of the two models.
Given observations
Examples
We illustrate applications of the discrimination information measures and ME fit indices using two data sets. The first data set pertains to unemployment duration, drawn from the Bureau of Labor Statistics 2001. We studied unemployment data for females and males in rural and urban areas, and will report the results for female workers in the urban areas. The results of information analyses for other categories were all remarkably similar to those reported here. The second data set pertains to
Concluding remarks
This paper took the first major step toward closing the gap between the growing presence of model in duration analysis literature and its remarkable absence in the information studies. We presented some information properties of the distribution and showed that its flexibility leads to an assortments of information measures for the family. These information functions provide insights and can serve various data analysis purposes such as MDI modeling and data transformation. We gave entropy
References (39)
- et al.
Information theoretic framework for process control
European Journal of Operational Research
(1998) - et al.
Ordering univariate distributions by entropy and variance
Journal of Econometrics
(1999) - et al.
Computations of maximum entropy Dirichlet for modeling lifetime data
Computational Statistics and Data Analysis
(2000) - et al.
Formulas for Rényi information and related measures for univariate distributions
Information Science
(2003) - et al.
Information indices: unification and applications
Journal of Econometrics
(2002) - et al.
A dynamic model of purchase timing with application to direct marketing
Journal of the American Statistical Association
(1999) - et al.
New firm survival: new results using a hazard function
Review of Economics and Statistics
(1995) - et al.
Bayesian hypothesis testing: a reference approach
International Statistics Review
(2002) Auctions with constrained information: blind bidding for motion pictures
The Review of Economics and Statistics
(1988)- et al.
Diagnostic measures for model criticism
Journal of the American Statistical Association
(1996)
Extended stay at university: an application of multinomial logit and duration models
Applied Economics
A nonparametric investigation of duration dependence in the American business cycle
Journal of Political Economy
Duration to first job and the return to schooling: estimates from a search-matching model
Review of Economic Studies
A duration model of irreversible oil investment: theory and empirical evidence
Journal of Applied Econometrics
Equity and time to sale in the real estate market
American Economic Review
Adaptive rejection sampling for Gibbs sampling
Applied Statistics
Estimating workers’ marginal willingness to pay for job attributes using duration data
Journal of Human Resources
Theory and methods inferential procedures for the generalized gamma distribution
Journal of the American Statistical Association
On the estimation of entropy
Annals of Institute of Mathematical Statistics
Cited by (24)
Using a generalized model for air traffic delay: An application of information based duration analysis
2018, Journal of Air Transport ManagementCitation Excerpt :The GG family includes many duration distributions such as exponential, gamma and Weibull as subfamilies. This essay illustrates the applications of information functions developed for GG family in Dadpay et al. (2007), using data on a sample of flight delays data. Air traffic delays are both a major source of passengers' complaints and a topic of discussion for authors of different disciplines studying the aviation industry.
Comprehensive empirical analysis of ERA-40 surface wind speed distribution over Europe
2008, Energy Conversion and ManagementMixed Poisson process with Stacy mixing variable
2024, Stochastic Analysis and Applications
- 1
Currently at. Rapp Collins Worldwide, 1660 North Westridge Circle, Irving, TX 75062.