Skip to main content

1994 | OriginalPaper | Buchkapitel

A new criterion for selecting models from partially observed data

verfasst von : Hidetoshi Shimodaira

Erschienen in: Selecting Models from Data

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

A new criterion PDIO (predictive divergence for indirect observation models) is proposed for selecting statistical models from partially observed data. PDIO is devised for “indirect observation models”, in which observations are only available indirectly through random variables. That is, some underlying hidden structure is assumed to generate the manifest variables. For example, unsupervised learning recognition systems, clustering, latent structure analysis, mixture distribution models, missing data, noisy observations, etc., or the models whose maximum likelihood estimator is based on the EM (expectation-maximization) algorithm. PDIO is a natural extension of AIC (Akaike’s information criterion), and the two criteria are equivalent when direct observations are available. Both criteria are expressed as the sum of two terms: the first term represents the goodness of fit of the model to the observed data, and the second term represents the model complexity. The goodness ot fit terms are equivalent in both criteria, but the complexity terms are different. The complexity term is a function of model structure and the number of samples and is added in order to take into account the reliability of the observed data. A mean fluctuation of the estimated true distribution is used as the model complexity in PDIO. The relative relation of the “model manifold” and the “observed manifold” is, therefore, reflected in the complexity term of PDIO from the information geometric point of view, whereas it reduces to the number of parameters in AIC. PDIO is very unique in dealing with the unobservable underlying structure “positively.” In this paper the generalized expression of PDIO is shown using two Fisher information matrices. An approximated computation method for PDIO is also presented utilizing EM iterates. Some computer simulations are shown to demonstrate how this criterion works.

Metadaten
Titel
A new criterion for selecting models from partially observed data
verfasst von
Hidetoshi Shimodaira
Copyright-Jahr
1994
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4612-2660-4_3