1994 | OriginalPaper | Buchkapitel
A new criterion for selecting models from partially observed data
verfasst von : Hidetoshi Shimodaira
Erschienen in: Selecting Models from Data
Verlag: Springer New York
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
A new criterion PDIO (predictive divergence for indirect observation models) is proposed for selecting statistical models from partially observed data. PDIO is devised for “indirect observation models”, in which observations are only available indirectly through random variables. That is, some underlying hidden structure is assumed to generate the manifest variables. For example, unsupervised learning recognition systems, clustering, latent structure analysis, mixture distribution models, missing data, noisy observations, etc., or the models whose maximum likelihood estimator is based on the EM (expectation-maximization) algorithm. PDIO is a natural extension of AIC (Akaike’s information criterion), and the two criteria are equivalent when direct observations are available. Both criteria are expressed as the sum of two terms: the first term represents the goodness of fit of the model to the observed data, and the second term represents the model complexity. The goodness ot fit terms are equivalent in both criteria, but the complexity terms are different. The complexity term is a function of model structure and the number of samples and is added in order to take into account the reliability of the observed data. A mean fluctuation of the estimated true distribution is used as the model complexity in PDIO. The relative relation of the “model manifold” and the “observed manifold” is, therefore, reflected in the complexity term of PDIO from the information geometric point of view, whereas it reduces to the number of parameters in AIC. PDIO is very unique in dealing with the unobservable underlying structure “positively.” In this paper the generalized expression of PDIO is shown using two Fisher information matrices. An approximated computation method for PDIO is also presented utilizing EM iterates. Some computer simulations are shown to demonstrate how this criterion works.