Top

2010 | Book

Read chapter Read first chapter

Complex Data Modeling and Computationally Intensive Statistical Methods

Editors: Pietro Mantovan, Piercesare Secchi

Publisher: Springer Milan

Book Series : Contributions to Statistics

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Frontmatter

Space-time texture analysis in thermal infrared imaging for classification of Raynaud’s Phenomenon

Abstract

This paper proposes a supervised classification approach for the differential diagnosis of Raynaud’s Phenomenon on the basis of functional infrared imaging (IR) data. The segmentation and registration of IR images are briefly discussed and two texture analysis techniques are introduced in a spatio-temporal framework to deal with the feature extraction problem. The classification of data from healthy subjects and from patients suffering from primary and secondary Raynaud’s Phenomenon is performed by using Stepwise Linear Discriminant Analysis (LDA) on a large number of features extracted from the images. The results of the proposed methodology are shown and discussed for a temporal sequence of images related to 44 subjects.

Graziano Aretusi, Lara Fontanella, Luigi Ippoliti, Arcangelo Merla

Mixed-effects modelling of Kevlar fibre failure times through Bayesian non-parametrics

Abstract

We examine the accelerated failure time model for univariate data with right censoring, with application to failure times of Kevlar fibres grouped by spool, subject to different stress levels. We propose a semi-parametric modelling by letting the error distribution be a shape-scale mixture of Weibull densities, the mixing measure being a normalised generalised gamma measure. We obtain posterior estimates of the regression parameter and also of credibility intervals for the predictive distributions and their quantiles, by including the posterior distribution of the random mixing probability in the MCMC scheme. The number of components in the non-parametric mixture can be interpreted as the number of groups, having a prior distribution induced by the non-parametric model, and is inferred from the data. Compared to previous results, we obtain narrower interval estimates of the quantiles of the predictive survival function. Other diagnostic plots, such as predictive tail probabilities and Bayesian residuals, show a good agreement between the model and the data.

Raffaele Argiento, Alessandra Guglielmi, Antonio Pievatolo

Space filling and locally optimal designs for Gaussian Universal Kriging

Abstract

Computer simulations are often used to replace physical experiments aimed at exploring the complex relationships between input and output variables. Undoubtedly, computer experiments have several advantages over real ones, however, when the response function is complex, simulation runs may be very expensive and/or time-consuming, and a possible solution consists of approximating the simulator by a suitable stochastic metamodel, simpler and much faster to run. Several metamodel techniques have been suggested in the literature and one of the most popular is the Kriging methodology. In this paperwe study the optimal design problem for the Universal Kriging metamodel with respect to different approaches, related to prediction, information gain and estimation. Also we give further justifications and some criticism concerning the adoption of the space filling designs, based on theoretical results and numerical evidence as well.

Alessandro Baldi Antognini, Maroussa Zagoraiou

Exploitation, integration and statistical analysis of the Public Health Database and STEMI Archive in the Lombardia region

Abstract

We describe the nature and aims of the Strategic Program “Exploitation, integration and study of current and future health databases in Lombardia for Acute Myocardial Infarction”. The main goal of the Programme is the construction and statistical analysis of data coming from the integration of complex clinical and administrative databases concerning patients with Acute Coronary Syndromes treated in the Lombardia region. Clinical data sets arise from observational studies about specific diseases, while administrative data arise from standardised and on-going procedures of data collection. The linkage between clinical and administrative databases enables the Lombardia region to create an efficient global system for collecting and storing integrated longitudinal data, to check them, to guarantee their quality and to study them from a statistical perspective.

Pietro Barbieri, Niccolò Grieco, Francesca Ieva, Anna Maria Paganoni, Piercesare Secchi

Bootstrap algorithms for variance estimation in πPS sampling

Abstract

The problem of bootstrapping the estimator’s variance under a probability proportional to size design is examined. Focusing on the Horvitz-Thompson estimator, three πPS-bootstrap algorithms are introduced with the purpose of both simplifying available procedures and of improving efficiency. Results from a simulation study using both natural and artificial data are presented in order to empirically investigate the properties of the provided bootstrap variance estimators.

Alessandro Barbiero, Fulvia Mecatti

Fast Bayesian functional data analysis of basal body temperature

Abstract

In many clinical settings, it is of interest to monitor a bio-marker over time for a patient in order to identify or predict clinically important features. For example, in reproductive studies that involve basal body temperature, a low, high point or sudden changes on the trajectory have important clinical significance in determining the day of ovulation or in causing dysfunctional cycles. It is common to have patient databases with a huge quantity of data and patient information is characterised with cycles that have sparse observations. If the main interest is to make predictions, it is crucial to borrow information across cycles and among patients. In this paper, we propose the use of fast and efficient algorithms that rely on spareness-favouring hierarchical priors for P-spline basis coefficients to aid estimation of functional trajectories. Using the basal body temperature data, we present an application of the Relevant Vector Machine method that generates sparse functional linear and linear mixed models that can be used to rapidly estimate individual-specific and population average functions.

James M. Ciera

A parametric Markov chain to model age- and state-dependent wear processes

Abstract

Many technological systems are subjected, during their operating life, to a gradual wear process which, in the long run, may cause failure. According to the literature, it results that statisticians and engineers have almost always modeled wear processes by independent increments models, which imply that future wear is assumed to depend, at most, on the system’s age. In many cases itseems to be more realistic and appropriate to adopts to chastic models which assume that factors other than age affect wear. Indeed, wear models which can (also) account for the dependence on the system’s state have been previously proposed in the literature [1, 3, 11, 13]. Many of the abovementioned models present a very complex structure that prevents their application to the kind of data that are usually available. As such, in this paper, a new simple parametric Markov chain wear model is proposed, in which the transition probabilities between process states depend on both the current age and the current wear level of the system. An application based on a real data set referring to the wear process of the cylinder liners of heavy-duty diesel engines for marine propulsion is analysed and discussed.

Massimiliano Giorgio, Maurizio Guida, Gianpaolo Pulcini

Case studies in Bayesian computation using INLA

Abstract

Latent Gaussian models are a common construct in statistical applications where a latent Gaussian field, indirectly observed through data, is used to model, for instance, time and space dependence or the smooth effect of covariates. Many well-known statistical models, such as smoothing-spline models, space time models, semiparametric regression, spatial and spatio-temporal models, log-Gaussian Cox models, and geostatistical models are latent Gaussian models. Integrated Nested Laplace approximation (INLA) is a new approach to implement Bayesian inference for such models. It provides approximations of the posterior marginals of the latent variables which are both very accurate and extremely fast to compute. Moreover, INLA treats latent Gaussian models in a general way, thus allowing for a great deal of automation in the inferential procedure. The inla programme, bundled in the R library INLA, is a prototype of such black-box for inference on latent Gaussian models which is both flexible and user-friendly. It is meant to, hopefully,make latent Gaussian models applicable, useful and appealing for a larger class of users.

Sara Martino, Håvard Rue

A graphical models approach for comparing gene sets

Abstract

Recently, a great effort in microarray data analysis has been directed towards the study of the so-called gene sets. A gene set is defined by genes that are, somehow, functionally related. For example, genes appearing in a known biological pathway naturally define a gene set. Gene sets are usually identified from a priori biological knowledge. Nowadays, many bioinformatics resources store such kind of knowledge (see, for example, the Kyoto Encyclopedia of Genes and Genomes, among others). In this paper we exploit a multivariate approach, based on graphical models, to deal with gene sets defined by pathways. Given a sample of microarray data corresponding to two experimental conditions and a pathway linking some of the genes, we investigate whether the strength of the relations induced by the functional links change among the two experimental conditions.

M. Sofia Massa, Monica Chiogna, Chiara Romualdi

Predictive densities and prediction limits based on predictive likelihoods

Abstract

The notion of predictive likelihood stems from the fact that in the prediction problem there are two unknown quantities to deal with: the future observation and the model parameter. Since, according to the likelihood principle, all the evidence is contained in the joint likelihood function, a predictive likelihood for the future observation is obtained by eliminating the nuisance quantity, namely the unknown model parameter. This paper focuses on the profile predictive likelihood and on some modified versions obtained by mimicking the solutions proposed to improve the profile (parametric) likelihood. These predictive likelihoods are evaluated by studying how well they generate prediction intervals. In particular, we find that, at least in some specific applications, these solution usually improve on those ones based on the plug-in procedure. However, the associated predictive densities and prediction limits do not correspond to the optimal frequentist solutions already described in the literature.

Paolo Vidoni

Computer-intensive conditional inference

Abstract

Conditional inference is a fundamental part of statistical theory. However, exact conditional inference is often awkward, leading to the desire for methods which offer accurate approximations. Such a methodology is provided by small-sample likelihood asymptotics. We argue in this paper that simple, simulation-based methods also offer accurate approximations to exact conditional inference in multiparameter exponential family and ancillary statistic settings. Bootstrap simulation of the marginal distribution of an appropriate statistic provides a conceptually simple and highly effective alternative to analytic procedures of approximate conditional inference.

G. Alastair Young, Thomas J. DiCiccio

Monte Carlo simulation methods for reliability estimation and failure prognostics

Abstract

Monte Carlo Simulation (MCS) offers a powerful means for modeling the stochastic failure behaviour of engineered structures, systems and components (SSC). This paper summarises current work on advanced MCS methods for reliability estimation and failure prognostics.

Enrico Zio

Title: Complex Data Modeling and Computationally Intensive Statistical Methods
Editors: Pietro Mantovan
Piercesare Secchi
Publisher: Springer Milan
Electronic ISBN: 978-88-470-1386-5
Print ISBN: 978-88-470-1385-8
DOI: https://doi.org/10.1007/978-88-470-1386-5

Springer Professional

Complex Data Modeling and Computationally Intensive Statistical Methods

Table of Contents

Frontmatter

Space-time texture analysis in thermal infrared imaging for classification of Raynaud’s Phenomenon

Mixed-effects modelling of Kevlar fibre failure times through Bayesian non-parametrics

Space filling and locally optimal designs for Gaussian Universal Kriging

Exploitation, integration and statistical analysis of the Public Health Database and STEMI Archive in the Lombardia region

Bootstrap algorithms for variance estimation in πPS sampling

Fast Bayesian functional data analysis of basal body temperature

A parametric Markov chain to model age- and state-dependent wear processes

Case studies in Bayesian computation using INLA

A graphical models approach for comparing gene sets

Predictive densities and prediction limits based on predictive likelihoods

Computer-intensive conditional inference

Monte Carlo simulation methods for reliability estimation and failure prognostics

Premium Partner