Skip to main content

2007 | Buch

Statistical Methods for Biostatistics and Related Fields

verfasst von: Professor Wolfgang Härdle, Professor Yuichi Mori, Professor Philippe Vieu

Verlag: Springer Berlin Heidelberg

insite
SUCHEN

Über dieses Buch

Biostatistics is one of the scientific fields for which the recent developments have been extremely important. It is also strongly related to other scientific disciplines involving statistical methodology. The aim of this book is to cover a wide scope of recent statistical methods used by scientists in biostatistics as well as in other related fields such as chemometrics, environmetrics and geophysics.

The contributed papers, coming from internationally recognized researchers, present various statistical methodologies together with a selected scope of their main mathematical properties and their applications in real case studies, making this book of interest to a wide audience among researchers and students in statistics.

Each method is accompanied with interactive and automatic Xplore routines, available on-line, allowing people to reproduce the proposed examples or to apply the methods to their own real datasets. Thus this book will also be of special interest to practitioners.

Inhaltsverzeichnis

Frontmatter

Biostatistics

Frontmatter
1. Discriminant Analysis Based on Continuous and Discrete Variables
Avner Bar-Hen, Jean-Jacques Daudin
2. Longitudinal Data Analysis with Linear Regression
Jörg Breitung, Rémy Slama, Axel Werwatz
3. A Kernel Method Used for the Analysis of Replicated Micro-array Experiments
Abstract
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional t-test and the normal mixture model.
Ali Gannoun, Beno Liquetît, Jérôme Saracco, Wolfgang Urfer
4. Kernel Estimates of Hazard Functions for Biomedical Data Sets
Ivana Horová, Jiří Zelinka
5. Partially Linear Models
Wolfgang Härdle, Hua Liang
6. Analysis of Contingency Tables
Masahiro Kuroda
7. Identifying Coexpressed Genes
Abstract
Some gene expression data contain outliers and noise because of experiment error. In clustering, outliers and noise can result in false positives and false negatives. This motivates us to develop a weighting method to adjust the expression data such that the outlier and noise effect decrease, and hence result in a reduction in false positives and false negatives in clustering.
In this paper, we describe the weighting adjustment method and apply it to a yeast cell cycle data set. Based on the adjusted yeast cell cycle expression data, the hierarchical clustering method with a correlation coefficient measure performs better than that based on standardized expression data. The clustering method based on the adjusted data can group some functionally related genes together and yields higher quality clusters.
Qihua Wang
8. Bootstrap Methods for Testing Interactions in GAMs
Javier Roca-Pardiñas, Carmen Cadarso-Suárez, Wenceslao González-Manteiga
9. Survival Trees
Carmela Cappelli, Heping Zhang
10. A Semiparametric Approach to Estimate Reference Curves for Biophysical Properties of the Skin
Abstract
Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. This procedure combines a dimension-reduction step (based on sliced inverse regression) and kernel estimation of conditional quantiles step. The usefulness of this semiparametric estimation procedure is illustrated on a simulated data set and on a real data set collected in order to establish reference curves for biophysical properties of the skin of healthy French women.
Saracco Jérôme, Gannoun Ali, Guinot Christiane, Liquet Benoît
11. Survival Analysis
Makoto Tomita

Related Sciences

Frontmatter
12. Ozone Pollution Forecasting Using Conditional Mean and Conditional Quantiles with Functional Covariates
Hervé Cardot, Christophe Crambes, Pascal Sarda
13. Nonparametric Functional Methods: New Tools for Chemometric Analysis
13.5 Concluding Comments
In this contribution, we have shown how spectrometric data can be succesfully analysed by considering them as curve data and by using the recent nonparametric methodology for curve data. However, note that all the statistical backgrounds are presented in a general way (and not only for spectrometric data). Similarly, the XploRe quantlets that we provided can be directly used in any other applied setting involving curve data. For reason of shortness, and because it was not the purpose here, we only presented the results given by the nonparametric functional methodology without discussing any comparison with alternative methods (but relevant references on these points are given all along the contribution).
Also for shortness reasons, we just presented two statistical problems (namely regression from curve data and curves discrimination) among the several problems that can be treated by nonparametric functional methods (on this point also, our contribution contains several references about other problems that could be attacked similarly). These two problems have been chosen by us for two reasons: first, these issues are highly relevant to many applied studies involving curve analysis and second, their theoretical and practical importance led to emergence of different computer automated procedures.
Frédéric Ferraty, Aldo Goia, Philippe Vieu
14. Variable Selection in Principal Component Analysis
Abstract
While there exist several criteria by which to select a reasonable subset of variables in the context of PCA, we introduce herein variable selection using criteria in Tanaka and Mori (1997)’s modified PCA (M.PCA) among others.
In order to perform such variable selection via XploRe, the quantlib vaspca, which reads all the necessary quantlets for selection, is first called, and then the quantlet mpca is run using a number of selection parameters.
In the first four sections we present brief explanations of variable selection in PCA, an outline of M.PCA and flows of four selection procedures, based mainly on Tanaka and Mori (1997)’s, Mori (1997), Mori, Tarumi and Tanaka (1998) and Iizuka et al. (2002a). In the last two sections, we illustrate the quantlet mpca and its performance by two numerical examples.
Yuichi Mori, Masaya Iizuka, Tomoyuki Tarumi, Yutaka Tanaka
15. Spatial Statistics
Pavel Čížzek, Wolfgang Härdle, Jürgen Symanzik
16. Functional Data Analysis
Abstract
In many different fields of applied statistics the object of interest is depending on some continuous parameter, i.e. continuous time. Typical examples in biostatistics are growth curves or temperature measurements. Although for technical reasons, we are able to measure temperature just in discrete intervals — it is clear that temperature is a continuous process. Temperature during one year is a function with argument “time”. By collecting one-year-temperature functions for several years or for different weather stations we obtain bunch (sample) of functions — functional data set. The questions arising by the statistical analysis of functional data are basically identical to the standard statistical analysis of univariate or multivariate objects. From the theoretical point, design of a stochastic model for functional data and statistical analysis of the functional data set can be taken often one-to-one from the conventional multivariate analysis. In fact the first method how to deal with the functional data is to discretize them and perform a standard multivariate analysis on the resulting random vectors. The aim of this chapter is to introduce the functional data analysis (FDA), discuss the practical usage and implementation of the FDA methods.
This chapter is organized as follows: Section 16.1 defines the basic mathematical and statistical framework for the FDA, Section 16.2 introduces the most popular implementation of functional data analysis — the functional basis expansion. In Section 16.4 we present the basic theory of the functional principal components, smoothed functional principal components and a practical application on the temperature data set of the Canadian Weather-stations.
Michal Benko
17. Analysis of Failure Time with Microearthquakes Applications
Graciela Estévez-Pérez, Alejandro Quintela del Rio
18. Polychotomous Regression: Application to Landcover Prediction
Frédéric Ferraty, Martin Paegelow, Pascal Sarda
19. The Application of Fuzzy Clustering to Satellite Images Data
Hizir Sofyan, Muzailin Affan, Khaled Bawahidi
Backmatter
Metadaten
Titel
Statistical Methods for Biostatistics and Related Fields
verfasst von
Professor Wolfgang Härdle
Professor Yuichi Mori
Professor Philippe Vieu
Copyright-Jahr
2007
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-32691-5
Print ISBN
978-3-540-32690-8
DOI
https://doi.org/10.1007/978-3-540-32691-5

Premium Partner