nach oben

2001 | Buch | 2. Auflage

Kapitel lesen Erstes Kapitel lesen

Multivariate Statistical Modelling Based on Generalized Linear Models

verfasst von: Ludwig Fahrmeir, Gerhard Tutz

Verlag: Springer New York

Buchreihe : Springer Series in Statistics

Enthalten in: Professional Book Archive

Einloggen, um Zugang zu erhalten

Über dieses Buch

Since our first edition of this book, many developments in statistical mod elling based on generalized linear models have been published, and our primary aim is to bring the book up to date. Naturally, the choice of these recent developments reflects our own teaching and research interests. The new organization parallels that of the first edition. We try to motiv ate and illustrate concepts with examples using real data, and most data sets are available on http:/ fwww. stat. uni-muenchen. de/welcome_e. html, with a link to data archive. We could not treat all recent developments in the main text, and in such cases we point to references at the end of each chapter. Many changes will be found in several sections, especially with those connected to Bayesian concepts. For example, the treatment of marginal models in Chapter 3 is now current and state-of-the-art. The coverage of nonparametric and semiparametric generalized regression in Chapter 5 is completely rewritten with a shift of emphasis to linear bases, as well as new sections on local smoothing approaches and Bayesian inference. Chapter 6 now incorporates developments in parametric modelling of both time series and longitudinal data. Additionally, random effect models in Chapter 7 now cover nonparametric maximum likelihood and a new section on fully Bayesian approaches. The modifications and extensions in Chapter 8 reflect the rapid development in state space and hidden Markov models.

Inhaltsverzeichnis

Frontmatter

1. Introduction

Abstract

Classical statistical models for regression, time series, and longitudinal data analysis are generally useful in situations where data are approximately Gaussian and can be explained by some linear structure. These models are easy to interpret and the methods are theoretically well understood and investigated. However, the underlying assumptions may be too stringent and applications of the methods may be misleading in situations where data are clearly non-normal, such as categorical or counted data. Statistical modelling aims at providing more flexible model-based tools for data analysis.

Ludwig Fahrmeir, Gerhard Tutz

2. Modelling and Analysis of Cross-Sectional Data: A Review of Univariate Generalized Linear Models

Abstract

The material in this chapter provides an introduction to univariate generalized linear models and serves as a basis for the following chapters, which contain extensions, e.g., to multivariate, nonparametric, random effects, or dynamic models. It is not intended to replace a deeper study of detailed expositions like that in McCullagh & Nelder (1989) or, with focus on the GLIM package, that in Aitkin, Anderson, Francis & Hinde (1989). Shorter introductions are given, e.g., by Dobson (1989), Firth (1991), and Fahrmeir, Hamerle & Tutz (1996). Collett (1991) focuses on modelling of binary data and discusses practical aspects in more detail.

Ludwig Fahrmeir, Gerhard Tutz

3. Models for Multicategorical Responses: Multivariate Extensions of Generalized Linear Models

Abstract

In this chapter the concept of generalized linear models is extended to the case of a vector-valued response variable. Consider Example 2.1, where we were interested in the effect of risk factors and antibiotics on infection following birth by caesarian section. In this example the response was binary, distinguishing only between occurrence and nonoccurrence of infection, and thereby ignoring that the data originally provided information on the type of infection (type I or II) as well. It is possible, however, to use this information by introducing a response variable with three categories (no infection, infection type I, infection type II). Naturally, these categories cannot be treated as a unidimensional response. We have to introduce a (dummy) variable for each category, thus obtaining a multivariate response variable. Therefore, link and response functions for the influence term will be vector-valued functions in this chapter. The focus is on multicategorical response variables and multinomial models. Variables of this type are often called polychotomous, the possible values are called categories. Extension to other multivariate exponential family densities is possible but not considered in this text.

Ludwig Fahrmeir, Gerhard Tutz

4. Selecting and Checking Models

Abstract

Fitting data by a certain generalized linear model means choosing appropriate forms for the predictor, the link function, and the exponential family or variance function. In the previous chapters Pearsons’s X ², the deviance and, in the multinomial case, the power-divergence family were introduced as general goodness-of-fit statistics. This chapter considers more specific tools to select and check models. Section 4.1 deals with variable selection, i.e., which variables should be included in the linear predictor. Diagnostic methods based on the hat matrix and on residuals are described in Section 4.2, and Section 4.3 covers general misspecification tests, such as Hausman-type tests and tests for nonnested models. We do not treat tests for specific directions, such as testing the correct form of the link function by embedding it in a broader parametric class of link functions. A survey of tests of this type is contained in Chapter 11.4 of McCullagh & Nelder (1989). In addition to the methods of this chapter, nonparametric approaches, as in Chapter 5, may also be used to check the adequacy of certain parametric forms.

Ludwig Fahrmeir, Gerhard Tutz

5. Semi- and Nonparametric Approaches to Regression Analysis

Abstract

In this chapter we give developments that lead beyond the framework of parametric models. Instead of assuming a functional form that specifies how explanatory variables determine dependent variables, the functional form is mostly assumed to be in some way smooth, and the data are allowed to determine the appropriate functional form under weak restrictions.

Ludwig Fahrmeir, Gerhard Tutz

6. Fixed Parameter Models for Time Series and Longitudinal Data

Abstract

The methods of the preceding chapters are mainly appropriate for modelling and analyzing a broad class of non-normal cross-sectional data. Extensions to time-dependent data are possible in a variety of ways. Time series are repeated observations (y _t, x _t) on a response variable y of primary interest and on a vector of covariates taken at times t = 1, ... , T. Discrete time longitudinal or panel data are repeated observations (y _it, x _it) taken for units i = 1,..., n at times t = 1,..., T _i. The restriction to integral times is made to simplify notation but is not necessary for most of the approaches. Longitudinal data may be viewed as a cross section of individual time series, reducing to a single time series for n = 1, or as a sequence of cross-sectional observations where units are identifiable over time. If a comparably small number of longer time series is observed, models and methods will be similar to those for single time series. If, however, many short time series have been observed, models, and often the scientific objective, can be different.

Ludwig Fahrmeir, Gerhard Tutz

7. Random Effects Models

Abstract

This chapter is concerned with random effects models for analyzing nonnormal data that are assumed to be clustered or correlated. The clustering may be due to repeated measurements over time, as in longitudinal studies, or to subsampling the primary sampling units, as in cross-sectional studies. In each of these cases the data consist of repeated observations (yit, xit), t = 1, ... , T _i, for each individual or unit i = 1, ... , n, where y denotes a response variable of primary interest and x a vector of covariates. Typical examples include panel data, where the cluster-specific data

$$\left( {{y_i},{x_i}} \right) = \left( {{y_{i1}},...,{y_i}{T_i},{x_{i1}},...,{x_i}{T_i}} \right)$$

correspond to a time series of length T _i, or large-scale health studies, where (y _i, x _i) represents the data of a primary sampling unit, say a hospital or a geographical region.

Ludwig Fahrmeir, Gerhard Tutz

8. State Space and Hidden Markov Models

Abstract

This chapter surveys state space and hidden Markov modelling approaches for analyzing time series or longitudinal data, spatial data, and spatiotemporal data. Responses are generally non-Gaussian, in particular, categorical, counted or nonnegative. State space and hidden Markov models have the common feature that they relate responses to unobserved “states” or “parameters” by an observation model. The states, which may represent, e.g., an unobserved temporal or spatial trend or time-varying covariate effects, are assumed to follow a latent or “hidden” Markov model.

Ludwig Fahrmeir, Gerhard Tutz

9. Survival Models

Abstract

In recent years the analysis of survival time, lifetime, or failure time data has received considerable attention. The methodology applies in medical trials, where survival is of primary interest, and in reliability experiments, where failure time is the duration of interest. We will mostly refer to survival time although, in principle, situations where the time until the occurrence of some event is of interest are considered.

Ludwig Fahrmeir, Gerhard Tutz

Backmatter

Titel: Multivariate Statistical Modelling Based on Generalized Linear Models
verfasst von: Ludwig Fahrmeir
Gerhard Tutz
Verlag: Springer New York
Electronic ISBN: 978-1-4757-3454-6
Print ISBN: 978-1-4419-2900-6
DOI: https://doi.org/10.1007/978-1-4757-3454-6

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

1. Introduction

2. Modelling and Analysis of Cross-Sectional Data: A Review of Univariate Generalized Linear Models

3. Models for Multicategorical Responses: Multivariate Extensions of Generalized Linear Models

4. Selecting and Checking Models

5. Semi- and Nonparametric Approaches to Regression Analysis

6. Fixed Parameter Models for Time Series and Longitudinal Data

7. Random Effects Models

8. State Space and Hidden Markov Models

9. Survival Models

Backmatter