Skip to main content

2010 | Buch

Regression with Linear Predictors

verfasst von: Per Kragh Andersen, Lene Theil Skovgaard

Verlag: Springer New York

Buchreihe : Statistics for Biology and Health

insite
SUCHEN

Über dieses Buch

This is a book about regression analysis, that is, the situation in statistics where the distribution of a response (or outcome) variable is related to - planatory variables (or covariates). This is an extremely common situation in the application of statistical methods in many ?elds, andlinear regression,- gistic regression, and Cox proportional hazards regression are frequently used for quantitative, binary, and survival time outcome variables, respectively. Several books on these topics have appeared and for that reason one may well ask why we embark on writing still another book on regression. We have two main reasons for doing this: 1. First, we want to highlightsimilaritiesamonglinear,logistic,proportional hazards,andotherregressionmodelsthatincludealinearpredictor. These modelsareoftentreatedentirelyseparatelyintextsinspiteofthefactthat alloperationsonthemodelsdealingwiththelinearpredictorareprecisely the same, including handling of categorical and quantitative covariates, testing for linearity and studying interactions. 2. Second, we want to emphasize that, for any type of outcome variable, multiple regression models are composed of simple building blocks that areaddedtogetherinthelinearpredictor:thatis,t-tests,one-wayanalyses of variance and simple linear regressions for quantitative outcomes, 2×2, 2×(k+1) tables and simple logistic regressions for binary outcomes, and 2-and (k+1)-sample logrank testsand simple Cox regressionsfor survival data. Thishastwoconsequences. Allthesesimpleandwellknownmethods can be considered as special cases of the regression models. On the other hand, the e?ect of a single explanatory variable in a multiple regression model can be interpreted in a way similar to that obtained in the simple analysis, however, now valid only for the other explanatory variables in the model “held ?xed”.

Inhaltsverzeichnis

Frontmatter
1. Introduction
Abstract
Suppose we are studying blood pressure in humans based on a random sample from a specific population, say, inhabitants of some larger city. The very first step in such a study may be to get a summary of the level and variation of blood pressure, subject to criteria such as ethnicity, gender or age. The purpose of studying blood pressure may be to establish normal references to serve as future guidelines for when to start treatment for either too high or too low a blood pressure.
Per Kragh Andersen, Lene Theil Skovgaard
2. Statistical models
Abstract
Research begins with theories and hypotheses. For instance, it may have been noted that many people experiencing a stroke tend to have high blood pressure. You get the idea that medication aimed at lowering blood pressure may therefore also reduce the risk of a stroke. However plausible this sounds, it requires an investigation (a study) to confirm or reject this suspicion. For instance, the apparent connection between high blood pressure and stroke may be simply due to the fact that people experience stroke at an older age where blood pressure is also increased and hence that lowering of blood pressure has no effect on the risk of a stroke because it does not change the age of the individual.
Per Kragh Andersen, Lene Theil Skovgaard
3. One categorical covariate
Abstract
In this chapter, we discuss one of the two building blocks of regression models, namely models including only a single categorical covariate. This means that we compare groups, such as treatments, countries, stature groups based on body mass index, diet types, age groups, and so on.
Per Kragh Andersen, Lene Theil Skovgaard
4. One quantitative covariate
Abstract
In this chapter we study models with a single quantitative covariate, that is, a covariate measured on a numerical scale. Some quantitative variables are continuous, meaning that they can take on any value (in principle infinitely many but in practice at least“many” values) in some interval, finite or infinite. Typical examples could be age and body mass index. The number of fever episodes for a pregnant woman is not a continuous variable, but still obviously quantitative. Ordered categorical variables, such as (underweight, normal weight or overweight) can also be thought of as quantitative variables, if each category can be assigned a meaningful score.
Per Kragh Andersen, Lene Theil Skovgaard
5. Multiple regression, the linear predictor
Abstract
In the previous two chapters we studied regression models where the linear predictor depended on a single explanatory variable, x. In Chapter 3, x was categorical and for a binary variable (Section 3.1) with values g0,g1 we added
$$ {\rm{bI}}({\rm{xi}} = {\rm{g1}}) $$
to the intercept a, whereas in general, for a variable with k + 1 levels (Section 3.2) we added instead the expression
$$ {\rm{b1I}}({\rm{xi}} = {\rm{ g1}}) + {\rm{ b2I}}({\rm{xi}} = {\rm{ g2}}) + {\rm{ }}\cdot\cdot\cdot{\rm{ }} + {\rm{ bkI}}({\rm{xi}} = {\rm{ gk}}), $$
with dummy variables for all categories except the reference category (xi = 0).
Per Kragh Andersen, Lene Theil Skovgaard
6. Model building: From purpose to conclusion
Abstract
To investigate a scientific question, data are needed. Sometimes, data may already be available, but in many situations new data have to be collected because the question is concerned with a new procedure or treatment or requires new covariates to be considered for a previously studied phenomenon.
Per Kragh Andersen, Lene Theil Skovgaard
7. Alternative outcome types and link functions
Abstract
In previous chapters we have focused on quantitative data with a linear mean, binary data with a logistic link, and proportional hazards models for survival data. In this chapter we first study two “new” datatypes. The first is multinomial data (Section 7.1) where the outcome is a categorical variable with more than two levels. This includes both the case where the categories are ordered (ordinal data, Section 7.1.1) and where they are unordered (nominal data, Section 7.1.2). The second type is counts (Section 7.2) and we study models based on both the Poisson and the Binomial distribution.
Per Kragh Andersen, Lene Theil Skovgaard
8. Further topics
Abstract
In this final chapter we briefly mention a number of topics related to the general class of regression models with a linear predictor. This chapter is mainly meant as a precaution because some of the assumptions made throughout earlier chapters are now relaxed. Section 8.1 discusses the situation where responses are multivariate, often as a consequence of having several response variables observed in the same subjects. In such cases, the assumption of independence between all the responses needs to be relaxed, because observations from the same subject tend to be more alike than observations from different subjects and this intrasubject correlation must be accounted for to obtain valid inference.
Per Kragh Andersen, Lene Theil Skovgaard
Backmatter
Metadaten
Titel
Regression with Linear Predictors
verfasst von
Per Kragh Andersen
Lene Theil Skovgaard
Copyright-Jahr
2010
Verlag
Springer New York
Electronic ISBN
978-1-4419-7170-8
Print ISBN
978-1-4419-7169-2
DOI
https://doi.org/10.1007/978-1-4419-7170-8