Skip to main content
Top

2014 | Book

Multistate Analysis of Life Histories with R

insite
SEARCH

About this book

This book provides an introduction to multistate event history analysis. It is an extension of survival analysis, in which a single terminal event (endpoint) is considered and the time-to-event is studied. Multistate models focus on life histories or trajectories, conceptualized as sequences of states and sequences of transitions between states. Life histories are modeled as realizations of continuous-time Markov processes. The model parameters, transition rates, are estimated from data on event counts and populations at risk, using the statistical theory of counting processes.

The Comprehensive R Network Archive (CRAN) includes several packages for multistate modeling. This book is about Biograph. The package is designed to (a) enhance exploratory analysis of life histories and (b) make multistate modeling accessible. The package incorporates utilities that connect to several packages for multistate modeling, including survival, eha, Epi, mvna, etm, mstate, msm, and TraMineR for sequence analysis. The book is a ‘hands-on’ presentation of Biograph and the packages listed. It is written from the perspective of the user. To help the user master the techniques and the software, a single data set is used to illustrate the methods and software. It is the subsample of the German Life History Survey, which was also used by Blossfeld and Rohwer in their popular textbook on event history modeling. Another

data set, the Netherlands Family and Fertility Survey, is used to illustrate how Biograph can assist in answering questions on life paths of cohorts and individuals.

The book is suitable as a textbook for graduate courses on event history analysis and introductory courses on competing risks and multistate models. It may also be used as a self-study book. The R code used in the book is available online.

Frans Willekens is affiliated with the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany. He is Emeritus Professor of Demography at the University of Groningen, a Honorary Fellow of the Netherlands Interdisciplinary Demographic Institute (NIDI) in the Hague, and a Research Associate of the International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria. He is a member of Royal Netherlands Academy of Arts and Sciences (KNAW). He has contributed to the modeling and simulation of life histories, mainly in the context of population forecasting.

Table of Contents

Frontmatter
Chapter 1. Introduction
Abstract
In this book, a particular class of models is considered: multistate models. Multistate models are ideally suited to model life histories. At a given instant, an individual has a set of attributes, such as marital status, employment status, living arrangement, health status and place of residence. In multistate analysis, a person with a given set of attributes is said to occupy a given state, and persons with the same attributes occupy the same state. When an attribute changes, the person moves to a different state. Most personal attributes change in the life course, implying transitions between states. Marriage, marriage dissolution, birth of a child, job change, migration, onset of disability and death are events that imply a transition between states. The set of possible states is the state space. The state variable is the state an individual occupies at a given time or age. If individuals are combined in cohorts or populations, the state variable is the number of individuals in a state at a given time or age. The life course is operationalised as a sequence of states and transitions between states. Two types of states are distinguished: states that can be entered and left (transient states) and states that can be entered but not left (absorbing states). Age is not a personal attribute; it is a time scale. Different time scales may be used to measure time to transition, calendar time and age being the most common time measurements.
Frans Willekens
Chapter 2. Life Histories: Real and Synthetic
Abstract
Life history data are generally incomplete. Usually, they do not cover for each individual in the study the entire life span or the life segment of interest. If data are collected retrospectively, observation ends at interview date, and no information is available on events and experiences after the date. Data collected prospectively are incomplete because events and other experiences are recorded during a limited period of time only. To deal with data limitations, models are introduced. The model that is considered in this chapter describes life histories. The model is based on the premise that life histories are realisations of a continuous-time Markov process. A Markov process is a stochastic process that describes a system with multiple states and transitions between the states. The time at which a transition occurs is random but the distribution of the time to transition is known. In the continuous-time Markov process, the transition time has an exponential distribution. The rate of transition out of the current state (exit rate) is the parameter of the exponential distribution. It depends on the current state only and is independent of the history of the stochastic process. In a system with multiple states, an individual who leaves the current state may enter one of several states. In competing risks models, states in the state space are viewed as competing destinations and transition rates are destination-specific. The Markov process is a first-order process: the destination state depends on the current state only and is independent of states occupied previously.
Frans Willekens
Chapter 3. The Biograph Object
Abstract
A Biograph object is a data frame of individual life history data. All information on the life history of a person is stored in a single record. That data format is known as wide format. The wide format differs from the long format, in which information on an episode or a transition is contained in a separate record. The structure of the data frame is described in Sect. 3.2 of this chapter. In a Biograph object, life history data are organised chronologically starting with the first reported state of the life course and the first transition. That data structure is consistent with the life course as a sequence of events and a sequence of states. Converting raw data from surveys, registers or follow-up studies into a Biograph object can be cumbersome. Most surveys are not organised from a life history perspective but from a life domain perspective. In Sect. 3.3, I describe how to convert the GLHS data into a Biograph object. The GLHS data structure is only one of the many possible data structures. It is not possible to develop a single conversion method for all known data structures. More on how to create a Biograph object and several examples may be found in Annex A. Data analysis may require some operations on the data before the analysis can start. In multistate modelling, that may involve the removal from the data of transitions to the same state (intrastate transitions). In the GLHS data, job changes are intrastate transitions. Other operations may change the observation window. Biograph includes functions to change the observation window. One function selects transitions between two time points (calendar time) or between two ages. Another function delineates an observation window by two events or by an event and the survey date. These operations change the structure of the data. Data restructuring is the subject of Sect. 3.4. In Sect. 3.5 of this chapter, I review formats of life history data and list functions that may be used to convert one data format into another. A description of life histories includes information on states occupied during the period of observation, on transitions between states and on the dates of transition. Different ways exist to report dates. Most people use calendar dates, but dates may also be measured as time elapsed since a reference date or a reference event. The Century Month Code (CMC), used in the GLHS, measures dates as the number of months elapsed since 1 January 1900. Other surveys may express dates differently. Section 3.5 covers different date formats and R functions, including Biograph functions, for converting one date format into another.
Frans Willekens
Chapter 4. Exploratory Data Analysis
Abstract
Biograph contains several functions for exploratory transition data analysis. In this chapter I describe the functions and the objects they generate.
Frans Willekens
Chapter 5. Visualisation of Life Histories
Abstract
Data visualisation is the graphic presentation of data to reveal complex information at a glance (Steele and Iliinsky 2010). The challenge is to map data to a visual display that reveals the range of values of variables and relations between variables. Visualisation of data can be an effective introduction to formal statistical modelling. Ages at marriage may be displayed as points in a scatter plot to assess the distribution of ages and to identify outliers. The marriage duration of a person may be displayed as a line connecting age at marriage and current age or age at marriage dissolution. The end point may be marked if the marriage has been dissolved and not marked if the marriage is intact at the end of the observation period. Visualisation of life histories poses particular challenges. The first is conceptual. The life history is a multistage process of development in which stages create a basis for subsequent stages. In this book the life course is conceptualised as sequences of states and sequences of events. In each domain of life, a state and event sequence can be identified. A second challenge is embedding. The life course is embedded in a historical context, and the visualisation should reveal how developmental processes vary in time. That requires at least two time scales: age and calendar time. The Lexis diagram, named after the demographer Wilhelm Lexis (1837–1914), meets that challenge. Each line in a Lexis diagram represents the follow-up of a single individual from entry to exit on two time scales: age and calendar time. The Lexis diagram is widely used and has inspired improved visualisations of life histories. Some of that research is reviewed in the brief historical note in Sect. 5.1. A third challenge is to reveal significant information at a glance. The graph should convey essential information and highlight the unexpected.
Frans Willekens
Chapter 6. Statistical Packages for Multistate Life History Analysis
Abstract
The Comprehensive R Archive Network (CRAN) (http://cran.r-project.org/) has a number of statistical packages for multistate analysis of event histories (multistate survival analysis). These packages focus on statistical inference, i.e. the estimation of transition rates and transition probabilities from empirical data. In this Chapter, the following packages are covered: survival by Therneau and Lumley, eha by Broström, mvna and etm by Allignol et al., mstate by Putter et al. and msm by Jackson. For an up-to-date overview of packages for survival analysis, the reader is referred to the CRAN Task View on Survival Analysis, maintained by Allignol and Latouche. The Task View has a section on multistate models. For a review of methods for estimating multistate models, the reader is referred to Chap. 2 and, for a more extensive treatment, to Aalen et al. (2008), in particular Chap. 3), Beyersmann et al. (2012), and a special issue of the Journal of Statistical Software (January 2011), edited by Putter. For recent advances in demography, see Willekens and Putter (2014). In essence, the method consists of counting transitions (events) and numbers of persons at risk of a transition just before the transition occurs or in the observation interval. The chapter consists of five sections, in addition to the introduction. Section 6.1 describes the survival package, Sect. 6.2 the eha package, Sect. 6.3 the mvna and etm packages, Sect. 6.4 the mstate package and Sect. 6.5 the msm package.
Frans Willekens
Chapter 7. The Multistate Life Table
Abstract
The multistate life table is a method developed in demography to describe the mortality and mobility experience of a cohort, a group of people born in a same period. The multistate life table is an extension of the life table, which describes the mortality experience. The life table was first developed in the seventeenth century by John Graunt. Graunt was interested in estimating probabilities of survival from observations on deaths. The life table is an established method in demography (see, e.g. Preston et al. 2001). In the 1970s Andrei Rogers extended the life table to include migrations between regions in addition to mortality (Rogers 1975). It soon became clear that regions may be replaced by states and interregional migrations by transitions between states. That resulted in the multistate life table and the wider field of multistate demography (Land and Rogers 1982). Today the multistate life table is used to describe life histories from birth to death. In this chapter I present functions for estimating multistate life table indicators. Age is the duration variable used throughout the chapter. The age intervals considered are of 1-year length.
Frans Willekens
Chapter 8. Application to the Netherlands Family and Fertility Survey
Abstract
The aim of this chapter is to illustrate Biograph with data from the Netherlands Family and Fertility Survey of 1998 (Onderzoek Gezinsvorming 1998 or NLOG98). Statistics Netherlands organised the survey for information on partnerships, marriage and family. In this chapter Biograph is used to study pathways to first birth. What life paths do women in the Netherlands follow between leaving parental home and motherhood? Some leave the parental home for marriage and have a child soon after marriage. Most women have a different pathway, however. The trajectory women follow determines to a large extent the age at which they become a mother. Differences in pathways can be associated with background characteristics. Three covariates are considered: religious denomination (kerk), level of education (educ) and birth cohort (cohort).
Frans Willekens
Chapter 9. Summary
Abstract
As life unfolds people move between states and enter new stages of life. The path taken depends on personal characteristics, early life experiences, context and chance. The life course can be represented as a sequence of states and modelled as a multistate process, governed by transition rates. In this book, the continuous-time Markov process is used to model life histories. Transition rates may depend on covariates.
Frans Willekens
Backmatter
Metadata
Title
Multistate Analysis of Life Histories with R
Author
Frans Willekens
Copyright Year
2014
Electronic ISBN
978-3-319-08383-4
Print ISBN
978-3-319-08382-7
DOI
https://doi.org/10.1007/978-3-319-08383-4

Premium Partner