Skip to main content
main-content

Über dieses Buch

The aim of this book is to present a survey of the many ways in which the statistical package GLIM may be used to model and analyze stochastic processes. Its emphasis is on using GLIM interactively to apply statistical techniques, and examples are drawn from a wide range of applications including medicine, biology, and the social sciences. It is based on the author's many years of teaching courses along these lines to both undergraduate and graduate students. The author assumes that readers have a reasonably strong background in statistics such as might be gained from undergraduate courses and that they are also familiar with the basic workings of GLIM. Topics covered include: the analysis of survival data, regression and fitting distributions, time series analysis (including both the time and frequency domains), repeated measurements, and generalized linear models.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Normal Theory Models and Some Extensions

Abstract
One of the most widely used tools in all of statistics is linear regression. This is often misnamed least squares regression, but a least squares estimation refers to a deterministic process, whereby the best straight line is fitted through a series of points. In statistical analysis, the interpretation is much different although the technical calculations remain the same. Normal theory linear regression carries the assumption that the response variable has a normal or Gaussian distribution:
$$ f(y;\mu ,{{\sigma }^{2}}) = \exp [{{(y - \mu )}^{2}}/(2{{\sigma }^{2}})]/\sqrt {{2\pi {{\sigma }^{2}}}} $$
(1.1)
The mean of this distribution changes in some deterministic way with the values of the explanatory variable(s), e.g.
$$ {{\mu }_{i}} = {{\beta }_{0}} + \sum\limits_{j} {{{\beta }_{j}}{{X}_{{ij}}}} $$
(1.2)
while the variance remains constant. Then, the regression equation specifies how the mean of the distribution changes for each value of the explanatory variable(s); individual observations will be dispersed about the mean with the given variance. This is illustrated in Figure 1.1.
James K. Lindsey

Chapter 2. Markov Chains

Abstract
Perhaps the simplest type of stochastic process occurs when one of only two possible outcomes or states is recorded at each successive equal interval of time. Often, this may be the presence or absence of an event. One way to look at such data is to study how the outcome observed at a given time depends on what previously occurred or the probability of transition from one state to the other. For example, if the outcome depends only on that immediately preceding it, we have a first order Markov chain. Such models can easily be analysed as multi-dimensional contingency tables and will be considered in this chapter. If the presence of the event is not too frequent, other approaches include studying either the numbers of events occurring in fixed intervals of time or the time between successive events. We shall look at these in the next chapter.
James K. Lindsey

Chapter 3. Point and Renewal Processes

Abstract
In the previous chapter, we saw one way to analyze a series of events in time, by conditioning on what happened before. Each event occurs in a given constant period of time. In fact, these periods were assumed small enough so that only one event, however defined, can occur in each period. Such data can also be studied in two further ways. If we group the data into longer periods, we can look at the rate of occurrence of the events. However, in such a case, we lose information since we also have available the actual times between successive events. Often, such varying intervals of time can be considered to be independently and identically distributed. In such a case, we have a renewal process. This term comes from industry where certain machines or parts must be replaced or renewed at varying intervals of time. But models for renewal processes have much wider application.
James K. Lindsey

Chapter 4. Survival Curves

Abstract
In the two previous chapters, we have studied series of events occurring to one or more units (mines, a hospital, voters, …) observed over time. In Chapter 2, each event resulted in a possible switch from one of a relatively limited number of states to another. In the last chapter, we looked at the series of duration times between the events. In such series, no event was absorbing: it was always possible to pass from any state to another. A different situation arises when an absorbing state exists. The process ends when the individual enters such a state. In the simplest case, only two states exist and the event of interest is the passage from the one to the other. Thus, we are interested in the time during which an individual survives in the first state before passing into the absorbing state, from which there is no return, the survival time.
James K. Lindsey

Chapter 5. Growth Curves

Abstract
One special type of series measured in time concerns changes related to growth. Measurements are made repeatedly on one or more individuals. One goal of such studies is often to predict future growth. Naturally, for a given individual, the observations usually will not be independent, but will be correlated. If several individuals are concerned, the interrelation may vary among the individuals. However, we shall ignore this problem, as has often been done with growth curves, until after we have looked at classical time series analysis in the next chapter. Here, we shall treat several examples of the simple case of one series of repeated measurements (i.e. on one individual) related to growth, leaving the case of measurements on several individuals to Chapter 8, when we shall be able to handle the correlation.
James K. Lindsey

Chapter 6. Time Series: The Time Domain

Abstract
Many of the data sets which might best be handled by one of the approaches of the previous chapters would often be treated by a classical time series analysis, assuming a normal distribution. Most often, this simply amounts to fitting a multiple regression, with (a dynamic model) or without (a static model) lagged variables. Checks are then made on the various assumptions underlying the model: serial correlation of the residuals, non-normality, heteroscedasticity, and mis-specification of the linear model. Except for the first, these are the common verifications that a GLIM user applies to any model.
James K. Lindsey

Chapter 7. Time Series: The Frequency Domain

Abstract
In the previous chapter, we studied how a series of observations evolves over time. Another approach is to study how the series varies in frequency: the periods of cyclic phenomena. To do this, we require an estimate of the spectral density function which is a complementary function to that expressed in the correlogram. Spectral analysis involves decomposing a function into a sum of sines and cosines, an adaptation of Fourier analysis to stochastic variables. Suppose that our time series can be represented by an unknown function of time, E(yt = g(t;ω). Then we decompose this function:
$$ g(t;\omega ){{ = }_{{\frac{{{{\user2{a}}_{o}}}}{2}}}} + \sum\limits_{{\user2{r} = 1}}^{\user2{M}} {({{\user2{a}}_{\user2{r}}}\cos {{\user2{\omega }}_{\user2{r}}}\user2{t}} + {{\user2{b}}_{\user2{r}}}\sin {{\user2{\omega }}_{\user2{r}}}\user2{t)} $$
(7.1)
$$ _{{\frac{{{{\user2{a}}_{o}}}}{2}}} + \sum\limits_{{\user2{r} = 1}}^{\user2{M}} {{{\user2{R}}_{\user2{r}}}} \cos ({{\user2{\omega }}_{\user2{r}}}\user2{t + |}{{\phi }_{\user2{r}}}) $$
(7.2)
is the amplitude of the rth harmonic of the series,ωr = 2π/Pr = 2πr/N is the frequency of the rth cycle, where pr is its period or length, φr = tan-1 (-br/ar) is the phase, and r (1 ≤ r ≤ M = [N/2]) are the fundamental or Fourier frequencies. Then,
$$ \user2{I(}{{\omega }_{r}}) = NR_{1}^{2}/4\pi $$
(7.3)
is known as the periodogram, and is plotted against ωr. In this way, the total area under the periodogram equals the variance of the time series. (R r 2 /2 plotted against r would give the same result.)
James K. Lindsey

Chapter 8. Repeated Measurements

Abstract
Suppose that several measurements are taken on a number of individuals, usually over time. We wish to compare the way in which the measurements change for different individuals or groups of individuals. We are, then, comparing growth curves or time series, and testing for differences among them. Since observations are taken from the same individual, they will usually be correlated. (As is often the case for such models, we ignored this correlation in the chapter on growth curves.) Note, however, that the number of observations is usually very much smaller than for a single time series or growth curve. In such a case, the covariances among observations often can be modelled by a simple autoregression. The further apart in time are the observations on an individual, the less closely related they are. The values on the minor diagonals of the variance-covariance matrix, for an individual, decrease the farther they are from the main diagonal, while off diagonal elements between individuals are zero. In this sense, repeated measurements models are very closely related to the models of the previous three chapters. The difference is that more than one individual is involved. We shall deal with this case in the second and fourth sections.
James K. Lindsey

Chapter 9. Stochastic Processes and Generalized Linear Models

Abstract
Most of the models presented up until now illustrate how the commonly used and classical approaches to stochastic processes can be handled in GLIM. Many demonstrate the versatility of writing macros in the GLIM programming language. This flexibility can obviously be extended to many other models, either in the family of generalized linear models, or, at least, closely related to it. For example, the models for time series in the time domain, presented in Chapter 6, all use a normal distribution. They can very simply be modified to accommodate any other member of the exponential family. Similar modifications could be applied to certain models in other chapters. In this chapter, we shall consider a heterogeneous collection of models for stochastic processes which bring together some of the ideas of the previous chapters.
James K. Lindsey

Backmatter

Weitere Informationen