Skip to main content

2022 | Buch

Mixture and Hidden Markov Models with R

insite
SUCHEN

Über dieses Buch

This book discusses mixture and hidden Markov models for modeling behavioral data. Mixture and hidden Markov models are statistical models which are useful when an observed system occupies a number of distinct “regimes” or unobserved (hidden) states. These models are widely used in a variety of fields, including artificial intelligence, biology, finance, and psychology. Hidden Markov models can be viewed as an extension of mixture models, to model transitions between states over time. Covering both mixture and hidden Markov models in a single book allows main concepts and issues to be introduced in the relatively simpler context of mixture models. After a thorough treatment of the theory and practice of mixture modeling, the conceptual leap towards hidden Markov models is relatively straightforward.

This book provides many practical examples illustrating the wide variety of uses of the models. These examples are drawn from our own work in psychology, as well as other areas such as financial time series and climate data. Most examples illustrate the use of the authors’ depmixS4 package, which provides a flexible framework to construct and estimate mixture and hidden Markov models. All examples are fully reproducible and the accompanying hmmR package provides all the datasets used, as well as additional functionality. This book is suitable for advanced students and researchers with an applied background.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Introduction and Preliminaries
Abstract
This chapter first provides a brief overview of the contents of the book. Section 1.2 then provides a brief introduction to R, focusing on summarizing and visualizing data, and modelling data with (generalized) linear models. Section 1.3 introduces the datasets used throughout the book to illustrate the models, including their origin and main research questions.
Ingmar Visser, Maarten Speekenbrink
Chapter 2. Mixture and Latent Class Models
Abstract
This chapter introduces mixture models and latent class models. After a motivating example, formal definitions of these models are presented in Sect. 2.2. In Sect. 2.3, several methods for maximum likelihood parameter estimation are outlined. In Sects. 2.4 and 2.5, parameter inference is discussed: Are parameters different from zero? Is there justification to constrain parameters to be equal to one another? What are the standard errors of parameter estimates? In Sect. 2.6, model selection is treated. A main question here is how many states (mixture components) a model should have and the various criteria that are used to decide this are defined and discussed. In Sect. 2.7, we then discuss how to model the effect of covariates on the prior probability of the states. Finally, in Sect. 2.8, we consider whether the parameters of mixture models are identifiable.
Ingmar Visser, Maarten Speekenbrink
Chapter 3. Mixture and Latent Class Models: Applications
Abstract
In this chapter we provide applications of mixture and latent class models. First, two examples of Gaussian mixture models are presented. In the application to the conservation of liquid task (Sect. 3.2), an illustration is provided of the occurrence of local maxima in the log likelihood, which is a recurring challenge in mixture and latent class models. Section 3.3 extends this analysis to a multivariate Gaussian mixture model and illustrates some of the practical difficulties in estimating such models. In Sect. 3.4, a latent class model is presented for the balance scale data, consisting of four binary items. This example is then used to illustrate testing item homogeneity: are the items equivalent to each other? Given such item homogeneity, items can then be collapsed into a sum score. Hence, the application in Sect. 3.5 analyses such sum scores of items using binomial mixtures, and illustrates testing direct versus indirect effects of age. The final application (Sect. 3.6) concerns data from the dimensional change card sort task, also by binomial mixture models, and illustrates the use of the bootstrapped likelihood ratio statistic for model selection.
Note that the term latent class model is mostly used for mixture models for multivariate categorical or nominal data. An example is provided with multivariate binary data in Sect. 3.4.
Ingmar Visser, Maarten Speekenbrink
Chapter 4. Hidden Markov Models
Abstract
This chapter introduces hidden Markov models (HMMs), which can be viewed as an extension of mixture models, in which a unit of observation (e.g., a participant) can move between the states (mixture components) over time. While in mixture models, each observation is an independent draw from the mixture distribution, in hidden Markov models these draws are dependent. Leroux and Puterman (Biometrics 48:545–548, 1992) hence use the term dependent mixture model to describe a hidden Markov model, which is also where depmixS4 derives its name from. In particular, the sequence of components from which draws are taken in an HMM follow a Markov process. In other words, consecutive components are related through a Markov chain. As in mixture models, the states in a (hidden) Markov model are elements from a finite set. Models with an infinite set of states mapping onto a Euclidian space are called state-space models and are outside the scope of this book, see e.g. Durbin and Koopman, Time series analysis by state space methods, 2012.
In Sect. 4.1, we first introduce (observed) Markov models, which are also called Markov chains, to discuss important properties of (hidden) Markov models, such as stationarity, homogeneity, and ergodicity. Section 4.2 then extends Markov models to hidden Markov models which are governed by unobserved (also called latent or hidden) states. Inferring the hidden states and computing the model likelihood is done efficiently through so-called filtering and smoothing recursions, which are the topic of Sect. 4.3. Parameter estimation of HMMs is much the same as for mixture models, and Sect. 4.4 is therefore relatively brief in discussing the relevant extensions to Sect. 2.​3. Assigning observations to hidden states is also called ‘decoding’ and this is the topic of Sect. 4.5. For global decoding, which takes into account the whole time series, the well-known Viterbi algorithm is introduced. Section 4.6 briefly discusses parameter inference, which again is mostly identical to inference in mixture models. In Sect. 4.7, we then discuss how to model the initial state and state transition probabilities with covariates, using multinomial logistic regression. Finally, we end the chapter with a discussion of missing data in the context of HMMs (Sect. 4.8).
Ingmar Visser, Maarten Speekenbrink
Chapter 5. Univariate Hidden Markov Models
Abstract
In this chapter we discuss a number of examples of hidden Markov models applied to univariate time series. We first present a basic Gaussian hidden Markov model, applied to a financial timeseries. This example extends the earlier analysis of this data with a mixture model (Sect. 3.​1) to account for the dependency between consecutive timepoints. Section 5.2 presents a Bernoulli HMM applied to data from a developmental study. This is an example of applying a HMM to a relatively large number of relatively short timeseries. The example in Sect. 5.3 applies a Gaussian HMM to a timeseries of response times in a speed-accuracy trade-off experiment. Of particular interest here is determining to which extent models account for the autocorrelation structure in the data. The fourth example (Sect. 5.4) concerns climate change data and introduces models for detecting sudden changes in continuous time series. These models use a constrained transition matrix to identify the times of discrete changes and the states in between the change points. The final application in this chapter (Sect. 5.5) illustrates the use of HMMs of generalized linear models, with covariates that vary over time. Such models can be seen viewed as “switching GLMs” and as states can rely on different sets of covariates, these models are rather flexible. All models in this chapter are fitted using the hmmr and depmixS4 packages.
Ingmar Visser, Maarten Speekenbrink
Chapter 6. Multivariate Hidden Markov Models
Abstract
This chapter provides three extended example analyses, applying hidden Markov models to multivariate time series. The first example (Sect. 6.1) concerns an analysis of a large number of replications of relative short (T = 8) timeseries. Each observation at a time points consists of a large number of variables, however. This is an example of how hidden Markov models can be used to analyze complex panel data. The second example (Sect. 6.2) shows how hidden Markov models can also be usefully applied to analyze mixed data, consisting of a binary (Bernoulli) and a continuous (Gaussian) variable. In other respects, this data is a more traditional application of a HMM to a small number of relatively long timeseries. The final example (Sect. 6.3) concerns an HMM applied to a bivariate timeseries with one binomial and one multinomial response variable. An interesting variation in this analysis is that the multinomial response also functions as a predictor for the binomial response.
Ingmar Visser, Maarten Speekenbrink
Chapter 7. Extensions
Abstract
This chapter discusses a number of extensions and variations of the models and techniques discussed in the previous chapters. First, in Sect. 7.1, we discuss higher-order hidden Markov models, in which the current state depends not only on the immediately preceding states, but other states further back in time as well. In Sect. 7.2, we then focus on models which assume that underlying each observation is not a single state, but multiple simultaneous states. For both higher-order HMMs, and HMMs with multiple simultaneous states, we discuss how these models can be transformed into a standard first-order HMM, by redefining the states. Section 7.3 discusses how to deal with practical estimation issues, such as those illustrated in Sect. 3.​3. The final part of this chapter discusses alternatives to maximum likelihood estimation. Section 7.4 introduces what can be called the “classification likelihood” approach to estimating mixture and hidden Markov models. Here, each observation is assigned uniquely to a single state (a “hard” classification), rather than allowing for uncertainty about the underlying state (“soft” classification). If the objective is purely to classify observations, this approach can be reasonable and will often converge more quickly than an EM algorithm using the standard likelihood. Finally, Sect. 7.5 touches on Bayesian estimation as another alternative to maximum likelihood estimation.
Ingmar Visser, Maarten Speekenbrink
Backmatter
Metadaten
Titel
Mixture and Hidden Markov Models with R
verfasst von
Ingmar Visser
Maarten Speekenbrink
Copyright-Jahr
2022
Electronic ISBN
978-3-031-01440-6
Print ISBN
978-3-031-01438-3
DOI
https://doi.org/10.1007/978-3-031-01440-6