Continuously variable duration hidden Markov models for automatic speech recognition

https://doi.org/10.1016/S0885-2308(86)80009-2Get rights and content

Abstract

During the past decade, the applicability of hidden Markov models (HMM) to various facets of speech analysis has been demonstrated in several different experiments. These investigations all rest on the assumption that speech is a quasi-stationary process whose stationary intervals can be identified with the occupancy of a single state of an appropriate HMM. In the traditional form of the HMM, the probability of duration of a state decreases exponentially with time. This behavior does not provide an adequate representation of the temporal structure of speech.

The solution proposed here is to replace the probability distributions of duration with continuous probability density functions to form a continuously variable duration hidden Markov model (CVDHMM). The gamma distribution is ideally suited to specification of the durational density since it is one-sided and only has two parameters which, together, define both mean and variance. The main result is a derivation and proof of convergence of re-estimation formulae for all the parameters of the CVDHMM. It is interesting to note that if the state durations are gamma-distributed, one of the formulae is non-algebraic but, fortuitously, has properties such that it is easily and rapidly solved numerically to any desired degree of accuracy. Other results are presented including the performance of the formulae on simulated data.

References (18)

  • BahlL.R. et al.

    A maximum likelihood approach to continuous speech recognition

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1983)
  • BakerJ.K.

    The DRAGON.system—an overview

    IEEE Transactions on Acoustics, Speech, and Signal Processing

    (1975)
  • BaumL.E.

    An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes

    Inequalities

    (1972)
  • BaumL.E. et al.

    A maximization technique in the statistical analysis of probabilistic functions of Markov chains

    Annals of Mathematical Statistics

    (1970)
  • BaumL.E. et al.

    Growth functions for transformations on manifolds

    Pacific Journal of Mathematics

    (1968)
  • BilliR.

    Vector quantization and Markov source models applied to speech recognition

  • BourlardH. et al.

    Connected digit recognition using vector quantization

There are more references available in the full text version of this article.

Cited by (362)

  • Sentiment analysis using novel and interpretable architectures of Hidden Markov Models[Formula presented]

    2021, Knowledge-Based Systems
    Citation Excerpt :

    For many sequential observations, we anticipate that the trends in the data over several successive observations will provide important information in predicting the next value. The aforementioned restriction also does not allow combinations of states to be effectively modeled [53] and the self-loops on the states are poor duration models in most practical applications [54]. Quan et al. [43] found that utilizing more than one previous state increased performance and concluded that a sentence is clearly affected by as many as four previous sentences.

  • Inhomogeneous hidden semi-Markov models for incompletely observed point processes

    2023, Annals of the Institute of Statistical Mathematics
View all citing articles on Scopus
View full text