Skip to main content
Log in

Spotting and Recognition of Consonant-Vowel Units from Continuous Speech Using Accurate Detection of Vowel Onset Points

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

In this paper, we propose an efficient approach to spotting and recognition of consonant-vowel (CV) units from continuous speech using accurate detection of vowel onset points (VOPs). Existing methods for VOP detection suffer from lack of high accuracy, spurious VOPs, and missed VOPs. The proposed VOP detection is designed to overcome most of the shortcomings of the existing methods and provide accurate detection of VOPs for improving the performance of spotting and recognition of CV units. The proposed method for VOP detection is carried out in two levels. At the first level, VOPs are detected by combining the complementary evidence from excitation source, spectral peaks, and modulation spectrum. At the second level, hypothesized VOPs are verified (genuine or spurious), and their positions are corrected using the uniform epoch intervals present in the vowel regions. The spotted CV units are recognized using a two-stage CV recognizer. Two-stage CV recognition system consists of hidden Markov models (HMMs) at the first stage for recognizing the vowel category of a CV unit and support vector machines (SVMs) for recognizing the consonant category of a CV unit at the second stage. Performance of spotting and recognition of CV units from continuous speech is evaluated using Telugu broadcast news speech corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. L.R. Bahl, R. Bakis, P.S. Cohen, A.G. Cole, F. Jelinek, B.L. Lewis, R.L. Mercer, Further results on the recognition of a continuously read natural corpus, in Proc. IEEE International Conference on Acoustics, Speech, Signal Processing (1980), pp. 872–875

    Google Scholar 

  2. R. Collobert, S. Bengio, C. Williamson, SVMTorch, support vector machines for large-scale regression problems. J. Mach. Learn. Res. 1, 143–160 (2001)

    MathSciNet  Google Scholar 

  3. O. Fujimura, Syllable as a unit of speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23(1), 82–87 (1975)

    Article  Google Scholar 

  4. S.V. Gangashetty, Neural network models for recognition of consonant-vowel units of speech in Multiple Languages. Ph.D. thesis, IIT Madras, Oct. 2004

  5. S.V. Gangashetty, C.C. Sekhar, B. Yegnanarayana, Spotting multilingual consonant-vowel units of speech using neural networks, in An ISCA Tutorial and Research Workshop on Non-Linear Speech Processing, Apr. 2005, pp. 287–297

    Google Scholar 

  6. S.V. Gangashetty, C.C. Sekhar, B. Yegnanarayana, Detection of vowel onset points in continuous speech using autoassociative neural network models, in Proc. Int. Conf. Spoken Language Processing, Oct. 2004, pp. 401–410

    Google Scholar 

  7. S.V. Gangashetty, C.C. Sekhar, B. Yegnanarayana, Extraction of fixed dimension patterns from varying duration segments of consonant-vowel utterances, in Proc. Intelligent Sensing and Information Processing (2004), pp. 159–164

    Chapter  Google Scholar 

  8. A. Ganapathiraju, J. Hamaker, J. Picone, M. Ordowski, G.R. Doddington, Syllable based large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9(4), 358–366 (2001)

    Article  Google Scholar 

  9. J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, N.L. Dahlgren, V. Zue, TIMIT Acoustic-Phonetic Continuous Speech Corpus (Linguistic Data Consortium, Philadelphia, 1993)

    Google Scholar 

  10. D.J. Hermes, Vowel onset detection. J. Acoust. Soc. Am. 87, 866–873 (1990)

    Article  Google Scholar 

  11. M.Y. Hwang, X.D. Huang, Shared distribution Hidden Markov Models for speech recognition. IEEE Trans. Speech Audio Process. 1, 414–420 (1993)

    Article  Google Scholar 

  12. K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16, 1602–1613 (2008)

    Article  Google Scholar 

  13. J.W. Picone, Signal modeling techniques in speech recognition. Proc. IEEE 81, 1215–1247 (1993)

    Article  Google Scholar 

  14. S.R.M. Prasanna, B.V. Sandeep Reddy, P. Krishnamoorthy, Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Trans. Audio Speech Lang. Process. 17(4), 556–565 (2009)

    Article  Google Scholar 

  15. S.R.M. Prasanna, S.V. Gangashetty, B. Yegnanarayana, Significance of vowel onset point for speech analysis, in Signal Processing and Communications (Biennial Conf., IISc) (2001), pp. 81–88

    Google Scholar 

  16. S.R.M. Prasanna, B. Yegnanarayana, Detection of vowel onset point events using excitation source information. Interspeech, pp. 1133–1136, Sep. 2005

  17. L.R. Rabiner, B.H. Juang, Fundamentals of Speech Recognition (PTR Prentice Hall, Englewood Cliffs, 1993)

    Google Scholar 

  18. R.M. Schwartz, Y.L. Chow, S. Roucos, M. Krasner, J. Makhoul, Improved hidden Markov modeling phonemes for continuous speech recognition, in Proc. IEEE International Conference on Acoustics, Speech, Signal Processing (1984), pp. 21–24

    Google Scholar 

  19. C.C. Sekhar, Neural network models for recognition of stop consonant-vowel (SCV) segments in continuous speech. Ph.D. thesis, IIT Madras (1996)

  20. C.C. Sekhar, W.F. Lee, K. Takeda, F. Itakura, Acoustic modeling of subword units using support vector machines, in Proc. of Workshop on Spoken Language Processing (2003), pp. 79–86

    Google Scholar 

  21. R. Thangarajan, A.M. Natarajan, M. Selvam, Syllable modeling in continuous speech recognition for Tamil language. Int. J. Speech Technol. 12, 47–57 (2009)

    Article  Google Scholar 

  22. A.K. Vuppala, S. Chakrabarti, K.S. Rao, Effect of speech coding on recognition of Consonant-Vowel (CV) units, in Proc. Int. Conf. Contemporary Computing, Aug. 2010. Springer Communications in Computer and Information Science (2010), pp. 284–294. ISSN: 1865-0929

    Google Scholar 

  23. J.-H. Wang, S.-H. Chen, A C/V segmentation algorithm for Mandarin speech using wavelet transforms, in Proc. Int. Conf. Acoust. Speech, Signal Process, vol. 1, Sep. 1999, pp. 1261–1264

    Google Scholar 

  24. J.F. Wang, C.H. Wu, S.H. Chang, J.Y. Lee, A hierarchical neural network based on C/V segmentation algorithm for isolated Mandarin speech recognition. IEEE Trans. Signal Process. 39(9), 2141–2146 (1991)

    Article  Google Scholar 

  25. S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, P. Woodland, The HTK Book Version 3.0 (Cambridge University Press, Cambridge, 2000)

    Google Scholar 

Download references

Acknowledgement

The authors like to acknowledge the reviewers for their valuable comments and suggested corrections. Those have helped us a lot for improving the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anil Kumar Vuppala.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vuppala, A.K., Rao, K.S. & Chakrabarti, S. Spotting and Recognition of Consonant-Vowel Units from Continuous Speech Using Accurate Detection of Vowel Onset Points. Circuits Syst Signal Process 31, 1459–1474 (2012). https://doi.org/10.1007/s00034-012-9391-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-012-9391-4

Keywords

Navigation