Skip to main content
Top
Published in: International Journal of Speech Technology 2/2012

01-06-2012

Time–domain non-linear feature parameter for consonant classification

Authors: T. M. Thasleema, P. Prajith, N. K. Narayanan

Published in: International Journal of Speech Technology | Issue 2/2012

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper introduces an accurate time–domain approach to model and classify the Malayalam consonant-Vowel (CV) speech unit waveforms. The technique is based on statistical models of Reconstructed State Space (RSS). A feature extraction method using RSS based State Space Point Distribution (SSPD) parameters are studied. The results of the simulation experiment performed on the Malayalam CV speech databases using Artificial Neural Network (ANN) and k-Nearest Neighborhood (k-NN) classifiers are also presented. The results indicate that the efficiency of the RSS approach is capable of increasing speaker independent consonant speech recognition accuracy.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Aiyar, S. (1987). Dravidian theories, p. 286. Aiyar, S. (1987). Dravidian theories, p. 286.
go back to reference Anitha, R., Srikrishna Satish, D., & Chandra Shekhar, C. (2004). Outerproduct of trajectory matrix for acoustic modelling using support vector machines. In IEEE workshop on machine learning for signal processing (pp. 355–363). Anitha, R., Srikrishna Satish, D., & Chandra Shekhar, C. (2004). Outerproduct of trajectory matrix for acoustic modelling using support vector machines. In IEEE workshop on machine learning for signal processing (pp. 355–363).
go back to reference Baker, G. L., & Gollub, J. (1996). Chaotic dynamics: An introduction. Cambridge: Cambridge University Press. MATH Baker, G. L., & Gollub, J. (1996). Chaotic dynamics: An introduction. Cambridge: Cambridge University Press. MATH
go back to reference Banbrook, M., & McLaughlin, S. (1994). Is speech chaotic? In Proceedings. IEE colloq. exploiting chaos in signal processing (pp. 1–8). Banbrook, M., & McLaughlin, S. (1994). Is speech chaotic? In Proceedings. IEE colloq. exploiting chaos in signal processing (pp. 1–8).
go back to reference Broomhead, D. S., & King, G. P. (1986). Extracting qualitative dynamics from experimental data. Physica D, 217–236. Broomhead, D. S., & King, G. P. (1986). Extracting qualitative dynamics from experimental data. Physica D, 217–236.
go back to reference Casdagli, M. (1991). Chaos and deterministic versus stochastic nonlinear modeling. Journal of the Royal Statistical Society. Series B, 54, 303–328. MathSciNet Casdagli, M. (1991). Chaos and deterministic versus stochastic nonlinear modeling. Journal of the Royal Statistical Society. Series B, 54, 303–328. MathSciNet
go back to reference Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. MATHCrossRef Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. MATHCrossRef
go back to reference Cutajar, M., Gatt, E., Grech, I., Casha, O., & Micallef, J. (2011). Neural network architectures for speaker independent phoneme recognition. In 7th international symposium on image and signal processing analysis, Croatia (pp. 90–95). Cutajar, M., Gatt, E., Grech, I., Casha, O., & Micallef, J. (2011). Neural network architectures for speaker independent phoneme recognition. In 7th international symposium on image and signal processing analysis, Croatia (pp. 90–95).
go back to reference Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley. MATH Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley. MATH
go back to reference Duda, R. O., Hart, P. E., & Stork, D. G. (2006). Pattern classification. New York: Wiley. Duda, R. O., Hart, P. E., & Stork, D. G. (2006). Pattern classification. New York: Wiley.
go back to reference Friedmen, M., & Kandel, A. (1999). Introduction to pattern recognition: Statistical, structural, neural and fuzzy logic approach. Singapore: World Scientific. Friedmen, M., & Kandel, A. (1999). Introduction to pattern recognition: Statistical, structural, neural and fuzzy logic approach. Singapore: World Scientific.
go back to reference Govindaraju, V., & Setlur, S. (2009). Advances in pattern recognition. Guide to OCR for Indic scripts: Document recognition and retrieval. Berlin: Springer. (p. 126). Govindaraju, V., & Setlur, S. (2009). Advances in pattern recognition. Guide to OCR for Indic scripts: Document recognition and retrieval. Berlin: Springer. (p. 126).
go back to reference Hand, D. J. (1981). Discrimination and classification. New York: Wiley. MATH Hand, D. J. (1981). Discrimination and classification. New York: Wiley. MATH
go back to reference Haykin, S. (2004). Neural networks: A comprehensive foundation. New Delhi: Prentice Hall of India Pvt. Ltd. Haykin, S. (2004). Neural networks: A comprehensive foundation. New Delhi: Prentice Hall of India Pvt. Ltd.
go back to reference Johnson, M. T., Povinalli, R. J., Lindgren, A. C., Ye, J., Liu, X., & Indrebo, K. (2005). Time domain isolated phoneme classification using reconstructed phase space. IEEE Transactions on Speech and Audio Processing, 13(4), 458–466. CrossRef Johnson, M. T., Povinalli, R. J., Lindgren, A. C., Ye, J., Liu, X., & Indrebo, K. (2005). Time domain isolated phoneme classification using reconstructed phase space. IEEE Transactions on Speech and Audio Processing, 13(4), 458–466. CrossRef
go back to reference Jurafsky, D., & Martin, J. H. (2004). An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River: Pearson Education. Jurafsky, D., & Martin, J. H. (2004). An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River: Pearson Education.
go back to reference Kantz, H., & Schreiber, T. (1997). Non linear time series analysis. Cambridge: Cambridge University Press. Kantz, H., & Schreiber, T. (1997). Non linear time series analysis. Cambridge: Cambridge University Press.
go back to reference Kohonen, T. (1988). An introduction to neural computing. Neural Networks. Kohonen, T. (1988). An introduction to neural computing. Neural Networks.
go back to reference Kwon, O.-W., Chan, K., & Lee, T.-W. (2003). Speech feature analysis using variational Bayesian PCA. IEEE Signal Processing Letters, 10, 5. Kwon, O.-W., Chan, K., & Lee, T.-W. (2003). Speech feature analysis using variational Bayesian PCA. IEEE Signal Processing Letters, 10, 5.
go back to reference Ladefoged, P. (2004). Vowels and consonants—an introduction to the sounds of language. Oxford: Blackwell. Ladefoged, P. (2004). Vowels and consonants—an introduction to the sounds of language. Oxford: Blackwell.
go back to reference Lajish, V. L. (2007). Adaptive neuro-fuzzy inference based pattern recognition studies on handwritten character images. PhD Thesis, University of Calicut. Lajish, V. L. (2007). Adaptive neuro-fuzzy inference based pattern recognition studies on handwritten character images. PhD Thesis, University of Calicut.
go back to reference Lippmann, R. P. (1987). An introduction to computing with neural nets. IEEE Transactions on Acoustic, Speech, and Signal Processing Magazine, 61, 4–22. Lippmann, R. P. (1987). An introduction to computing with neural nets. IEEE Transactions on Acoustic, Speech, and Signal Processing Magazine, 61, 4–22.
go back to reference McCullough, W. C., & Pitts, W. H. (1943). A logical calculus of ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5, 115–133. CrossRef McCullough, W. C., & Pitts, W. H. (1943). A logical calculus of ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5, 115–133. CrossRef
go back to reference Narayanan, N. K., & Kabeer, V. (2010). Face recognition using non-linear feature parameter and artificial neural network. International Journal of Computational Intelligent Systems, 3(5), 566–574. Narayanan, N. K., & Kabeer, V. (2010). Face recognition using non-linear feature parameter and artificial neural network. International Journal of Computational Intelligent Systems, 3(5), 566–574.
go back to reference Ott, E. (1993). Chaos in dynamical systems. Cambridge: Cambridge University Press. MATH Ott, E. (1993). Chaos in dynamical systems. Cambridge: Cambridge University Press. MATH
go back to reference Packard, N. H., Crutchfield, J. P., Farmer, J. D., & Shaw, R. S. (1980). Geometry from a time series. Physical Review Letters, 45, 712–716. CrossRef Packard, N. H., Crutchfield, J. P., Farmer, J. D., & Shaw, R. S. (1980). Geometry from a time series. Physical Review Letters, 45, 712–716. CrossRef
go back to reference Pal, S. K., & Mitra, S. (1992). Multilayer perceptron, fuzzy sets, and classification. IEEE Transactions on Neural Networks, 3(5), 683–697. CrossRef Pal, S. K., & Mitra, S. (1992). Multilayer perceptron, fuzzy sets, and classification. IEEE Transactions on Neural Networks, 3(5), 683–697. CrossRef
go back to reference Patil, H. A., & Basu, T. K. (2008). LP spectra vs. mel spectra for identification of professional mimics in Indian languages. International Journal of Speech Technology, 11, 1–16. CrossRef Patil, H. A., & Basu, T. K. (2008). LP spectra vs. mel spectra for identification of professional mimics in Indian languages. International Journal of Speech Technology, 11, 1–16. CrossRef
go back to reference Pernkopf, F. (2005). Bayesian network classifiers versus selective k-NN classifier. Pattern Recognition, 38, 1–10. MATHCrossRef Pernkopf, F. (2005). Bayesian network classifiers versus selective k-NN classifier. Pattern Recognition, 38, 1–10. MATHCrossRef
go back to reference Prajith, P. (2008). Investigations on the applications of dynamical instabilities and deterministic chaos for speech signal processing. PhD Thesis, University of Calicut. Prajith, P. (2008). Investigations on the applications of dynamical instabilities and deterministic chaos for speech signal processing. PhD Thesis, University of Calicut.
go back to reference Rabiner, L., & Juang, B. (1992). Fundamentals of speech recognition. Upper Saddle River: Pearson Education. Rabiner, L., & Juang, B. (1992). Fundamentals of speech recognition. Upper Saddle River: Pearson Education.
go back to reference Ramachandran, H. P. (2008). Encyclopedia of language and linguistics. Oxford: Pergamon Press. Ramachandran, H. P. (2008). Encyclopedia of language and linguistics. Oxford: Pergamon Press.
go back to reference Ray, A. K., & Chatterjee, B. (1984). Design of a nearest neighbor classifier system for Bengali character recognition. Journal of the Institution of Electronics and Telecommunication Engineers, 30, 226–229. Ray, A. K., & Chatterjee, B. (1984). Design of a nearest neighbor classifier system for Bengali character recognition. Journal of the Institution of Electronics and Telecommunication Engineers, 30, 226–229.
go back to reference Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press. MATH Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press. MATH
go back to reference Samouelian, A. (1994). Knowledge based approach to consonant recognition. In IEEE international conf. on ASSP (pp. 77–80). Samouelian, A. (1994). Knowledge based approach to consonant recognition. In IEEE international conf. on ASSP (pp. 77–80).
go back to reference Senthil, R. G., & Dandapt, S. (2010). Speaker recognition under stressed condition. International Journal of Speech Technology, 13, 141–161. CrossRef Senthil, R. G., & Dandapt, S. (2010). Speaker recognition under stressed condition. International Journal of Speech Technology, 13, 141–161. CrossRef
go back to reference Sheikhzadeh, H., & Deng, L. (1994). Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization. IEEE Transactions on Acoustics, Speech, and Signal Processing, 2, 80–91. Sheikhzadeh, H., & Deng, L. (1994). Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization. IEEE Transactions on Acoustics, Speech, and Signal Processing, 2, 80–91.
go back to reference Simpson, P. K. (1990). Artificial neural systems. Oxford: Pergamon. Simpson, P. K. (1990). Artificial neural systems. Oxford: Pergamon.
go back to reference Takens, F. (1980). Detecting strange attractors in turbulence. In Proceedings. Dynamical systems and turbulence (pp. 366–381), Warwick, UK. Takens, F. (1980). Detecting strange attractors in turbulence. In Proceedings. Dynamical systems and turbulence (pp. 366–381), Warwick, UK.
go back to reference Teager, H. M., & Teager, S. M. (1990). Evidence for nonlinear sound production mechanisms in the vocal tract. In Proceedings NATO ASI speech production speech modeling (pp. 241–261). Teager, H. M., & Teager, S. M. (1990). Evidence for nonlinear sound production mechanisms in the vocal tract. In Proceedings NATO ASI speech production speech modeling (pp. 241–261).
go back to reference Tou, J. T., & Gonzalez, R. C. (1974). Pattern recognition principles. London: Addison-Wesley. MATH Tou, J. T., & Gonzalez, R. C. (1974). Pattern recognition principles. London: Addison-Wesley. MATH
go back to reference Yu, M.-C. (2011). Multi-criteria ABC analysis using artificial-intelligence based classification techniques. Elsevier Expert Systems with Applications, 38, 3416–3421. CrossRef Yu, M.-C. (2011). Multi-criteria ABC analysis using artificial-intelligence based classification techniques. Elsevier Expert Systems with Applications, 38, 3416–3421. CrossRef
go back to reference Zhang, B. Srihari, S. N. (2004). Fast k-nearest neighbor using cluster based trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 525–528. CrossRef Zhang, B. Srihari, S. N. (2004). Fast k-nearest neighbor using cluster based trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 525–528. CrossRef
Metadata
Title
Time–domain non-linear feature parameter for consonant classification
Authors
T. M. Thasleema
P. Prajith
N. K. Narayanan
Publication date
01-06-2012
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 2/2012
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-012-9136-6

Other articles of this Issue 2/2012

International Journal of Speech Technology 2/2012 Go to the issue