Top

International Journal of Speech Technology

Published in:

01-06-2012

Time–domain non-linear feature parameter for consonant classification

Authors: T. M. Thasleema, P. Prajith, N. K. Narayanan

Published in: International Journal of Speech Technology | Issue 2/2012

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper introduces an accurate time–domain approach to model and classify the Malayalam consonant-Vowel (CV) speech unit waveforms. The technique is based on statistical models of Reconstructed State Space (RSS). A feature extraction method using RSS based State Space Point Distribution (SSPD) parameters are studied. The results of the simulation experiment performed on the Malayalam CV speech databases using Artificial Neural Network (ANN) and k-Nearest Neighborhood (k-NN) classifiers are also presented. The results indicate that the efficiency of the RSS approach is capable of increasing speaker independent consonant speech recognition accuracy.

previous article A HMM-WDLT framework for HNM-based voice conversion with parametric adjustment in formant bandwidth, duration and excitation

next article Speaker verification using excitation source information

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Aiyar, S. (1987). Dravidian theories, p. 286.

Anitha, R., Srikrishna Satish, D., & Chandra Shekhar, C. (2004). Outerproduct of trajectory matrix for acoustic modelling using support vector machines. In IEEE workshop on machine learning for signal processing (pp. 355–363).

Baker, G. L., & Gollub, J. (1996). Chaotic dynamics: An introduction. Cambridge: Cambridge University Press. MATH

Banbrook, M., & McLaughlin, S. (1994). Is speech chaotic? In Proceedings. IEE colloq. exploiting chaos in signal processing (pp. 1–8).

Broomhead, D. S., & King, G. P. (1986). Extracting qualitative dynamics from experimental data. Physica D, 217–236.

Casdagli, M. (1991). Chaos and deterministic versus stochastic nonlinear modeling. Journal of the Royal Statistical Society. Series B, 54, 303–328. MathSciNet

Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. MATHCrossRef

Cutajar, M., Gatt, E., Grech, I., Casha, O., & Micallef, J. (2011). Neural network architectures for speaker independent phoneme recognition. In 7th international symposium on image and signal processing analysis, Croatia (pp. 90–95).

Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley. MATH

Duda, R. O., Hart, P. E., & Stork, D. G. (2006). Pattern classification. New York: Wiley.

Friedmen, M., & Kandel, A. (1999). Introduction to pattern recognition: Statistical, structural, neural and fuzzy logic approach. Singapore: World Scientific.

Govindaraju, V., & Setlur, S. (2009). Advances in pattern recognition. Guide to OCR for Indic scripts: Document recognition and retrieval. Berlin: Springer. (p. 126).

Hand, D. J. (1981). Discrimination and classification. New York: Wiley. MATH

Haykin, S. (2004). Neural networks: A comprehensive foundation. New Delhi: Prentice Hall of India Pvt. Ltd.

Johnson, M. T., Povinalli, R. J., Lindgren, A. C., Ye, J., Liu, X., & Indrebo, K. (2005). Time domain isolated phoneme classification using reconstructed phase space. IEEE Transactions on Speech and Audio Processing, 13(4), 458–466. CrossRef

Jurafsky, D., & Martin, J. H. (2004). An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River: Pearson Education.

Kantz, H., & Schreiber, T. (1997). Non linear time series analysis. Cambridge: Cambridge University Press.

Kohonen, T. (1988). An introduction to neural computing. Neural Networks.

Kwon, O.-W., Chan, K., & Lee, T.-W. (2003). Speech feature analysis using variational Bayesian PCA. IEEE Signal Processing Letters, 10, 5.

Ladefoged, P. (2004). Vowels and consonants—an introduction to the sounds of language. Oxford: Blackwell.

Lajish, V. L. (2007). Adaptive neuro-fuzzy inference based pattern recognition studies on handwritten character images. PhD Thesis, University of Calicut.

Lippmann, R. P. (1987). An introduction to computing with neural nets. IEEE Transactions on Acoustic, Speech, and Signal Processing Magazine, 61, 4–22.

McCullough, W. C., & Pitts, W. H. (1943). A logical calculus of ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5, 115–133. CrossRef

Narayanan, N. K., & Kabeer, V. (2010). Face recognition using non-linear feature parameter and artificial neural network. International Journal of Computational Intelligent Systems, 3(5), 566–574.

Ott, E. (1993). Chaos in dynamical systems. Cambridge: Cambridge University Press. MATH

Packard, N. H., Crutchfield, J. P., Farmer, J. D., & Shaw, R. S. (1980). Geometry from a time series. Physical Review Letters, 45, 712–716. CrossRef

Pal, S. K., & Mitra, S. (1992). Multilayer perceptron, fuzzy sets, and classification. IEEE Transactions on Neural Networks, 3(5), 683–697. CrossRef

Patil, H. A., & Basu, T. K. (2008). LP spectra vs. mel spectra for identification of professional mimics in Indian languages. International Journal of Speech Technology, 11, 1–16. CrossRef

Pernkopf, F. (2005). Bayesian network classifiers versus selective k-NN classifier. Pattern Recognition, 38, 1–10. MATHCrossRef

Prajith, P. (2008). Investigations on the applications of dynamical instabilities and deterministic chaos for speech signal processing. PhD Thesis, University of Calicut.

Rabiner, L., & Juang, B. (1992). Fundamentals of speech recognition. Upper Saddle River: Pearson Education.

Ramachandran, H. P. (2008). Encyclopedia of language and linguistics. Oxford: Pergamon Press.

Ray, A. K., & Chatterjee, B. (1984). Design of a nearest neighbor classifier system for Bengali character recognition. Journal of the Institution of Electronics and Telecommunication Engineers, 30, 226–229.

Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press. MATH

Samouelian, A. (1994). Knowledge based approach to consonant recognition. In IEEE international conf. on ASSP (pp. 77–80).

Senthil, R. G., & Dandapt, S. (2010). Speaker recognition under stressed condition. International Journal of Speech Technology, 13, 141–161. CrossRef

Sheikhzadeh, H., & Deng, L. (1994). Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization. IEEE Transactions on Acoustics, Speech, and Signal Processing, 2, 80–91.

Simpson, P. K. (1990). Artificial neural systems. Oxford: Pergamon.

Takens, F. (1980). Detecting strange attractors in turbulence. In Proceedings. Dynamical systems and turbulence (pp. 366–381), Warwick, UK.

Teager, H. M., & Teager, S. M. (1990). Evidence for nonlinear sound production mechanisms in the vocal tract. In Proceedings NATO ASI speech production speech modeling (pp. 241–261).

Tou, J. T., & Gonzalez, R. C. (1974). Pattern recognition principles. London: Addison-Wesley. MATH

Whitney, H. (1936). Differentiable manifolds. Annals of Mathematics, 37, 645–680. MathSciNetCrossRef

Yu, M.-C. (2011). Multi-criteria ABC analysis using artificial-intelligence based classification techniques. Elsevier Expert Systems with Applications, 38, 3416–3421. CrossRef

Zhang, B. Srihari, S. N. (2004). Fast k-nearest neighbor using cluster based trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 525–528. CrossRef

Title: Time–domain non-linear feature parameter for consonant classification
Authors: T. M. Thasleema
P. Prajith
N. K. Narayanan
Publication date: 01-06-2012
Publisher: Springer US
Published in: International Journal of Speech Technology / Issue 2/2012
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-012-9136-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 2/2012

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Within-word pronunciation variation modeling for Arabic ASRs: a direct data-driven approach

A new approach to acoustic analysis of two British regional accents—Birmingham and Liverpool accents

A pertinent learning machine input feature for speaker discrimination by voice

Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress

A HMM-WDLT framework for HNM-based voice conversion with parametric adjustment in formant bandwidth, duration and excitation