nach oben

International Journal of Speech Technology

Erschienen in:

01.09.2013

Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables

verfasst von: R. P. Sharma, O. Farooq, I. Khan

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper proposes a new feature extraction technique using wavelet based sub-band parameters (WBSP) for classification of unaspirated Hindi stop consonants. The extracted acoustic parameters show marked deviation from the values reported for English and other languages, Hindi having distinguishing manner based features. Since acoustic parameters are difficult to be extracted automatically for speech recognition.

Mel Frequency Cepstral Coefficient (MFCC) based features are usually used. MFCC are based on short time Fourier transform (STFT) which assumes the speech signal to be stationary over a short period. This assumption is specifically violated in case of stop consonants.

In WBSP, from acoustic study, the features derived from CV syllables have different weighting factors with the middle segment having the maximum. The wavelet transform has been applied to splitting of signal into 8 sub-bands of different bandwidths and the variation of energy in different sub-bands is also taken into account. WBSP gives improved classification scores. The number of filters used (8) for feature extraction in WBSP is less compared to the number (24) used for MFCC. Its classification performance has been compared with four other techniques using linear classifier. Further, Principal components analysis (PCA) has also been applied to reduce dimensionality.

Vorheriger Artikel Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

Nächster Artikel MCRA noise estimation for KLT-VRE-based speech enhancement

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Athineos, M., & Ellis, D. P. (2003). Frequency-domain linear prediction for temporal features. In Proc. ASRU (pp. 261–266).

Chandra, M. (2007). Speech classification using wavelet transform. Ph.D. thesis submitted to A.M.U., Aligarh, India.

Chang, S., Kwon, Y., & Yang, S. (1998). Speech feature extracted from adaptive wavelet for speech classification. Electronics Letters, 34, 2211–2213. CrossRef

Chen, S. H. (2002). A study on speech signal processing using wavelet transforms. Ph.D. dissertation submitted to National Cheng Kung, University Tinan, Taiwan, and Republic of China.

Duda, R. O., Hart, P. E., & Stork, G. (2001). Pattern classification (2nd ed.). New York: Wiley. MATH

Farooq, O., & Datta, S. (2001). Mel filter-like admissible wavelet packet structure for speech classification. IEEE Signal Processing Letters, 8(7), 196–198. CrossRef

Farooq, O., & Datta, S. (2003). Phoneme recognition using wavelet based features. Journal of Information Sciences, 150(1–2), 5–15. CrossRef

Farooq, O., & Datta, S. (2007). Evaluation of a wavelet based ASR front-end. International Journal on Wavelets and Multiresolution Processing, 5(4), 641–654. CrossRef

Fukunaga, K. (1990). Introduction to statistical pattern classification. San Diego: Academic Press.

Huber, R., Ramoser, H., Mayer, K., Penz, H., & Rubik, M. (2005). Classification of coins using an eigenspace approach. Pattern Classification Letters, 26(1), 61–75. CrossRef

Jiang, H., Joo, M., & Gao, Y. (2003). Feature extraction using wavelet packets strategy. In Proceedings of the 42nd IEEE conference on decision and control, Maui, Hawaii, USA (pp. 4517–4520).

Katz, M., Meier, H. G., Dolfing, H., & Klakow, D. (2002). Robustness of linear discriminant analysis in automatic speech classification. In Proc. international conference on pattern classification, Québec, Canada (Vol. 3, pp. 30371–30374).

Krishnan, M., Neophytou, C. P., & Prescott, G. (1994). Wavelet transform speech classification using vector quantization, dynamic time warping and artificial neural networks. In International conference on spoken language process, Yokohama, Japan.

Mallat, S. (1998). A wavelet tour of signal processing (2nd ed.). New York: Academic Press. MATH

Partridge, M., & Calvo, R. (1997). Fast dimensionality reduction and simple PCA. Intelligent Data Analysis, 2(3), 292–298.

Posadas, A. M., Vidal, , F. de Miguel, Alguacil, G., Pena, J., Ibanez, J. M., & Morales, J. (1993). Spatial-temporal analysis of a seismic series using the principal components method. Journal of Geophysical Research, 98(B2), 1923–1932. CrossRef

Rabiner, L. R., & Juang, B. H. (2003). Fundamental of speech classification (1st ed.). Delhi: Pearson Education.

Sekhar, C. C., & Yegnanarayana, B. (2002). A constraint satisfaction model for classification of stop consonant–vowel (SCV) utterances. IEEE Transactions on Speech and Audio Processing, 10(7), 472–480. CrossRef

Sharma, R. P. (2008). Recognition of (Hindi) stop consonants. Unpublished Ph.D. thesis submitted to A.M.U., Aligarh, India.

Suchato, A. (2004). Classification of stop consonant place of articulation. Ph.D. dissertation submitted to Massachusetts Institute of Technology.

Tufekci, Z., & Gowdy, J. N. (2000). Feature extraction using discrete wavelet transform for speech classification. In IEEE, SoutheastCon, Nashville, Tennessee, USA (pp. 116–123).

Turk, M. A., & Pentland, A. P. (1991). Face classification using eigenfaces. In Proceedings of the computer vision and pattern classification (pp. 586–591).

Van der Maaten, L. J. P. (2007). An introduction to dimensionality reduction using Matlab. MICC Report, Maastricht University.

Wang, K., Lee, K., & Juang, B. H. (1997). Selective feature extraction via signal decomposition. IEEE Signal Processing Letters, 4, 8–11. CrossRef

Xueying, Z., & Jing, B. (2006). The speech classification based on the bark wavelet and CZCPA features. In IEEE international conference on intelligent robots and systems, October 9–15, Beijing, China (pp. 318–321).

Yoo, S., Boston, J. R., Durrant, J. D., Kovacyk, K., Karn, S., & Shaiman, S. E. J. (2005). Relative energy and intelligibility of transient speech components. Proceedings of IEEE ICASSP, 1, 69–72.

Titel: Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables
verfasst von: R. P. Sharma
O. Farooq
I. Khan
Publikationsdatum: 01.09.2013
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 3/2013
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-012-9185-x

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Barbara Liebermeister/© Barbara Liebermeister, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2013

Environment dependent noise tracking for speech enhancement

Implementation of variable bitrate data hiding techniques on standard and proposed GSM 06.10 full rate coder and its overall comparative evaluation of performance

Advanced classification approach for neuronal phoneme recognition system based on efficient constructive training algorithm

Wavelet-scalogram based study of non-periodicity in speech signals as a complementary measure of chaotic content

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.