Skip to main content
main-content

Über dieses Buch

Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Pattern Recognition for Speech Detection

Abstract
The supervised pattern recognition techniques such as back-propagation neural network (BPNN), support vector machine (SVM), hidden Markov model (HMM), and Gaussian mixture model (GMM) that are used to design the classifier for speech and speaker detection are described in this chapter. The unsupervised techniques such as fuzzy k-means algorithm and Kohonen self-organizing map (KSOM) are discussed in this chapter. The dimensionality reduction techniques such as principal component analysis (PCA), linear discriminant analysis (LDA), kernel LDA, and independent component analysis (ICA) are also discussed in this chapter. The techniques described in this chapter are illustrated using the MATLAB for better understanding.
E. S. Gopi

Chapter 2. Speech Production Model

Abstract
The continuous speech signal (air) that comes out of the mouth and the nose is converted into the electrical signal using the microphone. The electrical speech signal thus obtained is sampled to obtain the discrete signals and are stored in the digital system for further processing. This is digital speech processing. The speech signal model is broadly classified as the source-filter model and the probabilistic model. Source-filter model assumes the physical phenomenon for the production of speech signal. Probabilistic model like Hidden Markov Model (HMM), Gaussian Mixture Model (GMM) are the mathematical model that does not care about the physical phenomenon. Speech model is used to extract the feature vectors from the speech signal for isolated speech recognition and the speaker recognition. It is used to compress the speech signal for storage like in Code exited linear prediction (CELP). It is useful for converting text into speech, known as speech synthesis. It is also used for continuous speech recognition. This chapter deals with the source-filter model of speech production.
E. S. Gopi

Chapter 3. Feature Extraction of the Speech Signal

Abstract
Isolated speech recognition, speaker recognition, and continuous speech recognition require the feature vector extracted from the speech signal. This is subjected to pattern recognition to formulate the classifier. The feature vector is extracted from each frame of the speech signal under test. In this chapter, various parameter extraction techniques such as linear predictive co-efficients as the filter co-efficients of the vocal tract model, poles of the vocal tract filter, cepstrual co-efficients, mel-frequency cepstral co-efficients (MFCC), line spectral co-efficients, and reflection co-efficients are discussed in this chapter. The preprocessing techniques such as dynamic time warping, endpoint detection, and pre-emphasis are also discussed in this chapter.
E. S. Gopi

Chapter 4. Speech Compression

Abstract
The speech signal is usually sampled with the sampling frequency of 8,000 Hz. If the uniform quantization of \(8\) bits/sample is used, \(64{,}000\) bits are required for 1 s speech data for the sampling frequency of \(8{,}000\) Hz. The redundancy in the speech signal is exploited to achieve to the lowest of 3,000 bits for 1 s data. This is known as digital Speech compression. The quality of the speech signal comes down by doing compression. The various techniques like nonuniform quantization, adaptive differential pulse code modulation, code exited linear prediction etc., to compress the speech data are discussed in this chapter. Also the methodology to measure the quality of the compressed speech signal is also discussed in this chapter.
E. S. Gopi

Backmatter

Weitere Informationen