Skip to main content

2018 | Buch

Application of Wavelets in Speech Processing

insite
SUCHEN

Über dieses Buch

This new edition provides an updated and enhanced survey on employing wavelets analysis in an array of applications of speech processing. The author presents updated developments in topics such as; speech enhancement, noise suppression, spectral analysis of speech signal, speech quality assessment, speech recognition, forensics by Speech, and emotion recognition from speech. The new edition also features a new chapter on scalogram analysis of speech.

Moreover, in this edition, each chapter is restructured as such; that it becomes self contained, and can be read separately. Each chapter surveys the literature in a topic such that the use of wavelets in the work is explained and experimental results of proposed method are then discussed. Illustrative figures are also added to explain the methodology of each work.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Introduction
Abstract
As the wavelets gain wide applications in different fields especially within signal processing realm, this book will provide a survey on employing wavelet analysis in different applications of speech processing. Many speech processing algorithms and techniques still lack some sort of robustness which can be improved through the use of wavelet tools. Researchers and practitioners in speech technology will find valuable information in this monograph on the use of wavelets to strengthen both development and research in different applications of speech processing.
Mohamed Hesham Farouk
Chapter 2. Speech Production and Perception
Abstract
The main objective of research in speech processing is directed toward finding techniques for extracting features which, robustly, model a speech signal. Some of these features can be characterized by relatively simple models, while others may require more realistic models in both cases of speech production and perception.0
Mohamed Hesham Farouk
Chapter 3. Wavelets, Wavelet Filters, and Wavelet Transforms
Abstract
Spectral characteristics of speech are known to be particularly useful in describing a speech signal such that it can be efficiently reconstructed after coding or identified for recognition. The wavelets are considered one of such efficient methods for representing the spectrum of speech signals. Wavelets are used to model both production and perception processes of speech. Wavelet-based features prove a success in a widespread area of practical applications in speech-processing realm.
Mohamed Hesham Farouk
Chapter 4. Spectral Analysis of Speech Signal and Pitch Estimation
Abstract
Wavelet transform (WT) provides a way to explore the spectral characteristics of non-stationary speech signals. Multiresolution analysis based on the wavelet theory permits the introduction of the concepts of signal filtering with different bandwidths or frequency resolutions. As both time and frequency analysis can be conducted by WT, the tree structure of WP analysis can be customized to match the critical bands of human hearing giving better spectral estimation for speech signal than other methods. Wavelet-based pitch estimation assumes that the glottis closures are correlated with the maxima in the adjacent scales of the WT. This approach ensures more accurate estimation of pitch period.
Mohamed Hesham Farouk
Chapter 5. Speech Detection and Separation
Abstract
Several methods which are used for speech detection usually fail when SNR is low. The wavelet analysis has properties which can help in separating the speech from other signals. Many works report better detection and separation performance using wavelet analysis than using other techniques. On another level, as segmentation of speech into many classes is so hard, WT is well localized in time-frequency domain, and boundaries of speech segments can be willingly detected.
Mohamed Hesham Farouk
Chapter 6. Speech Enhancement and Noise Suppression
Abstract
Wavelet analysis has been widely used for noise suppression in signals. The multiresolution properties of wavelets reflect the frequency resolution of the human ear. Therefore, WT can be adapted to distinguish noise in speech through its properties in the time and frequency domains.
Mohamed Hesham Farouk
Chapter 7. Speech Recognition
Abstract
The wavelet analysis can improve speech recognition performance through many approaches. First, it can be used to remove noise, and consequently the recognition process may perform better. Alternatively, wavelet-based features can be added to other successful features to improve recognition performance. Third, wavelets can serve as an activation function in neural-networks employed for speech recognition. Hybrid methodology may comprise a mix of one or more approaches.
Mohamed Hesham Farouk
Chapter 8. Speaker Identification
Abstract
MFCC features are widely used in speech recognition. However MFCCs are not suitable for identifying a speaker since they should be located in high-frequency regions while the Mel scale gets coarser in the higher-frequency bands. The speaker’s individual information, which is nonuniformly distributed in the high-frequency bands, is equally important for speaker recognition. Accordingly, wavelet-based features are more appropriate.
Mohamed Hesham Farouk
Chapter 9. Emotion Recognition from Speech
Abstract
Like ASR, emotion recognition can benefit from the merits of wavelet analysis. Similar methodologies may be followed based on WT similar to that used in speech recognition. Mainly, it is realized in literatures that WP parameters are responsive to emotions. Also, many results prove that wavelet-based features improve emotion recognition.
Mohamed Hesham Farouk
Chapter 10. Speech Coding, Synthesis, and Compression
Abstract
WT-based analysis allows for the control of frequency resolution to closely match the response of the human auditory system. The inherent shaping of the wavelet synthesis filter and a controlled bit allocation to the wavelet coefficients help to minimize the perceptually significant noise due to the quantization error. Experimental results show that WT-based coders deliver superior quality to some audio standards when operating at the same bit rate and they deliver comparable quality to other codecs at lower bit rates. As a result, speech coding with WT can provide an efficient and flexible scheme for audio compression.
Mohamed Hesham Farouk
Chapter 11. Speech Quality Assessment
Abstract
The wavelet-packet analysis can be used to improve a perceptual-based objective speech quality measure. In this measure, the critical bands of auditory system may be approximated by a predefined wavelet-packet (PWP) tree structure. This PWP-based structure reduces the complexity of calculating such measures.
Mohamed Hesham Farouk
Chapter 12. Scalogram and Nonlinear Analysis of Speech
Abstract
As voice source generally interacts with the vocal tract in a nonlinear way, the interaction may take place at the glottis during the periodic vibration of vocal cords. The resulting excitation affects the lower-frequency components of produced voice at lips. Instead, turbulent sound source interacts in a way that influences the higher-frequency components. So, the wavelet decomposition can explore such nonlinear behavior through MRA. Nonlinear and chaotic components of a speech signal can be verified through scalogram analysis obtained from such MRA using CWT. A scale index obtained from CWT can confirm chaotic behavior even for highly periodic waveforms which is the case in speech vowels.
Mohamed Hesham Farouk
Chapter 13. Steganography, Forensics, and Security of Speech Signal
Abstract
Wavelet filter banks for perfect reconstruction can help in retrieving a hidden signal. In the wavelet domain, different techniques are applied on the wavelet coefficients to increase the hiding capacity and perceptual transparency. In general, steganography in wavelet domain shows high hiding capacity and transparency.
Mohamed Hesham Farouk
Chapter 14. Clinical Diagnosis and Assessment of Speech Pathology
Abstract
WT coefficients of normal voice signal have a remarkable difference compared to pathological one. This difference is distributed overall the speech frequency bands with different resolutions. Accordingly, WT is successfully used as a noninvasive method to diagnose vocal pathologies.
Mohamed Hesham Farouk
Backmatter
Metadaten
Titel
Application of Wavelets in Speech Processing
verfasst von
Prof. Dr. Mohamed Hesham Farouk
Copyright-Jahr
2018
Electronic ISBN
978-3-319-69002-5
Print ISBN
978-3-319-69001-8
DOI
https://doi.org/10.1007/978-3-319-69002-5

Neuer Inhalt