nach oben

2018 | Buch

Kapitel lesen Erstes Kapitel lesen

Application of Wavelets in Speech Processing

verfasst von: Prof. Dr. Mohamed Hesham Farouk

Verlag: Springer International Publishing

Buchreihe : SpringerBriefs in Electrical and Computer Engineering

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This new edition provides an updated and enhanced survey on employing wavelets analysis in an array of applications of speech processing. The author presents updated developments in topics such as; speech enhancement, noise suppression, spectral analysis of speech signal, speech quality assessment, speech recognition, forensics by Speech, and emotion recognition from speech. The new edition also features a new chapter on scalogram analysis of speech.

Moreover, in this edition, each chapter is restructured as such; that it becomes self contained, and can be read separately. Each chapter surveys the literature in a topic such that the use of wavelets in the work is explained and experimental results of proposed method are then discussed. Illustrative figures are also added to explain the methodology of each work.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Abstract

As the wavelets gain wide applications in different fields especially within signal processing realm, this book will provide a survey on employing wavelet analysis in different applications of speech processing. Many speech processing algorithms and techniques still lack some sort of robustness which can be improved through the use of wavelet tools. Researchers and practitioners in speech technology will find valuable information in this monograph on the use of wavelets to strengthen both development and research in different applications of speech processing.

Mohamed Hesham Farouk

Chapter 2. Speech Production and Perception

Abstract

The main objective of research in speech processing is directed toward finding techniques for extracting features which, robustly, model a speech signal. Some of these features can be characterized by relatively simple models, while others may require more realistic models in both cases of speech production and perception.0

Mohamed Hesham Farouk

Chapter 3. Wavelets, Wavelet Filters, and Wavelet Transforms

Abstract

Spectral characteristics of speech are known to be particularly useful in describing a speech signal such that it can be efficiently reconstructed after coding or identified for recognition. The wavelets are considered one of such efficient methods for representing the spectrum of speech signals. Wavelets are used to model both production and perception processes of speech. Wavelet-based features prove a success in a widespread area of practical applications in speech-processing realm.

Mohamed Hesham Farouk

Chapter 4. Spectral Analysis of Speech Signal and Pitch Estimation

Abstract

Wavelet transform (WT) provides a way to explore the spectral characteristics of non-stationary speech signals. Multiresolution analysis based on the wavelet theory permits the introduction of the concepts of signal filtering with different bandwidths or frequency resolutions. As both time and frequency analysis can be conducted by WT, the tree structure of WP analysis can be customized to match the critical bands of human hearing giving better spectral estimation for speech signal than other methods. Wavelet-based pitch estimation assumes that the glottis closures are correlated with the maxima in the adjacent scales of the WT. This approach ensures more accurate estimation of pitch period.

Mohamed Hesham Farouk

Chapter 5. Speech Detection and Separation

Abstract

Several methods which are used for speech detection usually fail when SNR is low. The wavelet analysis has properties which can help in separating the speech from other signals. Many works report better detection and separation performance using wavelet analysis than using other techniques. On another level, as segmentation of speech into many classes is so hard, WT is well localized in time-frequency domain, and boundaries of speech segments can be willingly detected.

Mohamed Hesham Farouk

Chapter 6. Speech Enhancement and Noise Suppression

Abstract

Wavelet analysis has been widely used for noise suppression in signals. The multiresolution properties of wavelets reflect the frequency resolution of the human ear. Therefore, WT can be adapted to distinguish noise in speech through its properties in the time and frequency domains.

Mohamed Hesham Farouk

Chapter 7. Speech Recognition

Abstract

The wavelet analysis can improve speech recognition performance through many approaches. First, it can be used to remove noise, and consequently the recognition process may perform better. Alternatively, wavelet-based features can be added to other successful features to improve recognition performance. Third, wavelets can serve as an activation function in neural-networks employed for speech recognition. Hybrid methodology may comprise a mix of one or more approaches.

Mohamed Hesham Farouk

Chapter 8. Speaker Identification

Abstract

MFCC features are widely used in speech recognition. However MFCCs are not suitable for identifying a speaker since they should be located in high-frequency regions while the Mel scale gets coarser in the higher-frequency bands. The speaker’s individual information, which is nonuniformly distributed in the high-frequency bands, is equally important for speaker recognition. Accordingly, wavelet-based features are more appropriate.

Mohamed Hesham Farouk

Chapter 9. Emotion Recognition from Speech

Abstract

Like ASR, emotion recognition can benefit from the merits of wavelet analysis. Similar methodologies may be followed based on WT similar to that used in speech recognition. Mainly, it is realized in literatures that WP parameters are responsive to emotions. Also, many results prove that wavelet-based features improve emotion recognition.

Mohamed Hesham Farouk

Chapter 10. Speech Coding, Synthesis, and Compression

Abstract

WT-based analysis allows for the control of frequency resolution to closely match the response of the human auditory system. The inherent shaping of the wavelet synthesis filter and a controlled bit allocation to the wavelet coefficients help to minimize the perceptually significant noise due to the quantization error. Experimental results show that WT-based coders deliver superior quality to some audio standards when operating at the same bit rate and they deliver comparable quality to other codecs at lower bit rates. As a result, speech coding with WT can provide an efficient and flexible scheme for audio compression.

Mohamed Hesham Farouk

Chapter 11. Speech Quality Assessment

Abstract

The wavelet-packet analysis can be used to improve a perceptual-based objective speech quality measure. In this measure, the critical bands of auditory system may be approximated by a predefined wavelet-packet (PWP) tree structure. This PWP-based structure reduces the complexity of calculating such measures.

Mohamed Hesham Farouk

Chapter 12. Scalogram and Nonlinear Analysis of Speech

Abstract

As voice source generally interacts with the vocal tract in a nonlinear way, the interaction may take place at the glottis during the periodic vibration of vocal cords. The resulting excitation affects the lower-frequency components of produced voice at lips. Instead, turbulent sound source interacts in a way that influences the higher-frequency components. So, the wavelet decomposition can explore such nonlinear behavior through MRA. Nonlinear and chaotic components of a speech signal can be verified through scalogram analysis obtained from such MRA using CWT. A scale index obtained from CWT can confirm chaotic behavior even for highly periodic waveforms which is the case in speech vowels.

Mohamed Hesham Farouk

Chapter 13. Steganography, Forensics, and Security of Speech Signal

Abstract

Wavelet filter banks for perfect reconstruction can help in retrieving a hidden signal. In the wavelet domain, different techniques are applied on the wavelet coefficients to increase the hiding capacity and perceptual transparency. In general, steganography in wavelet domain shows high hiding capacity and transparency.

Mohamed Hesham Farouk

Chapter 14. Clinical Diagnosis and Assessment of Speech Pathology

Abstract

WT coefficients of normal voice signal have a remarkable difference compared to pathological one. This difference is distributed overall the speech frequency bands with different resolutions. Accordingly, WT is successfully used as a noninvasive method to diagnose vocal pathologies.

Mohamed Hesham Farouk

Backmatter

Titel: Application of Wavelets in Speech Processing
verfasst von: Prof. Dr. Mohamed Hesham Farouk
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-69002-5
Print ISBN: 978-3-319-69001-8
DOI: https://doi.org/10.1007/978-3-319-69002-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Chapter 2. Speech Production and Perception

Chapter 3. Wavelets, Wavelet Filters, and Wavelet Transforms

Chapter 4. Spectral Analysis of Speech Signal and Pitch Estimation

Chapter 5. Speech Detection and Separation

Chapter 6. Speech Enhancement and Noise Suppression

Chapter 7. Speech Recognition

Chapter 8. Speaker Identification

Chapter 9. Emotion Recognition from Speech

Chapter 10. Speech Coding, Synthesis, and Compression

Chapter 11. Speech Quality Assessment

Chapter 12. Scalogram and Nonlinear Analysis of Speech

Chapter 13. Steganography, Forensics, and Security of Speech Signal

Chapter 14. Clinical Diagnosis and Assessment of Speech Pathology

Backmatter

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.