Skip to main content
Top
Published in: Wireless Personal Communications 3/2023

10-09-2022

Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement

Authors: Monika Gupta, R. K. Singh, Sachin Singh

Published in: Wireless Personal Communications | Issue 3/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Speech is the primary entity for personal communication however ambient quality generally impairs speech signal quality and understanding of communication. Therefore, it is required that the distorted speech signal be improved in its quality and comprehension. In the field of speech processing, great efforts have been made to develop speech enhancement techniques that restore speech signals by reducing the amount of interfering noise. This work focuses on a critical analysis of single channel speech enhancement technique that performs noise reduction through spectral subtraction based on minimal statistics. Minimal statistics implies estimating the power spectrum of a non-standard noise signal by avoiding the problem of detecting speech activity by finding the smallest value for a smooth power spectrum of a noisy speech signal. The performance of the spectral subtraction method is evaluated over a wide range of noise types with varying sound levels using single channel speech data. This estimator is used to find the optimal value for the method parameter and improve this algorithm to make it more suitable for voice communication purposes. The system can be implemented in MATLAB and also validated against a variety of performance measures and various improvements in signal-to-noise ratio (SNRI) and spectral distortion (SD). This approach provides effective speech enhancement in SNRI and SD performance metrics. A comparatively new method has been proposed in this paper named Spectral Statistics Based on Minimum Statistics (SSBMS) which customarily follows the transient noise and provides a better response in the process of speech enhancement.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.CrossRef Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.CrossRef
2.
go back to reference McAulay, R., & Malpass, M. (1980). Speech enhancement using a soft-decision noise suppression filter. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(2), 137–145.CrossRef McAulay, R., & Malpass, M. (1980). Speech enhancement using a soft-decision noise suppression filter. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(2), 137–145.CrossRef
3.
go back to reference Ghorpade, K., & Khaparde, A. (2022). Single channel speech enhancement using evolutionary algorithm with Log-MMSE. ASEAN Engineering Journal, 12(1), 83–91.CrossRef Ghorpade, K., & Khaparde, A. (2022). Single channel speech enhancement using evolutionary algorithm with Log-MMSE. ASEAN Engineering Journal, 12(1), 83–91.CrossRef
4.
go back to reference Yang, Y., Zhang, H., Zhang, X., & Zhang, H (2022) Alleviating the Loss-Metric mismatch in supervised single-channel speech enhancement, In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 6952–6956. Yang, Y., Zhang, H., Zhang, X., & Zhang, H (2022) Alleviating the Loss-Metric mismatch in supervised single-channel speech enhancement, In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 6952–6956.
5.
go back to reference Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.CrossRef Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.CrossRef
6.
go back to reference Cappe, O. (1994). Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Transactions on Speech and Audio Processing, 2(2), 345–349.CrossRef Cappe, O. (1994). Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Transactions on Speech and Audio Processing, 2(2), 345–349.CrossRef
7.
go back to reference Martin, R. (2005). Speech enhancement based on minimum mean-square error estimation and super-gaussian priors. IEEE Transactions on Speech and Audio Processing, 13(5), 845–856.CrossRef Martin, R. (2005). Speech enhancement based on minimum mean-square error estimation and super-gaussian priors. IEEE Transactions on Speech and Audio Processing, 13(5), 845–856.CrossRef
8.
go back to reference Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20(4), 1383–1393.CrossRef Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20(4), 1383–1393.CrossRef
9.
go back to reference Wang, D., & Lim, J. (1982). The unimportance of phase in speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 30(4), 679–681.CrossRef Wang, D., & Lim, J. (1982). The unimportance of phase in speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 30(4), 679–681.CrossRef
10.
go back to reference M. R. Weiss, A. E. Aschkenasy, and T. W. (1974) Parsons, Study and development of the intel technique for improving speech intelligibility, Nicolet Scientific Corp., Tech. Rep. M. R. Weiss, A. E. Aschkenasy, and T. W. (1974) Parsons, Study and development of the intel technique for improving speech intelligibility, Nicolet Scientific Corp., Tech. Rep.
11.
go back to reference Paliwal, K., Wojcicki, K., & Shannon, B. (2011). The importance of phase in speech enhancement. Speech Communication, 53(4), 465–494.CrossRef Paliwal, K., Wojcicki, K., & Shannon, B. (2011). The importance of phase in speech enhancement. Speech Communication, 53(4), 465–494.CrossRef
12.
go back to reference P. Mowlaee and R. Martin (2012) On phase importance in parameter estimation for single-channel source separation. In: Proceedings International Workshop on Acoustic Signal Enhancement, pp. 1–4. P. Mowlaee and R. Martin (2012) On phase importance in parameter estimation for single-channel source separation. In: Proceedings International Workshop on Acoustic Signal Enhancement, pp. 1–4.
13.
go back to reference Mowlaee, P., & Saeidi, R. (2013). Iterative closed-loop phase-aware single-channel speech enhancement. IEEE Signal Processing Letter, 20(12), 1235–1239.CrossRef Mowlaee, P., & Saeidi, R. (2013). Iterative closed-loop phase-aware single-channel speech enhancement. IEEE Signal Processing Letter, 20(12), 1235–1239.CrossRef
14.
go back to reference Mowlaee, P., & Kulmer, J. (2015). Phase estimation in single-channel speech enhancement: Limits-potential. IEEE Transactions on Audio, Speech, and Language Processing, 23(8), 1283–1294.CrossRef Mowlaee, P., & Kulmer, J. (2015). Phase estimation in single-channel speech enhancement: Limits-potential. IEEE Transactions on Audio, Speech, and Language Processing, 23(8), 1283–1294.CrossRef
15.
go back to reference Mowlaee, P., & Kulmer, J. (2015). Harmonic phase estimation in single-channel speech enhancement using phase decomposition and SNR information. IEEE Transactions on Audio, Speech, and Language Processing, 23(9), 1521–1532.CrossRef Mowlaee, P., & Kulmer, J. (2015). Harmonic phase estimation in single-channel speech enhancement using phase decomposition and SNR information. IEEE Transactions on Audio, Speech, and Language Processing, 23(9), 1521–1532.CrossRef
16.
go back to reference Gerkmann, T., & Krawczyk, M. (2013). MMSE-optimal spectral amplitude estimation given the STFT-phase. IEEE Signal Processing Letter, 20(2), 129–132.CrossRef Gerkmann, T., & Krawczyk, M. (2013). MMSE-optimal spectral amplitude estimation given the STFT-phase. IEEE Signal Processing Letter, 20(2), 129–132.CrossRef
17.
go back to reference Gerkmann, T. (2014). Bayesian estimation of clean speech spectral coefficients given a priori knowledge of the phase. IEEE Transactions on Signal Processing, 62(16), 4199–4208.CrossRefMATH Gerkmann, T. (2014). Bayesian estimation of clean speech spectral coefficients given a priori knowledge of the phase. IEEE Transactions on Signal Processing, 62(16), 4199–4208.CrossRefMATH
18.
go back to reference Krawczyk, M., & Gerkmann, T. (2014). STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 22(12), 1931–1940.CrossRef Krawczyk, M., & Gerkmann, T. (2014). STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 22(12), 1931–1940.CrossRef
19.
go back to reference Krawczyk-Becker, M., & Gerkmann, T. (2016). On MMSE-based estimation of amplitude and complex speech spectral coefficients under phase-uncertainty. IEEE Transactions on Audio, Speech, and Language Processing, 24(12), 2251–2262.CrossRef Krawczyk-Becker, M., & Gerkmann, T. (2016). On MMSE-based estimation of amplitude and complex speech spectral coefficients under phase-uncertainty. IEEE Transactions on Audio, Speech, and Language Processing, 24(12), 2251–2262.CrossRef
20.
go back to reference Krawczyk-Becker, M., & Gerkmann, T. (2016). An evaluation of the perceptual quality of phase-aware single-channel speech enhancement. The Journal of the Acoustical Society of America, 140, 364–369.CrossRef Krawczyk-Becker, M., & Gerkmann, T. (2016). An evaluation of the perceptual quality of phase-aware single-channel speech enhancement. The Journal of the Acoustical Society of America, 140, 364–369.CrossRef
21.
go back to reference Deville, Y., Gannot, S., Mason, R., Plumbley, M. D., & Ward, D. (2018). A study on the benefits of phase-aware speech enhancement in challenging noise scenarios. In Y. Deville, S. Gannot, R. Mason, & M. D. Plumbley (Eds.), Latent Variable Analysis and Signal Separation (pp. 407–416). Cham: Springer International Publishing.CrossRef Deville, Y., Gannot, S., Mason, R., Plumbley, M. D., & Ward, D. (2018). A study on the benefits of phase-aware speech enhancement in challenging noise scenarios. In Y. Deville, S. Gannot, R. Mason, & M. D. Plumbley (Eds.), Latent Variable Analysis and Signal Separation (pp. 407–416). Cham: Springer International Publishing.CrossRef
22.
go back to reference Mowlaee, P., Saeidi, R., & Stylianou, Y. (2016). Advances in phase-aware signal processing in speech communication. Speech Communication, 81, 1–29.CrossRef Mowlaee, P., Saeidi, R., & Stylianou, Y. (2016). Advances in phase-aware signal processing in speech communication. Speech Communication, 81, 1–29.CrossRef
23.
go back to reference Gerkmann, T., Krawczyk-Becker, M., & Roux, J. L. (2015). Phase processing for single channel speech enhancement: History and recent advances. IEEE Signal Processing Magazine, 32(2), 55–66.CrossRef Gerkmann, T., Krawczyk-Becker, M., & Roux, J. L. (2015). Phase processing for single channel speech enhancement: History and recent advances. IEEE Signal Processing Magazine, 32(2), 55–66.CrossRef
24.
go back to reference Krawczyk-Becker, M., & Gerkmann, T. (2018). On speech enhancement under PSD uncertainty. IEEE Transactions on Audio, Speech, and Language Processing, 26(6), 1144–1153.CrossRef Krawczyk-Becker, M., & Gerkmann, T. (2018). On speech enhancement under PSD uncertainty. IEEE Transactions on Audio, Speech, and Language Processing, 26(6), 1144–1153.CrossRef
25.
go back to reference Xu, Y., Du, J., Dai, L., & Lee, C. (2015). A regression approach to speech enhancement based on deep neural networks. IEEE Transactions on Audio, Speech, and Language Processing, 23(1), 7–19.CrossRef Xu, Y., Du, J., Dai, L., & Lee, C. (2015). A regression approach to speech enhancement based on deep neural networks. IEEE Transactions on Audio, Speech, and Language Processing, 23(1), 7–19.CrossRef
26.
go back to reference M. Kolbaek, Z. Tan, and J. Jensen, (2018) Monaural speech enhancement using deep neural networks by maximizing a short-time objective intelligibility measure. M. Kolbaek, Z. Tan, and J. Jensen, (2018) Monaural speech enhancement using deep neural networks by maximizing a short-time objective intelligibility measure.
Metadata
Title
Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement
Authors
Monika Gupta
R. K. Singh
Sachin Singh
Publication date
10-09-2022
Publisher
Springer US
Published in
Wireless Personal Communications / Issue 3/2023
Print ISSN: 0929-6212
Electronic ISSN: 1572-834X
DOI
https://doi.org/10.1007/s11277-022-10039-y

Other articles of this Issue 3/2023

Wireless Personal Communications 3/2023 Go to the issue