Skip to main content
Top

2016 | OriginalPaper | Chapter

Speech Denoising Based on Sparse Representation Algorithm

Authors : Yan Zhou, Heming Zhao, Xueqin Chen, Tao Liu, Di Wu, Li Shang

Published in: Intelligent Computing Theories and Application

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A new speech denoising method that aims for processing corrupted speech signal which is based on K-SVD sparse representation algorithm is proposed in this paper. Here, the DCT sparse and redundant representation over dictionary is used for the initial redundant dictionary. In order to analyze the time-frequency characteristics of speech signal clearly, the spectrogram patches are applied as training samples for the sparse decomposition in this approach. However, the training samples need to extend their deployment to arbitrary spectrogram sizes because the K-SVD algorithm is limited in handling small size spectrogram. A global spectrogram was defined prior that forces sparsity over patches in every location in the spectrogram. Afterwards, by using the K-SVD algorithm, the greedy algorithm is used for updating which alternates between dictionary and sparse coefficients. Then a dictionary that describes the speech structure effectively can be obtained. Finally, the corrupted speech signal can be sparsely decomposed under the redundant dictionary. Consequently, the sparse coefficients can be obtained and used to reconstruct the noiseless spectrograms. As a result, the purpose of the separation for the signal and noise is reached. The proposed K-SVD algorithm is a simple and effective algorithm, which is suitable for processing corrupted speech signal. Simulation experiments show that the performance of the proposed K-SVD denoising algorithm is stable, and the white noise can be effectively separated. In addition, the algorithm performance surpasses the redundant DCT dictionary method and Gabor dictionary method. In a word, K-SVD algorithm leads to an alternative and novel denoising method for speech signals.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Yoshioka, T, Nakatani, T, Okuno, H.G.: Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure. In: 2010 TEEE International Conference on Acoustics Speech and Signal Processing, pp. 4270–4273. TEEE Press, New York (2010) Yoshioka, T, Nakatani, T, Okuno, H.G.: Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure. In: 2010 TEEE International Conference on Acoustics Speech and Signal Processing, pp. 4270–4273. TEEE Press, New York (2010)
2.
go back to reference Loizou, P.C.: Speech Denoising: Theory and practice. CRC Press, Boca Raton (2007) Loizou, P.C.: Speech Denoising: Theory and practice. CRC Press, Boca Raton (2007)
3.
go back to reference Gowreesunker, B.V., Tewfik, A.H.: Learning sparse representation using iterative subspace identification. J. IEEE Trans. Signal Proc. 58(6), 3055–3065 (2010)MathSciNetCrossRef Gowreesunker, B.V., Tewfik, A.H.: Learning sparse representation using iterative subspace identification. J. IEEE Trans. Signal Proc. 58(6), 3055–3065 (2010)MathSciNetCrossRef
4.
go back to reference Zhang, L., Pan, Q.: On the determination of threshold in threshold based denoising by wavelet transformation. J. Acta Electronica Sinica 29(3), 400–403 (2001) Zhang, L., Pan, Q.: On the determination of threshold in threshold based denoising by wavelet transformation. J. Acta Electronica Sinica 29(3), 400–403 (2001)
5.
go back to reference Yu, X.S.: An improved total variation model with adaptive local constraints and its applications to image denoising, deblurring and inpainting. J. Int. J. Digit. Content Technol. Appl. 5(12), 170–177 (2011) Yu, X.S.: An improved total variation model with adaptive local constraints and its applications to image denoising, deblurring and inpainting. J. Int. J. Digit. Content Technol. Appl. 5(12), 170–177 (2011)
6.
go back to reference Loizou, P.C., Gibak, K.: Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions. J. IEEE Trans. Audio Speech Language Proc. 19(1), 47–56 (2010)CrossRef Loizou, P.C., Gibak, K.: Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions. J. IEEE Trans. Audio Speech Language Proc. 19(1), 47–56 (2010)CrossRef
7.
go back to reference Yegnanarayana, B., Avendano, C., Hermansky, H., Satyanarayana Murthy, P.: Speech denoising using linear prediction residual. J. Speech Commun. 28, 25–42 (1999)CrossRef Yegnanarayana, B., Avendano, C., Hermansky, H., Satyanarayana Murthy, P.: Speech denoising using linear prediction residual. J. Speech Commun. 28, 25–42 (1999)CrossRef
8.
go back to reference Tantibundhit, C., Pernkopf, F., Kubin, G.: Joint time-frequency segmentation algorithm for transient speech decomposition and speech denoising. J. IEEE Trans. Audio Speech Denoising 18(6), 1417–1428 (2010)CrossRef Tantibundhit, C., Pernkopf, F., Kubin, G.: Joint time-frequency segmentation algorithm for transient speech decomposition and speech denoising. J. IEEE Trans. Audio Speech Denoising 18(6), 1417–1428 (2010)CrossRef
10.
go back to reference Sigg, C.D., Dikk, T., Buhmann, J.K.: Speech enhancement with sparse coding in learned dictionaries, pp. 4758–4761(2010) Sigg, C.D., Dikk, T., Buhmann, J.K.: Speech enhancement with sparse coding in learned dictionaries, pp. 4758–4761(2010)
11.
go back to reference Cho, N., Jay Kuo, C.C.: Sparse music representation with source-specific dictionaries and its application to signal separation. J. IEEE Trans. Audio Speech Language Proc. 19(2), 337–348 (2011) Cho, N., Jay Kuo, C.C.: Sparse music representation with source-specific dictionaries and its application to signal separation. J. IEEE Trans. Audio Speech Language Proc. 19(2), 337–348 (2011)
12.
go back to reference Wen, J., Michael, S.: Scordilis.: Speech denoising by residual domain constrained optimization. J. Speech Commun. 48, 1349–1364 (2006)CrossRef Wen, J., Michael, S.: Scordilis.: Speech denoising by residual domain constrained optimization. J. Speech Commun. 48, 1349–1364 (2006)CrossRef
14.
go back to reference Lewicki, M.S., Sejnowski, T.J.: Learning redundant representations. J. Neur. Comput. 12, 337–365 (2000)CrossRef Lewicki, M.S., Sejnowski, T.J.: Learning redundant representations. J. Neur. Comput. 12, 337–365 (2000)CrossRef
15.
go back to reference Zhang, Y.: Hybrid recommendation method IN sparse datasets: combining content analysis and collaborative filtering. J. Int. J. Digital Content Technol. Appl. 6(10), 52–60 (2012)CrossRef Zhang, Y.: Hybrid recommendation method IN sparse datasets: combining content analysis and collaborative filtering. J. Int. J. Digital Content Technol. Appl. 6(10), 52–60 (2012)CrossRef
16.
go back to reference Aharon, M., Elad, M., Bruckstein, A.M.: K-SVD and its non-negative variant for dictionary design. J. Int. Soc. Optics Photonics, (2005) Aharon, M., Elad, M., Bruckstein, A.M.: K-SVD and its non-negative variant for dictionary design. J. Int. Soc. Optics Photonics, (2005)
17.
go back to reference Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer, New York (2010)CrossRefMATH Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer, New York (2010)CrossRefMATH
Metadata
Title
Speech Denoising Based on Sparse Representation Algorithm
Authors
Yan Zhou
Heming Zhao
Xueqin Chen
Tao Liu
Di Wu
Li Shang
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-42294-7_17

Premium Partner