Skip to main content

2016 | OriginalPaper | Buchkapitel

Speech Denoising Based on Sparse Representation Algorithm

verfasst von : Yan Zhou, Heming Zhao, Xueqin Chen, Tao Liu, Di Wu, Li Shang

Erschienen in: Intelligent Computing Theories and Application

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A new speech denoising method that aims for processing corrupted speech signal which is based on K-SVD sparse representation algorithm is proposed in this paper. Here, the DCT sparse and redundant representation over dictionary is used for the initial redundant dictionary. In order to analyze the time-frequency characteristics of speech signal clearly, the spectrogram patches are applied as training samples for the sparse decomposition in this approach. However, the training samples need to extend their deployment to arbitrary spectrogram sizes because the K-SVD algorithm is limited in handling small size spectrogram. A global spectrogram was defined prior that forces sparsity over patches in every location in the spectrogram. Afterwards, by using the K-SVD algorithm, the greedy algorithm is used for updating which alternates between dictionary and sparse coefficients. Then a dictionary that describes the speech structure effectively can be obtained. Finally, the corrupted speech signal can be sparsely decomposed under the redundant dictionary. Consequently, the sparse coefficients can be obtained and used to reconstruct the noiseless spectrograms. As a result, the purpose of the separation for the signal and noise is reached. The proposed K-SVD algorithm is a simple and effective algorithm, which is suitable for processing corrupted speech signal. Simulation experiments show that the performance of the proposed K-SVD denoising algorithm is stable, and the white noise can be effectively separated. In addition, the algorithm performance surpasses the redundant DCT dictionary method and Gabor dictionary method. In a word, K-SVD algorithm leads to an alternative and novel denoising method for speech signals.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yoshioka, T, Nakatani, T, Okuno, H.G.: Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure. In: 2010 TEEE International Conference on Acoustics Speech and Signal Processing, pp. 4270–4273. TEEE Press, New York (2010) Yoshioka, T, Nakatani, T, Okuno, H.G.: Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure. In: 2010 TEEE International Conference on Acoustics Speech and Signal Processing, pp. 4270–4273. TEEE Press, New York (2010)
2.
Zurück zum Zitat Loizou, P.C.: Speech Denoising: Theory and practice. CRC Press, Boca Raton (2007) Loizou, P.C.: Speech Denoising: Theory and practice. CRC Press, Boca Raton (2007)
3.
Zurück zum Zitat Gowreesunker, B.V., Tewfik, A.H.: Learning sparse representation using iterative subspace identification. J. IEEE Trans. Signal Proc. 58(6), 3055–3065 (2010)MathSciNetCrossRef Gowreesunker, B.V., Tewfik, A.H.: Learning sparse representation using iterative subspace identification. J. IEEE Trans. Signal Proc. 58(6), 3055–3065 (2010)MathSciNetCrossRef
4.
Zurück zum Zitat Zhang, L., Pan, Q.: On the determination of threshold in threshold based denoising by wavelet transformation. J. Acta Electronica Sinica 29(3), 400–403 (2001) Zhang, L., Pan, Q.: On the determination of threshold in threshold based denoising by wavelet transformation. J. Acta Electronica Sinica 29(3), 400–403 (2001)
5.
Zurück zum Zitat Yu, X.S.: An improved total variation model with adaptive local constraints and its applications to image denoising, deblurring and inpainting. J. Int. J. Digit. Content Technol. Appl. 5(12), 170–177 (2011) Yu, X.S.: An improved total variation model with adaptive local constraints and its applications to image denoising, deblurring and inpainting. J. Int. J. Digit. Content Technol. Appl. 5(12), 170–177 (2011)
6.
Zurück zum Zitat Loizou, P.C., Gibak, K.: Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions. J. IEEE Trans. Audio Speech Language Proc. 19(1), 47–56 (2010)CrossRef Loizou, P.C., Gibak, K.: Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions. J. IEEE Trans. Audio Speech Language Proc. 19(1), 47–56 (2010)CrossRef
7.
Zurück zum Zitat Yegnanarayana, B., Avendano, C., Hermansky, H., Satyanarayana Murthy, P.: Speech denoising using linear prediction residual. J. Speech Commun. 28, 25–42 (1999)CrossRef Yegnanarayana, B., Avendano, C., Hermansky, H., Satyanarayana Murthy, P.: Speech denoising using linear prediction residual. J. Speech Commun. 28, 25–42 (1999)CrossRef
8.
Zurück zum Zitat Tantibundhit, C., Pernkopf, F., Kubin, G.: Joint time-frequency segmentation algorithm for transient speech decomposition and speech denoising. J. IEEE Trans. Audio Speech Denoising 18(6), 1417–1428 (2010)CrossRef Tantibundhit, C., Pernkopf, F., Kubin, G.: Joint time-frequency segmentation algorithm for transient speech decomposition and speech denoising. J. IEEE Trans. Audio Speech Denoising 18(6), 1417–1428 (2010)CrossRef
10.
Zurück zum Zitat Sigg, C.D., Dikk, T., Buhmann, J.K.: Speech enhancement with sparse coding in learned dictionaries, pp. 4758–4761(2010) Sigg, C.D., Dikk, T., Buhmann, J.K.: Speech enhancement with sparse coding in learned dictionaries, pp. 4758–4761(2010)
11.
Zurück zum Zitat Cho, N., Jay Kuo, C.C.: Sparse music representation with source-specific dictionaries and its application to signal separation. J. IEEE Trans. Audio Speech Language Proc. 19(2), 337–348 (2011) Cho, N., Jay Kuo, C.C.: Sparse music representation with source-specific dictionaries and its application to signal separation. J. IEEE Trans. Audio Speech Language Proc. 19(2), 337–348 (2011)
12.
Zurück zum Zitat Wen, J., Michael, S.: Scordilis.: Speech denoising by residual domain constrained optimization. J. Speech Commun. 48, 1349–1364 (2006)CrossRef Wen, J., Michael, S.: Scordilis.: Speech denoising by residual domain constrained optimization. J. Speech Commun. 48, 1349–1364 (2006)CrossRef
13.
14.
Zurück zum Zitat Lewicki, M.S., Sejnowski, T.J.: Learning redundant representations. J. Neur. Comput. 12, 337–365 (2000)CrossRef Lewicki, M.S., Sejnowski, T.J.: Learning redundant representations. J. Neur. Comput. 12, 337–365 (2000)CrossRef
15.
Zurück zum Zitat Zhang, Y.: Hybrid recommendation method IN sparse datasets: combining content analysis and collaborative filtering. J. Int. J. Digital Content Technol. Appl. 6(10), 52–60 (2012)CrossRef Zhang, Y.: Hybrid recommendation method IN sparse datasets: combining content analysis and collaborative filtering. J. Int. J. Digital Content Technol. Appl. 6(10), 52–60 (2012)CrossRef
16.
Zurück zum Zitat Aharon, M., Elad, M., Bruckstein, A.M.: K-SVD and its non-negative variant for dictionary design. J. Int. Soc. Optics Photonics, (2005) Aharon, M., Elad, M., Bruckstein, A.M.: K-SVD and its non-negative variant for dictionary design. J. Int. Soc. Optics Photonics, (2005)
17.
Zurück zum Zitat Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer, New York (2010)CrossRefMATH Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer, New York (2010)CrossRefMATH
Metadaten
Titel
Speech Denoising Based on Sparse Representation Algorithm
verfasst von
Yan Zhou
Heming Zhao
Xueqin Chen
Tao Liu
Di Wu
Li Shang
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-42294-7_17