Top

Wireless Personal Communications

Published in:

24-07-2017

Performance Evaluation of Silence-Feature Normalization Model using Cepstrum Features of Noise Signals

Authors: SangYeob Oh, Kyungyong Chung

Published in: Wireless Personal Communications | Issue 4/2018

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Speech enhancement algorithms play an important role in speech signal processing. Over the past several decades, many algorithms have been studied for speech enhancement. A speech enhancement algorithm uses a noise removal method and a statistical model filter to analyze the speech signal in the frequency domain. Spectral subtraction and Wiener filters have been used as representative algorithms. These algorithms have excellent speech enhancement performance, but suffer from deterioration in performance due to specific noise or low signal-to-noise ratio (SNR) environments. In addition, according to estimations of erroneous noise, a noise existing in a voice signal is maintained so that a spectrum corresponding to a voice signal is distorted, or a frame corresponding to a voice signal cannot be retrieved, and voice recognition performance deteriorates. The problem of deterioration in speech recognition performance arises from the difference between speech recognition and training model. We use silence-feature normalization model as a methodology to improve the recognition rate resulting from the difference in the noisy environments. Conventional silence-feature normalization has a problem in that the silent part of the energy increases, which affects recognition performance due to unclear boundaries categorizing the voice. In this study, we use the cepstrum feature of the noise signals in the silence-feature normalization model to improve the performance of silence-feature normalization in a signal with a low SNR by setting a reference value for voiced and unvoiced classification. As a result of recognition rate confirmation, the recognition rates improve in performance, compared with other methods.

previous article Development of Platform-Based Knowledge Map Service to get Data Insights of R&D Institution on User-Interested Subjects

next article Comparative Study of Power Saving and Delay in LTE DRX, Directional-DRX and Hybrid-Directional DRX

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Zoltan, T., Peter, M., Zoltan, T., & Tibor, F. (2005). Robust voice activity detection based on the entropy of noise-suppressed spectrum. In Proceedings of the international conference on speech communication and technology (pp. 245–248).

Ahn, C. S., & Oh, S. Y. (2012). Gaussian model optimization using configuration thread control in CHMM vocabulary recognition. The Journal of Digital Policy and Management, 10(7), 167–172.

Ahn, C. S., & Oh, S. Y. (2012). Echo noise robust HMM learning model using average estimator LMS algorithm. The Journal of Digital Policy and Management, 10(10), 277–282.

Shen, G., & Chung, H. Y. (2010). Cepstral distance and log-energy based silence feature normalization for robust speech recognition. The Journal of the Acoustical Society of Korea, 29(4), 278–285.

Wang, K. C., & Tsai, Y. H. (2008). Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. In Proceedings of the international symposium on universal communication (pp. 423–428).

Ahn, C. S., & Oh, S. Y. (2012). CHMM modeling using LMS algorithm for continuous speech recognition improvement. The Journal of Digital Policy and Management., 10(11), 377–382.

Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001). Perceptual evaluation of speech quality (PESQ)—a new method for speech quality assessment of telephone networks and codecs. In Proceedings of the IEEE international conference acoustics, speech, and signal processing (pp. 749–752).

Park, J. S., & Ko, H. S. (2013). Robust speech endpoint detection in noisy environments for HRI. The Journal of the Acoustical Society of Korea, 32(2), 147–156.CrossRef

Yao, K. S., Visser, E., Kwon, O. W., & Lee, T. W. (2003). A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments. In Proceedings of the international conference on speech communication and technology (pp. 9–12).

10.

Tai, C. F., & Hung, J. W. (2006). Silence energy normalization for robust speech recognition in additive noise environments. In Proceedings of the international conference on spoken language processing (pp. 2558–2561).

11.

Han, I. S., & Ahn, C. S. (2014). Robust speech detection using SEM and SFN. International Journal of Multimedia and Ubiquitous Engineering, 9(9), 61–68.CrossRef

12.

Rangachari, S., & Loizou, P. C. (2006). A noise-estimation algorithm for highly non-stationary environments. Speech Communication, 48(2), 220–231.CrossRef

13.

Chung, K., & Park, R. C. (2016). PHR open platform based smart health service using distributed object group framework. Cluster Computing, 19(1), 505–517.CrossRef

14.

Kim, J. C., & Chung, K. (2017). Depression index service using knowledge based crowdsourcing in smart health. Wireless Personal Communication, 93(1), 255–268.CrossRef

15.

Park, R. C., Jung, H., Chung, K., & Yoon, K. H. (2015). Picocell based telemedicine health service for human UX/UI. Multimedia Tools and Applications, 74(7), 2519–2534.CrossRef

16.

Choi, G. K., & Kim, S. H. (2009). Voice activity detection method using psycho-acoustic model based on speech energy maximization in noisy environments. The Journal of the Acoustical Society of Korea, 28(5), 447–453.

17.

Chung, K., & Oh, S. Y. (2016). Voice activity detection using improvement unvoiced feature normalization process in noisy environment. Wireless Personal Communications, 89(3), 747–759.CrossRef

18.

Oh, S. Y., & Chung, K. Y. (2014). Target speech feature extraction using non-parametric correlation coefficient. Cluster Computing, 17(3), 893–899.CrossRef

19.

Oh, S. Y., & Chung, K. Y. (2014). Improvement of speech detection using ERB feature extraction. Wireless Personal Communications, 79(4), 2439–2451.CrossRef

20.

Pearce, D., Hirsch, H., & Deutschland Gmbh, E. E. (2000). The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. ISCA ITRW ASR2000 (pp. 29–32).

21.

Zhu, W. Z., & Shaughnessy, D. O. (2005). Log energy dynamic range normalization for robust for robust speech recognition. In Proceedings of the international conference on acoustics, speech, and signal (pp. 245–248).

22.

Jung, H., & Chung, K. Y. (2014). Discovery of automotive design paradigm using relevance feedback. Personal and Ubiquitous Computing, 18(6), 1363–1372.CrossRef

23.

Chung, K. Y., Na, Y., & Lee, J. H. (2013). Interactive design recommendation using sensor based smart wear and weather webbot. Wireless Personal Communications, 73(2), 243–256.CrossRef

24.

Chung, K., & Park, R. C. (2016). P2P cloud network services for IoT based disaster situations information. Peer-to-Peer Networking and Applications, 9(3), 566–577.CrossRef

25.

Jung, H., Yoo, H., & Chung, K. (2016). Associative context mining for ontology-driven hidden knowledge discovery. Cluster Computing, 19(4), 2261–2271.CrossRef

26.

Oh, S. Y., & Chung, K. (2016). Vocabulary optimization process using similar phoneme recognition and feature extraction. Cluster Computing, 19(3), 1683–1690.CrossRef

27.

Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.CrossRef

28.

Kim, J. C., Jung, H., Kim, S. H., & Chung, K. (2016). Slope based intelligent 3D disaster simulation using physics engine. Wireless Personal Communications, 86(1), 183–199.CrossRef

29.

Chung, K., Kim, J. C., & Park, R. C. (2016). Knowledge-based health service considering user convenience using hybrid Wi-Fi P2P. Information Technology and Management, 17(1), 67–80.CrossRef

30.

Jung, H., & Chung, K. (2016). Knowledge based dietary nutrition recommendation for obesity management. Information Technology and Management, 17(1), 29–42.CrossRef

31.

Kim, S. H., & Chung, K. (2016). Emergency situation monitoring service using context motion tracking of chronic disease Patients. Cluster Computing, 18(2), 747–759.CrossRef

32.

Jung, H., & Chung, K. (2015). Ontology-driven slope modeling for disaster management service. Cluster Computing, 18(2), 677–692.CrossRef

33.

Yoo, H., & Chung, K. (2017). PHR based diabetes index service model using life behavior analysis. Wireless Personal Communications, 93(1), 161–174.CrossRef

34.

Kim, K., Hong, M., Chung, K., & Oh, S. Y. (2015). Estimating unreliable objects and system reliability in P2P network. Peer-to-Peer Networking and Applications, 8(4), 610–619.CrossRef

35.

Chung, K., & Oh, S. Y. (2015). Improvement of speech signal extraction method using detection filter of energy spectrum entropy. Cluster Computing, 18(2), 629–635.CrossRef

Title: Performance Evaluation of Silence-Feature Normalization Model using Cepstrum Features of Noise Signals
Authors: SangYeob Oh
Kyungyong Chung
Publication date: 24-07-2017
Publisher: Springer US
Published in: Wireless Personal Communications / Issue 4/2018
Print ISSN: 0929-6212
Electronic ISSN: 1572-834X
DOI: https://doi.org/10.1007/s11277-017-4645-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2018

Modified Multiple Feedback QR Aided Successive Interference Cancellation Algorithm for Large MIMO Detection

Advanced Primary–Backup Platform with Container-Based Automatic Deployment for Fault-Tolerant Systems

Model of Knowledge-Based Process Management System Using Big Data in the Wireless Communication Environment

Non-intrusive Forensic Detection Method Using DSWT with Reduced Feature Set for Copy-Move Image Tampering

Modified Binary Multiplier Architecture to Achieve Reduced Latency and Hardware Utilization

Adaptive Spatiotemporal Feature Extraction and Dynamic Combining Methods for Selective Visual Attention System