Top

Cluster Computing

Published in:

01-09-2014

Target speech feature extraction using non-parametric correlation coefficient

Authors: Sang Yeob Oh, Kyung-Yong Chung

Published in: Cluster Computing | Issue 3/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Speech recognition systems for the automobile have a few weaknesses, including failure to recognize speech due to the mixing of environment noise from inside and outside the car and from other voices. Therefore, this paper features a technique for extracting only the selected target voice from input sound that is a mixture of voices and noises. The feature for selective speech extraction composes a correlation map of auditory elements by using similarity between channels and continuity of time, and utilizes a method of extracting speech features by using a non-parametric correlation coefficient. This proposed method was validated by showing that the average distortion of separation of the technique decreased by 0.8630 dB. It was shown that the performance of the selective feature extraction utilizing a cross correlation is good, but overall, the selective feature extraction utilizing a non-parametric correlation is better.

previous article Adaptive mining prediction model for content recommendation to coronary heart disease patients

next article Acquiring lexical knowledge using raw corpora and unsupervised clustering method

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Dupont, S., Luettin, J.: Audio-visual speech modelling for continuous speech recognition. IEEE Trans. Multimed. 2(3), 141–151 (2000) CrossRef

Gowdy, J.N., Subramanya, A., Bartels, C., Bilmes, J.: DBN-based muti-stream models for audio-visual speech recognition. In: Proc. IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 993–996 (2004)

Bilmes, J.A., Bartels, C.: Graphical model architectures for speech recognition. IEEE Signal Process. Mag. 22, 89–100 (2005) CrossRef

Schwartz, J.-L., Berthommier, F., Savariaux, C.: Seeing to hear better: evidence for early audio-visual interactions in speech identification. ERIC J. Rep.-Res. Cogn. 93(2), 69–78 (2004)

Chibelushi, C.C., Deravi, F., Moson, J.S.: A review of speech-based bimodal recognition. IEEE Trans. Multimed. 4(1), 23–37 (2002) CrossRef

Pham, T.T., Kim, J.Y., Na, S.Y., Hwang, S.T.: Robust eye localization for lip reading in mobile environment. In: Proc. of SCIS&ISIS, Japan, pp. 385–388 (2008)

Pham, T.T., Song, M.G., Kim, J.Y., Na, S.Y., Hwang, S.T.: A robust lip center detection in cell phone environment. In: Proc. of IEEE Symposium on Signal Processing and Information Technology, Sarajevo, pp. 390–395 (2008)

Hu, G., Wang, D.L.: Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Netw. 15, 1135–1150 (2004) CrossRef

Wu, X.H.: Auditory perception mechanism and computational auditory scene analysis. Post doctor research report (1997)

10.

Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and uncertain acoustic data. Speech Commun. 34, 267–285 (2001) CrossRefMATH

11.

Raj, B., Seltzer, M.L., Stern, R.M.: Reconstruction of missing features for robust speech recognition. Speech Commun. 43(4), 275–296 (2004) CrossRef

12.

Shao, Y., Wang, D.L.: Model-based sequential organization in cochannel speech. IEEE Trans. Audio Speech Lang. Process. 14, 289–298 (2006) CrossRef

13.

Cooke, M.: A glimpsing model of speech perception in noise. J. Acoust. Soc. Am. 119(3), 1562–1573 (2006) CrossRefMathSciNet

14.

Cooke, M., Barker, J., Cunningham, S., Shao, X.: An audio-visual corpus for speech perception and automatic speech recognition. J. Acoust. Soc. Am. 120(5), 2421–2424 (2006) CrossRef

15.

Moharil, S., Lee, S.Y.: Load balancing on temporally heterogeneous cluster of workstations for parallel simulated annealing. Clust. Comput. 14(4), 295–310 (2011) CrossRef

16.

Hasswa, A., Hassanein, H.: A smart spaces architecture based on heterogeneous contexts, particularly social contexts. Clust. Comput. 15(4), 373–390 (2012) CrossRef

17.

Jung, Y.G., Han, M.S., Chung, K.Y., Lee, S.J.: Monotonicity and performance evaluation: applications to high speed and mobile networks. Clust. Comput. 15(4), 401–414 (2012) CrossRef

18.

Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. (2013). doi:10.1007/s11042-011-0919-6

19.

Kim, J.H., Lee, D., Chung, K.Y.: Item recommendation based on context-aware model for personalized u-healthcare service. Multimed. Tools Appl. (2013). doi:10.1007/s11042-011-0920-0

20.

Chung, K.Y., Yoo, J., Kim, K.J.: Recent trends on mobile computing and future networks. Pers. Ubiquitous Comput. (2013). doi:10.1007/s00779-013-0682-y

21.

Kang, S.K., Chung, K.Y., Lee, J.H.: Development of head detection and tracking systems for visual surveillance. Pers. Ubiquitous Comput. (2013). doi:10.1007/s00779-013-0668-9

22.

Lee, K.D., Nam, M.Y., Chung, K.Y., Lee, Y.H., Kang, U.G.: Context and profile based cascade classifier for efficient people detection and safety care system. Multimed. Tools Appl. 63(1), 27–44 (2013) CrossRef

23.

Jung, Y.G., Han, M.S., Chung, K.Y., Lee, S.J.: A study of a valid frequency range using correlation analysis of throat signal. Inf. Int. Interdiscip. J. 14(11), 3791–3799 (2011)

Title: Target speech feature extraction using non-parametric correlation coefficient
Authors: Sang Yeob Oh
Kyung-Yong Chung
Publication date: 01-09-2014
Publisher: Springer US
Published in: Cluster Computing / Issue 3/2014
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-013-0284-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 3/2014

Fault tolerance and QoS scheduling using CAN in mobile social cloud computing

Distributed Electronic Commerce cluster for small enterprise

Managing performance and power consumption tradeoff for multiple heterogeneous servers in cloud computing

Implementation of load management application system using smart grid privacy policy in energy management service environment

Mining-based associative image filtering using harmonic mean

Mobile healthcare application with EMR interoperability for diabetes patients

Premium Partner