nach oben

Neural Computing and Applications

Erschienen in:

01.06.2013 | ICONIP 2011

Robust speech recognition based on independent vector analysis using harmonic frequency dependency

verfasst von: Soram Jun, Minook Kim, Myungwoo Oh, Hyung-Min Park

Erschienen in: Neural Computing and Applications | Ausgabe 7-8/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper describes an algorithm that enhances speech by independent vector analysis (IVA) using harmonic frequency dependency for robust speech recognition. While the conventional IVA exploits the full-band uniform dependencies of each source signal, a harmonic clique model is introduced to improve the enhancement performance by modeling strong dependencies among multiples of fundamental frequencies. An IVA-based learning algorithm is derived to consider the non-holonomic constraint and the minimal distortion principle to reduce the unavoidable distortion of IVA, and the minimum power distortionless response beamformer is used as a pre-processing step. In addition, the algorithm compares the log-spectral features of the enhanced speech and observed noisy speech to identify time–frequency segments corrupted by noise and restores those with the cluster-based missing feature reconstruction technique. Experimental results demonstrate that the proposed method enhances recognition performance significantly in noisy environments, especially with competing interference.

Vorheriger Artikel Proximity multi-sphere support vector clustering

Nächster Artikel Discriminative speaker recognition using large margin GMM

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Juang BH (1991) Speech recognition in adverse environments. Comput Speech Lang 5:279–294MathSciNetCrossRef

Singh R, Stern RM, Raj B (2002) Model compensation and matched condition methods for robust speech recognition. In: Davis G (ed) Noise reduction in speech applications. CRC Press, Florida

Raj B, Parikh V, Stern RM (1997) The effects of background music on speech recognition accuracy. In: IEEE ICASSP, pp 851–854

Haykin S (2000) Unsupervised adaptive filtering, volume 1: blind source separation. Wiley, New York

Comon P, Jutten C (2010) Handbook of blind source separation: independent component analysis and applications. Academic Press, Oxford

Hyvärinen A, Harhunen J, Oja E (2001) Independent component analysis. Wiley, New YorkCrossRef

Kim T, Attias HT, Lee S-Y, Lee T-W (2007) Blind source separation exploiting higher-order frequency dependencies. IEEE Trans Audio Speech Lang Process 15:70–79CrossRef

Lee I, Jang G-J, Lee T-W (2009) Independent vector analysis using densities represented by chain-like overlapped cliques in graphical models for separation of convolutedly mixed signals. IET Elect Lett 45(13):710–711CrossRef

Choi CH, Chang W, Lee S-Y (2012) Blind source separation of speech and music signals using harmonic frequency dependent independent vector analysis. IET Elect Lett 48(2):124–125CrossRef

10.

Matsuoka K, Nakashima S (2001) Minimal distortion principle for blind source separation. In: International workshop on ICA and BSS, pp. 722–727

11.

Raj B, Seltzer ML, Stern RM (2004) Reconstruction of missing features for robust speech recognition. Speech Commun 43:275–296CrossRef

12.

Amari SI, Chen TP, Cichocki A (2000) Nonholonomic orthogonal learning algorithms for blind source separation, Neural computation 12(6). MIT Press Cambridge, MA

13.

Kim L-H, Tashev I, Acero A (2010) Reverberated speech signal separation based on regularized subband feedforward ICA and instantaneous direction of arrival. In: IEEE ICASSP, pp 2678–2681

14.

Raj B, Stern RM (2005) Missing-feature methods for robust automatic speech recognition. IEEE Signal Process Mag 22:101–116CrossRef

15.

Kim M, Kim J-S, Park H-M (2011) Robust speech recognition using missing feature theory and target speech enhancement based on degenerate unmixing and estimation technique. In: Proceedings of SPIE 8058, 80580D

16.

Rabiner L, Juang BH (1993) Fundamentals of speech recognition. Prentice-Hall, New Jersey

17.

Price P, Fisher WM, Bernstein J, Pallet DS (1988) The DARPA 1000-word resource management database for continuous speech recognition. In: Proceedings of IEEE ICASSP, pp 651–654

18.

Young SJ, Evermann G, Gales M, Hain T, Kershaw D, Liu X, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland PC (2006) The HTK book (for HTK version 3.4). University of Cambridge, Cambridge

19.

Allen JB, Berkley DA (1979) Image method for efficiently simulating small-room acoustics. J Acoust Soc Am 65(4):943–950CrossRef

Titel: Robust speech recognition based on independent vector analysis using harmonic frequency dependency
verfasst von: Soram Jun
Minook Kim
Myungwoo Oh
Hyung-Min Park
Publikationsdatum: 01.06.2013
Verlag: Springer-Verlag
Erschienen in: Neural Computing and Applications / Ausgabe 7-8/2013
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-012-1002-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 7-8/2013

Stability analysis of robust adaptive hybrid position/force controller for robot manipulators using neural network with uncertainties

Movie scenes detection with MIGSOM based on shots semi-supervised clustering

PSO-based K-Means clustering with enhanced cluster matching for gene expression data

Efficient nonlinear classification via low-rank regularised least squares

Training of an artificial neural network in the diagnostic system of a technical object

Stability in the numerical simulation of stochastic delayed Hopfield neural networks