nach oben

Neural Processing Letters

Erschienen in:

01.04.2014

Extension of a Kernel-Based Classifier for Discriminative Spoken Keyword Spotting

verfasst von: Shima Tabibian, Ahmad Akbari, Babak Nasersharif

Erschienen in: Neural Processing Letters | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A keyword spotter is considered as a binary classifier that separates a class of utterances containing a target keyword from utterances without the keyword. These two classes are not inherently linearly separable. Thus, linear classifiers are not completely suitable for such cases. In this paper, we extend a kernel-based classification approach to separate the mentioned two non-linearly separable classes so that the area under the Receiver/Relative Operating Characteristic (ROC) curve (the most common measure for keyword spotter evaluation) is maximized. We evaluated the proposed keyword spotter under different experimental conditions on TIMIT database. The results indicate that, in false alarm per keyword per hour smaller than two, the true detection rate of the proposed kernel-based classification approach is about 15 % greater than that of the linear classifiers exploited in previous researches. Additionally, area under the ROC curve (AUC) of the proposed method is 1 % higher than AUC of the linear classifiers that is significant due to confidence levels 80 and 95 % obtained by t-test and F-test evaluations, respectively. In addition, we evaluated the proposed method in different noisy conditions. The results indicate that the proposed method show a good robustness in noisy conditions.

Vorheriger Artikel Generalization Bounds of Regularization Algorithm with Gaussian Kernels

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Rabiner LR, Juang B, Yegnanarayana B (2010) Fundamentals of speech recognition. Pearson, New Delhi

Rabiner LR, Schafer RW (2011) Theory and application of digital speech processing, 1st edn. Prentice Hall, New York

Deller JR, Hansen JHL, Proakis JG (2000) Discrete-time processing of speech signals. IEEE Press, New York

Ghaffari A, Homaeinezhad MR, Daevaeiha MM (2011) High resolution ambulatory holter ECG events detection-delineation via modified multi-lead wavelet-based features analysis: detection and quantification of heart rate turbulence. Expert Syst Appl 38:5299–5310

Wang D, Tejedor J, Frankel J, King S, Colas J (2009) Posterior-based confidence measures for spoken term detection. In: Proceedings of ICASSP, pp 4889–4892

Rose RC, Paul DB (1990) A hidden Markov model based keyword recognition system. In: Proceedings of ICASSP, pp 129–132

Tejedor J, Wang D, Frankel J, King S, Colás J (2008) A comparison of grapheme and phone-based units for Spanish spoken term detection. Speech Commun 50:980–991CrossRef

Fernandez S, Graves A, Schmidhuber J (2007) An application of recurrent neural networks to discriminative keyword spotting. In: International conference on artificial, neural networks (ICANN), pp 220–229

Li KP, Naylor JA, Rossen ML (1992) A whole word recurrent neural network for keyword spotting. In IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 81–84

10.

Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

11.

Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATH

12.

Altun Y, Tsochantaridis I, Hofmann Th (2003) Hidden Markov support vector machines. In: Proceedings of the twentieth international conference on machine learning

13.

Bahl LR, Brown PF, de Souza P, Mercer RL (1989) Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 49–52

14.

Juang B, Katagiri S (1992) Discriminative learning for minimum error classification. IEEE Trans Signal Process 40:3043–3054CrossRefMATH

15.

Povey D, Woodland P (2002) Minimum phone error and I-smoothing for improved discriminative training. In: International conference on acoustics, speech, and signal processing (ICASSP), pp 105–108

16.

Tabibian Sh, Akbari A, Nasersharif B (2011) An evolutionary based discriminative system for keyword spotting. In: Symposium on artificial intelligence and signal processing (AISP2011), indexed by IEEE, pp 83–88

17.

Keshet J, Bengio S (2009) Automatic speech and speaker recognition. Large margin and kernel methods. Wiley, New YorkCrossRef

18.

Keshet J, Grangier D, Bengio S (2009) Discriminative keyword spotting. Speech Commun 51:317–329CrossRef

19.

Tabibian Sh, Shokri A, Akbari A, Nasersharif B (2010) Performance evaluation for an HMM-based keyword spotter and a Large-margin based one in noisy environments. In: World conference on information technology, procedia computer science, vol 3, pp 1018–1022

20.

Salomon J, King S, Osborne M (2002) Frame wise phone classification using support vector machines. In: Proceedings of the seventh international conference on spoken language processing, pp 2645–2648

21.

Ganapathiraju A, Hamaker J, Picone J (2002) Support vector machines for speech recognition. In: Proceedings of the international conference on spoken language processing

22.

Padrell-Sendra1 J, Martin-Iglesias D, Diaz-de-Maria F (2006) Support vector machines for continuous speech recognition. In: European signal processing conference (EUSIPCO), pp 2–5

23.

Bardideh M, Razzazi F, Ghassemian H (2007) An SVM-based confidence measure for continuous speech recognition. In: IEEE international conference on signal processing and communications (ICSPC), pp 24–27

24.

Benayed Y, Fohr D, Haton JP, Chollet G (2003) Improving the performance of a keyword spotting system by using support vector machines. In: IEEE workshop on automatic speech recognition and understanding (ASRU), pp 145–149

25.

Ben Ayed Y, Fohr D, Haton JP, Chollet G (2002) Keyword spotting using support vector machines. In: Proceedings of the international conference on text, speech and dialogue, pp 285–292

26.

Zhi-yi Q, Yu L, Li-hong Zh, Ming-xin Sh (2006) A speech recognition system based on a hybrid HMM/SVM architecture. In: Proceedings of the first international conference on innovative computing, information and control (ICICIC), pp 100–104

27.

Solera-Urena R, Padrell-Sendra J, Martín-Iglesias D, Gallardo-Antolín A, Peláaez-Moreno C, Díaz-de-María F (2007) SVMs for automatic speech recognition: a survey. Progress in nonlinear speech processing. Springer, New York, pp 190–216

28.

Hejazi SA, Kazemi R, Ghaemmaghami S (2008) Isolated Persian digit recognition using a hybrid HMM-SVM. In: International symposium on intelligent signal processing and communication systems (ISPACS), pp 1–4

29.

Huang H, Zhu J (2006) Kernel-based non-linear feature extraction methods for speech recognition. In: Proceedings of the sixth international conference on intelligent systems design and applications (ISDA), pp 749–754

30.

Zheng WM, Zou CR, Zhao L (2005) An improved algorithm for kernel principle components analysis. Neural Process Lett 22:49–56CrossRef

31.

Zhang R, Wang W (2011) Learning linear and nonlinear PCA with linear programming. Neural Process Lett 33(2):151–170CrossRef

32.

Yang J, Frangi AF, Yang JY (2004) A new kernel Fisher discriminant algorithm with application to face recognition. Neurocomputing 56:415–421CrossRef

33.

Xu Y, Zhang D, Jin Zh, Li M, Yang JY (2006) A fast kernel-based nonlinear discriminant analysis for multi-class problems. Pattern Recognit 39:1026–1033CrossRefMATH

34.

Theodoridis S, Koutroumbas K (2009) Pattern recognition, 2nd edn. Elsevier, Amsterdam

35.

Rychetsky M (2001) Algorithms and architectures for machine learning based on regularized neural networks and support vector approaches. Shaker Verlag, Berlin

36.

Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vanthienen J (2002) Least squares support vector machines. World Scientific, SingaporeCrossRefMATH

37.

38.

Keshet J, Shalev-Shwartz S, Bengio S, Singer Y, Chazan D (2006) Discriminative kernel-based phoneme sequence recognition. In: Proceedings of international conference on spoken, language processing (INTERSPEECH)

39.

Dekel O, Keshet J, Singer Y (2004) Online algorithm for hierarchical phoneme classification. In Workshop on machine learning for multimodal interaction, pp 146–158

40.

Perez-Cruz F, Bousquet O (2004) Kernel methods and their potential use in signal processing. IEEE Signal Process Mag 21:57–65CrossRef

41.

Chang ChCh, Lin ChJ (2009) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin

42.

Chen CP, Blimes J, Kirchhoff K (2002) Low-resource noise-robust feature post-processing on AURORA 2.0. In: Proceedings of ICSLP, pp 2445–2448

43.

Kuo JW, Lo HY, Wang HM (2007) Improved HMM/SVM methods for automatic phoneme segmentation. In: Proceedings of the tenth European conference on speech communication and technology (Interspeech2007-Eurospeech)

44.

Toledano DT, Gómez LAH, Grande LV (2003) Automatic phonetic segmentation. IEEE Trans Speech Audio Process 11:617–625CrossRef

45.

Toh M, Togneri R, Nordholm S (2005) Spectral entropy as speech features for speech recognition. In: Proceedings of postgraduate electrical engineering and computing symposium (PEECS) , pp 22–25

46.

Peeters G (2004) A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Cuidado Project Report Ircam, pp 1–25

47.

Lin ChY, Rager Jang JSh (2005) Automatic segmentation and labeling for Mandarin Chinese speech corpora for concatenation-based TTS. Comput Linguist Chin Lang Process 10:145–166

48.

Buadat G, Anouar F (2000) Generalized discriminant analysis using a kernel approach. J Neural Comput 12:2385–2404CrossRef

49.

Zacharie DG, Pinto JP (2007) Keyword spotting on word lattices. Research Report, IDIAP Research Institute

50.

Cortes C, Mohri M (2004) Confidence intervals for the area under the ROC curve. Adv Neural Inf Process Syst 17:305–312

51.

Lori L, Kassel R, Stephanie S (1989) Speech database development: design and analysis of the acoustic-phonetic corpus. In: Proceedings of DARPA speech recognition workshop, vol 2, pp 161–170

52.

Liu Ch (2004) Gabor-based kernel PCA with fractional power polynomial models for face recognition. IEEE Trans Pattern Anal Mach Intell 26:572–581CrossRef

53.

Rossius R, Zenker G, Ittner A, Dilger W (1998) A short note about the application of polynomial kernel with fractional degree in support vector learning. In: Lecture notes in computer science, pp 143–148

54.

Tamimi H, Zell A (2004) Vision based localization of mobile robots using kernel approaches. In: Proceedings of the international conference on intelligent robots and systems (IROS 2004), pp 1896–1901

55.

Martin AF et al (1997) The DET curve in assessment of detection task performance. In: Proceedings of Eurospeech, vol 4, pp 1899–1903

56.

O’Mahony M (1986) Sensory evaluation of food: statistical methods and procedures. CRC Press, Boca Raton

57.

Lomax RG (2007) Statistical concepts: a second course. Lawrence Erlbaum Associates, Mahwah

Titel: Extension of a Kernel-Based Classifier for Discriminative Spoken Keyword Spotting
verfasst von: Shima Tabibian
Ahmad Akbari
Babak Nasersharif
Publikationsdatum: 01.04.2014
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 2/2014
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-013-9299-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence_ieS/© Springer Fachmedien Wiesbaden GmbH, Search Icon, Banner Hanser, Bunte Männchen, die Kunden darstelle, werden von einem riesigen Magneten angezogen. /© Oleksiy Mark, Dr. Daniel Schneider/© Fraunhofer IESE, Interview Level Ten PPA Bild/© LevelTen, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2014

New Results on Exponential Convergence for HRNNs with Continuously Distributed Delays in the Leakage Terms

Approximation and Estimation Bounds for Subsets of Reproducing Kernel Kreǐn Spaces

Generalization Bounds of Regularization Algorithm with Gaussian Kernels

Non-Gaussian Data Clustering via Expectation Propagation Learning of Finite Dirichlet Mixture Models and Applications

Learning to Rank with Extreme Learning Machine

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.