nach oben

Pattern Analysis and Applications

Erschienen in:

01.09.2009 | Theoretical Advances

Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models

verfasst von: Enrique Argones Rúa, Hervé Bredin, Carmen García Mateo, Gérard Chollet, Daniel González Jiménez

Erschienen in: Pattern Analysis and Applications | Ausgabe 3/2009

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper addresses the subject of liveness detection, which is a test that ensures that biometric cues are acquired from a live person who is actually present at the time of capture. The liveness check is performed by measuring the degree of synchrony between the lips and the voice extracted from a video sequence. Three new methods for asynchrony detection based on co-inertia analysis (CoIA) and a fourth based on coupled hidden Markov models (CHMMs) are derived. Experimental comparisons are made with several methods previously used in the literature for asynchrony detection and speaker location. The reported results demonstrate the effectiveness and superiority of the proposed new methods based on both CoIA and CHMMs as asynchrony detection methods.

Vorheriger Artikel Towards a measure of biometric feature information

Nächster Artikel Classification of image objects in Epo doping control using fuzzy decision tree

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Potamianos G, Neti C, Luettin J, Matthews I (2004) Audio-visual automatic speech recognition: an overview. Issues Vis Audio Vis Speech Process

Liu X, Liang L, Zhaa Y, Pi X, Nefian AV (2002) Audio-visual continuous speech recognition using a coupled hidden Markov model. In: Proceedings of the international conference on spoken language processing

Gurbuz S, Tufekci Z, Patterson T, Gowdy JN (2002) Multi-stream product modal audio-visual integration strategy for robust adaptive speech recognition. In: Proceedings of IEEE international conference on acoustics, speech and signal processing, Orlando

Chibelushi CC, Deravi F, Mason JSD (2002) A review of speech-based bimodal recognition. IEEE Trans Multimed 4(1):23–37CrossRef

Pan H, Liang Z-P, Huang TS (2000) A new approach to integrate audio and visual features of speech. In: IEEE international conference on multimedia and expo., pp 1093 – 1096

Chaudhari UV, Ramaswamy GN, Potamianos G, Neti C (2003) Information fusion and decision cascading for audio-visual speaker recognition based on time-varying stream reliability prediction. In: IEEE international conference on multimedia expo., vol III. Baltimore, pp 9–12, July 2003

Chetty G, Wagner M (2004) “Liveness” verification in audio-video authentication. In: Australian international conference on speech science and technology, pp 358–363

Eveno N, Besacier L (2005) A speaker independent liveness test for audio-video biometrics. In: Nineth European conference on speech communication and technology

Hershey J, Movellan J (2000) Audio vision: using audiovisual synchrony to locate sounds. In: Advances in neural information processing systems, vol 12, pp 813–819

10.

Slaney M, Covell M (2000) FaceSync: a linear operator for measuring synchronization of video facial images and audio tracks. Neural Inf Process Soc 13

11.

Fisher JW, Darell T (2004) Speaker association with signal-level audiovisual fusion. IEEE Trans Multimed 6(3):406–413CrossRef

12.

Nock HJ, Iyengar G, Neti C (2002) Assessing face and speech consistency for monologue detection in video. Multimedia 303–306

13.

Bredin H, Chollet G (2006) Measuring audio and visual speech synchrony: methods and applications. In: International conference on visual information engineering

14.

Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: DARPA image understanding workshop, pp 121–130

15.

Bredin H, Aversano G, Mokbel C, Chollet G (2006) The biosecure talking-face reference system. In: Second workshop on multimodal user authentication, May 2006

16.

Dolédec S, Chessel D (1994) Co-inertia analysis: an alternative method for studying species-environment relationships. Freshw Biol 31:277–294CrossRef

17.

Bailly-Baillière E, Bengio E, Bimbot F, Hamouz M, Kittler J, Mariéthoz J, Matas J, Messer K, Popovici V, Porée F, Ruiz B, Thiran J-P (2003) The BANCA database and evaluation protocol. In: Lecture notes in computer science, vol 2688, pp 625–638, January 2003

18.

Gutiérrez J, Rouas J-L, André-Obrecht R (2004) Weighted loss functions to make risk-based language identification fused decisions. In: IEEE Computer Society (ed). Proceedings of the 17th international conference on pattern recognition (ICPR’04)

19.

Qian J-Z, Ross A, Jain A (2001) Information fusion in biometrics. In: Proceedings of 3rd international conference on audio- and video-based person authentication (AVBPA), pp 354–359, Sweden, June 2001

20.

Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. In: European conference on speech communication and technology, pp 1895–1898

21.

Bailly-Bailliére E, Bengio S, Bimbot F, Hamouz M, Kittler J, Marióthoz J, Matas J, Messer K, Popovici V, Porée F, Ruiz B, Thiran J-P (2003) The banca database and evaluation protocol

22.

Bengio S, Mariéthoz J (2004) A statistical significance test for person authentication. ODYSSEY 2004—the speaker and language recognition workshop, pp 237–244

23.

Zhang X, Mersereau RM, Clements M (2002) Bimodal fusion in audio-visual speech recognition, vol 1. In: IEEE 2002 international conference on image processing, pp 964–967, September 2002

24.

Nefian AV, Liang L, Pi X, Xiaoxiang L, Mao C, Murphy K (2002) A coupled HMM for audio-visual speech recognition. In: Proceedings of the international conference on acoustics speech and signal processing (ICASSP02), May 2002

25.

Tao D, Li X, Hu W, Maybank S, Wu X (2007) Supervised tensor learning. knowledge and information systems, 13(1):1–42

26.

Tao D, Li X, Wu X, Maybank SJ (2007) General tensor discriminant analysis and gabor features for gait recognition. IEEE Trans Pattern Anal Mach Intell 29(10):700–715CrossRef

Titel: Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models
verfasst von: Enrique Argones Rúa
Hervé Bredin
Carmen García Mateo
Gérard Chollet
Daniel González Jiménez
Publikationsdatum: 01.09.2009
Verlag: Springer-Verlag
Erschienen in: Pattern Analysis and Applications / Ausgabe 3/2009
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-008-0121-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2009

A multiple expert system for classifying fluorescent intensity in antinuclear autoantibodies analysis

SUSIG: an on-line signature database, associated protocols and benchmark results

New scale invariant template matching technique using hyper space image representation

Soft authentication and behavior analysis using a chair with sensors attached: hipprint authentication

Towards a measure of biometric feature information

Soft authentication using an infrared ceiling sensor network

Premium Partner