Skip to main content
Top

2018 | OriginalPaper | Chapter

Lip Password-Based Speaker Verification Without a Priori Knowledge of Speech Language

Authors : Yiu-ming Cheung, Yichao Zhou

Published in: Computational Intelligence and Intelligent Systems

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Most recently, the lip password that embeds the password content into lip motion has been proposed for visual speaker verification (Liu and Cheung 2014). One merit of lip password is that it provides double security on the speaker verification, where only the target speaker saying the correct password can be accepted. Nevertheless, the previous work of lip password is based on identifying the distinguishing subunits of purely-digit password contents, thus limiting the application domain of lip password. To tackle this problem, we propose a novel visual speaker verification approach based on lip password without a priori knowledge of speech language, i.e. unknown language alphabet. We take advantage of the diagonal structure of sparse representation to preserve the temporal order of lip sequences by employ a diagonal-like mask in pooling stage and build a pyramid spatiotemporal features containing the structural characteristic under lip password. Experiments show the efficacy of the proposed approach comparing with the state-of-the-art ones.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
“Kai-Men-Jian-Shan”, “You-Qi-Wu-Li”, “Gong-Xi-Fa-Cai”, “Dong-Fang-Ming-Zhu”, “Wu-Jing-Da-Cai”.
 
Literature
1.
go back to reference Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: The Workshop on Computational Learning Theory, pp. 144–152 (1996) Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: The Workshop on Computational Learning Theory, pp. 144–152 (1996)
2.
go back to reference Broun, C.C., Zhang, X., Mersereau, R.M., Clements, M.: Automatic speechreading with application to speaker verification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. I-685. IEEE (2002) Broun, C.C., Zhang, X., Mersereau, R.M., Clements, M.: Automatic speechreading with application to speaker verification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. I-685. IEEE (2002)
3.
go back to reference Cetingul, H.E., Yemez, Y., Erzin, E., Tekalp, A.M.: Discriminative analysis of lip motion features for speaker identification and speech-reading. IEEE Trans. Image Process. 15(10), 2879–2891 (2006)CrossRef Cetingul, H.E., Yemez, Y., Erzin, E., Tekalp, A.M.: Discriminative analysis of lip motion features for speaker identification and speech-reading. IEEE Trans. Image Process. 15(10), 2879–2891 (2006)CrossRef
4.
go back to reference Chan, C.H., Goswami, B., Kittler, J., Christmas, W.: Local ordinal contrast pattern histograms for spatiotemporal, lip-based speaker authentication. IEEE Trans. Inf. Forensics Secur. 7(2), 602–612 (2012)CrossRef Chan, C.H., Goswami, B., Kittler, J., Christmas, W.: Local ordinal contrast pattern histograms for spatiotemporal, lip-based speaker authentication. IEEE Trans. Inf. Forensics Secur. 7(2), 602–612 (2012)CrossRef
5.
go back to reference Cheung, Y.M., Liu, X., You, X.: A local region based approach to lip tracking. Pattern Recogn. 45, 3336–3347 (2012)CrossRef Cheung, Y.M., Liu, X., You, X.: A local region based approach to lip tracking. Pattern Recogn. 45, 3336–3347 (2012)CrossRef
6.
7.
go back to reference Jourlin, P., Luettin, J., Genoud, D., Wassner, H.: Acoustic-labial speaker verification. Pattern Recognit. Lett. 18(9), 853–858 (1997)CrossRef Jourlin, P., Luettin, J., Genoud, D., Wassner, H.: Acoustic-labial speaker verification. Pattern Recognit. Lett. 18(9), 853–858 (1997)CrossRef
8.
go back to reference Karlsson, S.M., Bigun, J.: Lip-motion events analysis and lip segmentation using optical flow. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 138–145 (2012) Karlsson, S.M., Bigun, J.: Lip-motion events analysis and lip segmentation using optical flow. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 138–145 (2012)
9.
go back to reference Lai, J.Y., Wang, S.L., Liew, W.C., Shi, X.J.: Visual speaker identification and authentication by joint spatiotemporal sparse coding and hierarchical pooling. Inf. Sci. 373, 219–232 (2016)CrossRef Lai, J.Y., Wang, S.L., Liew, W.C., Shi, X.J.: Visual speaker identification and authentication by joint spatiotemporal sparse coding and hierarchical pooling. Inf. Sci. 373, 219–232 (2016)CrossRef
10.
go back to reference Li, M., Cheung, Y.M.: A novel motion based lip feature extraction for lip-reading. In: Proceedings of 2008 International Conference on Computational Intelligence and Security, pp. 361–365 (2008) Li, M., Cheung, Y.M.: A novel motion based lip feature extraction for lip-reading. In: Proceedings of 2008 International Conference on Computational Intelligence and Security, pp. 361–365 (2008)
11.
go back to reference Li, M., Cheung, Y.M.: Automatic lip localization under face illumination with shadow considertion. Sig. Process. 89(12), 2425–2434 (2009)CrossRef Li, M., Cheung, Y.M.: Automatic lip localization under face illumination with shadow considertion. Sig. Process. 89(12), 2425–2434 (2009)CrossRef
12.
go back to reference Liu, X., Cheung, Y.M.: Learning multi-boosted HMMs for lip-password based speaker verification. IEEE Trans. Inf. Forensics Secur. 9(2), 233–246 (2014)CrossRef Liu, X., Cheung, Y.M.: Learning multi-boosted HMMs for lip-password based speaker verification. IEEE Trans. Inf. Forensics Secur. 9(2), 233–246 (2014)CrossRef
13.
go back to reference Liu, X., Cheung, Y.M., Tang, Y.Y.: Lip event detection using oriented histograms of regional optical flow and low rank affinity pursuit. Comput. Vis. Image Underst. 148, 153–163 (2016)CrossRef Liu, X., Cheung, Y.M., Tang, Y.Y.: Lip event detection using oriented histograms of regional optical flow and low rank affinity pursuit. Comput. Vis. Image Underst. 148, 153–163 (2016)CrossRef
14.
go back to reference Luettin, J., Maître, G.: Evaluation protocol for the extended M2VTS database (XM2VTSDB). IDIAP (1998) Luettin, J., Maître, G.: Evaluation protocol for the extended M2VTS database (XM2VTSDB). IDIAP (1998)
15.
go back to reference Saeed, U.: Person identification using behavioral features from lip motion. In: IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, pp. 131–136 (2011) Saeed, U.: Person identification using behavioral features from lip motion. In: IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, pp. 131–136 (2011)
16.
go back to reference Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 994–1000 (2005) Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 994–1000 (2005)
17.
go back to reference Shaikh, A.A., Kumar, D.K., Gubbi, J.: Automatic visual speech segmentation and recognition using directional motion history images and zernike moments. Vis. Comput. 29(10), 969–982 (2013)CrossRef Shaikh, A.A., Kumar, D.K., Gubbi, J.: Automatic visual speech segmentation and recognition using directional motion history images and zernike moments. Vis. Comput. 29(10), 969–982 (2013)CrossRef
18.
go back to reference Shi, X.X., Wang, S.L., Lai, J.Y.: Visual speaker authentication by ensemble learning over static and dynamic lip details. In: IEEE International Conference on Image Processing, pp. 3942–3946 (2016) Shi, X.X., Wang, S.L., Lai, J.Y.: Visual speaker authentication by ensemble learning over static and dynamic lip details. In: IEEE International Conference on Image Processing, pp. 3942–3946 (2016)
19.
go back to reference Wang, S.L., Liew, A.W.C.: Physiological and behavioral lip biometrics: a comprehensive study of their discriminative power. Pattern Recogn. 45(9), 3328–3335 (2012)CrossRef Wang, S.L., Liew, A.W.C.: Physiological and behavioral lip biometrics: a comprehensive study of their discriminative power. Pattern Recogn. 45(9), 3328–3335 (2012)CrossRef
20.
go back to reference Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009) Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009)
Metadata
Title
Lip Password-Based Speaker Verification Without a Priori Knowledge of Speech Language
Authors
Yiu-ming Cheung
Yichao Zhou
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-1651-7_41

Premium Partner