nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Lip-Reading Based on Deep Learning Model

verfasst von : Mei-li Zhu, Qing-qing Wang, Jiang-lin Luo

Erschienen in: Transactions on Edutainment XV

Verlag: Springer Berlin Heidelberg

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

With the rapid development of computer computing power, deep learning plays a more and more important role in the fields of automatic driving, medical research, industrial automation and so on. In order to improve the accuracy of lip-reading recognition, an algorithm based on the model of lip deep learning was proposed in this paper. Binary image of the lip contour motion sequence was projected to the spatio-temporal energy, lip dynamic grayscale was used to reduce noise interference in the recognition process and then lip-reading recognition result was improved by using the excellent characteristics of deep learning ability. The experimental results show that deep learning can obtain the effective characteristics of lip dynamic change from the lip dynamic gray scale and get better recognition results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Point Cloud Registration Algorithm Based on 3D-SIFT

Nächstes Kapitel Parameter Estimation of Decaying DC Component via Improved Levenberg-Marquardt Algorithm

Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The stanford digital library metadata architecture. Int. J. Digit. Libr. 1, 108–121 (1997)CrossRef

Bruce, K.B., Cardelli, L., Pierce, B.C.: Comparing object encodings. In: Abadi, M., Ito, T. (eds.) Theoretical Aspects of Computer Software. LNCS, vol. 1281, pp. 415–438. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0014561CrossRef

van Leeuwen, J. (ed.): Computer Science Today. Recent Trends and Developments. Lecture Notes in Computer Science, vol. 1000. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0015232CrossRefMATH

Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, Heidelberg (1996). https://doi.org/10.1007/978-3-662-03315-9CrossRefMATH

Yao, H., Gao, W., Wang, R.: A survey of lipreading-one of visual languages. Acta Electronica Sinica 2, 239–246 (2001)

Yao, W., Liang, Y., Du, M.: A real-time lip localization and tacking for lipreading. In: Proceedings of the 3rd International Conference on Advanced Computer Theory and Engineering, pp. 363–366. IEEE, Chengdu (2010)

Rao, R.A., Russell, R.M.: Lip modeling for visual speech recognition. In: Proceeding of 28th Annual Asilomar Conference on Signals Systems and Computers, Pacific Grove: [s.n.] (1994)

Jun, H., Hua, Z:. A real time lip detection method in lipreading. In: 2007 Chinese Control Conference, CCC 2007, 31 June–26 July 2007, pp. 516–520 (2007)

Pao, T.L., Liao, W.Y.: A motion feature approach for audio-visual recognition. In: Proceedings of 48th Midwest Symposium on Circuits and Systems, vol. 1, pp. 421–424 (2005)

10.

Da Silveira, L.G., Facon, J., Borges, D.L.: Visual speech recognition: a solution from feature extraction to words classification. In: Proceedings of 16th Brazilian Symposium on Computer Graphics and Image Processing, pp. 399–405 (2003)

11.

Hong, X., Yao, H., Liu, Q., Chen, R.: An information acquiring channel — lip movement. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 232–238. Springer, Heidelberg (2005). https://doi.org/10.1007/11573548_30CrossRef

12.

Leszczynski, M., Skarbek, W.: Viseme recognition - a comparative study. In: AVSS-Advanced Video and Signal Based Surveillance, pp. 287–292 (2005)

13.

Kaynak, M.N., Zhi, Q., Cheok, A.D., et al.: Analysis of lip geometric features for audio—visual speech recognition. IEEE Trans. Syst. Man Cybern. Part A: Syst. Hum. 34(4), 564–570 (2004)CrossRef

14.

Seguier, R., Cladel, N.: Multiobjectives genetic snakes: application on audio-visual speech recognition. In: Proceedings of Fourth EURASIP Conference Focused on Video/Image Processing and Multimedia Communications, vol. 2, pp. 625–630 (2003)

15.

Matthews, I., Cootes, T.F., Bangham, J.A., et al.: Extraction of visual features for lipreading. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 198–213 (2002)CrossRef

16.

Wang, W., Cosker, D., Hicks, Y., Saneit, S., Chambers, J.: Video assisted speech source separation. In: 2005 Proceedings of International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2005), pp. 425–428. IEEE (2005)

17.

Cootes, T.F., Walker, K.N., Taylor, C.J.: View-based active appearance models. In: Proceedings of International Conference on Face and Gesture Recognition, pp. 227–232 (2000)

18.

Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59(4), 291–294 (1988)MathSciNetCrossRef

19.

Hinton, G.E.: A practical guide to training restricted Boltzmann machines. Momentum 9(1), 599–619 (2012)

20.

Cootes, T.F., Hill, A., Taylor, C.J., et al.: The use of active shape models for locating structures in medical images. Image Vis. Comput. 12(6), 355–366 (1994)CrossRef

21.

Li, G., Wang, M., Lin, L.: Improving Chinese lip-reading recognizing rate by unsymmetrical lip contour model. Optics Precis. Eng. (3), 473–477 (2006)

Titel: Lip-Reading Based on Deep Learning Model
verfasst von: Mei-li Zhu
Qing-qing Wang
Jiang-lin Luo
Verlag: Springer Berlin Heidelberg
Buch: Transactions on Edutainment XV
Print ISBN: 978-3-662-59350-9

Electronic ISBN: 978-3-662-59351-6

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-662-59351-6_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"