Top

Published in:

2019 | OriginalPaper | Chapter

Speaker Recognition Based on Lightweight Neural Network for Smart Home Solutions

Authors : Haojun Ai, Wuyang Xia, Quanxin Zhang

Published in: Cyberspace Safety and Security

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

With the technological advancement of smart home devices, the lifestyles of people have been gradually changed. Meanwhile, speaker recognition is available in almost all smart home devices. Currently, the mainstream speaker recognition service is provided by a very deep neural network which trained on the cloud server. However, these deep neural networks are not suitable for deployment and operation on smart home devices. Aiming at this problem, in this paper, we propose a packet bottleneck method to improve SqueezeNet which has been widely used in the speaker recognition task. In the meantime, a lightweight structure named TrimNet has been designed. Besides, a model updating strategy based on transfer learning has been adopted to avoid model deteriorates due to the cold speech. The experimental results demonstrate that the proposed lightweight structure TrimNet is superior to SqueezeNet in classification accuracy, structural parameter quantity, and calculation amount. Moreover, the model updating method can increase the recognition rate of cold speech without damaging the recognition rate of other speakers.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A Textual Password Entry Method Resistant to Human Shoulder-Surfing Attack

next chapter A Fine-Grained Authorized Keyword Secure Search Scheme in the Cloud Computing

Hansen, J.H.L., Hasan, T.: Speaker recognition by machines and humans: a tutorial review. IEEE Signal Process. Mag. 32(6), 74–99 (2015)CrossRef

Richards, H., Haynes, R., Kim, Y., Bridle, J.: Generalised discriminative transform via curriculum learning for speaker recognition. In: 2018 IEEE ICASSP, pp. 5324–5328 (2018)

Ghiurcau, M.V., Rusu, C., Astola, J.: A study of the effect of emotional state upon text-independent speaker identification. In: 2011 IEEE International Conference on ICASSP, 2011, pp. 4944–4947 (2011)

Matveev, Y.: The problem of voice template aging in speaker recognition systems. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS (LNAI), vol. 8113, pp. 345–353. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_46CrossRef

Przybocki, M.A., Martin, A.F., Le, A.N.: Nist speaker recognition evaluations utilizing the mixer corporał 2004, 2005, 2006. IEEE Trans. Audio Speech Lang. Process. 15(7), 1951–1959 (2007)CrossRef

Wagner, J., Fraga-Silva, T., Josse, Y., Schiller, D., Sei-derer, A., Andre, E.: Infected phonemes: how a cold impairs speech on a phonetic level. In: Proceedings of Interspeech 2017, pp. 3457–3461 (2017)

Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: 2016 IEEE Inter- national Conference on ICASSP, 2016, pp. 4945–4949 (2016)

Berry, D.A., Herzel, H., Titze, I.R., Krischer, K.: Interpretation of biomechanical simulations of normal and chaotic vocal fold oscillations with empirical eigenfunctions. J. Acoust. Soc. Am. 95(6), 3595–3604 (1994)CrossRef

Godino Llorente, J.I., Díazde María, F.: Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans. Audio Speech Lang. Process. 17(6), 1186–1195 (2009)CrossRef

10.

Hansen, J.H.L., Gavidia Ceballos, L., Kaiser, J.F.: A nonlinear operator-based speech feature analysis method with application to vocal fold pathology assessment. IEEE Trans. Biomed. Eng. 45(3), 300–313 (1998)CrossRef

11.

Tull, R.G., Rutledge, J.C., Larson, C.R: Cepstral analysis of cold-speech for speaker recognition: a second look. Ph.D. thesis, ASA (1996)

12.

Cole, R.A., Noel, M., Noel, V.: The CSLU speaker recognition corpus. In: Fifth International Conference on Spoken Language Processing (1998)

13.

Beigi, H.: Effects of time lapse on speaker recognition results. In: 2009 16th Inter- national Conference on Digital Signal Processing, pp. 1–6 (2009)

14.

Reynolds, D.A., Rose, R.C., et al.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)CrossRef

15.

Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Speech Audio Process. 19(4), 788–798 (2011)CrossRef

16.

Senior, I., Lopez-Moreno, A.: Improving DNN speaker independence with i-vector inputs. In: 2014 IEEE International Conference on ICASSP, 2014, pp. 225–229 (2014)

17.

Kenny, P.: Bayesian speaker verification with heavy tailed priors. In: Odyssey 2010, p. 14 (2010)

18.

Rohdin, J., Silnova, A., Diez, M., Plchot, O., Matějka, P., Burget, L.: End-to-end DNN based speaker recognition inspired by i-vector and PLDA. In: 2018 IEEE ICAS-SP, 2018, pp. 4874–4878 (2018)

19.

Yamada, T., Wang, L., Kai, A.: Improvement of distant-talking speaker identification using bottleneck features of DNN. In: Interspeech 2013, pp. 3661–3664 (2013)

20.

Lei, Y., Scheffer, N., Ferrer, L., McLaren, M.: A novel scheme for speaker recognition using a phonetically- aware deep neural network. In: 2014 IEEE International Conference on ICASSP, 2014, pp. 1695–1699 (2014)

21.

Torfi, A., Dawson, J., Nasrabadi, N.M.: Text-independent speaker verification using 3d convolutional neural networks. In: 2018 IEEE ICME, 2018, pp. 1–6 (2018)

Title: Speaker Recognition Based on Lightweight Neural Network for Smart Home Solutions
Authors: Haojun Ai
Wuyang Xia
Quanxin Zhang
Publisher: Springer International Publishing
Book: Cyberspace Safety and Security
Print ISBN: 978-3-030-37351-1

Electronic ISBN: 978-3-030-37352-8

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-37352-8_37

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner