Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

Published in: Wireless Personal Communications 3/2022

25-04-2022

A Cancelable Biometric Approach for Efficient Identification of Speakers from Encrypted Speech

Authors: Basant S. Abd El-Wahab, Heba A. El-Khobby, Mustafa M. Abd Elnaby, Fathi E. Abd El-Samie

Published in: Wireless Personal Communications | Issue 3/2022

Login to get access
share
SHARE

Abstract

Biometric systems based on feature extraction from speech signals are extensively deployed in different security systems. This work presents a secure speaker identification system, namely cancelable speaker identification system, which can be further used for remote access applications. Two formulations for efficient cryptosystems based on a chaotic map and the empirical mode decomposition (EMD) with a single key are presented for this purpose. The encryption achieves a large degree of security in the speaker identification systems. In the proposed cancelable speaker identification scenario, cepstral features are extracted from encrypted speech signals. The matching in this proposed scenario is performed with artificial neural networks (ANNs). Furthermore, various enhancement methods are implemented at the receiver side to remove the noise effect prior to speaker identification. Consequently, the features of the speakers are protected from the attackers. The simulation results prove that the proposed cancelable biometric system has an outstanding performance level compared to the traditional biometric systems. It achieves a recognition rate up to 100% at a − 20 dB signal-to-noise ratio (SNR) and above, using the two proposed cryptosystems with a high level of security.
Literature
1.
go back to reference Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In 2002 IEEE international conference on in Acoustics, speech, and signal processing (ICASSP) (pp. IV-4072-IV-4075). Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In 2002 IEEE international conference on in Acoustics, speech, and signal processing (ICASSP) (pp. IV-4072-IV-4075).
2.
go back to reference de Lara, J. R. C. (2005). A method of automatic speaker recognition using cepstral features and vectorial quantization. In Iberoamerican congress on pattern recognition (pp. 146–153). de Lara, J. R. C. (2005). A method of automatic speaker recognition using cepstral features and vectorial quantization. In Iberoamerican congress on pattern recognition (pp. 146–153).
3.
go back to reference Saksamudre, S.K., Shrishrimal, P.P., Deshmukh, R.R. A review on different approaches for speech recognition system. International Journal of Computer Applications, 115(22.)‏ Saksamudre, S.K., Shrishrimal, P.P., Deshmukh, R.R. A review on different approaches for speech recognition system. International Journal of Computer Applications, 115(22.)‏
4.
go back to reference Sotelo, E. E., Nakamura, T., Nagai, T., & Hernandez, E. E. (2012) Who said that? The crossmodal matching identity for inferring unfamiliar faces from voices. In 2012 Eighth international conference on signal image technology and internet based systems (SITIS) (pp. 97–104). Sotelo, E. E., Nakamura, T., Nagai, T., & Hernandez, E. E. (2012) Who said that? The crossmodal matching identity for inferring unfamiliar faces from voices. In 2012 Eighth international conference on signal image technology and internet based systems (SITIS) (pp. 97–104).
5.
go back to reference Chauhan, N., Isshiki, T., & Li, D. (2019) Speaker recognition using LPC, MFCC, ZCR features with ANN and SVM classifier for large input database. In IEEE 4th international conference on computer and communication systems (ICCCS). IEEE (pp. 130–133). Chauhan, N., Isshiki, T., & Li, D. (2019) Speaker recognition using LPC, MFCC, ZCR features with ANN and SVM classifier for large input database. In IEEE 4th international conference on computer and communication systems (ICCCS). IEEE (pp. 130–133).
6.
go back to reference Abd El-Wahab, B. S., El-khobby, H. A., Abd Elnaby, M. M., & Abd El-Samie, F. E. (2011) Simultaneous speaker identification and watermarking. International Journal of Speech Technology, pp. 1–14.‏ Abd El-Wahab, B. S., El-khobby, H. A., Abd Elnaby, M. M., & Abd El-Samie, F. E. (2011) Simultaneous speaker identification and watermarking. International Journal of Speech Technology, pp. 1–14.‏
7.
go back to reference Kekre, H., Athawale, A., Desai, M. (2011). Speaker identification using row mean vector of spectrogram. In Proceedings of the international conference & workshop on emerging trends in technology (pp. 171–174). Kekre, H., Athawale, A., Desai, M. (2011). Speaker identification using row mean vector of spectrogram. In Proceedings of the international conference & workshop on emerging trends in technology (pp. 171–174).
8.
go back to reference Verma, G. K. (2011) Multi-feature fusion for closed set text independent speaker identification. In International conference on information intelligence, systems, technology and management (pp. 170–179). Verma, G. K. (2011) Multi-feature fusion for closed set text independent speaker identification. In International conference on information intelligence, systems, technology and management (pp. 170–179).
9.
go back to reference Dutta, M., Patgiri, C., Sarma, M., & Sarma, K. K. (2015). Closed-set text-independent speaker identification system using multiple ANN classifiers. In Proceedings of the 3 rd international conference on frontiers of intelligent computing: Theory and applications (FICTA) 2014 (pp. 377–385). Dutta, M., Patgiri, C., Sarma, M., & Sarma, K. K. (2015). Closed-set text-independent speaker identification system using multiple ANN classifiers. In Proceedings of the 3 rd international conference on frontiers of intelligent computing: Theory and applications (FICTA) 2014 (pp. 377–385).
10.
go back to reference Kinnunen, T. (2003). Spectral features for automatic text-independent speaker recognition. Licentiate’s Thesis. Kinnunen, T. (2003). Spectral features for automatic text-independent speaker recognition. Licentiate’s Thesis.
11.
go back to reference Kurzekar, P. K., Deshmukh, R. R., Waghmare, V. B., & Shrishrimal, P. P. (2014). A comparative study of feature extraction techniques for speech recognition system. International Journal of Innovative Research in Science, Engineering and Technology, 3, 18006–18016. CrossRef Kurzekar, P. K., Deshmukh, R. R., Waghmare, V. B., & Shrishrimal, P. P. (2014). A comparative study of feature extraction techniques for speech recognition system. International Journal of Innovative Research in Science, Engineering and Technology, 3, 18006–18016. CrossRef
12.
go back to reference Ratha, N. K., Connell, J. H., & Bolle, R. M. (2001). Enhancing security and privacy in biometrics-based authentication systems. IBM Systems Journal, 40, 614–634. CrossRef Ratha, N. K., Connell, J. H., & Bolle, R. M. (2001). Enhancing security and privacy in biometrics-based authentication systems. IBM Systems Journal, 40, 614–634. CrossRef
13.
go back to reference Ratha, N. K., Chikkerur, S., Connell, J. H., & Bolle, R. M. (2007). Generating cancelable fingerprint templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 561–572. CrossRef Ratha, N. K., Chikkerur, S., Connell, J. H., & Bolle, R. M. (2007). Generating cancelable fingerprint templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 561–572. CrossRef
14.
go back to reference Ying, C. L., & Jin, A. T. B. (2007). Probabilistic random projections and speaker verification. In International conference on biometrics (pp. 445–454). Ying, C. L., & Jin, A. T. B. (2007). Probabilistic random projections and speaker verification. In International conference on biometrics (pp. 445–454).
15.
go back to reference Wang, Y., & Plataniotis, K. N. (2010). An analysis of random projection for changeable and privacy-preserving biometric verification. IEEE Transactions on Systems, Man, and Cybernetics Part B (Cybernetics), 40, 1280–1293. CrossRef Wang, Y., & Plataniotis, K. N. (2010). An analysis of random projection for changeable and privacy-preserving biometric verification. IEEE Transactions on Systems, Man, and Cybernetics Part B (Cybernetics), 40, 1280–1293. CrossRef
16.
go back to reference Teoh, A. B. J., & Chong, L.-Y. (2010). Secure speech template protection in speaker verification system. Speech Communication, 52, 150–163. CrossRef Teoh, A. B. J., & Chong, L.-Y. (2010). Secure speech template protection in speaker verification system. Speech Communication, 52, 150–163. CrossRef
17.
go back to reference Jin, A. T. B., Ling, D. N. C., & Goh, A. (2004). Biohashing: Two factor authentication featuring fingerprint data and tokenised random number. Pattern Recognition, 37(2004), 2245–2255. CrossRef Jin, A. T. B., Ling, D. N. C., & Goh, A. (2004). Biohashing: Two factor authentication featuring fingerprint data and tokenised random number. Pattern Recognition, 37(2004), 2245–2255. CrossRef
18.
go back to reference Teoh, A. B., Goh, A., & Ngo, D. C. (2006). Random multispace quantization as an analytic mechanism for biohashing of biometric and random identity inputs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1892–1901. CrossRef Teoh, A. B., Goh, A., & Ngo, D. C. (2006). Random multispace quantization as an analytic mechanism for biohashing of biometric and random identity inputs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1892–1901. CrossRef
19.
go back to reference Hu, H. T., Lin, S. J., & Hsu, L. Y. (2017). Effective blind speech watermarking via adaptive mean modulation and package synchronization in DWT domain. EURASIP Journal on Audio, Speech, and Music Processing, 1, 10. CrossRef Hu, H. T., Lin, S. J., & Hsu, L. Y. (2017). Effective blind speech watermarking via adaptive mean modulation and package synchronization in DWT domain. EURASIP Journal on Audio, Speech, and Music Processing, 1, 10. CrossRef
20.
go back to reference Hammad, M., Luo, G., & Wang, K. (2019). Cancelable biometric authentication system based on ECG. Multimedia Tools and Applications, 78, 1857–1887. CrossRef Hammad, M., Luo, G., & Wang, K. (2019). Cancelable biometric authentication system based on ECG. Multimedia Tools and Applications, 78, 1857–1887. CrossRef
21.
go back to reference Tulyakov, S., Farooq, F., & Govindaraju, V. (2005) Symmetric hash functions for fingerprint minutiae. In International conference on pattern recognition and image analysis (pp. 30–38). Tulyakov, S., Farooq, F., & Govindaraju, V. (2005) Symmetric hash functions for fingerprint minutiae. In International conference on pattern recognition and image analysis (pp. 30–38).
23.
go back to reference Xu, W., & Cheng, M. (2008). Cancelable voiceprint template based on chaff-points-mixture method. In CIS'08. International conference on computational intelligence and security, 2008 (pp. 263–266). Xu, W., & Cheng, M. (2008). Cancelable voiceprint template based on chaff-points-mixture method. In CIS'08. International conference on computational intelligence and security, 2008 (pp. 263–266).
24.
go back to reference Sandyarani, K., & Kumar, P. N. (2018). Efficient substructure sharing methods for optimizing the composite s-box, mixcolumn and inverse mixcolumn in rijndael advanced encryption standard. Journal of Computational and Theoretical Nanoscience, 15(3), 798–810. CrossRef Sandyarani, K., & Kumar, P. N. (2018). Efficient substructure sharing methods for optimizing the composite s-box, mixcolumn and inverse mixcolumn in rijndael advanced encryption standard. Journal of Computational and Theoretical Nanoscience, 15(3), 798–810. CrossRef
25.
go back to reference Ds, A., & Minu, K. (2013). Frequency speech scrambler based on hartley transform and OFDM algorithm. International Journal of Computer Applications, 61(8), 36–40. CrossRef Ds, A., & Minu, K. (2013). Frequency speech scrambler based on hartley transform and OFDM algorithm. International Journal of Computer Applications, 61(8), 36–40. CrossRef
26.
go back to reference Brasser, F., et al. (2018). VoiceGuard: Secure and private speech processing. In Interspeech (pp. 1303–1307) Brasser, F., et al. (2018). VoiceGuard: Secure and private speech processing. In Interspeech (pp. 1303–1307)
27.
go back to reference Alroubaie, Z. M., Hashem, M. A., & Hasan, F. S. (2019). FPGA design of encryption speech system using synchronized fixed-point chaotic maps based stream ciphers. International Journal of Engineering and Advanced Technology, 8(6), 1534–1541. CrossRef Alroubaie, Z. M., Hashem, M. A., & Hasan, F. S. (2019). FPGA design of encryption speech system using synchronized fixed-point chaotic maps based stream ciphers. International Journal of Engineering and Advanced Technology, 8(6), 1534–1541. CrossRef
28.
go back to reference Milton, R. (1989). A time and frequency-domain speech scrambler. In COMSIG 1989. Proceedings Communications and Signal Processing, 1989, Southern African Conference on, 1989 (pp. 125–130). Milton, R. (1989). A time and frequency-domain speech scrambler. In COMSIG 1989. Proceedings Communications and Signal Processing, 1989, Southern African Conference on, 1989 (pp. 125–130).
29.
go back to reference Sadkhan, S., & Abbas, N. (2012). Speech scrambling based on wavelet transform. In Advances in wavelet theory and their applications in engineering physics and technology (pp. 41–58).‏ Sadkhan, S., & Abbas, N. (2012). Speech scrambling based on wavelet transform. In Advances in wavelet theory and their applications in engineering physics and technology (pp. 41–58).‏
30.
go back to reference Manjunath, G., & Anand, G. (2002). Speech encryption using circulant transformations. In ICME'02. Proceedings. 2002 IEEE international conference on multimedia and expo, 2002 (pp. 553–556). Manjunath, G., & Anand, G. (2002). Speech encryption using circulant transformations. In ICME'02. Proceedings. 2002 IEEE international conference on multimedia and expo, 2002 (pp. 553–556).
31.
go back to reference Wu, Y., Ng, B. P. (2002). Speech scrambling with Hadamard transform in frequency domain. In 2002 6 th International conference on signal processing (pp. 1560–1563). Wu, Y., Ng, B. P. (2002). Speech scrambling with Hadamard transform in frequency domain. In 2002 6 th International conference on signal processing (pp. 1560–1563).
32.
go back to reference Jiao, Ge., Li, L., & Zou, Yi. (2019). Improved security for android system based on multi-chaotic maps using a novel image encryption algorithm. International Journal of Performability Engineering, 15(6), 1692. Jiao, Ge., Li, L., & Zou, Yi. (2019). Improved security for android system based on multi-chaotic maps using a novel image encryption algorithm. International Journal of Performability Engineering, 15(6), 1692.
33.
go back to reference Al Saad, S. N., & Hato, E. (2014). A speech encryption based on chaotic maps. International Journal of Computer Applications, 93, 19–28. Al Saad, S. N., & Hato, E. (2014). A speech encryption based on chaotic maps. International Journal of Computer Applications, 93, 19–28.
34.
go back to reference Zhai, Y., Lin, S., & Zhang, Q. (2008) Improving image encryption using multi-chaotic map. In Workshop on power electronics and intelligent transportation system, 2008. PEITS'08 (pp. 143–148). Zhai, Y., Lin, S., & Zhang, Q. (2008) Improving image encryption using multi-chaotic map. In Workshop on power electronics and intelligent transportation system, 2008. PEITS'08 (pp. 143–148).
35.
go back to reference Ratnavelu, K., et al. (2017). Image encryption method based on chaotic fuzzy cellular neural networks. Signal Processing, 140, 87–96. CrossRef Ratnavelu, K., et al. (2017). Image encryption method based on chaotic fuzzy cellular neural networks. Signal Processing, 140, 87–96. CrossRef
36.
go back to reference Wu, X., Zhu, B., Hu, Y., & Ran, Y. (2017). A novel color image encryption scheme using rectangular transform-enhanced chaotic tent maps. IEEE Access, 5, 6429–6436. Wu, X., Zhu, B., Hu, Y., & Ran, Y. (2017). A novel color image encryption scheme using rectangular transform-enhanced chaotic tent maps. IEEE Access, 5, 6429–6436.
37.
go back to reference Hassan, E. S., Zhu, X., El-Khamy, S. E., Dessouky, M. I., El-Dolil, S. A., & El-Samie, F. E. A. (2012). A chaotic interleaving scheme for the continuous phase modulation based single-carrier frequency-domain equalization system. Wireless Personal Communications, 62, 183–199. MATHCrossRef Hassan, E. S., Zhu, X., El-Khamy, S. E., Dessouky, M. I., El-Dolil, S. A., & El-Samie, F. E. A. (2012). A chaotic interleaving scheme for the continuous phase modulation based single-carrier frequency-domain equalization system. Wireless Personal Communications, 62, 183–199. MATHCrossRef
38.
go back to reference Khaldi, K., & Boudraa, A.-O. (2012). On signals compression by EMD. Electronics Letters, 48(21), 1329–1331. CrossRef Khaldi, K., & Boudraa, A.-O. (2012). On signals compression by EMD. Electronics Letters, 48(21), 1329–1331. CrossRef
39.
go back to reference Khaldi, K., Turki-Hadj Alouane, M. O. N. I. A., & Boudraa, A. O. (2010). Voiced speech enhancement based on adaptive filtering of selected intrinsic mode functions. Advances in Adaptive Data Analysis, 2(01), 65–80. MathSciNetCrossRef Khaldi, K., Turki-Hadj Alouane, M. O. N. I. A., & Boudraa, A. O. (2010). Voiced speech enhancement based on adaptive filtering of selected intrinsic mode functions. Advances in Adaptive Data Analysis, 2(01), 65–80. MathSciNetCrossRef
40.
go back to reference Tirumala, S. S., Shahamiri, S. R., Garhwal, A. S., & Wang, R. (2017). Speaker identification features extraction methods: A systematic review. Expert Systems With Applications, 90, 250–271. CrossRef Tirumala, S. S., Shahamiri, S. R., Garhwal, A. S., & Wang, R. (2017). Speaker identification features extraction methods: A systematic review. Expert Systems With Applications, 90, 250–271. CrossRef
41.
go back to reference Neville, K. L., & Hussain, Z. M. (2009). Effects of wavelet compression of speech on its Mel-Cepstral coefficients. In International conference on communication, computer and power (ICCCP’09), Muscat (pp. 387–390). Neville, K. L., & Hussain, Z. M. (2009). Effects of wavelet compression of speech on its Mel-Cepstral coefficients. In International conference on communication, computer and power (ICCCP’09), Muscat (pp. 387–390).
42.
go back to reference Gupta, S., Jaafar, J., Ahmad, W. F. W., & Bansal, A. (2013). Feature extraction using MFCC. Signal & Image Processing, 4, 101–108. Gupta, S., Jaafar, J., Ahmad, W. F. W., & Bansal, A. (2013). Feature extraction using MFCC. Signal & Image Processing, 4, 101–108.
43.
go back to reference Childers, D. G., Skinner, D. P., & Kemerait, R. C. (1977). The cepstrum: A guide to processing. Proceedings of the IEEE, 65, 1428–1443. CrossRef Childers, D. G., Skinner, D. P., & Kemerait, R. C. (1977). The cepstrum: A guide to processing. Proceedings of the IEEE, 65, 1428–1443. CrossRef
45.
go back to reference Yu, H., Tan, Z. H., Ma, Z., Martin, R., & Guo, J. (2017). Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features. IEEE Transactions on Neural Networks and Learning Systems, 29(10), 4633–4644. CrossRef Yu, H., Tan, Z. H., Ma, Z., Martin, R., & Guo, J. (2017). Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features. IEEE Transactions on Neural Networks and Learning Systems, 29(10), 4633–4644. CrossRef
46.
go back to reference Galushkin, A. I. (2007). Neural networks theory. . Springer. MATH Galushkin, A. I. (2007). Neural networks theory. . Springer. MATH
47.
go back to reference Suzuki, K. (Ed.). (2013). Artificial neural networks: Architectures and applications. BoD–Books on Demand Suzuki, K. (Ed.). (2013). Artificial neural networks: Architectures and applications. BoD–Books on Demand
48.
go back to reference Evans, N. W., Mason, J. S., Liu, W. M., & Fauve, B. (2006). An assessment on the fundamental limitations of spectral subtraction. In 2006 IEEE international conference on acoustics speech and signal processing proceedings (Vol. 1, pp. 1–1). IEEE. Evans, N. W., Mason, J. S., Liu, W. M., & Fauve, B. (2006). An assessment on the fundamental limitations of spectral subtraction. In 2006 IEEE international conference on acoustics speech and signal processing proceedings (Vol. 1, pp. 1–1). IEEE.
49.
go back to reference Purushotham, U., & Suresh, K. (2018). Implementation of spectral subtraction using sub-band filtering in DSP C6748 processor for enhancing speech signal. In Advances in machine learning and data science (pp. 259–267). Springer, Singapore. Purushotham, U., & Suresh, K. (2018). Implementation of spectral subtraction using sub-band filtering in DSP C6748 processor for enhancing speech signal. In Advances in machine learning and data science (pp. 259–267). Springer, Singapore.
50.
go back to reference Abd El-Fattah, M., Dessouky, M. I., Diab, S. M., & AbdEl-Samie, F.E.-S. (2008). Speech enhancement using an adaptive wiener filtering approach. Progress in Electromagnetics Research, 4, 167–184. CrossRef Abd El-Fattah, M., Dessouky, M. I., Diab, S. M., & AbdEl-Samie, F.E.-S. (2008). Speech enhancement using an adaptive wiener filtering approach. Progress in Electromagnetics Research, 4, 167–184. CrossRef
51.
go back to reference Macq, B., Dittmann, J., & Delp, E. J. (2004). Benchmarking of image watermarking algorithms for digital rights management. Proceedings of the IEEE, 92, 971–984. CrossRef Macq, B., Dittmann, J., & Delp, E. J. (2004). Benchmarking of image watermarking algorithms for digital rights management. Proceedings of the IEEE, 92, 971–984. CrossRef
52.
go back to reference Et-Khamy, S. E., Hadhoud, M. M., Dessouky, M. L., Salam, B. M., & Abd El-Sarnie, F. E. (2003). Sectioned implementation of regularized image interpolation. In 2003 46th midwest symposium on circuits and systems (Vol. 2, pp. 656–659). IEEE. Et-Khamy, S. E., Hadhoud, M. M., Dessouky, M. L., Salam, B. M., & Abd El-Sarnie, F. E. (2003). Sectioned implementation of regularized image interpolation. In 2003 46th midwest symposium on circuits and systems (Vol. 2, pp. 656–659). IEEE.
53.
go back to reference Rahimizadeh, N., Hasanzadeh, R. P., & Janabi-Sharifi, F. (2020). An optimized non-local LMMSE approach for speckle noise reduction of medical ultrasound images. Multimedia Tools and Applications, 1–23. Rahimizadeh, N., Hasanzadeh, R. P., & Janabi-Sharifi, F. (2020). An optimized non-local LMMSE approach for speckle noise reduction of medical ultrasound images. Multimedia Tools and Applications, 1–23.
Metadata
Title
A Cancelable Biometric Approach for Efficient Identification of Speakers from Encrypted Speech
Authors
Basant S. Abd El-Wahab
Heba A. El-Khobby
Mustafa M. Abd Elnaby
Fathi E. Abd El-Samie
Publication date
25-04-2022
Publisher
Springer US
Published in
Wireless Personal Communications / Issue 3/2022
Print ISSN: 0929-6212
Electronic ISSN: 1572-834X
DOI
https://doi.org/10.1007/s11277-021-08384-5