nach oben

International Journal of Speech Technology

Erschienen in:

18.10.2021

Construction of complex environment speech signal communication system based on 5G and AI driven feature extraction techniques

verfasst von: Yi Jiang, ErLi Cheng, YongHao Li, Yali Zhang

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In daily life, the most direct and important way of human communication is voice. With the rise of the Internet and the development of communication technology, the proportion of non voice signals such as image and data in communication system is increasing. However, in most communication systems, voice transmission function is generally required, so it is still one of the necessary functions for many communication systems to transmit voice information effectively. With the rapid development of the Internet, especially the recent 5G technology commercial and future civil, human–computer interaction will become more and more intelligent in the future, which also poses a greater challenge to speech recognition as a human–computer interface. Noise interference is one of the biggest obstacles to the practical application of speech system. Although a large number of noisy data based on deep learning can solve part of the noise robustness problem, non-stationary noise interference is still a great challenge for speech recognition system in very low SNR complex scenes. In addition, in the multi information fusion communication system, there are many kinds of data to be transmitted, and there are high requirements for bandwidth and storage space. Hence, this paper studies the construction of voice signal communication system based on the artificial intelligence and 5G technology. The model is designed and implemented considering the complex scenarios. The performance is validated through the simulations compared with the state-of-the-art methods.

Vorheriger Artikel Feature extraction from behavioral styles of children for prediction of severity of stuttering using historical stuttering data

Nächster Artikel A new double backward distributive weighted adaptive filtering approach for speech quality improvement

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Cernak, M., Lazaridis, A., Asaei, A., et al. (2016). Composition of deep and spiking neural networks for very low bit rate speech coding[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(12), 2301–2312.CrossRef

Chen, J., & Wang, D. L. (2017). Long short-term memory for speaker generalization in supervised speech separation[J]. The Journal of the Acoustical Society of America, 141(6), 4705–4714.MathSciNetCrossRef

Erdogan H, Hershey J R, Watanabe S, et al. Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks[C]. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Brisbane, QLD, Australia, 2015, 4: 708–712.

Hao, W., Yingxi, L., Bin, X., et al. (2019). Speech enhancement method based on convolution gated recurrent neural network [J]. Journal of Huazhong University of Science and Technology, 47(4), 13–18.

Huang, P. S., Kim, M., Hasegawa-Johnson, M., et al. (2015). Joint optimization of masks and deep recurrent neural networks for monaural source separation[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(12), 2136–2147.CrossRef

Kim, S. K., Park, Y. J., & Lee, S. (2016). Voice activity detection based on deep belief networks using likelihood ratio[J]. Journal of Central South University, 23(1), 145–149.CrossRef

Li, Y., & Kang, S. (2016). Artificial bandwidth extension using deep neural network-based spectral envelope estimation and enhanced excitation estimation[J]. IET Signal Processing, 10(4), 422–427.CrossRef

Li, Z., Cadet, C., & Outbib, R. (2018). Diagnosis for PEMFC based on magnetic measurements and data-driven approach. IEEE Transactions on Energy Conversion, 34(2), 964–972.CrossRef

Ling, Z. H., Kang, S. Y., Zen, H., et al. (2015). Deep learning for acoustic modeling in parametric speech generation: a systematic review of existing techniques and future trends[J]. IEEE Signal Processing Magazine, 32(3), 35–52.CrossRef

Liu M, Wang Y, Wang J, et al. Speech Enhancement Method Based On LSTM Neural Network for Speech Recognition[C]. 2018 14th IEEE International Conference on Signal Processing (ICSP). IEEE, Beijing, China, 2018, 8: 245–249.

Luo, X. J., Oyedele, Lukumon O., Ajayi, Anuoluwapo O., Akinade, Olugbenga O., Owolabi, Hakeem A., & Ahmed, Ashraf. (2020). Feature extraction and genetic algorithm enhanced adaptive deep neural network for energy consumption prediction in buildings. Renewable and Sustainable Energy Reviews, 131, e109980.CrossRef

Ma, Y., & Huang, B. (2017). Bayesian learning for dynamic feature extraction with application in soft sensing. IEEE Transactions on Industrial Electronics, 64(9), 7171–7180.CrossRef

Michelsanti D, Tan Z H. Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification[J]. arXiv preprint arXiv: 1709.01703, 2017.

Novotny O, Plchot O, Glembek O, et al. Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition[J]. arXiv preprint arXiv: 1811.07629, 2018.

Park S R, Lee J. A fully convolutional neural network for speech enhancement[J]. arXiv preprint arXiv: 1609.07132, 2016.

Qing, W. (2018). Research on speech enhancement based on multi-objective learning and fusion of deep neural network [D]. University of science and technology of China.

Ruiyu, L., Li, Z., Qingyun, W., et al. (2018). Speech signal processing: C + + version [M]. China Machine Press.

Sun, M., Konstantelos, I., & Strbac, G. (2018). A deep learning-based feature extraction framework for system security assessment. IEEE Transactions on Smart Grid, 10(5), 5007–5020.CrossRef

Wai C.Chu.Speech Coding Algorithms: Foundation and Evolution of Standardized Coders[M]. Hoboken and New Jersey:A John Wiley & Sons,Inc,2003:1–60.

Wan, J., Zheng, P., Si, H., Xiong, N. N., Zhang, W., & Vasilakos, A. V. (2019). An artificial intelligence driven multi-feature extraction scheme for big data detection. IEEE Access, 7, 80122–80132.CrossRef

Wang Y, Zhao S, Liu W, et al. Speech Bandwidth Expansion Based on Deep Neural Networks[C]//Interspeech. 2015: 2593–2597.

Weninger, F., Erdogan, H., Watanabe, S., et al. (2015). Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR[C]. International Conference on Latent Variable Analysis and Signal Separation, Liberec, Czech Republic, 8, 91–99.

Li Xiaodong. Research and implementation of an improved speech codec algorithm [D]. University of Defense Science and technology, 2011:3–10.

Xu Y, Du J, Huang Z, et al. Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement[J]. arXiv preprint arXiv: 1703.07172, 2017.

Xu, Y., Du, J., Dai, L. R., et al. (2015). A regression approach to speech enhancement based on deep neural networks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(1), 7–19.CrossRef

Feng Xudong. Research and implementation of low bit rate error resilient speech coding algorithm [D]. Xi'an University of Electronic Science and technology, 2013:2–15.

Jiang Xuehua. Digital speech compression system based on arm [D]. Hunan University. 2007:2–8.

Zhuang, Y.-T., Fei, Wu., Chen, C., & Pan, Y.-h. (2017). Challenges and opportunities: from big data to knowledge in AI 2.0. Frontiers of Information Technology & Electronic Engineering, 18(1), 3–14.CrossRef

Titel: Construction of complex environment speech signal communication system based on 5G and AI driven feature extraction techniques
verfasst von: Yi Jiang
ErLi Cheng
YongHao Li
Yali Zhang
Publikationsdatum: 18.10.2021
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 4/2022
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-021-09900-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Customer Experience/© © oatawa / Getty Images / iStock, Erdgasmotor 1.5 TGI evo von Volkswagen/© Volkswagen AG, Thorsten Mücke/© Alexandra Bachran, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2022

Analysis of influencing features with spectral feature extraction and multi-class classification using deep neural network for speech recognition system

AI driven human–computer interaction design framework of virtual environment based on comprehensive semantic data analysis with feature extraction

Intelligent automobile auxiliary propagation system based on speech recognition and AI driven feature extraction techniques

Non-intrusive speech quality assessment using context-aware neural networks

Robust acoustic domain identification with its application to speaker diarization

Feature extraction from behavioral styles of children for prediction of severity of stuttering using historical stuttering data

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.