Skip to main content
Erschienen in: Soft Computing 16/2022

01.12.2021 | Focus

Research on intelligent language translation system based on deep learning algorithm

verfasst von: Chunliu Shi

Erschienen in: Soft Computing | Ausgabe 16/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In order to improve the effect of intelligent language translation, this paper analyzes the problems of the MSE cost function used by most of the current DNN-based speech enhancement algorithms and proposes a deep learning speech enhancement algorithm based on perception-related cost functions. Moreover, this paper embeds the suppression gain parameter estimation into the architecture of the traditional speech enhancement algorithm and converts the relationship between the noisy speech spectrum and the enhanced speech spectrum into a simple multiplication relationship based on suppression gain combined with deep learning algorithms to construct an intelligent language translation system. Moreover, this paper evaluates the translation effect of the system, analyzes the actual results, and uses simulation tests to verify the performance of the intelligent language translation model constructed in this paper. From the experimental results, it can be seen that the intelligent language translation system based on deep learning algorithms has good results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abdel-Hamid O, Mohamed A-R, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process 22(10):1533–1545CrossRef Abdel-Hamid O, Mohamed A-R, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process 22(10):1533–1545CrossRef
Zurück zum Zitat Choi H-N, Byun S-W, Lee S-P (2015) Discriminative feature vector selection for emotion classification based on speech. Trans Korean Inst Electr Eng 64(9):1363–1368CrossRef Choi H-N, Byun S-W, Lee S-P (2015) Discriminative feature vector selection for emotion classification based on speech. Trans Korean Inst Electr Eng 64(9):1363–1368CrossRef
Zurück zum Zitat Haderlein T, Döllinger M, Matoušek V, Nöth E (2016) Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples. Logop Phoniatr Vocol 41(3):106–116CrossRef Haderlein T, Döllinger M, Matoušek V, Nöth E (2016) Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples. Logop Phoniatr Vocol 41(3):106–116CrossRef
Zurück zum Zitat Herbst CT, Hertegard S, Zangger-Borch D, Lindestad P-Å (2017) Freddie Mercury—acoustic analysis of speaking fundamental frequency, vibrato, and subharmonics. Logop Phoniatr Vocol 42(1):29–38CrossRef Herbst CT, Hertegard S, Zangger-Borch D, Lindestad P-Å (2017) Freddie Mercury—acoustic analysis of speaking fundamental frequency, vibrato, and subharmonics. Logop Phoniatr Vocol 42(1):29–38CrossRef
Zurück zum Zitat Hill AK, Cárdenas RA, Wheatley JR, Welling LL, Burriss RP, Claes P, Shriver MD (2017) Are there vocal cues to human developmental stability? Relationships between facial fluctuating asymmetry and voice attractiveness. Evol Hum Behav 38(2):249–258CrossRef Hill AK, Cárdenas RA, Wheatley JR, Welling LL, Burriss RP, Claes P, Shriver MD (2017) Are there vocal cues to human developmental stability? Relationships between facial fluctuating asymmetry and voice attractiveness. Evol Hum Behav 38(2):249–258CrossRef
Zurück zum Zitat Kang TG, Kim NS (2016) DNN-based voice activity detection with multi-task learning. IEICE Trans Inf Syst 99(2):550–553CrossRef Kang TG, Kim NS (2016) DNN-based voice activity detection with multi-task learning. IEICE Trans Inf Syst 99(2):550–553CrossRef
Zurück zum Zitat Kim C, Stern RM (2016) Power-normalized cepstral coefficients (PNCC) for robust speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(7):1315–1329CrossRef Kim C, Stern RM (2016) Power-normalized cepstral coefficients (PNCC) for robust speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(7):1315–1329CrossRef
Zurück zum Zitat Leeman A, Mixdorff H, O’Reilly M, Kolly M-J, Dellwo V (2014) Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison. Int J Speech Lang Law 21(2):343–370CrossRef Leeman A, Mixdorff H, O’Reilly M, Kolly M-J, Dellwo V (2014) Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison. Int J Speech Lang Law 21(2):343–370CrossRef
Zurück zum Zitat Li J, Deng L, Gong Y, Haeb-Umbach R (2014) An overview of noise-robust automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 22(4):745–777CrossRef Li J, Deng L, Gong Y, Haeb-Umbach R (2014) An overview of noise-robust automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 22(4):745–777CrossRef
Zurück zum Zitat Malallah FL, KNYMG S, Abdulameer SD (2018) Vision-based control by hand-directional gestures converting to voice. Int J Sci Technol Res 7(7):185–190 Malallah FL, KNYMG S, Abdulameer SD (2018) Vision-based control by hand-directional gestures converting to voice. Int J Sci Technol Res 7(7):185–190
Zurück zum Zitat Mohan G, Hamilton K, Grasberger A, Lammert AC, Waterman J (2015) Realtime voice activity and pitch modulation for laryngectomy transducers using head and facial gestures. J Acoust Soc Am 137(4):2302–2302CrossRef Mohan G, Hamilton K, Grasberger A, Lammert AC, Waterman J (2015) Realtime voice activity and pitch modulation for laryngectomy transducers using head and facial gestures. J Acoust Soc Am 137(4):2302–2302CrossRef
Zurück zum Zitat Nidhyananthan SS, Muthugeetha K, Vallimayil V (2018) Human recognition using voice print in labview. Int J Appl Eng Res 13(10):8126–8130 Nidhyananthan SS, Muthugeetha K, Vallimayil V (2018) Human recognition using voice print in labview. Int J Appl Eng Res 13(10):8126–8130
Zurück zum Zitat Noda K, Yamaguchi Y, Nakadai K, Okuno HG, Ogata T (2015) Audio-visual speech recognition using deep learning. Appl Intell 42(4):722–737CrossRef Noda K, Yamaguchi Y, Nakadai K, Okuno HG, Ogata T (2015) Audio-visual speech recognition using deep learning. Appl Intell 42(4):722–737CrossRef
Zurück zum Zitat Qian Y, Bi M, Tan T, Yu K (2016) Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(12):2263–2276CrossRef Qian Y, Bi M, Tan T, Yu K (2016) Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(12):2263–2276CrossRef
Zurück zum Zitat Sarria-Paja M, Senoussaoui M, Falk TH (2015) The effects of whispered speech on state-of-the-art voice based biometrics systems. In: Paper Presented at the 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE) Sarria-Paja M, Senoussaoui M, Falk TH (2015) The effects of whispered speech on state-of-the-art voice based biometrics systems. In: Paper Presented at the 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE)
Zurück zum Zitat Sleeper M (2016) Contact effects on voice-onset time in Patagonian Welsh. J Acoust Soc Am 140(4):3111–3111CrossRef Sleeper M (2016) Contact effects on voice-onset time in Patagonian Welsh. J Acoust Soc Am 140(4):3111–3111CrossRef
Zurück zum Zitat Talha M, Sohail M, Tariq R, Ahmad MT (2021) Impact of oil prices, energy consumption and economic growth on the inflation rate in Malaysia. Cuad Econ 44(124):26–32 Talha M, Sohail M, Tariq R, Ahmad MT (2021) Impact of oil prices, energy consumption and economic growth on the inflation rate in Malaysia. Cuad Econ 44(124):26–32
Zurück zum Zitat Talha M, Azeem S, Sohail M, Javed A, Tariq R (2020) Mediating effects of reflexivity of top management team between team processes and decision performance. Azerbaijan J Educ Stud 690 Talha M, Azeem S, Sohail M, Javed A, Tariq R (2020) Mediating effects of reflexivity of top management team between team processes and decision performance. Azerbaijan J Educ Stud 690
Metadaten
Titel
Research on intelligent language translation system based on deep learning algorithm
verfasst von
Chunliu Shi
Publikationsdatum
01.12.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 16/2022
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-021-06480-z

Weitere Artikel der Ausgabe 16/2022

Soft Computing 16/2022 Zur Ausgabe