Skip to main content
Top

2022 | OriginalPaper | Chapter

Examination of Balance Adjustment Method Between Voice and BGM in TV Viewing

Authors : Takanori Kono, Rin Hirakawa, Hideki Kawano, Yoshihisa Nakatoh

Published in: Human Interaction, Emerging Technologies and Future Systems V

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

As the elderly age, their hearing deteriorates and it becomes difficult to hear the sounds of daily life. Especially, “TV sound” is hard to hear for one in two elderly people. The cause is that the voice is drowned out by the BGM. In this research, we focus on the volume balance between voice and BGM in TV sound, and propose a volume balance adjustment method using sound source separation technology as a method to adjust these appropriately. In addition, the effectiveness of the proposed method will be evaluated through subjective evaluation. In order to improve your hearing of TV sounds, you need to emphasize them. Therefore, in this research, we propose a method to emphasize the sound by separating the TV sound into voice and BGM by the sound source separation technology, suppressing the gain of BGM, and then reintegrating it. In this study, Spleeter is used. Spleeter is a sound source separation software that uses supervised deep learning. It is mainly used to separate songs into parts, and the input music data can be divided into parts (Example: Vocal/Accompaniment). In the experiment, we used a mixture of voice and BGM as the sound of the TV. (We have prepared two types of voice, “Natural voice” and “Whispering voice”.) This simulated data is separated by Spleeter, and the gain of the BGM after separation is suppressed and mixed again. Eight male subjects in their twenties will be asked to hear the sound before and after processing to evaluate whether the ease of hearing the voice can be improved. As a result, it was found that increasing the ratio of voice improves the ease of hearing the voice. However, it was also found that the distortion generated in the process of sound source separation also affects the sound quality. Therefore, it can be said that it is necessary to improve the accuracy of sound source separation in order to further enhance the effect. In this study, we proposed a method to adjust the volume balance between voice and BGM to an appropriate level using sound source separation technology. In the future, we would like to consider ways to further improve hearing by improving the accuracy of sound source separation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mizumachi, M.: Interaction between hearing loss and speech change due to aging, “karei ni yoru choryokuteika to hatsuwahenka no sogosayo,” (in Japanese). J. Acoust. Soc. Jpn. 73(5), 297–302 (2017) Mizumachi, M.: Interaction between hearing loss and speech change due to aging, “karei ni yoru choryokuteika to hatsuwahenka no sogosayo,” (in Japanese). J. Acoust. Soc. Jpn. 73(5), 297–302 (2017)
2.
go back to reference Onuma, N., Mizuno, E.: Self-evaluation of hearing in the elderly and examination of candidates for hearing aid consultation, “Koreisha no kikoe no jikohyoka to hocho sodan taisho kohosha no kento,” (in Japanese). Tsukuba Coll. Technol. Techno Rep. 8, 145–152 (2001) Onuma, N., Mizuno, E.: Self-evaluation of hearing in the elderly and examination of candidates for hearing aid consultation, “Koreisha no kikoe no jikohyoka to hocho sodan taisho kohosha no kento,” (in Japanese). Tsukuba Coll. Technol. Techno Rep. 8, 145–152 (2001)
3.
go back to reference Watanabe, K.: A study on the effect of slower speech rate produced by the speech rate converter. Nippon Jibiinkoka Gakkai Kaiho 99(3), 445–453 (1996)CrossRef Watanabe, K.: A study on the effect of slower speech rate produced by the speech rate converter. Nippon Jibiinkoka Gakkai Kaiho 99(3), 445–453 (1996)CrossRef
4.
go back to reference Komori, T., Imai, A., Seiyama, N., Takou, R., Takagi, T.: Background sound suppression techniques of broadcast programs for elderly people. NHK Giken R&D, no. 161, pp. 31–41 (2017) Komori, T., Imai, A., Seiyama, N., Takou, R., Takagi, T.: Background sound suppression techniques of broadcast programs for elderly people. NHK Giken R&D, no. 161, pp. 31–41 (2017)
5.
go back to reference Murayama, Hamada, Komiyama, Kawabata: A method for improving the intelligibility of narration speech using adaptive signal processing, “Tekioshingoshori wo mochiita nareshononsei no kikitoriyasusakaizenho,” (in Japanese). In: 13th AES Regional Convention, Tokyo (2007) Murayama, Hamada, Komiyama, Kawabata: A method for improving the intelligibility of narration speech using adaptive signal processing, “Tekioshingoshori wo mochiita nareshononsei no kikitoriyasusakaizenho,” (in Japanese). In: 13th AES Regional Convention, Tokyo (2007)
6.
go back to reference Hirohata, M., Ono, T., Nishiyama, M.: Audio source separation technology to control volume balance between voices and background sounds. Toshiba Rev. 68(9), 26–29 (2013) Hirohata, M., Ono, T., Nishiyama, M.: Audio source separation technology to control volume balance between voices and background sounds. Toshiba Rev. 68(9), 26–29 (2013)
7.
go back to reference Lee, D.D., et al.: Algorithms for Non-negative Matrix Factorization. Proc. NIPS. 13, 556–562 (2000) Lee, D.D., et al.: Algorithms for Non-negative Matrix Factorization. Proc. NIPS. 13, 556–562 (2000)
8.
go back to reference Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Sig. Process. 27(2), 113–120 (1979)CrossRef Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Sig. Process. 27(2), 113–120 (1979)CrossRef
9.
go back to reference Toji, Y., Kodaira, Y., Sakata, S., Ueda, Y.: Development of speech enhancement system by formant emphasis “Horumantokyocho ni yoru onseihochoshorishisutemu no kaihatsu,” (in Japanese). In: Proceedings of the 2012 Kyushu Section Joint Convention of Institutes of Electrical and Related Engineers (65th Joint Convention), pp. 73–74 (2012) Toji, Y., Kodaira, Y., Sakata, S., Ueda, Y.: Development of speech enhancement system by formant emphasis “Horumantokyocho ni yoru onseihochoshorishisutemu no kaihatsu,” (in Japanese). In: Proceedings of the 2012 Kyushu Section Joint Convention of Institutes of Electrical and Related Engineers (65th Joint Convention), pp. 73–74 (2012)
10.
go back to reference Omachi, M., Ogawa, T., Kobayashi, T.: Blind source separation using associative memory and linear separation filter “Rensokioku to senkeibunrifuiruta wo mochiita buraindongembunri,” (in Japanese). In: Information Processing Society of Japan (SLP), vol. 2015-SLP-105, no. 4, pp. 1–6 (2015) Omachi, M., Ogawa, T., Kobayashi, T.: Blind source separation using associative memory and linear separation filter “Rensokioku to senkeibunrifuiruta wo mochiita buraindongembunri,” (in Japanese). In: Information Processing Society of Japan (SLP), vol. 2015-SLP-105, no. 4, pp. 1–6 (2015)
15.
go back to reference Sagisaka, Y., Uratani, N.: ATR speech and language database, “ATR onsei gengo detabesu,” (in Japanese). J. Acoust. Soc. Jpn. 48(12), 878–882 (1992) Sagisaka, Y., Uratani, N.: ATR speech and language database, “ATR onsei gengo detabesu,” (in Japanese). J. Acoust. Soc. Jpn. 48(12), 878–882 (1992)
Metadata
Title
Examination of Balance Adjustment Method Between Voice and BGM in TV Viewing
Authors
Takanori Kono
Rin Hirakawa
Hideki Kawano
Yoshihisa Nakatoh
Copyright Year
2022
DOI
https://doi.org/10.1007/978-3-030-85540-6_120

Premium Partner