Skip to main content

2013 | OriginalPaper | Buchkapitel

38. Voice Transformation Using Radial Basis Function

verfasst von : J. H. Nirmal, Suparva Patnaik, Mukesh A. Zaveri

Erschienen in: Proceedings of the Third International Conference on Trends in Information, Telecommunication and Computing

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents novel technique of voice transformation (VT), which transform the individual acoustic characteristics of the source speaker so that it is perceived as if spoken like target speaker. Using features namely line spectral pairs (LSP) and pitch as spectral and glottal parameters of the source speaker are transformed into target speaker parameters using radial basis function (RBF). The results are evaluated using subjective and objective measures based on voice quality method. The listening tests prove that the proposed algorithm converts speaker individuality while maintaining high speech quality.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kain A (2001) ’High resolution voice transformation’, PhD dissertation, Oregon Health and Science University Kain A (2001) ’High resolution voice transformation’, PhD dissertation, Oregon Health and Science University
2.
Zurück zum Zitat Daniel, erro. eslava (2008) Intra-lingual and cross-lingual voice conversion using Harmonic plus stochastic models. PhD dissertation universitat politècnica decatalunya Daniel, erro. eslava (2008) Intra-lingual and cross-lingual voice conversion using Harmonic plus stochastic models. PhD dissertation universitat politècnica decatalunya
3.
Zurück zum Zitat Turk O (2007) Cross-lingual voice conversion. PhD dissertation Bogazii University Turk O (2007) Cross-lingual voice conversion. PhD dissertation Bogazii University
4.
Zurück zum Zitat Sreenivasa Rao K (2010) Voice conversion by mapping the speaker-specific features using pitch synchronous approach. Computer speech and language, Elsevier, vol. 24, pp 474–494 Sreenivasa Rao K (2010) Voice conversion by mapping the speaker-specific features using pitch synchronous approach. Computer speech and language, Elsevier, vol. 24, pp 474–494
5.
Zurück zum Zitat Abe M, Nakamura S, Shikano K, Kuwabara H (1988) Voice conversion through vector quantization. International conference on acoustics, speech, and signal processing, ICASSP. p 655 Abe M, Nakamura S, Shikano K, Kuwabara H (1988) Voice conversion through vector quantization. International conference on acoustics, speech, and signal processing, ICASSP. p 655
6.
Zurück zum Zitat Abe MA (1999) Segment-based approach to voice conversion. International conference acoustics, speech, and signal processing, ICASSP. p 765 Abe MA (1999) Segment-based approach to voice conversion. International conference acoustics, speech, and signal processing, ICASSP. p 765
7.
Zurück zum Zitat Arslan LM, Talkin D (1999) Voice conversion by codebook mapping of line spectra frequencies and excitation spectrum. International proceedings Eurospeech. Rhodes, vol. 3, pp 1347–1350 Arslan LM, Talkin D (1999) Voice conversion by codebook mapping of line spectra frequencies and excitation spectrum. International proceedings Eurospeech. Rhodes, vol. 3, pp 1347–1350
8.
Zurück zum Zitat Shikano K, Nakamura S, Abe M (1999) Speaker adaptation and voice conversion by codebook mapping. IEEE international symposium on circuits and systems, vol. 1, pp 594–597 Shikano K, Nakamura S, Abe M (1999) Speaker adaptation and voice conversion by codebook mapping. IEEE international symposium on circuits and systems, vol. 1, pp 594–597
9.
Zurück zum Zitat Arslan LM (1999) Speaker transformation algorithm using segmental codebooks. STASC Speech Commun 28(3):211–226, 469–471 Arslan LM (1999) Speaker transformation algorithm using segmental codebooks. STASC Speech Commun 28(3):211–226, 469–471
10.
Zurück zum Zitat Valbret H, Moulines E, Tubach JP (1992) Voice transformation using PSOLA technique. Acoustics, speech, and signal processing, ICASSP pp I145–I148 Valbret H, Moulines E, Tubach JP (1992) Voice transformation using PSOLA technique. Acoustics, speech, and signal processing, ICASSP pp I145–I148
11.
Zurück zum Zitat Shuang ZW, Bakis R, Shechtman S, Chazan D, Qin Y (2006) Frequency warping based on mapping formant parameters. In: Proceedings of international conference spoken language process Shuang ZW, Bakis R, Shechtman S, Chazan D, Qin Y (2006) Frequency warping based on mapping formant parameters. In: Proceedings of international conference spoken language process
12.
Zurück zum Zitat Stylianou Y, Cappa O (1998) A system for voice conversion based on probabilistic classification and harmonic plus noise model. International conference acoustics, speech and signal processing, Proceedings pp 281–285 Stylianou Y, Cappa O (1998) A system for voice conversion based on probabilistic classification and harmonic plus noise model. International conference acoustics, speech and signal processing, Proceedings pp 281–285
13.
Zurück zum Zitat Kain A, Macon MW (1998) Spectral voice conversion for text-to-speech synthesis. Proceedings ICASSP, Seattle, pp 285–288 Kain A, Macon MW (1998) Spectral voice conversion for text-to-speech synthesis. Proceedings ICASSP, Seattle, pp 285–288
14.
Zurück zum Zitat Toda T, Saruwatari H, Shikano K (2001) Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum. International conference on acoustics, speech, and signal processing, Proceedings. ICASSP. pp 841–844 Toda T, Saruwatari H, Shikano K (2001) Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum. International conference on acoustics, speech, and signal processing, Proceedings. ICASSP. pp 841–844
15.
Zurück zum Zitat Ye H, Young S (2006) Quality-enhanced voice morphing using maximum likelihood transformations. IEEE transactions audio, speech, language process, vol. 14, no. 4, pp 1301–1312 Ye H, Young S (2006) Quality-enhanced voice morphing using maximum likelihood transformations. IEEE transactions audio, speech, language process, vol. 14, no. 4, pp 1301–1312
16.
Zurück zum Zitat Ohtani Y, Toda T, Saruwatari H, Shikano K (2006) ‘Maximum likelihood voice conversion based on GMM with straight mixed excitation’. In: Proceedings Interspeech Ohtani Y, Toda T, Saruwatari H, Shikano K (2006) ‘Maximum likelihood voice conversion based on GMM with straight mixed excitation’. In: Proceedings Interspeech
17.
Zurück zum Zitat Desai S, Raghavendra EV, Yegnanarayana B, Black AW, Prahallad K (2009) Voice conversion using artificial neural networks. In: Proceedings of IEEE international conference acoust, speech, and signal processing, pp 3893–3897 Desai S, Raghavendra EV, Yegnanarayana B, Black AW, Prahallad K (2009) Voice conversion using artificial neural networks. In: Proceedings of IEEE international conference acoust, speech, and signal processing, pp 3893–3897
18.
Zurück zum Zitat Chen W-Q, Zhang JL, Xiuguo B (2010) An improved method for voice conversion based on Gaussian mixture model. International conference on computer application and system modelling, PP V4-404-408 Chen W-Q, Zhang JL, Xiuguo B (2010) An improved method for voice conversion based on Gaussian mixture model. International conference on computer application and system modelling, PP V4-404-408
19.
Zurück zum Zitat Narendranath H, Murthy A, Rajendran S, Yegnanarayana B (1995)‘Transformation of formants for voice conversion using artificial neural networks’, Speech communication, vol. 16, pp 207–216 Narendranath H, Murthy A, Rajendran S, Yegnanarayana B (1995)‘Transformation of formants for voice conversion using artificial neural networks’, Speech communication, vol. 16, pp 207–216
20.
Zurück zum Zitat Chen Z, Zhang LH (2010) A ANN base high quality method for voice conversion’. International conference on wireless communications networking and mobile computing (WiCOM) Chen Z, Zhang LH (2010) A ANN base high quality method for voice conversion’. International conference on wireless communications networking and mobile computing (WiCOM)
21.
Zurück zum Zitat Grassi S (1997) Dufaux, Ansorge; Pellandini, ‘Efficient algorithm to compute LSP parameters From 10th-order lpc coefficients’. International conference on acoustics, speech, and signal processing, vol. 3, pp 1707–1710 Grassi S (1997) Dufaux, Ansorge; Pellandini, ‘Efficient algorithm to compute LSP parameters From 10th-order lpc coefficients’. International conference on acoustics, speech, and signal processing, vol. 3, pp 1707–1710
22.
Zurück zum Zitat Lan Vince McLoughlin (2008) Line spectral pairs. Elesevier signal processing, pp 448–467 Lan Vince McLoughlin (2008) Line spectral pairs. Elesevier signal processing, pp 448–467
23.
Zurück zum Zitat Lan Mcloughlin (2009) Applied speech and audio processing with matlab examples (1st edn). Cambridge Publication, Cambridge Lan Mcloughlin (2009) Applied speech and audio processing with matlab examples (1st edn). Cambridge Publication, Cambridge
24.
Zurück zum Zitat Vergin R, Azarshid F, Shahguansy D (2006) Robust gender dependent acoustic phonetic modeling in continous speech recognition based on new automatic Male Female classification. International conference spoken language processing, pp 1–4 Vergin R, Azarshid F, Shahguansy D (2006) Robust gender dependent acoustic phonetic modeling in continous speech recognition based on new automatic Male Female classification. International conference spoken language processing, pp 1–4
25.
Zurück zum Zitat Pawan K, Jakhanwal N, Bhowmick A, Chandra M (2011) Gender classification using pitch and formant. International conference on communication, computing & security pp 319–324 Pawan K, Jakhanwal N, Bhowmick A, Chandra M (2011) Gender classification using pitch and formant. International conference on communication, computing & security pp 319–324
Metadaten
Titel
Voice Transformation Using Radial Basis Function
verfasst von
J. H. Nirmal
Suparva Patnaik
Mukesh A. Zaveri
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-3363-7_38

Neuer Inhalt