Skip to main content

2016 | OriginalPaper | Buchkapitel

Graph Cut Based Segmentation Method for Tamil Continuous Speech

verfasst von : B. R. Laxmi Sree, M. S. Vijaya

Erschienen in: Digital Connectivity – Social Impact

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automatic segmentation of continuous speech plays an important role in building promising acoustic models for a standard continuous speech recognition system. This needs a lot of segmented data which is rarely available for many languages. As there are no industry standard speech segmentation tools for Indian languages like Tamil, there arises a need to work on Tamil speech segmentation. Here, a segmentation algorithm that is based on Graph cut is proposed for automatic phonetic level segmentation of continuous speech. Using graph cut for speech segmentation allows viewing speech globally rather locally which helps in segmentation of vocabulary, speaker independent speech. The input speech is represented as a graph and the proposed algorithm is applied on it. Experiments on the speech database comprising utterances of various speakers shows the proposed method outperforms the existing methods Blind Segmentation using Non-Linear Filtering and Non-Uniform Segmentation using Discrete Wavelet Transform.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Juneja, A., Espy-Wilson, C.: Segmentation of continuous speech using acoustic-phonetic parameters and statistical learning. In: Proceedings of 9th International Conference on Neural Information Processing, vol. 2, pp. 726–730, November 2002 Juneja, A., Espy-Wilson, C.: Segmentation of continuous speech using acoustic-phonetic parameters and statistical learning. In: Proceedings of 9th International Conference on Neural Information Processing, vol. 2, pp. 726–730, November 2002
2.
Zurück zum Zitat Räsänen, O., Laine, U.K., Altosaar, T.: Blind segmentation of speech using non-linear filtering methods. In: Ipsic, I. (ed.) Speech Technologies, chap. 5. InTech Open Access, June 2011. ISBN 978-953-307-996-7 Räsänen, O., Laine, U.K., Altosaar, T.: Blind segmentation of speech using non-linear filtering methods. In: Ipsic, I. (ed.) Speech Technologies, chap. 5. InTech Open Access, June 2011. ISBN 978-953-307-996-7
3.
Zurück zum Zitat Cosi, P.: SLAM: a PC-based multi-level segmentation tool. In: Rubio Ayuso, A.J., López Soler, J.M. (eds.) Speech Recognition and Coding. NATO ASI Series, vol. 147, pp. 124–127. Springer, Heidelberg (1995)CrossRef Cosi, P.: SLAM: a PC-based multi-level segmentation tool. In: Rubio Ayuso, A.J., López Soler, J.M. (eds.) Speech Recognition and Coding. NATO ASI Series, vol. 147, pp. 124–127. Springer, Heidelberg (1995)CrossRef
4.
Zurück zum Zitat Wickerhauser, V.: Proceedings of the Third International Conference on Wavelet Analysis and Its Applications (WAA), Chongqing, PR China. World Scientific, 29–31 May 2003 Wickerhauser, V.: Proceedings of the Third International Conference on Wavelet Analysis and Its Applications (WAA), Chongqing, PR China. World Scientific, 29–31 May 2003
5.
Zurück zum Zitat Tan, B.T., Lang, R., Schroder, H., Spray, A., Dermody, P.: Applying wavelet analysis to speech segmentation and classification. In: SPIE’s International Symposium on Optical Engineering and Photonics in Aerospace Sensing, pp. 750–761. International Society for Optics and Photonics, March 1994 Tan, B.T., Lang, R., Schroder, H., Spray, A., Dermody, P.: Applying wavelet analysis to speech segmentation and classification. In: SPIE’s International Symposium on Optical Engineering and Photonics in Aerospace Sensing, pp. 750–761. International Society for Optics and Photonics, March 1994
6.
Zurück zum Zitat Ziółko, M., Gałka, J., Drwiega, T.: Wavelet transform in speech segmentation. In: Fitt, A.D., Norbury, J., Ockendon, H., Wilson, E. (eds.) Progress in Industrial Mathematics at ECMI 2008, pp. 1073–1078. Springer, Heidelberg (2010)CrossRefMATH Ziółko, M., Gałka, J., Drwiega, T.: Wavelet transform in speech segmentation. In: Fitt, A.D., Norbury, J., Ockendon, H., Wilson, E. (eds.) Progress in Industrial Mathematics at ECMI 2008, pp. 1073–1078. Springer, Heidelberg (2010)CrossRefMATH
7.
Zurück zum Zitat Ziółko, B., Manandhar, S., Wilson, R., Ziółko, M.: Phoneme segmentation based on wavelet spectra analysis. Arch. Acoust. 36(1), 29–47 (2011)CrossRef Ziółko, B., Manandhar, S., Wilson, R., Ziółko, M.: Phoneme segmentation based on wavelet spectra analysis. Arch. Acoust. 36(1), 29–47 (2011)CrossRef
8.
Zurück zum Zitat Sarada, G.L., Lakshmi, A., Murthy, H.A., Nagarajan, T.: Automatic transcription of continuous speech into syllable-like units for Indian languages. Sadhana 34(2), 221–233 (2009)CrossRef Sarada, G.L., Lakshmi, A., Murthy, H.A., Nagarajan, T.: Automatic transcription of continuous speech into syllable-like units for Indian languages. Sadhana 34(2), 221–233 (2009)CrossRef
9.
Zurück zum Zitat Jayasankar, T., Thangarajan, R., Selvi, J.A.V.: Automatic continuous speech segmentation to improve Tamil text-to-speech synthesis. Int. J. Comput. Appl. 25(1), 31–36 (2011) Jayasankar, T., Thangarajan, R., Selvi, J.A.V.: Automatic continuous speech segmentation to improve Tamil text-to-speech synthesis. Int. J. Comput. Appl. 25(1), 31–36 (2011)
10.
Zurück zum Zitat Nagarajan, T., Murthy, H.A.: Subband-based group delay segmentation of spontaneous speech into syllable-like units. EURASIP J. Adv. Signal Process. 2004, 1–12 (2004)CrossRefMATH Nagarajan, T., Murthy, H.A.: Subband-based group delay segmentation of spontaneous speech into syllable-like units. EURASIP J. Adv. Signal Process. 2004, 1–12 (2004)CrossRefMATH
11.
Zurück zum Zitat Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRef Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRef
12.
Zurück zum Zitat Bleyer, M., Gelautz, M.: Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process. Image Commun. 22(2), 127–143 (2007)CrossRef Bleyer, M., Gelautz, M.: Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process. Image Commun. 22(2), 127–143 (2007)CrossRef
13.
Zurück zum Zitat Xiang, T., Gong, S.: Spectral clustering with eigenvector selection. Pattern Recogn. 41(3), 1012–1029 (2008)CrossRefMATH Xiang, T., Gong, S.: Spectral clustering with eigenvector selection. Pattern Recogn. 41(3), 1012–1029 (2008)CrossRefMATH
14.
Zurück zum Zitat Wu, Z., Leahy, R.: An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1101–1113 (1993)CrossRef Wu, Z., Leahy, R.: An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1101–1113 (1993)CrossRef
15.
Zurück zum Zitat Stan, A., Mamiya, Y., Yamagishi, J., Bell, P., Watts, O., Clark, R.A.J., King, S.: ALISA: an automatic lightly supervised speech segmentation and alignment tool. Comput. Speech Lang. 35, 116–133 (2016)CrossRef Stan, A., Mamiya, Y., Yamagishi, J., Bell, P., Watts, O., Clark, R.A.J., King, S.: ALISA: an automatic lightly supervised speech segmentation and alignment tool. Comput. Speech Lang. 35, 116–133 (2016)CrossRef
Metadaten
Titel
Graph Cut Based Segmentation Method for Tamil Continuous Speech
verfasst von
B. R. Laxmi Sree
M. S. Vijaya
Copyright-Jahr
2016
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-10-3274-5_21