Skip to main content
Top
Published in: Journal of Intelligent Information Systems 2/2019

11-02-2019

Effect of speech segment samples selection in stutter block detection and remediation

Authors: Pierre Arbajian, Ayman Hajja, Zbigniew W. Raś, Alicja A. Wieczorkowska

Published in: Journal of Intelligent Information Systems | Issue 2/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Speech remediation by identifying those segments which take away from the substance of the speech content can be performed by identifying portions of speech which may be deleted without diminishing from the speech quality, but rather improving the speech. Speech remediation is important when the speech is disfluent as in the case of stuttered speech. We describe two stuttered speech remediation approaches based on the identification of those segments of speech which, when removed, would enhance speech understandability in terms of both, speech content and speech flow. We adopted two approaches, in the first approach we identify and extract speech segments that have weak semantic significance due to their low relative intensity; we subsequently trained several classifiers using a large set of inherent and derived features which provided a second layer filtering stage. The first approach was effective but required a two step process. In order to streamline the detection and remediation process, we introduced an enhancement which expands the realm of disfluency detection to include a broader range of speech anomalies by eliminating the need for a domain-dependent pre-qualification stage. The results of the new approach offer improved accuracy with enhanced simplicity, flexibility and extensibility.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Ai, O.C., Hariharan, M., Yaacob, S., Chee, L.S. (2012). Classification of speech dysfluencies with MFCC and LPCC features, (Vol. 39 pp. 2157–2165). Ai, O.C., Hariharan, M., Yaacob, S., Chee, L.S. (2012). Classification of speech dysfluencies with MFCC and LPCC features, (Vol. 39 pp. 2157–2165).
go back to reference Arbajian, P., Hajja, A., Raś, Z.W., Wieczorkowska, A.A. (2017). Segment-Removal based stuttered speech remediation. In International workshop on new frontiers in mining complex patterns (pp. 16–34). Cham: Springer.CrossRef Arbajian, P., Hajja, A., Raś, Z.W., Wieczorkowska, A.A. (2017). Segment-Removal based stuttered speech remediation. In International workshop on new frontiers in mining complex patterns (pp. 16–34). Cham: Springer.CrossRef
go back to reference Boersma, P. (1993). Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In Proceedings of the institute of phonetic sciences, vol. 17, no. 1193. Boersma, P. (1993). Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In Proceedings of the institute of phonetic sciences, vol. 17, no. 1193.
go back to reference Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot International, 5(9/10), 341–345. Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot International, 5(9/10), 341–345.
go back to reference Chee, L.S., Ai, O.C., Yaacob, S. (2009). Overview of automatic stuttering recognition system. In Proc. International conference on man-machine systems (pp. 1–6) no. october, Batu Ferringhi, Penang Malaysia. Chee, L.S., Ai, O.C., Yaacob, S. (2009). Overview of automatic stuttering recognition system. In Proc. International conference on man-machine systems (pp. 1–6) no. october, Batu Ferringhi, Penang Malaysia.
go back to reference Czyzewski, A., Kaczmarek, A., Kostek, B. (2003). Intelligent processing of stuttered speech. J. Intell. IN Inf. Syst., 21, 143–171.CrossRef Czyzewski, A., Kaczmarek, A., Kostek, B. (2003). Intelligent processing of stuttered speech. J. Intell. IN Inf. Syst., 21, 143–171.CrossRef
go back to reference Esmaili, I., Dabanloo, N.J., Vali, M. (2016). Automatic classification of speech dysfluencies in continuous speech based on similarity measures and morphological image processing tools. Biomedical Signal Processing and Control, 23, 104–114.CrossRef Esmaili, I., Dabanloo, N.J., Vali, M. (2016). Automatic classification of speech dysfluencies in continuous speech based on similarity measures and morphological image processing tools. Biomedical Signal Processing and Control, 23, 104–114.CrossRef
go back to reference Fook, C.Y., Muthusamy, H., Chee, L.S., Yaacob, S.B., Adom, A.H.B. (2013). Comparison of speech parameterization techniques for the classification of speech disfluencies. In Turkish journal of electrical engineering & computer sciences, vol. 21, no. Sup. 1.CrossRef Fook, C.Y., Muthusamy, H., Chee, L.S., Yaacob, S.B., Adom, A.H.B. (2013). Comparison of speech parameterization techniques for the classification of speech disfluencies. In Turkish journal of electrical engineering & computer sciences, vol. 21, no. Sup. 1.CrossRef
go back to reference Hariharan, M., Chee, L.S., Ai, O.C., Yaacob, S. (2012). Classification of speech dysfluencies using LPC based parameterization techniques, (Vol. 36 pp. 1821–1830).CrossRef Hariharan, M., Chee, L.S., Ai, O.C., Yaacob, S. (2012). Classification of speech dysfluencies using LPC based parameterization techniques, (Vol. 36 pp. 1821–1830).CrossRef
go back to reference Honal, M., & Schultz, T. (2003). Correction of disfluencies in spontaneous speech using a noisy-channel approach. In Interspeech. Honal, M., & Schultz, T. (2003). Correction of disfluencies in spontaneous speech using a noisy-channel approach. In Interspeech.
go back to reference Honal, M., & Schultz, T. (2005). Automatic disfluency removal on recognized spontaneous Speech-Rapid adaptation to speaker dependent disfluencies. In ICASSP (no. 1, pp. 969–972). Honal, M., & Schultz, T. (2005). Automatic disfluency removal on recognized spontaneous Speech-Rapid adaptation to speaker dependent disfluencies. In ICASSP (no. 1, pp. 969–972).
go back to reference Howell, P., Davis, S., Bartrip, J. (2009). The UCLASS archive of stuttered speech, (Vol. 52 pp. 556–569). Howell, P., Davis, S., Bartrip, J. (2009). The UCLASS archive of stuttered speech, (Vol. 52 pp. 556–569).
go back to reference KM, R.K., & Ganesan, S. (2011). Comparison of multidimensional MFCC feature vectors for objective assessment of stuttered disfluencies. International Journal of Advanced Networking Applications, 2(05), 854–860. KM, R.K., & Ganesan, S. (2011). Comparison of multidimensional MFCC feature vectors for objective assessment of stuttered disfluencies. International Journal of Advanced Networking Applications, 2(05), 854–860.
go back to reference Lease, M., Johnson, M., Charniak, E. (2006). Recognizing disfluencies in conversational speech. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 1566–1573.CrossRef Lease, M., Johnson, M., Charniak, E. (2006). Recognizing disfluencies in conversational speech. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 1566–1573.CrossRef
go back to reference Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P. (2005). Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In Interspeech (pp. 3313–3316). Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P. (2005). Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In Interspeech (pp. 3313–3316).
go back to reference Raghavendra, M., & Rajeswari, P. (2016). Determination of disfluencies associated in stuttered speech using MFCC feature extraction, (Vol. 4 pp. 2321–9939). Raghavendra, M., & Rajeswari, P. (2016). Determination of disfluencies associated in stuttered speech using MFCC feature extraction, (Vol. 4 pp. 2321–9939).
go back to reference Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C. (2009). An approach for objective assessment of stuttered speech using MFCC. In The international congress for global science and technology (p. 19). Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C. (2009). An approach for objective assessment of stuttered speech using MFCC. In The international congress for global science and technology (p. 19).
go back to reference Ribeiro, M.T., Singh, S., Guestrin, C. (2016). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). ACM. Ribeiro, M.T., Singh, S., Guestrin, C. (2016). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). ACM.
go back to reference Świetlicka, I., Kuniszyk-Jóźkowiak, W., Smołka, E. (2013). Hierarchical ANN system for stuttering identification, (Vol. 27 pp. 228–242).CrossRef Świetlicka, I., Kuniszyk-Jóźkowiak, W., Smołka, E. (2013). Hierarchical ANN system for stuttering identification, (Vol. 27 pp. 228–242).CrossRef
go back to reference Winkelmann, R., & Raess, G. (2014). Introducing a web application for labeling, visualizing speech and correcting derived speech signals. In LREC (pp. 4129–4133). Winkelmann, R., & Raess, G. (2014). Introducing a web application for labeling, visualizing speech and correcting derived speech signals. In LREC (pp. 4129–4133).
Metadata
Title
Effect of speech segment samples selection in stutter block detection and remediation
Authors
Pierre Arbajian
Ayman Hajja
Zbigniew W. Raś
Alicja A. Wieczorkowska
Publication date
11-02-2019
Publisher
Springer US
Published in
Journal of Intelligent Information Systems / Issue 2/2019
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-019-00546-z

Other articles of this Issue 2/2019

Journal of Intelligent Information Systems 2/2019 Go to the issue

Premium Partner