Skip to main content

2015 | OriginalPaper | Buchkapitel

13. Stressed Speech Recognition Using Similarity Measurement on Inner Product Space

verfasst von : Bhanu Priya, S. Dandapat

Erschienen in: Advances in Communication and Computing

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, similarity measurement on different inner product space approach is proposed for analysis of stressed speech. The similarity is measured between neutral speech subspace and stressed speech subspace. Cosine between neutral speech and stressed speech is taken as similarity measurement parameter. It is asssumed that, speech and stress components of stressed speech are linearly related to each other. Cosine between neutral and stressed speech multiples of stressed speech contains speech information of stressed speech. Complement cosine (1-cosine) multiples of stressed speech is taken as stress component of stressed speech. Neutral speech subspace is created by all neutral speech of the training database and stressed speech subspace contain stressed (angry, sad, lombard, happy) speech. From experiment, it is observed that, stress information of stressed speech is not present in the complement cosine (1-cosine) times of stressed speech on different inner product space. The linear relationship between speech and stress component of stressed speech exists only for some specific inner product space. All the experiments are done using nonlinear (TEO-CB-Auto-Env) feature.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Patil, S.A., Hansen, J.H.: Speech Under Stress: Analysis, Modeling and Recognition. Springer, Berlin (2007) Patil, S.A., Hansen, J.H.: Speech Under Stress: Analysis, Modeling and Recognition. Springer, Berlin (2007)
2.
Zurück zum Zitat Chen, Y.: Cepstral domain stress compensation for robust speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ’87, vol. 12, pp. 717–720 (1987) Chen, Y.: Cepstral domain stress compensation for robust speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ’87, vol. 12, pp. 717–720 (1987)
3.
Zurück zum Zitat Ramamohan, S., Dandapat, S.: Sinusoidal model-based analysis and classification of stressed speech. IEEE Trans. Audio Speech Lang. Process. 14(3), 737–746 (2006)CrossRef Ramamohan, S., Dandapat, S.: Sinusoidal model-based analysis and classification of stressed speech. IEEE Trans. Audio Speech Lang. Process. 14(3), 737–746 (2006)CrossRef
4.
Zurück zum Zitat Nwe, T., Foo, S.W., De Silva, C.: Detection of stress and emotion in speech using traditional and fft based log energy features. In: Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia, vol. 3, pp. 1619–1623 (2003) Nwe, T., Foo, S.W., De Silva, C.: Detection of stress and emotion in speech using traditional and fft based log energy features. In: Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia, vol. 3, pp. 1619–1623 (2003)
5.
Zurück zum Zitat Ruzanski, E., Hansen, J., Meyerhoff, J., Saviolakis, G., Norris, W., Wollert, T.: Stress level classification of speech using euclidean distance metrics in a novel hybrid multi-dimensional feature space. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006, vol. 1, pp. I-I (2006) Ruzanski, E., Hansen, J., Meyerhoff, J., Saviolakis, G., Norris, W., Wollert, T.: Stress level classification of speech using euclidean distance metrics in a novel hybrid multi-dimensional feature space. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006, vol. 1, pp. I-I (2006)
6.
Zurück zum Zitat Loizou, P., Spanias, A.: Improved speech recognition using a subspace projection approach. IEEE Trans. Speech Audio Process. 7(3), 343–345 (1999)CrossRef Loizou, P., Spanias, A.: Improved speech recognition using a subspace projection approach. IEEE Trans. Speech Audio Process. 7(3), 343–345 (1999)CrossRef
7.
Zurück zum Zitat Pavan Kumar, D., Bilgi, R.R., Umesh, S.: Non-negative subspace projection during conventional mfcc feature extraction for noise robust speech recognition. In: National Conference on Communications (NCC), pp. 1–5 (2013) Pavan Kumar, D., Bilgi, R.R., Umesh, S.: Non-negative subspace projection during conventional mfcc feature extraction for noise robust speech recognition. In: National Conference on Communications (NCC), pp. 1–5 (2013)
8.
Zurück zum Zitat Shukla, S., Dandapat, S., Prasanna, S.: Subspace projection based analysis of speech under stressed condition. In: World Congress on Information and Communication Technologies (WICT), pp. 831–834 (2012) Shukla, S., Dandapat, S., Prasanna, S.: Subspace projection based analysis of speech under stressed condition. In: World Congress on Information and Communication Technologies (WICT), pp. 831–834 (2012)
9.
Zurück zum Zitat Vera, E.: The subspace approach as a first stage in speech enhancement, Latin America transactions. IEEE (Revista IEEE America Latina) 9(5), 721–725 (2011)CrossRef Vera, E.: The subspace approach as a first stage in speech enhancement, Latin America transactions. IEEE (Revista IEEE America Latina) 9(5), 721–725 (2011)CrossRef
10.
Zurück zum Zitat Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals, vol. 19. IET, Stevenage (1979) Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals, vol. 19. IET, Stevenage (1979)
11.
Zurück zum Zitat Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall, Inc., Upper Saddle River (1993) Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall, Inc., Upper Saddle River (1993)
12.
Zurück zum Zitat Nakos, G., Joyner, D.: Linear Algebra with Applications. Brooks/Cole Publishing Company, Pacific Grove (1998) Nakos, G., Joyner, D.: Linear Algebra with Applications. Brooks/Cole Publishing Company, Pacific Grove (1998)
13.
Zurück zum Zitat Zhou, G., Hansen, J., Kaiser, J.: Methods for stress classification: nonlinear teo and linear speech based features. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2087–2090 (1999) Zhou, G., Hansen, J., Kaiser, J.: Methods for stress classification: nonlinear teo and linear speech based features. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2087–2090 (1999)
14.
Zurück zum Zitat Nwe, T.L., Foo, S., De Silva, L.: Classification of stress in speech using linear and nonlinear features. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 2, pp. II-9-12 (2003) Nwe, T.L., Foo, S., De Silva, L.: Classification of stress in speech using linear and nonlinear features. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 2, pp. II-9-12 (2003)
15.
Zurück zum Zitat Zhou, G., Hansen, J., Kaiser, J.: Nonlinear feature based classification of speech under stress. IEEE Trans. Speech Audio Process. 9(3), 201–216 (2001)CrossRef Zhou, G., Hansen, J., Kaiser, J.: Nonlinear feature based classification of speech under stress. IEEE Trans. Speech Audio Process. 9(3), 201–216 (2001)CrossRef
Metadaten
Titel
Stressed Speech Recognition Using Similarity Measurement on Inner Product Space
verfasst von
Bhanu Priya
S. Dandapat
Copyright-Jahr
2015
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2464-8_13

Neuer Inhalt