Skip to main content
Top

2016 | OriginalPaper | Chapter

Environmental Sounds Recognition Based on Image Processing Methods

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The article presents an approach to environmental sound recognition that uses selected methods from the field of digital image processing and recognition. The proposed technique adopts the assumption that an audio signal can be converted into a visual representation, and processed further, as an image. At the first stage the audio data are converted into rectangular matrices called feature maps. Then a two-step approach is applied: the construction of a representative database of reference samples and the identification of test samples. The process of building the database employs two-dimensional linear discriminant analysis. Then the recognition operation is carried out in a reduced feature space that has been obtained by two-dimensional Karhunen–Loeve projection. At the classification stage, a minimum distance classifier is applied to different features. As it is shown, the results are very encouraging and can be a base for many practical audio applications.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abe, M., Matsumoto, J., Nishiguchi, M.: Content-based classification of audio signals using source and structure modelling. In: Proceedings of the IEEE Pacific Conference on Multimedia, pp. 280–283 (2000) Abe, M., Matsumoto, J., Nishiguchi, M.: Content-based classification of audio signals using source and structure modelling. In: Proceedings of the IEEE Pacific Conference on Multimedia, pp. 280–283 (2000)
2.
go back to reference Cantrell, C.D.: Modern Mathematical Methods for Physicists and Engineers. Cambridge University Press, Cambridge (2000)MATH Cantrell, C.D.: Modern Mathematical Methods for Physicists and Engineers. Cambridge University Press, Cambridge (2000)MATH
3.
go back to reference Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. IEEE Int. Conf. Multimed. Expo, ICME 2005, 1306–1309 (2005) Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. IEEE Int. Conf. Multimed. Expo, ICME 2005, 1306–1309 (2005)
4.
go back to reference Davis, S., Mermelstein, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASSP 28(4), 357–366 (1980)CrossRef Davis, S., Mermelstein, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASSP 28(4), 357–366 (1980)CrossRef
5.
go back to reference Dennis, J., Tran, H.D., Li, H.L.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2011)CrossRef Dennis, J., Tran, H.D., Li, H.L.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2011)CrossRef
7.
go back to reference Forczmański, P., Frejlichowski, D.: Classification of elementary stamp shapes by means of reduced point distance histogram representation. Mach. Learn. Data Min. Pattern Recognit., LNCS 7376, 603–616 (2012)CrossRef Forczmański, P., Frejlichowski, D.: Classification of elementary stamp shapes by means of reduced point distance histogram representation. Mach. Learn. Data Min. Pattern Recognit., LNCS 7376, 603–616 (2012)CrossRef
8.
go back to reference Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–4 (2013) Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–4 (2013)
9.
go back to reference Jiang, H., Bai, J., Zhang, S., Xu, B.: SVM-based audio scene classification, natural language processing and knowledge engineering. In: Proceedings of 2005 IEEE International Conference on IEEE NLP-KE’05, pp. 131–136 (2005) Jiang, H., Bai, J., Zhang, S., Xu, B.: SVM-based audio scene classification, natural language processing and knowledge engineering. In: Proceedings of 2005 IEEE International Conference on IEEE NLP-KE’05, pp. 131–136 (2005)
10.
go back to reference Kukharev, G., Forczmański, P.: Face recognition by means of two-dimensional direct linear discriminant analysis. In: Proceedings of the 8th International Conference PRIP 2005 Pattern Recognition and Information Processing. Republic of Belarus, Minsk, pp. 280–283 (2005) Kukharev, G., Forczmański, P.: Face recognition by means of two-dimensional direct linear discriminant analysis. In: Proceedings of the 8th International Conference PRIP 2005 Pattern Recognition and Information Processing. Republic of Belarus, Minsk, pp. 280–283 (2005)
11.
go back to reference Maka, T.: Environmental background sounds classification based on properties of feature contours. In: 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE, Amsterdam, LNCS, vol. 7906, pp. 602–609 (2013) Maka, T.: Environmental background sounds classification based on properties of feature contours. In: 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE, Amsterdam, LNCS, vol. 7906, pp. 602–609 (2013)
12.
go back to reference Okarma, K., Forczmański, P.: 2DLDA-based texture recognition in the aspect of objective image quality assessment. Ann. Univ. Mariae Curie-Sklodowska. Sectio AI Informatica 8(1), 99–110 (2008)MathSciNet Okarma, K., Forczmański, P.: 2DLDA-based texture recognition in the aspect of objective image quality assessment. Ann. Univ. Mariae Curie-Sklodowska. Sectio AI Informatica 8(1), 99–110 (2008)MathSciNet
13.
go back to reference Paraskevas, I., Chilton, E.: Audio classification using acoustic images for retrieval from multimedia databases. In: 4th EURASIP Conference on Video/Image Processing and Multimedia Communications. IEEE, vol. 1, pp. 187–192 (2003) Paraskevas, I., Chilton, E.: Audio classification using acoustic images for retrieval from multimedia databases. In: 4th EURASIP Conference on Video/Image Processing and Multimedia Communications. IEEE, vol. 1, pp. 187–192 (2003)
14.
go back to reference Paraskevas, I., Potirakis, S.M., Rangoussi, M.: Natural soundscapes and identification of environmental sounds: a pattern recognition approach. In: 16th International Conference on Digital Signal Processing, pp. 5–7, 1–6 July 2009 Paraskevas, I., Potirakis, S.M., Rangoussi, M.: Natural soundscapes and identification of environmental sounds: a pattern recognition approach. In: 16th International Conference on Digital Signal Processing, pp. 5–7, 1–6 July 2009
15.
go back to reference Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recognit. 30(5), 777–787 (1997)CrossRef Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recognit. 30(5), 777–787 (1997)CrossRef
16.
go back to reference Rabiner, L., Schafer, W.: Theory and Applications of Digital Speech Processing. Prentice-Hall, Englewood Cliffs (2010) Rabiner, L., Schafer, W.: Theory and Applications of Digital Speech Processing. Prentice-Hall, Englewood Cliffs (2010)
17.
go back to reference Rafii, Z., Coover, B., Han, J.: An audio fingerprinting system for live version identification using image processing techniques. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), pp. 644–648 (2014) Rafii, Z., Coover, B., Han, J.: An audio fingerprinting system for live version identification using image processing techniques. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), pp. 644–648 (2014)
18.
go back to reference Smith III, J.O.: Spectral Audio Processing. W3K Publishing, Stanford (2011) Smith III, J.O.: Spectral Audio Processing. W3K Publishing, Stanford (2011)
19.
go back to reference Wichern, G., Xue, J., Thornburg, H., Mechtley, B., Spanias, A.: Segmentation, indexing, and retrieval for environmental and natural sounds. IEEE Trans. Audio Speech Lang. Process. 18(3), 688–707 (2010)CrossRef Wichern, G., Xue, J., Thornburg, H., Mechtley, B., Spanias, A.: Segmentation, indexing, and retrieval for environmental and natural sounds. IEEE Trans. Audio Speech Lang. Process. 18(3), 688–707 (2010)CrossRef
20.
go back to reference Yu, G., Slotine, J.: Audio classification from time-frequency texture. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. Taipei, Taiwan, pp. 1677–1680 (2009) Yu, G., Slotine, J.: Audio classification from time-frequency texture. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. Taipei, Taiwan, pp. 1677–1680 (2009)
Metadata
Title
Environmental Sounds Recognition Based on Image Processing Methods
Authors
Tomasz Maka
Paweł Forczmański
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-26227-7_68

Premium Partner