nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Feature Extraction of Surround Sound Recordings for Acoustic Scene Classification

verfasst von : Sławomir K. Zieliński

Erschienen in: Artificial Intelligence and Soft Computing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper extends the traditional methodology of acoustic scene classification based on machine listening towards a new class of multichannel audio signals. It identifies a set of new features of five-channel surround recordings for classification of the two basic spatial audio scenes. Moreover, it compares the three artificial intelligence-based classification approaches to audio scene classification. The results indicate that the method based on the early fusion of features is superior compared to those involving the late fusion of signal metrics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel TUP-RS: Temporal User Profile Based Recommender System

Nächstes Kapitel Cascading Probability Distributions in Agent-Based Models: An Application to Behavioural Energy Wastage

Richard, G., Virtanen, T., Bello, J.P., Ono, N., Glotin, H.: Introduction to the special section on sound scene and event analysis. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1169–1171 (2017)CrossRef

Stowell, D., Giannoulis, D., Benetos, E., Lagrange, M., Plumbley, M.D.: Detection and classification of acoustic scenes and events. IEEE Trans. Multimedia 17(10), 1733–1746 (2015)CrossRef

Chu, S., Narayanan, S., Jay Kuo C.-C., Matarić, M.J.: Where am I? Scene recognition for mobile robots using audio features. In: Proceedings of IEEE International Conference on Multimedia and Expo, Toronto, Canada, pp. 885–888. IEEE (2006)

Petetin, Y., Laroche, C., Mayoue, A.: Deep neural networks for audio scene recognition. In: Proceedings of 23rd European Signal Processing Conference (EUSIPCO), Nice, France, pp. 125–129. IEEE (2015)

Bisot, V., Serizel, R., Essid, S., Richard, G.: Feature learning with matrix factorization applied to acoustic scene classification. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1216–1229 (2017)CrossRef

Phan, H., Hertel, L., Maass, M., Koch, P., Mazur, R., Mertins, A.: Improved audio scene classification based on label-tree embeddings and convolutional neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1278–1290 (2017)CrossRef

Dargie, W.: Adaptive audio-based context recognition. IEEE Trans. Syst. Man Cybern. – Part A: Syst. Hum. 39(4), 715–725 (2009)CrossRef

Stowell, D., Benetos, E.: On-bird sound recordings: automatic acoustic recognition of activities and contexts. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1193–1206 (2017)CrossRef

Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY. IEEE (2013)

10.

Trowitzsch, I., Mohr, J., Kashef, Y., Obermayer, K.: Robust detection of environmental sounds in binaural auditory scenes. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1344–1356 (2017)CrossRef

11.

Imoto, K., Ono, N.: Spatial cepstrum as a spatial feature using a distributed microphone array for acoustic scene analysis. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1335–1343 (2017)CrossRef

12.

Yang, W., Kirshnan, S.: Combining temporal features by local binary pattern for acoustic scene classification. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1315–1321 (2017)CrossRef

13.

Peeters, G., Giordano, B.L., Susini, P., Misdariis, N., McAdams, S.: The timbre toolbox: extracting audio descriptors from musical signals. J. Acoust. Soc. Am. 130(5), 2902–2916 (2011)CrossRef

14.

ITU-R Rec. BS.775: Multichannel stereophonic sound system with and without accompanying picture. International Telecommunication Union, Geneva, Switzerland (2012)

15.

Sánchez-Hevia, H.A., Ayllón, D., Gil-Pita, R., Rosa-Zurera, M.: Maximum likelihood decision fusion for weapon classification in wireless acoustic sensor networks. IEEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1172–1182 (2017)CrossRef

16.

Rumsey, F.: Spatial quality evaluation for reproduced sound: terminology, meaning, and a scene-based paradigm. J. Audio Eng. Soc. 50(9), 651–666 (2002)

17.

Breiman, L., Cutler, A.: Random Forests for Classification and Regression. https://www.stat.berkeley.edu/~breiman/RandomForests. Accessed 18 Nov 2017

18.

Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)CrossRef

19.

Trajdos, P., Kurzynski, M.: A dynamic model of classifier competence based on the local fuzzy confusion matrix and the random reference classifier. Int. J. Appl. Math. Comput. Sci. 26(1), 175–189 (2016)MathSciNetCrossRef

20.

Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)MathSciNetCrossRef

21.

Ridgeway, G.: Generalized Boosted Regression Models. http://code.google.com/p/gradientboostedmodels. Accessed 18 Nov 2017

22.

Bradley, J.S., Soulodre, G.A.: Objective measures of listener envelopment. J. Acoust. Soc. Am. 98(5), 2590–2597 (1995)CrossRef

23.

Jollifee, F.: Principal Component Analysis, 2nd edn. Springer, Berlin (2002). https://doi.org/10.1007/b98835CrossRef

24.

George, S., Zieliński, S., Rumsey, F.: Feature extraction for the prediction of multichannel spatial audio fidelity. IEEE Trans. Audio Speech Lang. Process. 14(6), 1994–2005 (2006)CrossRef

25.

Conetta, R., Brookes, T., Rumsey, F., Zieliński, S., Dewhirst, M., Jackson, P., Bech, S., Meares, D., George, S.: Spatial audio quality perception (part 2): a linear regression model. J. Audio Eng. Soc. 62(12), 847–860 (2014)CrossRef

26.

Gardner, B., Martin, K.: HRTF Measurements of a KEMAR Dummy-Head Microphone. http://sound.media.mit.edu/resources/KEMAR.html. Accessed 16 Nov 2017

Titel: Feature Extraction of Surround Sound Recordings for Acoustic Scene Classification
verfasst von: Sławomir K. Zieliński
Verlag: Springer International Publishing
Buch: Artificial Intelligence and Soft Computing
Print ISBN: 978-3-319-91261-5

Electronic ISBN: 978-3-319-91262-2

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-91262-2_43

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"