Skip to main content

2016 | OriginalPaper | Buchkapitel

Bio-Inspired Filters for Audio Analysis

verfasst von : Nicola Strisciuglio, Mario Vento, Nicolai Petkov

Erschienen in: Brain-Inspired Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, much is known about the functions of the components of the human auditory system. Computational models of these components are widely accepted and recently inspired the work of researchers in pattern recognition and signal processing. In this work we present a novel filter, which we call COPE (Combination of Peaks of Energy), that is inspired by the way the sound waves are converted into neuronal firing activity on the auditory nerve. A COPE filter creates a model of the pattern of the neural activity generated by a sound of interest and is able to detect the same pattern and modified versions of it. We apply the proposed method on the task of event detection for surveillance of roads. For the experiments, we use a publicly available data set, namely the MIVIA road events data set. The results that we achieve (recognition rate equal to \(94\%\) and false positive rate lower than \(4\%\)) and the comparison with existing methods demonstrate the effectiveness of the proposed bio-inspired filters for audio analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Azzopardi, G., Petkov, N.: A CORF computational model of a simple cell that relies on LGN input outperforms the Gabor function model. Biol. Cybern. 106(3), 177–189 (2012)CrossRef Azzopardi, G., Petkov, N.: A CORF computational model of a simple cell that relies on LGN input outperforms the Gabor function model. Biol. Cybern. 106(3), 177–189 (2012)CrossRef
2.
Zurück zum Zitat Azzopardi, G., Petkov, N.: Trainable COSFIRE filters for keypoint detection and pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 490–503 (2013)CrossRef Azzopardi, G., Petkov, N.: Trainable COSFIRE filters for keypoint detection and pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 490–503 (2013)CrossRef
3.
Zurück zum Zitat Azzopardi, G., Strisciuglio, N., Vento, M., Petkov, N.: Trainable COSFIRE filters for vessel delineation with application to retinal images. Med. Image Anal. 19(1), 46–57 (2015)CrossRef Azzopardi, G., Strisciuglio, N., Vento, M., Petkov, N.: Trainable COSFIRE filters for vessel delineation with application to retinal images. Med. Image Anal. 19(1), 46–57 (2015)CrossRef
4.
Zurück zum Zitat Blauert, J.: The Technology of Binaural Listening. Modern Acoustics and Signal Processing (2013) Blauert, J.: The Technology of Binaural Listening. Modern Acoustics and Signal Processing (2013)
5.
Zurück zum Zitat Cano, P., Batlle, E., Kalker, T., Haitsma, J.: A review of audio fingerprinting. J. VLSI Sig. Process. Syst. Sig. Image Video Technol. 41(3), 271–284 (2005)CrossRef Cano, P., Batlle, E., Kalker, T., Haitsma, J.: A review of audio fingerprinting. J. VLSI Sig. Process. Syst. Sig. Image Video Technol. 41(3), 271–284 (2005)CrossRef
6.
Zurück zum Zitat Carletti, V., Foggia, P., Percannella, G., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance using a bag of aural words classifier. In: IEEE AVSS, pp. 81–86, August 2013 Carletti, V., Foggia, P., Percannella, G., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance using a bag of aural words classifier. In: IEEE AVSS, pp. 81–86, August 2013
7.
Zurück zum Zitat Chin, M., Burred, J.: Audio event detection based on layered symbolic sequence representations. In: IEEE ICASSP, pp. 1953–1956 (2012) Chin, M., Burred, J.: Audio event detection based on layered symbolic sequence representations. In: IEEE ICASSP, pp. 1953–1956 (2012)
8.
Zurück zum Zitat Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. In: ICME, pp. 1306–1309 (2005) Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. In: ICME, pp. 1306–1309 (2005)
9.
Zurück zum Zitat Conte, D., Foggia, P., Percannella, G., Saggese, A., Vento, M.: An ensemble of rejecting classifiers for anomaly detection of audio events. In: IEEE AVSS, pp. 76–81, September 2012 Conte, D., Foggia, P., Percannella, G., Saggese, A., Vento, M.: An ensemble of rejecting classifiers for anomaly detection of audio events. In: IEEE AVSS, pp. 76–81, September 2012
10.
Zurück zum Zitat Crocco, M., Cristani, M., Trucco, A., Murino, V.: Audio surveillance: a systematic review. CoRR abs/1409.7787 (2014) Crocco, M., Cristani, M., Trucco, A., Murino, V.: Audio surveillance: a systematic review. CoRR abs/1409.7787 (2014)
11.
Zurück zum Zitat Daugman, J.G.: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Opt. Soc. Am. A 2(7), 1160–1169 (1985)CrossRef Daugman, J.G.: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Opt. Soc. Am. A 2(7), 1160–1169 (1985)CrossRef
12.
Zurück zum Zitat Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: a system for detecting anomalous sounds. IEEE Trans. Intell. Transp. Syst. PP(99), 1–10 (2015) Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: a system for detecting anomalous sounds. IEEE Trans. Intell. Transp. Syst. PP(99), 1–10 (2015)
13.
Zurück zum Zitat Foggia, P., Saggese, A., Strisciuglio, N., Vento, M.: Cascade classifiers trained on gammatonegrams for reliably detecting audio events. In: IEEE AVSS, pp. 50–55, August 2014 Foggia, P., Saggese, A., Strisciuglio, N., Vento, M.: Cascade classifiers trained on gammatonegrams for reliably detecting audio events. In: IEEE AVSS, pp. 50–55, August 2014
14.
Zurück zum Zitat Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Reliable detection of audio events in highly noisy environments. Pattern Recogn. Lett. 65, 22–28 (2015)CrossRef Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Reliable detection of audio events in highly noisy environments. Pattern Recogn. Lett. 65, 22–28 (2015)CrossRef
15.
Zurück zum Zitat Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI–6(6), 721–741 (1984)CrossRefMATH Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI–6(6), 721–741 (1984)CrossRefMATH
16.
Zurück zum Zitat Jeffress, L.A.: A place theory of sound localization. J. Comp. Physiol. Psychol. 41(1), 35–39 (1948)CrossRef Jeffress, L.A.: A place theory of sound localization. J. Comp. Physiol. Psychol. 41(1), 35–39 (1948)CrossRef
17.
Zurück zum Zitat Lecomte, S., Lengelle, R., Richard, C., Capman, F., Ravera, B.: Abnormal events detection using unsupervised one-class svm - application to audio surveillance and evaluation. In: IEEE AVSS, pp. 124–129, 30 2011-September 2 2011 Lecomte, S., Lengelle, R., Richard, C., Capman, F., Ravera, B.: Abnormal events detection using unsupervised one-class svm - application to audio surveillance and evaluation. In: IEEE AVSS, pp. 124–129, 30 2011-September 2 2011
19.
Zurück zum Zitat Meddis, R.: Auditory-nerve first-spike latency and auditory absolute threshold: a computer model. J. Acoust. Soc. Am. 119(1), 406–417 (2006)CrossRef Meddis, R.: Auditory-nerve first-spike latency and auditory absolute threshold: a computer model. J. Acoust. Soc. Am. 119(1), 406–417 (2006)CrossRef
20.
Zurück zum Zitat Ntalampiras, S., Potamitis, I., Fakotakis, N.: An adaptive framework for acoustic monitoring of potential hazards. EURASIP J. Audio Speech Music Process. 2009, 13:1–13:15 (2009) Ntalampiras, S., Potamitis, I., Fakotakis, N.: An adaptive framework for acoustic monitoring of potential hazards. EURASIP J. Audio Speech Music Process. 2009, 13:1–13:15 (2009)
21.
Zurück zum Zitat Ogle, J.P., Ellis, D.P.W.: Fingerprinting to identify repeated sound events in long-duration personal audio recordings. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, ICASSP 2007, vol. 1, pp. I-233–I-236, April 2007 Ogle, J.P., Ellis, D.P.W.: Fingerprinting to identify repeated sound events in long-duration personal audio recordings. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, ICASSP 2007, vol. 1, pp. I-233–I-236, April 2007
22.
Zurück zum Zitat Palmer, A., Russell, I.: Phase-locking in the cochlear nerve of the guinea-pig and its relation to the receptor potential of inner hair-cells. Hear. Res. 24(1), 1–15 (1986)CrossRef Palmer, A., Russell, I.: Phase-locking in the cochlear nerve of the guinea-pig and its relation to the receptor potential of inner hair-cells. Hear. Res. 24(1), 1–15 (1986)CrossRef
23.
Zurück zum Zitat Patterson, R.D., Moore, B.C.J.: Auditory filters and excitation patterns as representations of frequency resolution. Frequency selectivity in hearing, pp. 123–177 (1986) Patterson, R.D., Moore, B.C.J.: Auditory filters and excitation patterns as representations of frequency resolution. Frequency selectivity in hearing, pp. 123–177 (1986)
24.
Zurück zum Zitat Patterson, R.D., Robinson, K., Holdsworth, J., Mckeown, D., Zhang, C., Allerhand, M.: Complex Sounds and auditory images. In: Cazals, Y., Demany, L., Honer, K. (eds.) Auditory Physiology and Perception, Pergamon, Pergamon, Oxford, pp. 429–443 (1992) Patterson, R.D., Robinson, K., Holdsworth, J., Mckeown, D., Zhang, C., Allerhand, M.: Complex Sounds and auditory images. In: Cazals, Y., Demany, L., Honer, K. (eds.) Auditory Physiology and Perception, Pergamon, Pergamon, Oxford, pp. 429–443 (1992)
25.
Zurück zum Zitat Phan, H., Hertel, L., Maass, M., Mazur, R., Mertins, A.: Audio phrases for audio event recognition. In: 23nd European Signal Processing Conference, EUSIPCO 2015 (2015) Phan, H., Hertel, L., Maass, M., Mazur, R., Mertins, A.: Audio phrases for audio event recognition. In: 23nd European Signal Processing Conference, EUSIPCO 2015 (2015)
26.
Zurück zum Zitat Pour, A.F., Asgari, M., Hasanabadi, M.R.: Gammatonegram based speaker identification. In: 2014 4th International eConference on Computer and Knowledge Engineering (ICCKE), pp. 52–55, October 2014 Pour, A.F., Asgari, M., Hasanabadi, M.R.: Gammatonegram based speaker identification. In: 2014 4th International eConference on Computer and Knowledge Engineering (ICCKE), pp. 52–55, October 2014
27.
Zurück zum Zitat Poveda, E.A.L., Meddis, R.: A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110(6), 3107–18 (2001)CrossRef Poveda, E.A.L., Meddis, R.: A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110(6), 3107–18 (2001)CrossRef
28.
Zurück zum Zitat Rabaoui, A., Davy, M., Rossignol, S., Ellouze, N.: Using one-class svms and wavelets for audio surveillance. IEEE Trans. Inf. Forensics Security 3(4), 763–775 (2008)CrossRef Rabaoui, A., Davy, M., Rossignol, S., Ellouze, N.: Using one-class svms and wavelets for audio surveillance. IEEE Trans. Inf. Forensics Security 3(4), 763–775 (2008)CrossRef
29.
Zurück zum Zitat Strisciuglio, N., Azzopardi, G., Vento, M., Petkov, N.: Multiscale blood vessel delineation using B-COSFIRE filters. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9257, pp. 300–312. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23117-4_26 CrossRef Strisciuglio, N., Azzopardi, G., Vento, M., Petkov, N.: Multiscale blood vessel delineation using B-COSFIRE filters. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9257, pp. 300–312. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-23117-4_​26 CrossRef
30.
Zurück zum Zitat Strisciuglio, N., Azzopardi, G., Vento, M., Petkov, N.: Supervised vessel delineation in retinal fundus images with the automatic selection of B-COSFIRE filters. Mach. Vis. Appl., 1–13 (2016). doi:10.1007/s00138-016-0781-7 Strisciuglio, N., Azzopardi, G., Vento, M., Petkov, N.: Supervised vessel delineation in retinal fundus images with the automatic selection of B-COSFIRE filters. Mach. Vis. Appl., 1–13 (2016). doi:10.​1007/​s00138-016-0781-7
31.
Zurück zum Zitat Sturm, B.L.: A survey of evaluation in music genre recognition. In: Nürnberger, A., Stober, S., Larsen, B., Detyniecki, M. (eds.) AMR 2012. LNCS, vol. 8382, pp. 29–66. Springer, Heidelberg (2014). doi:10.1007/978-3-319-12093-5_2 Sturm, B.L.: A survey of evaluation in music genre recognition. In: Nürnberger, A., Stober, S., Larsen, B., Detyniecki, M. (eds.) AMR 2012. LNCS, vol. 8382, pp. 29–66. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-12093-5_​2
32.
Zurück zum Zitat Vacher, M., Istrate, D., Besacier, L., Serignat, J.F., Castelli, E.: Sound detection and classification for medical telesurvey. In: ACTA Press (eds.) Proceedings of the 2nd ICBME, Innsbruck, Austria, pp. 395–398, February 2004 Vacher, M., Istrate, D., Besacier, L., Serignat, J.F., Castelli, E.: Sound detection and classification for medical telesurvey. In: ACTA Press (eds.) Proceedings of the 2nd ICBME, Innsbruck, Austria, pp. 395–398, February 2004
33.
Zurück zum Zitat Valenzise, G., Gerosa, L., Tagliasacchi, M., Antonacci, F., Sarti, A.: Scream and gunshot detection and localization for audio-surveillance systems. In: IEEE AVSS, pp. 21–26 (2007) Valenzise, G., Gerosa, L., Tagliasacchi, M., Antonacci, F., Sarti, A.: Scream and gunshot detection and localization for audio-surveillance systems. In: IEEE AVSS, pp. 21–26 (2007)
34.
Zurück zum Zitat Wang, A.L.-C., Th Floor Block F.: An industrial-strength audio search algorithm. In: Proceedings of the 4th International Conference on Music Information Retrieval (2003) Wang, A.L.-C., Th Floor Block F.: An industrial-strength audio search algorithm. In: Proceedings of the 4th International Conference on Music Information Retrieval (2003)
Metadaten
Titel
Bio-Inspired Filters for Audio Analysis
verfasst von
Nicola Strisciuglio
Mario Vento
Nicolai Petkov
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-50862-7_8

Neuer Inhalt