Skip to main content
Erschienen in: Microsystem Technologies 10/2018

12.02.2018 | Technical Paper

A micro-control device of soundscape collection for mixed frog call recognition

verfasst von: Chih-Cheng Chiu, Tung-Kuan Liu, Wen-Ping Chen, Wen-Chih Lin, Jyh-Horng Chou

Erschienen in: Microsystem Technologies | Ausgabe 10/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

An Ecological Forest Park is a place that combines leisure and research, but the balance of the local ecology can be affected if the number of tourists exceeds the quota allowed for the park. Ecologists have utilized wild soundscapes in the most common surveys of frog ecology. However, soundscapes for a wild field are highly complex when recorded into a single channel from multiple sources since it contains various types of background voices and an unknown number of mixed sources. Blind source separation is ineffective in later processing of voiceprint recognition algorithms. This paper uses a micro server for automatic directional control of the microphone facing the animal source. This device also uses an interference tube to eliminate the noise outside from the directional microphone to predict the number of mixed sources that are used for the blind source separation by the cluster of frogs and discrepancy in the croaking gap. In the end, adaptive multi-stages average spectrum (AMSAS) is used to separate the animal sources, and the experiment makes use of the recorded files including the monosyllables of six types of frogs and mixed ones with seven kinds from the Shan-PING Forest Ecological Garden. Meanwhile, we compare the recognition rates among the processing using dynamic time warping, multi-stage average spectrum, and AMSAS to verify the superiority of the proposed method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bonaroya L, Bimbot F (2003) Wiener based source separation with HMM/GMM using a single sensor HMM/GMM using a single sensor. In: International symposium on independent component analysis and blind signal separation, pp 957–961 Bonaroya L, Bimbot F (2003) Wiener based source separation with HMM/GMM using a single sensor HMM/GMM using a single sensor. In: International symposium on independent component analysis and blind signal separation, pp 957–961
Zurück zum Zitat Chen WP, Chen SS, Lin CC, Chen YZ, Lin WC (2012) Automatic recognition of frog call using multi-stage average spectrum. Comput Math Appl 64(5):1270–1281CrossRef Chen WP, Chen SS, Lin CC, Chen YZ, Lin WC (2012) Automatic recognition of frog call using multi-stage average spectrum. Comput Math Appl 64(5):1270–1281CrossRef
Zurück zum Zitat Chesmore E (2001) Application of time domain signal coding and artificial neural networks to passive acoustic identification of animals. Appl Acoust 62:1359–1374CrossRef Chesmore E (2001) Application of time domain signal coding and artificial neural networks to passive acoustic identification of animals. Appl Acoust 62:1359–1374CrossRef
Zurück zum Zitat Davis SB, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366CrossRef Davis SB, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366CrossRef
Zurück zum Zitat Hattay J, Belaid S, Naanaa W (2015) Non-negative matrix factorisation for blind source separation in wavelet transform domain. IET Signal Proc 9(2):111–119CrossRef Hattay J, Belaid S, Naanaa W (2015) Non-negative matrix factorisation for blind source separation in wavelet transform domain. IET Signal Proc 9(2):111–119CrossRef
Zurück zum Zitat Hsieh SC, Chen WP, Lin WC, Chou FS, Lai JR (2012) Endpoint detection of frog croak syllables with using average energy entropy method. Taiwan J For Sci 27(2):149–161 Hsieh SC, Chen WP, Lin WC, Chou FS, Lai JR (2012) Endpoint detection of frog croak syllables with using average energy entropy method. Taiwan J For Sci 27(2):149–161
Zurück zum Zitat Huang CJ, Yang YJ, Yang DX, Chen YJ, Wei HY (2008) Realization of an intelligent frog call identification agent. In: Proceedings of the second KES international conference on agent and multi-agent systems: technologies and applications, vol 4953, pp 93–102 Huang CJ, Yang YJ, Yang DX, Chen YJ, Wei HY (2008) Realization of an intelligent frog call identification agent. In: Proceedings of the second KES international conference on agent and multi-agent systems: technologies and applications, vol 4953, pp 93–102
Zurück zum Zitat Huang CJ, Yang YJ, Yang DX, Chen YJ (2009) Frog classification using machine learning techniques. Expert Syst Appl 36(2):3737–3743CrossRef Huang CJ, Yang YJ, Yang DX, Chen YJ (2009) Frog classification using machine learning techniques. Expert Syst Appl 36(2):3737–3743CrossRef
Zurück zum Zitat Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634CrossRef Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634CrossRef
Zurück zum Zitat Hyvärinen A, Oja E (2000) Independent component analysis: algorithm and application. Neural Netw 13(4–5):411–430CrossRef Hyvärinen A, Oja E (2000) Independent component analysis: algorithm and application. Neural Netw 13(4–5):411–430CrossRef
Zurück zum Zitat Jang GJ, Lee TW (2003) A maximum likelihood approach to single channel source separation. J Mach Learn Res 4:1365–1392MathSciNetMATH Jang GJ, Lee TW (2003) A maximum likelihood approach to single channel source separation. J Mach Learn Res 4:1365–1392MathSciNetMATH
Zurück zum Zitat King RA, Gosling W (1978) Time-encoded speech. Electron Lett 14(15):222–226CrossRef King RA, Gosling W (1978) Time-encoded speech. Electron Lett 14(15):222–226CrossRef
Zurück zum Zitat Kırbız S, Gunsel B (2012) Perceptually weighted non-negative matrix factorization for blind single-channel music source separation. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 226–229 Kırbız S, Gunsel B (2012) Perceptually weighted non-negative matrix factorization for blind single-channel music source separation. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 226–229
Zurück zum Zitat Kogan JA, Margoliash D (1998) Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden markov models: a comparative study. J Acoust Soc Am 103(4):2185–2196CrossRef Kogan JA, Margoliash D (1998) Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden markov models: a comparative study. J Acoust Soc Am 103(4):2185–2196CrossRef
Zurück zum Zitat Komori T, Katagiri S (1992) Application of a generalized probabilistic descent method to dynamic time warping-based speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, pp 497–500 Komori T, Katagiri S (1992) Application of a generalized probabilistic descent method to dynamic time warping-based speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, pp 497–500
Zurück zum Zitat Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRef Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRef
Zurück zum Zitat Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562 Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562
Zurück zum Zitat Lee CH, Chou CH, Han CC, Hunag RZ (2006) Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis. Pattern Recogn Lett 27(2):93–101CrossRef Lee CH, Chou CH, Han CC, Hunag RZ (2006) Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis. Pattern Recogn Lett 27(2):93–101CrossRef
Zurück zum Zitat Li CL, Hui KC (2000) Feature recognition by template matching. Comput Gr 24(4):569–582CrossRef Li CL, Hui KC (2000) Feature recognition by template matching. Comput Gr 24(4):569–582CrossRef
Zurück zum Zitat Meganem I, Deville Y, Hosseini S, Déliot P, Briottet X (2014) Linear-quadratic blind source separation using NMF to unmix urban hyperspectral images. IEEE Trans Signal Process 62(7):1822–1833MathSciNetCrossRef Meganem I, Deville Y, Hosseini S, Déliot P, Briottet X (2014) Linear-quadratic blind source separation using NMF to unmix urban hyperspectral images. IEEE Trans Signal Process 62(7):1822–1833MathSciNetCrossRef
Zurück zum Zitat Mijović B, Vos MD, Gligorijević I, Taelman J, Huffel SV (2010) Source separation from single-channel recordings by combining empirical-mode decomposition and independent component analysis. In: IEEE transactions on biomedical engineering, vol 57, no 9CrossRef Mijović B, Vos MD, Gligorijević I, Taelman J, Huffel SV (2010) Source separation from single-channel recordings by combining empirical-mode decomposition and independent component analysis. In: IEEE transactions on biomedical engineering, vol 57, no 9CrossRef
Zurück zum Zitat Myers C, Rabiner LR (1980) Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans Acoust Speech Signal Process 28(6):623–635CrossRef Myers C, Rabiner LR (1980) Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans Acoust Speech Signal Process 28(6):623–635CrossRef
Zurück zum Zitat Noda JJ, Travieso CM, Sánchez-Rodríguez D, Dutta MK, Singh A (2016) Using bioacoustic signals and support vector machine for automatic classification of insects. In: 2016 3rd international conference on signal processing and integrated networks (SPIN), pp 656–659 Noda JJ, Travieso CM, Sánchez-Rodríguez D, Dutta MK, Singh A (2016) Using bioacoustic signals and support vector machine for automatic classification of insects. In: 2016 3rd international conference on signal processing and integrated networks (SPIN), pp 656–659
Zurück zum Zitat Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2):111–126CrossRef Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2):111–126CrossRef
Zurück zum Zitat Rapin J, Bobin J, Larue A, Starck JL (2013) Sparse and non-negative BSS for noisy data. IEEE Trans Signal Process 61(22):5620–5632MathSciNetCrossRef Rapin J, Bobin J, Larue A, Starck JL (2013) Sparse and non-negative BSS for noisy data. IEEE Trans Signal Process 61(22):5620–5632MathSciNetCrossRef
Zurück zum Zitat Schmidt MN, Mørup M (2006) Nonnegative matrix factor 2-D deconvolution for blind single channel source separation. In: Proceedings of international conferences independent component analysis and blind signal separation, vol 3889, pp 700–707CrossRef Schmidt MN, Mørup M (2006) Nonnegative matrix factor 2-D deconvolution for blind single channel source separation. In: Proceedings of international conferences independent component analysis and blind signal separation, vol 3889, pp 700–707CrossRef
Zurück zum Zitat Shyu Kuo-Kai, Lee Ming-Huan, Yu-Te Wu, Lee Po-Lei (2008) Implementation of pipelined fastICA on FPGA for real-time blind source separation. IEEE Trans Neural Netw 19(6):958–970CrossRef Shyu Kuo-Kai, Lee Ming-Huan, Yu-Te Wu, Lee Po-Lei (2008) Implementation of pipelined fastICA on FPGA for real-time blind source separation. IEEE Trans Neural Netw 19(6):958–970CrossRef
Zurück zum Zitat Somervuo P, Harma A, Fagerlund S (2006) Parametric representations of bird sounds for automatic species recognition. IEEE Trans Audio Speech Lang Process 14(6):2252–2263CrossRef Somervuo P, Harma A, Fagerlund S (2006) Parametric representations of bird sounds for automatic species recognition. IEEE Trans Audio Speech Lang Process 14(6):2252–2263CrossRef
Zurück zum Zitat Suzuki M, Honjo T (2015) Spot-forming method by using two shotgun microphones. In: 2015 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), pp 188–191 Suzuki M, Honjo T (2015) Spot-forming method by using two shotgun microphones. In: 2015 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), pp 188–191
Zurück zum Zitat Taylor A, Grigg G, Watson G, McCallum H (1996) Monitoring frog communities: an application of machine learning. In: Proceedings of eighth innovative applications of artificial intelligence conference, pp 1564–1569 Taylor A, Grigg G, Watson G, McCallum H (1996) Monitoring frog communities: an application of machine learning. In: Proceedings of eighth innovative applications of artificial intelligence conference, pp 1564–1569
Zurück zum Zitat Tyagi H, Hegde RM, Murthy HA, Prabhaka A (2006) Automatic identification of bird calls using spectral ensemble average voice prints. In: Proceedings of the thirteenth european signal processing conference Tyagi H, Hegde RM, Murthy HA, Prabhaka A (2006) Automatic identification of bird calls using spectral ensemble average voice prints. In: Proceedings of the thirteenth european signal processing conference
Metadaten
Titel
A micro-control device of soundscape collection for mixed frog call recognition
verfasst von
Chih-Cheng Chiu
Tung-Kuan Liu
Wen-Ping Chen
Wen-Chih Lin
Jyh-Horng Chou
Publikationsdatum
12.02.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Microsystem Technologies / Ausgabe 10/2018
Print ISSN: 0946-7076
Elektronische ISSN: 1432-1858
DOI
https://doi.org/10.1007/s00542-018-3754-0

Weitere Artikel der Ausgabe 10/2018

Microsystem Technologies 10/2018 Zur Ausgabe

Neuer Inhalt