Skip to main content
Erschienen in: International Journal of Speech Technology 4/2018

19.07.2018

Onset detection for tar solo

verfasst von: Behraz Farrokhi, Ehsanollah Kabir, Hedieh Sajedi

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper develops a new method of onset detection for the Tar, a traditional Iranian musical instrument. The proposed method is based on both types of pitch and energy features. Therefore, it can be utilized to detect either soft or hard onsets. Through this combination, we obtained a more precise separation between two adjacent notes. This ability is especially useful to detect the reaz, repeatedly played notes with the same frequency and short durations. For the evaluation of the method, a data set with predetermined onsets was produced and the results were compared with an energy-based method explained in terms of F-measure.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Argenti, F., Nesi, P., & Pantaleo, G. (2011). Automatic transcription of polyphonic music based on the constant-Q bispectral analysis. IEEE Trans. Audio, Speech Language Process, 19(6), 1610–1630.CrossRef Argenti, F., Nesi, P., & Pantaleo, G. (2011). Automatic transcription of polyphonic music based on the constant-Q bispectral analysis. IEEE Trans. Audio, Speech Language Process, 19(6), 1610–1630.CrossRef
Zurück zum Zitat Bello, J. P., Daudet, L., andC., S. A., Duxbury, M., Davies, & Sandler, M. (2005). A tutorial on onset detection in musical signals. IEEE Transactions on Speech and Audio Processing, 13(5), 1035–1047.CrossRef Bello, J. P., Daudet, L., andC., S. A., Duxbury, M., Davies, & Sandler, M. (2005). A tutorial on onset detection in musical signals. IEEE Transactions on Speech and Audio Processing, 13(5), 1035–1047.CrossRef
Zurück zum Zitat Bello, J. P., Duxbury, C., Davies, M., & Sandler, M. B. (2004). On the use of phase and energy for musical onset detection in the complex domain. IEEE Signal Processing Letters, 11(6), 553–556.CrossRef Bello, J. P., Duxbury, C., Davies, M., & Sandler, M. B. (2004). On the use of phase and energy for musical onset detection in the complex domain. IEEE Signal Processing Letters, 11(6), 553–556.CrossRef
Zurück zum Zitat Bello, J. P., & Sandler, M. (2003). Phase-based note onset detection for music signals. In IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE Cat. No. 03TH8684), pp. 441–444. Bello, J. P., & Sandler, M. (2003). Phase-based note onset detection for music signals. In IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE Cat. No. 03TH8684), pp. 441–444.
Zurück zum Zitat Benetos, E., Dixon, S., Giannoulis, D., Kirchhoff, H., & Klapuri, A. (2013). Automatic music transcription: Challenges and future directions. Journal of Intelligent Information Systems, 41(3), 407–434.CrossRef Benetos, E., Dixon, S., Giannoulis, D., Kirchhoff, H., & Klapuri, A. (2013). Automatic music transcription: Challenges and future directions. Journal of Intelligent Information Systems, 41(3), 407–434.CrossRef
Zurück zum Zitat Benetos, E., & Stylianou, Y. (2010). Auditory spectrum-based pitched instrument onset detection. IEEE Transactions on Audio, Speech, and Language Processing, 18(8), 1968–1977.CrossRef Benetos, E., & Stylianou, Y. (2010). Auditory spectrum-based pitched instrument onset detection. IEEE Transactions on Audio, Speech, and Language Processing, 18(8), 1968–1977.CrossRef
Zurück zum Zitat Bhalke, D. G., Rama Rao, C. B., & Borman, D. S. (2016). Automatic musical instrument classification using fractional fourier transform based- MFCC features and counter propagation neural network. Journal of Intelligent Information, 46(16), 445–446. Bhalke, D. G., Rama Rao, C. B., & Borman, D. S. (2016). Automatic musical instrument classification using fractional fourier transform based- MFCC features and counter propagation neural network. Journal of Intelligent Information, 46(16), 445–446.
Zurück zum Zitat Böck, S., Arzt, A., Krebs, F., & Schedl, M., Online realtime onset detection with recurrent neural networks, In Proceedings of the 15th International Conference on Digital Audio Effects, pp. 15–18, 2012. Böck, S., Arzt, A., Krebs, F., & Schedl, M., Online realtime onset detection with recurrent neural networks, In Proceedings of the 15th International Conference on Digital Audio Effects, pp. 15–18, 2012.
Zurück zum Zitat Bock, S., & Schedl, M. (2012). Polyphonic piano note transcription with recurrent neural networks, In ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 121–124. Bock, S., & Schedl, M. (2012). Polyphonic piano note transcription with recurrent neural networks, In ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 121–124.
Zurück zum Zitat Bock, S., & Widmer, G. (2013). Maximum filter vibrato suppression for onset detection, In Proceedings of the 16th International Conference on Digital Audio Effects, pp. 1–7. Bock, S., & Widmer, G. (2013). Maximum filter vibrato suppression for onset detection, In Proceedings of the 16th International Conference on Digital Audio Effects, pp. 1–7.
Zurück zum Zitat Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., & Verikas, A. (2017). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307–1319.CrossRef Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., & Verikas, A. (2017). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307–1319.CrossRef
Zurück zum Zitat Brown, J. C. (1992). Musical fundamental frequency tracking using a pattern recognition method. Journal of the Acoustical Society of America, 92(3), 1394–1402CrossRef Brown, J. C. (1992). Musical fundamental frequency tracking using a pattern recognition method. Journal of the Acoustical Society of America, 92(3), 1394–1402CrossRef
Zurück zum Zitat Brown, J. C., & Puckette, M. S. (1992). An efficient algorithm for the calculation of a constant Q transform. Journal of the Acoustical Society of America, 92(5), 2698–2701.CrossRef Brown, J. C., & Puckette, M. S. (1992). An efficient algorithm for the calculation of a constant Q transform. Journal of the Acoustical Society of America, 92(5), 2698–2701.CrossRef
Zurück zum Zitat Collins, N. (2005). Using a pitch detector for onset detection. In International Symposium on Music Information Retrieval, pp. 100–106. Collins, N. (2005). Using a pitch detector for onset detection. In International Symposium on Music Information Retrieval, pp. 100–106.
Zurück zum Zitat Collins, N. (2005). A comparison of sound onset detection algorithms with emphasis on psychoacoustically motivated detection functions. Audio Engineering Society Convention 118, 1, 34–45. Collins, N. (2005). A comparison of sound onset detection algorithms with emphasis on psychoacoustically motivated detection functions. Audio Engineering Society Convention 118, 1, 34–45.
Zurück zum Zitat Degara, N., Davies, M. E. P., Pena, A., & Plumbley, M. D. (2011). Onset event decoding exploiting the rhythmic structure of polyphonic music. IEEE Journal of Selected Topics in Signal Processing, 5(6), 1228–1239.CrossRef Degara, N., Davies, M. E. P., Pena, A., & Plumbley, M. D. (2011). Onset event decoding exploiting the rhythmic structure of polyphonic music. IEEE Journal of Selected Topics in Signal Processing, 5(6), 1228–1239.CrossRef
Zurück zum Zitat Dixon, S. (2006). Onset detection revisited. In Proceedings of the 9th International Conference on Digital Audio Effects, pp. 1–6. Dixon, S. (2006). Onset detection revisited. In Proceedings of the 9th International Conference on Digital Audio Effects, pp. 1–6.
Zurück zum Zitat Duxbury, C., Sandler, M., & Davies, M. (2002). A hybrid approach to musical note onset detection. In 5th International Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany, pp. 33–38. Duxbury, C., Sandler, M., & Davies, M. (2002). A hybrid approach to musical note onset detection. In 5th International Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany, pp. 33–38.
Zurück zum Zitat Gainza, M., & Coyle, E. (2011). Tempo detection using a hybrid multiband approach. IEEE Transactions on Audio, Speech, and Language Processing, 19(1), 57–68.CrossRef Gainza, M., & Coyle, E. (2011). Tempo detection using a hybrid multiband approach. IEEE Transactions on Audio, Speech, and Language Processing, 19(1), 57–68.CrossRef
Zurück zum Zitat Heo, H., Sung, D., Lee, K. (2013). Note onset detection based on harmonic cepstrum regularity. In IEEE International Conference on Multimedia and Expo (ICME), San Jose, CA, USA, pp. 1–6,. Heo, H., Sung, D., Lee, K. (2013). Note onset detection based on harmonic cepstrum regularity. In IEEE International Conference on Multimedia and Expo (ICME), San Jose, CA, USA, pp. 1–6,.
Zurück zum Zitat Heydarisan, P. (2016). Automatic recognition of Persian musical models in audio musical signals. Doctoral thesis, London Metropolitan University. Heydarisan, P. (2016). Automatic recognition of Persian musical models in audio musical signals. Doctoral thesis, London Metropolitan University.
Zurück zum Zitat Klapuri, A. (1999). Sound onset detection by applying psychoacoustic knowledge. In IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASS P99 (Cat. No.99CH36258), Vol. 6, pp. 3089–3092. Klapuri, A. (1999). Sound onset detection by applying psychoacoustic knowledge. In IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASS P99 (Cat. No.99CH36258), Vol. 6, pp. 3089–3092.
Zurück zum Zitat Liang, C., Su, L., & Yang, Y. (2015). Musical onset detection using constrained linear reconstruction. IEEE Signal Processing Letters, 22(11), 2142–2146.CrossRef Liang, C., Su, L., & Yang, Y. (2015). Musical onset detection using constrained linear reconstruction. IEEE Signal Processing Letters, 22(11), 2142–2146.CrossRef
Zurück zum Zitat Marchi, E., Ferroni, G., Eyben, F., & Squartini, S. (2014). Audio onset detection: A wavelet packet based approach with recurrent neural networks. In International Joint Conference on Neural Networks (IJCNN), Beijing, China. Marchi, E., Ferroni, G., Eyben, F., & Squartini, S. (2014). Audio onset detection: A wavelet packet based approach with recurrent neural networks. In International Joint Conference on Neural Networks (IJCNN), Beijing, China.
Zurück zum Zitat Masri, P. (1996). Computer modelling of sound for transformation and synthesis of musical signals. (Doctoral dissertation, University of Bristol). Masri, P. (1996). Computer modelling of sound for transformation and synthesis of musical signals. (Doctoral dissertation, University of Bristol).
Zurück zum Zitat Oliveira, J. L., Davies, M. E. P., Gouyon, F., & Reis, L. P. (2012). Beat tracking for multiple applications: A multi-agent system architecture with state recovery. IEEE Transactions on Audio, Speech, Language Processing, 20(10), 2696–2706.CrossRef Oliveira, J. L., Davies, M. E. P., Gouyon, F., & Reis, L. P. (2012). Beat tracking for multiple applications: A multi-agent system architecture with state recovery. IEEE Transactions on Audio, Speech, Language Processing, 20(10), 2696–2706.CrossRef
Zurück zum Zitat Percival, G., & Tzanetakis, G. (2014). Streamlined tempo estimation based on autocorrelation and cross-correlation with pulses. IEEE/ACM Transactions on Speech and Language Processing, 22(12), 1765–1776.CrossRef Percival, G., & Tzanetakis, G. (2014). Streamlined tempo estimation based on autocorrelation and cross-correlation with pulses. IEEE/ACM Transactions on Speech and Language Processing, 22(12), 1765–1776.CrossRef
Zurück zum Zitat Reis, G., Fernandéz, F., De Vega, & Ferreira, A. (2012). Automatic transcription of polyphonic piano music using genetic algorithms, adaptive spectral envelope modeling, and dynamic noise level estimation. IEEE Transactions on Speech and Language Processing, 20(8), 2313–2328.CrossRef Reis, G., Fernandéz, F., De Vega, & Ferreira, A. (2012). Automatic transcription of polyphonic piano music using genetic algorithms, adaptive spectral envelope modeling, and dynamic noise level estimation. IEEE Transactions on Speech and Language Processing, 20(8), 2313–2328.CrossRef
Zurück zum Zitat Robinson, D. W., & Dadson, R. S. (1956). A re-determination of the equal-loudness relations for pure tones. British Journal of Applied Physics, 7(5), 166–181.CrossRef Robinson, D. W., & Dadson, R. S. (1956). A re-determination of the equal-loudness relations for pure tones. British Journal of Applied Physics, 7(5), 166–181.CrossRef
Zurück zum Zitat Santosh, K., Hangarge, M., Bevilacqua, V., & Negi, A. (2017). A Fast k-Nearest Neighbor Classifier Using Unsupervised Clustering. In: Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016, Communications in Computer and Information Science, Vol. 709. Singapore: Springer. Santosh, K., Hangarge, M., Bevilacqua, V., & Negi, A. (2017). A Fast k-Nearest Neighbor Classifier Using Unsupervised Clustering. In: Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016, Communications in Computer and Information Science, Vol. 709. Singapore: Springer.
Zurück zum Zitat Schloss, A. W. (1985). On the Automatic Transcription of Percussive Music - From Acoustic Signal to High-Level Analysis. Doctoral thesis. Stanford University. Schloss, A. W. (1985). On the Automatic Transcription of Percussive Music - From Acoustic Signal to High-Level Analysis. Doctoral thesis. Stanford University.
Zurück zum Zitat Sephus, N. H., Lanterman, A. D., & Anderson, D. V. (2014). Modulation spectral features: In pursuit of invariant representations of music with application to unsupervised source identification. Journal of New Music Research, 44(1), 58–70.CrossRef Sephus, N. H., Lanterman, A. D., & Anderson, D. V. (2014). Modulation spectral features: In pursuit of invariant representations of music with application to unsupervised source identification. Journal of New Music Research, 44(1), 58–70.CrossRef
Zurück zum Zitat Sigtia, S., Benetos, E., & Dixon, S. (2016). An end-to-end neural network for polyphonic piano music transcription. IEEE/ACM Transactions on Audio, Speech and Language Processing, 24(5), 927–939.CrossRef Sigtia, S., Benetos, E., & Dixon, S. (2016). An end-to-end neural network for polyphonic piano music transcription. IEEE/ACM Transactions on Audio, Speech and Language Processing, 24(5), 927–939.CrossRef
Zurück zum Zitat Stasiak, B., Mońko, J., & Niewiadomski, A. (2016). Note onset detection in musical signals via neural-network-based multi-ODF fusion. International Journal of Applied Mathematics and Computer Science, 26(1), 203–213.MathSciNetCrossRef Stasiak, B., Mońko, J., & Niewiadomski, A. (2016). Note onset detection in musical signals via neural-network-based multi-ODF fusion. International Journal of Applied Mathematics and Computer Science, 26(1), 203–213.MathSciNetCrossRef
Zurück zum Zitat Stylianou, Y., & Gedik, A. C. (2010). Three dimensions of pitched instrument onset detection. IEEE Transactions on Audio, Speech and Language Processing, 18(6), 1517–1527.CrossRef Stylianou, Y., & Gedik, A. C. (2010). Three dimensions of pitched instrument onset detection. IEEE Transactions on Audio, Speech and Language Processing, 18(6), 1517–1527.CrossRef
Zurück zum Zitat Thoshkahna, B., & Ramakrishnan, K. R. (2008). A psychoacoustics based sound onset detection algorithm for polyphonic audio music and audio. In 9th International Conference on Signal Processing (ICSP), pp. 1424–1427. Thoshkahna, B., & Ramakrishnan, K. R. (2008). A psychoacoustics based sound onset detection algorithm for polyphonic audio music and audio. In 9th International Conference on Signal Processing (ICSP), pp. 1424–1427.
Zurück zum Zitat Tian, M., Black, D. A. A., & Sandler, M. (2014). Design and evaluation of onset detectors using different fusion policies. In 15th International Society for Music Information Retrieval Conference (ISMIR 2014) Design, Ismir, pp. 631–636. Tian, M., Black, D. A. A., & Sandler, M. (2014). Design and evaluation of onset detectors using different fusion policies. In 15th International Society for Music Information Retrieval Conference (ISMIR 2014) Design, Ismir, pp. 631–636.
Zurück zum Zitat Todisco, M., Delgado, H., & Evans, N. (2016). A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients. In Speaker Odyssey Workshop, Bilbao, Spain. Todisco, M., Delgado, H., & Evans, N. (2016). A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients. In Speaker Odyssey Workshop, Bilbao, Spain.
Zurück zum Zitat Todisco, M., Delgado, H., & Evans, N. (2017). Constan Q cepstral coefficients: A spoofing countermeasure for automatic speacker verification. Computer Speech & Language, 45, 516–535.CrossRef Todisco, M., Delgado, H., & Evans, N. (2017). Constan Q cepstral coefficients: A spoofing countermeasure for automatic speacker verification. Computer Speech & Language, 45, 516–535.CrossRef
Zurück zum Zitat Zhou, R., Mattavelli, M., & Zoia, G. (2008). Music onset detection based on resonator time frequency image. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1685–1695.CrossRef Zhou, R., Mattavelli, M., & Zoia, G. (2008). Music onset detection based on resonator time frequency image. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1685–1695.CrossRef
Zurück zum Zitat Zhou, R., & Reiss, J. D. (2007). Music onset detection combining energy-based and pitch-based approaches. In Proceedings MIREX Audio Onset Detection Contest. Zhou, R., & Reiss, J. D. (2007). Music onset detection combining energy-based and pitch-based approaches. In Proceedings MIREX Audio Onset Detection Contest.
Metadaten
Titel
Onset detection for tar solo
verfasst von
Behraz Farrokhi
Ehsanollah Kabir
Hedieh Sajedi
Publikationsdatum
19.07.2018
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 4/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-018-9534-5

Weitere Artikel der Ausgabe 4/2018

International Journal of Speech Technology 4/2018 Zur Ausgabe

Neuer Inhalt