nach oben

International Journal of Speech Technology

Erschienen in:

08.04.2021

RETRACTED ARTICLE: Audio fingerprint analysis for speech processing using deep learning method

verfasst von: Ali Altalbe

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We are generating truly mind-boggling amounts of audio data on a daily basis simply by using the Internet. In different audio-based applications, it increases the complexity of accessing and analyzing audio data. Therefore, the framework or supporting tools needed to retrieve audio data to make intelligent decisions in speech processing. However, non-stationarity and irregularity are insufficient for segmentation and classification of audio signals. Audio classification methods are used in many applications, such as speaker identification, gender recognition, music type classification, natural sound classification, etc. This work proposes a deep learning method based on long-term short-term memory (LSTM) that can be used with preprocessing, segmentation, and retrieval of audio signals from the GTZAN dataset. The simulation results show that the proposed algorithm can effectively improve the audio fingerprint-based data retrieval accuracy and overcome traditional methods' drawbacks. Compared with existing methods, the proposed LSTM method has achieved good results. The precision, recall, accuracy and F-measure of LSTM is 96.54%, 96.15%, 98.56% and 0.96% respectively. In the real world, the recommended audio fingerprint recognition system effectively works through voice applications, especially in heterogeneous portable consumer devices or online audio distributed systems.

Vorheriger Artikel A low power reconfigurable ADC for bioimpedance monitroing system

Nächster Artikel A hybrid system for Parkinson’s disease diagnosis using machine learning techniques

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Al-Maathidi, M. M., & Li, F. F. (2012). NNET based audio content classification and indexing system. International Journal of Digital Information and Wireless Communications (IJDIWC),2(4), 335–347.

Chen, T., Kuan, K., Celi, L., & Clifford, G. D. (2010). Intelligent heartsound diagnostics on a cellphone using a hands-free kit. AAAI Spring Symposium Series,2010, 26–31.

Christopher PraveenKumar, R., Suguna, S., & Becky Elfreda, J. (2014). Audio retrieval based on cepstralfeature. International Journal of Computers and Applications,107(17), 28–33.CrossRef

Deng, S. W., & Han, J. Q. (2016). Towards heart sound classification without segmentation via autocorrelation feature and diffusion maps. Future Generation Computer Systems,60, 13–21.CrossRef

Díaz-García, J., Brunet, P., Navazo, I., Vázquez, P.P. (2017). Down sampling methods for medical datasets. In Proceedings of the international conferences on computer graphics, visualization, computer vision and image processing 2017 and big data analytics, data mining and computational intelligence 2017—Part of the multi-conference on computer science and info, Lisbon, Portugal, 23 July 2017; pp. 12–20.

Geiger, J.T., Schuller, B., Rigoll, G. (2013). Large-scale audio feature extraction and SVM for acoustic scene classification. In Proceedings of IEEE workshop on applications of signal processing to audio and acoustics (WASPAA), New Paltz, Oct 2013, pp. 1–4.

Genussov, M., & Cohen, I. (2010). Musical genre classification of audio signals using geometric methods. European Signal Processing Conference,10, 497–501.

Gomes, E.F., Bentley, P.J., Coimbra, M., Pereira, E., Deng, Y. (2013). Classifying heart sounds approaches to the PASCAL challenge. In Proceedings of the HEALTHINF 2013—Proceedings of the international conference on health informatics, Barcelona, Spain, 11–14 February 2013; pp. 337–340.

Haque, M. A., & Kim, J. M. (2013). An enhanced fuzzy C-means algorithm for audio segmentation and classification. International Journal of Multimedia Tools Applications,63(2), 485–500.CrossRef

KesavanNamboothiri, T., Anju, L. (2016). Efficient audio retrieval using SVM and DTW techniques.Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) 23(2)

Ludeña-Choez, J., & Gallardo-Antolín, A. (2015). Feature extraction based on the high-pass filtering ofaudio signals for acoustic event classification. Journal Computer Speech & Language,30(1), 32–42.CrossRef

Muthumari, A., & Mala, K. (2016). An efficient approach for segmentation, feature extraction, and audio signals classification. Journal Circuits System,7, 255–279.CrossRef

Nagavi, T.C., Anusha, S.B., Monisha, P., Poornima, S.P. (2013). Content-based audio retrieval with MFCC feature extraction, clustering and sort-merge techniques. In Proceedings of IEEE 4th international conference on computing, communications and networking technologies, July2013, pp. 1–6.

Praveen Sundar, P. V., Ranjith, D., Karthikeyan, T., Vinoth Kumar, V., & Balajee Jeyakumar. (2020). Low power area efficient adaptive FIR filter for hearing aids using distributed arithmetic architecture. International Journal of Speech Technology,23, 287–296. https://doi.org/10.1007/s10772-020-09686-yCrossRef

Rong, F. (2016). Audio classification method based on machine learning. In IEEE Proceedings of international conference on intelligent transportation, big data & smart city, pp. 81–84.

Srinivasa Murthy,Y., Koolagudi, S.G. (2015). Classification of vocal and non-vocal regions from audiosongs using spectral features and pitch variations. In Proceedings of IEEE 28th Canadian Conferenceon Electrical and Computer Engineering (CCECE), Halifax, May 2015, pp. 1271–1276.

Yaseen, Son, G.-Y., & Kwon, S. (2018). Classification of heart sound signal using multiple features. Applied Sciences,8(12), 2344.CrossRef

Zhang, W., Han, J., & Deng, S. (2017). Heart sound classification based on scaled spectrogram and tensor decomposition. Expert Systems with Applications,84, 220–231.CrossRef

Zhang, X., Su, Z., Lin, P., He,Q., Yang, J. (2014). An audio feature extraction scheme based on spectraldecomposition. In Proceedings of IEEE international conference on audio, language and image processing (ICALIP), Shanghai, July 2014, pp. 730–733.

Zheng, Y., Guo, X., & Ding, X. (2015). A novel hybrid energy fraction and entropy-based approach for systolic heart murmurs identification. Expert Systems with Applications,42, 2710–2721.CrossRef

Titel: RETRACTED ARTICLE: Audio fingerprint analysis for speech processing using deep learning method
verfasst von: Ali Altalbe
Publikationsdatum: 08.04.2021
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 3/2022
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-021-09827-x

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Benny Hahn/© ZEP GmbH, Customer Experience/© © oatawa / Getty Images / iStock, Erdgasmotor 1.5 TGI evo von Volkswagen/© Volkswagen AG, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2022

Single-channel speech enhancement using implicit Wiener filter for high-quality speech communication

Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks

A low power reconfigurable ADC for bioimpedance monitroing system

High speed low area decimation filter for hearing aid application

Performance enhancement of text-independent speaker recognition in noisy and reverberation conditions using Radon transform with deep learning

An adaptive speech signal processing for COVID-19 detection using deep learning approach

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.