DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques

Islam, Md. Moinul; Haque, Monjurul; Islam, Saiful; Mia, Md. Zesun Ahmed; Rahman, S. M. A. Mohaiminur

doi:10.1007/978-3-030-93247-3_23

Md. Moinul Islam ORCID: orcid.org/0000-0003-4185-382X¹²,
Monjurul Haque¹³,
Saiful Islam¹⁴,
Md. Zesun Ahmed Mia^15,16 &
…
S. M. A. Mohaiminur Rahman¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 371))

Included in the following conference series:

International Conference on Intelligent Computing & Optimization

1247 Accesses
4 Citations

Abstract

Everything we know is based on our brain’s ability to process sensory data. Hearing is a crucial sense for our ability to learn. Sound is essential for a wide range of activities such as exchanging information, interacting with others, and so on. To convert the sound electrically, the role of the audio signal comes into play. Because of the countless essential applications, audio signal & their classification poses an important value. However, in this day and age, classifying audio signals remains a difficult task. To classify audio signals more accurately and effectively, we have proposed a new model. In this study, we’ve applied a brand-new method for audio classification that combines the strengths of Deep Convolutional Neural Network (DCNN) and Long-Short Term Memory (LSTM) models with a unique combination of feature engineering to get the best possible outcome. Here, we have integrated data augmentation and feature extraction together before fitting it into the model to evaluate the performance. There is a higher degree of accuracy observed after the experiment. To validate the efficacy of our model, a comparative analysis has been made with the latest conducted reference works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Learning in Audio Classification

Acoustic scene classification with multi-temporal complex modulation spectrogram features and a convolutional LSTM network

Article 11 November 2022

Classifying Audio Music Genres Using CNN and RNN

References

Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing and Optimization. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3. ISBN 978-3-030-00978-6
Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing and Optimization. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33585-4. ISBN 978-3-030-33585-4
Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing and Optimization. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-68154-8
Book MATH Google Scholar
Abdoli, S., Cardinal, P., Koerich, A.L.: End-to-end environmental sound classification using a 1D convolutional neural network. Expert Syst. Appl. 136, 252–263 (2019)
Article Google Scholar
Boddapati, V., Petef, A., Rasmusson, J., Lundberg, L.: Classifying environmental sounds using image recognition networks. Procedia Comput. Sci. 112, 2048–2056 (2017). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 21st International Conference, KES-20176-8 September 2017, Marseille, France
Google Scholar
Costa, Y.M., Oliveira, L.S., Silla, C.N.: An evaluation of convolutional neural networks for music classification using spectrograms. Appl. Soft Comput. 52, 28–38 (2017)
Article Google Scholar
Giannakopoulos, T., Pikrakis, A.: Audio features. In: Giannakopoulos, T., Pikrakis, A. (eds.) Introduction to Audio Analysis, pp. 59–103. Academic Press, Oxford (2014)
Chapter Google Scholar
Hershey, S., et al.: CNN architectures for large-scale audio classification (2017)
Google Scholar
Li, J., Dai, W., Metze, F., Qu, S., Das, S.: A comparison of deep learning methods for environmental sound detection. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 126–130. IEEE (2017)
Google Scholar
Palanisamy, K., Singhania, D., Yao, A.: Rethinking CNN models for audio classification (2020)
Google Scholar
Piczak, K.J.: Environmental sound classification with convolutional neural networks. In: 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2015)
Google Scholar
Salamon, J., Bello, J.P.: Unsupervised feature learning for urban sound classification. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 171–175. IEEE (2015)
Google Scholar
Salamon, J., Jacoby, C., Bello, J.P.: A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 1041–1044. Association for Computing Machinery, New York (2014)
Google Scholar
Sharma, J., Granmo, O.C., Goodwin, M.: Environment sound classification using multiple feature channels and attention based deep convolutional neural network. In: INTERSPEECH, pp. 1186–1190 (2020)
Google Scholar
Su, Y., Zhang, K., Wang, J., Madani, K.: Environment sound classification using a two-stream CNN based on decision-level fusion. Sensors 19(7), 1733 (2019)
Article Google Scholar
Tokozume, Y., Harada, T.: Learning environmental sounds with end-to-end convolutional neural network. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2721–2725. IEEE (2017)
Google Scholar
Zhang, Z., Xu, S., Cao, S., Zhang, S.: Deep convolutional neural network with mixup for environmental sound classification. In: Lai, J.-H., et al. (eds.) PRCV 2018. LNCS, vol. 11257, pp. 356–367. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03335-4_31
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Chittagong University of Engineering and Technology, Chittagong, Bangladesh
Md. Moinul Islam & S. M. A. Mohaiminur Rahman
Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh
Monjurul Haque
Ahsanullah University of Science and Technology, Dhaka, Bangladesh
Saiful Islam
Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh
Md. Zesun Ahmed Mia
University of Liberal Arts Bangladesh (ULAB), Dhaka, Bangladesh
Md. Zesun Ahmed Mia

Authors

Md. Moinul Islam
View author publications
You can also search for this author in PubMed Google Scholar
Monjurul Haque
View author publications
You can also search for this author in PubMed Google Scholar
Saiful Islam
View author publications
You can also search for this author in PubMed Google Scholar
Md. Zesun Ahmed Mia
View author publications
You can also search for this author in PubMed Google Scholar
S. M. A. Mohaiminur Rahman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Electrical & Electronic Engineering, MERLIN Research Centre, Ton Duc Thang University, Hồ Chí Minh City, Vietnam
Pandian Vasant
Faculty of Electrical Engineering and Computer Science, VŠB TU Ostrava, Ostrava-Poruba, Czech Republic
Ivan Zelinka
Faculty of Engineering Management, Poznan University of Technology, Poznan, Poland
Gerhard-Wilhelm Weber

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Islam, M.M., Haque, M., Islam, S., Mia, M.Z.A., Rahman, S.M.A.M. (2022). DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing & Optimization. ICO 2021. Lecture Notes in Networks and Systems, vol 371. Springer, Cham. https://doi.org/10.1007/978-3-030-93247-3_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-93247-3_23
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93246-6
Online ISBN: 978-3-030-93247-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques

Abstract

Access this chapter

Similar content being viewed by others

Deep Learning in Audio Classification

Acoustic scene classification with multi-temporal complex modulation spectrogram features and a convolutional LSTM network

Classifying Audio Music Genres Using CNN and RNN

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques

Abstract

Access this chapter

Similar content being viewed by others

Deep Learning in Audio Classification

Acoustic scene classification with multi-temporal complex modulation spectrogram features and a convolutional LSTM network

Classifying Audio Music Genres Using CNN and RNN

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation