Skip to main content

DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques

  • Conference paper
  • First Online:
Intelligent Computing & Optimization (ICO 2021)

Abstract

Everything we know is based on our brain’s ability to process sensory data. Hearing is a crucial sense for our ability to learn. Sound is essential for a wide range of activities such as exchanging information, interacting with others, and so on. To convert the sound electrically, the role of the audio signal comes into play. Because of the countless essential applications, audio signal & their classification poses an important value. However, in this day and age, classifying audio signals remains a difficult task. To classify audio signals more accurately and effectively, we have proposed a new model. In this study, we’ve applied a brand-new method for audio classification that combines the strengths of Deep Convolutional Neural Network (DCNN) and Long-Short Term Memory (LSTM) models with a unique combination of feature engineering to get the best possible outcome. Here, we have integrated data augmentation and feature extraction together before fitting it into the model to evaluate the performance. There is a higher degree of accuracy observed after the experiment. To validate the efficacy of our model, a comparative analysis has been made with the latest conducted reference works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing and Optimization. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3. ISBN 978-3-030-00978-6

  2. Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing and Optimization. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33585-4. ISBN 978-3-030-33585-4

  3. Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing and Optimization. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-68154-8

    Book  MATH  Google Scholar 

  4. Abdoli, S., Cardinal, P., Koerich, A.L.: End-to-end environmental sound classification using a 1D convolutional neural network. Expert Syst. Appl. 136, 252–263 (2019)

    Article  Google Scholar 

  5. Boddapati, V., Petef, A., Rasmusson, J., Lundberg, L.: Classifying environmental sounds using image recognition networks. Procedia Comput. Sci. 112, 2048–2056 (2017). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 21st International Conference, KES-20176-8 September 2017, Marseille, France

    Google Scholar 

  6. Costa, Y.M., Oliveira, L.S., Silla, C.N.: An evaluation of convolutional neural networks for music classification using spectrograms. Appl. Soft Comput. 52, 28–38 (2017)

    Article  Google Scholar 

  7. Giannakopoulos, T., Pikrakis, A.: Audio features. In: Giannakopoulos, T., Pikrakis, A. (eds.) Introduction to Audio Analysis, pp. 59–103. Academic Press, Oxford (2014)

    Chapter  Google Scholar 

  8. Hershey, S., et al.: CNN architectures for large-scale audio classification (2017)

    Google Scholar 

  9. Li, J., Dai, W., Metze, F., Qu, S., Das, S.: A comparison of deep learning methods for environmental sound detection. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 126–130. IEEE (2017)

    Google Scholar 

  10. Palanisamy, K., Singhania, D., Yao, A.: Rethinking CNN models for audio classification (2020)

    Google Scholar 

  11. Piczak, K.J.: Environmental sound classification with convolutional neural networks. In: 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2015)

    Google Scholar 

  12. Salamon, J., Bello, J.P.: Unsupervised feature learning for urban sound classification. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 171–175. IEEE (2015)

    Google Scholar 

  13. Salamon, J., Jacoby, C., Bello, J.P.: A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 1041–1044. Association for Computing Machinery, New York (2014)

    Google Scholar 

  14. Sharma, J., Granmo, O.C., Goodwin, M.: Environment sound classification using multiple feature channels and attention based deep convolutional neural network. In: INTERSPEECH, pp. 1186–1190 (2020)

    Google Scholar 

  15. Su, Y., Zhang, K., Wang, J., Madani, K.: Environment sound classification using a two-stream CNN based on decision-level fusion. Sensors 19(7), 1733 (2019)

    Article  Google Scholar 

  16. Tokozume, Y., Harada, T.: Learning environmental sounds with end-to-end convolutional neural network. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2721–2725. IEEE (2017)

    Google Scholar 

  17. Zhang, Z., Xu, S., Cao, S., Zhang, S.: Deep convolutional neural network with mixup for environmental sound classification. In: Lai, J.-H., et al. (eds.) PRCV 2018. LNCS, vol. 11257, pp. 356–367. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03335-4_31

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Islam, M.M., Haque, M., Islam, S., Mia, M.Z.A., Rahman, S.M.A.M. (2022). DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing & Optimization. ICO 2021. Lecture Notes in Networks and Systems, vol 371. Springer, Cham. https://doi.org/10.1007/978-3-030-93247-3_23

Download citation

Publish with us

Policies and ethics