article

Acoustic environment classification

Authors:
Ling Ma

University of York, York, UK

University of York, York, UK
View Profile

,
Ben Milner

University of East Anglia, Norwich, UK

University of East Anglia, Norwich, UK
View Profile

,
Dan Smith

University of East Anglia, Norwich, UK

University of East Anglia, Norwich, UK
View Profile

Authors Info & Claims

ACM Transactions on Speech and Language Processing Volume 3 Issue 2pp 1–22https://doi.org/10.1145/1149290.1149292

Published:01 July 2006Publication History

ACM Transactions on Speech and Language Processing

Abstract

The acoustic environment provides a rich source of information on the types of activity, communication modes, and people involved in many situations. It can be accurately classified using recordings from microphones commonly found in PDAs and other consumer devices. We describe a prototype HMM-based acoustic environment classifier incorporating an adaptive learning mechanism and a hierarchical classification model. Experimental results show that we can accurately classify a wide variety of everyday environments. We also show good results classifying single sounds, although classification accuracy is influenced by the granularity of the classification.

References

Bakker, E. M. and Lew, M. S. 2002. Semantic retrieval using audio analysis. In Proceedings of the Conference on Image and Video Retrieval. London UK. Lecture Notes in Computer Science, vol. 2383. 271--277. Google Scholar
Browne, P., Czirjek, C., Gurrin, C., Jarina, R., Lee, H., Marlow, S., McDonald, K., Murphy, N., O'Connor, N. E., Smeaton, A. F., and Ye, J. 2003. Dublin City University video track experiments for TREC 2002.Google Scholar
Cai, R., Lu, L., Zhang, H. J., and Cai, L.-H. 2003. Using structure patterns of temporal and spectral feature in audio similarity measure. In Proceedings of the ACM Multimedia Conference. Berkeley, CA. (Nov.). 219--222. Google Scholar
Clarkson, B., Sawhney, N., and Pentland, A. 1998. Auditory context awareness via wearable computing. Workshop on Perceptual User Interfaces. 37--42.Google Scholar
Couvreur, L. and Laniray, M. 2004. Automatic noise recognition in urban environments based on artificial neural networks and hidden Markov models. Inter-noise2004. Prague, Czech Republic.Google Scholar
Duda, R. O., Hart, P. E., and Stork, D. G. 2001. Pattern Classification, 2nd Ed. Wiley, New York, NY. Google Scholar
Foote, J. 1999. An overview of audio information retrieval. Multimedia Syst. 7, 1, 2--11. Google Scholar
Gaunard, P., Mubikangiey, C. G., Couvreur, C., and Fontaine, V. 1998. Automatic classification of environmental noise events by hidden Markov models. Appl. Acoustics 54, 3, 187.Google Scholar
Hindus, D. and Schmandt, C. 1992. Ubiquitous audio: Capturing spontaneous collaboration. In Proceedings of Computer-Supported Cooperative Work (CSCW). Toronto, Canada (Nov.). 210--217. Google Scholar
Huang, X., Acero, A., and Hon, H. 2001. Spoken Language Processing. Prentice Hall, Englewood Cliffs, NJ.Google Scholar
Leggetter, C. J. and Woodland, P. C. 1995. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9, 171--185.Google Scholar
Liu, L., Jiang, H., and Zhang, H.-J. 2001. A robust audio classification and segmentation method. In Proceedings of the ACM Multimedia Conference. Ottawa, Canada. 203--211. Google Scholar
Liu, L., Zhang, H.-J., and Jiang, H. 2002. Content analysis for audio classification and segmentation. IEEE Trans. Speech Audio Process. 10, 7, 504--516.Google Scholar
Ma, L., Smith, D. J., and Milner, B. P. 2003. Context awareness using environmental noise classification. In Proceedings of Eurospeech. Geneva, Switzerland, 2237--2240.Google Scholar
Ma, L., Smith, D. J., and Milner, B. P. 2003. Environmental noise classification for context-aware applications. In Proceedings of the International Conference on Database and Expert Systems Applications (DEXA). Lecture Notes in Computer Science, vol. 2736. 360--370.Google Scholar
Mynatt, E. D., Back, M., Want, R., Baer, M., and Ellis, J. B. 1998. Designing audio aura. In Proceedings of Conference on Human Factors in Computing Systems (CHI'98). 566--573. Google Scholar
Nishiura, T., Nakamura, S., Miki, K., and Shikano, K. 2003. Environment sound source identification based on hidden Markov model for robust speech recognition. In Proceedings of EuroSpeech. 2157--2160.Google Scholar
Peltonen, V. T. K., Eronen, A. J., Parviainen, M. P., and Klapuri, A. P. 2001. Recognition of everyday auditory environments: Potentials, latencies and cues, 110th Convention of Audio Engineering Society.Google Scholar
Peltonen, V., Tuomi, J., Klapuri, A., Huopaniemi, J., and Sorsa, T. 2002. Computational auditory environment recognition. In Proceedings of the International Conference on Acoustic, Speech, and Signal Processing. Orlando, FL.Google Scholar
Quénot, G. M., Moraru, D., Besacier, L., and Hulhem, P. 2003. CLIPS at TREC-11: Experiments in video retrieval. TREC-2002.Google Scholar
Sawhney, N. and Schmandt, C. 2000. Nomadic radio: Speech and audio interaction for contextual messaging in nomadic environments. ACM Trans. Comput. Human. Interact. 7, 3, 353--383. Google Scholar
Sawhney, N. 1997. Situational awareness from environmental sounds. Tech. rep. for Modeling Adaptive Behavior (MAS 738). MIT Media Lab.Google Scholar
Scheirer, E. and Slaney, M. 1997. Construction and evaluation of a robust multifeature speech/music discriminator. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 1331--1334. Google Scholar
Schmandt, C., Marmasse, N., Marti, S., Shawhney, N., and Wheeler, S. 2000. Everywhere messaging. IBM Syst. J. 39, 3--4, 660--677. Google Scholar
Smeaton, A. F. and Over, P. 2003. TRECVID: Benchmarking the effectivenss of information retrieval tasks on digital video. Lecture Notes in Computer Science, vol. 2728. 19--27. Google Scholar
Smith, D., Ma, L., and Ryan, N. 2005. Acoustic environment as an indicator of social and physical context. Person. Ubiquitous Comput. 10, 1 (DOI: 10.1007/s00779-005-0045-4). Google Scholar
Srinivasen, S., Petkovic, D., and Poncelon, D. B. 1999. Towards robust features for classifying audio in the CueVideo system. In Proceedings of the ACM Multimedia Conference. 393--340. Google Scholar
Stäger, M., Lukowitz, P., and Tröster, G. 2004. Implementation and evaluation of a low-power sound-based user activity recognition system. International Semantic Web Conference. 138--141. Google Scholar
Steward, J. 2005. Using a PDA for audio capture. BSc Project, University of East Anglia, Norwich, UK.Google Scholar
Toyoda, Y., Huang, J., Ding, S., and Liu, Y. 2004. Environmental sound recognition by multilayered neural networks. In Proceedings of the 4th International Conference on Computer and Information Technology (CIT '04). 123--127. Google Scholar
Tzanetakis, G. and Cook, P. 2002. Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10, 5, 293--302.Google Scholar
Vega, V. S B., Bressan, S. 2003. Continuous naive bayesian classifications. In Proceedings of the International Conference on Asian Digital Libraries. 279--289.Google Scholar
Vendrig, J., den Hartog, J., van Leeuwen, D., Patras, I., Raaijmakers, S., van Rest, J., Snoek, C., and Worring, M. 2003. TREC feature extraction by Active learning, TREC-2002.Google Scholar
Wold, E., Blum, T., Keslar, D., and Wheaton, J. 1996. Content-based classification search and retrieval of audio. IEEE Multimedia 3, 3, 27--36. Google Scholar
Wu, L., Guo, Y., Qiu, X., Feng, Z., Rong, J., Jin, W., Zhou, D., Wang, R., and Jin, M. 2003. TRECVid 2003. TREC-2003.Google Scholar
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., and Woodland, P. 2001. The HTK Book 3.1. Cambridge University Engineering Department, Cambridge, UK. http://htk.eng.cam.ac.uk.Google Scholar
Zhuang, L., Zhou, F., and Tyger, J. D. 2005. Keyboard acoustic emanations revisited. In Proceedings of the ACM Conference on Computer and Communications Security. Alexandria, VA (Nov). Google Scholar

Index Terms

Acoustic environment classification
1. Applied computing
  1. Arts and humanities
    1. Sound and music computing
2. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems
  2. Robustness
    1. Hardware reliability
      1. Signal integrity and noise analysis

Recommendations

Classification of similar impact sounds
ICISP'10: Proceedings of the 4th international conference on Image and signal processing

Several sound classifiers have been developed throughout the years. The accuracy provided by these classifiers is influenced by the features they use and the classification method implemented. While there are many approaches in sound feature extraction ...
Read More
Sound classification in hearing aids inspired by auditory scene analysis

A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are ...
Read More
Automatic rain and cicada chorus filtering of bird acoustic data
Abstract
Recording and analysing environmental audio recordings has become a common approach for monitoring the environment. This has several advantages over other approaches, such as reducing costs by avoiding the need for experts to be ...
Highlights
- Combinations of tasks for detecting rain and cicada choruses in audio are explored.
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Speech and Language Processing Volume 3, Issue 2
July 2006
94 pages
ISSN:1550-4875
EISSN:1550-4883
DOI:10.1145/1149290
Issue’s Table of Contents

Copyright © 2006 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 2006
Published in tslp Volume 3, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Sound classification
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 97
  Total Citations
  View Citations
- 2,019
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Acoustic environment classification

ACM Transactions on Speech and Language Processing

Abstract

References

Cited By

Index Terms

Recommendations

Classification of similar impact sounds

Sound classification in hearing aids inspired by auditory scene analysis

Automatic rain and cicada chorus filtering of bird acoustic data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Acoustic environment classification

ACM Transactions on Speech and Language Processing

Abstract

References

Cited By

Index Terms

Recommendations

Classification of similar impact sounds

Sound classification in hearing aids inspired by auditory scene analysis

Automatic rain and cicada chorus filtering of bird acoustic data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media