research-article

SoundSense: scalable sound sensing for people-centric applications on mobile phones

Authors:
Hong Lu

Dartmouth College, Hanover, NH, USA

Dartmouth College, Hanover, NH, USA
View Profile

,
Wei Pan

Dartmouth College, Hanover, NH, USA

Dartmouth College, Hanover, NH, USA
View Profile

,
Nicholas D. Lane

Dartmouth College, Hanover, NH, USA

Dartmouth College, Hanover, NH, USA
View Profile

,
Tanzeem Choudhury

Dartmouth College, Hanover, NH, USA

Dartmouth College, Hanover, NH, USA
View Profile

,
Andrew T. Campbell

Dartmouth College, Hanover, NH, USA

Dartmouth College, Hanover, NH, USA
View Profile

MobiSys '09: Proceedings of the 7th international conference on Mobile systems, applications, and servicesJune 2009Pages 165–178https://doi.org/10.1145/1555816.1555834

Published:22 June 2009Publication History

MobiSys '09: Proceedings of the 7th international conference on Mobile systems, applications, and services

Pages 165–178

ABSTRACT

Top end mobile phones include a number of specialized (e.g., accelerometer, compass, GPS) and general purpose sensors (e.g., microphone, camera) that enable new people-centric sensing applications. Perhaps the most ubiquitous and unexploited sensor on mobile phones is the microphone - a powerful sensor that is capable of making sophisticated inferences about human activity, location, and social events from sound. In this paper, we exploit this untapped sensor not in the context of human communications but as an enabler of new sensing applications. We propose SoundSense, a scalable framework for modeling sound events on mobile phones. SoundSense is implemented on the Apple iPhone and represents the first general purpose sound sensing system specifically designed to work on resource limited phones. The architecture and algorithms are designed for scalability and Soundsense uses a combination of supervised and unsupervised learning techniques to classify both general sound types (e.g., music, voice) and discover novel sound events specific to individual users. The system runs solely on the mobile phone with no back-end interactions. Through implementation and evaluation of two proof of concept people-centric sensing applications, we demostrate that SoundSense is capable of recognizing meaningful sound events that occur in users' everyday lives.

References

]]T. Abdelzaher, Y. Anokwa, P. Boda, J. Burke, D. Estrin, L. Guibas, A. Kansal, S. Madden, and J. Reich. Mobiscopes for human spaces. IEEE Pervasive Computing, 6(2):20---29, 2007. Google ScholarDigital Library
]]O. Amft, M. Stäger, P. Lukowicz, and G. Tröster. Analysis of chewing sounds for dietary monitoring. In M. Beigl, S. S. Intille, J. Rekimoto, and H.Tokuda, editors, Ubicomp, volume 3660 of Lecture Notes in Computer Science, pages 56--72. Springer, 2005. Google ScholarDigital Library
]]Apple. Introduction to the ob jective-c 2.0 programming language. Website, 2008. http://developer.apple.com/documentation/Cocoa/ConceptualObjectiveC/Introduction/chapter_1_section_1.html.Google Scholar
]]Apple. iphone. Website, 2008. http://www.apple.com/iphone/.Google Scholar
]]Apple. iphone sdk. Website, 2008. http://developer.apple.com/iphone/.Google Scholar
]]L. Bao and S. S. Intille. Activity recognition from user-annotated acceleration data. In A. Ferscha and F. Mattern, editors, Pervasive, volume 3001 of Lecture Notes in Computer Science, pages 1--17. Springer, 2004.Google Scholar
]]S. Basu. A linked-HMM model for robust voicing and speech detection. In Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03). 2003 IEEE International Conference on, volume 1, 2003.Google Scholar
]]C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2006. Google ScholarDigital Library
]]M. Borgerding. Kiss fft. Website, 2008. http://sourceforge.net/projects/kissfft/.Google Scholar
]]J. Burke, D. Estrin, M. Hansen, A. Parker, N. Ramanathan, S. Reddy, and Srivastava. Participatory sensing. In In: Workshop on World-Sensor-Web (WSW): Mobile Device Centric Sensor Networks and Applications, 2006.Google Scholar
]]A. T. Campbell, S. B. Eisenman, N. D. Lane, E. Miluzzo, and R. A. Peterson. People-centric urban sensing. In WICON '06: Proceedings of the 2nd annual international workshop on Wireless internet, page 18, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
]]T. Choudhury, G. Borriello, S. Consolvo, D. Haehnel, B. Harrison, B. Hemingway, J. Hightower, P. Klasnja, K. Koscher, A. LaMarca, et al. The Mobile Sensing Platform: An Embedded System for Capturing and Recognizing Human Activities. IEEE Pervasive Computing Special Issue on Activity-Based Computing, 2008. Google ScholarDigital Library
]]T. K. Choudhury. Sensing and modeling human networks. Technical report, Ph. D. Thesis, Program in Media Arts and Sciences, Massachusetts Institute of Technology, 2003. Google ScholarDigital Library
]]B. Clarkson, N. Sawhney, and A. Pentl. Auditory context awareness via wearable computing. In In Proceedings of the 1998 Workshop on Perceptual User Interfaces(PUI98), pages 4--6, 1998.Google Scholar
]]S. Dixon. Onset Detection Revisited. In Proceedings of the 9th International Conference on Digital Audio Effects (DAFx06), Montreal, Canada, 2006.Google Scholar
]]J. Foote. An overview of audio information retrieval. Multimedia Systems, 7(1):2--10, 1999. Google ScholarDigital Library
]]Google. Android. Website, 2008. http://code.google.com/android/.Google Scholar
]]F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE, 66(1):51--83, 1978.Google ScholarCross Ref
]]N. D. Lane, H. Lu, S. B. Eisenman, and A. T. Campbell. Cooperative techniques supporting sensor-based people-centric inferencing. In J. Indulska, D. J. Patterson, T. Rodden, and M. Ott, editors, Pervasive, volume 5013 of Lecture Notes in Computer Science, pages 75--92. Springer, 2008. Google ScholarDigital Library
]]M. L. Lee and A. K. Dey. Lifelogging memory appliance for people with episodic memory impairment. In H. Y. Youn and W.-D. Cho, editors, UbiComp, volume 344 of ACM International Conference Proceeding Series, pages 44--53. ACM, 2008. Google ScholarDigital Library
]]D. Li, I. Sethi, N. Dimitrova, and T. McGee. Classification of general audio data for content-based retrieval. Pattern Recognition Letters, 22(5):533--544, 2001. Google ScholarDigital Library
]]K. A. Li, T. Y. Sohn, S. Huang, and W. G. Griswold. Peopletones: a system for the detection and notification of buddy proximity on mobile phones. In MobiSys '08: Proceeding of the 6th international conference on Mobile systems, applications, and services, pages 160--173, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
]]L. Liao, D. Fox, and H. Kautz. Extracting places and activities from gps traces using hierarchical conditional random fields. Int. J. Rob. Res., 26(1):119--134, 2007. Google ScholarDigital Library
]]L. Ma, B. Milner, and D. Smith. Acoustic environment classification. ACM Transactions on Speech and Language Processing (TSLP), 3(2):1--22, 2006. Google ScholarDigital Library
]]L. Ma, D. Smith, and B. Milner. Context Awareness Using Environmental Noise Classification. In Eighth European Conference on Speech Communication and Technology. ISCA, 2003.Google Scholar
]]M. McKinney and J. Breebaart. Features for audio and music classification. In Proc. ISMIR, pages 151--158, 2003.Google Scholar
]]E. Miluzzo, N. Lane, K. Fodor, R. Peterson, H. Lu, M. Musolesi, S. Eisenman, X. Zheng, and A. Campbell. Sensing meets mobile social networks: the design, implementation and evaluation of the cenceme application. In Proceedings of the 6th ACM conference on Embedded network sensor systems, pages 337--350. ACM New York, NY, USA, 2008. Google ScholarDigital Library
]]Nokia. N95. Website, 2008. http://nseries.nokia.com.Google Scholar
]]D. J. Patterson, L. Liao, K. Gajos, M. Collier, N. Livic, K. Olson, S. Wang, D. Fox, and H. Kautz. Opportunity knocks: A system to provide cognitive assistance with transportation services. In UbiComp 2004: Ubiquitous Computing, volume 3205 of Lecture Notes in Computer Science, pages 433--450, Berlin / Heidelberg, 2004. Springer.Google Scholar
]]V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational Auditory Scene Recognition. In IEEE International conference on acoustics speech and signal processing, volume 2. IEEE; 1999, 2002.Google Scholar
]]V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. Computational Auditory Scene Recognition. In IEEE Intl. Conf. on Acoustics Speech and Signal Processing, volume 2. IEEE; 1999, 2002.Google Scholar
]]L. Rabiner and B. Juang. Fundamentals of speech recognition. 1993 Google ScholarDigital Library
]]D. Reynolds. An Overview of Automatic Speaker Recognition Technology. In IEEE International Conference on Acoustics Speech and Signal Processing, volume 4, pages 4072--4075. IEEE; 1999, 2002.Google Scholar
]]J. Saunders, L. Co, and N. Nashua. Real-time discrimination of broadcast speech/music. In Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on, volume 2, 1996. Google ScholarDigital Library
]]E. Scheirer and M. Slaney. Construction and evaluation of a robust multifeature speech/musicdiscriminator. In Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on, volume 2, 1997. Google ScholarDigital Library
]]A. Schmidt, K. Aidoo, A. Takaluoma, U. Tuomela, K. Van Laerhoven, and W. Van de Velde. Advanced interaction in context. In Handheld and Ubiquitous Computing: First International Symposium, Huc'99, Karlsruhe, Germany, September 27-29, 1999, Proceedings, page 89. Springer, 1999. Google ScholarDigital Library
]]I. Shafran, M. Riley, and M. Mohri. Voice signatures. In Automatic Speech Recognition and Understanding, 2003. ASRU'03. 2003 IEEE Workshop on, pages 31--36, 2003.Google ScholarCross Ref
]]C. Shannon. Communication in the presence of noise. Proceedings of the IRE, 37(1):10--21, 1949.Google ScholarCross Ref
]]D. Smith, L. Ma, and N. Ryan. Acoustic environment as an indicator of social and physical context. Personal and Ubiquitous Computing, 10(4):241--254, 2006. Google ScholarDigital Library
]]M. Spina and V. Zue. Automatic transcription of general audio data: preliminary analyses. In Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, volume 2, 1996.Google ScholarCross Ref
]]G. Tzanetakis and P. Cook. Musical genre classification of audio signals. Speech and Audio Processing, IEEE Transactions on, 10(5):293--302, 2002.Google Scholar
]]S. Vemuri, C. Schmandt, W. Bender, S. Tellex, and B. Lassey. An audio-based personal memory aid. In N. Davies, E. D. Mynatt, and I. Siio, editors, Ubicomp, volume 3205 of Lecture Notes in Computer Science, pages 400--417. Springer, 2004.Google Scholar
]]I. Witten, U. of Waikato, and D. of Computer Science. Weka: Practical Machine Learning Tools and Techniques with Java Implementations. Dept. of Computer Science, University of Waikato, 1999.Google Scholar
]]T. Zhang and C. Kuo. Audio-guided audiovisual data segmentation, indexing, and retrieval. In Proceedings of SPIE, volume 3656, page 316. SPIE, 1998.Google Scholar
]]F. Zheng, G. Zhang, and Z. Song. Comparison of Different Implementations of MFCC. Journal of Computer Science and Technology, 16(6):582--589, 2001. Google ScholarDigital Library

Index Terms

SoundSense: scalable sound sensing for people-centric applications on mobile phones
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems

Recommendations

Controlling Home and Office Appliances with Smart Phones

Most home and office appliances contain microprocessors. All these appliances have some user interface, but many users become frustrated with their appliances' difficult, complex functions. However, a new framework, the personal universal controller (...
Read More
A System for Detecting Unusual Sounds from Sound Environment Observed by Microphone Arrays
IAS '09: Proceedings of the 2009 Fifth International Conference on Information Assurance and Security - Volume 01

In this paper, we propose a system that can detect unusual sounds and directions by observing sound environment with microphone arrays. One of the attractive features of the system is to detect the unusual information through daily environmental sound ...
Read More
A smartphone-based digital hearing aid to mitigate hearing loss at specific frequencies
MMA '14: Proceedings of the 1st Workshop on Mobile Medical Applications

Hearing Loss is one of the three most common chronic conditions among the elderly. In many cases, an individuals hearing is only impaired at certain (not all) frequencies. Analog hearing aids boost all sound frequencies equally including frequencies in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MobiSys '09: Proceedings of the 7th international conference on Mobile systems, applications, and services
June 2009
370 pages
ISBN:9781605585666
DOI:10.1145/1555816
Co-chair:
Krzysztof Zielinski
AGH University of Science and Technology, Poland
,
General Chair:
Adam Wolisz
Technische Universität Berlin, Germany
,
Program Chairs:
Jason Flinn
University of Michigan, USA
,
Anthony LaMarca
Intel Research Seattle, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
audio processing
mobile phones
people centric sensing
sound classification
urban sensing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate274of1,679submissions,16%
Upcoming Conference
MOBISYS '24

Sponsor:

sigmobile

The 22nd Annual International Conference on Mobile Systems, Applications and Services

June 3 - 7, 2024

Minato-ku, Tokyo , Japan
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 439
  Total Citations
  View Citations
- 3,374
  Total Downloads
- Downloads (Last 12 months)106
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SoundSense: scalable sound sensing for people-centric applications on mobile phones

MobiSys '09: Proceedings of the 7th international conference on Mobile systems, applications, and services

ABSTRACT

References

Cited By

Index Terms

Recommendations

Controlling Home and Office Appliances with Smart Phones

A System for Detecting Unusual Sounds from Sound Environment Observed by Microphone Arrays

A smartphone-based digital hearing aid to mitigate hearing loss at specific frequencies

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

SoundSense: scalable sound sensing for people-centric applications on mobile phones

MobiSys '09: Proceedings of the 7th international conference on Mobile systems, applications, and services

ABSTRACT

References

Cited By

Index Terms

Recommendations

Controlling Home and Office Appliances with Smart Phones

A System for Detecting Unusual Sounds from Sound Environment Observed by Microphone Arrays

A smartphone-based digital hearing aid to mitigate hearing loss at specific frequencies

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media