Article

Portable meeting recorder

Authors:
Dar-Shyang Lee

Ricoh Innovations, Inc. Menlo Park, CA

Ricoh Innovations, Inc. Menlo Park, CA
View Profile

,
Berna Erol

Ricoh Innovations, Inc. Menlo Park, CA

Ricoh Innovations, Inc. Menlo Park, CA
View Profile

,
Jamey Graham

Ricoh Innovations, Inc. Menlo Park, CA

Ricoh Innovations, Inc. Menlo Park, CA
View Profile

,
Jonathan J. Hull

Ricoh Innovations, Inc. Menlo Park, CA

Ricoh Innovations, Inc. Menlo Park, CA
View Profile

,
Norihiko Murata

Ricoh Office System R&D Center Shinei-cho, Tsuzuki-ku, Yokohama, Japan

Ricoh Office System R&D Center Shinei-cho, Tsuzuki-ku, Yokohama, Japan
View Profile

MULTIMEDIA '02: Proceedings of the tenth ACM international conference on MultimediaDecember 2002Pages 493–502https://doi.org/10.1145/641007.641111

Published:01 December 2002Publication History

MULTIMEDIA '02: Proceedings of the tenth ACM international conference on Multimedia

Pages 493–502

ABSTRACT

The design and implementation of a portable meeting recorder is presented. Composed of an omni-directional video camera with four-channel audio capture, the system saves a view of all the activity in a meeting and the directions from which people spoke. Subsequent analysis computes metadata that includes video activity analysis of the compressed data stream and audio processing that helps locate events that occurred during the meeting. Automatic calculation of the room in which the meeting occurred allows for efficient navigation of a collection of recorded meetings. A user interface is populated from the metadata description to allow for simple browsing and location of significant events.

References

Foote, J. and Kimber, D., "FlyCam: Practical panoramic video and automatic camera control," Proceedings of International Conference on Multimedia & Expo, vol.3, pp. 1419--1422, 2000. Google ScholarDigital Library
Gross, R., Bett, M. Yu, H., Zhu, X., Pan, Y., Yang, J., Waibel, A., "Towards a multimodal meeting record," Proceedings of International Conference on Multimedia and Expo, pp. 1593--1596, New York, 2000.Google ScholarCross Ref
Sun, X., Foote, J., Kimber, D., and Manjunath, "Panoramic video capturing and compressed domain virtual camera control", ACM Multimedia, pp. 229--238, 2001. Google ScholarDigital Library
Rui, Y., Gupta, A., and Cadiz, J., "Viewing meetings captured by an omni-directional camera", ACM CHI 2001, pp. 450--457, Seattle, March 31- April 4, 2001. Google ScholarDigital Library
Waibel, A., Bett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., Soltau, H., Yu, H., and Zechner, K., "Advances in automatic meeting record creation and access", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 597--600, 2001.Google ScholarCross Ref
Hauptmann, A. G., and Smith, M., "Text speech and vision for video segmentation: The informedia project," Proceedings of the AAAI Fall Symposium on Computational Models for Integrating Language and Vision, 1995.Google Scholar
Maybury, M., Merlino, A., and Rayson, J., "Segmentation, content extraction and visualization of broadcast news video using multistream analysis", AAAI, 1997.Google Scholar
Myers, B. A., Casares, J. P., Stevens, S., Dabbish, L., Yocum, D., and Corbett, A., "A multi-view intelligent editor for digital video libraries", Joint Conference on Digital Libraries, Roanoke, VA, June 24--28, 2001. Google ScholarDigital Library
Foote, J., Boreczky, J., Girgensohn, A., and Wilcox, L., "An intelligent media browser using automatic multimodal analysis", ACM Multimedia, pp. 375--380, 1998. Google ScholarDigital Library
Lee, D. "Segmenting People in Meeting Videos Using Mixture Background and Object Models," Proc. of Pacific Rim Conf. on Multimedia, Taiwan, Dec. 16--18, 2002. Google ScholarDigital Library
Stauffer, C. and Grimson, W.E.L, "Adaptive Background Mixture Models for Real-Time Tracking," Proceedings of Computer Vision and Pattern Recognition, pp. 246--252, 1999.Google Scholar
Gross, R., Yang, J., Waibel, A., "Face Recognition in a Meeting Room", IEEE International Conference on Automatic Face and Gesture Recognition, 294--299, 2000. Google ScholarDigital Library
Hsu, R.L., Abdel-Mottaleb, M., and Jain, A. K., "Face detection in color images", Proc. International Conference on Image Processing, pp. 1046--1049, 2001.Google Scholar
Yang, M.H., Kriegman, D.J., Ahuja, N., "Detecting Faces in Images: A Survey", PAMI(24), No. 1, pp. 34--58, January 2002. Google ScholarDigital Library
Kapralos, B., Jenkin, M., Milios E., and Tsotsos, J.: "Eyes 'n Ears Face Detection", 2001 International Conference on Image Processing, vol 1, pp. 66--69, 2001.Google ScholarCross Ref
Abdel-Mottaleb, M. and Elgammal, A., "Face Detection in complex environments from color images," IEEE ICIP, pp. 622--626, Oct. 1999.Google Scholar
Yang, J., Zhu, X., Gross, R., Kominek, J., Y. Pan, Waibel, A., "Multimodal People ID for a Multimedia Meeting Browser," Proceedings of ACM Multimedia, pp. 159--168, 1999. Google ScholarDigital Library
Pingali, G. S., Opalach, A., Carlbom, I., "Multimedia retrieval through spatio-temporal activity maps", ACM Multimedia, pp. 129--136, 2001. Google ScholarDigital Library
Divakaran, A., Vetro, A., Asai, K., Nishikawa, H., "Video browsing system based on compressed domain feature extraction", IEEE Transactions on Consumer Electronics, vol. 46, pp. 637--644, 2000. Google ScholarDigital Library
Erol, B., Kossentini, F., "Local motion descriptors", IEEE Workshop on Multimedia Signal Processing, pp. 467--472, 2001.Google Scholar
Dorai, C., Kobla, V., "Perceived visual motion descriptors from MPEG-2 for content-based HDTV annotation and retrieval", IEEE 3rd Workshop on Multimedia Signal Processing, pp. 147--152, 1999.Google ScholarCross Ref
Sun, X., Divakaran, A., Manjunath, B.S., "A motion activity descriptor and its extraction in compressed domain," Proc. IEEE Pacific-Rim Conference on Multimedia (PCM '01), pp. 450--457, 2001. Google ScholarDigital Library
ISO/IEC JTC1/SC29/WG11, "Multimedia Content Description Interface - Part 3 Visual". Publicly available at http://mpeg.telecomitalialab.com/ working_documents.htm, March 2001.Google Scholar
Aramvith, S., and Sun, M.T., "MPEG-1 and MPEG-2 video standards", Handbook of Image and Video Processing, pp. 597--610, Academic Publishers, 2000.Google Scholar
ISO/IEC, "Information technology - generic coding of moving pictures and associated audio information: Video," 13818-2, 1995.Google Scholar
Arons, B., "Speech skimmer: A system for interactively skimming recorded speech", ACM Transactions on Computer-Human Interaction, vol 4, pp. 3--38, 1997. Google ScholarDigital Library
Pfau, T., Ellis, D.P.W., and Stolcke, A., "Multispeaker Speech Activity Detection for the ICSI Meeting Recorder", Proc. IEEE Automatic Speech Recognition and Understanding Workshop, 2001.Google ScholarCross Ref
Kimber, D., and L. Wilcox, L., "Acoustic segmentation for audio browsers," in Proc. Interface Conference. Sydney, Australia, 1996.Google Scholar
Tritschler, A. and Gopinath, R., "Improved Speaker Segmentation and Segments Clustering using the Bayesian Information Criterion", Proc. of Eurospeech, pp. 679--682, 1999.Google Scholar
Johnson, S.E., "Who Spoke When? - Automatic Segmentation and Clustering for Determining Speaker Turns", Proc. Eurospeech, Vol. 5, pp. 2211--2214, 1999.Google Scholar
Graham, J., "The MuVIE Client System: A Multimedia Visualization and Integration Environment," Ricoh Innovations, March 2002.Google Scholar

Index Terms

Portable meeting recorder
1. Information systems
  1. Information retrieval
  2. Information storage systems

Recommendations

B-box Mixer: An Interactive UI for Generating B-box Music
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

B-box is a form of vocal percussion that imitates rhythms in various types of sound, especially musical instruments. As b-box becoming popular, more and more people want to learn b-box and make their own b-box music. However, not everyone has the talent ...
Read More
TILES audio recorder: an unobtrusive wearable solution to track audio activity
WearSys '18: Proceedings of the 4th ACM Workshop on Wearable Systems and Applications

Most existing speech activity trackers used in human subject studies are bulky, record raw audio content which invades participant privacy, have complicated hardware and non-customizable software, and are too expensive for large-scale deployment. The ...
Read More
Precise pitch profile feature extraction from musical audio for key detection

The majority of pieces of music, including classical and popular music,are composed using music scales, such as keys. The key or the scale information of a piece provides important clues on its high level musical content, like harmonic and melodic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MULTIMEDIA '02: Proceedings of the tenth ACM international conference on Multimedia
December 2002
683 pages
ISBN:158113620X
DOI:10.1145/641007
Conference Chair:
Lawrence Rowe
UC Berkeley
,
General Chair:
Bernard Merialdo
Institut EURECOM
,
Program Chairs:
Max Muhlhauser
TU Darmstadt
,
Keith Ross
Institut EURECOM
,
Nevenka Dimitrova
Phillips Research
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
MPEG-2 compressed domain analysis
appliance
audio processing
meeting recorder
omni-directional video
Qualifiers
- Article
Conference

Acceptance Rates
MULTIMEDIA '02 Paper Acceptance Rate46of330submissions,14%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 53
  Total Citations
  View Citations
- 642
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Portable meeting recorder

MULTIMEDIA '02: Proceedings of the tenth ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

B-box Mixer: An Interactive UI for Generating B-box Music

TILES audio recorder: an unobtrusive wearable solution to track audio activity

Precise pitch profile feature extraction from musical audio for key detection