research-article

Semantic event detection in baseball videos based on a multi-output hidden Markov model

Authors:
Yin-Fu Huang

National Yunlin University of Science and Technology, Touliu, Yunlin, Taiwan

National Yunlin University of Science and Technology, Touliu, Yunlin, Taiwan
View Profile

,
Jyun-Jhang Huang

National Yunlin University of Science and Technology, Touliu, Yunlin, Taiwan

National Yunlin University of Science and Technology, Touliu, Yunlin, Taiwan
View Profile

SAC '11: Proceedings of the 2011 ACM Symposium on Applied ComputingMarch 2011Pages 929–936https://doi.org/10.1145/1982185.1982390

Published:21 March 2011Publication History

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

Pages 929–936

ABSTRACT

In this paper, we proposed an event detection method in baseball videos based on a multi-output HMM (hidden Markov model), using high-level audio/video features. For the video part, we use eight kinds of semantic scenes detected from baseball videos in our previous work. For the audio part, we extract the audio shots from corresponding video scenes, and cut an audio shot into N one-second clips. Then, the MFCC and ZCR of a one-second clip are extracted and fed into the SVM for classifying it as "acclaim" and "silence". Based on the classification results, the type of an audio shot can be determined in the post-classification. Next, a multi-output HMM modified from the original HMM is used to combine video and audio features to detect baseball video events. Finally, the experimental results show, the multi-output HMM has good event detection accuracy.

References

Liang Bai, Songyang Lao, Weiming Zhang, Gareth J. F. Jones, and Alan F. Smeaton. A semantic event detection approach for soccer video based on perception concepts and finite state machines. In Proceeding of the 8th International Workshop on Image Analysis for Multimedia Interactive Services, Santorini, Greece, 6--7 June 2007. Google ScholarDigital Library
Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines, 2001, software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Google ScholarDigital Library
Peng Chang, Mei Han, and Ehong Gong. Extract highlights from baseball game video with hidden Markov models. In Proceeding of International Conference on Image Processing. Vol. 1, pp. 609--612. Rochester, New York, 22--25 September 2002.Google ScholarCross Ref
Liming Chen, Bin Zhang, Wei Chen, Weibei Dou, and Yu-Jin Zhang. Content-based table tennis games highlight detection utilizing audiovisual clues. In Proceeding of the 4th International Conference on Image and Graphics. pp. 833--838. Chengdu, Sichuan, China, 22--24 August 2007. Google ScholarDigital Library
Shu-Ching Chen, Mei-Ling Shyu, Min Chen, and Chengcui Zhang. A decision tree-based multimodal data mining framework for soccer goal detection. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 265--268. Taipei, Taiwan, 27--30 June 2004.Google Scholar
M. Delakis, G. Gravier, and P. Gros. Score oriented Viterbi search in sport video structuring using HMM and segment models. In Proceeding of the 8th IEEE Workshop on Multimedia Signal Processing. pp. 484--487. Victoria, BC, 3--6 Oct. 2006.Google ScholarCross Ref
Alfian Abdul Halin, Mandava Rajeswari, and Dhanesh Ramachandram. Automatic overlaid text detection, extraction and recognition for high level event/concept identification in soccer videos. In Proceeing International Conference on Computer and Electrical Engineering. pp. 587--592. Phuket, 20--22 Dec. 2008. Google ScholarDigital Library
Yin-Fu Huang and Lien-Hung Tung. Semantic scene detection system for baseball videos based on the MPEG-7 specification. In Proceeding of ACM Symposium on Applied Computing. ACM, New York, NY, 941--947. Sierre, Switzerland, 22--26 March 2010. Google ScholarDigital Library
M. H. Kolekar, K. Palaniappan, and S. Sengupta. A novel framework for semantic annotation of soccer sports video sequences. In Proceeding of the 5th European Conference on Visual Media Production. pp. 1--9. London, 26--27 Nov. 2008.Google ScholarCross Ref
Jui-Hsin Lai and Shao-Yi Chien. Baseball and tennis video annotation with temporal structure decomposition. In Proceeding of the 10th IEEE Workshop on Multimedia Signal Processing. pp. 676--679. Cairns, Qld, 8--10 Oct. 2008.Google ScholarCross Ref
Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proceeding of IEEE. Vol. 77, No. 2, pp. 257--286. Feb 1989Google ScholarCross Ref
Niall Rea, Rozenn Dahyot, and Anil Kokaram. Modeling high level structure in sports with motion driven HMMs. In Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 3, pp. 621--624. Montreal, Quebec, Canada, 17--21 May 2004.Google ScholarCross Ref
N. Rea, R. Dahyot, and A. Kokaram. Classification and representation of semantic content in broadcast tennis videos. In Proceeding of IEEE International Conference on Image Processing. Vol. 3, pp. 1204--1207, Genoa, Italy, 11--14 Sept. 2005.Google ScholarCross Ref
Mohammed A. Refaey, Wael Abd-Almageed, and Larry S. Davis. A logic framework for sports video summarization using text-based semantic annotation. In Proceeding of the 3rd International Workshop on Semantic Media Adaptation and Personalization. pp. 69--75. Prague, 15--16 Dec. 2008. Google ScholarDigital Library
Yih-Ming Su and Chaur-Heh Hsieh. Semantic events detection and classification for baseball videos. In Proceeding of International Symposium on Industrial Electronics. pp. 1332--1336. Seoul, 5--8 July 2009.Google ScholarCross Ref
Ming-Chun Tien, Yi-Tang Wang, Chen-Wei Chou, Kuei-Yi Hsieh, Wei-Ta Chu, and Ja-Ling Wu. Event detection in tennis matches based on video data mining. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 1477--1480. Hannover, June 23 2008--April 26 2008.Google Scholar
Dian Tjondronegoro and Yi-Ping Phoebe Chen. Using decision-tree to automatically construct learned-heuristics for events classification in sports video. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 1465--1468. Toronto, Ont., 9--12 July 2006.Google ScholarCross Ref
Jinjun Wang, Chdngsheng Xu, Engsiong Chng, and Qi Tian. Sports highlight detection from keyword sequences using HMM. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 599--602. Taipei, 30--30 June 2004.Google Scholar
Guangsheng Zhao. Event-based soccer video retrieval with interactive genetic algorithm. In Proceeding of International Symposium on Information Science and Engineering. pp. 338--345. Shanghai, 20--22 Dec. 2008. Google ScholarDigital Library

Index Terms

Recommendations

Emotional speech classification using hidden conditional random fields
SoICT '11: Proceedings of the 2nd Symposium on Information and Communication Technology

Although there have been a great number of papers in the area of emotional speech recognition, most of them contribute to the feature extraction phase. Regarding classification algorithm, hidden Markov model (HMM) is still the most commonly used method. ...
Read More
Learning dynamic audio-visual mapping with input-output Hidden Markov models

In this paper, we formulate the problem of synthesizing facial animation from an input audio sequence as a dynamic audio-visual mapping. We propose that audio-visual mapping should be modeled with an input-output hidden Markov model, or IOHMM. An IOHMM ...
Read More
Event Detection by HMM, SVM and ANN: A Comparative Study
PROPOR '08: Proceedings of the 8th international conference on Computational Processing of the Portuguese Language

The goal of speech event detection (SED) is to reveal the presence of important elements in the speech signal for different sound classes. In a speech recognition system, events can be combined to detect phones, words or sentences, or to identify ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing
March 2011
1868 pages
ISBN:9781450301138
DOI:10.1145/1982185
Conference Chairs:
William Chu
Tunghai University, TaiChung, Taiwan
,
W. Eric Wong
University of Texas at Dallas, Richardson, Texas
,
Program Chairs:
Mathew J. Palakal
Indiana University Purdue University, Indianapolis
,
Chih-Cheng Hung
Southern Polytechnic State University, Marietta
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 March 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
HMM
MFCC
SVM
ZCR
audio/video
baseball video
broadcast
classification
event detection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 139
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Semantic event detection in baseball videos based on a multi-output hidden Markov model

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Emotional speech classification using hidden conditional random fields

Learning dynamic audio-visual mapping with input-output Hidden Markov models

Event Detection by HMM, SVM and ANN: A Comparative Study

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Semantic event detection in baseball videos based on a multi-output hidden Markov model

SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Emotional speech classification using hidden conditional random fields

Learning dynamic audio-visual mapping with input-output Hidden Markov models

Event Detection by HMM, SVM and ANN: A Comparative Study

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media