ABSTRACT
In this paper, we proposed an event detection method in baseball videos based on a multi-output HMM (hidden Markov model), using high-level audio/video features. For the video part, we use eight kinds of semantic scenes detected from baseball videos in our previous work. For the audio part, we extract the audio shots from corresponding video scenes, and cut an audio shot into N one-second clips. Then, the MFCC and ZCR of a one-second clip are extracted and fed into the SVM for classifying it as "acclaim" and "silence". Based on the classification results, the type of an audio shot can be determined in the post-classification. Next, a multi-output HMM modified from the original HMM is used to combine video and audio features to detect baseball video events. Finally, the experimental results show, the multi-output HMM has good event detection accuracy.
- Liang Bai, Songyang Lao, Weiming Zhang, Gareth J. F. Jones, and Alan F. Smeaton. A semantic event detection approach for soccer video based on perception concepts and finite state machines. In Proceeding of the 8th International Workshop on Image Analysis for Multimedia Interactive Services, Santorini, Greece, 6--7 June 2007. Google ScholarDigital Library
- Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines, 2001, software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Google ScholarDigital Library
- Peng Chang, Mei Han, and Ehong Gong. Extract highlights from baseball game video with hidden Markov models. In Proceeding of International Conference on Image Processing. Vol. 1, pp. 609--612. Rochester, New York, 22--25 September 2002.Google ScholarCross Ref
- Liming Chen, Bin Zhang, Wei Chen, Weibei Dou, and Yu-Jin Zhang. Content-based table tennis games highlight detection utilizing audiovisual clues. In Proceeding of the 4th International Conference on Image and Graphics. pp. 833--838. Chengdu, Sichuan, China, 22--24 August 2007. Google ScholarDigital Library
- Shu-Ching Chen, Mei-Ling Shyu, Min Chen, and Chengcui Zhang. A decision tree-based multimodal data mining framework for soccer goal detection. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 265--268. Taipei, Taiwan, 27--30 June 2004.Google Scholar
- M. Delakis, G. Gravier, and P. Gros. Score oriented Viterbi search in sport video structuring using HMM and segment models. In Proceeding of the 8th IEEE Workshop on Multimedia Signal Processing. pp. 484--487. Victoria, BC, 3--6 Oct. 2006.Google ScholarCross Ref
- Alfian Abdul Halin, Mandava Rajeswari, and Dhanesh Ramachandram. Automatic overlaid text detection, extraction and recognition for high level event/concept identification in soccer videos. In Proceeing International Conference on Computer and Electrical Engineering. pp. 587--592. Phuket, 20--22 Dec. 2008. Google ScholarDigital Library
- Yin-Fu Huang and Lien-Hung Tung. Semantic scene detection system for baseball videos based on the MPEG-7 specification. In Proceeding of ACM Symposium on Applied Computing. ACM, New York, NY, 941--947. Sierre, Switzerland, 22--26 March 2010. Google ScholarDigital Library
- M. H. Kolekar, K. Palaniappan, and S. Sengupta. A novel framework for semantic annotation of soccer sports video sequences. In Proceeding of the 5th European Conference on Visual Media Production. pp. 1--9. London, 26--27 Nov. 2008.Google ScholarCross Ref
- Jui-Hsin Lai and Shao-Yi Chien. Baseball and tennis video annotation with temporal structure decomposition. In Proceeding of the 10th IEEE Workshop on Multimedia Signal Processing. pp. 676--679. Cairns, Qld, 8--10 Oct. 2008.Google ScholarCross Ref
- Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proceeding of IEEE. Vol. 77, No. 2, pp. 257--286. Feb 1989Google ScholarCross Ref
- Niall Rea, Rozenn Dahyot, and Anil Kokaram. Modeling high level structure in sports with motion driven HMMs. In Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 3, pp. 621--624. Montreal, Quebec, Canada, 17--21 May 2004.Google ScholarCross Ref
- N. Rea, R. Dahyot, and A. Kokaram. Classification and representation of semantic content in broadcast tennis videos. In Proceeding of IEEE International Conference on Image Processing. Vol. 3, pp. 1204--1207, Genoa, Italy, 11--14 Sept. 2005.Google ScholarCross Ref
- Mohammed A. Refaey, Wael Abd-Almageed, and Larry S. Davis. A logic framework for sports video summarization using text-based semantic annotation. In Proceeding of the 3rd International Workshop on Semantic Media Adaptation and Personalization. pp. 69--75. Prague, 15--16 Dec. 2008. Google ScholarDigital Library
- Yih-Ming Su and Chaur-Heh Hsieh. Semantic events detection and classification for baseball videos. In Proceeding of International Symposium on Industrial Electronics. pp. 1332--1336. Seoul, 5--8 July 2009.Google ScholarCross Ref
- Ming-Chun Tien, Yi-Tang Wang, Chen-Wei Chou, Kuei-Yi Hsieh, Wei-Ta Chu, and Ja-Ling Wu. Event detection in tennis matches based on video data mining. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 1477--1480. Hannover, June 23 2008--April 26 2008.Google Scholar
- Dian Tjondronegoro and Yi-Ping Phoebe Chen. Using decision-tree to automatically construct learned-heuristics for events classification in sports video. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 1465--1468. Toronto, Ont., 9--12 July 2006.Google ScholarCross Ref
- Jinjun Wang, Chdngsheng Xu, Engsiong Chng, and Qi Tian. Sports highlight detection from keyword sequences using HMM. In Proceeding of IEEE International Conference on Multimedia and Expo.. pp. 599--602. Taipei, 30--30 June 2004.Google Scholar
- Guangsheng Zhao. Event-based soccer video retrieval with interactive genetic algorithm. In Proceeding of International Symposium on Information Science and Engineering. pp. 338--345. Shanghai, 20--22 Dec. 2008. Google ScholarDigital Library
Index Terms
- Semantic event detection in baseball videos based on a multi-output hidden Markov model
Recommendations
Emotional speech classification using hidden conditional random fields
SoICT '11: Proceedings of the 2nd Symposium on Information and Communication TechnologyAlthough there have been a great number of papers in the area of emotional speech recognition, most of them contribute to the feature extraction phase. Regarding classification algorithm, hidden Markov model (HMM) is still the most commonly used method. ...
Learning dynamic audio-visual mapping with input-output Hidden Markov models
In this paper, we formulate the problem of synthesizing facial animation from an input audio sequence as a dynamic audio-visual mapping. We propose that audio-visual mapping should be modeled with an input-output hidden Markov model, or IOHMM. An IOHMM ...
Event Detection by HMM, SVM and ANN: A Comparative Study
PROPOR '08: Proceedings of the 8th international conference on Computational Processing of the Portuguese LanguageThe goal of speech event detection (SED) is to reveal the presence of important elements in the speech signal for different sound classes. In a speech recognition system, events can be combined to detect phones, words or sentences, or to identify ...
Comments