Abstract
Automatic construction of content-based indices for video source material requires general semantic interpretation of both images and their accompanying sounds; but such a broadly-based semantic analysis is beyond the capabilities of the current technologies of machine vision and audio signal analysis. However, if one can assume a limited and well-demarcated body of domain knowledge for describing the content of a body of video, then it becomes easier to interpret a video source in terms of that domain knowledge. This paper presents our work on using domain knowledge to parse news video programs and to index them on the basis of their visual content. Models based on both the spatial structure of image frames and the temporal structure of the entire program have been developed for news videos, along with algorithms that apply these models by locating and identifying instances of their elements. Experimental results are also discussed in detail to evaluate both the models and the algorithms that use them. Finally, proposals for future work are summarized.
Similar content being viewed by others
References
Gong Y, Zhang HJ, Chua TC (1994) An image database system with content capturing and fast image indexing abilities. Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Boston, pp 121–130
Hawley MI (1993) Structure out of sound. PhD thesis, Massachusetts Institute of Technology, Cambridge, Mass
Nagasaka A, Tanaka Y (1991) Automatic video indexing and full-video search for object appearances, In: Knuth E, Wegner LM (eds), Visual Database Systems, II. North-Holland, Amsterdam, The Netherlands pp 113–127
Pentland AP, Picard RW, Davenport G, Welsh, B (1993) The BT/MIT project on advanced image tools for telecommunications: an overview, Technical Report No. 212, Media Lab, Massachusetts Institute of Technology, Cambridge, Mass
Smoliar SW, Zhang HJ, Koh SL, Lu GJ (1994) Interacting with digital video. Proceedings of IEEE TENCON '94, Singapore, pp952–956
Swanberg D, Shu C-F, Jain R (1993) Knowledge guided parsing in video databases, Proceedings IS&T/SPIE Symposium on Electronic Imaging: Science and Technology, San Jose, Calif
Zhang HJ, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of video. Multimedia Systems 1:10–28
Zhang HJ, Smoliar SW (1994) Developing power tools for video indexing and retrieval. Proceedings IS&T/SPIE Conference on Storage and Retrieval for Image and Video Database, San Jose, Calif, pp 140–149
Zhang HJ, Low CY, Gong YH, Smoliar SW (1994a) Video parsing using compressed data, Proceedings SPIE Conference on Image and Video Processing II, San Jose, Calif, pp 142–149
Zhang HJ, Smoliar SW, Wu JH, Low CY, Kankanhalli A (1994b) Development of a video database system. Proceedings of the Workshop on Digital Libraries: Current Issues, Newark, NJ
Zhang HJ, Smoliar SW, Wu JH (1995) Content-based video browsing tools, Proceedings IS&T/SPIE Conference Multimedia Computing and Networking, to be held in San Jose, Calif
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, H., Tan, S.Y., Smoliar, S.W. et al. Automatic parsing and indexing of news video. Multimedia Systems 2, 256–266 (1995). https://doi.org/10.1007/BF01225243
Issue Date:
DOI: https://doi.org/10.1007/BF01225243