research-article

A spatiotemporal text localization and identification approach for content-based video browsing

Authors:
Bassem Bouaziz

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia
View Profile

,
Walid Mahdi

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia
View Profile

,
Tarek Zlitni

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia
View Profile

,
Abdelmajid Benhamadou

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia

MIRACL: Multimedia Information systems and Advanced Computing Laboratory, CP, Sfax, Tunisia
View Profile

MoMM '09: Proceedings of the 7th International Conference on Advances in Mobile Computing and MultimediaDecember 2009Pages 44–51https://doi.org/10.1145/1821748.1821764

Published:14 December 2009Publication History

MoMM '09: Proceedings of the 7th International Conference on Advances in Mobile Computing and Multimedia

Pages 44–51

ABSTRACT

Text in videos contains much semantic information that can be used for video indexing and browsing. In this paper, we propose a spatiotemporal video-text localization and identification approach which proceeds in two main steps: text region localization and text region identification. In the first step we detect the significant appearance of the new objects in a frame by a split and merge processes applied on binarized edge frame pair differences. Detected objects are, a priori, considered as text. They are then filtered according to both local contrast and texture criteria in order to get the effective ones. The resulted text regions are identified based on a visual grammar descriptor containing a set of semantic text class regions characterized by visual features. A visual table of content is generated based on extracted text regions occurring within video sequence enriched by a semantic identification. The experimentation performed on a variety of video sequences shows the efficiency of our.

References

Lyu, M. R., Jiqiang Song, Min Cai, "A comprehensive method for multilingual video text detection, localization, and extraction", IEEE Trans. Circuits Syst. Video Technol., Volume 15, Issue 2, Feb. 2005, pp. 243--255. Google ScholarDigital Library
R. K. Srihari, Z. Zhang, A. Rao, Intelligent indexing and semantic retrieval of multimodal documents, Inform. Retrieval 2 (2/3) (2000) 245--275. Google ScholarDigital Library
A. K. Jain, B. Yu, "Automatic Text Location in Images and Video Frames," Pattern Recognition, 1998, vol. 31, pp. 2055--2076.Google ScholarCross Ref
V. Wu, R. Manmatha, and E. M. Riseman, "Textfinder: An Automatic System to Detect and Recognize Text in Images," IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, vol. 20, pp. 1224--1229. Google ScholarDigital Library
Keechul Jung, "Neural Network-based Text Location in Color Images," Pattern Recognition Letters, 2001, vol. 22, pp. 1503--1515. Google ScholarDigital Library
E. K. Wong, M. Chen, "A New Robust Algorithm for Video Text Extraction," Pattern Recognition, 2003, vol. 36, pp. 1397--1406.Google ScholarCross Ref
R. Lienhart and A. Wernicke, "Localizing and Segmenting Text in Images and Videos", IEEE Transactions on Circuits and System for Video Technology, 2002, vol. 12, pp. 256--268. Google ScholarDigital Library
S. Antani, D. Crandall, and R. Kasturi, "Robust extraction of text in video," in Proc. 15th Int. Conf. Pattern Recognit, vol. 1, 2000, pp. 831--834. Google ScholarDigital Library
D. Chen, K. Shearer, and H. Bourlard, "Text enhancement with asymmetric filter for video OCR" in Proc. 11th Int. Conf. Image Anal. Process, 2001, pp. 192--197 Google ScholarDigital Library
M. Cai, J. Song, and M. R. Lyu, "A new approach for video text detection," in Proc. Int. Conf Image Process., Rochester, NY, Sep. 2002, pp. 117--120.Google Scholar
C. Wolf, J.-M. Jolion, F. Chassaing, "Text localization, enhancement and binarization in multimedia documents" Pattern Recognition, 2002. Proceedings. 16th International Conference on, Volume 2, 11--15 Aug. 2002, pp. 1037--1040.Google Scholar
X. Hua, P. Yin and H. J. Zhang, "Efficient video text recognition using multiple Frame Integration," IEEE Int. Conf. on Image Processing (ICIP), Sept 2002.Google Scholar
B. Bouaziz, W. Mahdi, A. BEN Hamadou.: A New Video Images Text Localization Approach Based on a Fast Hough Transform. In ICIAR 2006, Springer Lectures notes Image and Video Processing and Analysis, pp. 414--425, September 2006. Google ScholarDigital Library
B. Bouaziz, W. Mahdi, A. BEN Hamadou: Automatic Text Regions Location in Video Frames. In: The IEEE International conference on signal-image technologyand internet based system, pp. 2--9. IEEE Press, Yaoundé (2005).Google Scholar
D. Coretez, P. Nunes, M. Sequeira, F. Pereira, "Image segmentation Towars new Image representation methods", Signal processing: Image communication, Vol. 6 Nr 6, (1995) 485--498.Google ScholarCross Ref
Gatos B., Pratikakis P., Perantonis S. J.: Text Detection in Indoor/Outdoor Scene Images. First International Workshop on Camera-based Document Analysis and Recognition (CBDAR'05), Seoul, Korea, (August 2005) 127--132Google Scholar
{Sin B., Kim S., Cho B.: Locating characters in scene images using frequencyfeatures. Proceedings of International Conference on Pattern Recognition, Vol. 3, Quebec, Canada, (2002) 489--492. Google ScholarDigital Library
W. Peng, R. Yong Man, W. Chee Sun, and C. Yanglim. Texturedescriptors in mpeg-7. In Proceedings of the 9th International Conference on Computer Analysis of Images and Patterns. Springer- Verlag, 2001. 753325 21--28. Google ScholarDigital Library
Ravishankar K. C., Prasad B. G., Gupta S. K., Biswas K. Dominant Color Region Based Indexing for CBIR. In proceedings of the International Conference on Image Analysis and Processing (ICIAP'99). Venice. Italy, (1999) 887--892. Google ScholarDigital Library
Prasad B. G., Gupta S. K., Biswas K. Color and Shape Index for Region-Based Image Retrieval" Proceedings of the 4th International Workshop on Visual Form, (2001), 716--728. Google ScholarDigital Library
Bouaziz, B. Zlitni, T. Mahdi, W. AViTExt: Automatic Video Text Extraction; A new Approach for video content indexing Application The International Conference on Information & Communication Technologies, Damascus (2008).Google Scholar
I. Sobel, An isotropic 3x3image gradient operator, in Machine Vision for Three-Dimensional Scenes, H. Freeman, Ed. New York: Academic, 1990, pp. 376--379.Google Scholar

Index Terms

A spatiotemporal text localization and identification approach for content-based video browsing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

A new region filtering and region weighting approach to relevance feedback in content-based image retrieval

A new region filtering and region weighting method, which filters out unnecessary regions from images and learns region importance from the region size and the spatial location of regions in an image, is proposed based on region representations. It ...
Read More
Region-based image retrieval using color-size features of watershed regions

This paper presents a region-based image retrieval system that provides a user interface for helping to specify the watershed regions of interest within a query image. We first propose a new type of visual features, called color-size feature, which ...
Read More
Content based video matching using spatiotemporal volumes

This paper presents a novel framework for matching video sequences using the spatiotemporal segmentation of videos. Instead of using appearance features for region correspondence across frames, we use interest point trajectories to generate video ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MoMM '09: Proceedings of the 7th International Conference on Advances in Mobile Computing and Multimedia
December 2009
663 pages
ISBN:9781605586595
DOI:10.1145/1821748
General Chair:
Gabriele Kotsis
Johannes Kepler University Linz, Austria
,
Program Chairs:
David Taniar
Monash University, Australia
,
Eric Pardede
La Trobe University, Australia
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 December 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
non-linear video browsing
region filtering
spatiotemporal features
text extraction
video indexing
visual index
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 98
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A spatiotemporal text localization and identification approach for content-based video browsing

MoMM '09: Proceedings of the 7th International Conference on Advances in Mobile Computing and Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

A new region filtering and region weighting approach to relevance feedback in content-based image retrieval

Region-based image retrieval using color-size features of watershed regions

Content based video matching using spatiotemporal volumes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A spatiotemporal text localization and identification approach for content-based video browsing

MoMM '09: Proceedings of the 7th International Conference on Advances in Mobile Computing and Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

A new region filtering and region weighting approach to relevance feedback in content-based image retrieval

Region-based image retrieval using color-size features of watershed regions

Content based video matching using spatiotemporal volumes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media