Skip to main content

1999 | OriginalPaper | Buchkapitel

Scale Space Technique for Word Segmentation in Handwritten Documents

verfasst von : R. Manmatha, Nitin Srimal

Erschienen in: Scale-Space Theories in Computer Vision

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Indexing large archives of historical manuscripts, like the pa- pers of George Washington, is required to allow rapid perusal by scholars and researchers who wish to consult the original manuscripts. Presently, such large archives are indexed manually. Since optical character recog- nition (OCR) works poorly with handwriting, a scheme based on match- ing word images called word spotting has been suggested previously for indexing such documents. The important steps in this scheme are seg- mentation of a document page into words and creation of lists containing instances of the same word by word image matching.We have developed a novel methodology for segmenting handwritten document images by analyzing the extent of “blobs” in a scale space representationof the image. We believe this is the first application of scale space to this problem. The algorithm has been applied to around 30 grey level images randomly picked from different sections of the George Washington corpus of 6,400 handwritten document images. An accuracy of 77 – 96 percent was observed with an average accuracy of around 87 percent. The algorithm works well in the presence of noise, shine through and other artifacts which may arise due aging and degradation of the page over a couple of centuries or through the man made processes of photocopying and scanning.

Metadaten
Titel
Scale Space Technique for Word Segmentation in Handwritten Documents
verfasst von
R. Manmatha
Nitin Srimal
Copyright-Jahr
1999
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-48236-9_3