Skip to main content
Top
Published in: Pattern Analysis and Applications 4/2006

01-02-2006 | Theoretical Advances

An efficient segmentation-free approach to assist old Greek handwritten manuscript OCR

Authors: B. Gatos, K. Ntzios, I. Pratikakis, S. Petridis, T. Konidaris, S. J. Perantonis

Published in: Pattern Analysis and Applications | Issue 4/2006

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Recognition of old Greek manuscripts is essential for quick and efficient content exploitation of the valuable old Greek historical collections. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts written in lower case letters. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures. First, we detect closed cavities that exist in the character body. Then, the protrusions in the outer contour outline of the connected components that contain the character closed cavities are used for the classification of the area around closed cavities to a specific character or a character ligature. The proposed method gives highly accurate results and offers great assistance to old Greek handwritten manuscript OCR. We also provide additional OCR applications that not only prove the robustness of the proposed method but also demonstrate its generic flavor in case segmentation and text location tasks are very difficult to perform.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Vinciarelli A (2002) survey on off-line Cursive Word Recognition. Pattern Recognition 35:1433–1446MATHCrossRef Vinciarelli A (2002) survey on off-line Cursive Word Recognition. Pattern Recognition 35:1433–1446MATHCrossRef
2.
go back to reference Lu Y, Tan CL (2002) Combination of multiple classifiers using probabilistic dictionary and its application to postcode recognition. Pattern Recognition 35:2823–2832MATHCrossRef Lu Y, Tan CL (2002) Combination of multiple classifiers using probabilistic dictionary and its application to postcode recognition. Pattern Recognition 35:2823–2832MATHCrossRef
3.
go back to reference Brakensiek A, Rottland J, Rigoll G (2003) Confidence measures for an address reading system. Seventh international conference on document analysis and recognition, ICDAR2003, pp 294–298 Brakensiek A, Rottland J, Rigoll G (2003) Confidence measures for an address reading system. Seventh international conference on document analysis and recognition, ICDAR2003, pp 294–298
4.
go back to reference Hirano T, Okada Y, Yoda F (2001) Field extraction method from existing forms transmitted by facsimile. Sixth international conference on document analysis and recognition, ICDAR2001, pp 738–742 Hirano T, Okada Y, Yoda F (2001) Field extraction method from existing forms transmitted by facsimile. Sixth international conference on document analysis and recognition, ICDAR2001, pp 738–742
5.
go back to reference Xu Q, Lam L, Suen CY (2001) A knowledge-based segmentation system for handwritten dates on bank cheques. Sixth international conference on document analysis and recognition, ICDAR2001, pp 384–388 Xu Q, Lam L, Suen CY (2001) A knowledge-based segmentation system for handwritten dates on bank cheques. Sixth international conference on document analysis and recognition, ICDAR2001, pp 384–388
6.
go back to reference Gorski N, Anisimov V, Augustin E, Baret O, Price D, Simon JC (1999) A2iA check reader: a family of bank check recognition systems. Proc. fifth int’l conf. document analysis and recognition, pp 523–526 Gorski N, Anisimov V, Augustin E, Baret O, Price D, Simon JC (1999) A2iA check reader: a family of bank check recognition systems. Proc. fifth int’l conf. document analysis and recognition, pp 523–526
7.
go back to reference Suen CY, et al (1993) Building a new generation of handwriting recognition systems. Patt Recog Lett 14:303–315CrossRef Suen CY, et al (1993) Building a new generation of handwriting recognition systems. Patt Recog Lett 14:303–315CrossRef
8.
go back to reference Guillevic D, Suen CY (1997) HMM word recognition engine. Fourth international conference on document analysis and recognition ICDAR97, pp 544 Guillevic D, Suen CY (1997) HMM word recognition engine. Fourth international conference on document analysis and recognition ICDAR97, pp 544
9.
go back to reference Kavallieratou E, Fakotakis N, Kokkinakis G (2002) Handwritten character recognition based on structural characteristics. 16th International conference on pattern recognition, pp 139–142 Kavallieratou E, Fakotakis N, Kokkinakis G (2002) Handwritten character recognition based on structural characteristics. 16th International conference on pattern recognition, pp 139–142
10.
go back to reference Eastwood B et al. (1997) A feature based neural network segmenter for handwritten words. International conference on computational intelligence and multimedia applications (ICCIMA’97), Australia, pp 286–290 Eastwood B et al. (1997) A feature based neural network segmenter for handwritten words. International conference on computational intelligence and multimedia applications (ICCIMA’97), Australia, pp 286–290
11.
go back to reference Lu Y, Shridhar M (1996) Character segmentation in handwritten words—an overview, Patt Recog 29(1):77–96CrossRef Lu Y, Shridhar M (1996) Character segmentation in handwritten words—an overview, Patt Recog 29(1):77–96CrossRef
12.
go back to reference Xiao X, Leedham G (1999) Cursive script segmentation incorporating knowledge of writing. Proceedings of the fifth international conference on document analysis and recognition, pp 535–538 Xiao X, Leedham G (1999) Cursive script segmentation incorporating knowledge of writing. Proceedings of the fifth international conference on document analysis and recognition, pp 535–538
13.
go back to reference Plamondon P, Privitera CM (1999) The segmentation of cursive handwritten: an approach based on off-line recovery of the motor-temporal information, IEEE Trans Image Process 8:80–91CrossRef Plamondon P, Privitera CM (1999) The segmentation of cursive handwritten: an approach based on off-line recovery of the motor-temporal information, IEEE Trans Image Process 8:80–91CrossRef
14.
go back to reference Chi Z, Suters M, Yan H (1995) Separation of single-and double-touching handwritten numeral strings. Opt Eng 34:1159–1165CrossRef Chi Z, Suters M, Yan H (1995) Separation of single-and double-touching handwritten numeral strings. Opt Eng 34:1159–1165CrossRef
15.
go back to reference Zhao S, Chi Z, Shi P, Yan H (2003) Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recognition 36:145–156MATHCrossRef Zhao S, Chi Z, Shi P, Yan H (2003) Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recognition 36:145–156MATHCrossRef
16.
go back to reference Farag R (1979) Word-level recognition of cursive script, IEEE Trans. Comput Vol C-28:172–175 Farag R (1979) Word-level recognition of cursive script, IEEE Trans. Comput Vol C-28:172–175
17.
go back to reference Simon J (1992) Off-line cursive word recognition. Proceedings of the IEEE 80:1150–1161CrossRef Simon J (1992) Off-line cursive word recognition. Proceedings of the IEEE 80:1150–1161CrossRef
18.
go back to reference Madhvanath S, Govindaraju V (1993) Holistic lexicon reduction. Proceedings of the Third International Workshop on Frontiers in Handwriting Recognition. Buffalo, N.Y:71–82 Madhvanath S, Govindaraju V (1993) Holistic lexicon reduction. Proceedings of the Third International Workshop on Frontiers in Handwriting Recognition. Buffalo, N.Y:71–82
19.
go back to reference Madhvanath S, Kleinger E, Govindaraju V (1999) Holistic verifications of handwritten phrases. IEEE Trans. PAMI 21:1344–1356 Madhvanath S, Kleinger E, Govindaraju V (1999) Holistic verifications of handwritten phrases. IEEE Trans. PAMI 21:1344–1356
20.
go back to reference Chen CH, de Curtins J (2003) Word Recognition in a Segmentation-Free Approach to OCR. Second International Conference on Document Analysis and Recognition (ICDAR’93), pp 573–576 Chen CH, de Curtins J (2003) Word Recognition in a Segmentation-Free Approach to OCR. Second International Conference on Document Analysis and Recognition (ICDAR’93), pp 573–576
21.
go back to reference Chen CH, de Curtins J (1992) A Segmentation-free Approach to OCR. IEEE Workshop on Applications of Computer Vision, pp 190–196 Chen CH, de Curtins J (1992) A Segmentation-free Approach to OCR. IEEE Workshop on Applications of Computer Vision, pp 190–196
22.
go back to reference Duda R, Hart E (1973) Pattern Classification and Scene Analysis. WileyMATH Duda R, Hart E (1973) Pattern Classification and Scene Analysis. WileyMATH
23.
go back to reference Amin A and Masini G Machine recognition of cursive Arabic words, Application of Digital Image Processing IV, San Diego, CA, August 1982, Vol SPIE-359, pp.286–292] Amin A and Masini G Machine recognition of cursive Arabic words, Application of Digital Image Processing IV, San Diego, CA, August 1982, Vol SPIE-359, pp.286–292]
24.
go back to reference Mori S, Suen CY, Yamamoto K Historical review of OCR research and development, Proc. IEEE, vol. 80 1992, pp. 1029–1058 Mori S, Suen CY, Yamamoto K Historical review of OCR research and development, Proc. IEEE, vol. 80 1992, pp. 1029–1058
25.
go back to reference Ulmann J. R. Experiments with the n-tuple method of pattern recognition, IEEE Trans. Computers, vol 18, no 12,1969 pp. 1135–1137 Ulmann J. R. Experiments with the n-tuple method of pattern recognition, IEEE Trans. Computers, vol 18, no 12,1969 pp. 1135–1137
26.
go back to reference Jung DM, Krishnamoorty MS, Nagy G, Shapira A. N-tuple features for OCR revisited, IEEE Trans. PAMI vol. 18, no. 7,1996, pp. 734–745 Jung DM, Krishnamoorty MS, Nagy G, Shapira A. N-tuple features for OCR revisited, IEEE Trans. PAMI vol. 18, no. 7,1996, pp. 734–745
27.
go back to reference Gonzalez RC, Woods RE (1992) Digital Image Processing. Addison-Wesley Gonzalez RC, Woods RE (1992) Digital Image Processing. Addison-Wesley
28.
go back to reference Gatos B, Pratikakis I, Perantonis SJ Locating Text in Historical Collection Manuscripts. Lecture Notes on AI, SETN 2004, pp. 476–485 Gatos B, Pratikakis I, Perantonis SJ Locating Text in Historical Collection Manuscripts. Lecture Notes on AI, SETN 2004, pp. 476–485
29.
go back to reference Niblack W (1986) An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs NJ, pp 115–116 Niblack W (1986) An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs NJ, pp 115–116
30.
go back to reference Pavlidis T (1992) Algorithms for Graphics and Image Processing. Computer Science Press, Rockville, MD Pavlidis T (1992) Algorithms for Graphics and Image Processing. Computer Science Press, Rockville, MD
31.
go back to reference Xia F (2003) Normal vector and winding number in 2D digital images with their application for hole detection. Pattern Recognition 36:1383–1395MATHCrossRef Xia F (2003) Normal vector and winding number in 2D digital images with their application for hole detection. Pattern Recognition 36:1383–1395MATHCrossRef
32.
go back to reference Jain A (1989) Fundamentals of digital image processing. Prentice Hall Jain A (1989) Fundamentals of digital image processing. Prentice Hall
33.
go back to reference Theodoridis S, Koutroumbas K (1997) Pattern Recognition. Academic Press Theodoridis S, Koutroumbas K (1997) Pattern Recognition. Academic Press
34.
go back to reference Chang CC, Lin, C. J. LIBSVM: A library for support vector machines 2001, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm Chang CC, Lin, C. J. LIBSVM: A library for support vector machines 2001, Software available at http://​www.​csie.​ntu.​edu.​tw/​~cjlin/​libsvm
35.
go back to reference American Memory: Historical Collections for the National Digital Library, http://memory.loc.gov/ American Memory: Historical Collections for the National Digital Library, http://​memory.​loc.​gov/​
36.
go back to reference Sauvola J, Kauniskangas H (1999) MediaTeam Document Database II, a CD-ROM collection of document images. University of Oulu, Finland Sauvola J, Kauniskangas H (1999) MediaTeam Document Database II, a CD-ROM collection of document images. University of Oulu, Finland
Metadata
Title
An efficient segmentation-free approach to assist old Greek handwritten manuscript OCR
Authors
B. Gatos
K. Ntzios
I. Pratikakis
S. Petridis
T. Konidaris
S. J. Perantonis
Publication date
01-02-2006
Publisher
Springer-Verlag
Published in
Pattern Analysis and Applications / Issue 4/2006
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-005-0013-7

Other articles of this Issue 4/2006

Pattern Analysis and Applications 4/2006 Go to the issue

Premium Partner