Skip to main content

2016 | OriginalPaper | Buchkapitel

Dense Correspondences and Ancient Texts

verfasst von : Tal Hassner, Lior Wolf, Nachum Dershowitz, Gil Sadeh, Daniel Stökl Ben-Ezra

Erschienen in: Dense Image Correspondences for Computer Vision

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter concerns applications of dense correspondences to images of a very different nature than those considered in previous chapters. Rather than images of natural or man-made scenes and objects, here, we deal with images of texts. We present a novel, dense correspondence-based approach to text image analysis instead of the more traditional approach of analysis at the character level (e.g., existing optical character recognition methods) or word level (the so called word spotting approach). We focus on the challenging domain of historical text image analysis. Such texts are handwritten and are often severely corrupted by noise and degradation, making them difficult to handle with existing methods. Our system is designed for the particular task of aligning such manuscript images to their transcripts. Our proposed alternative to performing this task manually is a system which directly matches the historical text image with a synthetic image rendered from the transcript. These matches are performed at the pixel level, by using SIFT flow applied to a novel per pixel representation. Our pipeline is robust to document degradation, variations between script styles and nonlinear image transformations. More importantly, this per pixel matching approach does not require prior learning of the particular script used in the documents being processed, and so can easily be applied to manuscripts of widely varying origins, languages, and characteristics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefMATH Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefMATH
2.
Zurück zum Zitat Al Azawi, M., Liwicki, M., Breuel, T.M.: WFST-based ground truth alignment for difficult historical documents with text modification and layout variations. In: IS&T/SPIE Electronic Imaging, International Society for Optics and Photonics (2013)CrossRef Al Azawi, M., Liwicki, M., Breuel, T.M.: WFST-based ground truth alignment for difficult historical documents with text modification and layout variations. In: IS&T/SPIE Electronic Imaging, International Society for Optics and Photonics (2013)CrossRef
3.
Zurück zum Zitat Asi, A., Rabaev, I., Kedem, K., El-Sana, J.: User-assisted alignment of arabic historical manuscripts. In: Proceedings of Workshop on Historical Document Imaging and Processing, pp. 22–28. ACM, New York (2011) Asi, A., Rabaev, I., Kedem, K., El-Sana, J.: User-assisted alignment of arabic historical manuscripts. In: Proceedings of Workshop on Historical Document Imaging and Processing, pp. 22–28. ACM, New York (2011)
4.
Zurück zum Zitat Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized PatchMatch correspondence algorithm. In: Proceedings of ECCV (2010)CrossRef Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized PatchMatch correspondence algorithm. In: Proceedings of ECCV (2010)CrossRef
5.
Zurück zum Zitat Dovgalecs, V., Burnett, A., Tranouez, P., Nicolas, S., Heutte, L.: Spot it! Finding words and patterns in historical documents. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1039–1043. IEEE, New York (2013) Dovgalecs, V., Burnett, A., Tranouez, P., Nicolas, S., Heutte, L.: Spot it! Finding words and patterns in historical documents. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1039–1043. IEEE, New York (2013)
6.
Zurück zum Zitat Ebert, S., Larlus, D., Schiele, B.: Extracting structures in image collections for object recognition. In: Proceedings of ECCV (2010)CrossRef Ebert, S., Larlus, D., Schiele, B.: Extracting structures in image collections for object recognition. In: Proceedings of ECCV (2010)CrossRef
7.
Zurück zum Zitat Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Proceedings of HIP (2011)CrossRef Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Proceedings of HIP (2011)CrossRef
8.
Zurück zum Zitat Guillaumin, M., Verbeek, J., Schmid, C., Lear, I., Kuntzmann, L.: Is that you? Metric learning approaches for face identification. In: Proceedings of ICCV (2009)CrossRef Guillaumin, M., Verbeek, J., Schmid, C., Lear, I., Kuntzmann, L.: Is that you? Metric learning approaches for face identification. In: Proceedings of ICCV (2009)CrossRef
9.
Zurück zum Zitat HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. 30(4), 70:1–70:9 (2011) HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. 30(4), 70:1–70:9 (2011)
10.
Zurück zum Zitat Hassner, T., Rehbein, M., Stokes, P.A., Wolf, L.: Computation and palaeography: potentials and limits. Dagstuhl Manifestos 2(1), 14–35 (2013) Hassner, T., Rehbein, M., Stokes, P.A., Wolf, L.: Computation and palaeography: potentials and limits. Dagstuhl Manifestos 2(1), 14–35 (2013)
11.
Zurück zum Zitat Hassner, T., Wolf, L., Dershowitz, N.: OCR-free transcript alignment. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1310–1314 (2013) Hassner, T., Wolf, L., Dershowitz, N.: OCR-free transcript alignment. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1310–1314 (2013)
12.
Zurück zum Zitat Heikkilä, M., Pietikäinen, M., Schmid, C.: Description of interest regions with center-symmetric local binary patterns. In: Indian Conference Computer Vision, Graphics and Image Processing (2006)CrossRefMATH Heikkilä, M., Pietikäinen, M., Schmid, C.: Description of interest regions with center-symmetric local binary patterns. In: Indian Conference Computer Vision, Graphics and Image Processing (2006)CrossRefMATH
13.
Zurück zum Zitat Hobby, J.D.: Matching document images with ground truth. Int. J. Doc. Anal. Recognit. 1(1), 52–61 (1998) Hobby, J.D.: Matching document images with ground truth. Int. J. Doc. Anal. Recognit. 1(1), 52–61 (1998)
15.
Zurück zum Zitat Huang, C., Srihari, S.N.: Mapping transcripts to handwritten text. In: Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition, pp. 15–20 (2006) Huang, C., Srihari, S.N.: Mapping transcripts to handwritten text. In: Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition, pp. 15–20 (2006)
16.
Zurück zum Zitat Jose, D., Bhardwaj, A., Govindaraju, V.: Transcript mapping for handwritten English documents. In: Yanikoglu, B.A., Berkner, K. (eds.) DRR, SPIE Proceedings, vol. 6815, SPIE (2008) Jose, D., Bhardwaj, A., Govindaraju, V.: Transcript mapping for handwritten English documents. In: Yanikoglu, B.A., Berkner, K. (eds.) DRR, SPIE Proceedings, vol. 6815, SPIE (2008)
17.
Zurück zum Zitat Kellokumpu, V., Zhao, G., Pietikainen, M.: Human activity recognition using a dynamic texture based method. In: Proceedings of BMVC (2008) Kellokumpu, V., Zhao, G., Pietikainen, M.: Human activity recognition using a dynamic texture based method. In: Proceedings of BMVC (2008)
18.
Zurück zum Zitat Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision (2011)CrossRef Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision (2011)CrossRef
19.
Zurück zum Zitat Kornfield, E.M., Manmatha, R., Allan, J.: Text alignment with handwritten documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 195–211. IEEE Computer Society, Cambridge (2004) Kornfield, E.M., Manmatha, R., Allan, J.: Text alignment with handwritten documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 195–211. IEEE Computer Society, Cambridge (2004)
20.
Zurück zum Zitat Kovesi, P.: Fast almost-Gaussian filtering. In: Proceedings of International Conference on Digital Image Computing: Techniques and Applications, pp. 121–125 (2010) Kovesi, P.: Fast almost-Gaussian filtering. In: Proceedings of International Conference on Digital Image Computing: Techniques and Applications, pp. 121–125 (2010)
21.
Zurück zum Zitat Kuster, M., Ludwig, C., Al-Hajj, Y., Selig, T.: Textgrid provenance tools for digital humanities ecosystems. In: Proceedings of Conference on Digital Ecosystems and Technologies Conference, pp. 317–323. IEEE, New York (2011) Kuster, M., Ludwig, C., Al-Hajj, Y., Selig, T.: Textgrid provenance tools for digital humanities ecosystems. In: Proceedings of Conference on Digital Ecosystems and Technologies Conference, pp. 317–323. IEEE, New York (2011)
22.
Zurück zum Zitat Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 278–287 (2004) Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 278–287 (2004)
23.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef
24.
Zurück zum Zitat Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef
25.
Zurück zum Zitat Ojala, T., Pietikainen, M., Harwood, D.: A comparative-study of texture measures with classification based on feature distributions. Pattern Recognition 29(1), 51–59 (1996)CrossRef Ojala, T., Pietikainen, M., Harwood, D.: A comparative-study of texture measures with classification based on feature distributions. Pattern Recognition 29(1), 51–59 (1996)CrossRef
26.
Zurück zum Zitat Ojala, T., Pietikäinen, M., Mäenpää, T.: A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In: Proceedings of ICAPR (2001)CrossRefMATH Ojala, T., Pietikäinen, M., Mäenpää, T.: A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In: Proceedings of ICAPR (2001)CrossRefMATH
27.
Zurück zum Zitat Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRefMATH Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRefMATH
28.
Zurück zum Zitat Rabaev, I., Biller, O., El-Sana, J., Kedem, K., Dinstein, I.: Case study in Hebrew character searching. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1080–1084. IEEE, New York (2011) Rabaev, I., Biller, O., El-Sana, J., Kedem, K., Dinstein, I.: Case study in Hebrew character searching. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1080–1084. IEEE, New York (2011)
29.
Zurück zum Zitat Rothfeder, J.L., Manmatha, R., Rath, T.M.: Aligning transcripts to automatically segmented handwritten manuscripts. In: Bunke, H., Spitz, A.L. (eds.) Document Analysis Systems. Lecture Notes in Computer Science, vol. 3872, pp. 84–95. Springer, Berlin (2006) Rothfeder, J.L., Manmatha, R., Rath, T.M.: Aligning transcripts to automatically segmented handwritten manuscripts. In: Bunke, H., Spitz, A.L. (eds.) Document Analysis Systems. Lecture Notes in Computer Science, vol. 3872, pp. 84–95. Springer, Berlin (2006)
30.
Zurück zum Zitat Sadeh, G., Wolf, L., Hassner, T., Dershowitz, N., Ben-Ezra, D.S., Ben-Ezra Stökl, D.: Viral transcription alignment. In: Proceedings of International Conference on Document Analysis and Recognition (2015) Sadeh, G., Wolf, L., Hassner, T., Dershowitz, N., Ben-Ezra, D.S., Ben-Ezra Stökl, D.: Viral transcription alignment. In: Proceedings of International Conference on Document Analysis and Recognition (2015)
31.
Zurück zum Zitat Sevilla-Lara, L., Learned-Miller., E.: Distribution fields for tracking. In: Proceedings of CVPR (2012) Sevilla-Lara, L., Learned-Miller., E.: Distribution fields for tracking. In: Proceedings of CVPR (2012)
32.
Zurück zum Zitat Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of CVPR, pp. 1–8 (2007). doi:10.1109/CVPR.2007.383198 Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of CVPR, pp. 1–8 (2007). doi:10.1109/CVPR.2007.383198
34.
Zurück zum Zitat Tomai, C.I., Zhang, B., Govindaraju, V.: Transcript mapping for historic handwritten document images. In: Frontiers in Handwriting Recognition, pp. 413–418 (2002) Tomai, C.I., Zhang, B., Govindaraju, V.: Transcript mapping for historic handwritten document images. In: Frontiers in Handwriting Recognition, pp. 413–418 (2002)
35.
Zurück zum Zitat Vedaldi, A., Fulkerson, B.: Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of International Conference on Multimedia, pp. 1469–1472 (2010) Vedaldi, A., Fulkerson, B.: Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of International Conference on Multimedia, pp. 1469–1472 (2010)
36.
Zurück zum Zitat Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recognit. 17(1), 33–45 (2014)CrossRef Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recognit. 17(1), 33–45 (2014)CrossRef
37.
Zurück zum Zitat Wolf, L., Hassner, T., Taigman, Y.: Descriptor based methods in the wild. In: Post-ECCV Faces in Real-Life Images Workshop (2008) Wolf, L., Hassner, T., Taigman, Y.: Descriptor based methods in the wild. In: Post-ECCV Faces in Real-Life Images Workshop (2008)
38.
Zurück zum Zitat Wolf, L., Hassner, T., Taigman, Y.: Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. Trans. Pattern Anal. Mach. Intell. 33(10), 1978–1990 (2011)CrossRef Wolf, L., Hassner, T., Taigman, Y.: Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. Trans. Pattern Anal. Mach. Intell. 33(10), 1978–1990 (2011)CrossRef
39.
Zurück zum Zitat Wolf, L., Littman, R., Mayer, N., German, T., Dershowitz, N., Shweka, R., Choueka, Y.: Identifying join candidates in the Cairo Genizah. Int. J. Comput. Vis. 94(1), 118–135 (2011)CrossRef Wolf, L., Littman, R., Mayer, N., German, T., Dershowitz, N., Shweka, R., Choueka, Y.: Identifying join candidates in the Cairo Genizah. Int. J. Comput. Vis. 94(1), 118–135 (2011)CrossRef
40.
Zurück zum Zitat Yin, F., Wang, Q.F., Liu, C.L.: Integrating geometric context for text alignment of handwritten Chinese documents. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 7–12. IEEE, New York (2010) Yin, F., Wang, Q.F., Liu, C.L.: Integrating geometric context for text alignment of handwritten Chinese documents. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 7–12. IEEE, New York (2010)
41.
Zurück zum Zitat Zhang, L., Chu, R., Xiang, S., Liao, S., Li, S.: Face detection based on multi-block LBP representation. In: IAPR/IEEE International Conference on Biometrics (2007)CrossRef Zhang, L., Chu, R., Xiang, S., Liao, S., Li, S.: Face detection based on multi-block LBP representation. In: IAPR/IEEE International Conference on Biometrics (2007)CrossRef
42.
Zurück zum Zitat Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings of CVPR, pp. 1393–1400 (2011) Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings of CVPR, pp. 1393–1400 (2011)
43.
Zurück zum Zitat Zhu, B., Nakagawa, M.: Online handwritten Japanese text recognition by improving segmentation quality. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition, Montreal, pp. 379–384 (2008) Zhu, B., Nakagawa, M.: Online handwritten Japanese text recognition by improving segmentation quality. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition, Montreal, pp. 379–384 (2008)
44.
Zurück zum Zitat Zimmermann, M., Bunke, H.: Automatic segmentation of the IAM off-line database for handwritten English text. In: Proceedings of ICPR, vol. 4, pp. 35–39 (2002) Zimmermann, M., Bunke, H.: Automatic segmentation of the IAM off-line database for handwritten English text. In: Proceedings of ICPR, vol. 4, pp. 35–39 (2002)
Metadaten
Titel
Dense Correspondences and Ancient Texts
verfasst von
Tal Hassner
Lior Wolf
Nachum Dershowitz
Gil Sadeh
Daniel Stökl Ben-Ezra
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-23048-1_12

Neuer Inhalt