Skip to main content
Erschienen in:
Buchtitelbild

2019 | OriginalPaper | Buchkapitel

MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System

verfasst von : Arnab Poddar, Rohan Mukherjee, Jayanta Mukhopadhyay, Prabir Kumar Biswas

Erschienen in: Document Analysis and Recognition

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Content of the document images are often shows hierarchical multi-layered tree structure. Further, the algorithms for document image applications like line detection, paragraph detection, word recognition, layout analysis etc. require pixel level annotation. In this paper, a Multi-layered Document Image Annotation System (MultiDIAS) has been introduced. The proposed system simultaneously provide a platform for hierarchical and pixel level annotation of document. MultiDIAS label the document image in four hierarchical layers (layout type, entity type, line type, word type) assigned by the user. The output generated are four ground-truth images and an XML file representing the metadata information. The MultiDIAS is tested on a complex handwritten manuscript written by renowned film director Satyajit Ray for the movie ‘Goopi Gyne Bagha Byne’. This annotated data generated using MultiDIAS can further be used in a wide range of applications of document image understanding and analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bhowmik, S., Sarkar, R., Nasipuri, M., Doermann, D.: Text and non-text separation in offline document images: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 1–20 (2018) Bhowmik, S., Sarkar, R., Nasipuri, M., Doermann, D.: Text and non-text separation in offline document images: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 1–20 (2018)
2.
Zurück zum Zitat Chaudhuri, B., Pal, U.: Skew angle detection of digitized indian script documents. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 182–186 (1997)CrossRef Chaudhuri, B., Pal, U.: Skew angle detection of digitized indian script documents. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 182–186 (1997)CrossRef
3.
Zurück zum Zitat Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: Document Recognition and Retrieval XXII, vol. 9402, p. 940204. International Society for Optics and Photonics (2015) Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: Document Recognition and Retrieval XXII, vol. 9402, p. 940204. International Society for Optics and Photonics (2015)
4.
Zurück zum Zitat Dey, S., Mukherjee, J., Sural, S., Nandedkar, A.V.: Anveshak - a groundtruth generation tool for foreground regions of document images. In: Mukherjee, S., Mukherjee, S., Mukherjee, D.P., Sivaswamy, J., Awate, S., Setlur, S., Namboodiri, A.M., Chaudhury, S. (eds.) ICVGIP 2016. LNCS, vol. 10481, pp. 255–264. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68124-5_22CrossRef Dey, S., Mukherjee, J., Sural, S., Nandedkar, A.V.: Anveshak - a groundtruth generation tool for foreground regions of document images. In: Mukherjee, S., Mukherjee, S., Mukherjee, D.P., Sivaswamy, J., Awate, S., Setlur, S., Namboodiri, A.M., Chaudhury, S. (eds.) ICVGIP 2016. LNCS, vol. 10481, pp. 255–264. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-68124-5_​22CrossRef
5.
Zurück zum Zitat Doermann, D., Zotkina, E., Li, H.: GEDI-a groundtruthing environment for document images. In: Ninth IAPR International Workshop on Document Analysis Systems (DAS 2010). Citeseer (2010) Doermann, D., Zotkina, E., Li, H.: GEDI-a groundtruthing environment for document images. In: Ninth IAPR International Workshop on Document Analysis Systems (DAS 2010). Citeseer (2010)
6.
Zurück zum Zitat Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geog. Inf. Geovisualization 10(2), 112–122 (1973)CrossRef Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geog. Inf. Geovisualization 10(2), 112–122 (1973)CrossRef
7.
Zurück zum Zitat Gonzalez, R.C., Woods, R.E., et al.: Digital image processing (2002) Gonzalez, R.C., Woods, R.E., et al.: Digital image processing (2002)
8.
Zurück zum Zitat Hormann, K., Agathos, A.: The point in polygon problem for arbitrary polygons. Comput. Geometry 20(3), 131–144 (2001)MathSciNetCrossRef Hormann, K., Agathos, A.: The point in polygon problem for arbitrary polygons. Comput. Geometry 20(3), 131–144 (2001)MathSciNetCrossRef
9.
Zurück zum Zitat Lee, C.H., Kanungo, T.: The architecture of trueviz: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)CrossRef Lee, C.H., Kanungo, T.: The architecture of trueviz: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)CrossRef
10.
Zurück zum Zitat Moll, M.A., Baird, H.S., An, C.: Truthing for pixel-accurate segmentation. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 379–385. IEEE (2008) Moll, M.A., Baird, H.S., An, C.: Truthing for pixel-accurate segmentation. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 379–385. IEEE (2008)
11.
Zurück zum Zitat Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. Pattern Recognit. 37(9), 1887–1899 (2004)CrossRef Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. Pattern Recognit. 37(9), 1887–1899 (2004)CrossRef
12.
Zurück zum Zitat Saleh, Z., Zhang, K., Calvo-Zaragoza, J., Vigliensoni, G., Fujinaga, I.: Pixel. js: web-based pixel classification correction platform for ground truth creation. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 39–40. IEEE (2017) Saleh, Z., Zhang, K., Calvo-Zaragoza, J., Vigliensoni, G., Fujinaga, I.: Pixel. js: web-based pixel classification correction platform for ground truth creation. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 39–40. IEEE (2017)
13.
Zurück zum Zitat Saund, E., Lin, J., Sarkar, P.: Pixlabeler: user interface for pixel-level labeling of elements in document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 646–650. IEEE (2009) Saund, E., Lin, J., Sarkar, P.: Pixlabeler: user interface for pixel-level labeling of elements in document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 646–650. IEEE (2009)
14.
Zurück zum Zitat Shafait, F., Keysers, D., Breuel, T.M.: Pixel-accurate representation and evaluation of page segmentation in document images, pp. 872–875. IEEE (2006) Shafait, F., Keysers, D., Breuel, T.M.: Pixel-accurate representation and evaluation of page segmentation in document images, pp. 872–875. IEEE (2006)
15.
Zurück zum Zitat Strecker, T., Van Beusekom, J., Albayrak, S., Breuel, T.M.: Automated ground truth data generation for newspaper document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 1275–1279. IEEE (2009) Strecker, T., Van Beusekom, J., Albayrak, S., Breuel, T.M.: Automated ground truth data generation for newspaper document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 1275–1279. IEEE (2009)
16.
Zurück zum Zitat Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)CrossRef Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)CrossRef
17.
Zurück zum Zitat Thoma, G.: Ground truth data for document image analysis. In: Symposium on Document Image Understanding and Technology (SDIUT), pp. 199–205 (2003) Thoma, G.: Ground truth data for document image analysis. In: Symposium on Document Image Understanding and Technology (SDIUT), pp. 199–205 (2003)
18.
Zurück zum Zitat Wenyin, L., Dori, D.: A protocol for performance evaluation of line detection algorithms. Mach. Vis. Appl. 9(5–6), 240–250 (1997)CrossRef Wenyin, L., Dori, D.: A protocol for performance evaluation of line detection algorithms. Mach. Vis. Appl. 9(5–6), 240–250 (1997)CrossRef
19.
Zurück zum Zitat Yacoub, S., Saxena, V., Sami, S.N.: Perfectdoc: a ground truthing environment for complex documents. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 452–456. IEEE (2005) Yacoub, S., Saxena, V., Sami, S.N.: Perfectdoc: a ground truthing environment for complex documents. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 452–456. IEEE (2005)
21.
Zurück zum Zitat Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)CrossRef Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)CrossRef
Metadaten
Titel
MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System
verfasst von
Arnab Poddar
Rohan Mukherjee
Jayanta Mukhopadhyay
Prabir Kumar Biswas
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-9361-7_1