Skip to main content

2021 | OriginalPaper | Buchkapitel

Weakly Supervised Bounding Box Extraction for Unlabeled Data in Table Detection

verfasst von : Arash Samari, Andrew Piper, Alison Hedley, Mohamed Cheriet

Erschienen in: Pattern Recognition. ICPR International Workshops and Challenges

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The organization and presentation of data in tabular format became an essential strategy of scientific communication and remains fundamental to the transmission of knowledge today. The use of automated detection to identify typographical elements such as tables and diagrams in digitized historical print offers a promising approach for future research. Most of the table detection tasks are using existing off-the-shelf methods for their detection algorithm. However, datasets that are used for evaluation are not challenging enough due to the lack of quantity and diversity. To have a better comparison between proposed methods we introduce the NAS dataset in this paper for historical digitized images. Tables in historic scientific documents vary widely in their characteristics. They also appear alongside visually similar items, such as maps, diagrams, and illustrations. We address these challenges with a multi-phase procedure, outlined in this article, evaluated using two datasets, ECCO (https://​www.​gale.​com/​primary-sources/​eighteenth-century-collections-online) and NAS (https://​beta.​synchromedia.​ca/​vok-visibility-of-knowledge). In our approach, we utilized the Gabor filter [1] to prepare our dataset for algorithmic detection with Faster-RCNN [2]. This method detects tables against all categories of visual information. Due to the limitation in labeled data, particularly for object detection, we developed a new method, namely, weakly supervision bounding box extraction, to extract bounding boxes automatically for our training set in an innovative way. Then a pseudo-labeling technique is used to create a more general model, via a three-step process of bounding box extraction and labeling.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lee, T.S.: Image representation using 2D Gabor wavelets. IEEE Trans. Pattern Anal. Mach. Intell. 18(10), 959–971 (1996)CrossRef Lee, T.S.: Image representation using 2D Gabor wavelets. IEEE Trans. Pattern Anal. Mach. Intell. 18(10), 959–971 (1996)CrossRef
2.
Zurück zum Zitat Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
3.
Zurück zum Zitat Piper, A., Wellmon, C., Cheriet, M.: The page image: towards a visual history of digital documents. Book History 23(1), 365–397 (2020)CrossRef Piper, A., Wellmon, C., Cheriet, M.: The page image: towards a visual history of digital documents. Book History 23(1), 365–397 (2020)CrossRef
4.
Zurück zum Zitat Michie, D., Spiegelhalter, D.J., Taylor, C.C., et al.: Machine learning. Neural Stat. Classif. 13(1994), 1–298 (1994)MATH Michie, D., Spiegelhalter, D.J., Taylor, C.C., et al.: Machine learning. Neural Stat. Classif. 13(1994), 1–298 (1994)MATH
5.
Zurück zum Zitat Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 771–776. IEEE (2017) Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 771–776. IEEE (2017)
6.
Zurück zum Zitat Shahab, A., Shafait, F., Kieninger, T., Dengel, A.: An open approach towards the benchmarking of table structure recognition systems. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 113–120 (2010) Shahab, A., Shafait, F., Kieninger, T., Dengel, A.: An open approach towards the benchmarking of table structure recognition systems. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 113–120 (2010)
7.
Zurück zum Zitat Pyreddy, P., Croft, W.B.: TINTIN: a system for retrieval in text tables. In: Proceedings of the Second ACM International Conference on Digital Libraries, pp. 193–200 (1997) Pyreddy, P., Croft, W.B.: TINTIN: a system for retrieval in text tables. In: Proceedings of the Second ACM International Conference on Digital Libraries, pp. 193–200 (1997)
8.
Zurück zum Zitat Kieninger, T., Dengel, A.: Applying the T-Recs table recognition system to the business letter domain. In: Proceedings of Sixth International Conference on Document Analysis and Recognition, pp. 518–522. IEEE (2001) Kieninger, T., Dengel, A.: Applying the T-Recs table recognition system to the business letter domain. In: Proceedings of Sixth International Conference on Document Analysis and Recognition, pp. 518–522. IEEE (2001)
9.
Zurück zum Zitat Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1185–1189. IEEE (2013) Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1185–1189. IEEE (2013)
10.
Zurück zum Zitat Yildiz, B., Kaiser, K., Miksch, S.: pdf2table: a method to extract table information from pdf files. In: IICAI, pp. 1773–1785 (2005) Yildiz, B., Kaiser, K., Miksch, S.: pdf2table: a method to extract table information from pdf files. In: IICAI, pp. 1773–1785 (2005)
11.
Zurück zum Zitat Fang, J., Gao, L., Bai, K., Qiu, R., Tao, X., Tang, Z.: A table detection method for multipage pdf documents via visual seperators and tabular structures. In: 2011 International Conference on Document Analysis and Recognition, pp. 779–783. IEEE (2011) Fang, J., Gao, L., Bai, K., Qiu, R., Tao, X., Tang, Z.: A table detection method for multipage pdf documents via visual seperators and tabular structures. In: 2011 International Conference on Document Analysis and Recognition, pp. 779–783. IEEE (2011)
12.
Zurück zum Zitat Hu, J., Kashi, R.S., Lopresti, D.P., Wilfong, G.: Medium-independent table detection. In: Document Recognition and Retrieval VII, vol. 3967, pp. 291–302. International Society for Optics and Photonics (1999) Hu, J., Kashi, R.S., Lopresti, D.P., Wilfong, G.: Medium-independent table detection. In: Document Recognition and Retrieval VII, vol. 3967, pp. 291–302. International Society for Optics and Photonics (1999)
13.
Zurück zum Zitat e Silva, A.C.: Learning rich hidden Markov models in document analysis: table location. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 843–847. IEEE (2009) e Silva, A.C.: Learning rich hidden Markov models in document analysis: table location. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 843–847. IEEE (2009)
14.
Zurück zum Zitat Tran, D.N., Tran, T.A., Oh, A., Kim, S.H., Na, I.S.: Table detection from document image using vertical arrangement of text blocks. Int. J. Contents 11(4), 77–85 (2015)CrossRef Tran, D.N., Tran, T.A., Oh, A., Kim, S.H., Na, I.S.: Table detection from document image using vertical arrangement of text blocks. Int. J. Contents 11(4), 77–85 (2015)CrossRef
16.
Zurück zum Zitat Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: DeepDeSRT: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1162–1167. IEEE (2017) Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: DeepDeSRT: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1162–1167. IEEE (2017)
17.
Zurück zum Zitat Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE (2013) Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE (2013)
18.
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014) Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
19.
Zurück zum Zitat Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015) Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
20.
Zurück zum Zitat Fang, J., Tao, X., Tang, Z., Qiu, R., Liu, Y.: Dataset, ground-truth and performance metrics for table detection evaluation. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 445–449. IEEE (2012) Fang, J., Tao, X., Tang, Z., Qiu, R., Liu, Y.: Dataset, ground-truth and performance metrics for table detection evaluation. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 445–449. IEEE (2012)
21.
Zurück zum Zitat Breu, H., Gil, J., Kirkpatrick, D., Werman, M.: Linear time euclidean distance transform algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 17(5), 529–533 (1995)CrossRef Breu, H., Gil, J., Kirkpatrick, D., Werman, M.: Linear time euclidean distance transform algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 17(5), 529–533 (1995)CrossRef
Metadaten
Titel
Weakly Supervised Bounding Box Extraction for Unlabeled Data in Table Detection
verfasst von
Arash Samari
Andrew Piper
Alison Hedley
Mohamed Cheriet
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-68787-8_25