Skip to main content

2019 | OriginalPaper | Buchkapitel

Improving the Perceptual Quality of Document Images Using Deep Neural Network

verfasst von : Ram Krishna Pandey, A. G. Ramakrishnan

Erschienen in: Advances in Neural Networks – ISNN 2019

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Given a low-resolution binary document image, we aim to improve its perceptual quality for enhanced readability. We have proposed a simple, deep learning based model, that uses convolution with transposed convolution and sub-pixel layers in the best possible way to construct the high-resolution image. The proposed architecture scales across the three different scripts tested, namely Tamil, Kannada and Roman. To show that the reconstructed output has enhanced readability, we have used the objective criterion of optical character recognizer (OCR) character level accuracy. The reported results by our CTCS architecture shows significant improvement in terms of the subjective criterion of human readability and objective criterion of OCR character level accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Shi, Z., Setlur, S., Govindaraju, V.: Image enhancement for degraded binary document images. In: Document Analysis and Recognition (ICDAR). IEEE (2011) Shi, Z., Setlur, S., Govindaraju, V.: Image enhancement for degraded binary document images. In: Document Analysis and Recognition (ICDAR). IEEE (2011)
2.
Zurück zum Zitat El Harraj, A., Raissouni, N.: OCR accuracy improvement on document images through a novel pre-processing approach. arXiv preprint arXiv:1509.03456 (2015) El Harraj, A., Raissouni, N.: OCR accuracy improvement on document images through a novel pre-processing approach. arXiv preprint arXiv:​1509.​03456 (2015)
3.
Zurück zum Zitat Kumar, V., Bansal, A., Tulsiyan, G.H., Mishra, A., Namboodiri, A., Jawahar, C.V.: Sparse document image coding for restoration. In: 12th IEEE International Conference on Document Analysis and Recognition (ICDAR), pp. 713–717 (2013) Kumar, V., Bansal, A., Tulsiyan, G.H., Mishra, A., Namboodiri, A., Jawahar, C.V.: Sparse document image coding for restoration. In: 12th IEEE International Conference on Document Analysis and Recognition (ICDAR), pp. 713–717 (2013)
4.
Zurück zum Zitat Pandey, R.K., Ramakrishnan, A.G.: Language independent single document image super-resolution using CNN for improved recognition. arXiv preprint arXiv:1701.08835 (2017) Pandey, R.K., Ramakrishnan, A.G.: Language independent single document image super-resolution using CNN for improved recognition. arXiv preprint arXiv:​1701.​08835 (2017)
5.
Zurück zum Zitat Pandey, R.K., Ramakrishnan, A.G.: Efficient document-image super-resolution using convolutional neural network. Sadhana 43(2), 15 (2018) Pandey, R.K., Ramakrishnan, A.G.: Efficient document-image super-resolution using convolutional neural network. Sadhana 43(2), 15 (2018)
6.
Zurück zum Zitat Pandey, R.K., Maiya, S.R., Ramakrishnan, A.G.: A new approach for upscaling document images for improving their quality. In: 14th IEEE India Council International Conference (INDICON). IEEE (2017) Pandey, R.K., Maiya, S.R., Ramakrishnan, A.G.: A new approach for upscaling document images for improving their quality. In: 14th IEEE India Council International Conference (INDICON). IEEE (2017)
7.
Zurück zum Zitat Pandey, R.K., Vignesh, K., Ramakrishnan, A.G., Chandrahasa, B.: Binary document image super resolution for improved readability and OCR performance. arXiv preprint arXiv:1812.02475 (2018) Pandey, R.K., Vignesh, K., Ramakrishnan, A.G., Chandrahasa, B.: Binary document image super resolution for improved readability and OCR performance. arXiv preprint arXiv:​1812.​02475 (2018)
9.
Zurück zum Zitat Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings 3rd International Conference Learning Representations (2014) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings 3rd International Conference Learning Representations (2014)
10.
Zurück zum Zitat Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolutional neural network for image deconvolution. In: Advances in Neural Information Processing Systems, pp. 1790–1798 (2014) Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolutional neural network for image deconvolution. In: Advances in Neural Information Processing Systems, pp. 1790–1798 (2014)
11.
Zurück zum Zitat Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016) Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
12.
Zurück zum Zitat Shivakumar, H.R., Ramakrishnan, A.G.: A tool that converted 200 Tamil books for use by blind students. In: Proceedings of the 12th International Tamil Internet Conference, Kuala Lumpur, Malaysia (2013) Shivakumar, H.R., Ramakrishnan, A.G.: A tool that converted 200 Tamil books for use by blind students. In: Proceedings of the 12th International Tamil Internet Conference, Kuala Lumpur, Malaysia (2013)
13.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 (2015)
14.
Zurück zum Zitat Glasner, D., Shai, B., Michal, I.: Super-resolution from a single image. In: 12th IEEE International Conference on Computer Vision (2009) Glasner, D., Shai, B., Michal, I.: Super-resolution from a single image. In: 12th IEEE International Conference on Computer Vision (2009)
15.
Zurück zum Zitat Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010) Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)
16.
Zurück zum Zitat Timofte, R., De Smet, V., Van Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (2013) Timofte, R., De Smet, V., Van Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (2013)
18.
Zurück zum Zitat Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105–114 (2017) Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105–114 (2017)
19.
Zurück zum Zitat Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate superresolution. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, no. 3, p. 5 (2017) Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate superresolution. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, no. 3, p. 5 (2017)
20.
Zurück zum Zitat Kumar, D., Ramakrishnan, A.G.: Power-law transformation for enhanced recognition of born-digital word images. In: International Conference on Signal Processing and Communications (SPCOM), pp. 1–5. IEEE (2012) Kumar, D., Ramakrishnan, A.G.: Power-law transformation for enhanced recognition of born-digital word images. In: International Conference on Signal Processing and Communications (SPCOM), pp. 1–5. IEEE (2012)
Metadaten
Titel
Improving the Perceptual Quality of Document Images Using Deep Neural Network
verfasst von
Ram Krishna Pandey
A. G. Ramakrishnan
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-22808-8_44

Premium Partner