nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Improving the Perceptual Quality of Document Images Using Deep Neural Network

verfasst von : Ram Krishna Pandey, A. G. Ramakrishnan

Erschienen in: Advances in Neural Networks – ISNN 2019

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Given a low-resolution binary document image, we aim to improve its perceptual quality for enhanced readability. We have proposed a simple, deep learning based model, that uses convolution with transposed convolution and sub-pixel layers in the best possible way to construct the high-resolution image. The proposed architecture scales across the three different scripts tested, namely Tamil, Kannada and Roman. To show that the reconstructed output has enhanced readability, we have used the objective criterion of optical character recognizer (OCR) character level accuracy. The reported results by our CTCS architecture shows significant improvement in terms of the subjective criterion of human readability and objective criterion of OCR character level accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Enhancing Feature Representation for Saliency Detection

Nächstes Kapitel Pair-Comparing Based Convolutional Neural Network for Blind Image Quality Assessment

Shi, Z., Setlur, S., Govindaraju, V.: Image enhancement for degraded binary document images. In: Document Analysis and Recognition (ICDAR). IEEE (2011)

El Harraj, A., Raissouni, N.: OCR accuracy improvement on document images through a novel pre-processing approach. arXiv preprint arXiv:1509.03456 (2015)

Kumar, V., Bansal, A., Tulsiyan, G.H., Mishra, A., Namboodiri, A., Jawahar, C.V.: Sparse document image coding for restoration. In: 12th IEEE International Conference on Document Analysis and Recognition (ICDAR), pp. 713–717 (2013)

Pandey, R.K., Ramakrishnan, A.G.: Language independent single document image super-resolution using CNN for improved recognition. arXiv preprint arXiv:1701.08835 (2017)

Pandey, R.K., Ramakrishnan, A.G.: Efficient document-image super-resolution using convolutional neural network. Sadhana 43(2), 15 (2018)

Pandey, R.K., Maiya, S.R., Ramakrishnan, A.G.: A new approach for upscaling document images for improving their quality. In: 14th IEEE India Council International Conference (INDICON). IEEE (2017)

Pandey, R.K., Vignesh, K., Ramakrishnan, A.G., Chandrahasa, B.: Binary document image super resolution for improved readability and OCR performance. arXiv preprint arXiv:1812.02475 (2018)

LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_3

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings 3rd International Conference Learning Representations (2014)

10.

Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolutional neural network for image deconvolution. In: Advances in Neural Information Processing Systems, pp. 1790–1798 (2014)

11.

Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

12.

Shivakumar, H.R., Ramakrishnan, A.G.: A tool that converted 200 Tamil books for use by blind students. In: Proceedings of the 12th International Tamil Internet Conference, Kuala Lumpur, Malaysia (2013)

13.

He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 (2015)

14.

Glasner, D., Shai, B., Michal, I.: Super-resolution from a single image. In: 12th IEEE International Conference on Computer Vision (2009)

15.

Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)

16.

Timofte, R., De Smet, V., Van Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (2013)

17.

Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13

18.

Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105–114 (2017)

19.

Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate superresolution. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, no. 3, p. 5 (2017)

20.

Kumar, D., Ramakrishnan, A.G.: Power-law transformation for enhanced recognition of born-digital word images. In: International Conference on Signal Processing and Communications (SPCOM), pp. 1–5. IEEE (2012)

Titel: Improving the Perceptual Quality of Document Images Using Deep Neural Network
verfasst von: Ram Krishna Pandey
A. G. Ramakrishnan
Verlag: Springer International Publishing
Buch: Advances in Neural Networks – ISNN 2019
Print ISBN: 978-3-030-22807-1

Electronic ISBN: 978-3-030-22808-8

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-22808-8_44

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner