nach oben

Erschienen in:

2021 | OriginalPaper | Buchkapitel

On the Use of Attention in Deep Learning Based Denoising Method for Ancient Cham Inscription Images

verfasst von : Tien-Nam Nguyen, Jean-Christophe Burie, Thi-Lan Le, Anne-Valerie Schweyer

Erschienen in: Document Analysis and Recognition – ICDAR 2021

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Image denoising is one of the most important steps in the document image analysis pipeline thanks to its good effect into the rest of the workflow. However, the noise in historical documents is totally different from the common noise present in other classical problems of image processing. It is particularly the case of the image of Cham inscriptions obtained by the stamping of ancient stele. In this paper, we leverage the advantage of deep learning to adapt with these noisy conditions. The proposed network follows an encoder-decoder structure by combining convolution/deconvolution operators with symmetrical skip connections and residual blocks for improving reconstructed image. Furthermore, global attention fusion is proposed to learn the relevant regions in the image. Our experiments demonstrate the proposed method can’t only remove unwanted parts in the image, but also enhance the visual quality for the Cham inscriptions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel One-Model Ensemble-Learning for Text Recognition of Historical Printings

Nächstes Kapitel Visual FUDGE: Form Understanding via Dynamic Graph Editing

Mao, X.J., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (2016)

Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Trans. Image Process 26(7), 3142–3155 (2017)MathSciNetCrossRef

Kesiman, M.W.A., et al.: Benchmarking of document image analysis tasks for palm leaf manuscripts from southeast Asia. J. Imaging 4(2), 43 (2018)

Lehtinen, J., et al.: Noise2noise: Learning image restoration without clean data. In: International Conference on Machine Learning (2018)

Krull, A., Buchholz, T.O., Jug, F.: Noise2void-learning denoising from single noisy images. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)

Pitas, I., Venetsanopoulos, A.N.: Nonlinear Digital Filters: Principles and Applications, vol. 84. Springer, New York (2013) https://doi.org/10.1007/978-1-4757-6017-0

Wiener, N.: Extrapolation, Interpolation, and Smoothing of Stationary time Series: with Engineering Applications. MIT Press, Cambridge (1950)

Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)

Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 60–65 (2005)

10.

Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 60(1–4), 259–268 (1992)

11.

Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)MathSciNetCrossRef

12.

Dong, W., Shi, G., Li, X.: Nonlocal image restoration with bilateral variance estimation: a low-rank approach. IEEE Trans. Image Process. 22(2), 700–711 (2012)MathSciNetCrossRef

13.

Choi, H., Baraniuk, R.: Analysis of wavelet-domain wiener filters. In: Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, pp. 613–616 (1998)

14.

Ram, I., Elad, M., Cohen, I.: Generalized tree-based wavelet transform. IEEE Trans. Signal Process. 59(9), 4199–4209 (2011)MathSciNetCrossRef

15.

Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)MathSciNetCrossRef

16.

Portilla, J., Strela, V., Wainwright, M.J., Simoncelli, E.P.: Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans. Image Process. 12(11), 1338–1351 (2003)MathSciNetCrossRef

17.

Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: Can plain neural networks compete with bm3d? In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012)

18.

Dumpala, V., Kurupathi, S.R., Bukhari, S.S., Dengel, A.: Removal of historical document degradations using conditional gans. In: ICPRAM (2019)

19.

Souibgui, M.A., Kessentini, Y.: De-gan: a conditional generative adversarial network for document enhancement. In: IEEE Transactions on PAMI (2020)

20.

Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Lisbon, Portugal (September 2015)

21.

Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (2017)

22.

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13CrossRef

23.

Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018)

24.

Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: bottleneck attention module. In: British Machine Vision Conference (2018)

25.

Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)

26.

Schlemper, J., et al.: Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)CrossRef

27.

Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28CrossRef

28.

Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (2018)

29.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

30.

Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711 (2016)

31.

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)

32.

Nguyen, M.T., Shweyer, A.V., Le, T.L., Tran, T.H., Vu, H.: Preliminary results on ancient cham glyph recognition from cham inscription images. In: 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6. IEEE (2019)

33.

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

34.

Hyvarinen, A., Hoyer, P., Oja, E.: Sparse code shrinkage: Denoising by nonlinear maximum likelihood estimation. Adv. Neural Inf. Process. Syst. 11, 473–479 (1999)

35.

Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)MathSciNetCrossRef

36.

Deledalle, C.A., Salmon, J., Dalalyan, A.S., et al.: Image denoising with patch based pca: local versus global. BMVC 81, 425–455 (2011)

37.

Zhang, K., Zuo, W., Zhang, L.: Ffdnet: toward a fast and flexible solution for cnn based image denoising. IEEE Trans. Image Process 27(9), 4608–4622 (2018)MathSciNetCrossRef

38.

Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)MathSciNetCrossRef

39.

Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985)

40.

Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)CrossRef

Titel: On the Use of Attention in Deep Learning Based Denoising Method for Ancient Cham Inscription Images
verfasst von: Tien-Nam Nguyen
Jean-Christophe Burie
Thi-Lan Le
Anne-Valerie Schweyer
Verlag: Springer International Publishing
Buch: Document Analysis and Recognition – ICDAR 2021
Print ISBN: 978-3-030-86548-1

Electronic ISBN: 978-3-030-86549-8

Copyright-Jahr: 2021
DOI: https://doi.org/10.1007/978-3-030-86549-8_26

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner