Skip to main content

2021 | OriginalPaper | Buchkapitel

On the Use of Attention in Deep Learning Based Denoising Method for Ancient Cham Inscription Images

verfasst von : Tien-Nam Nguyen, Jean-Christophe Burie, Thi-Lan Le, Anne-Valerie Schweyer

Erschienen in: Document Analysis and Recognition – ICDAR 2021

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Image denoising is one of the most important steps in the document image analysis pipeline thanks to its good effect into the rest of the workflow. However, the noise in historical documents is totally different from the common noise present in other classical problems of image processing. It is particularly the case of the image of Cham inscriptions obtained by the stamping of ancient stele. In this paper, we leverage the advantage of deep learning to adapt with these noisy conditions. The proposed network follows an encoder-decoder structure by combining convolution/deconvolution operators with symmetrical skip connections and residual blocks for improving reconstructed image. Furthermore, global attention fusion is proposed to learn the relevant regions in the image. Our experiments demonstrate the proposed method can’t only remove unwanted parts in the image, but also enhance the visual quality for the Cham inscriptions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mao, X.J., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (2016) Mao, X.J., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (2016)
2.
Zurück zum Zitat Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Trans. Image Process 26(7), 3142–3155 (2017)MathSciNetCrossRef Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Trans. Image Process 26(7), 3142–3155 (2017)MathSciNetCrossRef
3.
Zurück zum Zitat Kesiman, M.W.A., et al.: Benchmarking of document image analysis tasks for palm leaf manuscripts from southeast Asia. J. Imaging 4(2), 43 (2018) Kesiman, M.W.A., et al.: Benchmarking of document image analysis tasks for palm leaf manuscripts from southeast Asia. J. Imaging 4(2), 43 (2018)
4.
Zurück zum Zitat Lehtinen, J., et al.: Noise2noise: Learning image restoration without clean data. In: International Conference on Machine Learning (2018) Lehtinen, J., et al.: Noise2noise: Learning image restoration without clean data. In: International Conference on Machine Learning (2018)
5.
Zurück zum Zitat Krull, A., Buchholz, T.O., Jug, F.: Noise2void-learning denoising from single noisy images. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Krull, A., Buchholz, T.O., Jug, F.: Noise2void-learning denoising from single noisy images. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
7.
Zurück zum Zitat Wiener, N.: Extrapolation, Interpolation, and Smoothing of Stationary time Series: with Engineering Applications. MIT Press, Cambridge (1950) Wiener, N.: Extrapolation, Interpolation, and Smoothing of Stationary time Series: with Engineering Applications. MIT Press, Cambridge (1950)
8.
Zurück zum Zitat Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998) Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)
9.
Zurück zum Zitat Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 60–65 (2005) Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 60–65 (2005)
10.
Zurück zum Zitat Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 60(1–4), 259–268 (1992) Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 60(1–4), 259–268 (1992)
11.
Zurück zum Zitat Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)MathSciNetCrossRef Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)MathSciNetCrossRef
12.
Zurück zum Zitat Dong, W., Shi, G., Li, X.: Nonlocal image restoration with bilateral variance estimation: a low-rank approach. IEEE Trans. Image Process. 22(2), 700–711 (2012)MathSciNetCrossRef Dong, W., Shi, G., Li, X.: Nonlocal image restoration with bilateral variance estimation: a low-rank approach. IEEE Trans. Image Process. 22(2), 700–711 (2012)MathSciNetCrossRef
13.
Zurück zum Zitat Choi, H., Baraniuk, R.: Analysis of wavelet-domain wiener filters. In: Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, pp. 613–616 (1998) Choi, H., Baraniuk, R.: Analysis of wavelet-domain wiener filters. In: Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, pp. 613–616 (1998)
14.
Zurück zum Zitat Ram, I., Elad, M., Cohen, I.: Generalized tree-based wavelet transform. IEEE Trans. Signal Process. 59(9), 4199–4209 (2011)MathSciNetCrossRef Ram, I., Elad, M., Cohen, I.: Generalized tree-based wavelet transform. IEEE Trans. Signal Process. 59(9), 4199–4209 (2011)MathSciNetCrossRef
15.
Zurück zum Zitat Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)MathSciNetCrossRef Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)MathSciNetCrossRef
16.
Zurück zum Zitat Portilla, J., Strela, V., Wainwright, M.J., Simoncelli, E.P.: Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans. Image Process. 12(11), 1338–1351 (2003)MathSciNetCrossRef Portilla, J., Strela, V., Wainwright, M.J., Simoncelli, E.P.: Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans. Image Process. 12(11), 1338–1351 (2003)MathSciNetCrossRef
17.
Zurück zum Zitat Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: Can plain neural networks compete with bm3d? In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012) Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: Can plain neural networks compete with bm3d? In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012)
18.
Zurück zum Zitat Dumpala, V., Kurupathi, S.R., Bukhari, S.S., Dengel, A.: Removal of historical document degradations using conditional gans. In: ICPRAM (2019) Dumpala, V., Kurupathi, S.R., Bukhari, S.S., Dengel, A.: Removal of historical document degradations using conditional gans. In: ICPRAM (2019)
19.
Zurück zum Zitat Souibgui, M.A., Kessentini, Y.: De-gan: a conditional generative adversarial network for document enhancement. In: IEEE Transactions on PAMI (2020) Souibgui, M.A., Kessentini, Y.: De-gan: a conditional generative adversarial network for document enhancement. In: IEEE Transactions on PAMI (2020)
20.
Zurück zum Zitat Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Lisbon, Portugal (September 2015) Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Lisbon, Portugal (September 2015)
21.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (2017) Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (2017)
23.
Zurück zum Zitat Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018) Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018)
24.
Zurück zum Zitat Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: bottleneck attention module. In: British Machine Vision Conference (2018) Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: bottleneck attention module. In: British Machine Vision Conference (2018)
25.
Zurück zum Zitat Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019) Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)
26.
Zurück zum Zitat Schlemper, J., et al.: Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)CrossRef Schlemper, J., et al.: Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)CrossRef
28.
Zurück zum Zitat Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (2018) Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (2018)
29.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
30.
Zurück zum Zitat Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711 (2016) Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711 (2016)
31.
Zurück zum Zitat Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020) Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:​2004.​10934 (2020)
32.
Zurück zum Zitat Nguyen, M.T., Shweyer, A.V., Le, T.L., Tran, T.H., Vu, H.: Preliminary results on ancient cham glyph recognition from cham inscription images. In: 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6. IEEE (2019) Nguyen, M.T., Shweyer, A.V., Le, T.L., Tran, T.H., Vu, H.: Preliminary results on ancient cham glyph recognition from cham inscription images. In: 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6. IEEE (2019)
33.
Zurück zum Zitat Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017) Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
34.
Zurück zum Zitat Hyvarinen, A., Hoyer, P., Oja, E.: Sparse code shrinkage: Denoising by nonlinear maximum likelihood estimation. Adv. Neural Inf. Process. Syst. 11, 473–479 (1999) Hyvarinen, A., Hoyer, P., Oja, E.: Sparse code shrinkage: Denoising by nonlinear maximum likelihood estimation. Adv. Neural Inf. Process. Syst. 11, 473–479 (1999)
35.
Zurück zum Zitat Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)MathSciNetCrossRef Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)MathSciNetCrossRef
36.
Zurück zum Zitat Deledalle, C.A., Salmon, J., Dalalyan, A.S., et al.: Image denoising with patch based pca: local versus global. BMVC 81, 425–455 (2011) Deledalle, C.A., Salmon, J., Dalalyan, A.S., et al.: Image denoising with patch based pca: local versus global. BMVC 81, 425–455 (2011)
37.
Zurück zum Zitat Zhang, K., Zuo, W., Zhang, L.: Ffdnet: toward a fast and flexible solution for cnn based image denoising. IEEE Trans. Image Process 27(9), 4608–4622 (2018)MathSciNetCrossRef Zhang, K., Zuo, W., Zhang, L.: Ffdnet: toward a fast and flexible solution for cnn based image denoising. IEEE Trans. Image Process 27(9), 4608–4622 (2018)MathSciNetCrossRef
38.
Zurück zum Zitat Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)MathSciNetCrossRef Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)MathSciNetCrossRef
39.
Zurück zum Zitat Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985) Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985)
40.
Zurück zum Zitat Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)CrossRef Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)CrossRef
Metadaten
Titel
On the Use of Attention in Deep Learning Based Denoising Method for Ancient Cham Inscription Images
verfasst von
Tien-Nam Nguyen
Jean-Christophe Burie
Thi-Lan Le
Anne-Valerie Schweyer
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-86549-8_26

Premium Partner