Skip to main content

2020 | OriginalPaper | Buchkapitel

Dewarping Document Image by Displacement Flow Estimation with Fully Convolutional Network

verfasst von : Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu

Erschienen in: Document Analysis Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance. In this paper, we propose a novel framework for both rectifying distorted document image and removing background finely, by estimating pixel-wise displacements using a fully convolutional network (FCN). The document image is rectified by transformation according to the displacements of pixels. The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Our approach is easy to implement and consumes moderate computing resource. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Amidror, I.: Scattered data interpolation methods for electronic imaging systems: a survey. J. Electron. Imaging 11, 157–76 (2002)CrossRef Amidror, I.: Scattered data interpolation methods for electronic imaging systems: a survey. J. Electron. Imaging 11, 157–76 (2002)CrossRef
2.
Zurück zum Zitat Brown, M.S., Tsoi, Y.C.: Geometric and shading correction for images of printed materials using boundary. IEEE Trans. Image Process. 15(6), 1544–1554 (2006)CrossRef Brown, M.S., Tsoi, Y.C.: Geometric and shading correction for images of printed materials using boundary. IEEE Trans. Image Process. 15(6), 1544–1554 (2006)CrossRef
3.
Zurück zum Zitat Cao, H., Ding, X., Liu, C.: A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 228–233. IEEE (2003) Cao, H., Ding, X., Liu, C.: A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 228–233. IEEE (2003)
4.
Zurück zum Zitat Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017) Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:​1706.​05587 (2017)
5.
Zurück zum Zitat Courteille, F., Crouzil, A., Durou, J.D., Gurdjos, P.: Shape from shading for the digitization of curved documents. Mach. Vis. Appl. 18(5), 301–316 (2007)CrossRef Courteille, F., Crouzil, A., Durou, J.D., Gurdjos, P.: Shape from shading for the digitization of curved documents. Mach. Vis. Appl. 18(5), 301–316 (2007)CrossRef
6.
Zurück zum Zitat Das, S., Ma, K., Shu, Z., Samaras, D., Shilkrot, R.: Dewarpnet: single-image document unwarping with stacked 3D and 2D regression networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 131–140 (2019) Das, S., Ma, K., Shu, Z., Samaras, D., Shilkrot, R.: Dewarpnet: single-image document unwarping with stacked 3D and 2D regression networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 131–140 (2019)
7.
Zurück zum Zitat Das, S., Mishra, G., Sudharshana, A., Shilkrot, R.: The common fold: utilizing the four-fold to dewarp printed documents from a single image. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 125–128. ACM (2017) Das, S., Mishra, G., Sudharshana, A., Shilkrot, R.: The common fold: utilizing the four-fold to dewarp printed documents from a single image. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 125–128. ACM (2017)
8.
Zurück zum Zitat Fu, B., Wu, M., Li, R., Li, W., Xu, Z., Yang, C.: A model-based book dewarping method using text line detection. In: Proceedings of 2nd International Workshop on Camera Based Document Analysis and Recognition, Curitiba, Barazil, pp. 63–70 (2007) Fu, B., Wu, M., Li, R., Li, W., Xu, Z., Yang, C.: A model-based book dewarping method using text line detection. In: Proceedings of 2nd International Workshop on Camera Based Document Analysis and Recognition, Curitiba, Barazil, pp. 63–70 (2007)
9.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
11.
Zurück zum Zitat Li, X., Zhang, B., Liao, J., Sander, P.V.: Document rectification and illumination correction using a patch-based cnn. ACM Trans. Graph. 38(6), 1–11 (2019) Li, X., Zhang, B., Liao, J., Sander, P.V.: Document rectification and illumination correction using a patch-based cnn. ACM Trans. Graph. 38(6), 1–11 (2019)
12.
Zurück zum Zitat Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)CrossRef Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)CrossRef
13.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2010)CrossRef Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2010)CrossRef
14.
Zurück zum Zitat Liu, C., Zhang, Y., Wang, B., Ding, X.: Restoring camera-captured distorted document images. Int. J. Doc. Anal. Recogn. 18(2), 111–124 (2015)CrossRef Liu, C., Zhang, Y., Wang, B., Ding, X.: Restoring camera-captured distorted document images. Int. J. Doc. Anal. Recogn. 18(2), 111–124 (2015)CrossRef
15.
Zurück zum Zitat Ma, K., Shu, Z., Bai, X., Wang, J., Samaras, D.: Docunet: document image unwarping via a stacked u-net. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4709 (2018) Ma, K., Shu, Z., Bai, X., Wang, J., Samaras, D.: Docunet: document image unwarping via a stacked u-net. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4709 (2018)
16.
Zurück zum Zitat Meng, G., Wang, Y., Qu, S., Xiang, S., Pan, C.: Active flattening of curved document images via two structured beams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3890–3897 (2014) Meng, G., Wang, Y., Qu, S., Xiang, S., Pan, C.: Active flattening of curved document images via two structured beams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3890–3897 (2014)
17.
Zurück zum Zitat Ramanna, V., Bukhari, S.S., Dengel, A.: Document image dewarping using deep learning. In: International Conference on Pattern Recognition Applications and Methods (2019) Ramanna, V., Bukhari, S.S., Dengel, A.: Document image dewarping using deep learning. In: International Conference on Pattern Recognition Applications and Methods (2019)
18.
Zurück zum Zitat Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 377–384. IEEE (2011) Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 377–384. IEEE (2011)
19.
Zurück zum Zitat Tsoi, Y.C., Brown, M.S.: Multi-view document rectification using boundary. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007) Tsoi, Y.C., Brown, M.S.: Multi-view document rectification using boundary. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
20.
Zurück zum Zitat Wada, T., Ukida, H., Matsuyama, T.: Shape from shading with interreflections under a proximal light source: distortion-free copying of an unfolded book. Int. J. Comput. Vision 24(2), 125–135 (1997)CrossRef Wada, T., Ukida, H., Matsuyama, T.: Shape from shading with interreflections under a proximal light source: distortion-free copying of an unfolded book. Int. J. Comput. Vision 24(2), 125–135 (1997)CrossRef
21.
Zurück zum Zitat Wang, P., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1451–1460. IEEE (2018) Wang, P., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1451–1460. IEEE (2018)
22.
Zurück zum Zitat Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003) Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)
23.
Zurück zum Zitat Xing, Y., Li, R., Cheng, L., Wu, Z.: Research on curved Chinese document correction based on deep neural network. In: International Symposium on Computational Intelligence and Design, vol. 2, pp. 342–345. IEEE (2018) Xing, Y., Li, R., Cheng, L., Wu, Z.: Research on curved Chinese document correction based on deep neural network. In: International Symposium on Computational Intelligence and Design, vol. 2, pp. 342–345. IEEE (2018)
24.
Zurück zum Zitat You, S., Matsushita, Y., Sinha, S., Bou, Y., Ikeuchi, K.: Multiview rectification of folded documents. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 505–511 (2017)CrossRef You, S., Matsushita, Y., Sinha, S., Bou, Y., Ikeuchi, K.: Multiview rectification of folded documents. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 505–511 (2017)CrossRef
25.
Zurück zum Zitat Zhang, L., Zhang, Y., Tan, C.: An improved physically-based method for geometric restoration of distorted document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 728–734 (2008)CrossRef Zhang, L., Zhang, Y., Tan, C.: An improved physically-based method for geometric restoration of distorted document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 728–734 (2008)CrossRef
Metadaten
Titel
Dewarping Document Image by Displacement Flow Estimation with Fully Convolutional Network
verfasst von
Guo-Wang Xie
Fei Yin
Xu-Yao Zhang
Cheng-Lin Liu
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-57058-3_10

Premium Partner