Skip to main content

2019 | OriginalPaper | Buchkapitel

Learning to Clean: A GAN Perspective

verfasst von : Monika Sharma, Abhishek Verma, Lovekesh Vig

Erschienen in: Computer Vision – ACCV 2018 Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the big data era, the impetus to digitize the vast reservoirs of data trapped in unstructured scanned documents such as invoices, bank documents, courier receipts and contracts has gained fresh momentum. The scanning process often results in the introduction of artifacts such as salt-and-pepper/background noise, blur due to camera motion or shake, watermarkings, coffee stains, wrinkles, or faded text. These artifacts pose many readability challenges to current text recognition algorithms and significantly degrade their performance. Existing learning based denoising techniques require a dataset comprising of noisy documents paired with cleaned versions of the same document. In such scenarios, a model can be trained to generate clean documents from noisy versions. However, very often in the real world such a paired dataset is not available, and all we have for training our denoising model are unpaired sets of noisy and clean images. This paper explores the use of Generative Adversarial Networks (GAN) to generate denoised versions of the noisy documents. In particular, where paired information is available, we formulate the problem as an image-to-image translation task i.e, translating a document from noisy domain (i.e., background noise, blurred, faded, watermarked) to a target clean document using Generative Adversarial Networks (GAN). However, in the absence of paired images for training, we employed CycleGAN which is known to learn a mapping between the distributions of the noisy images to the denoised images using unpaired data to achieve image-to-image translation for cleaning the noisy documents. We compare the performance of CycleGAN for document cleaning tasks using unpaired images with a Conditional GAN trained on paired data from the same dataset. Experiments were performed on a public document dataset on which different types of noise were artificially induced, results demonstrate that CycleGAN learns a more robust mapping from the space of noisy to clean documents.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
2.
Zurück zum Zitat Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. CoRR abs/1606.03657 (2016). http://arxiv.org/abs/1606.03657 Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. CoRR abs/1606.03657 (2016). http://​arxiv.​org/​abs/​1606.​03657
6.
Zurück zum Zitat Goodfellow, I.J., et al.: Generative Adversarial Networks. ArXiv e-prints, June 2014 Goodfellow, I.J., et al.: Generative Adversarial Networks. ArXiv e-prints, June 2014
10.
12.
Zurück zum Zitat Li, H., Zhang, Y., Zhang, H., Zhu, Y., Sun, J.: Blind image deblurring based on sparse prior of dictionary pair. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 3054–3057, November 2012 Li, H., Zhang, Y., Zhang, H., Zhu, Y., Sun, J.: Blind image deblurring based on sparse prior of dictionary pair. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 3054–3057, November 2012
16.
22.
24.
Zurück zum Zitat Yao, Q., Kwok, J.T.: Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity. ArXiv e-prints, June 2016 Yao, Q., Kwok, J.T.: Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity. ArXiv e-prints, June 2016
Metadaten
Titel
Learning to Clean: A GAN Perspective
verfasst von
Monika Sharma
Abhishek Verma
Lovekesh Vig
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-21074-8_14

Premium Partner