Skip to main content
Erschienen in: Pattern Recognition and Image Analysis 4/2022

01.12.2022 | APPLIED PROBLEMS

Hough Encoder for Machine Readable Zone Localization

verfasst von: S. Ilyuhin, A. Sheshkus, V. Arlazarov, D. Nikolaev

Erschienen in: Pattern Recognition and Image Analysis | Ausgabe 4/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we deal with the document machine readable zone (MRZ) detection in the images taken by smartphones. These images could have low quality and projective distortions. Current solutions for the considered task are mostly image processing algorithms. All of them have precision problems in various use cases. Known neural network methods could address this issue, but impose strict requirements for computation power which is critical for on-device inference. Therefore, we propose a method that combines neural network and image processing that is both lightweight and accurate. The network is trained to process the input images and obtain the heatmap of MRZ characters. After that, we use connected components analysis to merge the characters into lines and evaluate the MRZ bounding box. The proposed neural network is a light version of the Hough encoder—architecture that was designed to work with projectively distorted images. Our network is 1.7 times smaller than the original Hough encoder and more than 100 times smaller in comparison with typical autoencoders, which makes it possible to use the proposed method in embedded devices. Experiments were held on the open synthetic dataset of the documents with 3 types of MRZ. Our results show that our method gives high quality on the test dataset, and have less trainable parameter compared to the common solution: Unet.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat V. Bessmeltsev, E. Bulushev, and N. Goloshevsky, “High-speed OCR algorithm for portable passport readers,” in Graphikon’2011, Moscow, 2011 (Grafikon, Moscow, 2011), pp. 29–32. V. Bessmeltsev, E. Bulushev, and N. Goloshevsky, “High-speed OCR algorithm for portable passport readers,” in Graphikon’2011, Moscow, 2011 (Grafikon, Moscow, 2011), pp. 29–32.
5.
Zurück zum Zitat K. Bulatov, D. Polevoy, D. Ilin, and Y. S. Chernyshova, “Problems of machine-readable zone recognition captured with digital mobile cameras,” Tr. Inst. Sist. Anal. Ross. Akad. Nauk 65, 85–93 (2015). K. Bulatov, D. Polevoy, D. Ilin, and Y. S. Chernyshova, “Problems of machine-readable zone recognition captured with digital mobile cameras,” Tr. Inst. Sist. Anal. Ross. Akad. Nauk 65, 85–93 (2015).
7.
Zurück zum Zitat H. Cho, M. Sung, and B. Jun, “Canny text detector: Fast and robust scene text localization algorithm,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016 (IEEE, 2016), pp. 3566–3573. https://doi.org/10.1109/CVPR.2016.388 H. Cho, M. Sung, and B. Jun, “Canny text detector: Fast and robust scene text localization algorithm,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016 (IEEE, 2016), pp. 3566–3573. https://​doi.​org/​10.​1109/​CVPR.​2016.​388
8.
Zurück zum Zitat I. Doc, 9303-Machine Readable Travel Documents. Part 1–2, Technical Report, (International Civil Aviation Organization, 2006). I. Doc, 9303-Machine Readable Travel Documents. Part 1–2, Technical Report, (International Civil Aviation Organization, 2006).
9.
Zurück zum Zitat J. Fabrizio, M. Cord, and B. Marcotegui, “Text extraction from street level images,” in CMRT09—CityModels, Roads and Traffic, Paris, 2009, Ed. by U. Stilla, F. Rottensteiner, and N. Paparoditis, pp. 199–204. https://hal.archives-ouvertes.fr/hal-00906998. J. Fabrizio, M. Cord, and B. Marcotegui, “Text extraction from street level images,” in CMRT09—CityModels, Roads and Traffic, Paris, 2009, Ed. by U. Stilla, F. Rottensteiner, and N. Paparoditis, pp. 199–204. https://​hal.​archives-ouvertes.​fr/​hal-00906998.​
10.
Zurück zum Zitat L. M. González, L. M. Bergasa, J. J. Yebes, and S. Bronte, “Text location in complex images,” in Proc. 21st Int. Conf. on Pattern Recognition (ICPR2012), Tsukuba, Japan, 2012 (IEEE, 2012), pp. 617–620. L. M. González, L. M. Bergasa, J. J. Yebes, and S. Bronte, “Text location in complex images,” in Proc. 21st Int. Conf. on Pattern Recognition (ICPR2012), Tsukuba, Japan, 2012 (IEEE, 2012), pp. 617–620.
11.
Zurück zum Zitat A. Hartl, C. Arth, and D. Schmalstieg, “Real-time detection and recognition of machine-readable zones with mobile devices,” VISAPP, No. 3, 79–87 (2015). A. Hartl, C. Arth, and D. Schmalstieg, “Real-time detection and recognition of machine-readable zones with mobile devices,” VISAPP, No. 3, 79–87 (2015).
13.
Zurück zum Zitat T. He, Z. Tian, W. Huang, C. Shen, Y. Qiao, and C. Sun, “An end-to-end textspotter with explicit alignment and attention,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, 2018 (IEEE, 2018), pp. 5020–5029. https://doi.org/10.1109/CVPR.2018.00527 T. He, Z. Tian, W. Huang, C. Shen, Y. Qiao, and C. Sun, “An end-to-end textspotter with explicit alignment and attention,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, 2018 (IEEE, 2018), pp. 5020–5029. https://​doi.​org/​10.​1109/​CVPR.​2018.​00527
14.
Zurück zum Zitat D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan, and L. P. de las Heras, “ICDAR 2013 Robust Reading Competition,” in 12th Int. Conf. on Document Analysis and Recognition, Washington, D.C., 2013 (IEEE, 2013), pp. 1484–1493. https://doi.org/10.1109/ICDAR.2013.221 D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan, and L. P. de las Heras, “ICDAR 2013 Robust Reading Competition,” in 12th Int. Conf. on Document Analysis and Recognition, Washington, D.C., 2013 (IEEE, 2013), pp. 1484–1493. https://​doi.​org/​10.​1109/​ICDAR.​2013.​221
20.
Zurück zum Zitat Z. Raisi, M. A. Naiel, P. Fieguth, S. Wardell, and J. Zelek, “Text detection and recognition in the wild: A review,” (2020). arXiv:2006.04305 [cs.CV] Z. Raisi, M. A. Naiel, P. Fieguth, S. Wardell, and J. Zelek, “Text detection and recognition in the wild: A review,” (2020). arXiv:2006.04305 [cs.CV]
21.
Zurück zum Zitat Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (general data protection regulation). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (general data protection regulation).
22.
Zurück zum Zitat O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Ed. by N. Navab, J. Hornegger, W. Wells, and A. Frangi, Lecture Notes in Computer Science, Vol. 9351 (Springer, Cham, 2015), pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28CrossRef O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Ed. by N. Navab, J. Hornegger, W. Wells, and A. Frangi, Lecture Notes in Computer Science, Vol. 9351 (Springer, Cham, 2015), pp. 234–241. https://​doi.​org/​10.​1007/​978-3-319-24574-4_​28CrossRef
25.
Zurück zum Zitat A. Sheshkus, A. Ingacheva, V. Arlazarov, and D. Nikolaev, “HoughNet: Neural network architecture for vanishing points detection,” in 2019 Int. Conf. on Document Analysis and Recognition (ICDAR), Sydney, 2019 (IEEE, 2019), pp. 844–849. https://doi.org/10.1109/ICDAR.2019.00140 A. Sheshkus, A. Ingacheva, V. Arlazarov, and D. Nikolaev, “HoughNet: Neural network architecture for vanishing points detection,” in 2019 Int. Conf. on Document Analysis and Recognition (ICDAR), Sydney, 2019 (IEEE, 2019), pp. 844–849. https://​doi.​org/​10.​1109/​ICDAR.​2019.​00140
27.
Zurück zum Zitat N. Skoryukina, “Machine-readable zones localization method robust to capture conditions,” Tr. Inst. Sist. Anal. Ross. Akad. Nauk 67, 81–86 (2017). N. Skoryukina, “Machine-readable zones localization method robust to capture conditions,” Tr. Inst. Sist. Anal. Ross. Akad. Nauk 67, 81–86 (2017).
28.
Zurück zum Zitat Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, “Detecting text in natural image with connectionist text proposal network,” in Computer Vision—ECCV 2016, Ed. by B. Leibe, J. Matas, N. Sebe, and M. Welling, Lecture Notes in Computer Science, Vol. 9912 (Springer, Cham, 2016), pp. 56–72. https://doi.org/10.1007/978-3-319-46484-8_4CrossRef Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, “Detecting text in natural image with connectionist text proposal network,” in Computer Vision—ECCV 2016, Ed. by B. Leibe, J. Matas, N. Sebe, and M. Welling, Lecture Notes in Computer Science, Vol. 9912 (Springer, Cham, 2016), pp. 56–72. https://​doi.​org/​10.​1007/​978-3-319-46484-8_​4CrossRef
30.
Zurück zum Zitat A. Zamberletti, L. Noce, and I. Gallo, “Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions,” in Computer Vision—ACCV 2014 Workshops, Ed. by C. Jawahar and S. Shan, Lecture Notes in Computer Science, Vol. 9009 (Springer, Cham, 2014), pp. 91–105. https://doi.org/10.1007/978-3-319-16631-5_7CrossRef A. Zamberletti, L. Noce, and I. Gallo, “Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions,” in Computer Vision—ACCV 2014 Workshops, Ed. by C. Jawahar and S. Shan, Lecture Notes in Computer Science, Vol. 9009 (Springer, Cham, 2014), pp. 91–105. https://​doi.​org/​10.​1007/​978-3-319-16631-5_​7CrossRef
31.
Zurück zum Zitat X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, “EAST: An efficient and accurate scene text detector,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, 2017 (IEEE, 2017), pp. 2642–2651. https://doi.org/10.1109/CVPR.2017.283 X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, “EAST: An efficient and accurate scene text detector,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, 2017 (IEEE, 2017), pp. 2642–2651. https://​doi.​org/​10.​1109/​CVPR.​2017.​283
Metadaten
Titel
Hough Encoder for Machine Readable Zone Localization
verfasst von
S. Ilyuhin
A. Sheshkus
V. Arlazarov
D. Nikolaev
Publikationsdatum
01.12.2022
Verlag
Pleiades Publishing
Erschienen in
Pattern Recognition and Image Analysis / Ausgabe 4/2022
Print ISSN: 1054-6618
Elektronische ISSN: 1555-6212
DOI
https://doi.org/10.1134/S1054661822040150

Weitere Artikel der Ausgabe 4/2022

Pattern Recognition and Image Analysis 4/2022 Zur Ausgabe

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Subjective Restoration of Omissions in the Measurement Data of an Object of Study and Its Mathematical Model

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Reduction of Video Data at Translation of a Registered Object Relative to Video Sensors Based on the Eigenbasis of Interpretation Model

Premium Partner