Skip to main content
Erschienen in: Cluster Computing 6/2023

20.09.2023

Gpu-based and streaming-enabled implementation of pre-processing flow towards enhancing optical character recognition accuracy and efficiency

verfasst von: Gener Serhan, Dattilo Parker, Gajaria Dhruv, Fusco Alexander, Akoglu Ali

Erschienen in: Cluster Computing | Ausgabe 6/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Research has demonstrated that digital images can be pre-processed through operations such as scaling, rotation, and blurring to enhance the accuracy of optical character recognition (OCR) by emphasizing important features within the image. Our study employed the open-source Tesseract OCR and found that accuracy can be improved through pre-processing techniques including thresholding, rotation, rescaling, erosion, dilation, and noise removal, based on a dataset of 560 phone screen images. However, our CPU-based implementation of this process resulted in an average latency of 48.32 ms per image, which can hinder the processing of millions of images using OCR. To address this challenge, we parallelized the pre-processing flow on the Nvidia P100 GPU and executed it through a streaming approach, which reduced the latency to 0.825 ms and achieved a speedup factor of 58.6x compared to the serial execution. This implementation enables the use of a GPU-based OCR engine to handle multiple sources of data streams with large-scale workloads.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
https://github.com/YSerhanGener/GPGPU-OCR-Pre-processing.
 
Literatur
1.
Zurück zum Zitat Singh, A., Bacchuwar, K., Bhasin, A.: A survey of OCR applications. Int. J. Machine Learn. Comput. 2, 314 (2012)CrossRef Singh, A., Bacchuwar, K., Bhasin, A.: A survey of OCR applications. Int. J. Machine Learn. Comput. 2, 314 (2012)CrossRef
2.
Zurück zum Zitat Day, T.G., Barranca, N.F.: Guidelines for Optimizing Readability of Flat-Size Mail. Tech. Rep. 177, United States Postal Service (2003) Day, T.G., Barranca, N.F.: Guidelines for Optimizing Readability of Flat-Size Mail. Tech. Rep. 177, United States Postal Service (2003)
5.
Zurück zum Zitat Bieniecki, W., Grabowski, S., Rozenberg, W.: Image preprocessing for improving ocr accuracy (2007) Bieniecki, W., Grabowski, S., Rozenberg, W.: Image preprocessing for improving ocr accuracy (2007)
6.
Zurück zum Zitat Petrescu, R., et al.: Combining tesseract and asprise results to improve ocr text detection accuracy. J. Inf. Syst. Op. Manag. 13, 57–64 (2019) Petrescu, R., et al.: Combining tesseract and asprise results to improve ocr text detection accuracy. J. Inf. Syst. Op. Manag. 13, 57–64 (2019)
7.
Zurück zum Zitat Lat, A., Jawahar, C.: Enhancing ocr accuracy with super resolution (2018) Lat, A., Jawahar, C.: Enhancing ocr accuracy with super resolution (2018)
8.
Zurück zum Zitat Kišš, M., Kohút, J., Beneš, K., Hradiš, M.: Importance of textlines in historical document classification (2022) Kišš, M., Kohút, J., Beneš, K., Hradiš, M.: Importance of textlines in historical document classification (2022)
9.
Zurück zum Zitat Sporici, D., Cuşnir, E., Boiangiu, C.-A.: Improving the accuracy of tesseract 4.0 ocr engine using convolution-based preprocessing. Symmetry 12, 715 (2020)CrossRef Sporici, D., Cuşnir, E., Boiangiu, C.-A.: Improving the accuracy of tesseract 4.0 ocr engine using convolution-based preprocessing. Symmetry 12, 715 (2020)CrossRef
10.
Zurück zum Zitat Gener, S., Dattilo, P., Gajaria, D., Fusco, A., Akoglu, A.: Gpgpu-based high throughput image pre-processing towards large-scale optical character recognition (2022) Gener, S., Dattilo, P., Gajaria, D., Fusco, A., Akoglu, A.: Gpgpu-based high throughput image pre-processing towards large-scale optical character recognition (2022)
11.
Zurück zum Zitat Mittal, R.: & Garg, A. A systematic review, Text extraction using ocr (2020) Mittal, R.: & Garg, A. A systematic review, Text extraction using ocr (2020)
13.
Zurück zum Zitat Harraj, A.E., Raissouni, N.: Ocr accuracy improvement on document images through a novel pre-processing approach. arXiv preprint arXiv:1509.03456 (2015) Harraj, A.E., Raissouni, N.: Ocr accuracy improvement on document images through a novel pre-processing approach. arXiv preprint arXiv:​1509.​03456 (2015)
14.
Zurück zum Zitat Koistinen, M., Kettunen, K., Kervinen, J.: How to improve optical character recognition of historical finnish newspapers using open source tesseract ocr engine–final notes on development and evaluation (2020) Koistinen, M., Kettunen, K., Kervinen, J.: How to improve optical character recognition of historical finnish newspapers using open source tesseract ocr engine–final notes on development and evaluation (2020)
15.
Zurück zum Zitat Shen, M., Lei, H.: Improving ocr performance with background image elimination (2015) Shen, M., Lei, H.: Improving ocr performance with background image elimination (2015)
16.
Zurück zum Zitat Brisinello, M., Grbić, R., Pul, M., Anelić, T.: Improving optical character recognition performance for low quality images (2017) Brisinello, M., Grbić, R., Pul, M., Anelić, T.: Improving optical character recognition performance for low quality images (2017)
17.
Zurück zum Zitat Bui, Q. A., Mollard, D., Tabbone, S.: Selecting automatically pre-processing methods to improve ocr performances (2017) Bui, Q. A., Mollard, D., Tabbone, S.: Selecting automatically pre-processing methods to improve ocr performances (2017)
18.
Zurück zum Zitat de Jager, C., Nel, M.: Business process automation: a workflow incorporating optical character recognition and approximate string and pattern matching for solving practical industry problems. Appl. Syst. Innovation 2, 33 (2019)CrossRef de Jager, C., Nel, M.: Business process automation: a workflow incorporating optical character recognition and approximate string and pattern matching for solving practical industry problems. Appl. Syst. Innovation 2, 33 (2019)CrossRef
19.
Zurück zum Zitat Graves, A., et al.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Machine Intel. 31, 855–868 (2008)CrossRef Graves, A., et al.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Machine Intel. 31, 855–868 (2008)CrossRef
22.
Zurück zum Zitat Gonzales, R.C., Woods, R.E.: Digital image processing second edition (2001) Gonzales, R.C., Woods, R.E.: Digital image processing second edition (2001)
23.
Zurück zum Zitat Szeliski, R.: Computer vision: algorithms and applications (Springer Science & Business Media, 2010) Szeliski, R.: Computer vision: algorithms and applications (Springer Science & Business Media, 2010)
24.
Zurück zum Zitat Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man, Cybern. 9, 62–66 (1979)CrossRef Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man, Cybern. 9, 62–66 (1979)CrossRef
25.
Zurück zum Zitat Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognition 33, 225–236 (2000)CrossRef Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognition 33, 225–236 (2000)CrossRef
26.
Zurück zum Zitat Prahara, A., Pranolo, A., Anwar, N., Mao, Y.: Parallel approach of adaptive image thresholding algorithm on GPU. Knowledge Eng. Data Sci. 4(2), 69–84 (2022)CrossRef Prahara, A., Pranolo, A., Anwar, N., Mao, Y.: Parallel approach of adaptive image thresholding algorithm on GPU. Knowledge Eng. Data Sci. 4(2), 69–84 (2022)CrossRef
27.
Zurück zum Zitat Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images (2008) Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images (2008)
28.
Zurück zum Zitat Tambe, S.B., Kulhare, D., Nirmal, M., Prajapati, G.: Image processing (ip) through erosion and dilation methods. Int. J. Emerg. Technol. Adv. Eng. 3, 285–289 (2013) Tambe, S.B., Kulhare, D., Nirmal, M., Prajapati, G.: Image processing (ip) through erosion and dilation methods. Int. J. Emerg. Technol. Adv. Eng. 3, 285–289 (2013)
29.
Zurück zum Zitat Gaster, B.R., Howes, L., Kaeli, D.R., Mistry, P., Schaa, D.: Heterogeneous Computing with OpenCL (Second Edition), Ch. Chapter 4 - Basic OpenCL Examples, 65–83 (Morgan Kaufmann, 225 Wyman Street, Waltham, MA 02451, USA, 2013), 1.2 edn Gaster, B.R., Howes, L., Kaeli, D.R., Mistry, P., Schaa, D.: Heterogeneous Computing with OpenCL (Second Edition), Ch. Chapter 4 - Basic OpenCL Examples, 65–83 (Morgan Kaufmann, 225 Wyman Street, Waltham, MA 02451, USA, 2013), 1.2 edn
30.
Zurück zum Zitat Aldulaimi, F., Alshakargy, H., et al.: Execution speed up of image rotation matrix using parallel technique. Am. Acad. Sci. Res. J. Eng., Technol Sci. 26, 1–17 (2016) Aldulaimi, F., Alshakargy, H., et al.: Execution speed up of image rotation matrix using parallel technique. Am. Acad. Sci. Res. J. Eng., Technol Sci. 26, 1–17 (2016)
31.
Zurück zum Zitat Sun, W., Lu, Y., Wu, F., Li, S.: Real-time screen image scaling and its GPU acceleration (2009) Sun, W., Lu, Y., Wu, F., Li, S.: Real-time screen image scaling and its GPU acceleration (2009)
32.
Zurück zum Zitat Di, C., Tian, X., Yiying, S.: Image scaling algorithm based on GPU parallel processing (2013) Di, C., Tian, X., Yiying, S.: Image scaling algorithm based on GPU parallel processing (2013)
33.
Zurück zum Zitat Kraus, M., Eissele, M., Strengert, M., Ersbøll, B.K., Pedersen, K.S.: GPU-based edge-directed image interpolation. In: Ersbøll, B.K., Pedersen, K.S. (eds.) Image analysis. Springer, Berlin (2007) Kraus, M., Eissele, M., Strengert, M., Ersbøll, B.K., Pedersen, K.S.: GPU-based edge-directed image interpolation. In: Ersbøll, B.K., Pedersen, K.S. (eds.) Image analysis. Springer, Berlin (2007)
Metadaten
Titel
Gpu-based and streaming-enabled implementation of pre-processing flow towards enhancing optical character recognition accuracy and efficiency
verfasst von
Gener Serhan
Dattilo Parker
Gajaria Dhruv
Fusco Alexander
Akoglu Ali
Publikationsdatum
20.09.2023
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 6/2023
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-023-04137-0

Weitere Artikel der Ausgabe 6/2023

Cluster Computing 6/2023 Zur Ausgabe

Premium Partner