Skip to main content
Top

2016 | OriginalPaper | Chapter

OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

Authors : Ahmad P. Tafti, Ahmadreza Baghaie, Mehdi Assefi, Hamid R. Arabnia, Zeyun Yu, Peggy Peissig

Published in: Advances in Visual Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Optical character recognition (OCR) as a classic machine learning challenge has been a longstanding topic in a variety of applications in healthcare, education, insurance, and legal industries to convert different types of electronic documents, such as scanned documents, digital images, and PDF files into fully editable and searchable text data. The rapid generation of digital images on a daily basis prioritizes OCR as an imperative and foundational tool for data analysis. With the help of OCR systems, we have been able to save a reasonable amount of effort in creating, processing, and saving electronic documents, adapting them to different purposes. A set of different OCR platforms are now available which, aside from lending theoretical contributions to other practical fields, have demonstrated successful applications in real-world problems. In this work, several qualitative and quantitative experimental evaluations have been performed using four well-know OCR services, including Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. We analyze the accuracy and reliability of the OCR packages employing a dataset including 1227 images from 15 different categories. Furthermore, we review the state-of-the-art OCR applications in healtcare informatics. The present evaluation is expected to advance OCR research, providing new insights and consideration to the research area, and assist researchers to determine which service is ideal for optical character recognition in an accurate and efficient manner.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016) Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016)
2.
go back to reference Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015) Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015)
3.
go back to reference Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)CrossRef Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)CrossRef
4.
go back to reference Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001) Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001)
5.
go back to reference Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)CrossRefMATH Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)CrossRefMATH
6.
go back to reference Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016) Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016)
7.
go back to reference Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear) Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear)
8.
go back to reference Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003) Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003)
9.
go back to reference Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005) Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005)
10.
go back to reference Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)CrossRef Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)CrossRef
11.
go back to reference Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016) Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016)
12.
go back to reference Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016) Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016)
13.
go back to reference Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982) Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982)
14.
go back to reference Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)CrossRef Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)CrossRef
15.
go back to reference Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015) Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015)
16.
go back to reference Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015) Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015)
17.
go back to reference Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008) Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008)
18.
go back to reference Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27863-6_50 CrossRef Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-27863-6_​50 CrossRef
19.
go back to reference Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)CrossRef Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)CrossRef
28.
go back to reference Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014) Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014)
29.
go back to reference Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995) Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995)
30.
go back to reference Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016) Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016)
31.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
32.
go back to reference Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009) Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009)
33.
go back to reference Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef
36.
go back to reference Smith, R.: An overview of the tesseract OCR engine (2007) Smith, R.: An overview of the tesseract OCR engine (2007)
37.
go back to reference Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)CrossRef Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)CrossRef
38.
go back to reference Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012) Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012)
39.
go back to reference Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994) Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994)
40.
go back to reference Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006) Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006)
41.
go back to reference Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014) Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014)
42.
go back to reference Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)CrossRef Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)CrossRef
43.
go back to reference Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012) Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012)
44.
go back to reference Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015) Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015)
Metadata
Title
OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym
Authors
Ahmad P. Tafti
Ahmadreza Baghaie
Mehdi Assefi
Hamid R. Arabnia
Zeyun Yu
Peggy Peissig
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-50835-1_66

Premium Partner