Skip to main content

2016 | OriginalPaper | Buchkapitel

OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

verfasst von : Ahmad P. Tafti, Ahmadreza Baghaie, Mehdi Assefi, Hamid R. Arabnia, Zeyun Yu, Peggy Peissig

Erschienen in: Advances in Visual Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Optical character recognition (OCR) as a classic machine learning challenge has been a longstanding topic in a variety of applications in healthcare, education, insurance, and legal industries to convert different types of electronic documents, such as scanned documents, digital images, and PDF files into fully editable and searchable text data. The rapid generation of digital images on a daily basis prioritizes OCR as an imperative and foundational tool for data analysis. With the help of OCR systems, we have been able to save a reasonable amount of effort in creating, processing, and saving electronic documents, adapting them to different purposes. A set of different OCR platforms are now available which, aside from lending theoretical contributions to other practical fields, have demonstrated successful applications in real-world problems. In this work, several qualitative and quantitative experimental evaluations have been performed using four well-know OCR services, including Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. We analyze the accuracy and reliability of the OCR packages employing a dataset including 1227 images from 15 different categories. Furthermore, we review the state-of-the-art OCR applications in healtcare informatics. The present evaluation is expected to advance OCR research, providing new insights and consideration to the research area, and assist researchers to determine which service is ideal for optical character recognition in an accurate and efficient manner.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016) Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016)
2.
Zurück zum Zitat Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015) Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015)
3.
Zurück zum Zitat Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)CrossRef Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)CrossRef
4.
Zurück zum Zitat Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001) Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001)
5.
Zurück zum Zitat Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)CrossRefMATH Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)CrossRefMATH
6.
Zurück zum Zitat Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016) Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016)
7.
Zurück zum Zitat Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear) Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear)
8.
Zurück zum Zitat Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003) Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003)
9.
Zurück zum Zitat Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005) Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005)
10.
Zurück zum Zitat Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)CrossRef Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)CrossRef
11.
Zurück zum Zitat Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016) Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016)
12.
Zurück zum Zitat Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016) Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016)
13.
Zurück zum Zitat Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982) Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982)
14.
Zurück zum Zitat Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)CrossRef Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)CrossRef
15.
Zurück zum Zitat Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015) Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015)
16.
Zurück zum Zitat Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015) Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015)
17.
Zurück zum Zitat Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008) Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008)
18.
Zurück zum Zitat Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27863-6_50 CrossRef Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-27863-6_​50 CrossRef
19.
Zurück zum Zitat Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)CrossRef Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)CrossRef
28.
Zurück zum Zitat Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014) Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014)
29.
Zurück zum Zitat Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995) Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995)
30.
Zurück zum Zitat Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016) Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016)
31.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
32.
Zurück zum Zitat Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009) Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009)
33.
Zurück zum Zitat Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef
36.
Zurück zum Zitat Smith, R.: An overview of the tesseract OCR engine (2007) Smith, R.: An overview of the tesseract OCR engine (2007)
37.
Zurück zum Zitat Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)CrossRef Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)CrossRef
38.
Zurück zum Zitat Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012) Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012)
39.
Zurück zum Zitat Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994) Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994)
40.
Zurück zum Zitat Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006) Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006)
41.
Zurück zum Zitat Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014) Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014)
42.
Zurück zum Zitat Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)CrossRef Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)CrossRef
43.
Zurück zum Zitat Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012) Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012)
44.
Zurück zum Zitat Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015) Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015)
Metadaten
Titel
OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym
verfasst von
Ahmad P. Tafti
Ahmadreza Baghaie
Mehdi Assefi
Hamid R. Arabnia
Zeyun Yu
Peggy Peissig
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-50835-1_66