Skip to main content

Advertisement

Log in

A Systematic Survey on CAPTCHA Recognition: Types, Creation and Breaking Techniques

  • Survey article
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Abstract

CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Human Apart. CAPTCHA is used for internet security. A few CAPTCHA schemes are available today like, text-based, audio-based, video/animation-based, puzzle based etc. In this paper, all these types are collaborating at single place to analyze. The main aim of this article is to present a literature to identify and recognize CAPTCHA, its types, the creation and breaking techniques. It is a systematic and complete analysis of all available CAPTCHA types. In this paper, 16 text-based CAPTCHA’s generation methods are discussed with usability and security ranges from 3 to 100 and 65 to 100%, respectively. The security and usability measures are not calculated/sustained using some known English schemes. Out of 16 reviewed CAPTCHAs, 12 are based on English language, 1 on Arabic language, 1 on Chinese language, 1 on Devanagari language and 1 on Gurumukhi script. The designs are made segment proof with overlapping random shapes, overlapping characters, clasping, different colors and different shades. For making recognition proof many techniques are used like image masking, local and global warping; broken characters, random rotation, arcs, jaws, etc. Approximately 50 schemes, especially based on the English language, are successfully broken with a success rate that ranges from 2 to 100%. The techniques that are used to break these schemes include shape context matching, distortion estimation, Log Gabor 2D filter, horizontal and vertical projection (for a segment the letters) are used. For recognition CNN, KNN, DNN and MCDNN are used. Almost 15 images-based CAPTCHAs are discussed that are designed with usability and security range 90–100 and 17–100%, respectively. Out of these 5 schemes are successfully broken with a success rate ranging between 7 and 100%. The K-NN and SVM are mostly used algorithms to recognize the images. Audio based CAPTCHAs (5 designs) are discussed with usability and security range from 68.5 to 100 and 100%, respectively. The broken rate of these audio schemes is also 45–75%. These schemes are broken with SVM and K-NN algorithms. The paper also discusses 4 popular video-based designs that provide usability and security that ranges from 75 to 100 and 98 to 100, respectively. These schemes are also compromised with broken rate 16–10% using SIFT, NN and simple OCR techniques. The paper can be a benchmark to precede any specific research to dive into any one of these types.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Aboufadel E, Olsen J, Windle J (2005) Breaking the holiday inn priority club CAPTCHA. College Math J 36(2):101–108

    Article  Google Scholar 

  2. Ahmad A, Yan J (2012) CAPTCHA color, usability and security. IEEE Internet Comput 16(2):1089–7801

    Article  Google Scholar 

  3. Ahn L, Blum M, Langford J (2004) Telling humans and computers apart automatically. Commun ACM 47(2):56–60

    Article  Google Scholar 

  4. Algwil A, Ciresanand D, Liu B (2016) A security analysis of automated chinese turing tests. In: Proceedings of the 32nd annual conference on computer security applications, pp 520–532

  5. Almazyad A, Ahmad Y, Kouchay S (2011) Multi-modal CAPTCHA: a user verification scheme. In: Proceedings of international conference on information science and applications (ICISA), pp 1–7

  6. Alsuhibany S (2011) Optimizing CAPTCHA generation. In: Proceedings of 6th international conference on availability, reliability and security (ARES), pp 740–745

  7. Alsuhibany S (2016) A benchmark for designing usable and secure text-based CAPTCHAs. Int J Netw Sec Its Appl 8(4):41–54

    Google Scholar 

  8. Alsuhibany S (2018) Generating Arabic handwritten CAPTCHA for cyber security. Int J Comput Sci Netw Sec 18(3):41–47

    Google Scholar 

  9. Bansal A, Garg A, Gupta A, Gupta A (2008) Breaking A Visual CAPTCHA: A Novel Approach Using HMM, pp 1–6.

  10. Bigham J, Cavender A (2009) Evaluating existing audio CAPTCHAs and an interface optimized for non-visual use. In Proceedings of the SIGCHI conference on human factors in computing systems, pp 1829–1838.

  11. Bongard MM (1970) Pattern recognition. Hyden Book Co., New York

    MATH  Google Scholar 

  12. Bursztein E, Aigrain J, Moscicki A (2014) The end is nigh: generic solving of text-based CAPTCHAs. In: Proceedings of 8th USENIX workshop on offensive technologies (WOOT 14), pp 1–15

  13. Bursztein E, Bethard S (2009) DeCAPTCHA: breaking 75% of eBay audio CAPTCHAs. In: Proceedings of 3rd USENIX workshop on offensive technologies, pp 1–7.

  14. Bursztein E, Martin M, Mitchell J (2011) Text-based CAPTCHA strengths and weaknesses. In: Proceedings of the 18th ACM conference on computer and communications security, pp 125–138

  15. Bursztein E, Moscicki A, Fabry C (2014) Easy does it: more usable CAPTCHAs. In: Proceedings of the 32nd annual ACM conference on human factors in computing systems, pp 2637–2646.

  16. Chakrabarti S, Singhal M (2007) Password-based authentication: preventing dictionary attacks. Computer 40(6):68–74

    Article  Google Scholar 

  17. Chellapilla K, Larson K, Simard P, Czerwinski M (2005) Designing human friendly human interaction proofs (HIPs). In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 711–720

  18. Chellapilla K, Simard P (2004) Using machine learning to break visual human interaction proofs (HIPs). In: Proceedings of the advances in neural information processing systems, pp 265–272

  19. Chen J, Luo X, Guo Y (2017) A survey on breaking technique of text-based CAPTCHA. In: Security and communication networks, pp 1–15

  20. Chew M, Baird HS (2003) Baffle text: a human interactive proof. Proc SPIE 5010:305–316

    Article  Google Scholar 

  21. Chew M, Tygar J (2004) Image recognition CAPTCHAs. In: Proceedings of the 7th international information security conference (ISC 2004), pp 268–279

  22. Chow R, Golle P, Jakobsson M, Wang L, Wang X (2008) Making CAPTCHA clickable. In: Proceedings of the 9th workshop on mobile computing systems and applications, pp 91–94

  23. Chow W, Susilo W (2011) AniCAP: an animated 3D CAPTCHA scheme based on motion parallax. In: Proceedings of 10th international conference on cryptology and network security, pp 255–271

  24. Coates A, Baird H, Fateman R (2001) Pessimal print: a reverse turing test. In: Proceedings of international conference on document analysis and recognition, pp 1154–1158

  25. Cui S, Mei J, Zhang W (2010) A CAPTCHA implementation based on moving objects recognition problem. In: International conference on E-business and E-government, pp 1277–1280

  26. Elson J, Douceur J, Howell J (2007) Asirra: a CAPTCHA that exploits interest-aligned manual image categorization. In: Proceedings of CCS, pp 366–374

  27. Gao H, Wang W, Qi J (2013) The robustness of hollow CAPTCHAs. In: Proceedings of the 2013 ACM SIGSAC conference on computer and communications security, pp 1075–1086

  28. Gao H, Yan J, Cao F (2016) A simple generic attack on text CAPTCHAs. In: Proceedings of network and distributed system security symposium (NDSS), pp 1–14

  29. Golle P (2008) Machine learning attacks against the asirra CAPTCHA. In: Proceedings of the 15th ACM conference on computer and communications security, pp 535–542

  30. Golle P, Ducheneaut N (2005) Preventing bots from playing online games. ACM Comput Entertain 3(3):1–10

    Article  Google Scholar 

  31. Hilaire S, Kim H, Kim C (2010) How to deal with bot scum in MMORPGs. In: Proceedings of IEEE international workshop technical committee on communications quality and reliability (CQR), pp 1–6

  32. Holman J, Lazar J, Feng J (2007) Developing usable CAPTCHAs for blind users. In: Proceedings of the 9th international ACM SIGACCESS conference on computers and accessibility, pp 245–246

  33. Huang S, Lee Y, Bell G (2010) An efficient segmentation algorithm for CAPTCHAs with line cluttering and character warping. Multimedia Tools Appl 48(2):267–289

    Article  Google Scholar 

  34. Hussain R, Gao H, Kumar K (2016) Recognition of merged characters in text based CAPTCHAs. In: Proceedings of 3rd international conference on computing for sustainable global development, pp 16–18

  35. Imsamai M, Phimoltares S (2010) 3D CAPTCHA: a next generation of the CAPTCHA. In: Proceedings of international conference on information science and applications (ICISA), pp 1–8

  36. Kevin B, Daven P, George H, Dave L (2017) University of Maryland unCaptcha: a low-resource defeat of reCaptcha’s audio challenge. In: WOOT 17-proceedings of the 11th USENIX conference on offensive technology

  37. Kluever K, Zanibbi R (2009) Balancing usability and security in a video CAPTCHA. In: Proceedings of the 5th symposium on usable privacy and security (SOUPS), pp 1–11

  38. Kumari B, Kumawat A, Gaur H (2017) Enhancing the security of CAPTCHA based on the new character locations. In: Proceedings of 4th international conference on computing for sustainable global development, pp 6997–7001

  39. Lazar J, Feng J, Brooks T (2012) The sounds right CAPTCHA: an improved approach to audio human interaction proofs for blind users. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 2267–2276

  40. Li C, Sudani W, Wang J, Liu F, Gill A (2010) Protection through multimedia CAPTCHA. In: Proceedings of 8th international conference on advances in mobile computing and multimedia, pp 63–68

  41. Tang M, Gao H, Zhang Y (2019) Research on deep learning techniques in breaking text-based captchas and designing image-based captcha. IEEE Trans Inf Forensics Secur 5(10):2522–2537

    Article  Google Scholar 

  42. Mehrnejad M, Bafghi A, Harati A, Toreini E (2011) Multiple SEIMCHA: multiple semantic image CAPTCHA. In: International conference on internet technology and secured transactions (ICITST), pp 196–201

  43. Mohamed M, Gao S, Saxena N (2014) Dynamic cognitive game CAPTCHA usability and detection of streaming-based farming. In: Proceedings of the workshop on usable security (USEC), pp 1–10

  44. Mori G, Malik J (2003) Recognizing objects in adversarial clutter: breaking a visual CAPTCHA. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 1–134

  45. Moy G, Jones N, Harkless C (2004) Distortion estimation techniques in solving visual CAPTCHAs. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 1–6

  46. Nguyen V, Chow Y, Susilo W (2012) Breaking an animated CAPTCHA scheme. In: International conference on applied cryptography and network security, pp 12–29.

  47. Nguyen V, Chow Y, Susilo W (2014) On the security of text-based 3D CAPTCHAs. Comput Sec 45:84–99

    Article  Google Scholar 

  48. Naor M. (1998) Verification of a human in the loop, or identification via the turing test. http://www.wisdom.weizmann.ac.il/~naor/PAPERS/human.pdf

  49. Pinkas B, Sander T (2002) Securing passwords against dictionary attacks. In: Proceedings of 9th conference on computer and communications security, pp 161–170

  50. Polakis L, Lanciniand M, Kontaxis G (2012) All your face are belong to us: breaking Facebook’s social authentication. In: Proceedings of the 28th annual computer security applications conference, pp 399–408

  51. Pope C, Kaur K (2005) Is it human or computer? Defending e-commerce with CAPTCHAs. IT Profess 7(2):43–49

    Article  Google Scholar 

  52. Ramaiah C, Govindaraju V (2015) A sigma-lognormal model for character level CAPTCHA generation. In: Proceedings of 13th international conference on document analysis and recognition, pp 966–970

  53. Rusu A, Govindaraju V (2004) Handwritten CAPTCHA: using the difference in the abilities of humans and machines to read handwritten words. In: Proceedings of 9th IAPR international workshop on frontiers of handwriting recognition, pp 226–231

  54. Rusu A, Thomas A, Govindaraju V (2010) Generation and use of handwritten CAPTCHAs. Int J Doc Anal Recogn 13:49–64

    Article  Google Scholar 

  55. Saini B, Bala A (2013) Bot protection using CAPTCHA: Gurumukhi script. Int J Appl Innov Eng Manag 2(5):267–275

    Google Scholar 

  56. Sauer G, Holman J, Lazar J (2010) Accessible privacy and security: a universally usable human-interaction proof tool. Univ Access Inf Soc 9(3):239–248

    Article  Google Scholar 

  57. Sauer G, Lazar J, Hochheiseret H (2010) Towards a universally usable human interaction proof: evaluation of task completion strategies. ACM Trans Access Comput 2(4):15:1-15:32

    Article  Google Scholar 

  58. Shirali-Shahreza M, Shirali-Shahreza S (2007) Collage CAPTCHA. In: Proceedings of 9th international symposium on signal processing and its applications, 1–4.

  59. Shirali-Shahreza M, Shirali-Shahreza S (2007) Online collage CAPTCHA. In: Proceedings of 8th international workshop on image analysis for multimedia interactive services, pp 58–58

  60. Shirali-Shahreza M, Shirali-Shahreza S (2007) Question-based CAPTCHA. In: Proceedings of the international conference on computational intelligence and multimedia applications (ICCIMA 2007), vol 4, pp 54–58

  61. Shirali-Shahreza M, Shirali-Shahreza S (2008) Advanced collage CAPTCHA. In: Proceedings of 5th international conference on information technology: new generations, pp 1234–1235

  62. Shirali-Shahreza M, Shirali-Shahreza S (2008) Dynamic CAPTCHA. In: International symposium on communications and information technologies, pp 436–440

  63. Shirali-Shahreza M, Shirali-Shahreza S (2008) A CAPTCHA system for Nintendo DS. In: Proceedings of the 7th ACM SIGCOMM workshop on network and system support for games, pp 104–105

  64. Shirali-Shahreza M, Shirali-Shahreza S (2008) Motion CAPTCHA. In: Proceedings of conference on human system interactions, pp 142–144

  65. Shirali-Shahreza M, Shirali-Shahreza S (2008) CAPTCHA for children. In: Proceedings of international conference on system of systems engineering, pp 1–6

  66. Shirali-shahreza S, Abolhassani H, Sameti H, Shirali-shahreza M (2009) Spoken CAPTCHA: a CAPTCHA system for blind users. Int Colloq Comput Commun Control Manag 1:221–224

    Google Scholar 

  67. Sivakorn S, Polakis I, Angelos D (2016) I am robot deep learning to break semantic image CAPTCHAs. In: Proceedings of IEEE European symposium on security and privacy, pp 388–403

  68. Sivakorn S, Polakis I, Keromytis A (2016) I’m not a human: breaking the Google reCAPTCHA. In: Proceedings of annual computer security applications conference (ACSAC), pp 399–408

  69. Susilo W, Chow Y, Zhou H (2010) STE3D-CAP: stereoscopic 3D CAPTCHA. In: International conference on cryptology and network security CANS 2010: lecture notes in computer science, vol 6467, pp 221–240

  70. Taal K, Atal A, Singh D (2013) reCAPTCHA assisted OCR for Devanagari texts. In: Proceedings of the 1st Indian workshop on machine learning, pp 1–2

  71. Tam J, Simsa J, Huggins-Daines D (2008) Improving audio CAPTCHAs. In: Proceedings of international symposium on usable privacy and security (SOUPS), pp 1–2

  72. Thomas A, Chaudhury S, Govindaraju V (2010) Leveraging the mixed-text segmentation problem to design secure handwritten CAPTCHAs. In: Proceedings of 12th IAPR international conference on handwriting recognition, pp 13–18

  73. Xu Y, Reynaga G, Chiasson S (2012) Security and usability challenges of moving-object CAPTCHAs: decoding codewords in motion. In: Proceedings of 21st USENIX security symposium, pp 49–64

  74. Yalamanchili S, Rao K (2011) A framework for Devanagari script-based CAPTCHA. Int J Adv Inf Technol 1(4):47–57

    Google Scholar 

  75. Yamamoto T, Suzuki T, Nishigaki M (2011) A proposal of four-panel cartoon CAPTCHA: the concept. In: Proceedings of international conference on advanced information networking and applications (AINA), pp 159–166

  76. Yamamoto T, Tygar J, Nishigaki M (2010) CAPTCHA using strangeness in machine translation. In: Proceedings of 24th IEEE international conference on advanced information networking and applications (AINA), pp 430–437

  77. Yan J, Ahmad A (2008) A low-cost attack on a Microsoft CAPTCHA. In: Proceedings of the 15th ACM conference on computer and communications security, pp 543–554

  78. Yan Y, Ahmad A (2007) Breaking visual CAPTCHAs with naive pattern recognition algorithms. In: Proceedings of the 23rd annual computer security applications conference, pp 279–291

  79. Yardi S, Feamster N, Bruckman A (2008) Photo-Based Authentication using Social Networks. In: Proceedings of the first workshop on Online social networks, pp 55–60

  80. Yu J, Ma X, Han T (2016) Usability investigation on the localization of text CAPTCHAs: take chinese characters as a case study. School of Media and Design, Shanghai Jiao Tong University, Shanghai, China, ResearchGate, pp 1–23

Download references

Funding

No funding was received.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Munish Kumar.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Research Involving Human Participants and/or Animals

No human and animal participants were used.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, M., Jindal, M.K. & Kumar, M. A Systematic Survey on CAPTCHA Recognition: Types, Creation and Breaking Techniques. Arch Computat Methods Eng 29, 1107–1136 (2022). https://doi.org/10.1007/s11831-021-09608-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11831-021-09608-4

Navigation