Why do we need CAPTCHAs?


There is no doubt that if you’re a regular user of the Internet that you have seen what is called a CAPTCHA box, usually when you are leaving a comment, creating an account or logging in somewhere. What are these mysterious puzzles and why are they so necessary to decode when trawling the Internet?


What is a CAPTCHA?

CAPTCHA is an acronym that stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”. Quite a mouthful. It was coined in 2000 by professors and scientists from Carnegie Mellon University and IBM.

A CAPTCHA is what is called a challenge-response test. One party presents a question or challenge and the other party must provide a valid answer or response in order to be authenticated.

The CAPTCHA idea originally comes from the Turing test (as can be seen in the above acronym). A Turing test is a means with which to test a machine’s ability to exhibit intelligent behaviour equivalent to that of a human being. A CAPTCHA can be called a reverse Turing test since it is a computer creating the test in the first place that will challenge humans and not the other way around.

Which purposes does CAPTCHA serve?

CAPTCHA prevents spam in website comment sections and on blogs. Many spammers bombard comment sections with links to increase search engine rankings. The test makes sure only humans comment and users don’t have to sign in beforehand to leave a comment.

Many companies offer free email services but a while ago bots would sign up for hundreds of free accounts and then use these accounts to cause havoc on the Internet. Now people need to complete a CAPTCHA before being able to get a free email account. Free services should be protected by CAPTCHA to prevent abuse via automated scripts.

It offers protection from scrapers who want to copy the email addresses of users. Spammers would crawl the Internet for email addresses that are posted in clear text. By utilising CAPTCHA you can protect against these scrapers. People need to solve a CAPTCHA before an email address is shown.

Sometimes people don’t want a webpage to be shown so there is an HTML tag that hides the page from robots. Big search engine companies respect this but sometimes it doesn’t prevent all bots from coming through. This is what CAPTCHA helps to prevent.

Often people will use programs to stuff online polls in favour of a certain vote. Usually IP addresses are recorded to prevent people from voting more than once but with the use of bots one can circumvent this policy. This makes it hard to truly trust online polls if CAPTCHA codes are not involved.

Dictionary attacks are when a computer goes through every word in a dictionary in order to obtain access to someone’s password and account. CAPTCHAs prevents this by requiring the computer (or person) to enter a code after a certain amount of unsuccessful logins.

The prevention of torrent sites from bots falsifying seed counts and positive reviews in order to trick people into download a trojan virus.

Issues with CAPTCHA technology

CAPTCHAs are sometimes only based on reading texts – which is a problem for people who are visually impaired and subsequently not everyone can access a protected resource, no matter if they truly are human. The most effective way around it is to allow a person to opt for an audio or sound-based CAPTCHA.

Some CAPTCHA images are not properly distorted. It may be using text that is completely undistorted or have only minor distortions. This will not deter bots from accessing protected resources because it is like reading normal text, something a bot can easily do.

Secure CAPTCHA code is not easy to build and there needs to be made sure that the CAPTCHA cannot be worked around at script level. Some script issues that can occur include systems passing the answer to the CAPTCHA in plain text and systems where the same answer can be used to solve multiple CAPTCHAs.

If too many sites start using a certain type of CAPTCHA, it can cause the system to become insecure and no longer valid. Puzzles that usually ask text-based questions is an example of this and they seem to be easy to circumvent if you can program a bot to learn the answer.

So next time you are confronted with a CAPTCHA and you’re scratching your head, remember that they are confusing bits of text for very good reasons – to protect your information and defeat bots!