Captchas are Evil

Over the past few years, there has been an explosion in sites using Captchas as a means of controlling automated abuse. Of course, the bad guys are also putting a lot of effort into automated ways of solving captchas. Even some of the good guys are doing that to point out how weak they often are as can be seen here. But this is my rant about captchas so I’ll get to my arguments with them, point by point.

I should point out that the captchas in question here are visual captchas. That said, a lot of my arguments can also be applied (with suitable modifications) to audio captchas, or, if the appropriate interface technology were available, tactile ones.

Let’s get the obvious problem with captchas out of the way. If I am either blind or using a device that cannot or will not render the captcha, it is obviously useless. If you cannot see it, you cannot solve it. This obviously makes a visual captcha inappropriate for sites that might be used by such people. This includes blind people but also includes people using really slow links who have disabled images or who are using character cell displays.

Now we can assume that the captcha is displayed and that the human being tested can actually see it.

A simple monochrome captcha, say black on white, with some sort of perturbation used to make it hard to mechanically decode, is dependent on the quality of the perturbation. In many cases, the perturbation is easy to remove and in many cases where it is not, it is also not easy for a human to solve. This is because there ends up being too much background noise from the perturbation or the warping of the text is so bad that it is not recognizable anymore. Sometimes, the warping makes one letter look like another which, obviously, makes solving the captcha nearly impossible. With all their potential weaknesses, these are the kind I have the least trouble decoding.

Now throw in colour. Using varying colours for the captcha can certainly make the perturbation of the text much more effective. Unfortunately, it relies on certain colour combinations having contrast when viewed. I low quality display will obviously cause problems, but even ruling out hardware, there are significant issues. A great many people are affected by various colour deficiencies in their vision. This means that what is a startlingly sharp contrast to one person can be totally invisible to another. Indeed, I have experienced this problem myself in a non-captcha setting. On version of Ubuntu had a CD package that had text in an unfortunate colour combination. I could see there was something written there (barely) but I simply could not read it. There was almost zero contrast. The person sitting next to me read it with absolutely no trouble and seemed somewhat surprised that I couldn’t read it. Captchas using colour run afoul of this all the time.

Now there is another problem with a lot of captchas. Some use upper and lower case letters and require the solver to get that combination correct. Some simply ignore the case in the response. To make matters worse, many use a font that makes distinguishing an upper case I from a lower case L from the digit 1 impossible, especially once it has been perturbed. Similar problems occur with other similar glyphs. Certainly, a machine will be unable to solve these puzzles reliably. But then, neither can a human. It doesn’t matter how good a problem solver the human is, if there is no way to distinguish one item from another, a human isn’t going to be any better at it than a machine.

Leaving aside the problems with the captchas themselves, there is another factor. I find it gravely insulting that I have to prove to a machine that I am a human in order to send a message to another human. Or to post a comment about something some human wrote. Oh, sure, some people have turned to these measures to defend themselves from being inundated with automated spam, but that’s not my problem. Most of those people, however, are aware of what they are doing and know they are alienating some potential contacts. They simply made a choice: self-defense or accessibility. People who use captchas just because some other big name person uses them are a major problem, though. I would go so far as to say that the vast majority of sites that use captchas did not make informed decisions to use them.

Let me make a suggestion for low volume sites. Moderation. Have a human moderate the submissions for comment sections. Do NOT have submissions automatically go live. If the submissions do not automatically go live, odds are the spammers will give up and go somewhere else. Sure, you have to reject the garbage, but at least it isn’t going up anywhere before you decide to allow it. That’s obviously not practical for any site with medium or high volumes of traffic or which requires realtime communication. But think about your particular site. Do you need realtime comment posting? Truly? Even if you think you do, you probably don’t.

I should note that I’m not saying that site operators should avoid using the tools at their disposal. I am saying that site operators should realize that bad captchas can have a deleterious affect on a site’s reputation. If you are going to use captchas, test your captcha scheme on people who are partially colour blind to see if they can solve them. Test them on people who are NOT colour blind. Test them on poorer quality displays. Test them on high quality displays. Make sure they are at least solvable the vast majority of the time. Make sure it is possible for the human to say, "I can’t make heads or tails of this, give me another." And even if you do all that, be prepared to piss off a non-trivial number of people.

Also, make sure you don’t fall into the trap of relying only on the captchas to block abuse. You might be surprised at the number of real humans causing trouble. You might even be surprised at the number of mechanical abusers that get past the captcha. Remember, the premise behind a captcha is that humans are good problem solvers; we see patterns. That applies everywhere on your site so you have to make sure you don’t leave the back door wide open while putting three deadbolts on the front door. Real humans with nefarious intent will be scoping out your site just as the bots are and they will find the other flaws. You might find that your captcha is not blocking anywhere near the abuse you thought it would.

Before I close out this rant, there is another type of puzzle that is irking me. It is related to the text based captcha. It involves showing a picture of some object or animal or what have you that has been perturbed in some way and asking the user to identify it. This is absolutely assinine. It is just as bad as asking someone to relate "regata" to something on an IQ test. It requires the person in question to actually have a way to recognize the object and then to provide the correct answer. In many cases, there is not a single correct answer and the creator of the puzzle may, in fact, not have thought of the same answer the viewer did. The fact that I don’t have a freaking clue what a regata is does not, in any way, lessen my intelligence. Likewise, not having a freaking clue that I’m looking at a picture of, say, a lemur does not mean I am not human. In both cases, it merely means I do not have the necessary life experience to solve the particular puzzle. So if you are going to use pictures of objects or animals or what have you as a test for humanity, don’t. That’s even more evil than the text based captchas.

Now we come down to the end of my rant. Would I ever use a captcha on a site that I operate? Probably not. I’m also not likely to rise up the site traffic scale high enough to warrant it. There is, however, a small possiblity that I might resort to something like a captcha if I felt the tradeoff was worth it. But at least it would be an informed decision. Remember, you cannot always take the high road. Sometimes it is washed out or otherwise impassable and you have to choose the best of the low roads.

Leave a Reply

Your email address will not be published. Required fields are marked *