Captchas: Noisy Enough?

I think the unban script is now at the “alpha” stage. That is: It seems to work. The author (that is me) has tested it. A few people tested filling out the form. Thanks to all who helped there. Among other things, I discovered that excess white space resulted in errors and I got some feedback on captchas. I fixed the white space issue, but now I want people to tell comment on this captcha:

  1. If you are color blind, are the colors ok? I can specify any two colors you like.
  2. Can you read the values?

Feel free to discuss captchas generally. I’m happy to participate in any and all discussions of captchas because they are interesting and diverting.

However, as I mentioned in comments, this captcha is not intended to be a 100% bullet proof high-security captcha that a bank might used to prevent thieves from accessing the casino bank account and stealing all the quatloos. This is merely intended to thwart the garden variety ‘bots that rove the web trying to leave spam, hammer sites etc. I suspect what I am doing is good enough.

Among other things, the ‘bot repellant features already include:

  1. A blank hidden form field that must not be filled in. Bots that fill that field in are logged and blocked from further attempts to get unbanned.
  2. Names of images do not give away the correct values to enter into the boxes. None of the hidden fields contain correct answers. (Giving the answers in the hidden parts of the form is one of the ways to permit stupid, easily coded bots ‘pass’ the captcha. Bots were once able to bet in UAH due to that flaw. They all lost, but they did bet.)
  3. If a bot is image recognition enabled it also needs to add. That means someone needs to code a bot that does both. Given the actual business model for most spammers and crackers, carrying around that amount of code is inefficient.
  4. The bot has a limited time to try to break the captch. (Mind you… bots can be pretty fast. So this isn’t necessarily a big deal for a bot.)
  5. The bot has to give a valid IP and email address.
  6. Some IP ranges are perma-banned. They can’t load the form and they can’t get unbanned. I’ll be extending this feature.
  7. Individual IPs can only unban 5 IPs a day.
  8. If an IP has too many requests that have not been acted on, it is blocked.
  9. Other.

The script also usually provides a message if one of the above blocks stops you. You are provided a contact email address and encouraged to email me so that I can sort things out.

Other ‘bot thwarting features are planned. For example: I plan to log submission attempts and block an IP if it keeps trying to solve the same captcha. (People aren’t going to solve it 1 time a second, but a bot could try to do so. )

But for now, the script seems to work. I don’t think bots are going to race in and overwhelm the site by trying to load the unban request page over and over. So I’m making it available and I’ll add those protections as we go along.

Ok: So if you are banned at Cloudflare, how will you get the address? I’ll be adding the address to the sidebar. The unban file is on a domain not protected by Cloudflare. People who are banned will be able to read the cache at Google, click the link and get themselves unbanned. If you want to test visit The Unban Page.

Update: March 16. Two people who are not me successfully unbanned themselves. (Or… they might not have been banned, but they successfully ran the script. The script can’t actually verify that you are banned before it submits the “unban” request to Cloudflare.)

57 thoughts on “Captchas: Noisy Enough?”

  1. I went into the unban page using Tor and amazingly that Tor IP address was banned, so I tried to get that IP address unbanned and got the message “I’ve detected a problem that may require human intervention. Please send an email to the admin at…”

    I can email you the entire message if you need it. I’m a bit reluctant to post IP addresses or email addresses in here.

  2. Skeptical–

    1) Let me fix that part of the page so it gives you an email address. I must have missed inserting it when I’m blocking connections with TOR.

    2) Blocking people connecting using TOR is a feature not a bug. 🙂

    I’ll email you.

  3. Lucia,
    Sorry if you misunderstood me. The page gave me an email address, but I didn’t post the entire message here as I didn’t want to post that email address.

  4. So you mean the page provided you an email where you can send the IPs etc? If you think the page was malfunctioning you should send information to that email address. That way I can learn that something went wrong and I can learn the details. Otherwise I can’t.

  5. Skeptical–
    On posting emails: email it’s ok to post my email in comments, I can edit it! 😉

    But even apart from that, those should be disposable emails from @spamgourmet. Did you s So posting that would be no problem at all. ee how long that address was? Notice the ‘.2.’ in the address. After 2 emails are sent to that address, spamgourmet deactivates it!

  6. Skeptical–
    There was no way for you to know the email was disposable. Everyone can guess my non-disposable one. Here’s how:
    1) What’s my first name?
    2)What’s the domain name for my blog?

    But those together: My non-disposable email address. I’ve got great spam filters.:)

  7. I can’t be the first to say that this blog seems to have become a blog about running a blog.

  8. Pieter– Thanks. Hmmm… phones and web. I never surf the web from a phone. Is the problem that the captcha is the 5 character images is too wide and gets shrunken down too much? Or what?

    MarkB–Oddly, I think you are the first to actually say it. I”m know for a fact you’re not the first to think it. I’ve been thinkin git. Yes. recently, there have been lots of posts on running the blog. But if I ban people, I need to unban. There is no ready made method for unbanning and certainly no ready made method to do it safely. The previous method was for various people to do whatever came into their heads to do. While that method doesn’t result in blog posts, it results in …. not good stuff.

  9. Usually, I can’t stand captuas because they distort beyond recognition. I have no problem reading those particular captuas, though. I am not colorblind. If these prove sufficient to thwart a bot, they are definitely usable.

    I’m not unhappy about the occasional drifts into the weeds about running the blog. Even when threads are alive that deal with the issue, there are other threads where climate stuff is being discussed.

  10. I just ran the page 10 times, and couldn’t read one of the capthas and was unsure of two more.

    So assuming I can reload if I get it wrong, I guess this means it works (till my eyes fail even more).

  11. steveta_uk–
    10 * 3 means you looked at 30 captchas. So, are you saying you 1 in 30 you couldn’t read and 2 were difficult? (I just want to be sure I know what you are saying.)

    Yes. You can reload if you get it wrong. Also, if you get it wrong, you can use your back button, look at the answer you entered and try again without refreshing– the time window is big enough. I’ve done that when I misread the captcha.

    But I want to keep the level of frustration relatively low while not making the captcha’s trivial for an ocr reader.

    I don’t want people to be presented with an unreadable puzzle 1 in 3 times. That’s a bit too high. I can tweak the noise down a little using the “adjustable parameter”. 🙂

  12. j ferguson–
    I don’t know but the link you provided on an earlier post indicates that it can work. The example in the article you linked to superimposed circles of the background cover on the text.

    Noise is widely in captchas and it’s easy to add so that’s my first try. I’ve included some logging that will tell me whether anything manages to simultaneously:
    1) Solve the captchas while
    2) Accidentally filling the blank hidden form.

    Only bots will do (2). Moreover, spam bot will likely be very eager to do (2) because bots tend to be programmed to fill in all blank input boxes because those are often the box for the “comment”– which is precisely what they want to post.

  13. Lucia,

    Being 56 with eye sight not what it use to be I had difficulty with making out 2 of the characters. Difficult in the sense that what I typed in was a best guess.

  14. I guessed right. I don’t remember the letter but I had difficulty with distinguishing if the number was 3 or 8.

  15. lucia, if you want to make this more “secure,” the best thing you could do is make the spacing non-uniform. The hardest thing for a computer program to do when breaking these is called “segmentation,” the process by which individual characters are separated. If the characters are uniformly spaced, segmentation is non-challenging.

    If you do change that, your CAPTCHAs should be somewhat secure. Your noise is fairly low level, but it does connect characters, so it makes boundaries harder to find. More importantly, you kept it the same color. It’s amazing how many CAPTCHAs have noise of a different color when that just makes it easy to filter. Those two things plus non-uniform spacing (ideally with characters actually overlapping a little) make CAPTCHAs fairly hard to beat.

    But without changing the spacing, it’d take me, at most, a couple minutes to modify a bot to be able to handle your CAPTCHAs.

    For a final nit to pick, it’s spelled “queue,” not “cue.”

  16. Oh. For the record, I didn’t try to check any hashing/encryption you’re using, so I can’t vouch for it.

  17. I find the security issues surrounding CAPTCHAs interesting, so I’m going to discuss a little about them. I’ll cover some of the technical aspects, but it should be simple enough for anyone to follow. Still, feel free to skip over this if you’re not interested.
    .
    The core to all OCR is correlation masks. Correlation masks are basically patterns. When comparing binary strings, you might look for the pattern 1101001. That would be a simple example. On the other hand, you might look for the same string with a rule of, “Match at least four digits.” This would match more often, so you’d be more likely to find matches, both real and “false positives.” (This tolerance level is important when noise the same color as the background is used).

    In images, the same idea applies. Each character (primarily a-z,A-Z,0-9) will have a mask which is looked for in an image. In addition to the positive correlation described above, it will also use negative correlation. This is for when things are found where they aren’t expected (i.e. noise). Again, there will be a tolerance level which can be adjusted to make matching more or less strict (giving a trade-off between false negatives and actual attempts).

    With CAPTCHAs, the idea is to make this pattern matching more difficult for a program without making it (significantly) more difficult for humans. The first step to doing this is to use non-uniform spacing. If spacing is uniform, the program will know to only look for patterns in fixed areas. That makes it significantly easier (and greatly reduces the opportunity for false positives).

    The second step is to make sure boundaries are not clearly separated. This serves the same purpose as the above as if two characters are completely separated by a background color, it’s easy to tell where one ends and the other begins. This can be done in part by using noise which can branch between characters, but a more effective solution is to have characters actually overlap.

    The third step is to try to complicate the correlation masks. When a mask is determined, it is determined for an individual character within an individual font. This means using multiple fonts makes the comparison more difficult. Beyond that, character alignment and size can be adjusted (rotations and scaling make pattern matching more difficult).

    There’s a lot more to be said, but there are limits to how much interest people have, so I’ll leave it at that basic overview.

  18. Brandon– After j ferguson’s first comment, I did surf a little. Evidently, the tricks I read about were:
    1) Vary spacing. (Have some letter touch.)
    2) Distort.
    3) Use more than one font in each captcha and
    4) Noise.

    Noise was the easiest to add first, so I did that. (One reason it was easiest is the example Captcha code I modified from added noise. So I didn’t have to do to much refreshing of my memory on GD!)

    I found a distortion example– so I could pull in that code. But it’s pretty clear distortion is going to be a bit computationally intensive.

    I can easily vary spacing– I was looking into that. I think I can easily use more than one font. It’s just a matter of coding a few comments in a loop. But.. not today! 🙂

  19. Oh the encryption, I’m sure it’s breakable! The main thing is to disguise the number a little. Oddly for script repelling “on the wild” things anyone who tried could break can be amazingly effective.

  20. lucia:

    I found a distortion example– so I could pull in that code. But it’s pretty clear distortion is going to be a bit computationally intensive.

    I can easily vary spacing– I was looking into that. I think I can easily use more than one font. It’s just a matter of coding a few comments in a loop. But.. not today! 🙂

    For what it’s worth, I wouldn’t worry about adding distortion. Varying the spacing and fonts should be more than enough for your purposes (even the varied fonts are probably unnecessary). It’s not like you’re aiming for your CAPTCHAs to be unbreakable. As long as you make it non-trivial, it should be fine.

    The problem right now is with uniform spacing your noise is basically meaningless. Characters will always be in a predefined space, so noise outside that does nothing. Moreover, your noise can be filtered out easily just by density.

    Oh the encryption, I’m sure it’s breakable! The main thing is to disguise the number a little. Oddly for script repelling “on the wild” things anyone who tried could break can be amazingly effective.

    I wouldn’t worry about encryption being breakable as it wouldn’t be worth the trouble. The main thing to worry about is that your encryption is implemented properly. For example, I’ve seen web sites with only ~1,000 test strings (but the noise was randomized). The encryption used was strong, but since it always generated the same hash for any given string, it was mostly meaningless. Each answer could be matched to a hash, so…

  21. Hi Lucia,

    It’s obvious Brandon is the expert on Captchas, and I wouldn’t dream of arguing with him, but I’m afraid I’m not a fan. I believe the going rate for human solvers – Captcha teams – is $1-2 per _thousand_ solutions. Your best bet is to tackle the problems in other ways.

    Personally I favour asking people to do something interpretive – give them a (relatively) lengthy text question to read and analyse to find a simple one word answer, for example. If it takes ten seconds to work out the answer, it won’t put anyone off, but it makes it uneconomic for spammers. Oh, and anyone who can’t read an understand a simple question probably shouldn’t be able to post in any case 😉

  22. Dave–

    give them a (relatively) lengthy text question to read and analyse to find a simple one word answer, for example.

    I don’t know how to code something like that doesn’t either require enormous amounts of human effort on my part or result in a quiz that is trivial for a bot to beat. Maybe you could clarify what you mean and suggest how it would be done. (I mean nuts and bolts. How many questions do I need? How do I come up with them? Etc.)

    I don’t think human solvers are useful for my current captchas. How do you propose they would solve my captchas? (Nuts and bolts.) Obviously, if someone sends a human to my page the human is going to unban the IP they request be unbanned. But other than this, how are you envisioning the human would assist the bot?

  23. Brandon:

    On numbers:

    My captchas have either 5 or 6 characters. That results in roughly 10^10 possible strings. (I have most the alphabet, large case, small case and digits 2-9. I skipped some letters and numbers as being too confusing. For example, I don’t use (0,o,O), I don’t use (1,l,i,j) because with noise these are all too easily confused by humans.

    The captchas are generated on the fly. So, even I had you call the string for captcha for say “code”, the image will look different each time it is created.

    The code to call the captcha has a query string containing the “word” in the captcha. The word is encoded. I tweaked the key so that its now

    $key=$stub.date(“zHis”,$now)

    where $now is a time which is set when someone first fills out the form. The $stub is set in a settings file, so should I eventually let others use this I would advise them to change their stub.

    The images for the match are also autogenerated. The problems are always of the form:

    230+56= or 450+39= etc.

    For some reason I have forgotten, the math problem uses two images instead of 1. I’m going to see if that is easy to change.

  24. Lucia,

    How about instructions like: “Add the first two numbers below then divide by the third”, or “subtract the second number from the third then multiply by the first”. That might hold off even the smartest of bots, since the combinations of add, subtract, multiply and divide and the instruction wording could vary enough to demand considerable coding to break.

  25. Dave,
    The concept of Captcha teams is a bit unnerving. Do you know how they work? Do they see the whole screen or just the captchas?

    If they saw and solved only the captchas, it would suggest that any other traps are handled by machines and the solved captcha would just be part of the solution. I still like the idea of the required calc description being in a text graphic too.

    Alas, this isn’t a Turing question after all since it does look like humans can be on the other end.

    Is all this bot assault to get comment access to place commercial urls?

  26. Oh. I didn’t see you had responded in the wrong topic lucia. I thought it was interesting you said:

    But I had planned to change to a system where I created the key I used by taking the key in the settings file (example “whatever”) and then using something like $key=”whatever”.date(“dhis”,$time) with whatever being the code I entered into the appropriate place in my settings file.

    Of course, then I have to store the time in a file so that the script knows what to use to decrypt. But I already do store the time for the request because I use it to decide that a request is stale and also to delete from the request files. So this method means that there are over 26 million keys a year. The visitor would probably have a difficult time figuring out what the key is.

    This isn’t really true. The time is displayed on the page the request is being made from. It’s quite easy to take that and generate a hash. Not knowing exactly what algorithim you used would make it more difficult, but not substantially so. Even if you removed that display, it’s not that challenging to discover a system’s “baseline” for time.

    Of course, it’s not something which would likely ever be done on your blog given the lack of value. Still, if you wanted to use the same approach in a secure manner, you’d use a second key along with the time to generate your hash (this is basically what HMAC does, if you want to look at it). That would pretty well make your keys unrecoverable (especially if you rotated the time-hashing key).

  27. Bah. Connection problems (I believe on my end) prevented me from editing my comment. As I was trying to add, I think I didn’t read closely enough. It sounds like you are using a second key to generate the time hash. If so, ignore what I said.

  28. Dave:

    Hi Lucia,

    It’s obvious Brandon is the expert on Captchas, and I wouldn’t dream of arguing with him, but I’m afraid I’m not a fan. I believe the going rate for human solvers – Captcha teams – is $1-2 per _thousand_ solutions. Your best bet is to tackle the problems in other ways.

    First, I want to point out I’m not an expert in these things. If anything, I’m the sort of person who talks to the experts and translates for them. For example, I’ve never written a neural net to generate correlation masks, but I’ve known several people who have. In other words, I can distill information, but I won’t be providing you anything you couldn’t find on your own (if you wanted to take the time).

    Second, while I think your price is a bit low, you’re also missing an important point. The reason human solvers charge so little is because they have such a bulk of things to cover. It’s like practically any product. The more you order, the cheaper the per unit cost is.

    Who would want to do that to break lucia’s CAPTCHAs? They can’t spam enough to warrant it, nor can they hope to recover that much information. Most likely, if someone did want to target this blog, they’d just do it in a way that didn’t get them banned (and manually unban themselves if they did).

    The last time I checked (it’s been a while) Yahoo used a weaker CAPTCHA system than lucia does. That’d be a far better target for that sort of effort.

    j ferguson:

    Dave,
    The concept of Captcha teams is a bit unnerving. Do you know how they work? Do they see the whole screen or just the captchas?

    It depends on the group. Sometimes what happens is if a bot hits a CAPTCHA, it sends a message to a solver with a screenshot, and he/she then fills the answer into a form which he sends back. Other times displays the bots are using are automatically forwarded to the solvers, and they can switch to one whenever they need to. Interestingly, the latter can be done without the bot having a monitor since it doesn’t actually need to see anything.

    Other times, a group is hired to “attack” a site by solving as many CAPTCHAs as it can. Their results are then analyzed in an attempt to break the encryption scheme used by the site. In this case, they actually visit the site themselves.

    There are probably other setups I don’t know about, but that should give you an idea.

    lucia:

    I don’t think human solvers are useful for my current captchas. How do you propose they would solve my captchas? (Nuts and bolts.) Obviously, if someone sends a human to my page the human is going to unban the IP they request be unbanned. But other than this, how are you envisioning the human would assist the bot?

    The bot would send a screenshot of the page (or individual fields/images, if sophisticated enough), and a human would then respond with the “answers.” The bot would read those answers, and it would input them.

    With your time limits in place, that is less likely to happen, of course. It’d also .be far easier to just send a bulk of banned IP addresses to a group and have them manually submit unban requests…

  29. It sounds like you are using a second key to generate the time hash. If so, ignore what I said.

    Yes. I use a “stem” key and then add the time to create the key that is actually used. The purpose of the ‘time’ part is to ensure that if I’m encrypting $code, the actual value of $encrypted_code will be different at different times. But I need the stem. Otherwise, anyone who knew I used the time would be able to figure out the encryption in a snap. So, at it’s simplest the idea is:

    $key=$stem.time();

    The purpose of ‘.time()’ is to make it difficult for the bot to figure out the value of $stem.

    If this was commercial and I was worried idiot-users would fail to change the default stem I could suggest people enter the date of some significant to them event and use time relative to that date– which could be recent or past. That would make it more time consuming for a bot to figure out the correct “time” value to added to the stem. I can think of a bunch of other things to do- either instead of or in addition to– what I’m already doing.
    (It may seem unkind to anticipate that users would be idiots. But let’s face it, people are very bad about following directions fully especially when they don’t know why a particular thing needs to be done.)

    You are right on the time– I’m passing the time when someone loaded the first page in a hidden field in the form.

  30. lucia:

    Yes. I use a “stem” key and then add the time to create the key that is actually used.

    Ah. As soon as I reread the quote, I thought that’s what you did. I’m going to blame the fact I initially missed it on the fact code excerpts can be confusing, especially when embedded in a paragraph. I figure it’s either that, or I was just too lazy while reading the text.

    And I refuse to admit fault.

    >.>

  31. j ferguson–
    There are lots of ways Captcha teams could work. The optimum way would depend on how unique the captchas are.

    I think the idea of Captcha teams goes like this:

    Suppose a bank where people keep money limits access to something by requiring customers enter a captcha. Suppose further that when creating captcha’s the bank spent a lot of money creating captcha that can’t be broken by even the most sophisticated OCR on the planet. Visually, they are unbreakable! Whooo hooo.

    But then, suppose in the next step, they cheaped out and created a finite number of these perfect captchas and also cheaped out on all other security measures. If someone breaks the captcha, they are in!

    For purposes here, let’s suppose they bank created 1,000,000 of captchas Next, suppose a cracker wants to break in to someone’s bank account to steal millions. They figure out they can enter names and bank account numbers easily but then are presented with a captcha to get into their account. Suppose they then discover then can’t read the captcha’s with OCR. But then the cracker learns he can hire a team to read the captchas at a rate of $1 per thousand captchas. So for $1,000 he can learn the words associated with all 1,000,000 captchas. The cracker now sets about capturing the image of captchas by visiting the site over and over. He stores the image and uses image processing software similar to the kind groups like Picscout use to find copyright violations creates his one database.Then, he submits his captchas to the team and has them read the captcha. Now he knows the solution for each of the million captchas– and he logs the solution along with each image in his data base.

    Later, with solutions in hand, he adds some code to his bot script and sends it back to the bank. Now it can get past the banks pitiful security by just comparing whatever captcha he is shown to the images he has on file. When he finds a match, his bot just compares the image to the ones in the database, finds the correct answer and enters that.

    This works if there are a finite number of captchas and the same ones are used over and over.

    But suppose there are an infinite number of captchas and each is used only once?

    There’s another way the captchas can be broken by humans. A bot could read the image url, and just sends that image to a person sitting in a 3rd world country who is reloading image over and over that person answer the captcha and relays the answer to the machine the ‘bot is on. So, the info goes

    bot fills a variety of forms fields and submits->
    At some point, the bank presents Captcha image url->
    Bot sees capthca URL->
    Bot sends URL_link to person with that precise Captcha image embedded in a form->
    Person loads link->
    Person solve captcha and and clicks ‘submit’->
    This sends the persons answer to the bot->
    Bot enters that answer in the form at the bank->
    Bot continues to fill out all other form fields
    ->thief gets money.

    This is only useful if the bot can fill all other form fields without the assistance of the human.

    That means the thief has to have programmed a bot that anticipates what that particular bank is going to ask for. Writing the program takes time. Setting up the captcha solving network is also work. But the project can be an economically viable (though felonious) business endeavor provided the prize at the end of the whole thing is sufficiently attractive.

    Notice that captchas are never the only security feature for anything that needs real security. The purpose of captchas is to make ‘bot enabled business plans more expensive to operate.

  32. Brandon

    It’d also .be far easier to just send a bulk of banned IP addresses to a group and have them manually submit unban requests…

    That’s the what I think people would do! The reason I asked Dave is that I think the method where a bot crawl and then asks a human to answer the captcha would be more expensive than just having the human visit.

    Once my system switches to sending email that will change. Because at that point, the bot would fill in the email and IP, then they get the human to solve the two other questions, then they submit. This spares the human the time required to type in the IP and their email. It’s not much of a savings in time for the human.

    In contrast, breaking into PayPal… if the figure out all the other stuff, then maybe getting the human to read a captcha could be worth it.

  33. Brandon–
    What’s a bit interesting is that the minute one goes from brainstorming possible methods and then think about implementing ideas of captcha alternatives people have one starts to realize why many of these simple alternatives are not used. Especially if you think “How would I break that system”.

    SteveF

    That might hold off even the smartest of bots, since the combinations of add, subtract, multiply and divide and the instruction wording could vary enough to demand considerable coding to break

    I’d sort of thought of this but the instruction wording can only vary as much as I code it to vary. So I would need “templates” of instructions. If I have a finite number, the bot could be programmed to know recognize them.

    So I don’t think explaining the math in words is helpful except to the extent that the method is dissimilar to what’s used at sites like yahoo.

    Carrick’s point that if ones method differs from the other ones, the pre-existing bots aren’t programmed to solve it holds. This makes the novel method safer when used “in the wild”. But I think from a theoretical POV, the math-in-words method isn’t any better than “solve the simple math problem shown in the image” method.

  34. lucia:

    At some point, the bank presents Captcha image url->
    Bot sees capthca URL->
    Bot sends URL_link to person with that precise Captcha image embedded in a form->

    This may or may not work, depending on the server. It’s fairly common for a server to dynamically generate a CAPTCHA image, send it, then delete the copy it holds. Assuming the server doesn’t allow a person to force a particular hash to be used by the CAPTCHA generator, a link to the image may be useless.

    Of course, it’d be easy for the bot to just send the image.

    In contrast, breaking into PayPal… if the figure out all the other stuff, then maybe getting the human to read a captcha could be worth it.

    PayPal’s CAPTCHA was easy to break a few years ago. It didn’t mean a whole lot though. All they used it for was when people created accounts.

    What’s a bit interesting is that the minute one goes from brainstorming possible methods and then think about implementing ideas of captcha alternatives people have one starts to realize why many of these simple alternatives are not used. Especially if you think “How would I break that system”.

    That sort of process is part of what made me become fascinated with security. I always loved coming up with ideas, then figuring out why those ideas wouldn’t work (and after, I’d try to find ways to fix them).

    By the way, I’m not looking for sympathy, but about six hours ago I started running a fever, and it’s gotten worse. If I don’t respond in the conversation, that’s why.

    (I just don’t want anyone to ask me a question and wonder why I never respond)

  35. “There’s another way the captchas can be broken by humans. A bot could read the image url, and just sends that image to a person sitting in a 3rd world country who is reloading image over and over that person answer the captcha and relays the answer to the machine the ‘bot is on.”

    How does the bot even know that there is a CAPTCHA to solve? Many sites are quite explicit about it, with the image immediately preceding the input field. But why not have something that’s easier to read, but unless you can understand some text, the bot cannot know where the data is located – i.e. hidden in plain sight.

    Something like “enter the initial characters from the nth sentence on this page”, or
    “enter the characters th*t are m*ssing fr*m thi* line”.

  36. steveta_uk–
    First you have to consider two cases:
    1) pages bots will be programmed to read over and over (e.g. Pay pal). For these, no amount of moving the image will help. The programmer will just visit the page, step through and then program the bot to find whatever is on that page. The programmer will find stuff that a person will find and program accordingly.

    2) It’s likely moving the image might not confuse the bot at all. If the image is wrapped in “> img src=”the_captcha” >, the bot will find it. by looking for “img src” and/or “the captcha”. The captcha might even be a known size– if so, the bot can look for the “height=’xx'” and “width=’yy'” bits. Moving the Captcha will make it harder for the human, not the bot because the bot finds it by searching for a certain string.

    To think about how to hide things from bots you need to think about what’s hard for bots. That means you have to think about how they find things. To think about how a bot finds something you have to think about how you might program it to find things. Bots don’t have “eyes”. But they can be programmed to search for strings.

    No matter where you put the string, the bot is likely to find it.

  37. Questions of the nature

    ‘What animal/plant/object is at the center/right/top/left/bottom of this image and what is its color/size/pattern’ are probably the most difficult to bot code solve but easiest for a human to determine whilst being a reasonably effective captcha.

    That does not get round the ‘human supported bot’ problem though.

  38. I’m not sure why you think that is the most difficult for a bot to solve. I think whether or not it was difficult for a bot would depend on details of the implementation: Bot could easily be programmed to detect words like “center/right/top/left and locate which part of the image is “center/right/top/left”. How many tips are you going to give the human to help them give the ‘right’ answer:

    Here’s an image

    Does the human enter “tyrannosaurus rex” or “dinosaur”?

    And how many animal images are you going to have? If you only have a few, the bot can be programmed to use image recognition to figure out that image is a “whatever the captcha things it is”. In fact, the bot could be programmed to give more reliable answers because he doesn’t worry about whether it’s a “tyrannosaurus rex” or “dinosaur”.

  39. Not long after I posted I realised that practical versions are less easy to come by. I was more thinking along lines of multiple animals in an image and questions that require thought rather than logic.

  40. RichardLH–
    I assumed you were thinking of something like a 3×3 grid containing 9 animal images. Then someone is supposed to say what animal in in the “top right” grid. Is that what you are thinking? Because unless you have lots and lots of images of animals, I think it would be veryeasy to program a bot to defeat that system. Also, unless people recognizes all the animals and calls each animal by the same name, it’s going to be frustrating for people because some will think they should write “T Rex”, others “Dinosaur” for the picture above. Some will write “bunny” where you planned they would write “rabbit”.

    So, I continue to think whether that method is useful depends on the details.

    Of course everything works initially. But the only reason it methods work is that for the time being no one uses the method and so no one has written the ‘bot that breaks a particular method.

  41. More like ‘Enter the letters/numbers the cats are sleeping on’ from an array of images.

    I accept that the number of images may need to be large so probably impractical in reality (or too easy to defeat) though it might be possible to automate merging images to get the numbers up.

    It was just trying to explore what question could be posed that require some human thought process to complete rather than something that can simply be solved by logic (or brute force).

  42. Something like that. I suppose it comes down to using a random complex background image rather than noise and then constructing a question that has human meaning to interpret rather than just ‘all visible chars’.

  43. Richard LH–
    Depending on implementation, a method where people read numbers that are superimposed on background images or animals could be trivially easy to crack or very difficult. It shares the problem with anything using images of animals or constructing questions that humans intepret. If one has a finite number of images of animals and a finite number of possible questions, it is trivial to program a bot that:

    a) memorizes the possible questions that the programmer coded in and ‘recognize’ the process it needs to implement to get the answer. Owing to the need to ask people to recognize common animals, the number of types of animals is going to be finite, and likely rather small. That is, you’ll have people identify ( cat, dog, horse, pig ) not (cat, dog, horse, pig….. tapir, black necked stilt etc.) So, it seems to me the bot is going to know it’s looking for a “cat” if the question includes the word “cat”. The bot does this by using something like

    if( stristr($text,"cat") ){ $what_to_look_for ='cat'; }

    b) Now, assuming you get a finite number of cat pictures, and the bot recognizes the question includes the word “cat”, the bot can be programmed to compare all images to all cat images it has memorized memorized. The bot will need to carry around image processing stuff — but it’s existing image processing stuff. (For example, google uses it to find matching images on it’s image search function. So do picscout and tineye!) So, consider this “done”.

    c) The bot can then read the number super imposed on the cat picture it recognized. Because of the properties of the limitations of the human eye relative, we can be pretty sure the font used for the number will be bold and solid. That means bot will find it trivial to simple to find the number.

    d)The bot can provide the correct answer in the box. I would imagine if programmed the bot will answer correctly nearly 100% of the time.

    In fact, depending on implementation of your idea, the bot might be even easier to program.

    Suppose you had 9 images with 9 numbers strings but someone has to pick the numeral on top of the TRex. A very easily coded spam bot that’s never seen any of your images at all will read the 9 numbers strings by just recognizing the numbers which will be bold solid characters on tops of “stuff”. The bot won’t be even slightly distracted by the images.

    A bot that only does this won’t know which is right number but it knows one of them is right. It know has a captcha it can break at least 10% of the time– with almost no programming! Since bots are tireless, this will let 10% of the spam in.

    A method that lets 10% of spam attempts through will let a lot of spam pass through. Lots. Mind you, filtering 90% of spam is better than nothing. It can also be very useful as a prescreen to a downstream spam filter. It’s not bad for keeping UAH bets from being flooded– but it’s still not very good.

    The fact is, how well an method works depends on both the basic idea the details of the implementation. I imagine you could implement details that make this work– but they all involve distorting the characters so that the bot can’t read the nine numbers it’s lifted off the 9 images.

    But the you are back to making a good captcha. So why add the complication of the animal images?

  44. Hmm. I think I would put the question as graphical text overlayed in the image to prevent easy detection of the question by bots.

    Also large images are very difficult to process especially if they have deliberate traps in them designed to entice bots as you are doing elsewhere.

    Your existing system seem to be quite effective though so all this is probably achademic.

  45. Hmm. I think I would put the question as graphical text overlayed in the image to prevent easy detection of the question by bots.

    Ok. How would you make sure the whole question is not easily read by recognizing what color represents text (which is easy) removing all non-text colors from the image (which is easy) and then using OCR to read the question? (And then interpret it as described above?)

  46. Lucia,
    I think Richard LH is getting at the same thing I was – graphical text. I had thought that presenting the bot with a field (the fill-in-blank) and a big graphic with no strings might be perplexing. It’s true that it could be OCR’d but the text could be multi-typefaced, have uneven spacing, tilted 90 degrees, and maybe even be script.

    Of course the effectiveness of something like this assumes that it won’t be turned over to a human or if so, the question could request something like what wood instrument is played with a puck.

    I thought one of the strengths of graphical text was that it would need to be converted to text to Google the question. This might make the time to solution (TTS) unacceptable to the human hand-off.

    On the other hand, this is a bit of a torture device for the miscreants who find themselves banned. I’m sure you are trying to make it easy to unban themselves especially as the ban may not have been earned. Is there a lot of this?

    It also seems possible that bot sniffer operators on encountering really really good protection might redouble their efforts on the assumption that such great protection must be warranted.

  47. j ferguson–

    I think Richard LH is getting at the same thing I was – graphical text.

    Yes. And I’m puzzled why you think a bot that can break a captcha would have any trouble with graphical text. A captcha already is a form of graphical text.

    It’s true that it could be OCR’d but the text could be multi-typefaced, have uneven spacing, tilted 90 degrees, and maybe even be script.

    In other words: The long text could be made difficult to read by distorted in the exact same way I already distort captchas. But I don’t see how that presents any problem at all to a bot that can break a shorter captcha. If a bot can read the shorter captcha it can read the long question.

    Once the question is text, if there are a finite number of questions, the bot can parse the question and do the rest of the problem. So the bot is done.

    If the bot can’t read the long question, the shorter captcha will also be sufficient. So I don’t see what it is about this that you think presents anything difficult to a bot.

    the question could request something like what wood instrument is played with a puck.

    I don’t know the answer to that. Is there such an instrument? I’m not going to google “wood instrument” and “puck” to get the answer.

    This question seems like it’s better at preventing humans from using the unban page than preventing bots from using it. Writing the question as an image is irrelevant because I don’t know the answer period.

    I’m sure you are trying to make it easy to unban themselves especially as the ban may not have been earned. Is there a lot of this?

    Enough to create an unban page. Making it easy for them to unban themselves is one of the reasons I created it. Carrick used it soon after I announced it and before I wrote the link on the sidebar. The other reason is to collect data so that I can figure out why the innocent get banned and fix it if possible.

    It also seems possible that bot sniffer operators on encountering really really good protection might redouble their efforts on the assumption that such great protection must be warranted

    I doubt it. People who write ‘bots generally know what they hope to get from the bot based on visiting the site or sniffing out other things at the site. They know PayPal has money. If they think they can make money by leaving comment spam, they know anything that has “wp-comments-post.php” available can potentially be spammed.

    That said, professional captcha breakers do like to write web pages making fun of novel ideas that don’t work and present them as examples of what not to do.

    One of the bad ideas is multicolor noise. Multicolor noise is very difficult for humans but really, really easy for bots. It doesn’t matter if the multi-color is random noise or an image of a cat– it’s easy for a bot to remove it.

    Here’s the sort of “if” statement that would take the pixel (x,y) from the noisy image, and then only write the pixel to the a cleaned up image if the color matched the “text color”


    for ($x = 0; $x < $width; ++$x){ for ($y = 0; $y < $height; ++$y) { $c = imagecolorat($noisy_image, $x, $y); if ($c == $text_col) { imagesetpixel($cleaned_up_image, $i, $i, $c); }

    That's it. It doesn't matter if the background is solid, random or the photo of a cat. As long as the pixel color doesn't exactly match the text color (1 of 256^3 possible colors ), the background image is stripped. If $noisy_image was text was on top of an image of a cat, the cat is gone.

    Other things are making the mistake like passing the solution in the hidden form field. Or using easily memorized calls to create the captchas. Etc. Humans are generally unaware of these blunders, but you'll notice Brandon asked me about some and I answered how I'd dealt with these potential blunders.

  48. Lucia,
    Somehow I came to think that more text to ocr is much more challenging than one or two captchas. This may be my ignorance and/or naivete.

    In dark antiquity I spent days teaching a Kurtzweil OCR routine what the letters and words it couldn’t read actually were so that it would be able to read them next time it found them. I had to do this with every new or different book, until the machine got a grip on the peculiarities of the typeface and printing quality. This was an early project to convert existing law casebooks to text. The machine got better and better, but it never got perfect, It was able to “see” context and apply all manner of algorithms, not simply try to match a template. And yes, I didn’t entirely understand how it worked.

    Assuming that bots couldn’t accurately ocr all of the words in say a 60 word string, one might simply ask the unbanning aspirant to type 36th word, and then have a couple of non-words (btfsatfl for example) mixed in with the text.

    I thought hockey stick might be wood instrument played with puck.

    alas, too obscure. nuts.

  49. The difficulty for bots is that this can easily turn into a multipass solution (which requires much more coding).

    Pass 1. OCR to find question.
    Pass 2. Remove text of the question from the image.
    Pass 3. Using the question OCR the resultant to find answer.

    Just deciding which is question and which is answer can be non-trivial (other than a brute force method).

    Anything can be coded for of course, it is just a question of how likely is it that general (as opposed to targeted) solutions will be coded for this level of complexity.

Comments are closed.