
Many of you have surely noticed I’ve been silent. This is because I transitioned to incorporating the solution to all the hacking / scraping/ cracking / spamming into a wordpress plugin. I know most of you don’t know what these are; but mostly, plugins are the way people share solutions with others who use WordPress software to run their blogs. In this particular case, the main thing the plugin does is runs 3 cronjobs which
- Examines a table created by the “Bad Behavior” plugin and submits selected IPs to be banned at Cloudflare.
- Examines a file created by ZBblock and submits selected IPS to be banned at Cloudflare.
- Looks up which IPs were banned at Cloudflare 7 days ago and unbanns most of them.
Now that I have this running, I will be using it for about a week, and then uploading at WordPress so others can share. ( I estimate the plugin will solve most small time bloggers major issue of hackers/spammers/crackers using huge amount of server resources hunting for vulnerabilities. However, by itself, it doesn’t help with image scraping. But I have that dealt with and can later add an image scraper module for people. )
In the meantime, there is one issue that I could use help on: How to decide a) which IPs should never be banned at Cloudflare and b) which should never be unbanned at Cloudflare.
With regard to (a) I’ve tried to collect together a list of IPs from Google, Yahoo, the msn bot in all it’s variations, Feedblitz, Feedfetcher and scoutjet. Does anyone know other useful, friendly things that should never be banned even if they accidentally acquire a dangerous looking uri that makes them look like a hacker for a short time? (All spiders sometimes pick up uris that are not only wrong but bad.)
With regard to (b): The goal is to permaban things spammers and hackers on servers. So, for example: My blog is hosted on dreamhost.com– a server. In contrast, I surf from an IP assigned by AT&T — an ISP. Hackers use both. But what we see is that some hosting companies have developed the reputation of permitting operation by spammers/hackers/crackers/scrapers to operate servers that spam/crack/scrape etc. The purpose of the permaban list is to never unban an IP from these sorts of servers. (A human will still be able to unban.)
Now, given that background, I’ll cut to the chase and ask two questions:
1) Is there a generic way to discover if an IP is associated with a server (more like Dreamhost) or an ISP (more like AT&T, Verizon etc.)
2) Specifically, how do I figure out if “Brazil Sao Paulo Comite Gestor Da Internet No Brasil” is a server or an ISP?
Because if those Brazilian spammers are on servers once banned, I want to keep them banned. If they are on ISP’s, I want to unban after 7 days. Unbanning helps prevent me from eventually having banned everyone on Brazil.
Unlike the pre-plugin days, adding to the permaban list wouldn’t require any programming. The plugin has a nice entry form– sort of like betting on UAH. (That’s the whole good thing about writing these. But it was something of a PITA.)
Also: Now that I wrote the plugin, I can equally well whitelist IPs to avoid ever banning them at Cloudflare. If you have been a frequent banning victim, let me know and I can whitelist IPs. ( You might continue to be banned during the first week– while I strip out the old solution which is still running! But afterwards, you won’t get banned. I do know a few people who need to be whitelisted. But if you have been banned, pipe up, I can read the comment and whitelist you! Hhmm… maybe I should write the plugin to white list everyone who has an approved comment in the past month? That doesn’t help lurkers, but it helps some people. )
Now…. I’m going to go exercise, visit my mother in law… and with luck, Roy will post his UAH. Because the plugin is running!
Of course… open thread. I’ll open one for climate stuff soon!
Lucia –
I’m one of the people that basically doesn’t understand much of what you’re attempting to do – except in outline – but I enjoy reading about it. It’s fascinating and it doesn’t bother me that some of the technical things go right over my head.
All power to your elbow – and programming.
P.S. I haven’t been banned in any shape or form for a couple of months 🙂
Ditto all of Anteros comments. Nice to have you out of purdah and back in the real world, Lucia.
Personally, I wouldn’t give Brazil Sao Paulo Comite Gestor Da Internet No Brasil the time of day, but who knows? 🙂
Anteros–
Good! I think the new way you won’t get banned– but if you do, it will be easy for me to whitelist you.
Mostly, after figuring out which steps actually work to save bandwidth/cpu/etc. I organized what is likely a more time efficient and ‘user friendly’ way to do things. I know do need to strip out all the old code still wrapped around the blog and I’ll be doing that over the week. (It’s not a time intensive task. But notice I unban after 7 days? To minimize ‘lucia’ time on task, I need to let scripts unwind. So, some of the ‘old’ things are still running and small tasks get undone as they no longer do anything meaningful.)
People are still sometimes going to get banned– but likely less often. There will be a post when I get this loaded at WordPress; that will be in a week or so.
Still haven’t seen Roy’s March temps!
I missed you. Glad to hear that it was nothing more serious:)
denny–
Nothing major. But while trying to fiddle to figure out where the earliest hook in WordPress is, or remember how to set up a mysql table when plugin is “activated” or thinking through how to come as close to idiot-proofing a plugin that is more complex than most, it was just easier to not post something.
Using this plugin IS more complicated than most. There are very few plugins that are designed to be used only if someone has installed a 2nd plugin and/or installed a script available on a separate site totally independent of WordPress. So, I wanted to think through: If a person who blogs about ‘whatever’, who gets antzy at the word “ftp” and throws in the towel if you say “unix” wants to install this, what can I do to make things a little easier. So, the admin panel does things like suggest the likely path to ZBblock and confirms if ZBblock and all its files were found. Etc. (Those using Bad Behavior only don’t need to ftp, but if they use ZBblock– which is great– they need to ftp. And my plugin is only useful if one uses either one or the other, and better yet, both!)
I am used to being banned…but mostly from activities that my wife disapproves of…
Hey! Algeria (or least my portion of it) is not on the banned list!
Ok, except for the 2 minute timeout for posting too fast…
Les-
I”m sure your wife would disapprove of:
1) You pretending to be the googlebot
2) You attempting cross-site scripting attacks (XSS) or
3) You trying to scrape my site.
If she doesn’t disapprove, I’d be happy to explain why should should kick you to the curb if you did these things.
But…. so far… I haven’t seen you do that.
But there are a few people who see the ZBblock scary page because they travel to far away places. I need them to tall me so I can comment out those bans.
Commenting out certain bans have helped a few in Norway, Thailand, Africa etc. So, it can matter. It’s a simple matter provided I’m told promptly. But now with the plugin I can do better:I can white list. (I still need to do some work around for certain people reading the blog when reading in certain countries.)
My wife would surely disapprove of those activities, or activities that sound like those…the cross site scripting, for example…
.
But, the plugin seems to be working for me.
Interesting. If I edit an entry twice, it goes back to the original post. It does take the second change, but drops the first.
Still, I like being able to edit. Sadly, I can only see mistakes after posting. Much like I need to print a report to see the errors.
I was wondering where you were hiding. See the mail we sent you a little bit ago.
Mosher– I skimmed and marked to look at later….. Sorry, but I’ve been focused on the plugin. It’s one of those things that’s best to just do and get over with. There is so much stupid wordpress specific knowledge that is only relevant to writing a plugin and nothing else. e.g. finding which wordpress hooks are useful, figuring out the wordpress specific calls to do something safely in wordpres or figuring out how to create a submenu in…. you guessed it… wordpress!
None of this wordpress specific stuff ever translates into creating something for some other reason.
For what it is worth, “Brazil Sao Paulo Comite Gestor Da Internet No Brasil” is being more than a bit pretentious. Translated roughly: “The steering committee for the internet in Brazil”. Almost certainly, no such thing exists. Pretentious titles are common in Brazil (OK, more common than in the States, but of course, they are present everywhere!).
SteveF–
The steering committe must be sending out proclamations saying: “Go forth and impersonate the googlebot while trying to submit spam!”
Bad behavior (and my .htaccess rules) prevent any of the spam for even making it to Akismet’s filiters. But it’s amazing how much of it there is. Oddly, the other main googlebot impersonator seems to be in Portugal. Something about speaking Portuguese must make people think, “I think I’ll pretend to be the googlebot today!”
Based on that translation I wouldn’t be surprised if that’s just the name of someone who owns a huge block of IPs and then lets others have access somehow. If so, I won’t be able to determine if they are servers of ISPs. Too bad. If they were servers and I could tell, I’d just keep them blocked at cloudflares.
OT, but RPS seems to have the latest UAH TLT but Dr. S hasn’t posted it yet.
http://pielkeclimatesci.wordpress.com/2012/04/03/march-2012-global-temperatur%e2%80%8be-report-from-the-university-of-alabama-at-huntsville/
YFNWG – well spotted. I have a depressing feeling that moschops has scooped the quatloos this month… must have been a freak result or somesuch…
+0.11 Who’d have guessed? – Not me for one! 🙂
Lucia,
“Something about speaking Portuguese must make people think, “I think I’ll pretend to be the googlebot today!â€
Eu não sei. Falando Português me dá nenhuma inclinação para representar-mim como um bot do Google.
SteveF–Admit it. You really want to pretend to be the googlebot. You know you do!!
Anteros– I need the extra digit I’ll get when Roy posts. It could make the difference.
Lucia, to your questions:
1) Is there a generic way to discover if an IP is associated with a server (more like Dreamhost) or an ISP (more like AT&T, Verizon etc.)
I think what you really want to get at here is whether the IP is Static (like a server in your characterisation) or Dynamic (like an ISP connection). In the case of static IPs, the user has an agreement with the upstream provider that their IP will not change over time or when they reconnect. Of course, static IPs do change as well; just less frequently.
Is there an infallible way to distinguish? The general answer is no, but. The “but” is because ISP-assigned dynamic IPs are a significant source of spam from infected home PCs, so our doughty and sometimes annoying spamfighters have taken the trouble to collect ranges of IPs that ISPs use for dynamic assignment.
These ranges are put into in Realtime Blackhole Lists (RBLs) so that mail servers can immediately see if a dynamic “home” address is sending mail directly, without going through an ISP’s server.
You might check out pbl.spamhaus.org and dyn.nszones.com for two such commonly used lists. http://www.spamhaus.org/faq/section/DNSBL%20Usage might be a good start point. RBLs work like DNS – if you send a DNS query to them for the IP, they will respond with a code telling you something about the queried IP. For example, Spamhaus PBL will return 127.0.0.10 or 127.0.0.11 for known dynamic IPs. Since these lists are actively maintained for anti-spam purposes, they do tend to be as good as you’ll get.
As for your second question, if I look up the relevant IP in rbls.org http://rbls.org/187.105.167.153 it shows me that it is listed as Spamhaus-assigned dynamic, and also in dyn.nszones.com.
Um. Perhaps I should have suggested http://www.nszones.com/dyn.asp as an even better place to start:
“DNSBL queries are structured by the inverse IP address as a subdomain of the DNSBL zone. For example, to check that the general DNSBL test address of 1.2.3.4 is listed in DYN, query 4.3.2.1.dyn.nszones.com with any NS lookup tool such as…
4.3.2.1.dyn.nsZones.com
You will receive an address:
127.0.0.3 – Dynamic IPs (Dial up, ADSL, Cable, no PTR IPs)”
Hemingway couldn’t have written it more clearly 🙂
Yes. I mostly don’t have a problem with these and banning 7 days and then unbanning is fine. Also, Bad Behavior and ZBblock look up at various places (Project HoneyPot and SFS respectively. )
But what I’m trying to figure out is if the entire block at that Brazilian network is server company that just lets spammers take out accounts on their machines. Some groups do that– and in that case, it’s worth never bothering to unban IPs in that range.
Jim–
Thanks! Looking at that after drinking coffee, it looks very useful. For my purposes, it looks like:
a) If an IP is on the dyn.nszones.com only block the spammer IP for 7 days.
b) If it is not on the dyn.nszones.com consider letting it stay blocked forever (or until an actual human being sqwaks.)
I’ll have to do some monitoring before adding an IP block to the “permaban list”, but not being dynamic is a factor in favor of permabanning–assuming it is a known spammer.
This is the opposite of what a mailserver would do because mail coming into a mail server should nearly never come from dynamics IPs. In contrast, connections to the blog are not only frequently from dynamics IPs, they are nearly always from dynamic IPs.
Sadly, it looks like the Brazilians, while known spammers are dynamic. That means I’ll just have to keep banning the dynamic IPs for 7 days and unbanning. Unbanning is because the spammers will just change IPs and I don’t want 100% of Brazil to be banned over time. (SteveF, and sometimes guest poster and frequent commenter visits Brazil. When traveling e could end up banned if I ban all of Brazil!)
Jim–
I’m trying to figure out how actually run queries based on the information you are pointing to. Reading spamhouse, all their instructions seem to say things like “All three zones can be queried in one single DNS lookup at zen.spamhaus.org. ” And you quoted “query 4.3.2.1.dyn.nszones.com with any NS lookup tool ”
So….how exactly do I make the query? Where do I find or get an “NS lookup tool”?
UAH +0.108
Back to the drawing board 🙂
lucia, first, congratulations on making so much progress on your plugin. Those things can be a huge pain to make, so nice job. Second:
Most operating systems have nslookup built in as a command line tool. All you have to do is type “nslookup a.b.c.d” and hit Enter. However, I’m not sure this would be useful for you given it requires access to the command line.
Now then, DNSBL is generally used for mail servers, and those are specifically designed to have an option to set an address for DNSBL queries. The same isn’t true for blog servers. For a blog, I believe you’d need customized code which allows you to make DNS queries. I know there is PHP code available online for just that, so it shouldn’t be hard for you to incorporate into your plugin if you’d like.
You should try reading this answer in a FAQ of one of the sites jim linked to. And as always, if you have any questions, feel free to ask.
Brandon–
The main thing is that ultimately, if I’m going to do it, I would want the plugin to decide which things to add to the permaban list. So… I need to do things in php. I would run that in a cron job so that it doesn’t slow down visitor access to a blog. But, still… php!
Ahhh! that’s the query I want!! I can read that in php– I’m sure of it.
BTW: I’m going to do that later– but now I have the info in comments. 🙂
I think what I need to do is create another table to store the IPs that were banned/ whitelisted/banned/ whitelisted over and over. Then find those banned repeatedly, figure out if they are static– and if they are just ban them forever.
BTW: The I’ve been banning and unbanning the Canadian Prison system IP every 7 days. It always reappears as soon as I unban it at cloudflare. On the one hand that does show me I am successfully unbanning automatically. On the other hand….
Needless to say that IP is going on the permaban list. If the prisoners or guards in the Canadian prison system want to scrape more images they are going to have to find another way.
lucia:
That’s what I figured. I just wanted to mention nslookup being a built-in thing for completion’s sake. Well, that and because it may be useful knowledge. For example, you can test DNSBL’s by manually running nslookups, and that could be useful while troubleshooting code.
Just remember to read a response, first you have to make the query. I checked the links in that answer, and they didn’t work. However, a quick Google search was enough to find PHP code to make DNS queries. It really isn’t a complicated task, so I think you should be able to get it to work without much trouble.
As for creating another table, couldn’t you just add a column to your current table, and have it increment whenever something is banned? It seems simpler.
I know nothing about WordPress, but you might try the PHP function gethostbyname(), unless otherwise indicated, before going to external commands.
A quick goggle brings up the first comment on
http://theserverpages.com/php/manual/en/function.gethostbyname.php
where this is used in something very similar to your context.
I’m not sure how well permabanning blocks of dynamic IP space based on the recommendations of RBLs intended to inform mailservers would work. You might try looking at how much overlap there is between your existing lists and the various RBLs.
Jim–
Thanks for the link. I’ll try that.
I’m sure permabanning blocks of dynamic IP space would be a very, very bad thing for a blog to do.
I would consider permabanning spammers on static IPs that are also spamming.
My work here is done. 🙂
All this software stuff makes my head spin.
I guess us hardware folks are don’t contribute as much as we used to.
gallopingcamel–
Some people suggested hardware solutions to my problem. The difficulty is that the hardware solutions aren’t available to me because I have shared hosting.
As for the general hardware stuff- Dreamhost deals with that.
I use a python tool called denyhosts (google it) which does for ssh, telnet, and ftp what you are trying to do with http. One service they provide is a growing list of recent systems that have attempted to log onto the user community’s servers using brute force. It is a huge list and is applied to the hosts.deny file in /etc. Tools that have Weitze Venema’s tcpwrappers compiled in refer to that file to allow or disallow the connection. It is a trivial thing to do the same with .htaccess files in Apache web servers.
In addition there are a lot of dns blacklists that use the very lightweight DNS protocol to validate source IPs. I would guess this could be implemented in less than 30 lines of PHP code.
Regarding blocking servers vs people, I don’t care – I block them all and never unblock any of them. But – always whitelist your own server and all clients you expect to use or to allow (co-mods, etc).
dp– Thanks for the lists.
I’ve seen fail2ban, denyhosts etc.
I don’t want to have a wordpress plugin do anything to .htaccess for many reasons– most of which have to do with the types of users who are going to use plugins. It’s very dangerous to do much of anything to their .htaccess files. If anything gets screwed up they don’t necessarily have the skillz to even begin to unscrew it up.
I don’t want to block any “people” IP’s permanently. Doing would mean that actual, non spammy people would eventually be blocked. I don’t want that.
Bbanhosts and fail2ban don’t block forever either– they block IPs for a short period of time.
But I would be happy to block known bad servers virtually forever.
I appreciate your reluctance to unleash the beast into society (auto-blocking) but as someone who is a professional in this business I’ve lost any patience with the growth of the unskilled web authors who create through their ignorance the playground that has created so much work for the rest of us.
If they adopt a good preventative method and can’t sort it out when it is abused then I am there for a fee to help 🙂
In fact I volunteer a lot of time to the problem, but the secondary problem is too many are making it too easy to be hacked to be ignored. Because they encourage hack attacks I am required to indulge hack attacks. Well aimed tough love is a blunt force instrument that is effective for the wary, useless to the foolish. We all need to play the game to win and it is no place for the weak.
Quite like when you’ve done this for 30 years you too will see the value in tough love. Because we’re not self-policing well enough we are soon to face the guiding hand of our respective government agencies who will act on our behalf. That will be the day the music died.
That said, follow your heart in all matters, of course.
dp–
I don’t entirely understand your comment. What sort of behavior and actions do you categorize as “tough love”.
BTW: I’m not reluctant to unleash this beast into society. I am eager to do it. But I would like to be sure my directions for installing can be followed and I would like someone to look over it to see if the way I ‘read/write’ input is safe.
One volunteer encountered a problem installing– my directions were… uhmm… less than perfect. Also, I forgot about a particular incompatibility– so I need to warn users about that.
Also, I failed to anticipate what I think is an issue with the BadBehavior plugin and Cloudflare. If BB is on while Cloudflare is resolving and if you have not yet set BB to whitelist Cloudflare and if the Cloudflare IP the admin shows to the blog is banned at Stop Forum spam, then BB will block the admin. Then the admin will not be able to properly set BB! Oddly, I think this is an issue with Bad Behavior and the blocking has nothing to do with my plugin.
I wasn’t blocked– but the tester in ‘country X’ was. Because of the way Cloudflare assigns IPs, I suspect it would have happened to many in ‘country X’. So, I need to organize installation instructions to minimize this possibility. (My instructions have the user insert some php to make sure the originating IPs are presented before the blog loads. If they have BB off when they include those, they can’t experience that particular problem.)
But auto-banning? It’s going to happen. The only way to get these hackers from racing through and crippling a blog on shared hosting is to set a firewall. The only real firewall I can get with shared hosting is with something like Cloudflare.
Also: With Cloudflare as a firewall, it’s difficult for the unskilled to truly screw up. If they create settings with my plugin that end up autobanning the entire universe and then want to turn the banning off, they can always just ‘pause’ Cloudflare by clicking the ‘pause’ button (and then later, redirecting their DNS pointers.) At that point, they’d be back to no firewall– but that’s where they started out!