Blocking anonymizer IP ranges: Opinion?

Question for all: Recently, a ‘illya’ commenter I consider borderline trollish has been leaving comments that made me think “Huh?” I noticed this commenter:

  • uses a “throwaway” email address. (Specifically nyms.net)
  • accesses through a service that markets itself as an anonymizer. Specifically overplay.net, which touts whose service promises:

    By rerouting your internet connection through our worldwide VPN servers you can:

  • Change your IP address to one in another country
  • Improve your security by ensuring your internet traffic is encrypted
  • Access content that is normally restricted to you.

From now on, if someone seems borderline trollish, I will be looking a their IP address. If it looks like a ‘throwaway’ I will moderate that commenter with no additional commen. Also, if that commenter is using an anonymizer, I will block that IP range entirely. That means, this afternoon, after checking all comments for IPs to ensure that other commenters don’t use that service, I plan to block the range “176.67.84.112 – 176.67.84.119” from loading the blog entirely. That is: I’m not just blocking comments. I will not permit people using that service to view the blog– they will be required to use their normal, not anonymous IP.

Before I begin blocking IP’s from VPN’s, I’d like to know if blocking VPN’s would generally cause people a problem?

Note: when blocking, I’ll only block VPN’s as I notice ‘issues’ with a specific range. But I do want to know if any substantial number of visitors use VPNs to surf (for whatever reason). If it turns out that lots of people use VPN’s for some reason, I can think of options other blocking access to the blog and avoid preventing people from visiting. If I am aware of a pitfall, I can think of other options. Otherwise, at least blocking “176.67.84.112 – 176.67.84.119” seems like it would be effective for my purposes.

34 thoughts on “Blocking anonymizer IP ranges: Opinion?”

  1. I suspect the “access content normally restricted to you” may be related to governments trying to control what their citizens can read/see on the internet. I could be wrong though.

  2. SteveF–
    That may be a reason the VPN business can make money.

    But I suspect “illya” is a sock-puppet of someone who I have already moderated and has decided it’s wroth $10/month to download IP masking software, create a new sockpuppet name and create a throw-away email address.

    VPN has all sorts of legitimate uses. That’s why I want to alert people to the potential issue. I’m not planning to run around and find every VPN IP and block them all. But if someone is:
    1) Acting borderline to truly troll-like.
    2) Uses a throwaway IP address and
    3) Is accessing through an anonymizing service, I’m blocking that IP.

    I did check and found Illya is the only person leaving comments from that IP address. So, I know I’m not blocking any regular commenter. I suspect I’m not blocking anyone else. It’s a very small IP range.

  3. I gather that you intercept other incoherent comments and we never see them. The ones we do see make me wonder whether the failure to communicate is a language issue or there really are people out there who comment yet have no focus whatever.

    illya’s did not seem to be the product of focus, nor did they seem confrontational nor argumentative – maybe written in a chemically induced fog.

  4. I would add that there is a fellow who frequents the Bishop’s who seems very adept at detection of some comments as the product of a chemically influenced mind.

  5. In the dark here, but do “gravatars” or whatever the common term is, help in such matters for controlling or blocking people using VPN?

  6. Sorry Lucia,

    I had this problem when I ran a blog.

    One particular commenter had IP addresses all over the place…I ended up having to block all of Canada, then most of Europe to get rid of one ‘troll’.

    Blocking a range of 6 IP addresses probably won’t deter a troll.

  7. j ferguson–

    It’s true the ones that appear seem to just be in a fog. I did intercept. The other ones either
    a) are continuuing to pester me to write whatever post it is he thinks I promised to write while refusing to provide a link or quote to help clarify what he is even asking or
    b) very, very fog like.

    Oddly though, this is precisely the way trolls whose main goal is to waste time and derail comments often behave.

    I could just ignore and hope everyone else ignores, but the whole “anonymous” bit bother’s me. Maybe it shouldn’t but it does.

    Anyway, rational or not, since these steps which conceal details from me the blog hostess bugs me: If a “new” to the blog commenter seems troll like and is clearly taking steps to mask their identity, I’m going to be draconian. No one accidentally creates a throwaway email address. No one accidentally spends $10/month extra to mask their IP. If someone who has done both looks a little troll like…. well…. I’m not giving them the benefit of the doubt!

    I would add that there is a fellow who frequents the Bishop’s who seems very adept at detection of some comments as the product of a chemically influenced mind.

    Yep. Unfortunately, nothing prevents the chemically influenced from hitting “submit!”.

    Pascvaks–
    I don’t think gravatars help much. I guess a blogger could require everyone to get a gravatar and use it. To use it, people have to sign in to some service that lets gravatar recognize them. Potentially, a troll or sock-puppet can just create new logins and new gravatars. So, it doesn’t really help much but can introduce an unnecessary hurdle for the non-troll / non-sockpuppet.

  8. HarryW

    Blocking a range of 6 IP addresses probably won’t deter a troll.

    .

    Akismet catches lots of the free proxy servers. This guy seems to be paying $10 month above and beyond whatever he pays his ISP. So, I’m hoping this helps. We’ll see….

  9. SteveF (Comment #83433) October 9th, 2011 at 10:45 am: I suspect the “access content normally restricted to you” may be related to governments trying to control what their citizens can read/see on the internet.

    They might be used that way but the commercial use, I believe, is primarily about access comercial content that is retricted to particular regions – in particular TV.

    This story in The Australian explains some of it: http://www.theaustralian.com.au/media/harvey-norman-mulls-next-move-after-questions-on-sale-of-mctivia/story-e6frg996-1226134301994

  10. A bit of overkill, if you ask me. There are legitimate reasons to use an anonymizer, and certainly blocking read access seems both pointless and self-defeating. Blocking comments, however, seems fine for any troll.

  11. Nyg Only,

    I suspect you are right about that, especially for TV coverage of sporting events.

  12. It’s not only sporting events but also TV shows or even YouTube-videos that have certain background music. F****ing DRM. Some (new) American TV-shows get broadcast here in Germany only after 3 years, if ever, so you either download it or use a proxy.

  13. One interesting thing about illya’s proxy is the proxy IP is in Finland. Of course, since it’s a proxy, this bears no relation to where illya is.

    Today, I’m writing a few scripts to reduce the ‘bot load to the blog. I looked at my server logs and it seems it would be worth bothering about and might reduce some of the ‘slow’ periods I’ve been experiencing. (The ‘bots are likely unrelated to illya. It’s just that ‘bots find sites and start loading every page. One is racing through and loading 60 different pages an hour. That’s quite a memory drain.)

  14. j ferguson–

    Do you have any idea what the ‘bots do with the pages? Looking for something?

    Yep!

    ‘bots are just anything anyone has programmed to load a page without a human actually sitting there clicking. So, for example, when I load a page using R and download data, I’ve programmed my mac to pretty much act as a ‘bot which downloads data on a page like GISTemps temperature archive.

    Different ‘bots do different things– whatever the human who wrote the ‘bot wants to do.

    Search bots, including the google bot, index pages for their search engines. The web master can adjust the crawlrate by providing instructions in the “robot.txt” file which well behaved ‘bots obey. I previously told them they can crawl at a rate of 1 page every 5 seconds. That’s a mistake on my part since WordPress has to create a page almost every time they hit a page. I need to look through and tell these well behaved bots to avoid certain addresses too.

    Some bots visit and try to leave comments. They’ll run through the site, filling forms and auto-clicking ‘submit’. The spam filter prevents their comments from appearing, but depending on how they are programmed, they still load a page– often old ones no one visits. So, that sucks up resources.

    Some bots try to harvest emails. No emails are displayed here; nevertheless, ‘bots visit, running the php script for WordPress and suck up resources.

    Badly behaved bots (or the search or ‘spam variety) disobey robot.txt. Every now and then, I need to look at the server logs, discover these bots and block them through .htacess. This does reduce use of memory and cpu. When they race through ‘categories’, ‘tags’ and other pages loading 1 a second, the blog does slow down, and can crash.

    Anyway, right now, I’m trying to deal with the well behaved bots. I’m looking at places I should tell them to stay away from, and tell them to stop coming back so often. After that, I can deal with the ‘bad’ bots!

  15. j ferguson–

    but had no concept of the load on your blog.

    Badly behaved bots can sometimes be a real problem at blogs running WordPress. First, you have to recognize my resource problem is memory limitation. It’s not cpu or bandwidth. Also, the people who coded WordPress created a memory hog. The difficulty is that it’s an easy to use memory hog, and lots of people have written plugins to extend the features. So, free, easy, lots of features etc. Who am I to complain?

    The difficulty is that loading a blog page runs a script– small program. The program assembles the page, fishing content out of the data based, and slapping it in the template. Many actions– like adding a comment also runs a script. WordPress as written happens to use a lot of memory to do things. (Habari uses less. But it has limited features.)

    At it’s simplest, the program runs every single time a person reloads a page.

    I can do things like caching pages– that is, getting WP to create a temporary static page and delivering it to a visitor. This reduces the load– provided that the cached page is actually being viewed by several people before the page changes. (The page changes when someone comments!)

    As you can imagine, that’s useful for a current blog posts.
    If Anthony Watt’s links, and posts, 10 people might request a page within a few seconds. If they all hit at nearly the same time and I was not caching, the computer memory load would at that instant because the server is creating the exact same page for each one– devoting memory to create each process. But if I’ve cached the page, the first visitor creates a page, the second third and fourth sees the ‘cached’ page. This helps a lot.

    But now, suppose instead that a badly behaved bot comes through. What it does is reads all the links on a page and starts crawling. Some of ‘bots will start loading a page a second (or more!!!) Caching doesn’t help because they will do things like race through the categories and archives visiting posts from 2007, 2008, visiting the same post under every conceivable entry point (by tag, category, page, date ) etc. In fact, caching makes the problem worse because now, when the bot comes, WP not only serves the page, but might create a cache– using more memory– but no one in the world wants to see that page!! (The cache plugin is supposed to recognize bots and not create cache based on their requests. But I don’t know if this really works.)

    Just slowing the well behaved ‘bots may alleviate some of the slow server stuff we’ve been experiencing since the time when people were coming over from Anthony’s blog to argue about Monckton. The combined load of the ‘bots (some welcome and some not) and Anthony’s visitors (all welcome) was making the memory load spike.

  16. Lucia,
    This is very interesting. I imagine that you can detect the IP address where the naughty activities are originating and stop it before your part of the WordPress system is completely absorbed by these fishing expeditions – a sort of auto lock-out.

    I’m familiar with scripts – used to write them for our Sun systems in the late ’80s, but mostly do do housekeeping things where speed or resource consumption wasn’t important – and especially where no one who actually knew what he/she was doing would ever see what i’d done.

    btw, the cat with the chipmunk – what became of the chipmunk?

  17. j ferguson–
    To detect, one must first look. I try to avoid devoting much time to checking my server logs etc.

    I don’t remember what happened to that particular chipmunk. That cat liked to play catch and release. He was on his way to the cat door; I blocked the door and took the picture. Otherwise, the general might have dropped a live chipmunk in the house.

  18. jferguson–
    I thought you might like to see how many pages a bot might visit, and how quickly. A bot at IP 109.123.106.189 started visiting this morning. Around 5:11:31 am, it started loading two pages a second for a while– (each page was requested twice). It hit the login page at 5:11:34 Then, it stopped at 5:13.
    No human in the world would click through that fast.

    If you visit project honeypot ( http://www.projecthoneypot.org/ip_109.123.107.151 ) you’ll see it reads:

    The Project Honey Pot system has detected behavior from the IP address consistent with that of a mail server and comment spammer. Below we’ve reported some other data associated with this IP. This interrelated data helps map spammers’ networks and aids in law enforcement efforts. If you know something about this IP, please leave a comment.

    This bot is running off a server at http://www.rapidswitch.com/

    Here are all the visits:

    109.169.48.182 - [10/Oct/2011:05:11:31 -0700] GET / HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:31 -0700] GET /musings HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:31 -0700] GET /musings/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:32 -0700] GET /frowzy.php HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:33 -0700] GET /\
    109.169.48.182 - [10/Oct/2011:05:11:33 -0700] GET /musings HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:33 -0700] GET /musings/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:34 -0700] GET /musings/wp-login.php HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:34 -0700] GET /musings/xmlrpc.php HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:35 -0700] GET /musings/2007 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:36 -0700] GET /musings/2007/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:37 -0700] GET /musings/2008 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:37 -0700] GET /musings/2008/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:38 -0700] GET /musings/2009 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:39 -0700] GET /musings/2009/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:40 -0700] GET /musings/2010 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:41 -0700] GET /musings/2010/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:42 -0700] GET /musings/2011 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:42 -0700] GET /musings/2011/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:43 -0700] GET /musings/climate-links HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:44 -0700] GET /musings/contact-lucia HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:44 -0700] GET /musings/contact-lucia/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:46 -0700] GET /musings/donations-for-peer-review-papers HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:47 -0700] GET /musings/donations-for-peer-review-papers/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:48 -0700] GET /musings/feed HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:48 -0700] GET /musings/feed/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:50 -0700] GET /musings/privacy-policy-statement HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:50 -0700] GET /musings/privacy-policy-statement/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:51 -0700] GET /musings/site-wide-disclosure-this-blog-runs-ads HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:52 -0700] GET /musings/site-wide-disclosure-this-blog-runs-ads/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:54 -0700] GET /musings/wp-includes/wlwmanifest.xml HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:54 -0700] GET /musings/\
    109.169.48.182 - [10/Oct/2011:05:11:55 -0700] GET /musings/2009/the-tribe-speaks-to-the-press HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:55 -0700] GET /musings/2009/the-tribe-speaks-to-the-press/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:57 -0700] GET /musings/2011/blocking-anonymizer-ip-ranges-opinion HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:57 -0700] GET /musings/2011/blocking-anonymizer-ip-ranges-opinion/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:58 -0700] GET /musings/2011/chef-hank-writes-about-italian-beef-cuts HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:11:58 -0700] GET /musings/2011/chef-hank-writes-about-italian-beef-cuts/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:00 -0700] GET /musings/2011/michael-tobis-new-blog HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:02 -0700] GET /musings/2011/michael-tobis-new-blog/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:03 -0700] GET /musings/2011/monckton-in-your-own-words-explain-this HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:04 -0700] GET /musings/2011/monckton-in-your-own-words-explain-this/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:05 -0700] GET /musings/2011/monckton-neither-0-15-wk-m2-nor-0-18-wk-m2-are-the-kt-implicit-planck-parameter HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:06 -0700] GET /musings/2011/monckton-neither-0-15-wk-m2-nor-0-18-wk-m2-are-the-kt-implicit-planck-parameter/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:08 -0700] GET /musings/2011/monckton-planck-parameter-no-better-than-pulling-numbers-out-of-a-hat HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:08 -0700] GET /musings/2011/monckton-planck-parameter-no-better-than-pulling-numbers-out-of-a-hat/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:09 -0700] GET /musings/2011/plank-mugs-question-only-readers-can-answer HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:10 -0700] GET /musings/2011/plank-mugs-question-only-readers-can-answer/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:12 -0700] GET /musings/2011/quatloos-for-the-sept-minimum-nh-ice-extent-bet HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:12 -0700] GET /musings/2011/quatloos-for-the-sept-minimum-nh-ice-extent-bet/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:13 -0700] GET /musings/2011/sea-ice-bets-bet-on-aug-sept-7-day-minimum HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:14 -0700] GET /musings/2011/sea-ice-bets-bet-on-aug-sept-7-day-minimum/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:15 -0700] GET /musings/2011/uah-quatloo-update-september-down-from-august HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:16 -0700] GET /musings/2011/uah-quatloo-update-september-down-from-august/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:19 -0700] GET /musings/2011/walking-the-planck-parts-1-2 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:19 -0700] GET /musings/2011/walking-the-planck-parts-1-2/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:21 -0700] GET /musings/category/betting HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:22 -0700] GET /musings/category/betting/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:23 -0700] GET /musings/category/crafts HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:23 -0700] GET /musings/category/crafts/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:25 -0700] GET /musings/category/environmental HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:25 -0700] GET /musings/category/environmental/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:26 -0700] GET /musings/category/gadgets HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:27 -0700] GET /musings/category/gadgets/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:28 -0700] GET /musings/category/global-climate-change HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:30 -0700] GET /musings/category/global-climate-change/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:31 -0700] GET /musings/category/haiku HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:31 -0700] GET /musings/category/haiku/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:32 -0700] GET /musings/category/politics HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:33 -0700] GET /musings/category/politics/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:34 -0700] GET /musings/category/random HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:35 -0700] GET /musings/category/random/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:36 -0700] GET /musings/category/statistics HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:36 -0700] GET /musings/category/statistics/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:37 -0700] GET /musings/category/toy-physics HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:38 -0700] GET /musings/category/toy-physics/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:39 -0700] GET /musings/category/uncategorized HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:39 -0700] GET /musings/category/uncategorized/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:40 -0700] GET /musings/comments/feed HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:41 -0700] GET /musings/comments/feed/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:42 -0700] GET /musings/feed/atom HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:43 -0700] GET /musings/feed/atom/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:44 -0700] GET /musings/feed/rss HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:45 -0700] GET /musings/feed/rss/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:45 -0700] GET /musings/page/2 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:46 -0700] GET /musings/page/2/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:47 -0700] GET /musings/tag/climate-sesitivity HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:48 -0700] GET /musings/tag/climate-sesitivity/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:49 -0700] GET /musings/tag/latex HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:49 -0700] GET /musings/tag/latex/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:50 -0700] GET /musings/tag/michael-tobis HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:51 -0700] GET /musings/tag/michael-tobis/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:52 -0700] GET /musings/tag/monckton HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:52 -0700] GET /musings/tag/monckton/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:53 -0700] GET /musings/tag/mugs HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:54 -0700] GET /musings/tag/mugs/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:55 -0700] GET /musings/tag/planck HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:55 -0700] GET /musings/tag/planck/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:56 -0700] GET /musings/tag/planet-3-0 HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:57 -0700] GET /musings/tag/planet-3-0/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:58 -0700] GET /musings/tag/quatloos HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:12:59 -0700] GET /musings/tag/quatloos/ HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:13:00 -0700] GET /musings/tag/uah HTTP/1.1
    109.169.48.182 - [10/Oct/2011:05:13:00 -0700] GET /musings/tag/uah/ HTTP/1.1

  19. Interesting,
    I wonder if they look for whatever they are looking for in real time? I would think they do and likely they aren’t looking for more than a few things.

    I wonder if any of the letter agencies taste the flavor of discourse across blogland in the hope of discovering a step change in our angst and in turn anticipating something they might consider to be troublesome.

    I was involved in a project like this in the late ’80s but it foundered on having to OCR daily newspapers worldwide. I see someone has put a book together on the the supposition that some kinds of printed (or web based) material can be word-analyzed and the focus of social discontent appraised.

  20. j ferguson,
    Project honey pot is a crowdsource system. People who have web sites put a “honey pot” on each page. These are in the form of a invisible to humans link to a particular script. Bots see the link and visit the script– that sends a message to project honey pot. I have one of those links on this blog, and I could see that 109.169.48.182 hit it. When I see that, I can check at project honeypot and I can see that IP has been hitting other people too.

    So… I blocked it. Based on today’s server logs, I blocked:
    deny from 92.53.33.215
    deny from 77.248.24.162
    deny from 109.169.48.182
    deny from 87.210.195.216
    deny from 92.53.33.215

    Based on the fact that 80% of the things that hit the script are listed as spammers, I’m going to modify honeypots script to create a list of things that hit the honeypot to alert me when they hit the script. Then, I can check and add them to htaccess. (I need to ad dates too. The spammers eventually change IPs.)

  21. live and not devious (maybe). Bear Trap: the honeypot would tie them in knots? likely not worthwhile sport.

  22. Lucia:
    Wow! That is a lot of work to go through just to keep your site up and running efficiently. When people talk about having the ‘whole world’ in their living room, I had some idea (porn and child exploitation), but no idea the extent and magnitude of it (bots fishing).

    BTW, I do use a VPN connection, but only to access the work server.

    Roy Weiler

  23. Roy–
    I also use a VPN server to access work. Oddly enough, if someone did a whois, they couldn’t tell it was a VPN. The issue with illya is that he is using a VPN provided by a service that specifically markets its use as an “anonymizer”. It’s the anonymizer aspect of illya’s choice of VPN and email that makes me less indulgent of the comments that make me think “huh? hmmm… “

  24. Lucia”

    “It’s the anonymizer aspect of illya’s choice of VPN and email that makes me less indulgent of the comments that make me think “huh? hmmm… “”

    I agree completely. What do you have to hide?

    Roy Weiler

  25. Lucia the funniest scam to run would be one where you run an entirely separate mirror site for those IPs. They can post there, they see all posts, but all of us go to this site where their posts never show up. They talk, they see there posts, but we cant see to respond. It drives trolls crazy and they never figure it out unless they log on from a different IP

  26. steve– I wrote a plugin that detected TCO and did that. The tweak was that his comments became visible to all after some number of hours; that reduced the number of people responding. I ran the plugin for a little while.

    The difficulty is because it serves different pages based on IP, it screws up caching and ends up drawing too much cpu and memory.

    The other problem is essentially maintaining 2 sites and special coding is too much effort for dealing with trolls. Just banning the “anonymous” IP’s illya set up is much less work. Moderating people whose main point in commenting is to suggest subject changes (to why the arctic temperature hover around 0C) and complain I’m dishonest because I don’t discuss ‘their’ pet notion is less work too! (I mean shoosh/jay with the later.)

  27. Seems like blocking an I.P. range would be easy to defeat. All one needs to do is access a public proxy list (like proxylist.net) and reconfigure their browser. Only takes a few seconds and a couple of mouse clicks.

  28. Duke C.
    It’s not quite as easy as you think. The thing is
    a) the person who wants to comment has to actually do it.
    b) if they are a spammer, they have to do it and be able to do it in an automated way. (Some spammers can; some can’t.)
    c) the IP’s on public proxy lists get detected by spam monitors fairly quickly. Lots of spamfilters read the IP when you comment and compare it to a spam list. Then they block that comment.

    Because of (c), many bots whose goal is to post comments find it’s not worthwhile because the blog spam filter repels them.

Comments are closed.