Reduce image scraping to prevent blog crashing and thwarth copyright trolls.

As some readers recall, I’ve been beavering away at reducing the ridiculous server load caused by various bots making constant heavy requests on the blog. The requests included excessive requests for blog post and for images. I am particularly sensitive to the image issue because I received a “Getty Demand Letter”. While I believe that the Ninth Circuit courts ruling in “Perfect 10 v. Amazon”, applies and hotlinking is not a copyright violation under us copyright law, I also think that many people would be well advised to interfere with image scraping at their blogs. Though interfering with image scraping will neither protect you from a copyright suit if you are violating copyright nor eliminate the potential that a copyright troll will incorrectly come to believe you have stepped on their copyright, it has the potential increase the cost of operation of entities like Picscout, (owned by Getty Images) Tineye (owned by Idee) and so reduce the likelihood that someone like Getty Images, Masterfile, Corbis, Hawaiian Art Network (i.e. HAN), Imageline or even the now defunct RightHaven will show up demanding money in exchange for their promise not to sue. (For more on these pests see ExtortionLetterInfo.)

More importantly: preventing image scraping will reduce your hosting costs and the frequency that your blog crashes as a result of some entity requesting zillions of images in a very short period of time.

For those uninterested in this topic: Comment will be treated as an open thread. But for those who might want to know how to reduce scraping now, I’ll post some .htacess code you can use.

————-

The heart of my scheme to prevent image scraping is a series of blocks in .htaccess which divert certain image requests to a .php script. Those reading will see that all blocks terminate with:

RewriteRule .*\.(jpe?g|png)$ http://mydomain.com/imageXXX.php?uri=%{REQUEST_URI} [L]

This command takes all requests ending with .jpg, .png or .jpeg and sends them to a script located at http://mydomain.com/imageXXX.php. It also tacks on the uri of the image requested. (Note, I do not filter access to .gifs. )

Those who have not yet written a http://mydomain.com/imageXXX.php can simply forbid access to these images by changing this rule to

RewriteRule .*\.(jpe?g|png)$ - [F]

The ‘- [F]’ forbids access rather than sending the requests through a filter.

I use the more complicated command because I want to log, filter and sometimes permit access. But if you have not yet written a script to log or filter and are noticing massive images scraping, ‘- [F]’ is a wise course. ( FWIW: Initially, I did use ‘- [F]’. Though I did not take data, it seems to me that many bots just kept requesting images after being forbidden. In contrast, as soon as I began diverting to a .php file many bots vanished the moment they were diverted. The ‘YellowImage.jpg’ experiment and a few others were rather enlightening in this regard.)

What does imageXXX.php do?
As I mentioned: initially you can just forbid access to certain requests. But diverting to a file ultimately works better. Since I am diverting not forbidding, the .php file does this:

  1. Because I am using cloudflare, it pulls out the originating IP and country code.
  2. Logs the request to an 15 minute image log file and to a daily image log file.
    • A cronjob using different script (called ‘checkfornasties’, checks the 15 minute log files and, among other things, counts the number of hits from a non-whitelisted IPs and counts the number of user-agents used by that IP during the 15 minute span. If either is excessive, that IP is banned at Cloudflare. For those at universities or in IT worrying that they will log on with their pc or mac and then turn around and use the browser on their workstation: My theory is Julio did just that yesterday when I was implementing the excess hits script. I decided using two user-agents in 15 minutes is not excessive. 🙂
      This cron job also checks for known nasty or image stripping referrers and user-agents (i.e. uas), and bans requests using those.
    • I manually scan through the image log file from time to time to see determine whether I should tweak the .htaccess file.
    • (I also manually scan through raw server logs, but this has nothing to do with my php file.)
  3. Runs the request through a series of checks to decide whether it should serve the image as I sometimes wish to do. (Coding the image script to sometimes serve files is absolutely necessary if you use Cloudflare. Those who helped by answering questions about the “Yellow” and “Lavender” images, thanks! You contributed mightily to this. Especially the one who used the really unusual user agent!)
    • Images with referrers on a whitelist are given an ‘ok’ and will be served the image if they survive the final step. Lots and lots and lots of you have been served images that pass through the script with no untoward effects. (Owing to screwups, some of you did briefly experience untoward effects when you tried to look at “YellowImage.jpg”. By the way: I will not be listing my white list. 🙂 )
    • Recent images are given an ‘ok’ and will be served if the request survives later steps.
    • All requests whether they are given an ‘ok’ or ‘not ok’ are sent through ZBblock which will block a lot of nasty things locally and either shows people ‘the scary message’ or a ‘503’. The ‘503’ is a lie– the server is not down. It just saves processing time relative to delivering ‘the scary message’ and it puts the request in a local black list and blocks that IP from further connections. (ZBblock blocked Anteros this morning; he emailed me, I fixed that issue which was likely blocking all sorts of people on a large ISP service in a particular part of the world. I cleared all IPs out of the local blacklist. )

Note: If a blog is hosted on Cloudflare, you must create a rule to prevent Cloudflare from caching any requests to ./imageXXX.php. (Some of you recall the experiment with the yellow and lavender images. I did this when I couldn’t figure out what the heck was going on. It turns out Cloudflare caches things and if a user at time 0 was sent to “imageXXX.php”, all other users in the geographic vicinity of that user were sent the same image! )

So, basically: imageXXX.php logs all requests. It sometimes serves the image. It sometimes bans you locally. And if you request an ridiculous number of images in a very short amount of time and start changing your user agent to experiment to discover whether the reason you can’t see images is your user agent, you will be banned at cloudflare.

Is imageXXX.php available to others?
Not yet. It will be eventually. Now that I know the method works, I need to organize this so people with IT skills even lower than mine can easily use it without my needing to provide a tutorial for each and every person who wants to give it a whirl. (People with great IT skills probably don’t even need my program!)

The next question people are likely to have is: Which requests get sent to this file? There are three basic ways to get sent to the file. Because I suck at .htaccess, I wrote three separate block of code (and Brandon, Kan or anyone who can tell me how to make these shorter, please do. I know it can be done, but in the first phase, I was hunting for ‘effective’ not ‘cpu-efficient’.)

The three blocks of code are described below:
Block I:

# bad image referrers
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/$ [nc,or]
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com$ [nc,or]
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/musings$ [nc,or]
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/musings/$ [nc,or]
RewriteCond %{HTTP_REFERER} index.php$ [nc,or]
RewriteCond %{HTTP_REFERER} ^feed [or]
RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{REQUEST_URI} /wp-content/uploads/
RewriteCond %{REQUEST_URI} !(2011/12|/2012/01|/2012/02|2012/04)
RewriteRule .*\.(jpe?g|png)$ http://mydomain.com/imageXXX.php?uri=%{REQUEST_URI} [L]

Motivation: Way back in December, when I first began working on getting the hammering of the site to stop, I noticed that image scrapers were persistently trying to load every single images hosted at this site back to 2007. Requests came in at a rate faster than 1/second, from a range of IPs (including some really weird ones like a prison system in Canada). Many, many, many of these requests with referrers that were clearly missing, probably fake or even worse certainly fake. For example: some came from referrers at the top of my domain (i.e. http://mydomain.com/ or http://www.mydomain.com/ ) with no blog post listed. This was very odd because if you hack back to the top of my domain (http://mydomain.com/) you will see the index page has no images; any image request with presenting that referrer is faking the referrer. The command up to “RewriteCond %{HTTP_REFERER} ^http://(.+\.)?rankexploits\.com/imageDiversion.php [nc,or]” all relate to fake referrers. The command containing “RewriteCond %{HTTP_REFERER} ^$” points to a request with no referrer (which can be legitimate). In contrast, a request for an image from a referrer containing “feed” is probably fake. Any requests for an older image contained in my ‘/wp-content/uploads/’ directory with those referrers needs to be logged and scrutinized.

As discussed above, the script is coded to sometimes send images. In the case above, the algorithm never sends the image. That means that if you request an older image from any site that includes the word ‘feed’ in the url, you will not be provided the image. (I may tweak this if I see legitimate requests in the logs. )

Initially, I thought Block I would be enough to handle my problem. In fact, initially, I thought just blocking requests containing ‘feed’ in the referrer would be enough. But it seems to me that as I began to block, new “methods” were being developed in parallel.

I began to to notice ridiculous attempts accesses zillions of images from a variety of IPs from weird places (like a prison system in Canada I kid you not). I also noticed these often had unusual user agents like “traumaCadX” which is used to process X-rays. A request using this useragent came from an IP that seemed to correspond to a group specializing in providing hotspots to airport and it was scraping images. Weird.

These requests were sending referrers that many would not wish to block. Some contained “google” or other search agent strings. To catch everything using “weird” referrers, I added another block:

Block II
# catch known image user agents and google through imageXXX.
# note: this does catch the google image bot.

RewriteCond %{HTTP_USER_AGENT} (image|pics|pict|copy|NSPlayer|vlc/|picgrabber|psbot|spider|playstation|traumaCadX|brandwatch.net|search|CoverScout|RGAnalytics|Digimarc|CoverScout|psbot|java|getty|cydral|tineye|clipish|Chilkat|web|Webinator|panscient|CCBot|Phantom|sniffer|Acoon|Copyright|ahrefs|picgrabber)[nc]
RewriteCond %{REQUEST_URI} /wp-content/uploads/
RewriteRule .*\.(jpe?g|png)$ http://mydomain.com/imageXXX.php?uri=%{REQUEST_URI} [L]

Note that I use [nc] in the list of user agents to block.

This is because various companies capitalization conventions seem to change over time. Letter series like “image”, “pics”, “pict” appear in various images scrapers like ‘picscout’, ‘picsearch’, and ‘pictobot’. Search also often appears in various bots– but luckily not the google bot, bing bot or any that I want to permit to visit. Some of these appear in mystery bots whose documentation I could not find.

Also, I am currently not too concerned about efficiency. I have not doubled checked to edit to eliminate “Webinator” on the groups that its already covered by “web”. The reason for this is that I continue to manually check the logs to determine whether a short version might be over inclusive. If it is, I don’t want to forget to retain “Webinator”.

Requests from these user agents to through the imageXXX.php. Sometimes these are served images– which is important because the appearance of ‘image’ in the list means the Googlebot-Image requests do go through the script. I can’t remove “images” from the .htaccess blockII because the scrapers can fake user agents, some do try to pass as “Googlebot-Image” when scraping. (Fortunately, ZBblock will catch some of those. My script that checks ridiculous numbers of requests in a 15 minute time window also catches some of those. Also: It does not necessarily send Googlebot-Image images. )

Unfortunately, the two previous blocks aren’t enough. Bots can spoof user agents and spoof referrers. What if a bot tells me it’s using “Mozilla something or other” and comes from “http://joesblog.com” requesting an old image. Did I start to see this? Yes I did. I don’t mind if the requests are for new images or if I can verify that they are from a blog that does link me. So, I added this:

Block III
# catch known almost anything looking for an old image imageXXX.
RewriteCond %{REQUEST_URI} /wp-content/uploads/
RewriteCond %{REQUEST_URI} !(2011/12|/2012/01|/2012/02|2012/04)
RewriteCond %{HTTP_REFERER} !(Whitelisted_domains|whitelisted_blog_posts) [nc]
RewriteRule .*\.(jpe?g|png)$ http://mydomain.com/imageXXX.php?uri=%{REQUEST_URI} [L]

This block sends requests for all old images through the script unless those requests come from blog posts I have verified request my images. I can manually verify that a blog post links my images and add them to the “RewriteCond %{HTTP_REFERER} !(Whitelisted_domains|whitelisted_blog_posts)” list. (Should I notice scrapers trying to take advantage of this whitelist, I can eliminate that line and send the blogger a note and request they copy and host my images themselves. But for now, whitelisting some blog posts that send a lot of traffic saves cpu.) Also, I need to edit ‘RewriteCond %{REQUEST_URI} !(2011/12|/2012/01|/2012/02|2012/04)’ from time to time because when 2013 rolls around, images containing “2011/12” in the url will be old.

Once again: the imageXXX.php sometimes just sends the image. In fact, the referrer whitelist inside imageXXX.php is much wider than the one in .htaccess. Many of the requests that are diverted by this block are shown the image. But because these are logged, I can catch image scrapers trying to race through images rather quickly.

Questions
For those that read this far, I’d like to know whether you can immediately see a huge hole in the strategy that a person who works at a company that makes money by scraping images and who is very strongly motivated to scrape might exploit. I tried to think of some– but it’s always better to ask people. Also, if you can tell me how to rewrite block one to eliminate at least 2 lines, let me know. And also, if there is something obviously stupid about having three block in .htaccess, let me know that– and tell me the solution!

Meanwhile, for others: Open thread. And warning: You will be seeing what I do to get rid of the cracker-bots, and referrer spammers! I’m doing this precisely to get feedback from the IT people who visit and know more than I do. But I can report: cracker ‘bots and referrer spam is way down. (And since the rate of both is entirely independent of rates of real visits, this is not merely because traffic is down due to light blogging.)

225 thoughts on “Reduce image scraping to prevent blog crashing and thwarth copyright trolls.”

  1. Doc–
    I know. But people like Brandon and Kan will be able to give me advise on how to make that more efficient. I was just focused on something that *worked*. I had to do a lot of hunting, and worrying about the niceties of a splendid .htaccess was out of the question early on.

  2. Lucia –

    I can’t speak for the other guys, but it would go completely over my head. It’s not a language I speak!

    I think we’re just expressing our ignorance…

  3. Anteros–
    That’s what I figure. But there are certain things that over time various people want to find because it helps them with their issues. So I wanted to post this. At the same time, I know most my regular readers range from a) don’t give a hoot to b) don’t have a clue what this is even saying. (To some extent, (b) arises because of (a).) I don’t want to respond by explaining when the issue is “don’t know, don’t care”. (BTW: Not understanding the stuff after —– is not only acceptable, but in some ways entirely wise. At the same time, if those who do understand and who work in the area help straighten me out, that would be great. Because I know some of these blocks aren’t efficient.)

  4. Don’t get me wrong Lucia, I am glad you have this blog and look after it.
    If the price you have to pay is to know how to defrigmisize the fluximeter using a decoupled klanistic demogistic, but not houserized, flobinator, then I am glad you do.

  5. Lucia –

    I do understand you. I actually did read the whole of the post, and it’s true I got a flavour of what you’re trying to do. It’s a minor source of frustration that I have no expertees with which to offer any help. And also hope that there are both people who it helps – and those who can help.

    Good luck with it – if I encounter the scary one again, I’ll let you know 😉

  6. Lucia, boys will be boys. I read your piece with interest and am hoping you get some technical replies. I have some non technical questions about what you are doing and how these things affect other blog owners – but that can wait.

  7. Kenneth–
    Feel free to ask non-technical questions. I’ll be happy to answer and if the questions raise issues that need to be addressed, I’ll know about them.

    For the time being: Some of what I am doing can have a minor impact on other blogs. For example: If a blog hotlinks my images, those images might “disappear”. I deal with that by whitelisting in the script–but for a while images from the Blackboard that Ron had linked in certain posts at The Whiteboard vanished. I could see that was happening, and had coding to make them reappear on the “to do” list.

  8. When it comes to efficiency/effectiveness, I think the real issue now will be the imageXXX.php file. I can’t imagine any of the minor issues I’d raise about your .htaccess code would have a notable impact. Fortunately, I can tell you how to “rewrite block one to eliminate at least 2 lines.” Currently, you have these lines:

    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/$ [nc,or]
    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com$ [nc,or]
    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/musings$ [nc,or]
    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/musings/$ [nc,or]

    As I imagine you’re aware, these are redundant. There are two pairs of lines, each with only a one character difference. To eliminate this redundancy, all you need to do is add a question mark after that single character (which makes the command consider that character optional). This would give you two lines:

    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/?$ [nc,or]
    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/musings/?$ [nc,or]

    You could even reduce it to a single line by using the same grouping feature you used near the start of the string:

    RewriteCond %{HTTP_REFERER} ^http://(.+\.)?mydomain\.com/?(musings/?)?$ [nc,or]

    That makes your string comparison a little more cluttered, but it does reduce four lines to one. I assume the grouping should be easy for you to understand, but if not, I can explain it. In addition to that, I have a couple questions:

    RewriteCond %{HTTP_REFERER} ^$

    Could you tell me what this line does/is supposed to do? I’m especially curious since it doesn’t have an or tag.

    RewriteCond %{REQUEST_URI} !(2011/12|/2012/01|/2012/02|2012/04)

    Is there a reason you have April listed in this but not March? I assume April is in it so you don’t have to add it later, but not having March seems strange. Also, do you really want to do exceptions by months? You could just do !(2011/12|2012) and be done with it (I left the specific month for December in there so you don’t allow the other 11 months from 2011). I’m also not sure why you have the / in front of the year two out of four times. Is there a reason I’m missing?

    Aside from that (and a nit or two), the things I see look fine. I don’t see any glaring flaws someone could exploit. There could be problems with the imageXXX.php file, but I obviously wouldn’t be able to tell.

    Oh, one last thing. I know you said you’re not worried about minimizing the number of UAs you check in the blacklist, but you have picgrabber listed twice (seventh and last entries). I assume you’ll want to change that.

  9. Brandon–
    Thanks for reducing the 4 two 1! I knew those could at least be reduced to 2 but I’m never confident about where to put the ? bit.

    The RewriteCond %{HTTP_REFERER} ^$ if for blank referrers. Is that wrong?
    There is no or after it because everything after it I want to be “and”. So, what I want is:
    { If it has certain referrers (the 1 line you compressed my 4 lines into or a blank referrer or no referrer}

    AND

    { URI is in the wp-contents/upload directory AND URI is 'not new')}
    THEN
    Then send image requests to the image processing file.

    So I thought the last command in the ‘or’ block for referrer possibilities has to not have an ‘or’ afterwards. Otherwise, it will work like:

    { If it has certain referrers (the 1 line you compressed my 4 lines into OR a blank referrer ORr no referrer ORURI is in the wp-contents/upload directory}

    AND

    { URI is 'not new')}
    THEN
    Then send image requests to the image processing file.

    Am I wrong?

  10. Is there a reason you have April listed in this but not March?

    You got my reason exactly. Just blindness. I had added the extra months back in early January. I accidentally skipped March.

    You could just do !(2011/12|2012)

    Not really, because in march I’ll be getting rid of 2011/12. I want this to be new months only. People don’t ordinarily go back to ‘old’ posts often. The exception is just to avoid processing requests people really do request. It means scrapers can also get those without being redirected, but I don’t mind if they see 3 months of images.

    ut you have picgrabber listed twice (seventh and last entries).

    Thanks! I periodically find things twice too! I’ll get that out.

  11. Lucia:

    Am I wrong?

    Nope. I just apparenly can’t think as well at three in the morning after spending the night in a bar (though I don’t have alcohol as an excuse). As soon I saw your comment this morning, I realized the answer to my own question.

    Not really, because in march I’ll be getting rid of 2011/12. I want this to be new months only. People don’t ordinarily go back to ‘old’ posts often. The exception is just to avoid processing requests people really do request. It means scrapers can also get those without being redirected, but I don’t mind if they see 3 months of images.

    I didn’t mean to suggest that change would be a permanent fix. I just meant it’d be simpler for the moment. I guess it might cause some trouble once you reached April though, and you wanted to stop excepting requests from January. Sure, it’d save you space now, but you’d wind up having to type things out again then. Though, come April you could just use:

    RewriteCond %{REQUEST_URI} !2012(/02|/03|/04|/05|/06)

    Then updating the rule would just require changing a few characters. It doesn’t really affect anything, but it would be simpler.

  12. Brandon–
    I have to admit to a great deal of lack of confidence with .htacess. So, I thought I might well be wrong on the use of the “or”.

    I wish I could think of a way to not have to manually update the rule to avoid filtering the “new” images through the script. One reason I don’t fret tooooo much about efficiency of the .htaccess is I know that specific line is very important from a cpu point of view. Some posts have 10 images and I really don’t want to suddenly have the script running 10 times every time people visiting the just published post load it. If it means I don’t catch a scraper until they start loading old posts that’s ok. The feature about scrapers is that… well.. they scrape. I can wait a while.

    Oh… naturally, today I saw a referrer from http://www.google.com. Uhmmm.. I’m pretty sure that’s gotta be fake. If you really were at http://www.google.com, the referrers nearly always look like “https://www.google.com/search?q=blah…..”

    Anyway that got sent through the filter cuz guess what? Google is absolutely not in the white list!

  13. I wish I could think of a way to not have to manually update the rule to avoid filtering the “new” images through the script.

    There’s not really a good way to do that due to the limitations of .htaccess. I suppose you could write another script that would update your .htaccess file then schedule it to run periodically, but that doesn’t seem worthwhile.

    Oh… naturally, today I saw a referrer from http://www.google.com. Uhmmm.. I’m pretty sure that’s gotta be fake. If you really were at http://www.google.com, the referrers nearly always look like “https://www.google.com/search?q=blah…..”

    It doesn’t matter for what you do, but Google said they’re switching their referral strings to use url? instead of search?. It’s supposed to take some time, but I don’t know much about it.

  14. Brandon– The main thing is it can’t just be “http://www.google.com” with nothing else at the end. While I (and everyone) would love google to put a direct link to my blog on it’s main page telling everyone the world to visit, reality if they visit from here it will be from a search results page. The uri is going to include some type of query string.

    But lots of spammers want to spoof a referrer containing ‘google’ in it. I suspect they figure many people will just assume it was a search result. (I don’t know why they don’t go to the trouble to create a long, complicated referrer, but they don’t.)

  15. Lucia,

    All this is way over my head, but I feel I should tell you that “thwarth” sounds like you have developed a lisp 🙂

  16. Eli–
    I know there was a snit about McIntyre sending bots and getting blocked. I know that visitors at Climate Audit– including those who unlike you like Steve– explained to him that there are protocols for sending bots. I know this got straightend out. But other than that, I have no idea precisely what you think your (or NASA’s) gripe is based on the blog post you linked. Maybe I would know if I could read the link to whatever it is in your memory hole but it’s dead. If you decide to fix it, let me know. Then I’ll go back and read it.

    On a more general note: While I do not criticize those at NASA for temporarily blocking SteveMc — and as far as I am aware I have never criticized them for temporarily blocking SteveMc, I think it’s worth noting that what you think are two images of equivalent amounts of “scraping” are not the same in degree or even intent. SteveMc needed to learn to read the time delay instruction and wait between requests– and to that extent they can be the same in terms of use of robots.txt. As SteveMc doesn’t actually run a commercial bot, it’s not surprising he didn’t know this. But it was also perfectly reasonable for those at NASA to block and wait to hear from a human being trying to get the data.

    Obviously, having not read the material at your dead link, I will comment on a few things touching on the issue in general and which based on my recollection of what happened matter. These touch on things you ‘threw out’ in your rather unclear blog post including: robots.txt, “scraping”, the difference between requesting static and dymanic content, and the difference between requesting data and images at a private blog.

    * Lots of bots even nice ones read robots.txt but do not entirely obey rit. In particular, many do not entirely boy the time delay instruction. ( Google does not seem to obey if you tell it to wait 30 seconds between visits. It’s a nice bot and waits for a response– which prevents the blog from crashing due to it’s visits, but it will not wait 30 seconds. I’m pretty sure they even tell you think straight out. ) So merely not obeying robots.txt on a time instruction is not unusual.

    * The quote you provide complains about uniformative user agents. Lots of bots — in fact most bots– have vague uninformative user agents that do not tell me who they are. I image you do– you probably use a user agent saying “Hey! I’m using browser X”. So while relatively uninformative information is uninformative, it is not the same as what I often see– obvious outright lying in the user agent. (This is not a theory: It is a fact.) Moreover, I see obvious outright lying on the referrer. In fact, at someone from

    United States Robstown Wireless Data Service Provider Corporation
    Resolve Host: mobile-166-147-065-055.mycingular.net flat out lied and left “http://www.google.com/” as their referrer while trying to download

    10 times in a minute. This referrer is not merely vague– it is a flat out lie. Many of the scrapers visiting my site are not merely leaving vague information when visiting and revisiting and revisiting. They are lying about either the referrer the user agent or both. (I know they are lying about the user agent because often switch immediately after being refused an image sometimes they switch multiple times. However, in the case of the stinkin’ liar above, the user agent stayed at “Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; Silk/1.0.13.81_10003810) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 Silk-Accelerated=true”

    * The problem I discuss here relates not to one swoop through but to to things that scrape repeatedly and persistently. What I mean is that once I began looking I could see the same IP downloading roughly 4 years * 100 images/year all within less than a 1/2 hour at this blog. You might think this is comparable to Steve’s “several thousand” requests, but that’s because I haven’t mentioned that these were coming half dozen times a day. Then they can back the next day. And the next day. And the next day. On– And did I fail to mention that the scraping was also happening at my knitting blog? So take the current estimated number of images and multiple by 3. (Oh, and when I blocked those IP’s the requests started coming from new IPs.)

    Oh– I did I fail to mention at because blogs are dynamic not static and because incorrect url requests trigger the blog script, when the IP tried to download an image that had been deleted, it tried to run the script that is WordPress. This happened often because the script got its image requests from somewhere– who knows where– and made lots of mistakes. Such a request takes vastly more cpu and memory– and rest assured was much more resource intensive than any request for flat data files from NASA.

    My blog literally– not figuratively– crashed during some of these things. It was crashing 3 or 4 times a day. Once I blocked those IP’s I would see visits from a new, but related IP.

    Whole scale scraping of the exact same content day after day after day that results in crashing the server is quite different from what the IT guy at NASA is describing which was not repeated unending scraping over months, which included mistakes that ended up requesting that my server run php scripts.

    Oh– and so far I’ve only discussed the image requests. I’ve been having those sorts of requests from ‘bots asking for every blog post in every archive, broken urls, trying to guess archive names.

    I don’t know what rate Steve was requesting data– but the fact is, that if all I had experienced was someone requesting a piddling thousands data files in a day — the level complained of by NASA– there would absolutely no problem to solve. My blog would not crash. Relative to the bots hitting me, that rate is very light.

    Oh– and I’m not NASA which does, as part of its mission, make data available. I’m a small time blogger.

  17. On the last bit, while driving to the grocery store I wanted to be a bit more specific and provide numbers. Currently, the solution to scraping permits would permit single IP coming in at a more-or-less even rate to download 21,000 images a day– and that limit only counts if they make ‘suspicious’ images requests that trigger the script. Many many requests are not “suspicious” and don’t trigger the script.

    While Eli might think what I call “scraping” is equivalent to the access level SteveMc achived during the “NASA” incident, based on the numerical values in the quote Eli provided, what I call “scraping solved” permits an order of magnitude more requests than NASA permitted before they blocked SteveMc. I also only implemented this draconian limit because I’ve been seeing requests at rates 10 to 100 times above those where my script halts them.

    While I recognize that everyone involved in protecting a server has to be vigilant and has a right to block anything they deem suspicious, and NASA’s IT guys are right to halt it and wait for someone to communicate them, the rate that SteveMc was hitting NASA was a much, much lower request rate than I am describing using the word “scraping”.

  18. Steven
    I got a kick out of this which suggests some of the bunnies and mice don’t have any sense of the difference between “accessing a site” and “DOS attack”:

    Rattus Norvegicus said…

    McI was scraping the data, but in the process was doing a pretty good imitation of a DOS attack.

    Reudy reported thousands of hits a day.

    In this post I complain of an IP hitting one– exactly 1– IP tens of thousands of times in one morning.
    http://rankexploits.com/musings/2012/nasa-cerf-office-of-the-chief-information-officer-top-threat/
    This is at least 10 times the rate Reudy complained of and no one would call this a DOS attack. Dreamhost wouldn’t have (and didn’t) classified it as such.

    In this post:
    http://rankexploits.com/musings/2011/bezeqint-net-is-this-an-attack/
    I discussing being hammered by something hitting at a rate sometimes exceeding 3/second and remaining sustained for hours. That’s thousands a hour— not the thousand a day that made Ruedy cut off SteveMc. (And depending on how Apache interpreted the IP, the bot was sometimes served the page and sometimes denied. But it just kept requesting and requesting and requesting…)

    I discussed this hammering with Dreamhost– it’s absolutely not considered evidence of a DOS attack. Also, Bezeq doesn’t consider it fast enough to be abuse.

    I think Ruedy or a Sysop have a right to cut off SteveMc if either are worried about what he saw. But the idea that what Ruedy or the Sysop saw was remotely related close to rates one sees during a DOS– that’s totally clueless. Those rates don’t even approach “normal errors at blogs with medium to small levels of traffic. They are drops in the bucket and don’t even amount to what I call “hammering”!

  19. While it’s not surprise Eli Rabett says he views those people favorably, I’m personally unimpressed by the way the situation was handled. The script Steve McIntyre used was not a “robot.” You can see why by viewing this page. Despite this, the webmaster (Robert Schmunk) claimed:

    Because the robot running on the cable.rogers.com network has rather obviously and blatantly violated those rules, I placed a block on our server restricting its access to the server.

    Now then, it’s understandable Schmunk might have mistook McIntyre’s script at a bot. However, he later said:

    It is an automated process scraping content from the website, and if that isn’t what a web robot does, then it’s close enough.

    This strikes me as ridiculous. If something isn’t a bot, there is no reason to expect it to abide by restrictions placed on bots. There is no such thing as “close enough” when discussing what rules apply to. Either something is a bot and should abide by restrictions placed on bots, or it isn’t a bot.

    Even worse, Schmunk continued to refer to McIntyre’s script as a bot well after this point. This misrepresents the situation, creating a false impression of misbehavior on McIntyre’s part. I find it humorous Schmunk claimed McIntyre’s tone was “on the arrogant side” as Schmunk’s willful disregard of facts is far more arrogant than anything from McIntyre.

    By the way, Eli Rabett posted on this issue when it first happened. In the process, he completely fabricated a claim:

    Turns out that the rapid polling of the robot resulted in denial of service to everyone else, and the webmaster told McIntyre to bug out

    There is absolutely no evidence any service was interrupted to anyone, much less to everyone else. Neither Robert Schmunk, nor anyone else from GISS, ever said a word about service having been interrupted. In fact, at the time this happened, McIntyre invited people to try accessing the website to see if they experienced any slowdown in service. Nobody did.

    This claim from Eli Rabett was either a delusion or a lie.

    TL,DR: Steve McIntyre did not use a bot. Also, Eli Rabett is full of it.

  20. There is absolutely no evidence any service was interrupted to anyone, much less to everyone else.

    If the connection level Schmunk had reported had resulted in denial of service, we would have to conclude that NASA servers operate on the amount of electricity one would expect to be generated by one or two hamsters running in their little wheels.

    As I said: I might have cut off service, and I don’t criticize Schmunk for having done so. He’s going home at night. His job is to protect the server. Ordinarily, the downside of blocking someone he doesn’t know is waiting for them to send a message saying “Hey! What’s up? Why was I cut off from a server that seems to be advertized as open to the public?!” Then when you discover it’s a human, you let them through. (Honestly, I don’t see what’s arrogant about such a question. Eli doesn’t seem to have quoted what Mc said– so I’ll assume it wasn’t “YOu F-in’ b***” etc. If Mc had written that, Eli would have quoted. 🙂

    Ok… so usually the downside is you incovenience the person you cut off for 24 hours. In this case the person blogged about it and caused you inconvenience. To NASA: put on your big boy pants. The requests were a drop in the bucket compared to the access requests at blogs. We all know the rates of requests here.

    I wasn’t aware of the distinction between “bot” and “automated script” you, Brandon use– but as an IT guy, Schmunk should be. If it’s not a bot, then you are right Schmunk is full of it. (This is not to say he can’t cut off service for a day. Anyone can. But others have a right to report they were cut off.)

  21. Oh, wow. I had forgotten how much amusement I got from the Deltoid article on this issue. The title alone should be enough to ensure any unbiased reader wouldn’t trust the article:

    Steve McIntyre’s DOS attack on GISS

    I think that actually qualifies as libel. DOS attacks are criminal offenses so that means Tim Lambert accused Steve McIntyre of being a criminal without any basis. He then went on to say:

    See, it was a script collating station data, not a web robot scraping station data. In the ensuing discussion, McIntyre steadfastly refused to accept that the the two are exactly the same thing

    It’s hardly surprising McIntyre refused to accept that given it isn’t true. Similarly, the end of Lambert’s article:

    Eventually GISS agreed to unblock him if he would show some consideration to other users by running his bot late at night or on weekends.

    Yes, GISS wanted McIntyre to “show some consideration to other users” by scraping its site at night or on the weekends. You know, the exact time McIntyre had been scraping the site…

  22. Lucia:

    I wasn’t aware of the distinction between “bot” and “automated script” you, Brandon use– but as an IT guy, Schmunk should be. If it’s not a bot, then you are right Schmunk is full of it.

    I assume Robert Schmunk simply didn’t know, or didn’t care, about the actual definition of “bot.” To him, a “bot” was one thing, and that’s that. It’s arrogant, but not surprising. I wouldn’t normally care much, but when talking to McIntyre, he conceded the possibility of being wrong about the definition. Then, when talking to others about McIntyre, he ignored the possibility. That strikes me as dishonest, and it falsely creates a negative impression of McIntyre in the people reading Schmunk’s comments.

    By the way, we’re technically discussing “web robots,” not just bots. The distinction shouldn’t matter for anything said so far, but it might be important for anyone who decides to search for more information on bots. For example, if you do a search for “bot” with Google, you’ll get hits for things like CleverBot and “zombies.” They aren’t relevant to the robot.txt file, or what McIntyre’s script did, but you will see them if you look for information on “bots.”

    (This is not to say he can’t cut off service for a day. Anyone can. But others have a right to report they were cut off.)

    Definitely. I don’t have a problem with Schmunk’s decision to block McIntyre.

  23. As I recall the scraping that SteveM did was a legitimate effort to download material from GISS that required him going back and forth with an R program in order to download the information in an useable form. It was probably temperature data from stations but I am not sure. I think he was surprised when he was unable to link to the GISS site and complained about on his blog. As I recall the issues were quickly and amicably resolved with GISS. I think all they requested was a second or two delays between requests. Is not this problem resolved with what Steve Mosher put together in R? Actually to obtain station data from KNMI would require a similar type operation.

    Why would Eli go off on an individual attempting to do legitimate work and analysis? He seems way too worried about individuals attempting to gain information and taking initiatives that might lead to criticisms of larger organizations.

  24. It was probably temperature data from stations but I am not sure.

    It was.

    I think he was surprised when he was unable to link to the GISS site and complained about on his blog.

    I’m not sure why you think his problem was anything to do with linking. His scraping got interrupted midway through, and then he found he couldn’t access the site.

    I think all they requested was a second or two delays between requests.

    They didn’t even request that. All they requested was he do the scraping at night or on the weekend. Oddly enough, he had been doing his scraping at night, so this changed nothing.

    Why would Eli go off on an individual attempting to do legitimate work and analysis?

    Given the fact he made an obviously baseless criticism of McIntyre, I think I could wager an answer for your question. Since you can probably guess what it is, I won’t bother putting it into words.

  25. Bradon–
    Here is their current robots.txt file
    http://data.giss.nasa.gov/robots.txt

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /gfx/
    Disallow: /modelE/transient/
    Disallow: /work/

    User-agent: msnbot
    Crawl-delay: 480
    Disallow: /cgi-bin/
    Disallow: /gfx/
    Disallow: /modelE/transient/
    Disallow: /work/

    User-agent: Slurp
    Crawl-delay: 480
    Disallow: /cgi-bin/
    Disallow: /gfx/
    Disallow: /modelE/transient/
    Disallow: /work/

    User-agent: Scooter
    Crawl-delay: 480
    Disallow: /cgi-bin/
    Disallow: /gfx/
    Disallow: /modelE/transient/
    Disallow: /work/

    User-agent: YahooSeeker/CafeKelsa
    Disallow: /

    Bizarrely enough they exclude all bots except msn bot, Slurp, Scooter and ” YahooSeeker/CafeKelsa” (a bot I’m unfamiliar with.) So they block google entirely but have a crawl delay of 480 seconds.

    FWIW, since I’m not a robot, I tried to load http://data.giss.nasa.gov/cgi-bin/ I got a 403 error; clearly some IPs or agents are blocked entirely. If everything is blocked, I’m not sure why they need a robot.txt instruction. I’m not going to spoof the msn bot to see if spoofing that user agent gets me in. 🙂

    Anyway, other than the fact that Eli brought this up I have no particular interest in accessing the cgi-bin file at GISS.

  26. Returning to Schmuck’s complaint that the user agent was “vague” (as nearly all are), I want to point out I had another scraper with referrers that aren’t vague, they are just flat out lies. Today,
    modemcable020.34-131-66.mc.videotron.ca came by and requested 229 images between 16:49:36 and 16:50:29 — requesting every image that appeared on the blog back to 2011/05.

    All requests provided the referrer http://rankexploits.com/musings/feed/. Sorry, but there ain’t no link to http://rankexploits.com/musings/wp-content/uploads/2011/05/TruevsReconstructed-500×500.jpg at http://rankexploits.com/musings/feed/ The referrer is just a big fat lie.

    I’m constantly amazed at just how long a user agent can get:

    Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.0.3705; Tablet PC 1.7; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; BRI/1; BRI/2)

    BTW: I wouldn’t be surprised if this is Tineye though it might not be.

  27. Here is their current robots.txt file

    That’s the same as it was when all this happened. I didn’t understand it then, and I don’t understand it now.

    Do you want to hear something funny though? Access to the robot.txt file was (and I think still is) blocked by UA. That’s right. Some bots trying to check the robot.txt file to see how they should behave would be prevented from seeing the rules they’re supposed to follow.

  28. “Schmuck’s complaint?” I always thought that was Moby Dick, nothing to do with ‘bots.

  29. Steve McIntyre created a script that scrapped the GISSTemp database. The server manager saw the scrapping and blocked it because a) it was tying up a lot of the server and b) it violated site policy.

    Although you did not provide any further details about your problem, I will assume that you are the person on the cable.rogers.com network who has been running a robot for the past several hours trying to scrape GISTEMP station data and who has made over 16000 (!) requests to the data.giss.nasa.gov website.

    Much hilarity ensued as Steve denied that his script was a robot and various people tried to knock it into his head that it looked like a robot, it behaved like a robot and calling it a duck was not very useful

    Steve threw a fit. We got a good insight into what happened from and FOIA response

  30. TL,DR: Steve McIntyre did not use a bot.

    .
    At the risk of sounding repetitive:
    .
    “Him, a merchant! It’s pure slander, he never was one. All that he did was to be very obliging, very ready to help; and, since he was a connoisseur in cloth, he went all over to choose them, had them brought to his house, and gave them to his friends for money.”
    – Molière

  31. Lucia:

    For the sake of efficiency, I suggest that you replace
    “RewriteRule .*\.(jpe?g|png)$ http://…” with
    “RewriteRule .*\.(jpg|jpeg|png)$ http…”

    That won’t save a huge number of cycles, it is waste to call a regular expression wildcard parser in this situation. This comment draws only on generalized knowledge as a programmer, not anything language or application specific.

  32. SteveM was attempting to download and put station data into a form from which he could more efficiently analyze it. GISS station data, and station data in general could be difficult to work with in the form it was in in those days. I’ll have to go back to GISS and determine whether that has changed. Station data from GHCN and CRU is rather easy to access and use currently. In my recent attempts to understand what goes into the three major surface based temperature data sets and how the raw data has been processed I have gained an appreciation for what GHCN and NCDC are doing.

    There were some exchanges previously between GISS and SteveM that might have lead to some misunderstandings, but as I recall the scraping issue was resolved whereby SteveM could continue. I believe that discussion lead to and/or resulted from some complaints from amateurs who were attempting to use the entire GISS station data set. I was, in those days, surprised not to see climate scientists paying more attention to the temperature data sets they based papers on.

    To equate what SteveM was attempting to do with what Lucia is attempting to avoid is an attempt, I think, by Eli to portray those why might analyze the works of and criticize his favorite climate scientists as rabble rousers. That the analyses and criticisms might be coming instead from genuinely interested and informed individuals is what must stick in the craw of some of these defenders of the status quo and consensus on AGW.

  33. Lucia-

    Does your log show anything interesting at 9:50 AM (PST) associated with the following user agent string?

    User Agent: Mozilla/5.0 (Windows NT 6.1; rv:10.0) Gecko/20100101 Firefox/10.0

    IP Address: 66.87.0.38

    Was curious to see if your script would recognize or block the ImageExchange (Picscout) add-on.

  34. DukeC–
    Hmmm.. Are you from Picscout? If so, I hate you but I’ll answer anyway. 🙂

    I thought I could answer that without look and that the answer was that my script would not block the ImageExchange add-on. I also thought the answer was that I could think of no way to do so.

    But I just looked.

    It turns out the answer to your question is that I saw something very interesting and my script banned the ass of 72.26.211.129 the IP associated with the interesting thing. Since I don’t know if the hit from 72.26.211.129 was associated with the one from 66.87.0.38, I don’t know if I blocked ImageExchange (Picscout) add-on. If you’re from Picscout, I’ve given you information to know, but you’ll have to tell me. 🙂

    I’m going to do a few experiments later on to see if I see the same strange thing when I try to connect. 🙂

    Oddly enough, it’s never been my goal to block that add on. Though I have a gripe with bezeq hammering my site till it crashed right around the time I responded to the 2nd stoooopid getty letter (and trying to hammer over and over), I don’t mind if people who surf looking for pictures happen to report back to picscout because my goal is not to facilitate copyright infringement. It’s to stop having *&%$#F’in bots like those from bezeq and priority colo (and other strange places) hammering my site till it drops. Individuals dropping by with the add on aren’t going to be a problem.

    But– since I’m pissed at Getty right now, if I happen to cause them inconvenience or increase their operating costs by blocking their add on,that’s ok with me! (And if you work for picscout, go tell them I said I’ll be happy to inconvenience the heck out of them!) 🙂

  35. Eli Rabett continues to spread the baseless figment of his imagination:

    Steve McIntyre created a script that scrapped the GISSTemp database. The server manager saw the scrapping and blocked it because a) it was tying up a lot of the server

    And continues to show he doesn’t care about what words actually mean:

    b) it violated site policy.

    Much hilarity ensued as Steve denied that his script was a robot and various people tried to knock it into his head that it looked like a robot, it behaved like a robot and calling it a duck was not very useful

    But hey, if he keeps repeating himself over and over, he’s likely to convince some people he’s right. Reality has little bearing on the views of many. Similarly, toto says:

    At the risk of sounding repetitive:

    While there is the risk of sounding repetitive, I think the larger risk is that of sounding close-minded. Mocking a position you haven’t even addressed doesn’t inspire confidence. At least, not in anyone with any sort of open mind.

    Disagreement is fine. Asking people to clarify their position is fine. Dismissing what they say without even considering it is not.

    And it’s just pathetic to do that while mocking what they say.

  36. AFPhys:

    For the sake of efficiency, I suggest that you replace
    “RewriteRule .*\.(jpe?g|png)$ http://…” with
    “RewriteRule .*\.(jpg|jpeg|png)$ http…”

    That won’t save a huge number of cycles, it is waste to call a regular expression wildcard parser in this situation. This comment draws only on generalized knowledge as a programmer, not anything language or application specific.

    I don’t agree. I’m not sure why you’d say a “wildcard parser” would be called for a question mark. The question mark is not a wildcard, and it wouldn’t be parsed as one. All it would do is try to match jpeg first, then if that failed, try to match jpg. If we make the change you suggest, more would be done. If jpeg failed, the expression would then move to the next alternation and try to match j, jp, jpg. Instead of backtracking one space, it would have to backtrack three. I don’t see how your recommendation could be more optimal.

    Besides which, if you want to optimize that string, why allow the parentheses to create an unused backreference?* I may be mistaken about which of those two approaches would be more optimal (though I don’t think I am), but there’s no question it’d be better to change both as such:

    RewriteRule .*\.(?:jpe?g|png)$ http…
    RewriteRule .*\.(?:jpg|jpeg|png)$ http…

    But I doubt this sort of minor optimization is of much interest to Lucia. She isn’t dealing with long strings or complicated expressions, so changes like these should have a negligible effect.

    *Backreferences are created when you use parentheses to group things. They are effectively variables which store whatever was matched by your grouping. If you never use a variable that gets stored, it a waste to store it. Placing ?: after the opening bracket will prevent a backreference from being created, saving some effort.

  37. Brandon – we have seen similar things from the Rabett before, eg his one man attempt to re-define the laws of copyright.

  38. Brandon/AFPhys–
    You both clearly know more about .htaccess than I do.

    DukeC–
    I repeated the experiment at the one domain I don’t send through cloudflare. I activated the picscout add-on and I saw 72.26.211.130 visit. Yep. It’s weird. Yep. My script would ban it– it might take an hour, but it would ban it. I think my script bans detects and bans. I’m going to make it ban faster now. Such is my desire to kick picscout/getty in the shins that I’m going over to the ELI forum and report on the IP range. I suspect that info will be communicated far and wide; if it cuts into picscouts profit margins: whoo hooo!

  39. Since this is an open thread…

    Prof. Mann’s new book, The Hockey Stick and the Climate Wars: Dispatches from the Front Lines is now for sale at amazon.com. The reviews are pouring in! Split between five stars (~65%) and one star (~25%).

    Among the many glowing reviews are names I recognized from the Blackboard and elsewhere: Peter Gleick, Michael Tobis, Chris Colose, Arthur Smith, John Cook, and Gregory Laden.

    Apparently this is news at WUWT, accounting for some (many? most?) of the one-stars. Donna LaFramboise in reverse.

    I’ll be very interested to see what the book has to say, once work ebbs a bit.

    amazon.com page.

  40. Re: Eli Rabett (Comment #89597) 
February 13th, 2012 at 8:24 am

    Steve McIntyre created a script that scrapped the GISSTemp database. The server manager saw the scrapping and blocked it because a) it was tying up a lot of the server and b) it violated site policy.

    Eli,
    I followed your link and agree that “web scraping” does seem to have taken place. However, I could not find anything which demonstrated either that services had been adversely affected or that the script was a “bot” that violated site policy.

  41. Duke C.–
    I saw those IPs were a cloud service and I figure that they might differ for people coming from different parts of the globe. I thought up and have partially implemented a strategy to watch for the add-on (or anything similar) and find the IPs that might trail requests for images by a few seconds. Thanks for prodding me to think about watching for toolbars add-ons. Even though I don’t mind the toolbar add-ons, I think lots of people are going to want that functionality.

  42. actually I would have thought that Rabett, being a remnant of the mainframe days of yore, would know all about scraping and the tedious necesssity of doing that stuff to glean data from the databases that were designed purely to hoard data without letting it loose. In his mind, that might be a golden age of scientific enquiry.

  43. “In his mind, that might be a golden age of scientific enquiry.”

    Don’t they do something like this on that board game where they mix phrases like “silence is golden age of scientific enquiry”.

  44. diogenes:
    There is also a big difference between my position vis scraping or accessing stuff and NASA:

    I am a private individual using private resources. I am absolutely entitled to decree that my policy is to offer a blog where I post my writings and let those who wish visit in an fashion I consider orderly. I do not consider an automated process that comes in and hammers my site to be orderly, and I am going to stop it.

    And I am not talking out of two sides of my mouth. I am saying I’m not letting anyone do it. If google did it, I’d ban them. It is my plan to limit access in the way that I wish. I am not going to permit scraping if the method involve hammering my site. Period. If I could, I’d figure out a way to prevent scraping entirely– but I can’t. So be it.

    In contrast, NASA is a publicly funded entity and one of the things they claim to do is make data accessible to the public. To fulfill their claim they make data publicly accessible they must permit activities that are called. Not only must they permit scraping, they must permit scraping using automated bots. People will call it scraping because it is scraping.

    If NASA does not permit “scraping”, then NASA’s claim they make data accessible becomes what is ordinarily called ” a big fat lie”.

    I would guess that despite all the nasty unprofessional rhetoric (pigs/jesters/ arrogant etc.) that spewed from NASA fingertips during that incident, the reason they let SteveMc resume doing precisely what he was doing– gathering data on evenings and weekends using an automated bot– was that if they did not let him do it everyone would know they were not fulfilling their claim to permit the public access to data. (Or, the claim was meaningless. Because saying people can download tons and tons of data provided they do it manually is idiotic.)

    I don’t fault Schmunck for cutting off access when he saw the hits in his logs just before going home. I might have done the same thing as a safeguard. When a human contacted me, or when I came back the next day and permitted the IP to resume while I watched, I’d realize that was not an attack.

    (Though, realistically, the rates SteveMc was hitting could not look like DOS attack. Seriously. It’s like finding one rabbet turd and deciding there was a risk that a herd of tribbles had descended on the Starship Enterprise. Still, sometimes caution is wise, and I don’t fault Schumnk for cutting SteveMc off temporarily. And I don’t fault SteveMc for mentioning it at his blog. I think the discussion was good and helped clarify NASA’s policies.)

  45. ***lucia (Comment #89604)***

    Am I with Picscout? Heavens no. Please re-direct all that ass whuppin’ energy towards Eli Rabbet and his tattered strawman argument. 🙂

    Seems that Getty Images was prescient enough to include this language in their website terms:

    You are specifically prohibited from: (a) downloading, copying, or re-transmitting any or all of the Site or the Getty Images Content without, or in violation of, a written license or agreement with Getty Images; (b) using any data mining, robots or similar data gathering or extraction methods;

    http://www.gettyimages.com/Corporate/Terms.aspx

    I wonder- If a blog owner were to add a small link at the bottom of their home page which contained similar language, could he/she lawyer up and hit the bot owner with a cease and desist letter?

    Just a thought.

  46. Duke C–
    Glad to hear you aren’t from picscout! 🙂
    Eitherway, thanks for asking that because I discovered as currently constituted their add-on is easily detectable. And blockable. People at the ELI forum have been informed of the IP ranges found so far and are banning. I’m going to see if I can detect other ranges.

    I don’t know if I wrote that stuff whether I could sue Getty’s hindquarters off. Someone has suggested that at the ELI forums. But really– suing is a nuisance– and how do you prove who really was at an IP in court? I have no effective tools to get much on anyone until possibly the discovery stage. By then I’d have spent $$ on lawyers– and what if I guessed who to sue incorrectly? Figuring out ways to block works. Lawyering up? Probably not so much.

  47. willard quotes Wikipedia as saying:

    > Internet bots, also known as web robots, WWW robots or simply bots, are software applications that run automated tasks over the Internet.

    I am well aware of this definition on Wikipedia. It’s one of the many things Wikipedia gets wrong. There are a few things to note about the article. First, no source is given for that definition. While I’m sure the authors of the article could find one to support it (Wikipedia authors are pretty good at finding sources which are wrong), the fact they haven’t should be enough to raise flags. Second, the article clearly isn’t referring to just what is covered by the robots.txt file:

    In addition to their uses outlined above, bots may also be implemented where a response speed faster than that of humans is required (e.g., gaming bots and auction-site robots)

    The fact the article discuss gaming bots should make it clear the definition willard quoted should not be used in this discussion. Beyond that, a line shortly after the definition willard quoted contradicts the position he holds:

    Each server can have a file called robots.txt, containing rules for the spidering of that server that the bot is supposed to obey.

    This states the only thing the robots.txt files is supposed to cover is the spidering of a server, not all “bot” behavior. This means even if you use the Wikipedia article’s definitions, it agrees Steve McIntyre’s actions did not violate any policy (what he did is obviously not spidering).

    Long story short, the Wikipedia article on this subject is a bad article which shouldn’t be used, but even it makes it clear Schmunk’s position was wrong.

  48. lucia:

    Brandon/AFPhys–
    You both clearly know more about .htaccess than I do.

    To be fair, AFPhys may not know anything about .htaccess. That he knows more about regular expressions than you doesn’t mean he knows more about .htaccess itself. Since he’s only discussed regular expressions, not .htaccess, you shouldn’t assume you know less than he about it. Of course, you shouldn’t assume you know more either!

    If NASA does not permit “scraping”, then NASA’s claim they make data accessible becomes what is ordinarily called ” a big fat lie”.

    That’s not necessarily true. As long as NASA makes everything available in convenient “packages” to download (which it didn’t, and I believe still doesn’t), it doesn’t need to allow scraping. I don’t think there’d be anything wrong with NASA saying, “Here is what you’re looking for in a zip file. Please download that rather than tying up extra resources scraping our site.”

    diogenes:

    Brandon – we have seen similar things from the Rabett before, eg his one man attempt to re-define the laws of copyright.

    Aye. I’ve seen enough of Eli Rabett’s behavior to have no hopes of “reaching” him. I just hope some people reading his comments may take to heart what people say about him and his behavior.

    AMac:

    Prof. Mann’s new book, The Hockey Stick and the Climate Wars: Dispatches from the Front Lines is now for sale at amazon.com. The reviews are pouring in! Split between five stars (~65%) and one star (~25%).

    Among the many glowing reviews are names I recognized from the Blackboard and elsewhere: Peter Gleick, Michael Tobis, Chris Colose, Arthur Smith, John Cook, and Gregory Laden.

    I’ll be very interested to see what the book has to say, once work ebbs a bit.

    I ordered this book a while back, and I’m told it will be arriving within a few days. I think it will be fascinating to read, though not in a good way (the excerpts I’ve seen had glaring issues).

    For what it’s worth, I get the impression those glowing reviews would have turned out the same regardless of what Michael Mann wrote. I’m just not sure how much of it is them actually believing what they say, and how much of it is their policy of, “Admit nothing.”

  49. re Mann’s book, it is fairly clear that he has urged all of his contacts to give him a glowing review and rating. Tobis, in his review, admits that he rated it (4 stars) and wrote the review after reading a quarter of the book. Scott Mandia devotes over 2000 words to his review and ends with a ringing endorsement of the “hockey stick”. Who would have thought it? Actually, you could probably draft his review without even reading the book – it contains all the same memes that have been discussed so often on various blogs. It might even be possible that most of it was supplied by Skeptical Science – and of course John Cook also gives the book a 5 star rating.

  50. After Sinan Unur’s calm explanations, Nicholas, the guy who wrote Steve’s “script” conceded:
    > OK then. I think you are right.
    We’ll let Brandon Shollenberger spin his proofs by assertion and non sequitur, based on his state-of-the-art knowledge of software agents. The apprentice in parsomatics found his genial home on the Internet.

  51. Willard
    Based on what you linked, Nicholas seems to have conceded that NASA “Is lucky to get 15 billion $$$ a year. ”

    Posted May 17, 2007 at 9:29 AM | Permalink | Reply

    NASA is lucky to get 15 billion a year.

    Nicholas
    Posted May 17, 2007 at 9:36 AM | Permalink | Reply

    OK then. I think you are right.

    So a decent web server that wouldn’t fall over when Mr. McIntyre tries to fetch the GISS data would set them back roughly 0.000005% of their budget 😉

    I have no idea what you think Nicholas conceeded by quoting “Ok then. I think you are right.I think he was saying that the requests characterized as DOS attackes could be handled by a serer that would cost roughly “0.000005% of their budget 😉 ”

    I realize that you don’t know a cgi-script from an armadillo. If you did, you’d know we covered this ground.

    My blog runs php: a script. So requests involve dynamic content- just as requests in the cgi-bin do. I explain this whenever some tells me to just get more bandwidth. The problem isn’t bandwidth– it’s cpu and memory and usually for requests in parallel.

    But the fact is, requests at the rate NASA complains they saw from steve are 1 or two orders of magnitude lighter than what I call “hammering” when out of controlled scrapers hit my blog. No one anywher diagnoses that rate as being a DOS attack– which would be requets at a much higher rate.

    I pay less than $20 a month for everything– hosting, people taking care of Apache etc. I’m sure NASA spends more than that on the SYSOPS time to look at the logs. If they spend less on their server than I do to host my blog, that’s nuts.

    If you think Nicholas conceded that the requests looked like a DOS attack, I suggest that’s because you don’t understand what he wrote.

    If Nicholas did concede that connections at the rate SteveMc made looked like DOS attacks then Nicholas was incorrect.

  52. actually Lucia, Willard was quoting Nicholas in extremely bad faith. The post immediately before the top post where someone opines about the magnitude of NASA’s budget, Nicholas says:

    “Posted May 17, 2007 at 9:02 AM | Permalink | Reply

    Sinan, having re-read your post.. yes, it’s true that if you are on a fast network, this script (without any kind of delay in the loop) could, in conjunction with a poorly written CGI script, make the CPU of the web server fairly busy.

    However, it won’t cause it to be used 100%. It will still have time left for other requests. Other requests should also share the CPU evenly, so they will in effect slow down Steve’s fetching in order to temporarily serve other people’s requests.

    As a result, I would be surprised if it had a significant impact. I think NASA’s budget is in the hundred of billions of dollars a year. They can’t spring $500 for a dual core web server for GISS? This research is related to an issue which is supposedly one of the most critical of our time, which is having lots of money thrown at it, etc. What’s worse, the reason we have to do this is because they have organized their web site in such a way to make it hard to get to the data.

    It certainly isn’t a “denial of service” because running this script will not make the web server unusable for others, nor even especially slow. They might experience small extra delays above what is normal. But, the data sets being served are so small, and the network latency such a large part of the time spent fetching the data, that they’d have to drop the ball really badly to implement this in such a way that it really couldn’t handle the extra requests gracefully.”

  53. I scanned the interminably long 2007 Climate Audit thread from the point willard linked in Comment #89648 untill its end. Here are the three comments most relevant to this discussion of that episode. The third one gives a better understanding of the point of view of A. Sinan Unur, the sysadmin willard quoted immediately above.

    First

    Glen Raphael – Posted May 20, 2007 at 11:51 AM

    Steve, I know you’ve been mistreated by other researchers, but that’s no reason to go into each new encounter with a chip on your shoulder. You were running what you now realize was a poorly-written site-scraping program because it didn’t include even so much as a one-second delay to allow other requests into the queue. That’s bad practice…

    So they were correct to block you and when you complained about the block they resolved the issue in less than a day; not bad at all!

    [snip]

    Second

    Steve McIntyre – Posted May 20, 2007 at 12:35 PM

    [Glen Raphael]. your comments are fair enough. I note that the ultimate resolution of the issue was them telling me just to carry on with what I was doing after hours (which is what I was doing anyway) so it couldn’t have had that deleterious an impact on their operations.

    [snip]

    Third

    A. Sinan Unur – Posted May 23, 2007 at 7:35 AM

    [Rabbett’s blog entry [apparently this one]] is a blatant mischaracterization. There was never any claim [by GISS] that Steve’s script caused a denial of service.

    I tried to point out, from the admin’s perspective, repeated requests over a long period of time, using an “automated user agent”, to a CPU and memory intensive resource, can be seen as such an attempt and tried to explain the motivation behind blocking access to something other than active data hiding. Steve M. immediately saw and understood that point and fixed the title of the post.

    [snip]

  54. Willard–
    Also note Sinan Urr doesn’t think Steve caused DOS

    Posted May 17, 2007 at 10:28 AM | Permalink | Reply

    ….I don’t think Steve caused denial-of-service

    Also: Not withstanding what the full range of things that one might call “bots”, Brandon is correct about what robots.txt is defined to cover.

  55. “As a result, I would be surprised if it had a significant impact. I think NASA’s budget is in the hundred of billions of dollars a year. They can’t spring $500 for a dual core web server for GISS? This research is related to an issue which is supposedly one of the most critical of our time, which is having lots of money thrown at it, etc. What’s worse, the reason we have to do this is because they have organized their web site in such a way to make it hard to get to the data.”

    This comment is what in my mind the issue of SteveM’s scraping GISS should really be all about. I still have to check what GISS has avaiable by way of station data. I know that GHCN station data is in a form that makes it very easy to use, while CRUs can be downloaded readily and then put into better form to use.

    In my attempts to learn exactly what the 3 major surface temperature data set users supply, I have quickly conluded that almost all data is sourced from GHCN/NCDC and that both GISS and CRU use a goodly percentage of the adjusted GHCN data. There is not much value added coming from either CRU or GISS and I think they retain their data sets primarily for marketing purposes of showing prospective funders. I advocate competition but I am not sure how that translates into a government funded or run operations

  56. Steve’s script was not trying to INDEX the site as the webmaster thought. One difference between a robot and a scraper is that
    scrapers ( at least those I write ) do not search a site recursively.
    Here’s a helpful definition

    http://webdesign.about.com/od/promotion/a/aa020705.htm

    For environment canada the operators in fact give you instructions
    on how to scrape the site since they dont provide ftp access to the files. There are entire packages in R devoted to automating the task of scraping sites. For Ghcn Daily I hit the site 27,000 times to download the files.

  57. Before we play the Also Note game and agree to “move on” the goalposts, it might be interesting to reach an agreement on the ontological question of bothood. To that effect, perhaps we could “read the blog” a bit more, as our beloved bot was fond to say.

    Here is another testimony, this time by Dave Blair, on the May 17th, 2007 at 10:36 am:

    > Usually there is interesting info here and I realize there is frustration on this matter but to be blunt, I think you should move on to something else. This is clearly a bot. The web adminstrator is just following policy – no conspiracy.

    http://climateaudit.org/2007/05/17/giss-blocks-data-access/#comment-88327

    Glen’s comment, made on the May 20th, 2007 at 11:51 am, began with this sentence, which AMac did not quote:

    > Unfortunately, Lee’s right.

    http://climateaudit.org/2007/05/17/giss-blocks-data-access/#comment-88273

    We might need some more parsomatics to know what Lee’s right about. It may be of interest to note that Dave and Glen’s comments seem to have been made three days apart, and that Dave was not the first to tell Steve that his “script” can indeed be considered a bot. Meanwhile, we’ll enjoy the comment that won that thread:

    > Do Climate Auditors Dream of Electic Sheeple?

    http://climateaudit.org/2007/05/17/giss-blocks-data-access/#comment-88221

  58. Mosher–
    And of course GHCN lets you do that because letting people do that is their intended purpose. In contrast, letting automated-anything visit and download every image back to 2007 without reading any posts, or participating in any discussion is not what I spend my money to support. But the automated-non-human things want to do that for purposes of their own. I want to stop them.

    Unlike an agency thwarting data access I have a perfect right to thwart any and every connection I chose for whatever reason I chose. Agency employees may or may not have a right to do that.

    Eli’s insinuation by snark that my situation somehow translates into SteveMc was right and NASA was wrong is nutty on so many levels it’s hard to even list them all. (Willard as usual is even more confused sounding than Eli.)

  59. willard

    it might be interesting to reach an agreement on the ontological question of bothood

    Or it might not be interesting to follow you down winding paths into ontology.

  60. willard (Comment #89655)

    Glen’s comment, made on the May 20th, 2007 at 11:51 am, began with this sentence, which AMac did not quote:

    > Unfortunately, Lee’s right.

    Very meta, given that the length and tediousness of that CA thread stems from the varied charges of nefariousness and ignorance that aggrieved partisans of the two camps lob at one another.

    willard, since you bring it up, why didn’t I quote Glen saying “Unfortunately, Lee’s right”?

    You parsomatically imply a nefarious explanation. Can you propose an ordinary one?

    Here’s a hint: my comment #89651 was already long.

    Sheesh.

  61. Before we play the Also Note game and agree to “move on” the goalposts, it might be interesting to reach an agreement on the ontological question of bothood. To that effect, perhaps we could “read the blog” a bit more, as our beloved bot was fond to say.

    Does anyone else find it amusing willard claims to want to “reach an agreement” on what a bot is (in this context), yet he refuses to address the discussion of what a bot is?

  62. I just received an email reply from Reto Ruedy of GISS and he put the station data all in one place for download at the link below. He commented in the email that, since GISS adjusts the station data to give trends the same as the rural stations for all stations by region, GISS was hesitant to advertise the station data as being data from a particular station.

    http://data.giss.nasa.gov/gistemp/station_data

  63. “For Ghcn Daily I hit the site 27,000 times to download the files.”

    Mosher, does that mean you had to download max, min, mean temperatures separately from 9,000 plus stations one at a time?

    Monthly station data is all in one place for down load, but I would suppose daily station data all in one place would be a very large file – but would that be a problem for download into R?

  64. “Unlike an agency thwarting data access I have a perfect right to thwart any and every connection I chose for whatever reason I chose. Agency employees may or may not have a right to do that.”

    I sometimes think some people miss this very important point and difference. I always thought that those with a modern liberal bent tended to defend the little guy. I guess that all changes when the little guy might be portrayed to be working against their causes.

    Lucia, maybe this question was asked before but how do other small blog owners handle these problems? I do like the determination and tenacity that you display in handling this problem. You have evidently taken some major initiative to write the code required to do something about it. Do other owners simply live with the problem or do they pay for devices/software to control it. Or are they less susceptible to these kinds of attacks?

  65. Kenneth–
    I don’t know what other bloggers do. It can vary:
    1) Some blog on wordpress.com. They don’t see the issue.
    2) Some just get hacked.
    3) Some have so few visits they don’t notice because they don’t get complaints from visitors when the blog is down. (Lets face it, if you post once a week and 3 people a week visit, you won’t know your blog crashed!)

    There’s a whole range.

  66. John M,

    Hmm well I have no problem with it.

    From Heartland Institute internal documents:

    funding for high-profile individuals who regularly and publicly counter the alarmist AGW message. At the moment, this funding goes primarily to Craig Idso ($11,600 per month), Fred Singer ($5,000 per month, plus expenses), Robert Carter ($1,667 per month), and a number of other individuals, but we will consider expanding it, if funding can be found.

  67. Robert (Comment #89667) —

    Sounds like there will be plenty of interesting and relevant bits of information to come to light, as was the case with analogous leaks. I’ll stay tuned.

  68. From the website:

    “An anonymous donor calling him (or her)self “Heartland Insider” has released the Heartland Institute’s budget, fundraising plan, its Climate Strategy for 2012 and sundry other documents (all attached) that prove all of the worst allegations that have been levelled against the organization.”

    Heartland Insider eh? not exactly catchy…

  69. Robert,

    It would be interesting to see how those numbers compare with a typical NSF grant to “study” AGW.

  70. Robert,

    If the documents “prove all of the worst allegations that have been levelled against the organization.”, where’s the part about wanting to destroy James Hansen’s “Creation”?

  71. From the documents

    An anonymous donor gave 14.2 million dollars in the last 6 years to the heartland institute.

  72. I think the point Robert is failing to highlight is that you don’t have to pay attention to Heartland at all, about anything, at any time. What Heartland does is not relevant to whether AGW is true or not. If AGW is a hoax (and it is), you don’t need Heartland to tell you. AGW advocates do a good enough job of that themselves.

    Andrew

  73. Robert

    An anonymous donor gave 14.2 million dollars in the last 6 years to the heartland institute.

    Ok. Presumably you posted that to make some sort of point. What is it?

  74. At the moment, this funding goes primarily to Craig Idso ($11,600 per month), Fred Singer ($5,000 per month, plus expenses), Robert Carter ($1,667 per month),

    If true– and it may well be– what this shows is whoever is doling out money has no idea how to spend it to achieve anything. As far as I can tell, all three of those guys are utterly ineffective promulgating any ideas about anything. Have you seen Fred Singer talk lately?

  75. and…thinking about it, Steve McIntyre was requesting 20 meg….it is lucky that the NASA server admins do not work for YouTube, where they must get hundreds of requests for 20 meg every second. I can imagine the Rabett’s head exploding at the thought of this much data being widely available.

  76. Andrew_KY–
    I’d prefer if Robert told me what point he is trying to make. I don’t trust your attempts to read his mind.

  77. I guess I should of pointed out that I was responding to John M

    “Robert,
    It would be interesting to see how those numbers compare with a typical NSF grant to “study” AGW.”

    I guess I should of been less subtle…

  78. sorry Robert, but it still means that the warmists are getting way more funding than the realists

  79. Re: lucia (Comment #89681)

    As far as I can tell, all three of those guys are utterly ineffective promulgating any ideas about anything. Have you seen Fred Singer talk lately?

    Come to think of it… yes, I have. 🙂

    He gets paid $5k/month + travel?

  80. lucia (Comment #89681)
    February 14th, 2012 at 6:11 pm
    If true– and it may well be– what this shows is whoever is doling out money has no idea how to spend it to achieve anything. As far as I can tell, all three of those guys are utterly ineffective promulgating any ideas about anything. Have you seen Fred Singer talk lately?

    The one i’m surprised about is Idso… I’ve rarely even heard of him..

  81. Well John M,

    Linking to one of the groups being funded (WUWT) (as shown in the link) isn’t exactly the best choice haha

    but regarding NSF… if you think that NSF funding pays scientists 11000 a month in salary then I think you need to revisit the application process…

  82. Robert–

    I guess I should of been less subtle…

    Well yes. Because John M appears to have been responding to

    Craig Idso ($11,600 per month), Fred Singer ($5,000 per month, plus expenses), Robert Carter ($1,667 per month),

    Had you been less subtle, we might have immediately known your response to John M was to change the subject. Now that you have clarified, we know.

  83. Robert expense– Is it salary? Or is it to someones company and so includes money for secretaries, clerks, office rental etc. Many NSF grants are small, but some larger NSF grants cover some salary, money for grad students, some clerical work, money for equipment etc. I’m still not sure what your point is– and your changing subjects and making comparisons that look like pomagranate seeds to watermelons makes it difficult to know. But if whatever point you are making needs an honest comparison to be made, maybe you should dig up the data, create a table and present it.

    I’m not particularly motivated to do it– and anyway, I still have no idea what point you are trying to make or argument you are trying to advance.

    Is it that someone funded Heartland? Or some people get funding? Those aren’t points — they are just information.

    If you think these bits of data mean something tell us. I’m not going to try to read your mind.

  84. The one i’m surprised about is Idso… I’ve rarely even heard of him..
    There are two Idso’s listed at

    http://icecap.us/index.php/go/experts

    I have no idea whether the money goes to an individual Idso or to “President of the Center for the Study of Carbon Dioxide and Global Change”– which would mean it’s split for the center.

    I realize that the idea that it goes to the center might bug the c*&^ out of you, but when making comparisons, it’s not quite right to label something as salary if it covers rent, server hosting etc. (And don’t just link Desmog blog. As usually, they use a lot of rhetoric but other than that…. crickets.)

  85. Robert,

    “Linking to one of the groups being funded (WUWT) (as shown in the link) isn’t exactly the best choice”

    This from someone who started all this by linking to demonsmog?

    I guess it’s easier to dismiss the citation than it is to deal with the facts.

    Anyway,with regard to how well grantees do, I’ll let you add it up and divide by the number of people yourself (scroll down to the bragging about funding, which we know from the Penn State “investigation”, is all that counts).

    http://www3.geosc.psu.edu/people/faculty/personalpages/mmann/documents/Mann_Vitae.pdf

    Bear in mind this is on top of his tax payer funded salary.

    And I do know how the NSF works, by the way. They fund summer salaries for PIs pro-rated based on what the parent institution is fool enough to pay these guys.

    While it’s true that grantees don’t pocket all of this, you seem to to be implying the Heartland fundees do, which I guess means you think they work in their dining room with a pad of paper and a pencil.

  86. Why does Robert want to compare a private foundation to NSF?

    Shouldn’t he be comparing Heartland to say World Wild Life or Greenpeace? Or is it only spooky and gasp the worse of our fears get confirmed when “they” do it instead of us.

    My opinion, is Robert is just showing the true nature and character of the AGW movement: They only want their own voices to be heard, and all others silenced.

  87. Not that it’s particularly meaningful, but compare private sector Idso’s to public servant James Hansen’s total income. Hansen’s base income is more than $11k a month and unlike with Heartland, we don’t get any say whether we help pay for it or not.

    Nor should we be allowed to. If we disagree we should just shut up.

    There I think I have Robert’s position right.

  88. Carrick,

    As much as it pains me to defend Robert, I was the one who brought up the NSF.

    You are correct that the more apt comparison is with the activist NGOs, but the NSF data is more readily available.

    I guess we’ll just have to wait for another leak of internal documents. 🙂

  89. Carrick–
    Hansen’s base income is also likely his salary. Quite like his $11K/month does not include what he gets in medical, dental or other benefits. In contrast, whatever Idso makes from Heartland (or the private donor) pays for office, supplies, staff, and yes, any health or dental insurance premiums he elects to pay out of his $11K.

    Of course, maybe I’m wrong– but if so, Robert can dig up the info and present it in a digestible way so that we can see whether it supports whatever argument or claim he thinks these factoids support. Otherwise, for now, it seems he wants to insinuate some mystery point or argument by posting factoids that are not even sufficiently complete to communicate actual facts.

  90. Robert–
    Question. Out of curiosity, what are your thoughts on the propriety of people quoting, citing and hosting confidential memos that have leaked?
    (Mine is that it’s ok. It’s ok if it’s the IPCC. It’s ok if it’s climategate emails. It’s ok if it’s Heartland’s memo. I’m just wondering what your thoughts are.)

  91. Personally I’m okay with leaked documents. I’m a big proponent of transparency so that’s the reason why but I do think sometimes there has to be safety concerns too ie home addresses and home telephone numbers etc should be edited out of documents etc…

    I think the ZOD IPCC stuff should of been public anyways…

  92. On that topic I think that there should be more transparency in a lot of processes in climate research. That is why I support open-access journals for example the EGU ones. Then at least people can see who gets easy and hard reviews etc… having been pounded in a few i’ve wondered whether the same rigor is applied to everyone 😛

  93. Robert

    I think the ZOD IPCC stuff should of been public anyways…

    Glad to hear it! It’s always nice to come across points of agreement on things.

    ut I do think sometimes there has to be safety concerns too ie home addresses and home telephone numbers etc should be edited out of documents etc

    Agreed. This is a problem with leaking as a method– but OTOH, I don’t see this as a reason to not discuss what’s in the documents.

    Is the Hearland leak limited to that 1 memo? Because it’s not particularly informative. I don’t think we can discover if $11K to Idso is for salary, his company or whatever.

    On other things: I know others get to have their own reactions, but I’m must not shocked or appalled that a private individual gives lots of money and remain anonymous. It doesn’t bother me at all if Heartland has pledge to give Antony Watts support sometimes in the future for work collecting temperature data which until now he has been doing unfunded. I know he’s not who you or NSF would pick but it seems to me that a private individual might be inclined to give money to someone outside the NSF funding circle rather than inside. These seems entirely reasonable to me. So, Heartland pledging that seems perfectly ok with me.

    I’m sure Greenpeace helps some people raise funds for things Greenpeace supports. I don’t find it shocking– and I don’t find heartland pledging to help Anthony in the future shocking.

  94. I tend to mostly have an issue with anonymous contributions of any kind. If company X gives to Greenpeace or to a SuperPAC I think that people have a right to know where funding comes from, especially when it comes from corporate interests. The only thing from these documents (they’re on the website linked to before) that I tend to not like is millions of millions donated by a single Anonymous donor… to me it’s just the same as the anonymous donations in politics… it undermines democracy.

  95. Robert–
    I don’t have a problem with anonymous private donations to political candidates. So I guess we both agree it’s the same– but I think it’s ok and you don’t. I don’t think it undermines democracy.

  96. Thinking more about it: I’m not even sure anonymous donations to private entities is the same as anonymous donations to elected officials. Even if I could be persuaded that anonymous donations to elected representatives with seats in the government and who can enact legislation was a big problem, I would still have no problem with anonymous donations to private entities. I can’t even begin to see any problem with the latter. Private entities don’t pass or execute laws and don’t act as judges during trials. So, I don’t see how anonymous donations to those things could possibly undermine democracy.

    In contrast, I see government pay outs to private individuals or masked as crony capitalisms as potentially undermining democracy. But private individuals giving money to private individuals or entities? I just don’t see a potential for undermining democracy there.

  97. Well I suppose we can just agree to disagree on that point. For me the problem with so called “private entities” is that when there are no caps on the amount which can be donated it essentially tips the table towards those with money all the time. If two lobbying organizations exist and one is for industrial interests whereas the other is for perhaps consumer protection etc… the lack of limits on the amount can be given means that one group gets its interests more represented because they have more financial capital at its disposal.

    I’m all in favor of capitalism sure, but not a warped version where those who have money rig the system to perpetually make them winners. People innovate, achieve and get to a place where they have money and power, and then use it so that they don’t need to find new ideas but can instead keep the status quo… it’s like lazy capitalism…

    Anyways enough of that…

    I think that for me the issue is that when organizations receive unlimited anonymous donations and have no requirement to be honest then it leads people to be misinformed and I consider a key tenet to democracy is a well-informed electorate.

  98. For me the problem with so called “private entities” is that when there are no caps on the amount which can be donated it essentially tips the table towards those with money all the time.

    Maybe. But I don’t see this as a big problem. I can give money anonymously to the Salvation Army, the humane society, a church or to some person whose tear jerker life story is carried on the nightly news. I can give money to the neighbor. I could also buy overpriced girl scout cookies or overpriced christmas wrapping paper or magazines from someone — and I can pay cash thereby making it anonymous.

    Sure the wealthy can give more than the poor, but I don’t see how this creates a problem for democracy.

    For me the problem with so called “private entities” is that when there are no caps on the amount which can be donated it essentially tips the table towards those with money all the time.

    Sure. And this is why we have laws about what lobbying organizations can do and what sorts of gifts they can give legislators. But I don’t see how it means that people should not be permitted to donate money anonymously to those organizations they support. There are all sorts of reasons a private individual might not want to divulge who they donate to. Maybe my boss doesn’t like the humane society or doesn’t like me giving to the boy scouts. I don’t see why I shouldn’t be able to donate anonymous.

    where those who have money rig the system to perpetually make them winners.

    That’s where I see a problem with anonymous donations to elected officials, civil servants or bureaucrats. But I don’t see how this means that one private entity can’t anonymously give money to another private entity.

    it’s like lazy capitalism…

    Are you worrying about crony capitalism? I see a problem with giving money to politicians who enact legislation. But I just don’t see a problem with private individuals giving money to things they support.

    I consider a key tenet to democracy is a well-informed electorate.

    Heartland isn’t a governing body. Neither are the boyscouts, AARP or any number of private groups. I think all do some lobbying and also some politicing. I don’t see their getting private donations as thwarting the ability of anyone to be well informed.

  99. Looking at their proposed budget for 2012, I’d prefer that the 88K come to me to set up a web site, rather than go to WUWT.
    Of course, I wouldnt take money from heartland.

    On the documents, whoever leaked/stole them, should have redacted personell information. At least the CRU leaker had the sense to limit his civil liability. I iamgine the person let go for truancy wont like that fact publically revealed.

    Since the heartland puts out crap Im only to happy to see that somebody exposed the crap. Sad to see a couple people take 125 bucks a month for writing. Should have kept their noses clean.
    My sense is that if you want to challenge the establishment you have two choices:

    1. get an establishment job and work from the inside
    2. Do it for free from the outside and accept money from no one.

  100. HaroldW–
    I wish I had the money to donate that he does!

    If there are any “Holly Golightly”s” in the Chicago area, I’m sure they will be trying to figure out who this “Rusty Trawler” is!

  101. “I think that for me the issue is that when organizations receive unlimited anonymous donations and have no requirement to be honest then it leads people to be misinformed and I consider a key tenet to democracy is a well-informed electorate.”

    Robert, I think you have thrown out an issue here but I do not see you giving any detailed or definite solutions you might see for solving your problem. You are engaging in the discussion you initiated, so you are certainly ahead of Eli in that regard.

    There are a lot of potential evils lurking within that statement of a well informed electorate. We give our governments lots of power over individuals and organizations and we (you) expect those individuals and organizations to have no recourse in presenting their sides of an issue in order to have a better informed electorate. Would you limit speech or the funds to provide that speech to better inform the electorate? It would appear that you are prejudging the speech with regards to whether it better informs the public and the politicians.

    If the electorate was turned off by (over) spending for speech they might well vote against the interests of those doing the spending. Or do you perhaps consider that the electorate is not sufficiently intelligent or motivated to do that? If the electorate can be surreptitiously “bought” by spending money on lobbying and campaigning, then they can be “bought” more directly by politicians simply doling out other people’s money to their constituency – and do it with much less fear of a counter argument if speech involved in lobbying and campaigning is limited.

    Certainly you are not talking here about limiting what a private organization wants to spend on informing the public or what a private donor might want to give to such an organization – no matter the quality of the informing or subject matter – are you?. It is like free speech in general. If you get to pick and choose what speech to allow it is no longer free speech. Free speech here is defined as speech that is not used in committing fraud -and no, free speech does not mean someone can have unlimited access to my property to speak.

  102. > let’s not lose sight of the fact that Willard and Rabett were acting in bad faith.

    If some were to ask why we were having this discussion, enthusiasts in parsomatics might be tempted to blame it all on NASA.

  103. “Does anyone else find it amusing willard claims to want to “reach an agreement” on what a bot is (in this context), yet he refuses to address the discussion of what a bot is?”

    YAD: yet another diversion.

    willard is all about the marginalia. That’s ok. Its fun to doodle in the margins. he enjoys it, try not to judge it. embrace your inner willard.

  104. “If some were to ask why we were having this discussion, enthusiasts in parsomatics might be tempted to blame it all on NASA.”

    Hardly. It’s pretty clear that none of the parties in the whole
    botscapade were blameless.

    1. The original code writer could have done a better job.
    2. the code user could have taken a more generous view of the
    actions of the sysadmin.
    3. The sysadmin could have taken more time to understand what
    the code was actually doing.
    4. Other could have refrained from accusing parties of crimes.
    5. marginaliaists could refrain from derailing. or not.

  105. Mosher– I had turned off the troll-control delay-o-meter for a long long time. Then willard returned. At first he was writing short snippets. Then he figured out that he could write more.

    It was amusing for a while. Then I remembered why my rule for willard is he can write short snippets. If he wants to write more, he should write it at his blog and link.

    As for his most recent comment: Is he really trying to make some sort of mystery point by suggesting some unnamed person (possibly willard himself?) might ask why “we” were discussing “this” (whatever this might be) and that some other people (“enthusiasts in parsomatics”) might respond to by blaming the discussion about “this” on NASA?

    Given the multiple topics in the thread, I don’t know if “this discussion” is the recent thread discussion about Heartland (a topic introduced by Robert), image scraping (the post mentions getty, ‘bots etc.), the weirdness of willard or any number of other things. All these discussions seem perfectly legitimate topics to me, so I don’t know why “enthusiasts in parsomatics” would use the verb “blame” when explaining “why” we are having them. And I really truly don’t know why anyone would suggest NASA is the cause of the conversation.

    And guess what? No one has blame NASA for this conversation!

    But I suspect willard may think he’s being subtle or clever. Or possibly making some sort of sly point by use of sarcasm. Or… something.

    His mystery drivel is once again limited to short blurbs.

  106. Ahhh! Mosher. I see that’s what “this” was. I would have thought the answer to willards question about why we were having “this” discussion was that Rabbett brought that subject up in Eli Rabett (Comment #89571). Prior to the Rabbetts thread jack, we were discussing image scraping!

  107. lucia (Comment #89772)

    > Prior to the Rabbetts thread jack…

    Up top, you had declared this an open thread, which would seem to preclude ‘jacking.

    Re: willard — a while back, I commented at Stoat (and, later, at OIIFTG). I thought it wasn’t good netiquette for their respective proprietors to wave in unflattering commentary about my ideas and me, while throttling what I offered in response. If willard and/or his writing is to be a topic, it would seem more fair to invite him to participate in the dialog.

    My two cents.

  108. I just received my copy of Michael Mann’s book. The first sentence of it is (emphasis mine):

    On the morning of November 17, 2009, I awoke to learn that my e-mail correspondence with fellow scientists had been hacked from a climate research center at the University of East Anglia in the United Kingdom and selectively posted on the Internet for all to see.

    Two pages later, there are these two sentences:

    Instrumental records from around the globe indicate that Earth has warmed by almost 1 degree Celsius (about 1.5°F) over the past century. That may seem a small amount, but it is already noticeable in glacier retreat, rising sea level, more frequent heat waves, and more intense hurricanes, among many other phenomena.

    That’s as far as I’ve gotten, but it shows a disturbing trend. Ideas which are possible, but by no means known to be true, are stated as fact. If this is remotely representative of the book’s accuracy, there is no way the people giving it glowing reviews read it with an open mind.

  109. On page three (the prologue’s pages are numbered with Roman numerals), there are these two sentences (emphasis mine):

    The recipient of a coveted McArthur “genius” award in recognition of his groundbreaking contributions to our understanding of climate change, Santer was a primary author on a series of important papers establishing the human role in observed climate change. As such, Santer was in a better position than anyone–and certainly than a bureaucrat with a political agenda–to assess the level of scientific confidence in concluding that human activity was changing the climate.

    Say what? Are we really supposed to believe Ben Santer was more qualified than anyone in the world to judge this issue? One person is above the rest? This sort of comment seems to indicate bias, but the real problem only comes two paragraphs later:

    In February 1996, for example, S. Fred Singer, the founder of the Science and Environmental Projection Project and a recipient over the years of substantial fossil fuel funding,7 published a letter attacking Santer in the journal Science.8 Singer disputed the IPCC finding that model predictions matched the observed warming and claimed–wrongly–that the observations showed cooling.

    Whereas before Michael Mann exaggerated things, here he flat out misrepresents them. It’s an exaggeration to call disputing a conclusion attacking the conclusion’s author, but more importantly, Singer’s letter said:

    The summary (correctly) reports that climate has warmed by 0.3° to 0.6°C in the last 100 years, but does not mention that there has been little warming if any (depending on whose compilation is used) in the last 50 years, during which time some 80% of greenhouse gases were added to the atmosphere.

    While you may argue Singer was wrong, he certainly did not deny warming had been observed. Mann manages to ignore this by only referring to (without specifying what he was talking about, of course):

    The summary does not mention that the satellite data–the only true global measurements, available since 1979–show no warming at all, but actually a slight cooling, although this is compatible with a zero trend.

    Yes, Singer says one set of observations shows cooling (which he mentions is statistically insignificant). No, this does not mean Singer says “the observations showed cooling.” I suspect the next few paragraphs may contain similar misrepresentations, but I don’t have the knowledge to check (and they don’t seem easily verifiable. However, I do find one sentence amusing:

    Insiders accusing Santer of abusing the peer review system and of “political tampering and “scientific cleansing”–a charge that was especially distasteful given that Santer had lost relatives in Nazi Germany.

    At this point, I’ve seen Mann use the word “denier” something like half a dozen times. It seems strange to use it while claiming “scientific cleansing” is a reference to Nazi Germany. I know there is disagreement over whether or not “denier” is a Holocaust reference, but it seems to me there is a far stronger case for it than for “scientific cleansing.” At least with “denier,” the original user of it explicitly stated they were making a Holocaust reference.

  110. Since this thread got hijacked, allow Eli to use the wayback machine to point out that the only person with a valid opinion of whether McIntyre set a bot onto GISS was the server manager, and at at time he clearly identified it as one.

    The identity of the computer making the requests was consistent, and as best I recall was something in the domain of Rogers Communications, a Canadian phone company and IPS

    Plainly this activity was from an “automated” agent, which in rough parlance is usually called a “robot”. Many robots have legitimate purposes, e.g. search engines such as Google or Yahoo, but others do not (spambots), and others one just doesn’t know.

    As the robot on May 16 came from a generic ISP address rather than, say and academic address and further because it’s “user agent” tag provided no further information about who was running it, and also because the GISS websites have “robots.txt” files which instruct all well behaved web robots to stay out of the CGI directories, I cut off access to the ISP in question to the websites on Web2.

    There were several thousand requests, and the problem was that to answer anyone of them at the GISS server required a few seconds of time, so that tied up the site, e.g. no matter what the intent of the scrapper, it acted like a DOS attack

  111. By the way, I find it distasteful that Mann provides a reference discussing the funding of the group that made the accusation and a reference showing Santer lost family in Nazi Germany, but he doesn’t provide a reference to the report where the accusation was made. It seems like just another case of RealClimate refusing to link to what it criticizes.

  112. A nice diversion from Mann’s dishonesty is Eli Rabett’s dishonesty:

    Since this thread got hijacked, allow Eli to use the wayback machine to point out that the only person with a valid opinion of whether McIntyre set a bot onto GISS was the server manager, and at at time he clearly identified it as one.

    Clearly “the only person with a valid opinion” on what a word means is the server manager at GISS. He is also the only person in the world with a valid opinion on what robots.txt files cover. /sarc

    There were several thousand requests, and the problem was that to answer anyone of them at the GISS server required a few seconds of time, so that tied up the site, e.g. no matter what the intent of the scrapper, it acted like a DOS attack

    Dishonesty aside, Rabett has now shown, quite conclusively, he has no idea how connecting to servers works.

  113. I just hope that Eli never has to work at you tube or facebook – to have to deal with thousands of requests must be traumatic. How much traffic does youtube get? And NASA, where they use electro-mechanical computers, will feel it most of all.

    Brandon – I was wondering whether to look at the Mann book. But from your extracts, it seems to be a copy and paste of Joe Romm. I will avoid. I wonder how Arthur Smith, who seems to be the designated enforcer on this mission, given that he regularly steps in to rebut the one star reviewers on Amazon, deals with this in his head. I particularly enjoy the way he trots out the line that, before Mann’s hockey-stick, no one had put error bars around their climate reconstructions. He just cannot bring himself to admit that he is defending junk.

  114. Amac==

    Up top, you had declared this an open thread, which would seem to preclude ‘jacking.

    True. I forgot. 🙁

    If willard and/or his writing is to be a topic, it would seem more fair to invite him to participate in the dialog.

    My two cents.

    My plan is for willard and his writing to no longer be a topic. The filter is now active but wasn’t before.

    Eli

    that the only person with a valid opinion of whether McIntyre set a bot onto GISS was the server manager,

    This is dunderheaded even for you. Anyone knows a server manager can be mistaken about what a bot is. Others with expertise can read what he described and state whether or not it was a) a bot or b) whether what the sysop claimed about violations of robots.txt could possibly be true. The sysop cannot change what robots.txt governs– it’s a standard!

    There were several thousand requests, and the problem was that to answer anyone of them at the GISS server required a few seconds of time, so that tied up the site, e.g. no matter what the intent of the scrapper, it acted like a DOS attack

    Even the sysop didn’t say it acted like a DOS attack (as you can see if you read the text you quoted). That he did not make the ridiculous claim you are making shows he’s not an idiot because several thousand requests over the time period he describe doesn’t look like a DOS attack.

    He cut off an automated agent because it was in the cgi-file, was not from a .edu address and the user agent didn’t tell him who it was. No one has faulted him for this. But that doesn’t mean you get to go around suggesting he said it looked like a DOS attack nor that anyone who is informed about systems would think so.

  115. I made it to page 16 without finding any really problematic statements (it’s mostly stuff about Mann’s life and uncontroversial science, so no surprise there). However, page 16 offers (what to me seems to be) a doozy. On it, we find Figure 2.3, a graph showing Hansen’s 1988 model projections of what temperatures might be like in the future, compared to the observed temperatures. It’s source is given as, “Source: Pearson Education, Inc., 2009.” The graph is basically like the one you can find in this RealClimate post, with one major exception. That post shows two temperature lines. One is labeled Station Data, the other Land-Ocean. The article says this about them:

    The former is likely to overestimate the true global surface air temperature trend (since the oceans do not warm as fast as the land), while the latter may underestimate the true trend, since the air temperature over the ocean is predicted to rise at a slightly higher rate than the ocean temperature. In Hansen’s 2006 paper, he uses both and suggests the true answer lies in between.

    This means the figure used in this book is using a temperature graph which is said to overestimate temperatures. On top of that, that graph ends in 2005, more than half a decade before this book was published. Had a more honest version of this graph been used, the “match” in the figure would be far, far worse. Of course, this version gives a “better” impression, so…

  116. Dishonesty aside, Rabett has now shown, quite conclusively, he has no idea how connecting to servers works.

    Indeed, another clueless individual made a similar bone-headed suggestion:

    If you do programmatic retrieval, it is usually considered good manners to insert a delay of at least 1 second between requests so as not to load the server.

    Another idiot offered a link purporting to present “performance hints,” and added the inane observation that

    While the execution time of a script can be significantly reduced by using faster CPUs, the invocation is disk I/O bound.

    What a maroon!

    Another fool said he

    slept for a full 4 seconds between each station request and ran it overnight on the weekends.

    Clearly driven beyond the endurance of any normal man, Mr McIntyre’s response was

    In retrospect, I would have put a sleep instruction – BTW how do you do that?

  117. diogenes:

    I will avoid. I wonder how Arthur Smith, who seems to be the designated enforcer on this mission, given that he regularly steps in to rebut the one star reviewers on Amazon, deals with this in his head.

    I have no idea. My personal experience with Arthur Smith is extremely negative. We had a discussion on his blog where he asked for specific examples of certain types of behavior from Michael Mann. I tried to oblige, providing examples and sources which showed the claims were true. He basically responded by saying he didn’t have the knowledge to judge the issues (even though 30 minutes of reading with no prior knowledge would more than prove what I said true), but he would look into them. A year or two later, he posted saying how nobody was ever willing to give him the specific examples and/or references to back up criticism of Mann, and so those criticisms seemed baseless. I confronted him on this, referencing the previous discussion we had, and he refused to engage. In effect, he hid from the examples and references he asked for, then claimed he never got them.

    By the way, if you want to see how different Figure 2.3, take a look at this post. Imagine how differently people reading this book would feel about it if they saw graphs like the ones in that post.

  118. Brandon – I am now stepping into Arthur Smith mode…..where did Mann make a factual error? He was the first person to put error bars around his climate reconstructions.

  119. The level of defensive silly here is astounding

    The server manager was the one who cut off McIntyre’s ISP and he did so because he considered it to be acting as a robot would and thus concluded it was a robot and in violation of site policy. Does anyone dispute that? Thought so.

    The GISS server was not set up to handle Facebook’s traffic, and it did not need to have zillions of requests to force it to it’s knees. Some systems are like that without large capacity. You set up what you think you need and what you can afford.

  120. Eli Rabett:

    The server manager was the one who cut off McIntyre’s ISP and he did so because he considered it to be acting as a robot would and thus concluded it was a robot and in violation of site policy. Does anyone dispute that? Thought so.

    Does anyone dispute the Schmunk did that for those reasons? No. Is that relevant to the issue of whether or not Steve McIntyre used a bot? No. The issue isn’t what Schmunk thought. The issue is whether or not Schmunk was right. He wasn’t.

    The GISS server was not set up to handle Facebook’s traffic, and it did not need to have zillions of requests to force it to it’s knees. Some systems are like that without large capacity. You set up what you think you need and what you can afford.

    And what was setup was easily able to handle the requests. It was never forced to its knees. In fact, experimentation suggests it didn’t even experience any slowdowns.

    This isn’t complicated stuff, but until you stop making things up, you may have trouble understanding it.

  121. The GISS server was not set up to handle Facebook’s traffic, and it did not need to have zillions of requests to force it to it’s knees. Some systems are like that without large capacity. You set up what you think you need and what you can afford.

    I don’t think Eli even reads what people write. If he did, he wouldn’t be bringing up this hypothetical that suggests SteveMc was somehow requesting data at a rate that require Facebook’s servers to support it. The amount of traffic Steve caused wouldn’t even bring my blog to it’s knees. It would take 20 times the traffic Steve caused to bring my blog to its knees. I pay less than $20/month for shared hosting.

    politely suggests that the Rabett stops digging…

    Rabetts dig. That’s what they do.

  122. Page 18 has a disturbing passage, though it doesn’t seem to be an example of Mann being dishonest. Instead, it seems to be an example of him being unable to do basic arithmetic. Before we get to that, Mann says on page 17:

    Furthermore, we could estimate that such an increase would lead to an additional warming of anywhere between 1.5 and 4.5°C (roughly 3-8°F).

    That’s unremarkable until you reach page 18, which says (the first line is missing “of,” but that’s not important):

    There was increasing recognition by the mid-1990s that another 2°C (3.5°F) [of] warming beyond current levels (for a total of 3°C or 5°F warming relative to preindustrial times) could represent a serious threat to our welfare.4 Precisely what limitations in global greenhouse gas emissions would be required to avoid that amount of warming remained uncertain, and still does, because of the spread of predictions among models. If we choose to take the midrange model estimates as a best guess, avoiding another 2°C of warming would require stabilizing atmospheric CO2 concentrations at no higher than about 450 parts per million (ppm).
    Preindustrial levels were about 280 ppm…

    Think about that. He claims the “midrange model estimates” say to avoid a total increase of three degrees, we would need to stabilize atmospheric CO2 at 450 ppm, a 60% increase over preindustrial times. Of course, you’ll remember the sensitivity range he gave is 1.5-4.5°C. The midrange of that is 3°C.

    That means Mann is saying if we assume the Earth’s sensitivity to a doubling of CO2 levels is 3°C, to avoid an increase of 3°C we must stabilize CO2 levels at 60% over the baseline…

    Edit: So that nobody thinks this is an inconsequential math error, I should point out Mann then goes on to spend half a page discussing CO2 levels, and what would be needed to stabilize CO2 levels at 450 ppm. He then spends a paragraph discussing people who think the goal needs to be even lower.

  123. The level of defensive silly here is astounding
    The server manager was the one who cut off McIntyre’s ISP and he did so because he considered it to be acting as a robot would and thus concluded it was a robot and in violation of site policy. Does anyone dispute that? Thought so.
    The GISS server was not set up to handle Facebook’s traffic, and it did not need to have zillions of requests to force it to it’s knees. Some systems are like that without large capacity. You set up what you think you need and what you can afford.

    ####################

    The server manager was the one who cut off McIntyre’s ISP and he did so because he considered it to be acting as a robot would and thus concluded it was a robot and in violation of site policy. Does anyone dispute that? Nope.

    That’s not the issue.

    1. The sysadmin said he though the bot was INDEXING
    2. The number of requests and their timing would not bring a system to its knees.
    3. People like deltoid and you who accused steve of a crime need to be a bit more circumspect.

    A program apparently violated the sites policies. A glance at what urls were being accessed would tell anyone what was being done. serial download of files. Not recursive searching of the site for indexing. everybody jumped to conclusions. The only person to persist in bad faith is eli and tim lambert.
    It wasnt a DOS. not in intent. not in fact.
    Accusing steve of a crime is probably actionable. Word to the wise.

  124. Eli

    ” it acted like a DOS attack”

    unfortunately that is not what you and Lambert wrote. You wrote that it was a DOS attack. not “acted like one”.
    further, it didnt even act like a DOS attack.

  125. On page 24, I find it is important to check Mann’s references. He says:

    Spencer still contends, nonetheless, that humans are not to blame for the increase [in temperature],16 while Christy accepts that there is a detectable human contribution to the warming, but argues that future warming will be less than standard climate models project.17

    The first source to examine is 16. In it, Roy Spencer says:

    This means that most (1.71/1.98 = 86%) of the upward trend in carbon dioxide since CO2 monitoring began at Mauna Loa 50 years ago could indeed be explained as a result of the warming, rather than the other way around.

    So, there is at least empirical evidence that increasing temperatures are causing some portion of the recent rise in atmospheric CO2, in which case CO2 is not the only cause of the warming.

    Now then, you can disagree with Spencer about what he says, but there is no way to take the claim “CO2 is not the only cause of the warming” to mean “humans are not to blame for the increase [in temperature].” You could say Spencer contends humans are not to blame for all of the increase, but that’s not what Mann says. Instead, Mann exaggerates Spencer’s claim in order to paint him more negatively. This is blatant misrepresentation. Mann’s handling of reference 17 is no better. It says:

    In a phone interview, Christy said that while he supports the AGU declaration, and is convinced that human activities are the major cause of the global warming that has been measured, he is “still a strong critic of scientists who make catastrophic predictions of huge increases in global temperatures and tremendous rises in sea levels.”

    John Christy criticizes “catastrophic predictions of huge increases in global temperatures.” Michael Mann uses this to say Christy “argues that future warming will be less than standard climate models project.” Unless Mann intends to claim standard climate models give catastrophic predictions of huge increases in global temperatures, he is misrepresenting his source.

  126. I don’t know how much people care to read what I’ve posted about this book, but there is one thing I think most people will be shocked to see. From page 51:

    The tests revealed that not all of the records were playing an equal role in our reconstructions. Certain proxy data appeared to be of critical importance in establishing the reliability of the reconstruction–in particular, one set of tree ring records spanning the boreal tree line of North America published by dendroclimatologists Gordon Jacoby and Rosanne D’Arrigo.

    In effect, Mann admits the hockey stick was dependent entirely upon a small amount of tree ring data. This has been a major criticism for years, and Mann and his supporters have long disputed it, yet here Mann admits it is true as though it were unremarkable. Compare it to what he said in MBH98:

    On the other hand, the long-term trend in NH is relatively robust to the inclusion of dendroclimatic indicators in the network

    I wonder if Mann even realizes what he said.

  127. > Certain proxy data appeared to be of critical importance in establishing the reliability of the reconstruction…

    .

    Matt Skaggs (Feb 15, 2012 at 1:30 PM)

    Varvology and dendrochronology are essentially exercises in reverse engineering of natural processes. As such, the inherent weakness of reverse engineering applies: the more interesting a feature, the less likely it is to be properly understood, and the more likely it is to be a contingency of history. And there is a similar law for reverse engineers: the more confident you are in your reverse engineering efforts, the less likely you are to have ever had any conclusive confirmation that you were right or wrong…

  128. I’m trying to reconcile two passages from Mann’s book. So far, I’ve had no luck. To me, they seem to contradict each other, but perhaps someone can tell me I’ve missed something. On page 34:

    Seemingly quite relevant to the issue of whether modern warming might be natural in cause, at least in substantial part, was the purported existence of a period in our not-so-distant preindustrial past characterized by warmth rivaling that of the present day. Evidence of such warmth would not, in and of itself, necessarily contradict a human role in the current war4ming; after all, that proposition, as we have seen, is based on multiple lines of evidence. However, if warmth less than a thousand years ago rivaled modern warmth, it might seem to support a far larger role for natural climate variability, and the possibility that a large fraction of the current warming could itself be natural.

    On page 57:

    It was indeed possible that other natural factors, be they changes in solar output or volcanic activity, could have led to conditions that were as warm as today.

    Whether conditions in past centuries might have been warmer than today, then, would not have a scientific bearing on the case for the reality of human-caused climate change. That case, as we’ve seen, rests on multiple independent lines of evidence…

    As though those two passages aren’t confusing enough, lower on page 57 he says:

    Our finding that recent warming was anomalous in a long-term (now, apparently, millennial) context was suggestive of the possibility that human activity was implicated in the warming.

    If he accepts natural factors could have caused equitable warming in the past millennium and claims it would be irrelevant if such were the case, how could his findings possibly support this conclusion?

  129. Sounds as though Prof Mann has belatedly figured out that the more ambiguous and vague the statement, the lower the odds that unseemly future developments will prove that you were wrong, completely wrong.

    A touch of internal inconsistency could actually be helpful, given the prevalence of quote mining. Highlight the version that’s best withstood the vagaries of time.

  130. Chapter 5, The Origin of Denial, is basically what you’d expect it to be. I won’t even try to assess the accuracy of it given how much research I’d need to check, as well as the general irrelevance of it. However, a couple things stand out:

    Some of the amateurse more than willing to engage in some degree of mischief, whether it be taking advantage of the IPCC open review process by flooding its authors with countless frivolous comments (each of which must be responded to according to IPCC rules) or exploiting the Freedom of Information Act (FOIA) and related laws to launch frivolous requests for documents and private correspondence of scientists.

    The myth of FOIA flooding seems to be accepted by Mann without reservation. I am a little confused though. I don’t recall the IPCC review process being open before AR5. My understanding was reviewers were invited. Regardless, the complaint Mann raises about it is silly. Responses to legitimate comments were often no more than one or two words (often dismissing comments without any explanation), so they couldn’t be too burdensome. More importantly though, we get the first reference to McIntyre:

    Today, much of the trench warfare takes place on the Internet. Former minding industry consultant Stephen McIntyre is especially well known for his broadsides against established climate science. McIntyre frequently uses hiw Web site climateaudit to launch attacks against climate scientists themselves, often leveling thinly veiled accusations of fraud and incompetence–once, for example, titling a post about a highly respected NASA climate scientist with the rhetorical question “Is Gavin Schmidt Honest?”60

    I find this fascinating for two reasons. First, while Mann refers to that blog post, he doesn’t address anything in it. It seems remarkable to condemn a person for making a criticism without actually disputing the criticism. Second, Mann shows how biased he is in his reference list. 60 is listed as:

    Stephen McIntyre, “Is Gavin Schmidt Honest?” climateaudit, October 29, 2005.

    Dozens of other references in the chapter have a URL listed. Every blog post discussed has a URL listed. The only time Mann doesn’t provide a URL seems to be when referring to McIntyre. Could there be a more glaring example of bias?

    Or maybe I should be less cynical. Maybe Mann isn’t being biased. Maybe he just doesn’t want people reading the blog post to see it is completely appropriate…

    Edit: I just read the next sentence (it’s in another paragraph, on the next page), and it’s a doozy:

    Since then, a number of other amateur climate change denial bloggers have arrived on the scene.

    Because Mann had referred to McIntyre immediately before this, “other amateur climate change denial bloggers” can only be taken to mean McIntyre is an “amateur climate change denial blogger.”

    How can anyone read this book and give it a glowing review?

  131. Brandon,
    About the “midrange model estimates,” is 2°C given as the central value of estimated ranges in any of his references?

  132. AMac:

    Sounds as though Prof Mann has belatedly figured out that the more ambiguous and vague the statement, the lower the odds that unseemly future developments will prove that you were wrong, completely wrong.

    A touch of internal inconsistency could actually be helpful, given the prevalence of quote mining. Highlight the version that’s best withstood the vagaries of time.

    I haven’t noticed any change about how clearly Mann says things. He seems to be extremely clear at times (often while making explicit statements which are wrong), and incomprehensible at others, but it seems mostly random which he’ll be at any given point. Though you are right inconsistency could help him at times. I know it’s been used by others to some effect (especially in the defense of obstruction of data releases).

    Oliver:

    Brandon,
    About the “midrange model estimates,” is 2°C given as the central value of estimated ranges in any of his references?

    Nope. In fact, the passage I quoted is the only time he lists a range of sensitivity (at least, in that chapter), and he does so without providing a reference.

  133. I know it’s just a minor typo, but still (page 108):

    Five years after the IPCC Third Assessment Report finding of a “discernible human influence” on the climate, during the run-up to the 2000 presidential election, one of the two candidates said…

    Obviously that should be “the IPCC Second Assessment Report.”

  134. Re: Brandon Shollenberger (Comment #89885)

    That’s interesting. I just brought it up because of the oft-quoted (e.g., IPCC) ranges of 1.5-4.5°C/doubling with central value around 2°C. But why wouldn’t Mann bolster it with references or further explanation?

  135. Oliver, I wouldn’t try to guess why Mann does what he does. For example, in the reference 45 of chapter eight, he flat-out lies:

    those claims were false, resulting from their misunderstanding of the format of a spreadsheet version of the dataset they had specifically requested from my associate, Scott Rutherford. None of the problems they cited were present in the raw, publicly available version of our dataset…

    This is a complete fabrication with no basis in reality. It’s a myth created out of thin air which contradicts readily available evidence. Not only that, it’s a myth created years ago, one which has repeatedly been shown to be false. There is no way Michael Mann should not know how completely and utterly wrong it is.

  136. Oliver (#89890) —
    AR4 said “climate sensitivity is likely to be in the range of 2 to 4.5°C with a best estimate of about 3°C.”

    Brandon (#89873)–
    Mann’s statement — “Our finding that recent warming was anomalous … was suggestive of the possibility that human activity was implicated in the warming.” — reducing the conclusion’s certainty to merely “suggestive” sounds reasonable to me. [Assuming you accept there’s anything at all in these reconstructions.] MBH99’s conclusion that “1998 [was] the warmest year [of the past millennium], at moderately high levels of confidence” is too certain. I’m assuming that Mann is not backing down from such claims.

  137. Brandon —
    By the way, thanks for your notes on Mann’s book. Much appreciated, as I’m unlikely to read it myself.

  138. HaroldW, glad to. I wanted to take note of some things for personal reference, and I figured people here might have some interest in it. If nothing else, it should give an alternative view to those glowing reviews the book got.

    By the way, there are a number of issues I haven’t raised. They’re usually more minor or involve things which aren’t quick and easy to check. I figure there’s enough to discuss without covering the more complicated things. I mean, I’m not even 150 pages in, and look what I’ve seen.

  139. …”other amateur climate change denial bloggers” can only be taken to mean McIntyre is an “amateur climate change denial blogger.”

    Well, in the old, strict sense of the word he is an amateur, since he doesn’t get paid for it. The old word for the thing he isn’t is “tyro,” which I’d love to see back in common use. (“Denial” doesn’t fit in either case, I know.)

  140. As we’ve been discussing robots.txt, I thought I would point out an entry from a really stupid probably-robot:

    173.245.53.249 - - [15/Feb/2012:02:16:13 -0800] "GET /robos.txt HTTP/1.1" 404 391 "-" "Mozilla/5.0 (someone@somewhere.any)"

    Thats’s a cloudflare IP. The real one is 213.214-14-84.ripe.coltfrance.com

    Whoever “someone@somewhere.any” is, he needs to elevate the IQ of his bots by fixing the typo and changing “robos.txt” to “robots.txt”. Assuming the script-kiddy sent his robot to more than one site, that poor probably-bot must be wondering why everyone 404’s his requests.

  141. In my last reaction to the book, I commented on Michael Mann repeating a lie (in his notes section) that McIntyre and McKitrick had requested his data in a spreadsheet. Let me pick up there, referring now to the part of Chapter 8 that note is attached to (page 123):

    The central claim of the McIntyre and McKitrick paper, that the hockey stick was an artifact of bad data, was readily refuted.45

    Now then, what follows that sentence is as false as that sentence itself. However, discussing such would involve discussing some technical details. To avoid that, I’ll simply focus on this sentence. Compare it to this sentence from the abstract of the 2003 paper Mann refers to:

    The particular “hockey stick” shape derived in the
    MBH98 proxy construction – a temperature index that decreases slightly between the early 15th century and early 20th century and then increases dramatically up to 1980 — is primarily an artefact of poor data handling, obsolete data and incorrect calculation of principal components.

    Notice the part I made bold. McIntyre and McKitrick explicitly state their conclusion is based on issues with the data, and issues with the calculations using that data. Mann simply ignores a major part of the MM paper is his portrayal (this is also tied to the more technical part which follows). But it get worse. Later in the very same paragraph:

    For the time being, climate change deniers had everything they needed to do immediate damage. They had a published study purporting to call into question the basis of the scientific evidence for human-caused climate change…

    Mann says the paper purports to call the basis of global warming into question. That’s absurd. Nothing McIntyre has ever written purports to do that. Mann is just pulling things out of thin air (to phrase it politely) in order to create a false impression of McIntyre’s work.

  142. Following from my last comment, Chapter 9 has Mann making this claim (Page 130):

    McIntyre and McKitrick had quietly dropped their erroneous original assertion (in their 2003 paper discussed in chapter 8 that the hockey stick was an artifact of bad data. Their new, albeit equally erroneous, assertion was that the hockey stick was an artifcat of the conventions used in applying principal component analysis (PCA) to certain tree ring networks, which, they argued, “manufactured Hockey Sticks” even from pure noise.

    Of course, as I discussed just above, Mann is flagrantly misrepresenting McIntyre and McKitrick’s work. Their earlier paper did claim the “hockey stick was an artifact of bad data,” but rather, claimed bad data plus faulty calculations were responsible for the hockey stick. Now, McIntyre and McKitrick had simple focused on the faulty calculations, as they were the key problem.

    Moreover, he misrepresents what they said about his implementation of PCA. McIntyre and McKitrick never discussed red noise, not pure noise. Red noise is a type of noise which exhibits autocorrelation, and it is distinct from pure (white) noise.

    And again, I want to point out this book got glowing reviews.

  143. Re: HaroldW (Comment #89892)

    Oliver (#89890) –
    AR4 said “climate sensitivity is likely to be in the range of 2 to 4.5°C with a best estimate of about 3°C.”

    You are quite right. My excuse is that I was probably confused. Either that or I was wondering why Mann had suddenly embraced Forster and Gregory’s original 2006 sensitivity estimate and range. 🙂

  144. Re: Brandon Shollenberger (Comment #89891)

    Oliver, I wouldn’t try to guess why Mann does what he does.

    I suppose that’s wisest.

  145. The next little bit of the book contains a discussion of work done decades ago in another field, and I won’t dwell on it. However, I do want to point out there is some irony in it as Mann repeats a common view of that work (advanced by Gould), and in doing so, makes a number of false claims.

    More importantly, we have another case of what can only be considered a lie from Michael Mann. In a discussion of the PCA issue, Mann explains how an “objective” selection criterion is needed. However, in his reference 18, Mann says what can only be considered a lie:

    In our case, we use a criterion known as the Preisdendorfer’s Rule N

    This is a canard. Preisdendorfer’s Rule N was never mentioned in Mann’s work or his supplementary material. The first reference to it only came about years later, after McIntyre and McKitrick had criticized Mann’s work. It’s hard to imagine a more blatant example of post-hoc reasoning, though this one is actually built on a lie.

    Not only is there no evidence to show it was actually used, there is ample evidence to show it could not possibly have been used. Proving this requires calculations, and I obviously won’t be doing them here, but for a reference, read this post.

    Put bluntly, Mann’s claim is certainly false, and there isn’t the slightest reason he should have ever believed it to be true. The falsity of it has been long-established, but despite this, he repeats it. There is no way to consider it anything other than a blatant lie.

  146. The sheer volume of false claims in this section is overwhelming. Earlier I didn’t mention one false claim he made because it’s sometimes hard to pick out which ones to discuss. Back when he said McIntyre and McKitrick’s earlier paper only claimed Mann used bad data, he said they got their results by throwing out large amounts of the data he had used. This was untrue, but I focused on the other misrepresentation of their work. However, it comes up again in this section where Mann claims McIntyre and McKitrick effectively did the same thing in a 2005 paper (page 138):

    They applied retention criteria that we had obtained using our convention (modern centering) to PCs calculated from the tree ring data based on a different convention (long-term centering). Through this error, they eliminated the key hockey stick pattern of long-term variation.20 In effect, McIntyre and McKitrick had “buried” or “hidden” the hockey stick. They had chosen to throw out a critical pattern in the data as if it were noise, when an objective analysis unambiguously identified it as a significant pattern.21 It was essentially the same error they had committed in their 2003 paper, wherein the key proxy data were simply thrown out22–it’s just that here, they were thrown out in a way that was not as obvious.

    In relation to McIntyre and McKitrick’s 2005 Geophysical Research Letters (GRL) paper, the paper Mann referenced, this paragraph makes basically no sense. The GRL paper was fairly short, only a few pages long, and it didn’t do anything this paragraph could be referring to. However, McIntyre and McKitrick published another paper that same year in Energy & Environment (E&E). This paper was much longer, and it is clearly what Mann is referring to here. He has simply conflated the two papers. Now then, before discussing the part I made bold, I Want to offer two sentences from Mann’s note at 20:

    Using their long-term centering, the hockey stick pattern emphasizing the high-elevation western North American tree ring data was no longer PC#1, but ws demoted to PC#4.

    They simply assumed that our selection rule–keeping the first two PCS only–that had been derived based on a modern-centering convention, could be applied to their results.

    All three of these are false. McIntyre and McKitrick did not assume any particular selection rule. Indeed, they reported exactly what needed to be included to get a hockey stick:

    If a centered PC calculation on the North American network is carried out (as we advocate), then MM-type results occur if the first 2 NOAMER PCs are used in the AD1400 network (the number as used in MBH98), while MBH-type results occur if the NOAMER network is expanded to 5 PCs in the AD1400 segment (as proposed in Mann et al., 2004b, 2004d). Specifically, MBH-type results occur as long as the PC4 is retained, while MM-type results occur in any combination which excludes the PC4. Hence their conclusion about the uniqueness of the late 20th century climate hinges on the inclusion of a low-order PC series that only accounts for 8 percent of the variance of one proxy roster.

    You’ll note, McIntyre and McKitrick’s results regarding PC calculations are identical to Mann’s. They reported exactly what he reports in his book. The only difference is Mann falsely claims to have used Preisdendorfer’s Rule N, and falsely claims that rule would lead to the inclusion of PC#4.

    Just to reiterate, despite the fact they said the same thing he said, Mann claims McIntyre and McKitrick buried and hid the hockey stick. Think about that.

  147. Still on page 138, we fine more false claims. I’m not going to discuss what Mann says about the Wahl and Ammann paper because doing so would require discussing technical issues (suffice to say his claims are wrong, in no small part because of issues with the paper). However, even with this limitation, there is much to say about one paragraph:

    They showed that, had McIntyre and McKitrick subjected their alternative reconstruction to the statistical validation tests stressed in MBH98 and MBH99 (and nearly all related studies), it would have failed these critical tests. McIntyre and McKitrick, in short, had not only failed to reproduce the hockey stick by eliminating key data, but their own results, unlike the MBH98 reconstruction, failed standard statistical tests of validity. (Wahl and Ammann provided all data and code used online so interested individuals could check any of these findings for themselves.)

    First, look at the portion I made bold. Mann actually claims McIntyre and McKitrick offered an “alternative reconstruction.” It’s hard to imagine a more absurd claim. McIntyre and McKitrick have never claimed to offer anything of the sort. All they’ve done is sensitivity tests, seeing what happens if you change certain things. Mann inexplicably ignores this obvious fact. Again, I can see no interpretation other than it being a blatant lie.

    Second, Mann claims the “alternative reconstruction” (which doesn’t actually exist) would fail statistical validation tests. This is a remarkable claim given the fact Mann’s own reconstruction failed R2 validation. Despite the fact he knew this, and the fact he published positive R2 scores for the 1820 step of his reconstruction, he never mentioned this failure in his work.

    Third, and finally, he praises Wahl and Ammann for making things available online. Wahl and Ammann originally refused to publish the R2 validation scores, the ones which Mann’s reconstruction failed. It was only after a formal complaint was filed that they included this information.

    Oh, I need to correct something. This isn’t all in one paragraph. This was all in half of one paragraph. Yeah, there’s no way anyone could read this book with an open mind and honestly give a glowing review like those found on Amazon.

  148. Brandon – I appreciate your efforts here. It must be very difficult to read this book without getting enraged.

  149. Brandon (#90096)
    I’d give Mann a pass on his statement that MM argued that hockey sticks could be produced from pure noise via the MBH algorithm. You’re correct that MM used red noise rather than white noise, but my take on the expression “pure noise” is that it refers to the absence of any “signal” component in the data seriers, rather than the spectral properties of the noise.

    But why does Mann consider that assertion to be “erroneous”?

  150. …and a second to diogenes’ comment above. When you’re through, it would be cool to combine your criticisms in a post here. Each item numbered & with page refs for discussion. I’m sure there are Mann defenders who would make the discussion interesting.

  151. diogenes:

    Brandon – I appreciate your efforts here. It must be very difficult to read this book without getting enraged.

    The book doesn’t really bother me in and of itself. I’ve become numb to Mann spouting off nonsense over the years. Unfortunately, I haven’t got past the annoyance/anger at people actually listening to him. If I knew people didn’t support Mann’s book, it wouldn’t bother me, and I’d get through it much more quickly. Unfortunately, I keep having to take breaks because I get too frustrated knowing these things are being promoted as the truth.

    HaroldW:

    I’d give Mann a pass on his statement that MM argued that hockey sticks could be produced from pure noise via the MBH algorithm.

    As a single statement, I’d give him a pass on it too. I’d consider it little more than a typo. It’d be worth mentioning, but only to correct it. However, given the massive misrepresentations I’ve found Mann making in this book, I can’t just dismiss this as a meaningless mistake.

    But why does Mann consider that assertion to be “erroneous”?

    Discussing the validity (or supposed lack thereof) of that would require getting into technical details, and right now, I’m trying to avoid that. I’m bogged down enough on simple misrepresentations without dwelling on mathematics.

    That said, Mann basically claims McIntyre and McKitrick didn’t use proper noise simulations, and that’s why they got the results they got. In note #25, He says of it:

    This was first shown by us in the RealClimate post “On Yet Another False Claim by McIntyre and McKitrick,” January 6, 2005, and later verified by Wahl and Ammann.

    I’ve read both the RealClimate post and the Wahl and Ammann paper, but to the best of my knowledge, neither discusses the supposed error. The RealClimate post disputes McIntyre and McKitrick’s finding, but it doesn’t ever claim they made the miscalculation the book claims they make. Wahl and Ammann actually say the exact opposite:

    The method presented in MM05a generates apparently realistic pseudo tree ring series with autocorrelation (AC) structures like those of the original MBH proxy data (focusing on the 1400- onward set of proxy tree ring data), using red noise series generated by employing the original proxies’ complete AC structure.

    Given that, this seems to be a complete fabrication on Mann’s part. This isn’t new though, as he made the same claim to Congress. You can find more information about that here

    On an unrelated note, looking over this material reminded me of a funny claim by Mann some time back. In the comments of that RealClimate post, Mann has an inline response to a commenter which says:

    Even without technical training or a statistical background, you should have an adequate basis for discerning which of the two parties is likely wrong here. Only one of the parties involved has (1) had their claims fail scientific peer-review, (2) produced a reconstruction that is completely at odds with all other existing estimates (note that there is no sign of the anomalous 15th century warmth claimed by MM in any of the roughly dozen other model and proxy-based estimates shown here), and (3) been established to have made egregious elementary errors in other published work that render the work thoroughly invalid. These observations would seem quite telling. -mike

    Now, I’ve already discussed the nonsense of his 2, so just look at his 3. If you go to that page, you’ll see the link he provides (I don’t want to add more links to this comment as it might trip moderation) is to an article discussing a paper by McKitrick and Michaels. That’s right. Mann tells people they should think McIntyre and McKitrick are wrong because their names start with the same letters as McKitrick and Michaels…

    …and a second to diogenes’ comment above. When you’re through, it would be cool to combine your criticisms in a post here. Each item numbered & with page refs for discussion. I’m sure there are Mann defenders who would make the discussion interesting.

    I’d be happy to put everything into a single document, but I don’t think that would work well for a blog post. There is a lot of material, and I don’t think it could all fit. That said, I could perhaps pick a number of issues I’ve seen and discuss those in a blog post. If that’s something people would be interested in, I’d be happy to do it (though it’s up to lucia as to whether or not she’d want to host it).

  152. Brandon – I suggest you ask Lucia, but she probably does not want to seem to take sides. Climate Audit is probably the wrong place to get your views across. Maybe Bishop Hill or perhaps jeff’s site.

    A rational person would have taken the chance to say, yes, I have made mistakes but I think this is what the science reasonably reflects.

    To come out all guns firing, and torpedoes shooting, full steam ahead, is the act of a total idiot. From what I can gather, that is what Mann has done….what a f…wit. I know more maths than guys with more maths qualifications than me.

  153. AMac:

    Another vote of thanks, Brandon. You write and we’ll read and comment.

    I’m glad to do it. It’s nice to see some others are interested in it, and since I would be taking these notes this regardless, it’s no extra burden (well, my notes would be less clear and structured, but improving on that is already a worthy goal).

    diogenes:

    Brandon – I suggest you ask Lucia, but she probably does not want to seem to take sides. Climate Audit is probably the wrong place to get your views across. Maybe Bishop Hill or perhaps jeff’s site.

    I’ve had almost no interaction with any blogger you mention aside from Lucia, so I’m uncertain about where I’d be best off looking to post something like that. I think my best bet would be to type up a “rough draft” of a post, then send that to a few bloggers and see what they think about hosting it. If that didn’t pan out, I could always generate a PDF file and make available.

    Of course, before doing anything else, I want to finish the book. It’s taking longer than I’d anticipate, but I do only have a hundred pages left (the main part of the book is only 258 pages).

    By the way, I’m starting to think I shouldn’t have ever said Mann lied. While I cannot fathom any other explanation for a number of things, it’s not as though I can read his mind. The only other possibility I can think of is he is delusional, and comparatively, calling him a liar seems kind. Still, I don’t like making claims about a person’s thought processes. What do you guys think?

  154. Chapter 11 contains an sentence I find hilarious, though it has little bearing on Mann himself. Mostly, it speaks to the extreme bias of his book. On page 160:

    The [National Academy of Science] report was subjected to the Academy’s extremely rigorous peer review process (there were thirteen peer reviewers and two review editors).

    If that report was subjected to “the Academy’s extremely rigorous peer review process,” I can’t imagine any more damning thing to say about the NAS. Gerry North, the chair of the panel responsible for the the report said they “didn’t do any research.” In fact, they just got 12 “people around the table” and “took a look at papers.” Most clearly, he said they “just kind of winged it.” If Mann considers that an “extremely rigorous peer review process,” I have to wonder at the purpose of paying attention to peer-reviewed material over blog postings.

    For anyone who wants to verify that North actually said these things, check this post for a link to his recorded words (be warned, there is about an hour of speaking before these comments). In the meantime, consider how Mann describes the Wegman Report:

    The report had no backing from any recognized scientific organization, there was no evidence of attempts to solicit input from leading scientific experts, and only lip service was paid to the idea of formal peer review.

    I’m at a loss as to how paying “only lip service” to “formal peer review” is worse than if they had gotten some “people around the table” and “just kind of winged it.” If anything, it seems to me avoiding the “extremely rigorous peer review process” used by the NAS would be a good thing!

  155. Brandon, I take it as a general principle to not try and impugn the motives of another person. Unless you have definitive proof of his state of mind at the time he wrote it…for example if he admits he’s lying in the book, that pretty well settles it.

    The story is similar with Peter Gleick, when he admits using misleading terms and encouraging others to do similarly in order to “educate” the public, he’s not only admitting to lying he’s admitting that he’s using falsehoods as a propaganda tools

    I’m not sure Mann fits into that category. (If you find emails where he admits to this, that is a smoking gone.)

  156. Carrick:

    Brandon, I take it as a general principle to not try and impugn the motives of another person. Unless you have definitive proof of his state of mind at the time he wrote it…for example if he admits he’s lying in the book, that pretty well settles it.

    I understand that principle, but I don’t agree with it. Take for example, Wahl and Ammann. Their paper repeated the claim McIntyre and McKitrick had made an “alternative reconstruction.” Not only was this claim obviously untrue, McInytre had specifically told one of the authors it was untrue prior to them publishing their paper.

    There is no “definitive proof” Wahl and Ammann lied in their paper. It is conceivable they somehow wrote a paper, got it published, and yet somehow failed to understand the most basic point of the work they were discussing. It’s also conceivable they did this despite having that point explicitly told to them. That is conceivable.

    But it’s also bull. At a certain point, you have to stop trying to make up excuses and go with the obvious. I don’t see why I should refrain from calling a person dishonest simply because there is some off-chance the person is actually so delusional they believe things they’ve completely fabricated out of thin air.

    By the way, Chapter 11 can be fairly summarized as, “The NAS report says we’re right. The Wegman report was wrong, and everyone siding with it should be mocked.” It’s so lacking in substance that it’s hard to find anything in it to comment on. About the only notable part is Mann says (multiple times) the issues raised by the Wegman report and McIntyre were inconsequential, and neither discussed the impact the issues had on Mann’s work. Not only is that untrue, it’s so ridiculous nobody with an open mind would find it easy to believe. I should probably write more about the matter, but eh…

  157. Page 181 brings us another massive misrepresentation from Michael Mann:

    A study in 2003 by NOAA scientist Tom Peterson and collaborators indicated that the cool park effect largely mitigates any urban heat bias in the U.S. measurements.

    The paper did nothing of the sort. In fact, the abstract of the paper states:

    It is postulated that this is due to micro- and local-scale impacts dominating
    over the mesoscale urban heat island.

    To postulate is basically to assume as true without proof. Despite that, Mann claims Peterson “indicated” those what it merely postulated. Apparently if you say something Mann likes, you prove it is true. More interestingly, Mann says:

    There were even more basic reasons for rejecting the claim that the surface temperature record was compromised by urban heat island effects. The global warming trend is seen not only in land measurements but also in ocean surface temperatures, where obviously no urbanization is occurring.12 The ocean warming isn’t as large as the observed land warming, but this is expected from basic physics and predicted by all climate models…

    So, Mann claims nobody should believe the temperature record is influenced by UHI because the ocean is also warming. He then says the ocean is warming at a slower pace…

  158. Page 183 brings a paragraph I won’t comment on. I will merely restate it. It begins:

    Now, we come to the fourth and final pillar: that warming, even if it is taking place, might very well be a natural occurrence, independent of the effects of human activity. Support for the argument that climate change could be natural was said to be provided by a medieval warm period as warm as, or warmer than, today. Such a period of comparable warmth in the not-so-distant past, the logic goes, would imply that modern warming could be natural, too. The logic is flawed, however: The mere existence of a past warm period says nothing about the cause of the current warming.

    The logic is flawed because it doesn’t do something it wasn’t trying to do.

    But the finding was hardly necessary to render implausible the argument that natural variability could account for modern warming. To reach that conclusion, it was sufficient to show, as many studies now did, that models could not reproduce the anomalous warming of the past century from natural factors alone.

    If models cannot explain the warming with only natural factors, it’s implausible natural factors are causing the warming. It doesn’t matter if there was a different period with equitable warming due to natural factors. It doesn’t matter if the explanation for that equitable warming is unknown, and thus, possibly not covered by the models. The models say humans are to blame.

  159. Page 186 has a brief mention of Douglass et al. (2008). Since the details of this issue involve technical details, I’ll just briefly summarize the situation. If anyone wants more details, I’d be happy to provide them once I’ve finished going through the book.

    Mann cites a RealClimate post as showing the paper’s conclusion “arose from a simple misunderstanding of the concept of statistical uncertainty.” This is nonsense. In reality, that is more true of the RealClimate post (which even contradicts the IPCC’s position on how to handle a key issue). He then refers to a paper by Ben Santer and others as proving Douglass’s paper was wrong, ignoring the fact Santer’s paper was contradicted by another paper not too much later.

    In other words, Mann takes the arguments he likes and praises them as being absolutely right. He then dismisses or ignores any contrary arguments.

  160. It seems Mann’s plan for this book is to repeat practically every false claim he has ever made. Presumably, he figures most people who read the book won’t discover how absurdly wrong the things he says are. Those who already know the claims are false before reading the book won’t change their opinion. Those who don’t already know will mostly just believe me. Only a small amount won’t already know the truth but will have the interest to check his claims. Because of that, he gets all those claims he likes to get to make to be part of the “public record.”

    For example, on page 190, Mann once against insists his handling of the Tiljander series was not wrong:

    Stephen McIntyre wasted little time in launching a series of attacks on the PNAS paper, employing–it would seem– the strategy of throwing as much mud against the wall as possible and hoping that some would stick. Teaming up with his former coauthor Ross McKitrick, he submitted a short letter to the editor of PNAS claiming that our reconstruction used “upside down proxy data.”52 That was nonsensical, as we pointed out in our response,53 one of our methods didn’t assume any orientation, while the other used an objective procedure for determining it.54

    And because he insists it isn’t wrong, he says:

    we were able to obtain a meaningful reconstruction of the Northern Hemisphere average temperature for the past thirteen hundred years without using tree ring data at all.

    I’ll comment on this more later, but I just realized what time it is. I need to try to get some sleep as I’m going to be out and about for quite a bit of the day. In the meantime, AMac should feel free to chime in as he can probably explain the situation better than I can (I always have trouble remembering the details of what is true of CPS as opposed to EIV).

  161. In Comment #90427, Brandon Shollenberger quotes from page 190 of Michael Mann’s new book:

    > Ross McKitrick [and McIntyre] submitted a short letter to the editor of PNAS claiming that our reconstruction used “upside down proxy data.”52

    That sentence is correct. The relevant paragraph of the five comprising the letter is short enough to be quoted in its entirety:

    [Mann08’s] non-dendro network uses some data with the axes upside down, e.g., Korttajarvi sediments, which are also compromised by agricultural impact (M. Tiljander, personal communication), and uses data not qualified as temperature proxies (e.g., speleothem δ13C).

    1. Did Mann08’s non-dendro network use some data with the axes upside-down? Yes.

    2. Are the Lake Korttajarvi sediments an instance of this? Yes.

    3. Are the Lake Korttajarvi sediments compromised by agricultural impact? Yes, after 1720.

    4. Did Mann08’s non-dendro network use data not qualified as temperature proxies? Likely. The Lake Korttajarvi sediments may themselves be an example. Speleothem δ13C time series and monsoon-influences ocean sediments are probably other instances — but I haven’t looked into them.

    I’ll expand on point 1 for any reader of Shollenberger’s review who is new to the Tiljander saga (Blackboard regulars have heard it all before). I was dismayed by the climate science community’s ready acceptance of Prof Mann’s fantastical excuses for what was obviously a careless series of mistakes. I ended up creating a blog to ease access to sources, data, and analysis concerning this issue. To explore Tiljander-related questions in more depth, go there.

    Prof Mann’s defenses of Mann08’s use of the Tiljander data series are phrased in such a way that the technical meaning of what he says cannot be clearly understood.

    The earliest example of the “defense by incomprehensibility” tactic came in Mann et al’s Reply to McIntyre and McKittrick. They wrote

    The claim that “upside down” data were used is bizarre. Multivariate regression methods are insensitive to the sign of predictors. Screening, when used, employed one-sided tests only when a definite sign could be a priori reasoned on physical grounds.

    This passage is only a defense of Mann08 to the extent that it confuses the reader, by conflating Mann08’s (faulty) implementation of the CPS procedure with its (faulty) implementation of EIV. In actuality, both CPS and EIV employ two of the four Tiljander series in upside-down orientations (a third is rightside-up and the fourth is indeterminate). However, each arrives at its incorrect usage in a distinct fashion. For CPS, the data was screened as entered — and it was entered “upside-down”. With EIV, the computational routine determined which orientation would provide the best (statistical) fit, and it assigned the upside-down orientation without operator intervention.

    Beyond this instance, the “defense by incomprehensibility” cudgel has not been wielded by Prof Mann’s co-authors. This is because they have all been entirely silent on the issue. Amusingly, one co-author was also an author of Kaufman09, which used one of the Tiljander data series (XRD) in the opposite orientation as that used in Mann08. This could be considered an implicit rebuke.

    Vigorous defenses of upside-down Tiljander have, however, been taken up by Prof. Mann’s co-bloggers at RealClimate.org, notably by Gavin Schmidt. “Incomprehensibility” has been the approach favored by Dr Schmidt and the other pro-Mainstream climate bloggers who have weighed in on the subject (I would rate “ridicule” and “ad hominem” as the runner-up tactics). Arthur Smith was the sole such blogger to attempt a sober analysis of the issue… and he lost interest partway through the process.

    Two closing thoughts.

    1. The big deal with Tiljander-in-Mann is that Mann08’s procedures — CPS and EIV both — require that all proxies be directly calibrated to the instrumental temperature record, 1850-1995. The Tiljander data series are uncalibratable due to post-1720 contamination by farming, road-building, and other activities. The authors of Tiljander03 state this plainly — and Mann08 discusses these authors’ concerns before dismissing them! “Upside down” was simply a consequence of “uncalibratable” for two of the data series. A third was “uncalibratable though rightside-up,” which is no better.

    2. Tiljander matters. It matters in the same way that the horseshoe made a difference to the king in the poem “For want of a nail.” Without Tiljander, a central claim of Mann08 fails. Its vaunted “Non-dendro” contribution to the understanding of temperatures of the past 2,000 years turns out to be of no value whatsoever.

  162. I judge that the book Mann has written should be considered the work of an advocate. The climate science recounted in the book is what it is and especially to those who have made the effort to analyze it. That Mann spins the work by making vague statements about it and against those who might criticize it is very much in line with an overly zealous advocate.

    The discouraging aspect of Mann’s works is less about Mann more about the lack of public criticism of his works and particularly that of Mann (08) by other climate scientists. I would suspect that that lacking is the result not of the science inclination of some of the climate scientists but rather more from an advocacy position that might judge that criticism of Mann’s work could be construed as a chink in the armor of the consensus position on AGW.

    A published paper like Mann (08) needs to be judged on its entire merits and contents and not just certain aspects of it, although one should not lose sight of those certain aspects allowing insights into the efforts the authors have made in their analyses. The weaknesses and outright errors in Mann (08) are usually discussed piecemeal and the implied counter argument then becomes that the conclusions of the paper do not necessarily rest on that one aspect. I have been collecting the problems with Mann (08) over time and am continuing to add to the collection. Recently I read AMac’s analysis of the Tiljander proxies and found that besides using part of the proxies upside down and proxies contaminated by non climate effects it used 4 proxies of which only 2 were independent.

    The major problems I have noted with Mann (08) deal with the claim that the proxies screening/selection process ( a process that in itself flies in the face of a proper a prior selection) would have had a low probability of passing the correlation test during the calibration/validation (1850-1995) period merely by chance. That screening process used an extensive number (105) of proxies of MXD dendro series (Schweingruber) that were cut-off arbitrarily at 1960 and added back with uncertain and undefined data to the year 1995. The selection process included 73 proxies (Luterbacher) that contained instrumental temperature data during the time of the correlation test. And, of course we have the Tiljander proxies already discussed.

    Mann(08) also tends to limit the model possibilities for proxies such as to preclude discussion of ARIMA models with and without a fractional d that can show extended periods of trending data, both increasing and decreasing, at the ends of the series. If investigators do not use an a prior selection criteria but rather do posterior screening it is easy to understand how an ending trend could be selected from a series that is without a deterministic trend. An ARIMA model with long term persistence can be shown to be consistent with the pass rate of the correlation test imposed in Mann (08) when the improperly handled proxies as noted above are removed.

    Mann (08), on the other hand, has merits in showing clearly in graphs and at least noting in passing the divergence problem in, not only dendro, but non dendro proxies. The overall inattention by the climate science community to the divergence problem is another feature of the climate science community that I see as a result of mixing advocacy and science.

  163. “Brandon, I take it as a general principle to not try and impugn the motives of another person. Unless you have definitive proof of his state of mind at the time he wrote it…for example if he admits he’s lying in the book, that pretty well settles it.”

    While I would agree with Carrick’s premise here the alternative conclusion(s) in the case of Mann is not very flattering either.

  164. I suppose the key question is, just how persuasive would this book be if you came to the debate knowing nothing? Is Mann an effective user of rhetoric? Could he persuade an agostic that he knew what he was talking about?

  165. Thanks AMac. Your explanation does a good job of explaining why Mann’s book is extremely wrong and misleading on this issue. There’s actually more in the book about the Tiljander, but I’ll get to that after the next paragraph (yes, three paragraphs in a row are this bad):

    McIntyre also appealed to the conclusions of the 2006 NAS report to claim that our continued use of the very long bristlecone pine series was inappropriate. Yet this was a misrepresentation of what the NAS had concluded. The NAS panel expressed some concerns about so-called strip-bark tree ring records, which include many of the long-lived bristlecone pines. These trees grow at very high CO2-limited elevations, and there is the possibility that increases in growth over the past two centuries may not be driven entirely by climate, but also by the phenomenon of CO2 fertilization – something that had been called attention to and dealt with in MBH99 (see chapter 4). The NAS report simply recommended efforts to better understand any potential biases by “performing experimental studies on biophysical relationships between temperature and tree-ring parameters”.

    This is a gross misrepresentation of the NAS report’s findings. From the very same page as the quote he offers from it (page 52):

    While “strip-bark” samples should be avoided for temperature reconstructions, attention should also be paid to the confounding effects of anthropogenic nitrogen deposition (Vitousek et al. 1997)…

    The very conclusion McIntyre cited was on the same page as Mann quoted from. Despite this, Mann called McIntyre’s (completely accurate portrayal) a misrepresentation. It’s mind-boggling that anyone would call something a misrepresentation while quoting from the very page that shows it’s accurate. I know some people feel I shouldn’t call Mann a liar based on evidence like this, but I honestly cannot see an alternative explanation.

    For a note of additional absurdity, take a look at this blog post from RealClimate. In it, you’ll see Ray Bradley, a coauthor of the paper Mann’s discussing, say:

    One final note: bristlecone pines often have an unusual growth form known as “strip bark morphology” in which annual growth layers are restricted to only parts of a tree’s circumference. Some studies have suggested that such trees be avoided for paleoclimatic purposes, a point repeated in a recent National Academy of Sciences report (Surface temperature reconstructions for the last 2,000 years. NRC, 2006).

    So one of Mann’s coauthors in the paper acknowledges the NAS report says “strip-bark” samples should be avoided, yet Mann claims that is a misrepresentation of the NAS report. As though that’s not strange enough, the seventh comment on the blog post contains an inline response from Michael Mann.

    That’s right. Michael Mann added commentary to a discussion of a blog post from his coauthor which states the exact same thing as Steve McIntyre. Despite this, he claims McIntyre simply misrepresented the NAS report.

    By the way, credit should be given to the commenter oneuniverse over at ClimateAudit for the discussion of this issue. I would have certainly discussed the absurdity of Mann’s paragraph, but I doubt I would have found that RealClimate post. He mistakes the author of the blog post as Gavin instead of Ray Bradley, but otherwise, says the same things as me.

  166. The next paragraph covers an issue AMac referred to briefly when he said (sorry for cutting the link to your blog AMac, but I’m not sure how many links I can include before tripping moderation):

    Without Tiljander, a central claim of Mann08 fails. Its vaunted “Non-dendro” contribution to the understanding of temperatures of the past 2,000 years turns out to be of no value whatsoever.

    This point is confirmed by Gavin Schmidt, a frequent defender of Mann’s work who says both (in inline remarks near comment 530):

    Since the no-dendro CPS version only validates until 1500 AD (Mann et al (2008) ), it is hardly likely that the no-dendro/no-Tilj CPS version will validate any further back, so criticising how bad the 1000 AD network is using CPS is hardly germane. Note too that while the EIV no-dendro version does validate to 1000 AD, the no-dendro/no-Tilj only works going back to 1500 AD (Mann et al, 2009, SI).

    More clearly, he is asked by a commenter:

    So just to be clear with regard to your response to 525. Under either method (CPS or EIV) it is not possible to get a validated reconstruction to before 1500 without the use of tree rings, or the Tiljander sediments.

    To which he responds:

    That appears to be the case with the Mann et al 2008 network.

    To this date, he has never retracted that position. He also hasn’t retracted contradicting positions he offered by for that one, but that failure is neither here nor there. The point is, without including Tiljander, a central claim of Mann 2008 is untenable. From the abstract:

    Recent warmth appears anomalous for at least the past 1,300 years whether or not tree-ring data are used. If tree-ring data are used, the conclusion can be extended to at least the past 1,700 years, but with additional strong caveats.

    This claim is false. It’s obviously false. As we saw above, even Gavin Schmidt acknowledges it is false. Heck, even Michael Mann himself acknowledges it is false. Don’t believe me? Look at the Supplementary Information for Mann 2009 (which uses the same reconstruction):

    Additional significance tests that we have performed indicate that the NH land+ocean Had reconstruction with all tree-ring data and 7 potential “problem” proxies removed (see original Supp Info where this reconstruction is shown) yields a reconstruction that passes RE at just below the 95% level (approximately 94% level) back to AD 1300 and the 90% level back to AD 1100 (they pass CE at similar respective levels).

    So, everyone seems to agree the claim is false. Mann 2008 could not produce a valid reconstruction (under their own tests) without tree rings, unless one includes the nonsensically used Tiljander proxies. Despite this wide agreement, Mann’s book says:

    McIntyre settled then on a more specific avenue of attack: our use of a small group of sediment records from Lake Korttajarvi in central Finland. But this was quite inconsequential and, ironically, we were the ones who had raised concerns about these particular data in the first place, not McIntyre. We had included them for consideration only to be complete in our survey of proxy records in the public domain. In the online supplementary information accompanying publication of our PNAS article, we had both noted the potential problems with these records and showed that eliminating them made absolutely no difference to the resulting reconstruction.57 McIntyre had thus attempted to fabricate yet another false controversy

  167. Continuing along the same paragraph, we find Mann quoting Tom Crowley:

    Paleoclimatologist Tom Crowley perhaps summarized it best: “McIntyre … never publishes an alternative reconstruction that he thinks is better … because that involves taking a risk of him being criticized. He just nitpicks others. I don’t know of anyone else in science who … fails to do something constructive himself.”58

    The link I provided is the the reference Mann gives. You’ll note McIntyre not only responded to Tom Crowley’s summary, he is quoted as doing so in the very same article:

    The idea that I’m afraid of “taking a risk” or “taking a risk of being criticized” is a very strange characterization of what I do. Merely venturing into this field by confronting the most prominent authors at my age and stage of life was a far riskier enterprise than Crowley gives credit for. And as for “taking a risk of being criticized”? Can you honestly think of anyone in this field who is subjected to more criticism than I am? Or someone who has more eyes on their work looking for some fatal error?

    Paleoclimate reconstructions are an application of multivariate calibration, which provides a theoretical basis for confidence interval calculation (e.g., refs. 2 and 3). Inconsistency among proxies sharply inflates confidence intervals (3). Applying the inconsistency test of ref. 3 to Mann et al. A.D. 1000 proxy data shows that finite confidence intervals cannot be defined before ~1800.

    Until this problem is resolved, I don’t see what purpose is served by proposing another reconstruction.

    Underlying my articles and commentary is the effort to frame reconstructions in a broader statistical framework (multivariate calibration) where there is available theory, a project that seems to be ignored both by applied statisticians and climate scientists…. I’ve been working on this from time to time over the past few years and this too seems “highly constructive” to me and far more relevant to my interests and skills than adding to the population of poorly constrained “reconstructions,” as Crowley proposes.

    It would be desirable as well if journals publishing statistical paleoclimate articles followed econometric journal practices by requiring the archiving of working code as a condition of review. While progress has been slow, I think that my efforts on these fronts, both data and code, have been constructive.

    There’s obviously more to his response, and you can read it all for yourself by following the link, but the point is clear. Steve McIntyre responded to every part of Crowley’s summary in the very same spot Mann got the summary from. Despite this, Mann completely ignores everything McIntyre said.

    In short, this a case of Mann proposing as gospel that which he likes while simultaneously ignoring anything which disagrees with it. And he gets glowing reviews for it.

  168. The next paragraph is also misleading, but I don’t think I want to discuss it right now. To explain, I’ll quote from the paragraph:

    Our own findings, of course, hardly existed in isolation. There was now an impressive array of reconstructions, a veritable “hockey team,”59 and contrarians such as Stephen McIntyre had in recent years wasted no time in expanding their attacks to the ever-grown set of reconstructions. In addition to the dozen reconstructions shown in the IPCC AR4 report…

    The implication of this is there are a dozen reconstructions shown by the AR4 report which give hockey sticks. Discussing this “hockey team” would require discussing something like a dozen different reconstructions, and that’s more detail-oriented than I’m trying to be right now. However, I want to mention it because it comes up again in the next chapter (page 198):

    Only now, there wasn’t just a lone hockey stick but, as we have seen, a “hockey team” of well over a dozen independent reconstructions, all pointing to the same conclusion–that the recent warming of the planet was indeed anomalous in a long-term context.

    This is basically the same claim as above, but now, Mann even says the reconstructions are independent. It’s false, and if people would like, I’ll discuss the details I’m skipping for now once I’ve gotten through the rest of the book.

    As a teaser, I’ll comment on three reconstructions in the figure he refers to. One of the reconstructions shown was MBH itself. One only went back to 1600 AD. Another only went back to 1400 AD.

    That inspires a lot of faith, huh?

  169. Continuing on page 198, we get this:

    When Science in early September 2009 published an article by Darrell Kaufman and his colleagues showing the most dramatic hockey stick yet–a two-thousand-year reconstruction of Arctic temperature changes19–Stephen McIntyre and his forces went on the attack on the Internet,20 immediately trumpeting the false claim that the work was compromised by bad data, despite the fact that whether or not the authors used the data in question made no difference to the result they obtained.21

    Naturally, the “data in question” is the Tiljander series. Before I get to that though, I want to discuss reference number 20. It says:

    On his climateaudit site, McIntyre posted five separate pieces attacking the work within the space of two weeks.

    One of the five blog posts he refers to is titled, “Kaufman et al: Obstructed by Thompson and Jacoby.” Does that sound like it’s like to be an “attack” to you? You can check for yourself if you have any doubts, but it is not any sort of attack against Kaufman’s paper. I guess in Mann’s view, the fact McIntyre mentions Kaufman should be taken to mean McIntyre was attacking his paper.

    But it goes beyond that. Stephen McIntyre never said the improper use of Tiljander is what gave Kaufman’s paper its hockey stick. In fact, he was quite clear that the biggest source of the hockey stick in the paper was a different data issue involving a series by the name of Yamal.

    Now then, we come to a funny part. A month or two after Kaufman paper came out, a corrigendum was published. Included in it was this line:

    Record 20 was corrected to reflect the original interpretation of Tiljander et al. (S32) that X-ray density is related inversely to temperature.

    That’s right. Kaufman had used a Tiljander series upside down. He admitted it (albeit, only after being pressured to) and corrected it. That means Mann is referring to a controversy where McIntyre’s position was acknowledged as correct by the authors of the paper. That seems like an unwise move to me.

    Even worse, Ray Bradley, who I mentioned not long ago, was a coauthor on both Mann 2008 and Kaufman 2009. Both papers made the same mistake, but only the Kaufman group admitted it. This means Bradley was an author on two papers with contradictory views on Tiljander.

    Doesn’t that just seem awkward?

  170. Upside down Tiljander,
    Six sigma Yamal larch;
    Curvies turn and swirly glitter,
    Slippery fishies in a pond.
    ==============

  171. Continuing along the same paragraph:

    A more vicious attack was reserved for later that month. The matter concerned a tree ring temperature reconstruction for Russia’s Yamal region that Keith Briffa and colleagues had published some years earlier; it once again showed recent warmth to be anomalous in a two-thousand-year context. At a time when Briffa was known to be seriously ill and not in a position to respond to any allegations, McIntyre publicly accused him of having intentionally cherry-picked tree ring records to get a particular result.22

    This is rubbish. Steve McIntyre never accused Briffa of cherry-picking. In a somewhat peculiar twist, Michael Mann’s next reference (#23) is to a post on DeepClimate’s blog. The first paragraph of that post contains a link to a post where DeepClimate makes the same accusations about Steve McIntyre supposedly accusing Briffa of cherry-picking. The reason this is peculiar is the first time I ever posted on DeepClimate’s blog was on that post. I didn’t think in the process of checking Mann’s references I’d wind up being directed to my own arguments, but there you have it.

    I think my response to DeepClimate’s post holds up pretty well, and I think most anyone reading his response to me will find it unconvincing. I especially doubt many people will agree with DeepClimate that when McIntyre says D’Arrigo cherry-picks (something he proudly admitted), he is accusing Keith Briffa of doing it…

  172. And more from the same paragraph:

    Moreover, he demanded that Briffa turn over all of the individual underlying tree ring records in his possession.

    This is actually true. But what Mann doesn’t tell you is the “demand” was not made to Keith Briffa. Instead, it was made to the journals which published Briffa’s work. As for the demand, all McIntyre did is tell them they were obligated to follow the rules they had in place.

    You would never guess from this sentence all McIntyre did was say, “Follow your own rules.” You also wouldn’t guess the journal in question actually agreed with this demand and forced Briffa to archive the measurements.

    Yet correspondence later found between McIntyre and Briffa’s Russian colleagues (who had supplied the tree ring data in the first place) revealed23 that they, not Briffa, had chosen which tree ring records

    The fact it wasn’t Briffa who made the selection is hardly surprising news. Steve McIntyre even said he believed that was the case.

    McIntyre had the data all along!24

    Mann makes this sound like a big deal, but it really isn’t. McIntyre said of it:

    In response to your point that I wasn’t “diligent enough” in pursuing the matter with the Russians, in fact, I already had a version of the data from the Russians, one that I’d had since 2004. What I didn’t know until a couple of weeks ago was that this was the actual version that Briffa had used.

    Having a version of data doesn’t mean you know that version is the version somebody used. Since Briffa refused to archive his data, how was McIntyre supposed to know he had the same version? Beyond that, the fact he happened to get the same version years before doesn’t somehow change the fact Briffa was obligated to archive the data, nor does it change the fact Briffa avoided doing that for years.

    In other words, Mann is grasping at just about anything to criticize McIntyre. And apparently, a lot of his readers are going along with it.

  173. Finally, I can move onto the next paragraph:

    To support his “cherry picking” allegation, McIntyre had produced his own composite reconstruction–which happened to lack the prominent recent warming evident in Briffa’s reconstruction.

    As above, McIntyre did not accuse Briffa of cherry picking. Also as above, McIntyre has never produced his own reconstruction. Doing sensitivity tests is not the same as proposing alternative reconstructions.

    How did he accomplish this? By deleting tree ring records of Briffa’s he didn’t seem to like, and replacing them with other tree ring data he had found on the Internet, which were inappropriate for use in a long-term temperature reconstruction

    The “tree ring records… he didn’t seem to like” were 12 cores, an unreasonably low amount. He deleted them to see what would happen if a different site’s data was used instead. This site had 34 cores, a far better number, and it was from the same area.

    Not only is this a perfectly reasonable test, McIntyre then did the same thing using both sets of data (the 12 and 34 cores). In other words, he took Briffa’s data set and added more data to it. Naturally Mann doesn’t tell you this.

    Finally, Mann’s phrase, “other tree ring data he had found on the Internet” strikes me as an attempt at snideness. Perhaps I’m just being cynical (possible, given what I’m dealing with), but if not, I feel it’s important to point out McIntyre got the data from the International Tree Ring Database. You know, the single largest and best resource for tree ring data in the world.

    Yeah, he just used “tree ring data he had found on the Internet.”

  174. Brandon —
    I hadn’t checked this thread in a couple of days, so a lot to catch up on! I particularly appreciate learning of McIntyre’s comments you mention in #90573.

  175. The next paragraph once again refers to all those other reconstructions. This time, Mann says:

    Most climate reconstructions either didn’t use the Yamal series in question anyway27 or were undetectably altered if the Yamal series was entirely eliminated from the pool of proxy data used.28

    Again, I’m limiting my discussion on the reconstructions, but I want to point out something important. Half of the dozen reconstructions in the IPCC AR4 figure Mann boasted about use Yamal, as well as Kaufmann 2009 and Briffa 2008 (which I haven’t discussed, but is one of three more reconstructions Mann refers to). Of those, Kaufman 2009 is certainly heavily dependent upon Yamal, as are a number of others. Given this, Mann’s claim is completely bogus.

    Given this, you might wonder what the references he has say. 28 is the only one which refers to anything notable, and it says this:

    This had, in fact, been explicitly demonstrated for the recent Kaufman et al. reconstruction discussed earlier and for our Proceedings of the National Academy of Science (2008) reconstruction, but it likely applied to just about all published reconstructions

    You’ll note he claims this had “been explicitly demonstrated” for the Kaufman paper, but he never says where. In fact, this is the only time he’s ever referred to such. Normally one uses notes to provide references for claims, not make more claims without any references. There is no way to verify what Mann says here, and I obviously don’t believe it for a moment. I suspect he is simply pulling this out of thin air as I’ve never seen anyone claim it before.

    I also find it interesting Mann says “it likely applied to just about all published reconstructions.” That seems to suggest he really doesn’t know the impact Yamal has on reconstructions even though he made a bold claim about that impact in the middle of that paragraph.

    By the way, I forgot to remark on something in the comment above. Mann said the cores McIntyre added “were inappropriate for use in a long-term temperature reconstruction.” There is absolutely no reason the cores he used were any worse for such than the 12 Briffa used. That is, unless your reason is it doesn’t get the “right” results.

  176. HaroldW:

    I hadn’t checked this thread in a couple of days, so a lot to catch up on! I particularly appreciate learning of McIntyre’s comments you mention in #90573.

    I’m trying not to get too bogged down on details, but at the same time, point out the things which are not just wrong, but easily seen as wrong. Personally, I think anyone reading this book with an open mind should notice a decent number of the things I’ve pointed out (if they were interested in the subject). Yes, a lot of them require knowledge, but that knowledge is usually available with just a few minutes of reading (20-30 perhaps, if you have no idea where to look).

    By the way, I’ve finally reached page 200. Given how much trouble the last ten pages were, I think I need a break. I’ll try to post For those who are curious what’s ahead, next is Mann’s discussion of FOIA. After that, there’s Climategate, the whitewashes which followed, and the Wegman controversy. After that, lucia can finally have her open thread back (if she still wants it)!

  177. Brandon – I imagine there will be a whole lot of Mannian b/s flung over the various enquiries and whitewashes. And I suspect that the critique of Wegman will focus on the immaterial plagiarism allegations rather than on the rather damning conclusions about his mathematical knowledge. Was I right?

  178. Page 200 has the first obviously untrue comment about FOI requests:

    In November 2008 McIntyre filed a FOIA demand to NOAA requesting not only data used in a recent paper by Ben Santer and coauthors (all of which was already, in fact, available in the public domain, but [also] all the e-mail correspondence between Santer and his coauthors.38

    The “also” I added is there because it’s missing in the paragraph. I also should point out the data Mann says was “available in the public domain” was the underlying data used by Santer et al. It was not the data as used by him. To understand the distinction, read this from Santer:

    You will need to do a little work in order to calculate synthetic Microwave Sounding Unit (MSU) temperatures from climate model atmospheric temperature information. This should not pose any difficulties for you. Algorithms for calculating synthetic MSU temperatures have been published by ourselves and others in the peer-reviewed literature. You will also need to calculate spatially-averaged temperature changes from the gridded model and observational data. Again, that should not be too taxing.

    McIntyre asked for data as used in the paper so he wouldn’t have to repeat a bunch of calculations which would introduce the possibility of inconsistencies in implementation. It’s a perfectly normal request in most fields of science, but Mann obviously gives a different portrayal. With that clarified, we can look at the contents of note 38:

    The precise wording of the demand is as follows: “This is a request under the Freedom of Information Act. Santer et al, Consistency of modelled and observed temperature trends in the tropical troposphere, (Int J Climatology, 2008), of which NOAA employees J. R. Lanzante, S. Solomon, M. Free and T. R. Karl were co-authors, reported on a statistical analysis of the output of 47 runs of climate models that had been collated into monthly time series by Benjamin Santer and associates. I request that a copy of the following NOAA records be provided to me: (1) any monthly time series of output from any of the 47 climate models sent by Santer and/or other coauthors of Santer et al 2008 to NOAA employees between 2006 and October 2008; (2) any correspondence concerning these monthly time series between Santer and/or other coauthors of Santer et al 2008 and NOAA employees between 2006 and October 2008…”

    I find it telling Mann refers to this as a “demand” not a “request,” but that’s hardly important. What’s important is Mann claims McIntyre asked for “all the e-mail correspondence between Santer and his coauthors.” He then directs the reader to a note which asks merely for “any correspondence concerning these monthly time series.”

    A person who simply read this paragraph and checked the note Mann directs them too would see this is a glaring misrepresentation.

  179. Interestingly enough, Mann says very little about the “FOIA campaign” that supposedly “flooded” people with requests. The first comment is on page 200, and it’s just a statement about FOIA requests in general:

    This strategy has increasingly involved the abuse of vexatious Freedom of Information Act (FOIA) demands to harass scientists and impede their progress in research…

    His only reference (at least, that I’ve found so far) to the “campaign” itself is on the next page:

    While contrarians were going after Keith Briffa for his Yamal tree ring work in the latter half of 2009, they were also badgering Phil Jones and his colleagues at CRU with an escalating barrage of FOIA demands–sixty of them in one weekend alone.

    While there’s obviously more to the story (and the requests were not vexatious), there’s little from Mann to comment on. It’s a nice change of pace to have fallacious commentary be so brief.

  180. On page 206, Mann says:

    In fall 2009, he made a similarly baseless accusation of plagiarism against my colleague Eric Steig.61

    Note 61 is long, so I’ll only quote parts of it:

    MuCulloch wrote to Nature alleging that Steig had plagiarized his work… MuCulloch “published” a piece on the climateaudit blog criticizing the Steig et al. analysis–correctly, as it turned out… Once Steig was able to confirm that such an error had been made, he recalculated the trend significances correctly… When it was published in August 2009 (Nature, 457: 459-462), McCulloch contacted Nature. MuCulloch complained that Steig had appropriated his own finding. Yet is is self-evident that Steig et al. were aware of the need for the autocorrelation correction, since the paper explicitly stated (albeit, it turns out, in error) that it had been made.

    So, Mann says Steig made an error and McCulloch found it (while Steig was in Antarctica, it happens). Later Steig became aware of the error and corrected it. McCulloch then accused him of plagiarism. That much is fine, but look at the part I made bold. To rebut the idea Steig became aware of the error because of McCulloch, Mann says Steig knew the correction was necessary.

    Say what? Knowing making an error is wrong means you couldn’t possibly copy from someone who points out you made the error? I get there is a lot of verbiage in this note, but how could anyone not realize that was silly?

    Incidentally, Mann’s note goes onto say something incredibly ridiculous, but to understand the problem, you need more than just the book:

    Had McCulloch notified Steig of the error when he first discovered it, or had he submitted a formal comment to Nature identifying the error, he would have received credit and acknowledgment. He chose, however, to do neither of these things. To suggest that Steig’s correction of an error in his own work, using standard methods, could constitute plagiarism was simply absurd.

    The first and third sentences of this excerpt contradict each other. Mann says if McCulloch had notified Steig of the error, he’d be given credit. Mann then says Steig’s correction couldn’t constitute plagiarism since it was just a correction of his own work using standard methods. But if McCulloch could have gotten credit, then plagiarism would necessarily have been a possibility. If he deserved credit and didn’t get it, there’d be plagiarism.

    More ridiculous, McCulloch actually did contact Steig and all of Steig’s coauthors.

  181. Mann starts off his discussion of Climategate with a bang (page 207):

    The most malicious of the assaults on climate science would be timed for maximum impact: the run-up to the Copenhagen climate change summit of December 2009, a historic, much anticipated opportunity for a meaningful global climate change agreement.1 The episode began with a crime committed by highly skilled computer hackers…

    Mann claims “highly skilled computer hackers” were responsible, but in reality, nobody even knows how the e-mails were collected. Even if someone were inclined to believe Mann on this issue, surely they’d be suspicious by him saying it was done by hackers. How could he possibly know more than one person was involved?

    But it gets worse. Note #1 says:

    The hackers had access to the materials in early October 2009, but held off releasing them until mid-November 2009, apparently to inflict maximum damage to the Copenhagen climate summit in early December 2009.

    To be fair, he does provide a reference (a Ben Webster article I haven’t been able to find online). However, that doesn’t change the fact this comment is completely absurd. The e-mails released contain e-mails from November. I don’t care how “highly skilled” some hackers may be. They cannot steal e-mails which have never been written.

  182. Brandon Shollenberger (#90710)
    Mann uses the words “vexatious Freedom of Information Act (FOIA) demands”…in context, vexatious is a term which has a specific interpretation. For the UK, see here.

    Are you aware of any FoI requests in the climate area which have been refused using the “vexatious” criterion? I don’t recall CRU ever rejecting one for that reason.

  183. Brandon (#90721)
    I believe the Ben Webster article to which you refer is here. That’s a Wayback Machine archival, as the original article is not available at The Times any more. Steve McIntyre discussed it here.

    The claim of Oct 2009 access to emails is made but not documented in that December 3 Webster article. However, a few days later, on December 7, Webster wrote this column, in which he mentions that Paul Hudson of the BBC received emails in October. Thus the Hudson evidence would seem to be the source of Webster’s claim; I don’t know if this has ever been confirmed.

    However, it appears that Hudson’s statement was misinterpreted: as Hudson had clarified, he was the recipient in October of emails which were part of the Climategate release, which allowed him to confirm that at least some of the CG emails were genuine. Hudson’s clarification on November 24 predated Webster’s columns, but Webster apparently was unaware of it when he wrote on Dec 7, “The BBC has confirmed that Paul Hudson received some documents on October 12 but no story was broadcast or printed by Mr Hudson or the corporation.”

    My conclusion is that Webster was mistaken in his Dec 3 & 7 articles, and that there is no evidence that the CG emails were available in October (and held until November). In Mann’s defense, I don’t think that Ben Webster ever made a correction to his claim to that effect. [And of course, I might be wrong!]

  184. Jeff Norman:

    Your notes are Mann’s book are greatly appreciated. Thank you.

    You’re welcome! I’m just glad people have found them helpful.

    HaroldW:

    Are you aware of any FoI requests in the climate area which have been refused using the “vexatious” criterion? I don’t recall CRU ever rejecting one for that reason.

    I believe some requests have been denied because of how much time they would take to answer, but no, none have ever been (officially) called vexatious.

    I don’t think it’s worth worrying Mann not using a “technical” definition though. Many people would use the word the same way as him, so it’s not much of an issue.

    My conclusion is that Webster was mistaken in his Dec 3 & 7 articles, and that there is no evidence that the CG emails were available in October (and held until November). In Mann’s defense, I don’t think that Ben Webster ever made a correction to his claim to that effect. [And of course, I might be wrong!]

    Thanks for that. I probably could have found the article myself, but I didn’t spend much time looking for it. For what it’s worth, I think your interpretation is spot on. However, I don’t think what you say excuses Mann’s absurd comment in the slightest. The fact you can find a claim made in a couple newspaper articles does not mean you can state it as fact. It certainly doesn’t mean you can make an extremely bold claim without doing anything to verify the source is right.

    If it was some inconsequential issue, or it was one where the truth wasn’t easy to discover (and hadn’t been published in hundreds of places), we might be able to excuse Mann. But when even just a couple minutes with Google is enough to disprove a serious claim, there is no excuse for publishing it in a book.

  185. I get how people wouldn’t notice some of Mann’s deceptions. What I don’t get is how people wouldn’t notice the absurdity of some of the things he says (on page 210):

    The full quotation from Jones’s e-mail was (emphasis added), “I’ve just completed Mike’s Nature trick of adding in the real temps to each series for the last 20 years (i.e. from 1981 onwards) and from 1961 for Keith’s to hide the decline.” Only by omitting the twenty-three words in between “trick” and “hide the decline” were [climate]* change deniers able to fabricate the claim of a supposed “trick to hide the decline.”

    Can you imagine anything more ridiculous than this comment? Mann complains people fabricated a claim by omitting 23 words. Those words were four prepositional phrases and a parenthetical comment. They obviously do not change the meaning of the sentence (other than to clarify it). Despite this, Mann writes an entire paragraph about how removing them is dishonest. In fact, he continues in the paragraph to say:

    No such phrase was used in the e-mail nor in any of the stolen e-mails for that matter. Indeed, “Mike’s Nature trick” and “hide the decline” had nothing to do with each other.

    I don’t care how much or how little knowledge you have of the subjects covered in this book. There is absolutely no way to read this and not think Mann sounds like a fool.

    *I added the word “climate” because it was missing. At least, I assume Mann wasn’t meaning to refer to “change deniers.”

  186. Brandon (#90828)
    “But when even just a couple minutes with Google is enough to disprove a serious claim, there is no excuse for publishing it in a book.”
    You’re quite right…it did only take a few minutes with Google and following links. Less time than to write it up 🙂
    The diagnosis is confirmation bias. Mann liked the Webster scenario — it’s those nefarious climate deniers! — and stopped looking. [By the way, van Ypersele’s assessment in Webster’s Dec 7 article linked above, seems about equally unfounded. Why can’t people just say “I don’t know”?]

    It doesn’t say a lot for the quality of the editing. Some of the mis-statements about reconstructions get into technical details which perhaps an editor would shy away from. But this is basic verification, no math needed.

  187. Haroldw:

    It doesn’t say a lot for the quality of the editing. Some of the mis-statements about reconstructions get into technical details which perhaps an editor would shy away from. But this is basic verification, no math needed.

    I agree, but there are even worse examples, like the one I pointed out in this comment. They are things Mann says which make no sense in and of themselves. There are quite a few sentences/paragraphs where the logic is completely nonsensical. There are also instances where Mann flat-out contradicts himself.

    It gives the impression the editing for this book was very poor. There are plenty of issues which should have been caught by someone who didn’t even try to research anything Mann said. I don’t know if that’s a problem of the editing, or if maybe Mann’s original submission was so bad this is a huge step up. Either way, this book does not deserve glowing reviews.

    By the way, while I don’t like to harp on typos, it’s kind of incredible to find words missing from sentences multiple times in a book. It really makes the editing seem shoddy.

  188. On page 211, Mann says Jones:

    was referring, specifically, to an entirely legitimate plotting device for comparing two datasets on a single graph, as in our 1998 Nature article (MBH98)–hence “Mikes Nature trick.”

    In the next paragraph, he elaborates:

    we supplemented our plot of reconstructed temperatures in MBH98 by additionally showing the instrumental temperatures, which extended through the 1990s. That allowed our reconstruction of past temperatures to be viewed in the context of the most recent warming. The separate curves for the proxy reconstruction and instrumental temperature data were clearly labeled…

    This is is true. The two lines were plotted separately. However, that is not what the trick Jones referred to is. In fact, Mann doesn’t talk about the actual trick. The actual trick involves smoothing.

    When you have noisy data, it can be useful to make a “smoothed” graph so a signal is easier to see. Basically, this involves averaging data with the data of the points near it (called a moving average). Of course, this means the points in the middle of the graph have to be treated differently than the points at the ends of the graph (since they’ll have less data on one side). There are various ways to handle this issue, and the way used by Mann is what we call “Mike’s Nature trick.” This is what he did:

    First, he appended the temperature record to the end of his reconstruction. This combines the two records into a single line. Next, he smoothed the record. Finally, he deleted all the data after 1980, when the reconstructed record ended. The net effect of this was to change the end of the graph from pointing down to pointing up.

    This is not “an entirely legitimate plotting device.” It is not just a case of plotting two different lines on the same graph. It is using data from one line to manipulate the data from another line without any rational basis.

  189. On page 212, Mann defends “hide the decline” as follows:

    These data show an enigmatic decline in their response to warming temperatures after roughly 1960, perhaps because of pollution21–that is the decline that Jones was referring to.
    While “hide the decline” was poor–and unfortunate–wording on JOnes’s part, he was simply referring to something Briffa and coauthors had themselves cautioned in their original 1998 publication: that their tree ring density should not be used to infer temperatures after 1960 because they were compromised by the divergence problem.

    No explanation is offered as to how one can know the divergence means the proxies stopped tracking temperatures in the modern period, yet also know they tracked temperatures well in the past. It’s not even addressed. Instead, Mann simply says Briffa (and coauthors) recommended hiding the data, therefore it was okay. This isn’t too bad, but it provides crucial context for what Mann says back on page 211:

    There was one thing Jones did in his WMO graph, however, that went beyond what we had done in our Nature article: He had seamlessly merged proxy and instrumental data into a single curve, without explaining which was which. That was potentially misleading, though not intentionally so; he was only seeking to simplify the picture for the largely nontechnical audience of the WMO report.

    So, Jones deletes the (decreasing) data after 1960, appends (increasing) instrumental data to the record, offers it as a single continuous record, and Mann says this “was potentially misleading.”

    No worries though, it wasn’t intentional if so. Jones was simply trying to make the picture simpler by deleting data then replacing it with data that went in the opposite direction without explaining what he did…

  190. Once again, a case of Mann saying something so completely ridiculous anyone reading the book should notice it (page 213):

    Some critics also claimed that the e-mails revealed a culture of “gatekeeping,” that climate scientists, myself included, were unfairly preventing skeptics from publishing in the peer reviewed literature. So claimed Patrick Michaels25 of the libertarian26 Cato Institute roughly a month after the CRU hack in a December 17 Wall Street Journal op-ed.27 Peer review, however, is by definition gatekeeping; it is intended to keep seriously deficient work from polluting the scientific literature.28

    Mann says climate scientists were accused of gatekeeping, unfairly preventing skeptics from publishing. He responds to this accusation by saying peer review is by definition gatekeeping…

    He doesn’t actually dispute anything. He never denies unfairly preventing skeptics from publishing. He simply says gatekeeping is normal. If anything, that’s a tacit admission the accusation is true!

  191. > Briffa and coauthors [cautioned] that their tree ring density should not be used to infer temperatures after 1960 because they were compromised by the divergence problem.

    Unfortunately for Mann as well as for Briffa et al, this sentence can be understood quite easily.

    “We used proper techniques in collecting tree rings, but for unknown reasons, they fail to correlate with the instrumental temperature record after about 1960. Why? Well, because of the divergence problem, of course! What date range should we use for calibration and calculation of confidence intervals? Well, pre-1960, when tree rings and thermometers don’t diverge! Of course! And as to whether ‘divergence’ could have been a factor prior to 1850… Of course not!

    This is post hoc reasoning in pure form.

  192. There are a variety of other issues in chapter 14, but I’m going to skip them. They’re often minor, and even when they’re not, they’re misleading in ways that aren’t obvious. For the same reason, I’m skipping over chapter 15’s discussion of the “investigations” (read: whitewashes) that followed Climategate. That brings me to the Wegman Report on page 239, where Mann says:

    Where Wegman and couathors did tweak Bradley’s original text, the changes were often far from innocuous. Bradley’s words were systematically altered in a way that downplayed the reliability of the science and, in a perverse twist of irony, made them appear to undermine the conclusions of Bradley’s own work.41

    The part I made bold is a claim made by both John Mashey and DeepClimate. It’s false. I first examined it in this comment at Collide-a-scape about two weeks after Mashey’s report was published. It turns out there is only one case where the Wegman report supposedly inverted a conclusion, and it’s nothing more than DeepClimate and Mashey failing to understand a simple sentence (or intentionally misrepresenting it, if they did understand it). The passage from the Wegman Report says:

    As pointed out earlier, many different sets of climatic conditions can and do yield similar tree ring profiles. Thus tree ring proxy data alone is not sufficient to determine past climate variables.

    The original material said:

    If an equation can be developed that accurately describes instrumentally observed climatic variability in terms of tree growth over the same interval, then paleoclimatic reconstructions can be made using only the tree-ring data.

    And as I originally explained:

    Of course, this sentence begins with the word “if.” “If” something can be done, Wegman is wrong, and his text “inverts” Bradley’s “original results.” DeepClimate made no effort to show an equation like the one Bradley describes can be made, much less that one has been made. Given that, his example cannot possibly show Wegman “inverted” Bradley’s conclusions. In fact, there is not even any indication of error.

    Whatever else may be true of the Wegman Report, it never inverted any of Bradley’s original results. That accusation, the most serious one made by Mashey/DeepClimate is a complete fabrication.

    While there are other problematic statements regarding plagiarism, both by Mann and Mashey/DeepClimate, I’m not going to discuss them. The simple reality is the Wegman Report did have plagiarism, and it’s not worth examining the detail of the matter. It suffices to say one cannot simply trust what any of these people say about the Wegman Report to be accurate.

  193. AMac:

    This is post hoc reasoning in pure form.

    Obviously so, yet it seems to be quite popular, and almost unquestioned amongst any “consensus defenders” who speak about the subject.

  194. A paragraph on page 242 made my mouth drop. After everything else I’ve written about, I thought nothing could shock me, but look at what Mann says in it:

    Not only had their apparently been64 substantial undisclosed collaboration between the WR authors and Stephen McIntyre, as hinted at earlier65–something Wegman had denied in his testimony under oath in Congress66…

    If Wegman lied to Congress like Mann claims, he is guilty of at least one felony. That means he could be sent to jail for five years. That is what Mann accuses him of. And what is his evidence? Note #66 directs us to this link. It’s the transcript of Dr Wegman’s testimony in front of Congress, from which Mann offers this quote:

    Mr. Stupak: Did you or your co-authors contact Mr. McIntyre and get his
    help in replicating his work?
    Dr. Wegman. Actually, no…

    Notice the ellipsis. What do you think they might hide? How about:

    DR. WEGMAN. Actually, no. What I did do was I called
    Mr. McIntyre and said that when we downloaded his code we
    could not get it to work either, and it was unfortunate that
    he was criticizing Dr. Mann when in fact he was in exactly
    the same situation. Subsequently, he reposted his code to
    make it more user friendly and we did download it
    subsequently and verified that it would work.
    MR. STUPAK. And then after you re-downloaded and verified
    it worked, did you have any further contact with
    Mr. McIntyre then?
    DR. WEGMAN. Well, as I testified last week, Dr. Said and
    myself had gone to one of the meetings where he was talking,
    and we spoke with him but did not identify who we were at
    the time. This was early in the phase. Subsequently, I had
    had no contact with him until basically last week.
    MR. STUPAK. Okay. Any of your co-authors that you know of,
    Dr. Said or any others, have contact with Mr. McIntyre other
    than that one time at this convention or wherever he was
    speaking?
    DR. WEGMAN. One of my graduate students, John Rigsby, who
    did the code for us, worked the code for us, did have some
    interaction with him in order to verify some of the details
    of the code.
    MR. STUPAK. So you, Dr. Said and this Mr. Rigsby would be
    the people who had contact with Mr. McIntyre then?
    DR. WEGMAN. That is correct, yes.
    MR. STUPAK. Thank you. Nothing further.

    Wegman details the communication between he, his co-authors and Steve McIntyre. Mann takes this, deletes everything but two words, and offers Wegman’s answer as saying the exact opposite of what he actually said. He then uses the complete fabrication to accuse Wegman of a felony.

    There is no way this could possibly have been done by mistake. Either Mann intentionally lied, libeling Wegman, or some source he used did the lying. Either way, this is unbelievable.

  195. I’ve finished reading Mann’s book, and there’s at least one more thing from chapter 15 I want to comment on. Unfortunately, I’m too upset right now. However, I do want to comment on one passage on page 250. Back near the beginning of my commentary, I discussed a glaring error of Mann’s. On page 250, this is revisited:

    When we reach concentrations of 450 ppm (about 2030, extrapolating from current trends), we will likely have locked in at least 2°C (3.5°F) warming of the climate relative to preindustrial levels…

    The numbers given here make sense. Given that, it’s clear the earlier passage was simply mistaken to discuss an increase over current temperatures rather than past temperatures. It’s an incredibly dumb error to appear in a book, but it did. Somehow, Mann misrepresented the material he was referencing, made an obvious arithmetic error, and this didn’t get caught by any editing.

  196. Yes, thank you Brandon, for the exploration. It was like watching one of those crash-test-dummy slo-mo videos. You know what’s going to happen, but it’s still interesting.

    Andrew

  197. I’ll echo Andrew_KY’s sentiments. I don’t have the time or stomach to go through Mann’s nausea-enducing self-adulation (what I fear I will find), but this is very interesting none-the-less.

    It paints the image of a person who is very average intellectually but possessing a giant-sized ego.

  198. thans Brandon – I was tyempted to buy the book and wade my way through but I suspected it would just nauseate me. You have been selkfless in this, and you deserve a time out for a truly heroic act of reading 250 pages of bilge and lies and deceit…and terrible writing.

  199. AMac:

    Thanks for posting your review, Brandon.

    Dan Hughes:

    What AMac said.

    You’re welcome!

    Andrew_KY:

    Yes, thank you Brandon, for the exploration. It was like watching one of those crash-test-dummy slo-mo videos. You know what’s going to happen, but it’s still interesting.

    I thought I knew what was going to happen, but it turns out I was wrong. I never expected libel to be in the book so brazenly like in that last example. It’s like watching one of those crash-test-dummy slo-mo videos, only to have the car explode at the last second before hitting the wall.

    Carrick:

    It paints the image of a person who is very average intellectually but possessing a giant-sized ego.

    It’s even worse if you read the book itself. I tried to avoid commenting on the tone of the book and the various misrepresentations it causes, but at times it’s almost unbelievable.

    diogenes:

    thans Brandon – I was tyempted to buy the book and wade my way through but I suspected it would just nauseate me. You have been selkfless in this, and you deserve a time out for a truly heroic act of reading 250 pages of bilge and lies and deceit…and terrible writing.

    I don’t deserve too much praise for this. When a book is this filled with garbage, I want to take notes of the things I find (assuming I can manage to read it). Posting them on here was almost no extra work, and it gave more purpose to my effort.

    By the way, deceit and lies I could have dealt with. What bothers me the most is the incompetence/brazenness displayed by them. Mann being full of it doesn’t bother me. What bothers me is he either did a terrible job of writing the book, or he intentionally sought to deceive people (or both). And he gets praised for it! Heck, he’ll probably profit from the book: It seems it’ll increase his image, even if he doesn’t make any money off it.

    As for a break, that might be a good idea, but I still think I should collect this into a single document. If I do that and edit it for clarity, people will have a well-informed and easy-to-read opinion which doesn’t praise Mann. It might be of some value in combating the rubbish he’s published.

  200. “I thought I knew what was going to happen…”

    Well, reading stuff from people associated with AGW promotion for a few years+ has taught me that there pretty much is no bottom in this area. The sh*t that my eyes and brain endure on a daily basis are ongoing proof of this.

    Andrew

  201. Brandon – if you can stomach it, I think it would be useful to mpull together the inaccuracies, lies, deceptions and cover-ups that you have noticed. The thing is, there are probably others! It seems that the Team are trying to get attention at the moment, so it is important to show just how insecure the “science” as reported by the IPCC actually is.

  202. diogenes, I actually already started combining this into a single document. The problem I’m facing is mostly organizational. I have to figure out what things I want to discuss, the order to put them in, etc. On top of that, I have to decide how detailed/technical I want to be for many different issues.

    What I’m thinking I’ll do is focus on the simple issues first. For example, my introduction focuses on the Wegman misrepresentation. It’s both a very serious claim by Mann, and it’s easy for anyone to understand. From there, I figure I’ll discuss cases where Mann obviously distorts/contradicts sources. Then I’ll discuss where he makes things up. Then I’ll discuss nonsensical things he said. Finally, I’ll discuss some technical disagreements.

    I think readers should start with the simplest things first, and as they progress, face more complicated matters. This allows them to stop at whatever level of complexity they want without missing anything they’d be interested in. I’ll still have trouble fitting everything in, and I may have to skip some examples, but… Yeah, rambling a bit.

    Anyway, I’m not interested in trying to “take on” the Team, the IPCC or anything like that. I just want there to be some way people reading Mann’s book could have a chance to be exposed to an alternative view. As it stands, there isn’t. Short of reading tons of papers and blog posts, nobody could be expected to know how much Mann misrepresented things.

    By the way, I was a little disappointed in how long this is taking me until I saw the timestamp of the comment where I started this. That timestamp says I started reading the book only a week ago. It feels like a lot longer than that. I guess only having a dozen pages done (of an expected 20-30) isn’t so bad after all.

Comments are closed.