Fyfe, Gillett and Zwiers have published an “Opinion/Comment” titled “Recent observed global warming is significantly less than that simulated by climate models”. The paper discusses precisely what we’ve been discussing here at The Blackboard lo these many years: That is, the observed trends are falling outside the range of model runs.
I could comment and discuss their method which differs from mine. But instead, I’ll take the approach of simply showing my comparison between models and observations if we begin in Jan 1993, the start date for their main comparison. ( They also compare starting in January 1998. Their tests run through Dec. 2012. I’m taking the liberty of ending with the most recent month.)
While our methods differ, their test corresponds to the uncertainty bounds shown furthest right on my graph. That is: The are testing whether the observation falls inside the full model spread. Their paper uses HadCrut4 for for their test. Also: I believe they recomputed the trends over the sampled part of the glob and I do not. (John Kennedy of the Met Office suggested the simplest quick and dirty way to do that would be to compute from 60N to 60S and I’m planning to do that but have not yet done so.)
As you can see, using ‘my’ method also shows the observations outside the range consistent with models.
I’d say more but I have to get back in the saddle and actually implement my ENSO correction and do a few other things. I’d like to have a well organized argument when the Dec. 2013 data are available as that makes for a nice round number.
I should add Von Storch and Zorita are discussing their paper and this one at their blog.

This should not be a surprise to anyone doing even a rudimentary comparison of IPCC models and temperature observations.
IMHO the most significant fact is that temperatures are even at the low end of the projections based on the “commitment” scenario, which assumes zero growth in greenhouse gasses from 2000.
In my experience, the usual response from warmists is that temperatures are “well within” the range of projections.
Ray,
It’s one thing to claim “within” (which maybe, by some method of claiming it might be true.) It’s another to claim “well within”.
Oddly, on threads, I’ve seem people extoll the fact that the temperature are “within” during the baseline periods and then ‘explain’ that you do expect excursions ‘sometimes’. So, bearing in mind it is “within” during the baseline one is supposed to give the models “credit” for that.
Well… of course they are “within” during the baseline. By definition (and subtraction!)
And moreover, since the forced trend prior to the baseline is small, the observations staying inside the spread during that period tells us very little about whether sensitivity is right or wrong. If we could run the earth and models both with steady forcings and compare over long periods, the only thing the test would tell us is whether the natural variability in models was close to correct. It couldn’t tell us if sensitivity was close to correct.
lucia,
I fail to see the scientific justification for adjusting the models to a common baseline at all. It’s PR. It makes them look better than they actually are.
I’d be surprised if 60N to 60S failed to be modeled within observations, since the expectation is that global warming primarily affects the coldest places, so higher latitudes, nighttime, and winters.
Lucia,
“It’s one thing to claim “within†(which maybe, by some method of claiming it might be true.) It’s another to claim “well withinâ€.”
I have had many discussions over whether being in the bottom 10% of the model range constituted “well within”.
MikeN–
The point is that HadCrut4 has no measurements over the poles. So it’s best to match thee region that HadCrut4 covers. Whether 60N/S is the best choice, I don’t know. But I suspect John Kennedy knows what the coverage distribution is since he’s the one who puts out HadCrut.
Is it time to rather than looking at an average of many models, to raise the question, what makes the FIO-ESM model to describe current trends so much better than all the others? Can one learn something from answering this question? Another question I would like to answered is what makes the Can-ESM2 model outstandingly worse? Could one learn something from comparing these two models?
Following on denny’s post, the ranking from best to worst depends on whether trends are computed only where observations exist. Computed in this way, the CanESM2 trend will probably fall close to the model-average trend, implying an exaggerated Arctic response in this model. The ranking also depends on how a given model responds to the 1991 Pinatubo eruption. Models that respond too strongly, like CanESM2, may recover with too much anomalous warming leading up to and beyond 1993.
Thank you, JCF.