Ninety Month Trends: IPCC AR4 2C/Century still outside ±95% uncertainty bands.

Trends for the Global Means Surface temperature for five groups (GISS, HadCrut, NOAA/NCDC, UAH/MSU and RSS.) were calculated from Jan 2001-June 2008 using Ordinary Least Squares (OLS) using the method in Lee & Lund. to compute error bars, and Cochrane-Orcutt and compared to the IPCC AR4’s projected central tendency of 2C/century for the trend during the first few decades of this century. The results for the trend fits, are shown in the graph and table below:

June Hypothesis TestClick For Larger

Figure 1: The IPCC projected trend is illustrated in brown. The Cochrane – Orcutt trend for the average of all five data sets is illustrated in orange; ±95% confidence intervals illustrated in hazy orange. The OLS trend for the average of all five data sets is illustrated in green, with ±95% uncertainty bounds in hazy green. Individual data sets were fit using Cochrane-Orcutt, and shown.

.

Results for Hypothesis Tests for Individual Cases
Method
Data Set OLS: Red corrected
(C/century)
CO
(C/century)
Merge 5 -0.8 ± 2.6 (Reject 2C/century.) -1.2 ± 2.2 (Reject 2C/century.)
Merge 3 -0.4 ± 1.8 (Reject 2C/century) -0.7 ± 1.6 (Reject 2C/century.)
Merge 5 is a data set created by averaging monthly data from all five sets below and then fitting. Merge three is created by averaging surface based sets.
GISS -0.3 ± 2.6 (Fail to reject 2C/century.) -0.9 ± 2.2 (Reject 2C/century.)
HadCRUT -1.3 ± 2.2 (Reject 2C/century.) -1.7 ± 1.7 (Reject 2C/century.)
NOAA 0.0 ± 1.8 (Reject 2C/century.) -0.5 ± 1.6 (Reject 2C/century.)
RSS MSU -1.3 ± 2.3 (Reject 2C/century.) -2.1 ± 2.3 (Reject 2C/century.)
UAH MSU -1.0 ± 4.0 (Fail to reject 2C/century.) -2.2 ± 3.3 (Reject 2C/century.)
*
The red noise correction uses the finite size adjustment discussed in Lee & Lund.
Monte-Carlo simulations for distributions with AR(1) and lag1 autocorrelations matching those obtain with this Merge(3) data indicated that this Red Noise correction results in uncertainty intervals that are approximately 0.1 C/century too large; the CO uncertainty intervals are approximately 0.1C/century too small. I did not run Monte-Carlo for the other cases.

For both methods, the results assume the residuals to the fits are AR(1).

This month’s post is a quick summary, while I spend some time running some monte-carlo results to test another possible method. If you wish to review the caveats and responses to various comments you may find the in May.

26 thoughts on “Ninety Month Trends: IPCC AR4 2C/Century still outside ±95% uncertainty bands.”

  1. Lucia, totally OT, but if you have noticed a drop in traffic over the past few weeks, it is probably because Websense is now blocking your site as a “game” site. You should contact them to correct this.

  2. Lucia, totally OT, but if you have noticed a drop in traffic over the past few weeks, it is probably because Websense is now blocking your site as a “game” site. You should contact them to correct this.

    Must be all that talk about “Monte Carlo”…

  3. Thanks for the update Lucia.

    You will undoubtedly continue to get comments that 90 months isn’t enough time to conclude anything about Climate Change from temperature data, so maybe you ought to pay more attention to rainfall.

    http://news.bbc.co.uk/2/hi/south_asia/7511356.stm

    Ninety months too short for temperature? Just a couple of years of rainfall data seems OK (local data to boot!) to draw sweeping conclusions.

    “But in 2005 and 2006 our yearly rainfall was well below the average. We could well be witnessing a severe change in our climatic conditions because of global warming.”

    While the annual rainfall for 2007 was back to normal, Mr Rayen argues that the “pattern of delivery” of Cherrapunjee’s rainfall is changing. In previous years, 98% of the area’s rainfall was between March to October.

    This year the rains did not arrive until June, and the reason for that he says could be man-made.

  4. It “could” be. Anything within a range of possibilities is a possibility and could be. The question is whether that particular weather really is “man-made” or just weather. Could be.

  5. John M,

    Its certainly enough time to conclude something about climate change; the question is, what exactly does it tell us? Lucia is doing a reasonably convincing job showing that this data contradicts the decadal trend projections in the AR4 WGI.

    However, a 90 month stagnation in temperatures tell us very little about actual climate sensitivity over the long run, and what sort of global temperature increases we would expect to accompany increasing atmospheric concentrations of greenhouse gases over the next century. So don’t be to quick to throw the proverbial baby out with the bathwater in this case.

  6. Also, Bob B, thanks for reminding me why I avoid reading Newsbusters. Noel’s confusion of ground level ozone and sulphate aerosols is particularly amusing, as is his misguided defense of the Montreal Protocol as a mechanism to “clean the air”. I don’t want to start a flame war over this, but surely any dispassionate observer has to realize that this particular article is anything but a model of lucidity.

  7. “However, a 90 month stagnation in temperatures tell us very little about actual climate sensitivity over the long run”

    Apparently, neither will the models. If they are off now, they would only be right in the future by random chance or for an instant in time, like a broken watch being right every day. It really suggests that either some assumption is wrong (e.g. climate sensitivity to some forcing) or some other physical process is not understood or included (including unknown or unaccounted for natural variations in some or all of the forcings).

  8. Zeke, Where do you get a 90 month stagnation? Using OLS, 4 are negative, 1 is stagnant, under CO all 5 are negative, 4 of them significantly negative. 1 in 10 sounds like an outlier to me. I don’t know what 90 months of stagnation says, but I know that 90 months of -1.7C century says climate models got some splainin to do.

  9. Zeke–# —Gavin has retracted and doublespoke about it. So what is your point if you have one??

  10. Lucia,

    One question has been nagging me, and since my stats knowledge is poor, maybe you could explain something. Linear regression assumes that the y values are normally distributed and have a constant standard deviation, correct? The data in the graphs are actually averages based on the grid points on the globe (I assume). Is the data from which those averages are generated normally distributed? Does it matter?

  11. A normal Gaussian distribution would be ‘white noise.’ The recurring assumption in analysis of temperature anomalies is that they are ‘red noise.’ That is, a very hot year predisposes the next year to also be hotter than the norm. An autocorrelation.

    There are various tests to see if the residuals contain patterns or otherwise fail to conform to the assumption. The consensus seems to be “Yes, temperature data of this sort looks close enough to red noise for us.”

    Lucia words this more precisely:
    “For both methods, the results assume the residuals to the fits are AR(1). “

  12. BarryW:
    1) Linear regression assumes the residuals to the regression are normally distributed. Roughly speaking, if you find the distance between the data point and the line that’s normally distributed.
    2) It matters if they aren’t.
    3) I test (after someone asked me to.) The J-B tests and Chi-squares says the assumption of normally-distributed for the current data is “not falsified.” (You can only test for falsification, or against a second assumption. But basically, the assumption of normality looks ok. It does break down if we get near a single volcanic eruption though.)

  13. Alan–

    Red noise can also be Gaussian! The “Red/White” distinction has to do with correlations at time (t, t+τ). Both can be gaussian, or poisson, or …. any number of distributions.

  14. Sorry, that’s what I was trying to (clumsily) say. But the values you were plotting are the averages of some distribution. The point I was trying to get at was not that the residuals of the averages that you plotted were normally distributed, but would the underlying data’s residuals be also? If instead of plotting the averages, let’s say you took the grid values, which the averages were derived from, and plotted them for each time sample instead of their average. Would their residuals also be normally distributed and would that matter? In other words could you get differing answers depending on whether you’re plotting the averages as opposed to the original data?

  15. Does Mr. Schmidt have a rebuttal regarding your findings, Lucia?

    Lucia: I think Gavin is Dr. Schmidt. That said, I’m Dr. Lucia– though only when I’m in a professional work-oriented capacity. I don’t consider blogging to be acting in my professional work-oriented capacity.

  16. The point I was trying to get at was not that the residuals of the averages that you plotted were normally distributed, but would the underlying data’s residuals be also?

    Not necessarily. There is an idea — not entirely proven– that if we take a huge collection of measurements and each error is independent from the others, the errors will ultimately be normally distributed.

    So, it’s possible that when you average over the entire planet, the residuals to a fit will be normally distributed, but at the same time, the residuals for say, Chicago, might not be normally distributed.

    In fact, with regard to “weather noise” we know that distribution of temperature around a mean is often far from normally distrubuted. Here is a histogram for VaxJoe, Sweden.

    You can see there is a tendency for temperature to cluster around the freezing point of water, and then around some other warmer temperature.

    Still, the idea for the planet is that even though the distributions for “weather noise” (defined based on whatever reference makes sense for a particular analysis) is not normally distributed in individual locations, when we average over the whole planet, the various non-normal features might average out.

    Of course… they might not. That’s why one should check.

  17. Ken–
    Gavin dropped into comments. If I understand correctly, his position is that to falsify the IPCC projections one must use the magnitude of “weather noise” from the models. I think his impression is that this is, somehow incorporated into the definition of “falsify”.

    I disagree. However, I did a further analysis to show that if we assume the properties of ‘weather noise’ are those described by an AR(1) process Gavin suggested in comments, and then perform a hypothesis test based on actual ‘weather noise’ his weather noise falsifies. This test is independent of the test for the trend.

    My hypothesis test is imperfect because Gavin didn’t say the AR(1) process totally matches the IPCC models. But, it’s the one he suggested, and it’s easy to show it can’t be correct.

    Gavin hasn’t rebutted this.

    However, I myself recognize some of the possible rebuttals. I’m planning to look at some “model data” to see if the suggested AR(1) process matched the models and do a few more tests.

  18. Lucia,

    Suppose I had predicted, back in 2000, that temperature would fall at a rate of 2º/century for the next few decades. Could I now claim, on the basis of your figures, that only the NOAA OLS data would reject my hypothesis? That all other sets and combinations would fail to reject the hypothesis? Of course, I made no such prediction at the time but I wonder if I can use your figures to validate the hindcast my new model has just made.

    Basically, I am wondering if the statistical methods would change depending on whether the data is a forecast or a hindcast. Intuitively, it seems like cheating to use the same data to both assist in the creation of my model and then use the same data to validate it. However, as a good empiricist, I am not likely to propose a model that does not match past observations, so I can hardly ignore existing data when building my model. What is an honest modeller to do?

  19. Jorge– I added quickly. But, it looks like the answer is yes. Of course, you didn’t predict that did you? 🙂

    When assessing the hindcast, I would draw on the same tools as for the forecast. BUT, I would remember it is a hind cast. And, I would also be very, very careful not to convince myself that any data that existed before I began developing my model didn’t influence the model in some way. Sometimes modelers want to suggest that hind-casts of ancient data is somehow “out of sample”. Or that the hindcasts are “out of sample”, because the parameters in the models aren’t fit to GMST the way I fit data for the hypothesis test.

    But there is still a difficulty of influence. Scientists did develop theories about what influenced climate based on associations observed over time. So, the decision to include things like the effect of volcanos were somewhat influenced by noticing they had an effect in the past. Other things happen.

    On the one hand, one might expect better luck because the models contain physics rather than pure tuning. But, on the other hand, this has to be proven. And the best way to prove something to an independent third party is to show the model can predict data the analyst absolutely, positively did not have access to before the model runs were made.

    For climate models, this means data from the future.

  20. I’ve just discovered your blog, Lucia, and am enjoying it.

    One of my worries about the way in which many (most) analysts seem to regard climate change is the apparent assumption that it is reasonable to fit a simple linear model to the data, and to contemplate the properties of the residuals from such a fit. One is of course able freely to select a model and to test its ability to describe satisfactorily the behaviour of the actual data. Any conclusions regarding the adequacy of the hypothesised model would be based on the inferential statistics derived by standard statistical techniques.

    However, apart from the /statistical/ considerations there are also “practical” ones. By this I mean that although the statistical inferences may be completely uncontroversial (being based on sound technology) they may have little practical value, simply because the effect they have statistically established as being “highly significant” may be so small that they would have little or no discernible practical impact.

    All this aside, I feel that there may be a very strong case for questioning the value of a simple linear model as a tool for describing, and even forecasting, the behaviour of a supremely complex system that lies in thrall to numerous chaotic (and thus non-linear) influences. Is there not a case for proposing a model or models that are defined in a less restrictive way?

    My own extensive investigations over the last 15 years of published (and I presume original) data sets of various provenances and types – instrumental, proxy, assemblies of these of varying complexity (bafflingly so in some instances such as the famous “Mann et al” data that gave rise under Mann’s analyses to the profoundly influential hockey stick plot) have convinced me that climate has a pronounced tendency to be “stable” for periods that vary widely in extent but to be subject to extremely abrupt changes that can be positive or negative. These observations have made me completely sceptical about the chance of climatologists ever being able to forecast with confidence what will happen next in a climatological sense.

    I would be happy to provide, by email, some thoughts and more importantly diagrams that give some evidence for this viewpoint, and would welcome a discussion.

    Robin

  21. Robin–
    Thanks for the kind words.
    My reason for fitting a linear trend is that the central tendency of the IPCC projections from 2000-2030 is linear. My goal is to test the hypothesis for the “underlying climate trend”, so I use that feature of their projections.

    There are other people who want to use more complex statistical models to do other things– like forecasting, explaining, teasing out relationships etc. My scope is more modest– I just want to test whether the IPCC’s projections bear out against data.

    If someone has another hypothesis, I’d be happy to test that too. But coming up with my own forecast is not any sort of major goal.

  22. When are you going to “publish” this in “peer reviewed” literature so it can be officially “cited”?

  23. David– I’m starting to look at the features of “weather noise” in models. So, to some extent, I’ll publish in the literature when I’ve figured out how to put this in the context of the “weather noise” in models.

Comments are closed.