William Connolley seems to be having a little trouble understanding how one can apply statistics to a set consisting of a time series with many data pairs, and then state a range for climate trends that are consistent with the noisy data.
As it happens it’s easy using standard statistical techniques applied to data containing noise. One of them is called “linear regression”; this is a type of averaging technique that permits us to identify average trends in data.
I have applied a regression to monthly Global Mean Surface Temperature data (aka “GMST”.), averaged from four sources. The graph below shows the best fit trend in purple; that trends is rate of warming most consistent with that particular data set. The blue lines indicate the 95% confidence bands for trends that could be consistent with the data. The best fit trend is m=1.5 C/century; the 95% confidence intervals is 1.0 C/century < m < 2.1 C/century.
This graph proves warming occurred during the indicated period with confidence greater than 95%: The exact same analysis says recent IPCC claims for future trends are incorrect.
The fact that the slope of the flattest blue line is greater than zero is empirical proof that global warming has happened. Why is it proof =?
Because a straight line from 1979-2007, with a slope of zero passing through the intersection of the two possible lines, passes outside the two blue lines. That is, it’s slope is lower than the lowest slope consistent with that data.
Normally, this ‘falsification to 95% confidence’ of “no global warming” is not controversial. The method is not controversial.
However, I applied the technique to test the predictions of warming communicated by the IPCC in the AR4. I found their recent short term projections for warming are falsified to 95% confidence Their best estimate for the near term trend in the earth’s surface temperature is not consistent with observations of the earth’s surface temperature since the time when the IPCC made its projections. The mean trend they projected was too high. The lowest trend they communicated to the public, in graphical form, was too high to be consistent with recent observation of he earth’s surface temperature.
If the IPCC thought lower trends were likely, they failed to communicate these lower trends.
William objects to my conclusion … because weather is variable.
Evidently, William believes I can’t conclude the IPCC projections are too high, because there is some problem with the IPCC error bars. Specifically:
Because the error bars are not supposed to constrain the year-to-year variation.
Of course the IPCC error bars don’t constrain year-to-year variability. Error bars for a predicted trend never constrain the year to year variability. They are intended to communicate uncertainty about the trend, or average, behavior.
Weather is variable, but many different realizations of weather could result in the same trend. To avoid the thorny problem of predicting weather, the IPCC only tries to predict the average behavior, aka, trends. In the process, they also state an uncertainty in their prediction of the trend
Because the IPCC only predicts trends, I compared the IPCC predictions of trends to trends consistent with the real earth’s weather. That is: I compared like to like.
What about William’s dice rolling analogy?
William suggests the dice rolling analogy shows what I did is wrong. This analogy is usually used to explain why one should compare averages to averages. That is: it is normally used to explain why the method I use, which involves averaging over 83 measurements of GMST, collected monthly, and performing a type of average to determine the trend works. In contrast, looking at the weather (aka the individual rolls of dice) doesn’t work!
One can for example, determine the ‘average’ number of spots that appear when we roll a die. We would roll the die “N” times, and then calculate the average. Once we rolled the dice, we could compare to manufacturer’s quality claim for fair dice; for example: “The average number on the face of this die will be 3.5± 0.1.
We don’t test that claim by comparing to the variability of the outcome of individual dice rolls. Instead, we measure an average, and calculate confidence intervals to some stated certainty for the average. We then use the confidence intervals for the average to draw conclusions about the manufacturer’s claim. If we roll the dice “N” times, and compute an mean of 4.7 ±0.5 to to 95% confidence, we falsify the manufacture’s claim of 3.5± 0.1.
To test the IPCC’s claims about trends in the GMST, I computed the trend and confidence for GMST from the “experiment” called “the real earth’s weather”. I compared averages to averages, using 95% confidence intervals for averages.
Do the confidence intervals for the trend constrain weather variability?
No! And the confidence intervals for the average number of dots that appear on the face of a die don’t either! 🙂
The difference between confidence intervals on the trend (and average) and the weather (individual dice rolls)
is illustrated graphically in my previous figure. Notice that, just as the IPCC, confidence interval don’t constrain the year-to-year variability (aka, ‘weather noise’), the blue lines in my chart also don’t constrain year-to-year variability.
So what of the weather variability?
Interestingly enough, standard linear regression techniques also provide an uncertainty for the year to year variability. Those are the yellow lines in my figure; the yellow enclose 95% of all data and describe weather variability about the mean. (If you wished to predict GMST of individual months in the future, it would be prudent to use the gold uncertainty intervals. Those include the uncertainty both due to the “weather noise” and the “slope”.)
These yellow lines are never used to test whether or not a predicted trend is correct.
Proving a predicted trend is right or wrong my pointing to weather variability would be as silly as “proving” the average number of dots that appear when we roll a die is not 3.5 by rolling that die once and discovering 4 dots appeared!
To do these tests, we compare averaged quantities to to averaged quantities. We compare confidence intervals for averages to confidence intervals for averages. If the averages happen to be average trends, this principle still holds.
So, can we identify trend from noisy data!
Yes. We can identify trends from noisy data. That’s the way AGW was proven in the first place. When we do that, the data since 2001 indicate that either:
a) The published IPCC projections are too high (to 95% confidence) or
b) The IPCC communicated uncertainty intervals that were too small.
Before the AR4 was published, others have warned that adoption of these narrow uncertainty intervals. Roger Pielke Jr. and Roger Pielke Sr. both stated:
Such predictions represent a huge gamble with public and policymaker opinion. If more-or-less steady global warming does not occur as forecast by these models, not only will professional reputations be at risk, but the need to reduce threats to the wide spectrum of serious and legitimate environmental concerns (including the human release of greenhouse gases) will be questioned by some as having been oversold. For better or worse, a failure to accurately predict the changes in the global average surface temperature, global average tropospheric temperature, ocean average heat content change, or Arctic sea ice coverage would raise questions on the reliance of global climate models for accurate prediction on multi-decadal time scales. Surprises or experience that evolve outside the bounds of model output would likely raise questions even among some of those who have so far accepted the IPCC reports as a balanced presentation of climate science. (for a perspective different than the IPCC on applications of climate models see this).
Steady warming has not occurred since 2001. Last year, climate modelers could embrace error bars, stating that the flat temperature trends are not statistically significant. But, the trend has persisted sufficiently long, and the recent decline sufficiently deep for people to correctly state that the trend since 2001 (and longer) inconsistent with the IPCC projection of +2C/century.
Stronger things could be said. Stronger things will be said. And dice throwing analogies aren’t going to work, because all the statistical statements are based on many rolls of the die.
===
Update
March 18. David Stockwell suggested I read through and add “to 95% confidence” to avoid giving the impression that anything can 100% falsified using statistics. I stepped through. If I missed any, it is my intention to convey that the test I am doing are to 95% confidence, which is a level used by the IPCC.

Hi Lucia. Your write posts with a hurricane intensity, and an eye of clarity.
One quibble I have is your unqualified use of the terms ‘falsification’ and ‘proof’. If you use the 95% confidence limit, and there is nothing special about this limit, there is still a 5% chance you could be wrong. I am sure you know this, but ‘falsification’ and ‘proof’ imply an absolute certainty that isn’t there in a statistical determination. Ideally statements would simply have a p value, 0.83 e.g. and we would use CLs at all.
As a rough guide, I think CLs of 90% are barely significant, and 99.9% is needed to be almost certain.
David– Yes. I guess I’m using 95% because that’s a value the IPCC uses. I’ll look through and check for 95% confidence limits. I agree nothing is ever uterly falsified with statistical tests!
“a) The published IPCC projections are too high (to 95% confidence) or
b) The IPCC communicated uncertainty intervals that were too small. ”
c) The IPCC communicated uncertainty intervals that were for something else and aren’t suitable for this purpose
c) is actually the case.
Using the right uncertainty levels would be expected to reduce your 95% CL. By how much is not clear. Yes the IPCC don’t provide those levels on this graph, since that’s not what the graph is intended to portray, but that doesn’t make the problem go away.
We also have information that is not contained in your data set that would lead us to reject your best fit slope as too low. Right now your result is telling us that it will be more likely to cool than warm over the next decade or so, whereas the reality is that everything we know tells us that is not the case.
Of course ultimately this will all come out in the wash as more data comes in. Nonetheless not taking this stuff into account is going to make it take longer than it should.
FrankD:
The uncertainties the IPCC communicated to the public, and published, fall outside the range that is consistent with the data. That’s all I’m saying. Had the published other values, or communcited them, it is possible that their “extremely unlikely” range would have overlapped the “extremely unlikely” range consistent with the data. However, since the did not publish this information, we can only test what they communicated.
Policy makers and the public develop their impression of what the IPCC is predicting from the figures, words and numbers they publish.
As for your suggestion that there is information not contained in my analysis: If you have relevant information about measured values for GMST during the period from 2001- now, I’d be happy to include it. Let me know, and I will.
I’m puzzled about your suggesting that my regression predicts anything. Linear regressions are predictive. As are as I am aware, my regression tells us nothing about what is likely in the future. It is a diagnostic to describe the range of trends that could be consistent with recent measured data.
Why would you think it is diagnostic.
lucia, do this very simple experiement. add 24 years onto the end of the 6 years 2monnths you have. do this in two ways.
first go back to 1975 march. copy the next 24 years and graft then on to the end of your data. ( just use hadcru for example). to make the graft “fit” you will have to add an offset I would suggest .4C. this will “simulate” the kind of warming we saw from 1975 through 1998. Now, run the analysis again. Then do the same thing but with 24 years of data
were you simulate a 24 year period of .2C warming ( just add white nise to a .2C trend line) it might be interesting
to see what has to happen to catch up to the ipcc projection
steven mosher,
It might be fun, but I prefer to avoid spending time doing idiosyncratic projections, particularly when I would find it difficult to explain the meaning of the projections.
Either we will catch up to the IPCC projections or we won’t. There are many hypothetical strings of weather that could cause us to catch up, but the correct way of showing that would be difficult. If you want to show one, I’d enjoy reading it. If you have an account that permits you to post figures, this blog permits images to be inserted into comments.
Thank you for making your point even clearer in this post. I really do not see what more you can do to explain your work, and suggest that anyone who still does not understand probably never will.
Can I ask a question that addresses the issue in the opposite way to your recent posts about the IPCC projections? To what extent is it correct to say that there has been no statistically significant global warming over the last six/seven/eight/nine/ten years? Your recent work has been on the hypothesis that the climate has been on a warming trend of 2 degrees a century since 2001; could you put the question the other way by testing the hypothesis that “There has been a warming trend of 0 degrees” over the last decade or so?
The self-described Hansen’s Bulldog made a series of posts two or three months ago attacking the suggestion that “there has been no global warming 1998”, and one of his fans is posting links that work whenever he sees this alleged. From http://tamino.wordpress.com/2008/01/09/dead-heat/ :
I apologise if I am giving you more work than you need, but I would certainly welcome a post looking at the trend in global temperatures recently, perhaps stating how far back one can go with the statement “There has been no statistically significant warming trend since 200* or 199*” statistically true to a 95% confidence interval.
Patrick:
Using this test there has been warming and it’s statistically significant to the 95% confidence level. (It’s significant to more than that– 99% or so. I just don’t memorize these p values.)
So, I only newly self-taught myself Cochrane-Orcutt. I’m communicating with a econometrician, and there are other better tests. So, I’ll be learning those. It turns out that every test is subject to false positive and false negative issues. Which happen depend on the test, the features of the data etc.
(The teaching myself may sound odd. But, in my background, I’ve dealt with two extreme: time series that are soooooo correlated, that one is interested in taking enough to get the spectral characterisics. The other extreme is time series where one makes darn sure the individual data points are uncorrelated. So, I understand statistics in general, but have not used the techniques econometricians use often. These intermediate ones are subject to all sorts of odd behaviors in terms of statistical evaluations.)
Oh… on this:
All Tamino showed was that the temperatures fell inside some wide temperature bars for the weather. See the yellow bars? Yes. All the data fall inside the yellow lines. Of course the do. They would if I did the same thing for weather from 1800 to now.
Showing the data fall inside the very wide error bars including all weather is not the standard way to compare the trends before and after. You are supposed to compare the slopes of the lines and the uncertainty in those lines.
The reason you need a lot of data is that the uncertainty in the slope, as obtained from data, varies as 1/[(number of data-2) (spread in time) ]
So, say we expect the scatter about the line to be 0.1C on average. (That is, the Temperature tend to be, about 0.1C aways from teh line.)
If we take 30 data points spread over 30 years, we expect the uncertainty in the error bars to go as
~’C’ 0.1C/[(30-2) 30 years ]
where ‘C’ is a constant.
If we take 10 data points spread over 10 years, we expect teh uncertainty in the error bars to go as
~ ‘C’ 0.1C/[(10-2) 10 years ].
So, you can see that the uncertainty in the slope for 30 years is much much less than for 10 years. And the uncertainty for 5 years is even greater than for 10 years.
But this doesn’t means we can’t apply statistics to five years, and it doesn’t mean you do what Tamino suggests to test slopes.
tell eveyone not to worry Phil Jones has predicted that 2008 will be in the top 10 warmest years.
you know the feb data is in for all four measures
Lucia,
The uncertainties the IPCC communicated to the public, and published, fall outside the range that is consistent with the data.
Just as apples fall outside the range that is consistent with oranges.
Had the published other values, or communcited them, it is possible that their “extremely unlikely†range would have overlapped the “extremely unlikely†range consistent with the data.
Or it’s possible that it would have overlapped even more than that. However your statement of 95% CL rests on the assumption that they do not overlap at all.
That’s all I’m saying
The headline of this thread would create a different impression. Indeed hasty readers could be forgiven for concluding that you have demonstrated the entire output of the IPCC to be false.
As for your suggestion that there is information not contained in my analysis
Of course there is information not contained in your analysis. According to your analysis of the data the most likely current trend is one of cooling. Do you really think the most likely climate trend in the real world is one of cooling? Have all those birds and butterflies returned to their previous habitats since 2001? Is there any reason to consider that the warming trend would radically switch in 2001 just because the IPCC made some predictions?
I’m puzzled about your suggesting that my regression predicts anything. Linear regressions are predictive. As are as I am aware, my regression tells us nothing about what is likely in the future.
Well your graph actually includes the future, for a kickoff. I can only go on the information you communicate.
Frank– I’ll concede I drew the lines long, and added a note to the effect that past 2008 is an extrapolotion.
However, my uncertainty intervals include warming. In contrast, the IPCC uncertainty intervals do not include the event we just had. In anycase, I already discussed that the test for 95% confidence intervals does not require the IPCC to show their 95% confidence intervals.
You seem to be operating under the notion that if the 95% uncertainty intervals for the IPCC fall just inside those for the weather, something about the IPCC is not falsified. In reality, if that occurred, the central tendency of the IPPC projections would be falsified to better than 99%. The 1 sigma would be falsified. In fact, the only thing that would not be falsified is events they specifically said were very unlikely– that would be events that occur less than 2.5% of the time.
So, in short, you seem to be repeating an incorrect notion over and over. Repeating it over and over isn’t going to make it correct.
Lucia,
“In contrast, the IPCC uncertainty intervals do not include the event we just had.”
We don’t have the IPCC uncertainty intervals for short-term projections.
“You seem to be operating under the notion that if the 95% uncertainty intervals for the IPCC fall just inside those for the weather”
No, that is just a strawman.
I am (still) saying that the degree of overlap will affect your CL and how much is not clear, because we don’t know how much they overlap. They might overlap more than ‘just inside’ and they might even include your mean. You’re only considering the worst case possibility that the overlap will be negligible (a process that could be characterised as picking stones since we are by now running out of types of cherries).
You are by turns saying that the IPCC error bars don’t matter, expressing surprise that they are missing, noting that we don’t have them, and then making unwarranted assumptions as to how much they would overlap if we did have them.
By analogy if I made a projection that some variable’s trend over seven years would be 1.2 +/- 500 that would just be a fancy way of saying ‘I can’t predict what the trend will look like over 7 years’. It would not be a ‘falsification’ if the data realisation showed the central trend was -0.6, and indeed it would be almost pointless to go about falsifying such a projection.
However if additionally I said the trend over 20 years would be 1.2 +/- 0.05 then I would be expressing much more confidence that it would not trend down over 20 years. If I then went on to draw a figure illustrating the trend for 20 years without any error bars at all, that would change nothing about the 7 year ‘falsification’ discussed previously. It would not become more reasonable just because I didn’t show error bars on a graph intended to communicate something else.
No Frank:
You seem to suggest I need to know some other error bars. I am saying that the 1-sigma error bars on the trend areare precisely what we need to do the analysis I did. I used those.
The IPCC decided what phenomena to include when estimating these quite specific uncertainty intervals. You and others seem to suggest that
a) they should have included other factors, which you believe are well understood.
b) I should pretend the correct uncertainty intervals are those for the weather or
c) think we need to show 95% confidence intervals don’t overlap 95% confidence intervals.
With regard to
(a): Yes. I think the IPCC should have included uncertainties due to all known factors in the trend. If they failed to do so– as you suggest– then I am surprised, and I would criticize them for that.
(b) The correct uncertainty intervals for this analysis are not those for the weather and
(c) The IPCC didn’t provide 95% confindence intervals for the mean. If they had, I could make additional inferences or conclusions, beyond those I have discussed. But they did not. So, I can’t. But you seem to believe that I need those intervals to make the conclusions I drew here. I don’t need them.
This discussion is getting repetitious. Unless you say something new, that’s it for me. 🙂