It’s September 29, and I’m finally reporting the test of the hypothesis that the global means surface temperatures are rising at a rate of 2C/century, ( which is ‘purt dang close to the trend represented by the average of models in the IPCC AR4: That is to say, it represents a central tendency for the predictions of IPCC models used in the AR4. For graph of average of model runs compared to 2C/century, read 1.)
As readers generally want to know the results more than the basis for the results, I’ll provide the executive summary first: The hypothesis of 2C/century is not rejected to p=95% based on the average of NOAA, HadCrut3 and GISS using a statistical model treating “weather noise” as “AR(1)+white noise”; that is it does not falsify. However, using the IPCC terminology, based on this test, we would say we have “very low confidence” the hypothesis that 2C/century represents some underlying climate trend masked by “noise”.
Why only “very low confidence” in 2C/century this month?
As readers know, I vary the form of the hypothesis test. This permits us to see how varying the assumptions related to the statistical model affect our conclusion as to whether the difference between the current trend and 2C/century is statistically significant. I believe this is useful because those who don’t want to dig into the math often perfectly well able the analyst made assumptions about various things and those assumptions often drive the results.
Last month, I posted results using four separate statistical models. Two of those test used a “2C/century trend &AR(1) + White Noise” statistical model. In addition to the trend, that model requires three parameters to describe the “weather noise” (i.e. residuals to the trend.) Last month, I assumed the “white noise” corresponded to “measurement uncertainty”, and I estimated the parameters by assuming the variance of the measurement uncertainty was known.
This month, I using the “2C/century trend &AR(1) + White Noise” statistical model but I obtained the magnitude of the three parameters based on fit to data collected during a historical period when stratospheric aerosols were light and did not vary substantially. Because the normal source of stratospheric aerosols is large eruptions of volcanoes, I’ll nickname this period the “volcano lite period”. This month’s choice results in larger uncertainty intervals than last month’s choice of assumptions.
Why this period?
From the point of view of externally applied forcings which affect the trend in GMST, “volcano lite period” period is chosen as the most closely representative historic period to the period since 2001. However, it is worth noting that 30 year historic “volcano lite period” includes observations of GMST from months when the stratosphere was still clearing from aerosols. Consequently, using that periods may over estimate the variability of weather noise during periods when the stratosphere is calm. In this sense, we may tend to fail to falsify trends that are false because we assumed too much “weather noise” before doing any mathematical manipulations. That said, there some factors that may cause us to underestimate the variability in the weather noise which I plan to discuss more fully later. (For more detail on the time period, read 1, 2 and 3.)
How were the parameters obtained?
The parameters for the “noise” were obtained algebraically, in a method similar to that used to obtain ARMA(1,1) parameters for HadCrut3 during the same period. However, the algebra is slightly different. I plan to discuss the specific parameters obtained and compare the “white noise” to estimates of measurement noise in a later post.
Data Sets
This method can only be applied to data sets from groups that were reporting data as far back as 1913. For this analysis I chose land/ocean surface measurements from NOAA, HadCrut3, GISS and then created a fourth set by averaging over the three previous sets. The main motivation for the “averaged” group is to obtain an value that falls between the extremes of the agency currently reporting the largest trend in GMST and that reporting the smallest trend in GMST.
Results based AR(1)+White
The results for this analysis are summarized in the table below.
p (m < m observed) | Observed Trend (m) | “Is the 2C/century consistent with data?” | |
HadCrut3 | 2.49% | -1.23 C/Century | Inconsistent: 2 C/century Falsified |
GISS | 4.18% | -0.24 C/Century | Very Low Confidence. |
NOAA | 3.45% | -0.12 C/Century | Very Low Confidence |
Average of 3 | 2.55% | -0.59 C/Century | Very Low Confidence |
Terms: I have translated ‘p’ values for the hypothesis test into one of five standard terms. Four are taken from the IPCC usages.![]() I add the term “falsify” if the result of a particular analysis indicates we should reject it at p=95%, which corresponds to a diagnosis that a hypothesis has a 1 in 20 chance of being true. |
As you can see, if we use IPCC language, based on this statistical model we would conclude that we have “very low confidence” that the current trends are consistent with 2C/century + “weather noise”. However, we would not reject 2C/century at the 95% confidence intervals based on GISS, NOAA or the average of the three data sets. HadCrut3 is rejected at a 95% confidence level, i.e. this test says 2C/century is to be treated as false– or falsified.
For those who like graphs, I created a histogram illustrating the distribution of trends around 2C/century one would expect if the “climate trend” was 2C/century, and but “weather noise” were described using an AR(1)+white noise process with parameters from the “volcano-lite” period I used as the basis for estimating “weather noise”:

As you can see, 2.55% of the “simulated weather” cases fell below the observed trend of -0.588 C/century based on the average of NOAA, GISS and HadCrut.
Are there caveats?
As always, there are caveats. The test assumed we can estimate the statistical properties of “weather noise” based on a particular “volcano lite” historic period. This period is only 30 years long, so there is some imprecision in the estimate of the parameters themselves. The test also assumed the “weather noise” is AR(1)+White noise (this can also be expressed as an ARMA(1,1) process.) If the true process is AR(1), then spread of trends consistent with 2C/century show above is larger than the true spread. If true process is something else entirely, the uncertainty intervals may be too large or too small; it is impossible to know which without specifying the candidate for “true process”.
It is also worth nothing that I run 10,000 cases for simulated weather, I would expect to obtain slightly different results each time. So, oddly enough, there is imprecision in the p value! To be precise, I should also report I expect a standard deviation of ±0.16% on the 2.55%. I could run more cases to reduce the ±0.16% interval, but the major source of uncertainty is that due to the uncertainty in the parameters for the AR(1)+white noise data. (Some readers will remember that if I used the ARMA(1,1) data fit to the same data, HadCrut did not falsify. As I noted, rounding the parameters makes a difference, and that’s making a difference when the p’s are very near 2.5%.)
Still, based on this analysis with these assumptions, we have very low confidence that 2C/century is correct.
What will happen in September?
Beats me! But as we get more data, the bell shaped curve in figure one will get narrower and taller. That means there will be less uncertainty in the estimate of the trend– that’s the major effect of more data. So, next month, you’ll see a tiny difference in the spread. When we get the new data, we’ll see if it stays inside or falls outside the 95% confidence interval.
Of, if you know a psychic, maybe she can tell you now! 🙂
My psychic is rather low-budget and uses various proxies in lieu of a crystal ball. After careful instrumental and intuitive autocorrelation, she has concluded that the IPCC predictions will go from “very low confidence” to “deeply disappointing, if not silly”
I am pretty sure I was ripped off on this reading but she is nevertheless entertaining, a lot cheaper to operate than the IPCC and occasionally provides upset NFL picks, and is thus a better overall value than the IPCC.
Cheers.
My psychic uses an ensemble model for his soothsaying. This wouldn’t be too bad but the spread is so wide it predicts everything. But the only prediction which could be considered certain is one about his prediction was correct. 🙁
George and Raphael–
My psychic insists her prediction for the 2008 presidential election is correct because she can hindcast the winners of the past 10 races.
My psychic failed utterly to predict that I would be so broke as a result of the crash I wouldn’t be able to afford a personal psychic any more, so she was rubbish anyway.
Perhaps you could get something nice and optimistic together on the economy, Lucia?
SteveUK,
I think the prediction is:
1) If you have cash, there will be buying opportunity sometime in the next 6 months.
2) If you have cash, and would like a Florida condo, there will be a buying opportunity sometime during the next 2 years.
If you don’t have cash, too bad for you!
With any luck, you won’t have done what an acquaintance of my brother – in – law did. He or his wife inherited some money. They decided to buy 1 Florida condo pre-bubble-bursting. They sold. They decided that was so easy and splendid, they took their profits, borrowed more and bought contracts on seven count them seven condos under construction! Then the bubble burst.
I don’t know if the condos are built yet, but as the come on the market, they have big obligations on 7 condos worth much more than they paid. Plus if they can’t sell, they will be obligated to pay taxes and association fees– which tend to have sky rocketed as a result of the rise in hurricane insurance rates. I feel sorry for these two… but my sympathy is also somewhat limited as… well…. They chose to leverage themselves to the hilt. Also, I think they are still generally ok.
Lucia, thank you so much, that’s really cheered me up. There’s nothing like people worse off than you for brightening up the day.
At the moment I have cash.
But it’s in banks.
Wooooh dear…!
So the transient climate response appears to be lower than expected. That does not mean that it is low, or that the equilibrium sensitivity is low.
My psychic predicted a rise of 0.5C over the next two years, falsifying Lucia’s ‘falsification’
‘Don’t be absurd, I said. That’s twelve times the IPCC projected rate, it ain’t happenin, babe.
‘I knew you’d say that’ she replied, ‘it happens to be exactly the rise in the 2 years leading up to Feb 1998, the last El Nino, – we’re about due a big one, what with all these greenhouse gases building up and all .’
‘Hang on, wait a just moment’ I objected, ‘an El Nino, that’s weather, that’s not climate. The IPCC is about climate, dammit – long term trends, decades, even more.’
‘Ssshhhh’ she soothed. ‘I knew you were going to say that, some day Lucia will apply her analytical prowess to climate, not weather. I promise …..’
Bender– I do think the predictions for the early 2000’s are dominated by whatever they estimate for the transient climate response. No, this does not mean equilibrium sensitivity is low. We can’t test that easily because we can’t do the correct experiment!
bender,
It is a question of odds and what you are willing to bet based on those odds – not facts. I certaintly do not want to make huge bets on anti-CO2 policies given the current odds. That could change with another 5 years of data but that is the beauty of science: if the fact changes you can/should change your mind.
Lucia,
I predict there will soon be a huge volcano that will take your trend lines even more negative and further away from the +0.2ºC projection. This will also lead to a huge increase in variance and the +0.2ºC will once more fall within the range of plausible trends.
This will make everybody happy. Warmers will say their projection is on track and coolers will rejoice at the even larger cooling trend.
My local psychic was called away unexpectedly so I had to make the prediction myself. 🙂
Jorge–
If a volcano erupts, warmers will say that the scenario for the projection no longer applies! Coolers will of course just note the trend itself, and point out it was low before the eruption. Then, in 2012 or so, another IPCC report! 🙂
Lucia,
It appears that Bob Carter has cited your blog in a paper:
http://www.eap-journal.com/index.html
(click on Carter’s name to download the PDF)
The primary problem with that craftily written paper is that lack of GCM skill in the short-run does not in fact imply that the models does not work in the long-run. At the bottom of p. 185 he dismisses GCMs as a basis for policy-making. However, as is typical for most economists, he’s looking at short time horizons – time-scales over which GCMs were not designed to perform well*. Yet the issue is global policy over much longer time horizons. Second, his argument is made especially easy by conveniently ignoring the proposition that Earth climate is approaching a tipping point (450ppm CO2, according to Hansen) whereupon later action will be ineffective. Note I am not saying this is correct, just that the proposition is ignored.
These two oversights are highly damaging to the paper’s credibility and usefulness. The author should consider a separate paper that treats these subjects a little less recklessly. Because as it stands, his argument in critically incomplete.
Looking back at the ttle of the paper, it is easy now to spot the straw man. “Where’s the evidence?” You’re not going to see it the way he’s looking at the data. You have to open your eyes.
*I’m not convinced he understands the difference between equilibrium climate sensitivity and transient climate response. The latter may be more muted than what the models suggest. That does not imply the former is equally biased. This refutes his impatient supposition that we should surely have the evidence in front of us by now. Yes, Earth’s climate changes, but it does so slowly relative to the economist’s and politician’s attention span. Patience, Dr. Carter.
gravityloss, check #5601. Please note it was a crosspost with your most recent.
Bender re. Bob Carter
“However, as is typical for most economists, he’s looking at short time horizons – time-scales over which GCMs were not designed to perform well*”.
Bender Bob Carter is not an economist, far from it.
http://members.iinet.net.au/~glrmc/
From the biographical link above “Bob Carter is a Research Professor at James Cook University (Queensland) and the University of Adelaide (South Australia). He is a palaeontologist, stratigrapher, marine geologist and environmental scientist with more than thirty years professional experience, and holds degrees from the University of Otago (New Zealand) and the University of Cambridge (England). He has held tenured academic staff positions at the University of Otago (Dunedin) and James Cook University (Townsville), where he was Professor and Head of School of Earth Sciences between 1981 and 1999”.
As a palaeontologist & geologist Carter’s mode of thinking & framing of problems is most decidedly long term. I haven’t read the article yet but wanted to point this out to clear any misconceptions.
However one question to you. Just how long are GCM’s designed to perform well at and how sucessful have they been in performing over this time-span?
bender:
It is almost impossible for me to take seriously economic projections of this kind. The underlying climate predictions are neither precise nor reliable. Macro economic modeling is even worse. I think the way to do it to save time and money is draw the axes on graph paper, carefully pick a color to represent error bars and then fill in the page with that color, put a cover on it and title it “The Hell If We Know.”
In essence, it’s the defenders of the hockey stick handing off to guys who never get predictions right (current economic mess being a case in point) for a compound error on a (literally) global scale.
Sure. But on the other hand, the IPCC consensus projection in 1990 was 3C/century for the period one would call “about now”, and that didn’t happen. So, we are left with:
a) the only long term projections appear more or less wrong (for whatever reason)
b) more recent projection/predictions cannot be tested over the period where they are claimed to work because they are too recent. These leaves them “unproven”. and
c) Shorter term predictions aren’t looking so great. Though the comparison period is short, the deviation from the current trend is fairly large even if we account for observed variability of the earth’s surface temperatures.
None of this proves the long term projections incorrect. But, if the IPCC documents persistently tell us the ability to hindcast gives us “greater confidence” in models, then presumably, the fact that forecast ability is entirely unproven might give us “lesser confidence” in models.
So, sure the issue is over longer time horizons. (My point of view: encourage building nuclear baseload now. In terms of technology, it’s a proven way to get stable baseload now. We should also encourage research into better ways to do solar, wind etc. I’d love solar power generating shingles on my roof to run my refrigerator and TV! 🙂 )
If the planet time constant is long, there should be time for later action. The action would need to involve geo-engineering though! (We’d need to suck out CO2 before the heat “in the pipeline” arrives, or we’d need to sprinkle aerosols in the stratosphere. Doubtless both would be expensive; in an economic journal, that’s a problem.
I happen to think the preponderance of the evidence indicates AGW theory is true in the sense that a) people have added GHG’s to the atmosphere and b) it’s causing warming.
Even if the thermometer record is only 150 years, (or 5 batches of 30 years) and the temperature went “down, up, down, up, up”, that still gives general support to a theory that makes sense based on phenomena we also recognize as acting in less controversial areas. ( The “truth” of how materials — like CO2– can absorb and remmit energy has applications in combustion problems. If we didn’t recognize this physics, we’d have trouble designing furnaces! )
However, I think the models appear to have some problems. It is unfortunate that the case, as presented to the public, relies so heavily on model predictions. But, they are flashy, can create color pictures, and provide ‘projections’. So, I guess it’s natural that the climate scientists lean on that hoe.
Lucia- Have you considered performing the same analysis, comparing the average of the RSS and UAH mid-troposphere temperature trends since 1979, with the GCM scenarios for the mid-troposphere? I think that would be quite telling.
Perhaps also mid-troposphere absolute humidity measurements versus scenarios from GCM’s.
Great posts, by the way.
Paminator,
I can’t apply this one to RSS or UAH because I can’t get data for the earlier “no volcano” period. At some point, I’m going to try to see what I can do by accounting for volcanic eruptions, but I’m not sure I can get a long enough time period.
The plan I was thinking of was this:
1) Assume the average of the IPCC models represents the “true trend” from 1980-now. (Or 1975-now.)
2) Subtract the weather data from the “true trend”. Get the AR(1)+noise parameters.
See how that pans out.
The difficulty is that I know the parameter are a bit biased, and shorter amounts of data result in greater bias. The bias make the uncertainty intervals a little too small. So, I’m trying to figure out how to correct for the bias, which would let me use the satelitte data. But… it’s going to take a while. (I’m doing a few other tests right now.)
Two things.
The distribution looks slightly skewed to the right (the way it’s done the confidence limits assumes a symetrical distribution). As the trend line is on the short side of the curve, it looks as though it falls outside of the 95% area from the middle, or to put it another way, there looks to be less than 2.5% of data to the left of the current trend line. That would put GISS and NOAA also in the falsified band.
Secondly, although (possibly for the sake of argument here) showing within 95% confidence limits, the actual measurement are so far off centre as to fall into the IPCC “very low confidence” area. What you have shown is that there is a less than 5% (maybe as low as 2-3%) chance of the Models forecast proving correct (based on current trends). Surely this should be published? Ideally before the next IPCC Report.
MarkR–
Since I’m running simulated weather, I just count the number of cases below the current trend to get the p value. So, those are the p values based on the 10,000 cases. Things can look skewed due to lack of convergence. (I’m not going to run 1,000,000,000 cases!)
As for publishing… I do need to get off the dime and publish a few things. But there are other things about the AR(1)+white noise assumption that need to be checked. (I want to do some of the stuff I mentioned in the note to paminator.)
My psychic says that the Earth will be destroyed by an asteroid in 2112. I’ll be dead by then so I won’t see it, but what great revenge against those that will outlive me! I’m not sure of that prediction though, she was listening to Rush (the band, not the radio guy) when she made it.
On the other hand, bender comments about “the proposition that Earth climate is approaching a tipping point (450ppm CO2, according to Hansen)”
Note that this is about the 450, not about bender (the poster here, not the character on Futurama) or about Hansen (the physics and astronomy guy from Iowa that runs GISS, not the band or the Idaho or Utah politicans).
Currently, at 10 ppmv per 6 years, 450 is about 36 years from now. I am of course assuming BAU and ignoring any possible increase or decrease in rates of change et al.
So we just all need to hang around until 2044.
Oh, wait, the world ends in 2012. Nevermind.
Speaking of Dr. Hansen of GISS, here’s a bit of interesting thinking. Rather well ties in with the high amount of certainty we have on carbon dixode.
http://www.giss.nasa.gov/research/features/altscenario/
Carbon dioxide will become the dominant climate forcing if its emissions continue to increase and aerosol effects level off. Business-as-usual scenarios understate the potential for CO2 emission reductions from improved energy efficiency and decarbonization of fuels. Based on this potential and current CO2 growth trends, we argue that limiting the CO2 forcing increase to 1 W/m2 in the next 50 years is plausible.
Indeed, CO2 emissions from fossil fuel use declined slightly in 1998 and again in 1999, while the global economy grew. However, achieving the level of emissions needed to slow climate change significantly is likely to require policies that encourage technological developments to accelerate energy efficiency and decarbonization trends.