It has come to my attention that a certain blog-climate warrior has now decided to get back to work adjusting analysis to include factors that will widen error bars while doing things in ways that do a poor job of incorporating explanatory factors that would narrow the error bars. So, I decided I might as well go ahead and extend my own uncertainty analysis. In an unprecedented move, The Blackboard will now present final results first and explain the method afterwards.
The graph below shows uncertainty in the trend computed based on the difference between the multi-run mean computed from a series of AR4 models used for projections and the observations from Hadley. If the multi-model run consisted of the average over runs that perfectly captured the earth’s climate response to externally applied forcings, we would expect the mean trend to be equal to zero, or at least the value of zero would lie within the uncertainty intervals.

Inspection of the graph indicates that, when analyzed this way, the difference between observations and a multi-run mean from an ensemble of runs used in the AR4 is negative (suggesting the models over-project warming) and moreover, the difference is statistically significant if we happen to start our analysis in ’50, ’60, ’70, ’00, or ’01. However, the difference is not statistically significant if we begin analysis in ’80 or ’90.
The result are such that currently, the data permit cherry pickers to decide certain “key” years are the ‘right’ ones to begin analysis. It is worth nothing however, that analyses beginning prior to 2001 include comparison of “predictions/projections” to “observations” that were not only available prior to making the “predictions”, but also available prior to freezing of the Scenarios used to create projections into the future. So, the projection period after 2001 has a special status relative to the period prior to 2001. For this reason, I favor 2001 as a start year– though of course others are permitted to have their own favorite start year for reasons of their own.
I’m will be deferring full discussion of analytical choices to later (fairly long) blog posts. However, the synopsis is that :
- The analysis uses monthly values for observations and projections in surface temperature. This is because it can be shown that when both monthly and anual average data are available, analysis using monthly data almost always has lower Type II error at a given choice of Type I error.
- The multi-run mean is based on an ensemble of runs used in the IPCC AR4, downloaded from the Climate Explorer. The selected runs were forced using modelers choice for the 20th century and extended into the 20th century using the A1B scenarios.
- The analysis accounts for the effects of volcanic forcing on the temperature excursions in a more phenomenologically realistic way than the rather unphysical linear regression of volcanic aerosols with time.
- The analysis accounts for the correction due to ENSO using the MEI. I don’t know if this choice minimize the uncertainty intervals– I selected it because…well… someone else did. 🙂
- The analysis assumes the autocorrelation of temperature with time varies as it would if the residuals can be described using an ARMA(1,1) process.
- Analysis uses all data from the ‘start year’ indicated through Nov. 2009. Most start years are January for decades beginning with ‘0’; this choice is modified for 1950, because the method for correcting for MEI does not permit starting in January.
For those who are curious about the rankings of annual average temperatures basd on the same analysis, after adjusting for MEI and any nonlinear response due to exogenous forcings, I’ve reconstituted and plotted them below.

After adjusting for MEI & etc. I find 2009 had the third warmest temperature on record– whatever that means. (I don’t like to compare adjusted years since the result varies according to the adjustment method. We should really include uncertainty intervals in any sort of comparison of this type; if we did, we could conclude that the MEI adjusted temperature has been quite flat this century.)
Over the course of the next two weeks, I’ll be posting discussion of each important analytical choice to let you critique them. But, I thought I’d let all ‘a y’all see the almost-end-year results first. When GISS and Hadley post their December data, I’ll update the final results for the year.
I had noted the post that “someone” had authored, and was hoping you’d respond, especially since it had a negative comment about your work. I look forward to your further elucidation.
Duane–
Yes. Well it appears “someone” felt compelled to write an entire post purporting to rebut a very brief observation left in comments at Roger Pielke. It appears that person had no clue what I might mean. Also, if he understood phenomenology, he might have accounted for volcanic aerosols in simpler and more physically sound manner. But… alas….
Is that a majoic flute I hear? Sounds like it needs a tuning.
Lucia,
But… alas…. that “someone” would then have demonstrated what you have shown… little or no (adjusted/underlying) warming trend since 2000. That was clearly not his objective.
.
One thing I don’t understand is why Tamino is always so openly hostile over what are really pretty straightforward technical issues (AR(1) vs ARMA(1,1), how to model volcanic aerosols, etc.). I just don’t see why a scientist won’t address these kinds of issues dispassionately.
.
I look forward to the more detailed posts.
Lucia,
One other thing bothered me about Tamino’s post: he included adjustments for volcanic aerosols, but completley ignored reported substantial declines in other atmospheric aerosols (eg. Mishchenko et al, Long-Term Satellite Record Reveals Likely Recent Aerosol Trend, Science, 2007). This strikes me as worse than “cherry picking”; after all, aerosols are aerosols… the photons don’t know the source.
SteveF–
You will be happy to learn the method I use to deal with volcanic aerosols automatically incorporates the effect of any forcings treated as exogenous by the modelers who performed the runs I am testing. So, that includes volcanic aerosols, changes in aerosols since the 70s, solar variations etc. It takes care of any and all expected non-linearities in the “signal” resulting from the forcing including nonlinear response to ghgs etc.
Plus, it’s a snap to implement.
In contrast, Tamino’s approach– while better than nothing– suffers from the same error Monckton makes when creating what he considers “the real IPCC” projections. Using a linear regression between volcanic aerosols and surface temperature ignores the fact that the earth’s climate system has heat capacity.
Mind you– using a regression of aerosols shifted with time is better than nothing. Under certain circumstances (though probably not that of volcanic eruptions) it using a regression between an index and a response would actually work. (It might work well for a very regular 11 year solar cycle.) Even when you would not expect it to work well (like for volcanic eruptions) I would use it if I didn’t have an obviously better choice based on the physics.
But when testing physical models that, on average, purport to predict the “signal” in response to any forcing (including volcanic aerosols), there is an obviously better for taking out the signal. Obviously better. And really simple. Heh.
You are teasing us now Lucia… I am breathless with anticipation of the details, since you say “obviously better”! 😉
My comment about Tamino not incorporating known aerosol forcings also applies to the models… they assume aerosols rose through ~2000 and have been about flat since then, while the data says they fell significantly from ~1993 to 2005 (no more recent data available) to the tune of 2-4 watts per square meter at the surface. So much of the warming that took place in that period could be due to falling aerosols instead of just rising GHG’s.
SteveF–
Sure. But all I’m doing is comparing ‘prediction/projections’ from those models to observations. I want to know: In the end of the process, are they giving us unbiased guidance about the future?
The answer can be ‘yes or no’. If the answer is ‘no’– they mean projection is biased, then we seek reasons for the bias. If it turns out to be that the problem is aerosols fell from 1993-2005, but the modelers assume they rose, that might explain any existing discrepancy (or even spurious agreement). But that’s different from noting whether or not it exists.
Lucia,
I understand your point, and for certain if you show that the models are not consistent (in a statistically significant way) with temperature data, then that alone is proof they are not accurate. .
But I’m also very concerned about the potential for spurious agreement. An analysis like yours could indicate “yes, the models are giving us reasonable guidance about the future”, when in fact they are doing nothing of the sort. If the model assumptions/inputs are not concordant with reality, and it seems to me they probably are not WRT aerosol effects, then the guidance provided could still be way off. So even a model that appears consistent with the temperature data need to be critically examined with respect to the validity of inputs and assumptions.
SteveF — I think the distinction between unbiased guidance and reasonable guidance needs to be emphasized.
oliver (Comment#29568),
Can you explain what you mean?
SteveF–
Sure. Spurious agreement is possible and something to consider. But that’s a more sophisticated question than I’m looking at.
Spurious agreement is especially likely when modelers know the results of observations before running simulations. That’s why it’s best to test predictions with data collected after modelers make predictions.
It’s not always possible to compare to data that were not available. It’s still useful to compare to hindcasts, but comparing to data that was unavailable when forecasts/ projections/ prediction or whatever someone wants to call these things is the idea.
Lucia,
Yep.
But my understanding is that aerosol data is a continuing input that the modelers can adjust as they see fit… like solar cycle forcing. Am I wrong about this?
SteveF–
Yes. If modelers want to create new runs, they can use whatever forcing they like. In a publication, they would provide reasons for their choices. So, if a modeler wanted to, they could re-run the already existing modelers using different forcings. In principle, the choice of what forcings to use is based on observations of aerosol loadings; in practice the observations for historic loadings are poorly constrained. Since it’s diffcult to go back in time and loadings that weren’t well measured in the past, they are unlikely to become better constrained in the fugure.
This means one might pick the loadings that happen to make the runs better fit known observations for periods of time when we have fairly vague knowledge of loadings. But it also means that it’s very difficult for anyone testing model agreement with observations to “correct” the observations for aerosol loadings because we don’t know these any better than the modelers do.
Lucia,
OK. But after Mishchenko et al (2007) and several earlier reports of falling total aerosols (brightening at the surface), why the heck aren’t these reported falling aerosols being tried in the models? Why would the modelers stick with aerosol estimates that seem clearly inconsistent with credible published measurements of aerosols? The devil in me thinks the worst, and imagines less than the best of motives. Maybe the devil is being unfair to the modelers.
SteveF–
Who says the aersol levels reported in 2007 aren’t being tried in models now?
Those levels weren’t used in model runs submitted to the IPCC prior to 2007– but how could they have been? Whether models are good bad or indifferent, running models takes time. The process of people working collectively also takes time. It’s pretty obvious aersole loadings first published in 2007 were not available to those running models used to create projections in 2007.
Lucia,
Well, there were several earlier reports as well (starting around 2003 if I remember right). Still, i would be very pleased if lower aerosol forcings were given a try given a try in some of the models. Of course, it’s seems clear (at least to me) that gradually falling aerosol forcing from 1994 to 2005 would make the model temperature projections diverge even further from the temperature measurements than your model vs observation work shows… so I’m not going to hold my breath.
SteveF–
You may be right. But I don’t think there is any good way to “correct” either the model projections or the observations to try to identify how much worse model projections might have been if the modelers had known the aerosol loadings were going to fall.
FWIW, even 2003 is too late for information about forcings to progate into the AR4. Climate modelers are trying to create a large ensemble of model runs with similar forcings in the 21st century. To do that, they had to select a common set of forcings that everyone would use. These were selected in late 2000. This probably left just enough time for various modeling groups to plan projects, get funding, allocate staff, equipment etc. and make the runs.
(BTW: The reason I use 2001 as my start data for “freezing” projections is the SRES were published in late 2000.)
Now, I can understand why someone might wish the modelers used more recent forcing. Likely as not, some group, somewhere either is currently using that information or will eventually use it. It’s true new information about forcings will always arrive in the future. But we don’t need to read in any conspiracy theories to know why information about forcings learned in 2003 weren’t used to drive model runs published in the AR4.
Lucia,
Don’t get me wrong, I don’t suspect any conspiracy here. If someone were actually allow to ask a modeler (say Gavin at RC), they would likely say that the reports of lower aerosols are probably wrong… (why? because they suggest the models are wrong!). It’s just confirmation/expectation biases that I am suggesting.
SteveF–
I don’t knwo what the modelers would say if you asked them about the reported levels of aerosols in 2007 etc.
I asked Gavin about sulfates on RealClimate at one point. His response:
Lucia (Comment#29601),
I don’t really know either, since most legitimate questions are ‘disappeared’ at RC, and email messages to modelers are not replied to.
BTW, as recently as last month James Hansen showed the standard ‘aerosols increasing to 2000, flat after 2000’ graph. Gavin works for Hansen, so let me think about the implications for a second….
Carrick (Comment#29610),
When did Gavin make that comment? Pre 2007?
One of the ‘aerosol people’ Gavin ought to talk to is NASA scientist Mishchenko, who is also lead scientist for the long delayed Glory mission to better characterize atmospheric aerosols. If he did, he would hear that Mishchenko’s best estimate (95% range) of global aerosol effect is about +2 to +4 watts per square meter between 1993 and 2005, much larger than the increase in forcing from GHG’s over the same period.
I have not had much success in finding references to this rather astounding result, perhaps because it was published in that obscure and discredited little journal, Science.
SteveF, it was January 2008.
Carrick
Two of the most interesting graphs I’ve seen recently.
You may want to take a look at “Limits on CO2 Climate Forcing from Recent Temperature Data of Earth” David H. Douglass and John R. Christy, published I think in Energy and Environment in 2008. They did a similar analysis for 1979-2008 using UAH data and extract a residual trend of about 0.06 deg/decade. Your adjusted trend will be higher than that since you start with the higher CRU trend line.
Lucia,
The link to the MEI seems to be broken. I think it’s missing an ‘l’ at the end
BarryW,
I was able to access it.
Lucia,
I had done similar regressions as Tamino. I used the same volcanic/MEI data but I also included solar irradiance and cloud cover. The result is that the serial correlation does decline so the confidence intervals don’t widen as much. Including only MEI certainly doesn’t minimize it, but it helps.
lucia. i will GRANT you one thing.
this will FOSTER continued crying from tammy.
Chad–
Yes. What I meant was there are some other ENSO indices. I didn’t hunt around to see which happens to minimize most for any particular choice of analysis start year. (I don’t plan to either. )
Ah, you say “and the observations from Hadley”. But are those “observations” or are they “adjusted observations”? And, if the latter, how can you be confident that the adjustments were made honestly enough, and competently enough, to justify using “the observations from Hadley”?
Dearieme– These are the observations they report. When someone comes up with another set I’ll use those too.
SteveF,
Do you have a link to that Mishchenko reference?
I have found a Mischchenko paper in Science of the name mentioned, http://www.sciencemag.org/cgi/content/full/sci;315/5818/1543?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=Mishchenko&searchid=1&FIRSTINDEX=0&issue=5818&resourcetype=HWCIT
It talks about a decline of .03 in the aerosol optical thickness. How does that convert to watts per square metre? (Not snarky: I do not know.)
From what I can tell, aerosol forcings are currently estimated at around -1.5 watts/square metre. The aerosol optical thickness declined from around .13 to around .1 over the time period that you refer to. This would indicate an increase from -1.95 watt/square metre to its current value, an increase of .45 watts/square metre.
(Obviously, this is assuming that the change of forcing related to aerosol optical thickness is linear.)
The increase in forcing from CO2 over that same period seems to be around .35 watts/square metre (I think – my maths could be wrong) .
An increase of 4 watts per square metre in 12 years or so would be a bad thing – that would equate to around 3 degrees centigrade of warming!
I should point out that I have nowhere been able to find a paper that suggests an increase of even that much in forcing due to a decline in aerosols – this is just my rough calculations.
Not sure were this fits into the whole modeling discussion but I thought I’d pass it along.
flow modeling
David Gould,
The Science reference is correct. How much the 0.03 decline in AOD influences the surface intensity depends on how much sunlight is assumed to reach the surface (clouds, atmospheric absorption, etc.). I believe a ball-park estimate is 1366 * 0.25 * 0.7 = 239 watts per square meter, less about 75 watts of atmospheric absorption; or about 164 watts per square meter, so a decrease in AOD of 0.03 would correspond to about 5 watts per square meter! Of course, aerosols absorb a lot of light within the atmosphere, so much of the gain in energy at the surface would be off-set by lesser energy gain in the atmosphere. In a related paper (Limits on climate sensitivity derived from recent satellite and surface observations, Chylek et al, J. Geophys. Res., 112, 2007) the net increase in forcing from falling aerosols is estimated at 0.36 watt per square meter per decade from 1994 to 2005. This values includes estimates of both direct and indirect (cloud albedo) effects.
FWIW, other recent papers about global average AOD trends measured by various satellites paint a more clouded (no pun) picture of the long term aerosol trend. The different measurements use different methods and scattering models, and suffer from problems with calibration as well. The best evidence seems to be a significant fall in average AOD since 1994 (or earlier), but the post 2000 period looks quite a bit flatter than pre-2000.
My earlier comment was only intended to point out that credible data exists which suggests much of the rapid warming which took place prior to 2000 could have been due to falling aerosols. If this is the case, then the total increase in forcing was considerably higher than assumed by climate modelers, and the climate sensitivity values from those models are therefore too high. The apparent divergence of model projections (Lucia’s many posts on this divergence) from measured temperatures since 2000 is also consistent with an overestimate of sensitivity, especially if the AOD has been changing more slowly post 2000.
SteveF,
Thanks. 🙂
If CO2 effect appears overstated by cooked up numbers based on hansen’s computer model and made gospel by Kiehl/Trenberth global energy budget, then the model-derived climate sensitivity parameter used by IPCC is wrong.
http://chriscolose.wordpress.com/2008/12/10/an-update-to-kiehl-and-trenberth-1997/#comment-1493
“Let’s say today’s atmospheric CO2 concentration is 380 ppmv and that this is increasing annually by 2.5 ppmv.
We then have an annual change of 382.5/380 = 1.0066
The logarithm (ln) of this ratio is 0.00656
GH theory (IPCC: Myhre et al.) tells us that the CO2 climate forcing from this increase equals 5.35 times the ln(ratio) = 0.035 W/m^2
Even if we escalate this by a factor of 3.2 to account for net positive feedbacks, as estimated by the IPCC model simulations, we arrive at 0.112 W/m^2
How do K+T arrive at 0.9 W/m^2 or eight times this value?”
As far as I can figure, the relationship between forcing and temperature is derived only by observation and not physics theory.
Formula to convert radiative flux (RF) to surface temperature change is (ΔTs): ΔTs = λRF, where λ is the climate sensitivity parameter. (IPCC)
Nordell and Gervet claim that thermal pollution explains a large chunk of observed warming 55-74%, based on heat energy analysis.
http://www.ltu.se/shb/2.1492/1.5035?l=en
Barry Brook quotes a source that thermal pollution accounts for 0.1W/m2 worst case. this is in the same ballpark as theoretically derived CO2 effect+IPCC positive feedbacks.
http://bravenewclimate.com/integral-fast-reactor-ifr-nuclear-power/#comment-40215
If we correct for airport heat island bias as well, where does that leave the climate model?
http://chiefio.wordpress.com/2009/12/15/of-jet-exhaust-and-airport-thermometers-feed-the-heat/
Wondering what values you use for climate sensitivity parameter, Lucia?
Re: BLouis79 (Jan 18 10:08),
Radiative forcing responds rapidly to changes in ghg’s. Surface temperature response is slower. How slowly it responds is the interesting question because it relates to climate sensitivity. If you assume that the temperature response has a lag of 15 years and apply a linear increase in forcing, the temperature won’t keep up and there will be an excess forcing. That’s the 0.9 W/m2, otherwise known as heat in the pipeline. The response time and climate sensitivity are inversely related, longer response equals higher sensitivity. The thing is, though, that heat has to be somewhere. The most likely place is the ocean. Which is why ocean heat content (OHC) is being studied so assiduously. The heat content of the upper 700 meters of the ocean has been flat for the last 5 years. That implies that, at least for those five years, there has been no excess forcing. OHC measurement is not at the level of surface temperature measurement, but as the ARGO ocean floats become better understood and the length of the ARGO time series increases, we should be able to know more.