However, the absence of influence of old data in the case of climate models will be impossible to prove, so I still think we have to insist that climate models accurately forecast future states of the atmosphere
So says W.M. Briggs; I agree. Quite honestly, despite the often contentious debates over the accuracy of climate models, I think everyone agrees that models need to be tested against future data. The influence of old data on climate models is, at least some sense, absolutely certain.
The data of old data on climate models comes about in the normal way that occurs in all model development projects which generally proceeds more or less like this:
- A researcher identifies a process they wish to better understand and predict.
- The researcher identifies existing data, theories and pre-existing models thought to describe the process in some way. They spend some time comparing the predictive ability of pre-existing models to existing data thought to be reliable.
- If the researcher thinks the model is fine, he may simply use the model. But if the researcher identifies some important mismatch between the data and the model predictions, he will try to develop a new model that better explains available data.
He may do this in a variety of was: He may try to seek to identify new phenomena that were entirely absent from the previous model. If so, how will he select which phenomena to include? He’ll select those that result in better agreement with existing data! He may try to improve approximations for phenomena included in the previous model, but done so inadequately. Which approximations will he try to improve? And how will he improve them? He’ll focus on the approximation he thinks are causing the mis-match between model predictions and the data. Moreover, he’ll improve the approximation by comparing his approximation to existing data.
So, at this step, the modeler always relies on existing data when building his model.
- Once the modeler has developed the new model, he will nearly always run a hind-cast to see if his newer model matches existing data better than the older model. If the new model does a better job than the old model in some important respect, the modeler will generally adopt the new model. If the new model does poorly, the modeler will adjust his parameterizations, or approximations, and repeat until such time as the new model better compares to the data.
I think most modelers will recognize this is more or less the normal process. In fact, it is the only sensible way to improve models: What is one to do other than rely on existing data to test older models? Is one to improve of understanding of physics by consistently selecting models that disagree with the existing data?
So, at least to some extent, the current models are always influenced by existing data. This causes some problems because hindcasts, that is comparison of models to the data on which they are based, always give over optimistic estimates of predictive ability. It is also difficult to estimate the degree of excess optimism.
Are there mitigating factors for that permit us to overlook this influence?
In the physical sciences, this issue is somewhat (or possibly largely) mitigated by the fact that we can constrain our models using some highly regarded physical principles. That said, model often involves many simplifications which can introduce doubt. ( Both constraining fits to physical processes and simplification have been are routine in engineering where constraining curve fits to physics has been routine since roughly the 40s or 60s depending on the field.)
At the risk of giving the wrong impression about GCM’s, which are complex models, I’ll explain what I am trying to describe about the process by way of my wildly oversimplified predictive model for an isothermal climate system, which we will henceforth call “Lumpy” ( See 1,2, 3)
Though grossly simplified compared to any real climate model, you’ll see the mathematical form of “Lumpy” is based on a physical constraint. That is: a simple energy balance for the earth’s climate system. According to this model, the temperature anomaly, θ must relate to the anomalous forcing following the relation:
where α-1 is the the effective heat capacity per unit area, τ is the time constant for the climate and q is a forcing (a heat flux.)1.
If this equation appliee, and I knew the magnitude of constants τ α and the time series for q, I could attempt to predict or explain the temperature anomaly, θ, based on known values of forcings, q. Unfortunately, I don’t know τ α, and instead I obtained my estimates by finding values that minimize the uncertainty between model predictions and time series data for ( θ, q) data pairs.
After I obtain τ α, I would hope to later this model to project into the future using estimated values of forcing ‘q’. But in the meanwhile, I would brag, brag, brag about how well the model predicted the data I used to create it! I mean… hey look at this:

I would of course also brag that by requiring my model to conform to a principle called ‘conservation of energy’, one might expect that my model might work better than a simple linear regression for the time series of earth’s temperature since 1880. I would further point out that the physical sciences have had great success by developing models that are constrained to fit data in certain prescribed ways that are consistent with principles like conservation of mass, momentum and energy. And if you call my model into question, I can shout you down and say, “Hey, don’t you believe in modeling?”
But, others might demur and suggest my model was full of baloney. (And there are reasons others could be correct. )
Others are likely to point out that Lumpy’s highcorrelation coefficient based on pre-existing data will likely drop when confronted with new data. (Others at Climate Audit have already crowed at Lumpy’s failure to predict the cold January 2008. )
So what does ‘Lumpy’ have to do with GCM’s?
Climate models, including GCM’s and “Lumpy”, my toy model share the trait that both are constrained to agree with some physical principle. Climate models differ in many ways from lumpy, one is the parameter used in the models are not obtained using an OLS as I did for Lumpy. However the GCMs do contain parameter with uncertain magnitudes. Unlike my model where my parameters are quite obviously selected to fit a batch of ( θ, q) data pairs, the magnitude of parameters in climate models may be indirectly selected based on data.
How? The process is sufficiently complicated to mask the effect. However, over the course of years, researches do find the sets of parameters and choose parameterizations that best match existing data.
Letting the data influence the choice of parameters is not a perversion of science. It is science which involves a firm grounding in empiricism. The alternative is madness!
So, the reality is that any current model is always affected by pre-existing data to some extent.
Who do we fix this?
In most fields, the way we break out of the box of comparing model predictions to the pre-existing data is to collect data after a model is run. This is not a trivial task in any field.
Experiments are expensive, perplexingly difficult and present a different set of technical difficulties from modeling.
But there is truly no alternative. Because, like it or not, any respectable modeler will draw on any and all available reliable data when developing his model.
So when Dr. Briggs says, “I still think we have to insist that climate models accurately forecast future states of the atmosphere”, he is entirely right. We do need to insist on this.
On the other hand: My model, Lumpy?
She’s flat out right; you can count on her. ‘Cuz I say so. And Lumpy says AGW is in the pipeline. 🙂
—–
Footnotes:
1. Regular readers will also note the similarity of this equation to the IPCC equation energy balance equation I commented on yesterday. Yes, I think it’s oversimplified in many ways. Yes, at some point I will answer Dan Hughe’s questions about the impact of oversimplifying.
Lucia,
One of the biggest concerns I have about climate models and their advocates is the tendency to incorporate new data into the models as soon as it available. This means we never have enough new data to evaluate the models.
For example, a year from now you could throw in a ‘random ENSO forcing’ and recalculate the parameters for lumpy. You would then be able to claim that the ‘new and improved’ lumpy now predicted the La Nina and leave everyone with the impression that Lumpy 2009 actually has better predictive value than Lumpy 2008 because it has ‘incorporated new information’. In reality we have no idea whether Lumpy 2009 is better – in fact, it might even be worse because of the ad hoc nature of the ENSO
fudgeforcing.This leads to a catch 22 situation: you can’t keep using Lumpy 2008 because you know it was wrong yet Lumpy 2009 an unknown quantity. The long term nature of climate science makes this kind of dilemma impossible to resolve.
For that reason, I think it a mistake to treat climate models as useful predictive tools. They are, at best, equivalent to testing a drug on a lab rat. They may provide useful insights into how the drug/climate works but the results may have no connection to reality because of factors which the model does not/cannot take into account.
Despite my previous dump on models, I do think they have some uses. For example, I think modellers could do a better job of incorporating poorly understood things like the sun and cosmic rays. I realize that the comic ray theory has many detractors but I think it is plausible enough to incorporate into a model and compare that model’s predictive ability to the model that assumes CO2 is the only major driver.
You could call it: “Lumpy, the Solar Edition”.
Here are some forcings proposed by Nir Shaviv that could be plugged into a model: http://www.sciencebits.com/CO2orSolar
This posting on Lubos’s blog has a guest post by Shaviv and links to papers with better numbers: http://motls.blogspot.com/2007/07/nir-shaviv-why-is-lockwood-and-frohlich.html
Here is a link that explains why the anti-GCR FUD spread by the AGW advocates may not be as factual as they claim: http://www.warwickhughes.com/blog/?p=131
Unfortunately, Raven, I think that lack of ability to predict Cosmic Ray variability means that lucia really can’t use that info to predict. But it would be interesting to see some lumpy postdictions! Especially seeing what happens when you vary the effects of the sun, aerosols, etc.
Ferdinand Engelbeen has this page were he demonstrates the effect of varying assumptions on postdictions using an Oxford climate model:
http://www.ferdinand-engelbeen.be/klimaat/oxford.html
The interesting thing is that the one with a smaller CO2 response fits slightly better. Interesting result, isn’t it?
Andrew,
Shaviv made a prediction in July 2007 that caught my eye:
“In fact, if you look at the total heat in the oceans, you will see that from 2001 it actually decreased! (Well, recently, after the inconvenient buoy data was removed, the heat content stopped increasing.) This lower heat content should start to cause a prolonged cooling, assuming the solar activity will remain at the 2001 level or lower.”
He made this prediction when the 2007 temps were still heading higher and it looked like Hansen was going to be right with his prediction that 2007 would be the hottest on record.
Shaviv’s argument is the solar effect is damped and there is a lag between the solar forcing and the climate response – an effect that aligns nicely with the theory behind Lucia’s model. The solar effect is also obscured by volcanic activity which means any comparison of CR theory to data must account for volcanos – again something that Lucia’s model can do.
Obviously, she would have to use the the version of the data which Shaviv and Svensmark used to develop their theory. Arguing whether this data is accurate is a seperate issue.
@Raven & Andrew–
Yes. To include cosmic rays in Lumpy, a GCM or energy balance model, some would need to provide an estimate of the magnitude of anomalous forcings due to cosmic rays. For Lumpy, I’d want estimates back to 1880.
With a wildly over simplified model like “Lumpy”, a ‘modeler’ (if you can even use that word for one who develops something like ‘Lumpy’) might be willing to do a run with very course estimates just to see what happens. After all, the time required to “run” the model is trivial.
But, obviously, no one is going to try to run a GCM with estimates they think are much too crude. This is particularly true if they think the forcings they currently use do explain the data. (And, like it or not, the forcings Gavin gave me, shoved into “Lumpy” do result in a pretty decent estimate. (In fact, amazingly decent, given the approximate nature of the functional model on which “Lumpy” is based.)
If you do have estimates for anomalous forcings, I can shove those in quite easily. That’s pretty much what I did with the forcings Gavin favors, and they make Lumpy work pretty well.
Lucia,
I will see if I can find some data for you.
I suspect the historical aerosol and volcanic forcings were estimated by constructing a model similar to yours and figuring out how large they would need to be in order to be consistent with the assumed CO2 sensitivity. This makes the match between lumpy and the actual data much less surprising.
Raven– Heh. You cynic! 🙂
On the one hand, I don’t believe anyone intentionally did what you say. On the other hand, it is inevitable that scientists would eventually end up with a set of (f, T’) data that matched each other within the physical contraints they believe apply to the system.
I think, with respect the the most recent (f) data, will be somewhat constrained by measurements of downwelling and upwelling radiation.
With respect to the historic ‘f’ data, things must be estimated based on indirect evidence. So, there must be quite a bit of uncertainty. Modelers must, of course pick a specific number to drive their model. How? Well…
I think it’s obvious that any and all models must be validated against data collected after the model is developed. (Except Lumpy of course. Lumpy is always right ’round these parts. 🙂 )
Realistic – not cynical. There are not a lot options available when it comed to estimating thing like historicals aerosols. Even today we don’t have a lot of options when it comes to estimating aerosols.
I don’t think people intentionally or even knowingly did this. As far as I can tell the (climate) scientific process goes something like:
1) X writes a paper that estimates CO2 sensitivity;
2) Y writes a paper that uses X’s sensitivity to estimate historical aerosols;
3) Z builds a model that uses the sensitivity from X and the aerosols from Y;
4) W compares Z’s model to X’s estimate and declares that X’s CO2 sensitivity has been validated;
Raven,
I think there are elements to 1-4 in all scientific areas. But, your list does miss a few mitigating actions. For example, so information to estimate historic aerosols comes from icecores, and analysis of materials from digs. So, there is at least some totally empirical evidence to peg estimates somewhat.
More recent estimates can be based on more direct measurements– but only after the equipment existed to measure something, and if someone thought to do it. And naturally, no one thinks to measure something unless the find the information useful currently, for some reason.
But, on the whole, I think we are in agreement the only true validations are those that occur after one created (or in Lumpy’s case, concocts ) a predictive model.
I’ll share my experience of climate models, that is models of climate in buildings (heat, air and moist transfer).
Here everything can be deterministically computed. Earths climate is obviously more complex, e.g. how do placton react to higher carbon levels in the sea? No one knows and certainly cannot express it matemathically. So climate models of buildings are comparatively easy to solve. On to my experience:
if the model is developed for a particular building, the model produces the approximately correct result if it is fed with data from actual measurements in the surrounding (wind speed, temperature, cloudiness, relative humidity). Change the input data set to a statistically correct average but not actual measurements and the model is way off target. Change the actual material to something with slightly different hydroscopic properties and the model is way of. Change the initial state and the results are often bad even though it shouldn’t be much of an initial value problem.
take a general modell, one of the state of the art stuff doing CFDs et c, and the model is rarely correct, and never when fed with averaged input data. All wizzy matematics is of no good use (though the models remain very popular among scientists – though for some reason they are less popular with the people that have to put their dollars on the outcome).
My conclusion is that a simple model that is trimmed according to very accurate, spacially and temporally representative data does a far better job at predicting future states than a much more complex model that is fed the typical averaged data.
Now GCMs are developed for a much harder job and thus include much more parametrizations, have to big cells/to little grids, are trimmed with very low quality data and are to predict the world not as it have been but how it will be (a house remains the same, the world changes as land use practises change, ocean current develops etc). It is not properly initialized, is fed with input data of less than bad quality and is used to create scenarios based on guesses of future economical activity (imagine someone a hundred years from now that was supposed to predict the inventions made during the 20th century and you realize just how futile such guesses are).
Now, my experience lead me to reject a proposition such as “we have to insist that climate models accurately forecast future states of the atmosphere”. They do not. We need them as tools to develop our knowledge, we should take efforts to develop our understanding of the climate. We should take reasonable efforts such as trying to move away from a fossile fuel-based economy (at a pace that doesn’t severly dampens economic growth), including research money in alternative energy sources and including carbon taxes. We should however not assume the predictions to such an extent that we scare our children, destry our economies nor stand in the way of poverty reduction in the developing world. Neither should we let the risk with CO2 overshadow other environmental threats. For such, the models are simply to inaccurate.
Avfuktare Vind (I clicked, are you in Sweden?)
I’m in general agreement with you on modeling– and more for the reasons you say than the ones Raven says. (Though, Raven is partially correct.)
I think models are useful for guiding our thoughts about trends. But, I don’t happen to put a whole lot of reliance on GCM’s. The thing is that, from my view, the case for AGW is pretty good without GCM’s.
I guess, in some sense, one of the things I don’t like in this debate is that it sometimes appears that if one thinks AGW is probably, somehow, one is then supposed to say “GCMs are Grrrr8!”.
GCM’s are models. They have the same problems and strengths other models have.
I’m also in general agreement on this conclusion:
I lived in El Salvador until I was 6. I’m definitely not for surpressing the economic development of poor countries unnecessarily. So, I do agree that we need to think carefully about what to do on the policy side.
Right now, I am for finding ways to lower GHG emissions (which include more than CO2), but equally, or more importantly, also developing energy sources we already understand. Right now, the only clear path is to encourage nuclear energy in addition to renewable. Encouraging conservation is useful, but that path alone simply won’t get us where we need to go.
Yes I’m in Sweden, and I couldn’t help but to think that “Liljegren” has some history from our nice but peculiar country…
I’ve too spent my fair share in the developing world. The need for economic development there is something an astonishing amount of westerners are oblivious to. I’m glad you’re not one of them 😉
I think we have very good opportunities to reduce future CO2 emission, even without any government efforts at all. The current trend om renewables are half the cost per kwh / decade, and they already compete successfully with fossile fuel in some locations. Give it a few more decades and no new fossile fuel power plants will be built, and a few more and the existing ones will be out-performed and closed. Nuclear have some very interesting developments going, e.g. the “transmutation” reactor that some swedish scientist are developing (it uses spent fuel and thus solves most of the waist problem). In all I find the picture not gloomy at all, as the current trends will phase out fossile fuel within some 50 years even without action. (This would be true even if the IPCCs more elevated lambdas is correct along with a high CO2 residence time, we will simply not reach CO2 levels that provide for the more scarry scenarios. And I believe the IPCC to be somewhat exaggerating both current radiative imbalance and CO2 residence time.)
Best,
Avfuktare Vind —
Lucia Liljegren sure sounds like a Swedish name, doesn’t it?
In reality, I’m mostly Irish and Cuban by birth. My father-in-law was born in Malmo. His family immigrated around the time of WWII. I changed my name when I married. Oddly enough, once my named changed people constantly say things like “I could tell you were Swedish just looking at you!” (Heh!)
I too agree 100% with this statement:
In fact, I suspect that even the most strident skeptics would agree with it. The problem is the alarmists have hijacked the political agenda with GCM derived fear mongering which means skeptics get sucked into debates about the predictive abilities of GCMs.
I would also have to take issue with your statement that the AGW case does not require GCMs because I feel it is necessary to distinguish between AGW and catastrophic AGW. The case for CO2 induced AGW is fairly strong but I feel that case can only be used to justify prudent CO2 reduction measures as described in the quote above. The case for catastrophic AGW rests entirely on model predictions because one cannot predict catastrophy without attempting to determine the regional effects of the CO2 warming (e.g. predicting drought in region X or an increase storm impacts on region Y). One also cannot predict ‘tipping points’ without a model.
Lucia,
You should run Some of the IPCC SRES through Lumpy.
I love simple models. Kentucky windage.
Raven– You’re right. Predicting catastrophic warming does seem to require models.
I guess I tend to focus on the question: Is there warming at all. That also comes up.
I think there is a difficulty with lots of these labels. I wouldn’t say that someone doubting the likelyhood of catastrophic warming is a denialists. If that makes someone a denialists, it makes “denial” fall inside the bounds predicted by the IPCC. (Well…. unless we re-define catastrophe!)
But it does appear that some people label people who generally suspect there is 1-2C / century of warming happening as denialists. (This is odd, as 2C is in the short term mid-range predicted by the IPCC. But… there ya’ go!)
Steven,
I plan to. Then, we can validate Lumpy. But, of course, as someone previously pointed out– Raven I think– I’ll update Lumpy with new data every year. Then, I’ll pretend Lumpy was never wrong. Incantations will be involved. Who knows, maybe I’ll find new terms? 🙂
Of course you’ll find new terms. Duh!
you’ll like this:
http://fumapex.dmi.dk/Pub/Docu/Reports/FUMAPEX_D4.6.fv.pdf
Hi Lucia,
I was arguing with people at Tammy’s about the Pinatubo prediction that supposedly validates the climate models. It is appears a model did predict the post eruption response but there is a bait and switch going on because they did not use the GCMs to make the prediction – they used a “primitive climate model”.
http://www.giss.nasa.gov/research/briefs/hansen_02/
The mention of a “primitive climate model” made me think of lumpy.
The paper does not say what primitive means. I wondered if it might be similar to lumpy but the wiggles seem a little large for that.
Now I have spent some time looking at Figure 9.5 in AR4
http://ipcc-wg1.ucar.edu/wg1/Report/AR4WG1_Print_Ch09.pdf
The model ensemble range in this figure is quite interesting because seems to cover all of the bases from no cooling at all to whopping 0.6 degC of cooling. With ensembles like that it is really hard to be wrong.
That said the bottom half of Fig 9.5 is also interesting because it seems to be telling us what the IPCC claims is ‘weather noise’ without anthropogenic forcings and the ensemble range is amazingly tight – but that is the just the eyeball method. I am curious how this noise compares to the other estimates of noise you have been using.
Do you mean this figure? I’m in the process of looking at all that spaghetti individually.
The few I’ve downloaded and looked at in a cursory manner fall in two categories:
1) Those that appear to have mega-whopping “weather noise’ all the time, including the 20’s and 30s when no volcanos were erupting.
2) Those that calm down to have normal “weather noise” when volcanos aren’t erupting. These calmed a bit during the 20s and 30s, and also have less weather noise now.
But…. I’ve only looked at 4. Then I got distracted by my blog freezing for some reason, and fiddled with the format. I’m going to get back to looking at the data Monday.
Yes, that the one.
There are a couple things that bug me about it.
1) There is 0.6 degC temperature rise from 1910 to 1940 that is not matched in either of the scenarios. How can they claim that a rise of 0.7 is not natural when they have no explaination for a 0.6 degC rise?
2) In theory, the planet would have cooled 0.2 degC since 1940 without GHGs. Why? The sun? Longer term effects of volcanos? That is a significant drop if one insists that natural forcings are insignifcant.
The people making the models are demanding action now based on their predictions. You as a purer scientific mind are nieve to the fact that the greens do not intent to wait for the models to work themselves at as they are stoking the gravy train at full blast.
Sure we should theoretically wait a few decades for the scientists to tweek their models but the question is not the models in the future, its the models right now and the climate laws right now.
In the immediate timeframe we must reject the models for their poor predictive power and accept them once they actually match the empirical universe.
And a second to Raven, what stops the modellers from simply incorporating contradictory data every climate cycle into their models.
By repeating this over and over they can always be right even if their predictions are always wrong.