Some readers may be aware that a some vocal climate modelers (and some who comment on blogs) word their descriptions of AOGCM results to suggest that results from climate model are absolutely, totally, utterly and completely independent from observations of historic surface temperatures. Yet, from time to time, (in fact rather frequently) it is possible to find quotes in peer reviewed articles that suggest other competent scientist do recognize that testing models against data not available prior the model runs might benefit our ability to test models.
Or, at least that’s how I interpret the following interesting quote from C. Reifen and Toumi
[17] Ideally we would want to test models which have been developed independently of observations. We recognize the importance of the post 1970 climate shift period and it would have been instructive to test this period with models having no prior knowledge of it. However, this problem also applies to any climate projections using a subset of models based on 20th Century performance and it is this approach we are testing here. We are not implying that comparisons against observations are not important in model validation. Good agreement with past climate builds confidence in the reliability of a model’s future projections. Our analysis only examines selection based on models’ ability to replicate a mean anomaly over a historic time period. There are other criteria that could be used and would be worth investigating.
Of course, I may be mistaken, but “Ideally we would want to test models which have been developed independently of observations.” suggest that the authors think that, at least in some sense, the models (or realizations) were not developed totally independently of observations. Moreover, “We recognize the importance of the post 1970 climate shift period and it would have been instructive to test this period with models having no prior knowledge of it.” suggests the authors recognize that some choices made by modelers might, at least hypothetically, have been influenced by prior knowledge that the shift had occurred.
Of course, the only cure for this is to test projections against observations that did not precede any choices made by modelers. This is admittedly difficult to do. However, this difficulty does not mean that the public is required to gouge out their eyes and pretend they cannot see that testing models against hindcasts only lessens our confidence that models can predict the future.
Someday, modelers may be able to provide evidence climate models can predict (or even project) the future with some degree of accuracy, and will show this without resorting to shennanigans like changing the characteristics of their smoothing filters without notifying readers. (See (1), (2), and (3).)
Until that time, a not-insignificant number of people who believe model simulations are qualitatively correct will continue to suspect the accuracy of any quantitative projections or predictions of future warming.
Hat Tip: WUWT which always scoops me on everything.
Please proofread “suggest that results from climate are”, as I think there’s a word missing. The results from climate tend to involve plants, but you’re not talking about that.
Lucia,
This is like preaching to the choir, but it really is hard to imagine someone actually arguing that the climate models are independent from historical observations. You hardly need to “back out” the implication from this paper.
Models of anywhere near this complexity have some “degrees of freedom” in their construction. The models that look obviously wrong were tossed. The less unconstrained parameters were either taken from observations or made up so that the whole thing looks sort of like what we observe.
Compare SST and the MOC in the Atlantic, for example. Evidently a choice was made that it was more important to produce something realistic in one area of the model than elsewhere in the model; it is also evident that reasonable models can differ quite substantially on the latter point.
Oliver
It is very difficult to believe someone would argue this in a peer reviewed article or standing in front of an audience familiar with modeling.
But, it is my distinct impression there are some (not necessarily lead bloggers) who respond by insisting that the ability to compare results of GCM’s to past surface temperatures matters very, very little because models are based on “physics”, and this somehow takes care of the whole problem associated with our having observed how an AOGCM as a whole predicted historic data. (The various arguments are hard to summarize and are scattered all over the web. But to make an analogy, it is as if someone suggested that early engineering models could predict flow over a backward facing step because we’d validated a model for the gradient transport models for turbulent transport of momentum in experiments for pipeflow rather than fitting a correlation for drag as a function of velocity over a backward facing step.)
I agree with your characterization of how historic data affect the models. I’ve said more or less the same thing in the past. (You can read an attempt to explain a similar idea here.
The fact is: No matter what we learn from separate effects experiments or in clean flows, if an AOGCM that predict features of greatest interest more poorly will be adjusted to some extent. Surface temperature are a feature everyone looks at intensively, so models with poor agreement with surface temperatures over the 20th century will be tweaked until they don’t look too bad.
There are also a number of FAQ or wording of post that minimize the impact of “tuning” to the extent that they convey a false impression of the level of confidence we might place in scientist or engineers ability to accurately predict behavior using parameterized equations in many problems involving transport phenomena. (For the non-engineers and scientists “transport phenomena” = “conservation of mass, momentum and energy in moving fluids”.)
Of course, anyone who understand the process of developing and improving models knows that invoking the word physics is not a sufficient condition to ensure a model has not been tuned to get good results in some particular area.
Well said Lucia. Interestingly enough, we have the AR4 runs, so every year that goes by gives us a better comparison of observed reality vs. models. I am sure that within the next two years if obervations continue to refuse to warm, somebody from the “team” will try to make a big deal of one of the tiny eruptions the last few years.
Pay no attention to that man behind the curtain!
http://www.youtube.com/watch?v=YWyCCJ6B2WE
Lucia:
Of course, all scientific models are designed to mimic some aspect of the ‘world’ and therefore are built around some previous ‘observations’. And when they are tweaked to do well with hindcasts, previous observations further ‘restrict their degrees of freedom’.
We judge how much confidence we wish to place in a model on BOTH the number of free parameters used to fit projected or predicted performance AS WELL AS any ‘confidence intervals’ around the mean projected/predicted mean, set by the uncertainty due to the ‘noise’ in the data. Add to that, a component of uncertainty in the data that were used to judge the quality of any hindcasts.
Judging statistical confidence intervals is somewhat more ‘objective’ than judging effect of the biases in the model construction itself. The latter is where ‘expertise’ should generally be given more weight than apparent logical consistency.
Your adeptness at poking holes in over-stated confidence in analysis of noisy, short-runs of data is laudable.
But making too much of semantics (e.g., the differences between predictions and projections) won’t necessarily help non-experts who read this blog.
The confusion in the quoted text simply reveals the fact that many scientists are less than expert in their use of language.
Len–
I think the difficulty with the words “projection/predictions” is that there is a distinction with almost no practical difference. It is true that the IPCC wants to use “projection” to mean provisional predictions a that depend on some hypothetical postulated forcing scenario or story line for future emissions outcomes. So, in principle, they aren’t plain vanilla predictions. But the postualted forcing scenarios aren’t pulled out of nowhere. They do represent some sort of range of predictions– and then the climate scientists themselves compare the projections to data. So… which word give the correct nuance?
Lucia said “”So… which word give the correct nuance?””
I thought it was agreed “”hepatomancy””, or “”augury”” were the preferred choice.
John,
I had to look up “hepatomancy”. Ick!
Lucia,
Rahmstorf seems to be the mercaptan of the month here. But any sins he committed did not involve AOGCM modelling. He was testing observations against someone else’s AOGCM modelling.
But I’ll take you up on your backward step analogy. Take k-epsilon. What Launder and co did was to show, by dimensional analysis etc, that for isotropic turbulence you could get closure with a small number of empirical constants. RNG makes it even clearer. When you’ve found the constants (tuned the model) you can apply it in a large range of circumstances.
So incorporating the a lot of physics/maths, plus a bit of tuning, gives you a widely applicable model. Just fitting constants without the physics wouldn’t do nearly as well.
That’s what AOGCM’s are seeking to do.
Nick–
Who suggested Rahmstorf committed any sin involving AOGCM modeling? I ran the search tool of firefox to verify you are the first to say anything about Rahmstorf at all on this post.
No one disputes those who write and run AOGCM’s seek to develop models and parameterizations that are widely applicable. But seeking to do this does not automatically guarantee success.
The question to ask is whether or not they have already succeed and how we verify that they have succeeded. This same question arose in developing engineering codes (and still does.)
Let’s look at your example: You know that k-epsilon has been quite popular, and is still used. Historically, we applied it to a variety of problems, sometimes with success; many papers were published showing how well it worked in specific cases.
However, it doesn’t always do well (or doesn’t do well enough) This fact was quickly discovered once people tried to extend the closure to more complicated problems.
So, the fact that k-e did well in flows that share certain characteristics did not necessarily translate into equally good results in other situations. Consequently, other turbulence models were developed. (Some more direct extensions of k-e — like Reynolds stress closures, some, like large eddy simulation which is philosopically a bit different. )
But even in flows where k-e doesn’t do so well, we all knew that modelers had some latitude to make tweak simulation results. Because the magnitudes of parameters in k-e were not tightly constrained, sometimes, a practitioner who had access to a some empirical data could “tweak” parameters in k-e a bit to get somewhat better agreement in a particular flow. And they did. (Sometimes they made no bones about this. Models are tunes, and for some industrial applications,this is just how the tool was used.)
When no empirical data existed, no tuning was possible, and results were often less favorable when the model was used for true, honest to goodness predictions. This emphasizes my point: We can’t know if a model or parameterization “really” works until we predict some data that was not observed before models were run.
Given all this, in my opinion, what experience with k-epsilon models tells us is precisely that we should not become confident a particular physics model will work in all cases simply because a modeler could get fairly decent results predicting flows which had already been observed. The true test of the model comes when the modeler predicts a flow that has not been observed.
Whether we are discussing AOGCMs, or engineering models, only when models are shown to work well in predicting observatiosn before they are observed can we have confidence in models ability to predict.
“Who suggested Rahmstorf committed any sin involving AOGCM modeling?”
Well, what does this mean?
“Someday, modelers may be able to provide evidence climate models can predict (or even project) the future with some degree of accuracy, and will show this without resorting to shennanigans like changing the characteristics of their smoothing filters without notifying readers. (See (1), (2), and (3).)”
On k-e etc, yes, I agree it doesn’t workfor everything. But it does well with a backward facing step, to relate to your analogy. All I’m saying is that physics plus fitting is a lot better than fitting without physics. And is in fact useful.
Nick–None of those references suggest that Rahmstorf ran the codes. You said “He was testing observations against someone else’s AOGCM modelling”. Yes. That’s what’ he’s doing in those references.
This post is discussing the need for testing against data that was not observed before models are done. Rahmstorf is making some baby steps in that direction. I’d have linked his attempts whether he had done them well or badly– as qualitatively, that needs to be done. But, linking and saying it needs to be done is not a neutral things, so yes, I mention that it’s not being done in a balanced way.
Sure. Of course fitting with physics is better than fitting without physics. Anyway, what’s with this vague “useful” business? Who says using physics is not “useful?” whatever that is supposed to mean here.)
Useful seems to the blind date equivalent of “She’s nice.” given in answer to the skeptical man’s question “Is she good looking?” Yeah.. well.. she probably is nice. . .(whatever that might even mean.)
Getting back to models: “Useful”, or “Promising” isn’t the same as “Has been shown to accurately predict future surface temperatures”.
When a model depends heavily on physics/maths, it’s simply using previously especially well-supported models as sub parts. Experts are in better positions to judge the ‘plausiblity’ of the particular physics/maths to the particular problem – whether it’s a good bet – or a waste of time.
But no one yet knows how to calculate an appropriate level of confidence to associate with such ‘Bayesian Priors’.
As a result, science requires new observations that ‘confirm’ model interpolations ‘between old observations’ and/or model extrapolations beyond those old observations. Confidence in a model therefore hinges mainly on measures of how good are its ‘predictions’ or ‘projections’.
Another interesting quote: “What does the accuracy of a climate model’s simulation of past or contemporary climate say about the accuracy of its projections of climate change? This question is just beginning to be addressed…”.
IPCC FAR, page 594
Nick Stokes,
IMHO the issue is mainly the “physical principles” defense, which is often invoked when the applicability of the GCMs for future predictions has been called into question.
That they are not isolated, “pure physics,” with no dependence on observations, doesn’t seem to be in much doubt; nor, as you pointed out, would that be desirable.
P.S.: Lucia, your Lumpiness has wholly preceded me. 🙂
I think there is another fundamental problem: essential to the veracity of GCM projections is the assumption that the climate has low internal variability, however this assumption itself comes (to a significant degree) from running the GCMs.
The AOLGCM models are not based on ‘fundamental’ principles. A few parts are instead based on models of some fundamental principles.
The fundamental principles on which some of the model equations are based, however, are not the most important aspects of AOLGCMs.
The most important aspects are buried in the parameterizations of sub-grid and too-complex-to-model-from-first-principles and too-difficult-to-calculate-from-first-principles and we-don’t-fully-understand physical phenomena and processes.
These parameterizations are modeled based on empirical information from the application domains; some are ad hoc, others are quite heuristic . They are not based on fundamental properties of the materials involved. The distinction between models of the application domain and fundamental models of material responses is critically important. The latter, after having been validated, can be extrapolated to other applications with confidence; the former cannot be extrapolated.
While there is nothing wrong with this approach to modeling inherently complex physical phenomena and processes, it is wrong to state that the modeling is comprised of fundamental descriptions of material responses. For AOLGCMs this is usually expressed as ‘based on the Navier-Stokes equations’. Most are not; they might be based on some model approximations to these fundamental equations. Radiative energy transport is also usually thrown into such statements. This characterization conveniently overlooks the harsh reality that it is the interaction of the radiative transport with the extremely-difficult-to-model and we-don’t-fully-understand particulate matter in the atmosphere that is critically important.
If the effects of the uncertainties in the models, especially the parameterizations, were propagated along with the evolution of the solution, the ultimate errors bars will eventually begin to grow exponentially. Almost any ‘model projection’ would fall within these error bars.
If I correctly recall, the original ‘universal constants’ for the k-epsilon model cannot be used for applications to the backward-facing step. I recall that the original constants were eventually determined to be applicable to mostly-parabolic, uni-directional, shear dominated flows. I’ll check my Pope on these if necessary.
The k-epsilon model of turbulent flows is an excellent illustration of the above discussion. It is a model applicable to some turbulent flows; it is not a model of material responses.
Oh, I forgot.
It is known that the numerical solution methods in all AOLGCMs have yet to be shown to be capable of obtaining the fundamental, critical, and overriding necessity of convergence.
Under these conditions, the nature of the continuous equations is immaterial. The numerical methods are not solving the equations anyway.