Real Climate Tries to Spin Pielke: A curious lesson 2.

Last week I mentioned Roger Pielke Jr’s interesting article comparing IPCC prediction/projections for Global Mean Surface Temperature observations of measurements taken on the true planet earth. Yesterday, Stefan Rahmstorf posted a rather odd commentary at Real Climate. I say odd, because, as far as I can tell, Stefan’s commentary seems to go off on tangents rather than discuss Roger’s major point which would seem to be:

It would be wise for

  1. the IPPC to formalize their methods to verify models,
  2. for communicate the results of those verifications clearly and quantitatively and
  3. communicate the reasons for changes in projections more clearly than found in recent documents.

I agree with Roger on these things. Standard verification methods, quantification and better communication would help the public better understand what is in store for the world’s climate.

So, now that I’ve described what appear to be Roger’s major points, let us examine what Stefan at RC says!

Stefan is concerned that Roger’s one page article doesn’t properly account for ‘noise’ in weather data.

Stefan Rahmstorf says:

If you are comparing noisy data to a model trend, make sure you have enough data for them to show a statistically significant trend.

Accounting for “noise” is important in many analyses, and to support certain types of conclusions. As it happens, it’s not quite clear that it’s necessary to support Roger’s argument in the specific paper. There is no point in logic that says Roger must do a full uncertainty analysis to suggest that the IPCC ought to be doing one!

But, I can think of an example where “noise” should have been accounted for formally. That would be Stefan Rahmstorf own Rahmstorf et al. 2007, published in Nature. Stefan’s article shows a graphic comparing observations of GMST to IPCC projections of GMST. That graphic includes uncertainties in predicted trends, but it includes absolutely no indication of any uncertainty associated with determining an underlying trend from noisy data.

Yet that graphic is the main evidence Stefan used to support his major points about the correspondence between projections of GMST and data. And that’s what Stefan is supposedly doing in his paper. David Stockwell’s recently discussed the uncertainty associated with Rahmstorf’s trend analysis here. Accounting for this uncertainty in determining trend from weather data would impact the conclusions in Rahmstorf et al. 2007.

What happens if we account for uncertainty in the noisy weather data?
It just so happens that using GMST data available since 2001, it is possible to show that we can exclude certain predicted trends as true and we can say this with a relatively high degree of statistical confidence. We find the 2C/century trend predicted/projected by the IPCC AR4 document is falsified to a confidence of 95%. Could 2C/century still be true for the underlying trend? Sure. Things that happen 5% of the time, happen 5% of the time! (Other caveats also apply as mentioned in my previous articles.)

The major results are shown graphically:

GMST data after 2001
The red line is the trend projected by the IPCC. The fuzzy bands are the sorts of uncertainties they communicated to the public. The purple ones are trends consistent with data since 2001.1

Stefan seems to suggest there is a magic number of years required to evaluate any and all claims about weather trends.

Speaking of Pielke and John Tierney’s earlier discussions of 8 year trends (and forcing us to search for links), Stefan at Real Climate tells us:

We showed that this period is too short for a meaningful trend comparison.

And suggests,

“. . . 17 years is sufficient.”

Stefan provides no explanation why 17 years sufficient, and no explanation how the reader might determine whether 15 years, 12 years or 10 years might be enough. (Though, I should note that the criteria for sufficiency of data are somewhat well understood as a matter of statistical testing.)

Stefan’s “proof” that 8 years is too short is, evidently, a his own blog post containing a chart showing that eight year down-trends do occur during period with a long term uptrend.

What does Stefan’s previous blog post discuss? It discusses a period of time during which downtrends were caused by major volcanic eruptions. You can read more about why this sort of proof is totally absurd here . The argument can with these two graph which illustrates that, the longish dips in temperature are associated with volcanic eruptions:


It is no surprise that many aren’t entirely convinced by the Real Climate “proof” that 8 years is too short to test any and all claims of climate trends during any and all periods. The fact is that, when comparing data for GMST to predictions of GMST, everyone’s own tacit knowledge (aka common sense) tells them that they need to apply the first of Gavin and Stefan’s three rules from “lesson 1”:

Firstly, are the drivers changing as we expected?

In Real Climate’s example “proof”, the 8 year dips were associated with an unpredictable driver called “volcanic eruptions”. These are absent from the recent period and cannot explain the failure of the IPCC projections to track GMST since 2001. Additional statistical statements involving words like “standard deviation” and “uncertainty intervals” could be made, but need I make them here?

Stefan criticizes Roger Jr. for not properly distinguishing between forecasts and scenarios.

Stefan says,

3. One should not mix up a scenario with a forecast – I cannot easily compare a scenario for the effects of greenhouse gases alone with observed data, because I cannot easily isolate the effect of the greenhouse gases in these data, given that other forcings are also at play in the real world.

Indeed. When evaluating IPCC reports, modelers often wish to make the distinction between forecasts and scenarios. But I see no evidence Roger made this mistake.

But, since it’s Rahmstorf criticizing Roger Pielke Jr., one might ask: As a practical matter, with regard to emphasizing the distinction between forecasts and scenarios, how do Rahmstorf et al. 2007 and Pielke 2008 differ?

As lead author of Rahmstorf et al. 2007, Stefan compares projections to observations. He perceives differences between the two. He discusses why observations and projections may differ: This is a question of interest to people who develop climate models, and also of interest to the voting public. Yes. We would all like to know why the models and data are different.

In Roger’s article, Roger similarly compares the projections and observations. Like Stefan, he notices differences between the two. Roger suggests both the model projections and the comparisons to data are useful. But, rather than discussing why observations and projections differ, Roger suggests the information, as communicated to the public might be somewhat confusing. Roger then suggest the method of communicating the projections and their relationship to observations could be improved, saying:

Such projections, no matter how accurate, can help to accelerate understanding about the human effects on the climate system as well as the limits to our ability to predict future consequences of those effects with accuracy and precision.

To facilitate such comparisons the IPCC should (1) clearly define the exact variables in its projections and the appropriate corresponding verification (observational) datasets, and (2) clearly explain in a quantitative fashion the exact reasons for changes to its projections from assessment to assessment, in even greater detail than found in the statement in 1995 regarding aerosols and the carbon cycle. Once published, projections should not be forgotten but should be rigorously compared with evolving observations.

There is no evidence Roger has confused forecasts and projections. Roger treats projections as projections, calls them projections and discusses how the information in projections could be made more useful to people (like voters and policy makers) who wish to better understand climate.

And though Roger doesn’t say this, it may be that he would suggest that the forecast/projection confusion might be confusing to the public, and maybe clearer explanations on the part of the IPCC could clear up this confusion.

Yet, Roger’s suggestion that the IPCC could improve the utility of their projections by making verification clearer and more quantitative, seem to bother Stefan who complains:

1. IPCC already showed a very similar comparison as Pielke does, but including uncertainty ranges.

But I would say this: The IPCC’s comparisons are similar to Roger to the extent the two share deficiencies.

Roger’s comparison uses no standard statistical techniques to account for uncertainty in noisy data; neither do the IPCC or Rahmstorf’s. Roger’s comparison does not quantify the degree of agreement of disagrement; neither does the IPCC (or Rahmstrof). Roger’s method is purely visual; so are Rahmstorf and the IPCC’s.

These features are flaws in when they appear in the IPCC documents in Rahmstorf, but not Roger Pielke’s paper. The reason is subtle: Roger’s point is better, more quantitative comparisons are required. He does not appear to be holding his method up as the gold standard, or even claiming it is adequate!

Conclusion

I concur with Roger the IPCC documents require clearer explanations discussing why the model projections change over time. The documents should be expanded by one or two pages to include clear discussions of verification of past IPCC projections. The results of these verifications should be quantified and communicated clearly and unambiguously. The exact reasons for changes in projections should be explained, and when possible related to the quantative discussions of the verifications done on past projections.

As it stands, the narratives in the IPCC amount to qualitative feel-good ruminations that provide little guidance to those who wish to better understand how we are to interpret the current projections. Better comparisons are required. I applaud Roger for saying so, and Nature for publishing the article where he said this.

End notes:

1. Note: My use of the word “projection” is in the sense of “extended line”. As in: The tree limb projects over the fence. Given the size of the IPCC chart, and the crudeness of my graphics tools, it’s difdicult for me to draw short lines and check that I inclined to match the trend I wish to show. In future, I plan to be careful with the use of the term “projection” when I mean, “I extended the line due to the limitations of my crude graphics tools.”

2. Roger Pielke blog is having difficulties. The best link to his response to Real Climate post seems to be: Rogers’s response:

39 thoughts on “Real Climate Tries to Spin Pielke: A curious lesson 2.”

  1. Their examples expose another problem with their position by using volcanic eruption related down turns. They keep saying the models account for the major drivers. If previous downturns were due to volcanism and since there is a dip that they have no verifiable explanation for and is not volcanic, then there must be drivers they have not accounted for, so how can they claim robustness for the models?

  2. BarryW–
    And moreover, we know they believe those dips are due to volcanic eruptions because the fact that they did hurry up, code up the model after pinatubo and predicted the temperature dip before it happened is part of the proof the models have some predictive ability.

    Clearly, the models have some predictive ability. The real questions are: how much? So the pertain to quantifying that ability. Also, the issue with respect to the full IPCC modeling process, does the full modeling process result in something people can understand? Does it help them make useful decisions?

    (It’s also necessary to make sure we don’t solely go off onto the “How goo are AOGCM’s” question. As, they represent only a portion of the models and methods. The full methodology start with creating scenarios, using one sort of model, estimating forcings– using models– running AOGCMS, creating and tuning “simple models” to match AOGCM’s, running the simple modles and then doig some statsitical treatment.)

  3. I’m not sure I’d count what is almost retrofitting a model to what should be a known driver as showing predictive capabilities. I’m coming to the position (admittedly one tempered by a lot of ignorance) that the models are useful for measuring sensitivities because you can hold other variables static. Because of the number of apparent (and unquantified) feedbacks in the real world I doubt their predictive capabilities. Atmoz has a post on the Gray hurricane predictions that says that the average from previous years is a better predictor than Gray’s predictions. Admittedly that is weather and not climate. I wonder how the models are doing (as opposed to what the IPCC has derived from them) relative to the time frame you’ve been looking at? Are they any better than a short term average or tend.

  4. BarryW:

    I wonder how the models are doing (as opposed to what the IPCC has derived from them) relative to the time frame you’ve been looking at? Are they any better than a short term average or tend.

    I don’t know the answer to that. The reason I limit myself to what the IPCC document communicate to the public is that document is supposed to be, in some sense, a consensus of what climatologist thought most likely at the time the document was written.

    I have no doubt that someone, somewhere has models, and individual model runs, that would match the current behavior. I also have no doubt that someone, somewhere could dredge up models runs that would be so hilariously incorrect as to leave us gasping.

    But given that the IPCC– and climatologists– are relying on some sort of averaging to project into the future, it seems reasonable to discuss only those results that were communicated as plausible.

  5. Nice analysis and a fair parsing of the exchange. Maybe there is a cure after all for the muddy-headed thinking about “weather noise”, “internal climate variability”, and model & data and uncertainty that passes in the literature as “science”. A higher level of scrutiny is clearly warranted. A reply from Dr. Rahmstorf would be most welcome.

  6. Thanks bender! (I don’t think I’m on Real Climate’s radar screen. And, for some reason, WordPress doesn’t seem to ping them the way it pings other people I link. I don’t know quite how that happens, since I thought that pings just go “out”, but can be ignored.)

    Trivia for the minute: When the email alerting me to your comment arived,I clicked to the screen, faced the window and said “S..t! It’ snowing!” I went to the window, opened it, there are flakes– they are wet and not accumulating. But…

    I may need to look up when we last had mid-April snow. I’m sure it’s happened, but, darn! I set up my spring house last week to get my seedlings started, and I don’t want it covered in snow!

  7. I share Bender’s view that a reply by Dr. Rahmstorf to the serious criticisms that have been made of Rahmstorf et al (2007) would be most welcome. This should have been a higher priority for him than that of posting his ‘Model-data-comparison, Lesson 2’ [for Roger Pielke, Jr] on RealClimate.

    But all seven of the Rahmstorf et al co-authors have some responsibility in this matter, as do the six prestigious research institutions in five countries (Germany, France, Australia, the UK and the US) with which they are affiliated. And so too does the journal in which the article was published (which, incidentally, was the US-based ‘Science’, not the UK-based ‘Nature’).

    Stefan Rahmstorf was a Lead Author of Chapter 6 of the 2007 WGI report. Other Rahmstorf7 co-authors were Coordinating Lead Authors, Lead Authors or expert reviewers of Chapters 1 (Somerville), 3 (Parker), and 5 (Cazenave and Church) . The article was a product of members of the IPCC milieu, and was published online in the same week as the release in Paris of the IPCC WGI report (or, strictly speaking, its Summary for Policymakers).

    Yet another co-author of the article in ‘Science’, James Hansen, has recently written to Australia’s Prime Minister (and in similar terms to all Australian State Premiers) calling on him to exercise his leadership ‘on a matter concerning coal-fired power plants and carbon dioxide emission rates in your country.’ Hansen explains that, because of the urgency of the matter, he has not collected signatures – but he offers the names of seven Australian scientists whom Mr. Rudd could consult.

    Hansen also says that he had ‘read and commend[s] the Interim Report of Professor Ross Garnaut’, which has been submitted to all Australian Governments. According to this Report, ‘Comparisons between observed data and model predictions suggest that the climate system may be responding more quickly than climate models indicate’ (Rahmstorf et al, 2007) and, specifically, that ‘Global mean surface temperature increase since 1990 has been measured at 0.33 C, which is in the upper end of the range predicted (sic) by the IPCC in the Third Assessment Report in 2001, as shown in Figure 5 (Rahmstorf et al 2007).’ As Lucia says in the above post, the paper on which the Australian report has relied ‘includes absolutely no indication of any uncertainty associated with determining an underlying trend from noisy data.’

  8. There is a very interesting paper about how to deal with uncertainties in models. Although the paper is not about GCM’s, it is about carbon cycle models, where they also have to deal with the same two types of uncertainties: there is uncertainty in the observational data, and there is “structural” uncertainty, that is uncertainty about the physical processes themselves. The abstract reads:

    Observation-based estimates of the global carbon budget serve as important constraints on carbon cycle models. We test the effect of new budget data on projection uncertainty. Using a simple global model, we find that data for an additional decadal budget have only a marginal effect on projection uncertainty, in the absence of any constraints on decadal variability in carbon fluxes. Even if uncertainty in the global budget were eliminated entirely, uncertainty in the mechanisms governing carbon sinks have a much larger effect on future projections. Results suggest that learning about the carbon cycle will best be facilitated by improved understanding of sink mechanisms and their variability as opposed to better estimates of the magnitudes of fluxes that make up the global carbon budget.

    In their conclusion, the authors also add:

    This study is intended as a first step toward investigating the potential for anticipating how learning about the carbon cycle might take place via future observations. We have therefore focused on global aggregate data, and used a simple global model. Results illustrate how new decadal budget information alone is of limited value in judging the credibility of models calibrated to previous data, since discrepancies can always be explained as being due to decadal variability. Even if the new data are used to recalibrate the model to observations over a longer time period, the effect on projection uncertainty is not large, and in particular is small compared to the effect of uncertainty in model structure.

    This is more or less a candid admission that you can always retrofit the model to the data if your data or model uncertainty is large enough. But that doesn’t help your predictive ability, because if there are processes that are not included in your model, you can miss the boat entirely.

    On another topic, am I the only one who has problems with what is called “noise” in climate data? From a physicist’s point of view, noise comes from the measuring instrument. Or else, it is an intrinsic, and probabilistic, feature of the phenomenon, like quantum noise in light measurements. But climate, albeit chaotic, is deterministic to a large degree. What is called noise is deterministic variability. Furthemore, this variability is not independent from any underlying trends, because it’s a nonlinear system with many feedbacks involved. A major volcanic eruption could, in principle, change the course of climate by triggering albedo feedbacks. Major El Ninos do not occur randomly either. The heat that they dissipate must come from somewhere. So even if we can’t predict exactly what next year’s average temperature will be, we could, in principle, have an understanding of the variability that constraints our predictions, to a narrower and narrower range as this understanding gets better and better. That’s what wheather (not climate) models do, in fact. My point is that it’s just to easy to dismiss the variability as “noise”, as if it were entirely unpredictable. In fact, it just points to our ignorance of what’s really going on.

  9. Thanks Francois,

    Add me to your list of people who are bothered by the term “noise” applied to weather. I don’t know who came up with the idea to call the weather noise. Engineers do not apply the term “noise” to random-looking components of velocity, temperature, concentration or anything when these arise due to the non-linear attributes of the Navier-Stokes. These are features of interest, and if we were to start calling it “noise” we would need to explain how the “noise” in the system can be used to achieve useful results like enhanced heat transfer, diffusion, mixing etc. (We do sometimes argue about what to call it, as the appropriate name can vary depending on the particular flow. )

    I prefer thinking of measurement uncertainties as “noise”, and weather as some sort of random-seeming deviation from the “average”.

    My terminology does leave arguments as to the proper definition of “average”, but at least one isn’t lead into the sorts of conceptual blunders that underly Tamino’s criticism of Schwartz’s paper. (Tamino incorrectly interpreted Schwart’s as suggesting the IPCC simplified equation describing surface temperature described both the temperature and measurement noise. If one recgonizes the existence of measurement error, Tamino’s criticsm is found to be based on a rather silly mistake. You can read more about the problem here: http://rankexploits.com/musings/2007/time-constant-for-climate-greater-than-schwartz-suggests/ I get a higher value for time constant than Schwartx — but just slightly higher. And, accounting for measurement noise, the physical model fits the data rather well.) )

  10. Francois
    While waiting for my husband outside the pharmacy, I heard the distinctive “pings” of “wintery mix” bouncing off the car hood. This is not unprecendented for late April, but… I want to get my seedling outside! (They are in the kitchen window right now.)

  11. Francois O, there are some interesting comments about noise here that might interest you. If you don’t want to read the entire thread, do a search for ‘Gavin’; GISS/NASA’s Gavin Schmidt who introduced the topic into the thread.

  12. I also have become a little confused by the nomenclature. I’ve read statements by certified climatologists that say climate is the average of the weather. At the same time certified climatologists tell me the GCMs do not calculate the weather. And as I read things recently the weather has been labeled noise. So, apparently the GCMs are calculating the average of phenomena and processes that are not modeled, those phenomena and processes are weather, and that is noise. Thus, the GCMS calculate the average of noise by use of models and methods applied to phenomena and processes that are not included in the codes.

    Clarifications are appreciated.

  13. Dan, I’ve read William’s post, and entirely agree with him. One can look at it this way. Imagine you’re a bunch of guys betting on horse races, and each of you has a different “model” to determine which horse is going to win. Say there are six horses in each race, and you are six people betting. You each pick a different horse. So, as an “ensemble” of gamblers, you’re always right! At least one of you is always right. Gavin’s argument in his first reply (didn’t read the whole thread) is that the average is nevertheless close to observations. Okay, so say each horse has a number from 1 to 6. Say your “metric” is to average the number of the horses picked. The average turns out to be 3, of course, and over many races, the average of the races is also 3. Wonderful, isn’t it? Of course, William’s argument is that if one of you consistently picks the winning horse, then his “model” is better, and one should not listen to all the others. His other arguments is that even if everyone agrees on which horse will win, that doesn’t mean that it will win. So, a model is only good if it consistently matches observations. Not if it’s good with one metric and bad with another one, as Gavin claims.

    But there is another reason to be cautious with models. It is because what we’re really trying to figure out with the models is what will be the effect of higher CO2 concentrations, something that has never occured before, or at least not at a time when we had good observation data. That makes the models extremely sensitive to our understanding of one particular physical phenomenon, furthermore one that cannot be tested with past observations. That is called, of course, “extrapolating”. Anybody who has done modelling, or just curve fitting, in his life will know the dangers of extrapolating.

    But of course the modellers know about all that. They’re not stupid. But modelling is their bread and butter, it’s what they do for a living. So they will not readily admit to the limits of the models. To a certain degree they can fool themselves too, however (I think Gavin does, that’s what happens when you start a blog…). There have been fine sociological analysis of the modelling community, particularly by Myanna Lahsen. There’s also Shackley and Wynne, and Shackley at al.. When interacting with modellers, it’s good to keep that in mind.

  14. Lucia, I made this comment at Anthony Watt’s blog, which contains a question directed at you.

    Bob B, re lucia’s statement that The true weather variability in month to month measurements appears larger than the measurement uncertainty

    I see no reason why weather in the aggregate across the whole planet and over a month should be variable or noisy to the degree claimed. I.e. I agree that almost all of the variability is in the measurement.

    Lucia may have a statistical basis for that statement and I’ll ask at her (BTW excellent) blog.

  15. The following was posted at realclimate 18 hrs ago. Still awaiting moderation, according to the message.

    Stefan

    I would be grateful if you would clarify for me a puzzling aspect of your Rahmstorf et al. ‘07 Science paper. You state in the figure caption that the ‘minimum roughness criterion’ was used to get the temperature trend line. Use of this method of data padding as described by Mann 2004 should ‘pin’ the trend line to the 2006 temperature value. However, while the 2006 value lies in the center of the IPCC range, the trend line shown on the figure lies above the 2006 value, in the upper IPCC range.

    I would like to clarify this apparent inconsistency. This an important paper for the case that ‘the climate system is responding more quickly than climate models indicate’ and it is important to verify its technical correctness. More details and graphs can be found here:

    http://landshape.org/enm/rahmstorf-et-al-2007-ipcc-error/

  16. Sorry for the probably crazy sounding question I’m about to ask, but I’m curious and don’t have the skill or time to look into it myself (severe chronic pain from repetitive stress).

    Has anyone looking into a relationship of CRF and volcanic activity. Thinking goes, properties and formation of aerosols is probably similar to formation of ions for cloud nucleation. Could CRF 1) cause increased production and build of aerosols in volcanoes? and 2) If this happens, could it cause changes in pressure differentials in volcanoes causing instability and increased activity?

  17. Add me to your list of people who are bothered by the term “noise” applied to weather. I don’t know who came up with the idea to call the weather noise. Engineers do not apply the term “noise” to random-looking components of velocity, temperature, concentration or anything when these arise due to the non-linear attributes of the Navier-Stokes.

    Me too .
    The difference is that I am much more than bothered (the use of the word “noise” generates in me an irresistible urge to send the speaker back to the ground school) and it was basically this fact that made me interested in climate 11 years ago .
    Equating weather and “noise” is about as stupid as saying that you can consistently solve Navier Stokes equations by throwing a die .
    The words “noise” and “equation” are antinomic .
    God rarely throws dice but he likes to make us think that he does often so 🙂

  18. re:noise

    I don’t seer the big deal. It gets tiresome saying “unforced variability” over and over.

  19. I haven’t gotten a real good handle on noise/weather and “unforced variability” doesn’t provide much clarity to me either. All physical phenomena and processes require driving potentials in order to take place. Transport of mass, momentum and energy, exchanges of these at moving and stationary interfaces, and all chemical processes require driving potentials.

    From what I have read so far, “unforced variability” seems to refer to the ‘chaotic’ nature of calculated numbers as seen in applications of GCMs. Additionally, I remain somewhat confused that this seems also to be assigned to weather.

    The GCMs cannot calculated the weather because they do not contain models for important weather phenomena and processes, the spatial (and maybe the temporal) resolution is not sufficiently refined, and the injection of empirical data into the calculations is not carried out. To assign the observed ‘chaotic response’ to weather doesn’t make sense if the GCM models don’t have weather phenomena and processes in them and don’t make weather calculations.

    For me, assignment of an observed response to things not included just doesn’t follow. This seems to be another of the hypotheses that are simply attached to the behavior seen in the results produced by GCM applications.

    In my opinion, all the observed responses should be assignable to the equations that make up the complete models. If a physical significance is to be attached to ‘chaotic response’ then the first step is to eliminate, by deep investigations into the mathematical properties of the models, the possibilities that what is being seen are purely artifacts of the numerical methods. And by mathematical properties of the models I mean the mathematical properties of the continuous equations, their discrete approximations, and the numerical solution methods applied to the discrete equations.

    Corrections and additional clarifications appreciated.

  20. Dan,

    All physical phenomena and processes require driving potentials in order to take place.

    Agreed. However, the problem comes (I think) in that the reaction to drivers is chaotic itself. There’s no doubt that we don’t understand unforced variability, but we don’t have to understand it perfectly to estimate climate sensitivity.

    “unofrced variability” includes things (like ENSO) that aren’t really “weather”. We have a very hard time predicting ENSO, but this is improving.

    The GCMs cannot calculated the weather because they do not contain models for important weather phenomena and processes, the spatial (and maybe the temporal) resolution is not sufficiently refined, and the injection of empirical data into the calculations is not carried out.

    Not sure what you mean by “important weather phenomena.” GCMs do simulate ENSO and other oceanic and atmospheric oscillations. Some are well simulated. Some are not. Some change in unexpected ways, as the Southern Annular Mode did in response to ozone depletion. Coarseness is a problem in some GCMs. The big issue is that GCMs are rarely initialized and so cannot predict short term. Moreover, the most accurate an important output of GCMs is climate sensitivity. This is done with ensemble means. Since the unforced variability from individual models is canceled out, it actually makes determining CS easier.

    The assignment of “unforced variability” to particular events is done more on observation (we know that climate is somewhat chaotic) and theory (There is no forcing known that can explain observations). The current “cooling” is still consistent with what we know of unforced variability; however, a few mores years would demand a reassessment of the forcings or the extent to which unforced variability can have an effect.

    I rambled a bit there. 🙂

  21. In my opinion, all the observed responses should be assignable to the equations that make up the complete models. If a physical significance is to be attached to ‘chaotic response’ then the first step is to eliminate, by deep investigations into the mathematical properties of the models, the possibilities that what is being seen are purely artifacts of the numerical methods. And by mathematical properties of the models I mean the mathematical properties of the continuous equations, their discrete approximations, and the numerical solution methods applied to the discrete equations.

    Corrections and additional clarifications appreciated.

    Dan you say what you say because you did a considerable work in deterministic chaos emerging from non linear ODEs and PDEs .
    That is not the case for 99% of people dabbling in climate “models” .
    As you can always calculate an average of ANY observable , the most naive representation of ANY observable is to say that it is equal to [X] + noise(t) where [] is the average of the observable and noise(t) is the difference between X(t) and .
    Here X can be anything – a velocity , some surface or a volume integral of pressure or temperature , the frequency or amplitude of ENSO etc .
    So far apart from saying that it is naive , there is not much to add and it is a tautology anyway (an always true statement containing no additional information) .
    Serious things begin when people add that noise(t) is random , gaussian , “cancels out” etc .
    In the matter of fluid dynamics since Kolmogorov 60 years ago we know that it is not true in general .
    Noise(t) is neither random nor gaussian nor “cancels out” in the general case .
    However it is possible to construct a statistical asymptotic field theory (similar to statistical thermodynamics) for CERTAIN cases where additionnal hypothesis can be made .
    Namely homogeneity and isotropy .
    In those cases this theory shows that for very small scales the statistics have a universal form and allow to make statistical statements about energy dissipation .
    Obviously the theory dramaticaly fails when the hypothesis are wrong and that is the domain of low dimensional chaos that you mention and Komogorov of course also knew that the theory was not valid for every general case .

    The climate “modellers” are still in the pre Kolmogorov era where they ALWAYS do the hypothesis that in ANY conditions and for ANY not understood observable one has
    X(t) = [X] + gaussian noise(t) .
    Why Gaussian ?
    Because it “cancels out” so you don’t need to bother about details that you don’t understand anyway and only calculate averages what you are able to do .
    Of course if that unfounded , naive assumption should be justified , they would have to do exactly what you say and what Kolmogorov has done 60 years ago – start from the continuous equations AND their numerical approximations and prove that there are specific conditions and scales (both spatial and temporal) where a statistical asymptotic field theory could make sense .
    Everybody who knows a bit about chaos also knows that such a climate theory cannot be founded on statistics themselves alone because in most cases it is impossible to make the difference between the deterministic chaos that obeys to no statistics and will always surprise by “unprobable” brutal behaviour changes after a certain time and a gaussian noise where the probability of changes far from the average is negligible .
    They look alike until they don’t 🙂

    In the current state of climate “modelling” there is sofar not even a hint that the people have understood that “noise” does not necessarily cancel let alone begun a serious theoretical (aka properties of continuous equations) work on this matter .

    P.S
    You should put the Teixeira paper as a permanent link on your blog because I find it extremely enlightening for the understanding of chaos , convergence and continuous equations in physics .

  22. In the above I have been trapped again by the fact that the average symbols (“”) don’t display on some blogs .
    So the term [X] which means average of X has disappeared everywhere where it was what may make the understanding a little bit harder .
    So every time you think that something is missing , add [X] like in the expression X(t) = [X] + noise(t)

  23. Tom, I got some of them.

    I’m don’t think climate models assume gaussian. I’m not even sure which formalism they use to deal with any “off-average” behavior due to the NS. But, I do agree they way the modelers who blog discuss things, they don’t seem to have much probability-statistics type understanding. (And I don’t just mean hypothesis testing.)

  24. In responding to David Stockwell’s comment #124 on the ‘Model-data comparison – Lesson 2’ thread at RealClimate, Stefan Rahmstorf, author of the initial post, criticised David and also Roger Pielke Jr for their misspelling of his (Rahmstorf’s) name. According to Rahmstorf, ‘this is … an indication of the care someone takes in getting things right.’

    The misspellings by David and Roger were in blog postings, but the name of one of Rahmstorf’s RealClimate colleagues -Caspar Ammann – was misspelled in his (Rahmstorf’s) contribution (‘Anthropogenic Climate Change: Revisiting the Facts’) to the important edited collection ‘Global Warming: Looking Beyond Kyoto’ (Ernesto Zedillo, eds., Brookings Institution Press and Yale Centre for the Study of Globalization, 2008). The error is in footnote 41 on p. 52, in which ‘Caspar M. Amman (sic)’ is cited as a co-author of a Comment published in ‘Science’ in 2006.

    Is this an indication of the care someone took ‘in getting things right’? As yet RealClimate hasn’t published a post I sent 15 hours ago, drawing attention to the error. And I suspect that they won’t: teacher musn’t be seen to have made mistakes.

  25. Stefan Rahmstorf, author of the initial post, criticised David and also Roger Pielke Jr for their misspelling of his (Rahmstorf’s) name. According to Rahmstorf, ‘this is … an indication of the care someone takes in getting things right.’

    Wow. Stefan must be having a hard time finding real things to criticize!

  26. Ian,
    Ouch! I have put up a new post with analysis and graphs in response to the recent reply by Stefan Rahmstorf. I must say in his defence that he has been prepared so far to respond to the actual numeric issues. IMHO he is trying to defend the indefensible, particularly in regard to uncertainty. But lets see what he says and let the readers be the judge. Its been an instructive exchange.

  27. David– I like your response. I think so far, Stefan is overly focused on the end points. I’m equally concerned with the “slide” to put the IPCC T=0 anomaly at some particular temperature.

    Now that I’m done figuring out how cyanide dilutes, I’ll be able to blog tomorrow! 🙂

  28. Lucia, I haven’t touched on that issue you raise but it is also valid. You can see it when you run a regression line. If you then were allowed to shift the line in the x axis its going to change the location of some temperatures to above or below the the trend line.

    And, I haven’t started on the sea level data, though running after every Chicken Little peeps is getting old. They just fail to mention the whole IPCC TAR and AR4 evaluation of limitations of sea level data, an analysis I thought was rather thoughtful.

  29. Tom, I got some of them.

    I’m don’t think climate models assume gaussian. I’m not even sure which formalism they use to deal with any “off-average” behavior due to the NS

    Thanks .
    I am sure that they do and I have it from the head modeller Schmidt himself .
    In the W.Briggs blog he has written :

    But there is another component – e – the internal variability – which is chaotic and depends on exactly what the weather is doing. The atmospheric component of ‘e’ is only predictable over a few days, while for the ocean part, there might be some predictability for a few months to a few years (depending on where you are).

    As Schmidt is apparently not familiar with the chaos theory , he has always been using the concept “chaotic” improperly .
    In his mind it means – stochastic and “cancelling out over some (not nearer defined) time period” .
    I agree that “gaussian” is a technicality that is not necessary – any probability density function that is symmetrical wrt the mean would do .
    But then with such a hypothesis the Gaussian imposes itself as the most convenient and relevant mathematical form .
    Besides I have several papers dealing with ENSO modelling that explicitely use gaussian “noise” .
    I am sure you would find tons of that if you wanted to look nearer in the “modelling” litterature .
    All those papers that I have read have one thing in common – the (gaussian) noise hypothesis is considered self explaining , the authors don’t seem to realize that introducing randomness in otherwise perfectly deterministic processes is an EXTREMELY strong assumption .
    Things that Kolmogorov and Lorenz (peace to his soul) were very well aware of and argued with painful carefulness basing on continuous equations describing the process they were studying .

    Then it is also possible to take the problem the other way round and there is a whole corpus of litterature that takes time series of observables as such and tries to look if there can be detected some colored signal – f.Ex a paper I linked somewhere says that ENSO could be pink aka a mixture of white and red noise .
    This approach is good to detect definitely deterministic and non or slightly chaotic systems but it is very difficult to differentiate deterministic chaos from a purely random process by exclusively statistical methods .
    So if somebody who wants to do that , he has to do the legwork like Kolmogorov did and do what Dan is saying – looking hard at the continuous equations AND their numerical approximations that one uses in a model AND deduce from there under what conditions (if any) a statistical field theory could be legitimated .
    If that is not done and it isn’t , we stay at the level 0 of reasoning – “Hey I collected some data over some arbitrary time period and if I remove some fit from it , I get a thing that looks more or less random . Let’s say that the fit is the explanation and that anything else is random for any time interval and let’s move on .”

  30. re: comment 1885. Boris April 16th, 2008 at 9:49 am

    ” … (we know that climate is somewhat chaotic) … ”

    So far as I know, ‘somewhat chaotic’ is an undefined concept. Can you point us to reports and papers in this area?

    Chaotic response is fundamentally known only from numerical solutions of discrete approximations to the continuous equations. It seems reasonable then to first determine, eliminate beyond any doubts, that the observed response is not due to properties of the chosen numerical solution methods. Accurate numerical solutions of systems of nonlinear continuous equations are notoriously difficult. The process is filled to overflowing with subtle pitfalls. Mathematical analyses of the properties and characteristics of the continuous equation systems is also notoriously difficult.

    So far as I have been able to determine, both these steps have been skipped over in the case of GCM models, methods, and codes. That the trajectories of the dependent variables obtained in calculations show chaotic response is an untested hypothesis. These steps do represent very difficult work, but that is no reason to avoid them.

    A somewhat rigorous approach to develop deep understanding of mathematical models of physical phenomena and processes includes the following steps. (1) Development of the final form of the continuous model equations (generally ODEs, PDEs and some algebraic equations). (2) Determine the characteristics of these. And by characteristics I mean determine if the equations are elliptic, parabolic or hyperbolic. Sometimes there are surprises discovered at this step. (3) Determine the initial and boundary conditions that lead to a closed system of continuous equations and a well-posed problem. (4) Develop the discrete approximations to the continuous equations. (5) Analyze the discrete approximations to determine consistency, stability, and convergence of the proposed numerical solution methods for the discrete approximations. Sometimes there are surprises discovered here (see Step 2).

    For all but the most trivial problems, there are usually lots of iterative loops within each step, and around several subsets of the steps. Some might require development of software to successfully carry out the step. Analyses of numerical solution methods, for example, might in fact require that software be developed to determine that the proposed methods are stable. Numerical solution for the linear growth factors, for example, in the case of almost all real-world problems of interest is generally required. The type of continuous equations, as determined in Step 2, is an important consideration relative to maintaining conservation of mass and energy for example, in the discrete approximations. Parabolic equations, for which the dependent variables are coupled throughout the solution domain, present special problems relative to conservation of mass and energy conservation at stationary and non-stationary interfaces.

    All these processes should be completed, and the results understood in depth, before planning, designing, development, and coding of the software for the model begins. To gather up some continuous equations, throw some discrete approximations onto these, and wrap everything up with some coding, and then obtain the properties of the model from the generated numbers is simply not correct

  31. Dan

    Chaotic response is fundamentally known only from numerical solutions of discrete approximations to the continuous equations. It seems reasonable then to first determine, eliminate beyond any doubts, that the observed response is not due to properties of the chosen numerical solution methods.

    I think this is not so. Chaotic response has been observed in flows near transition to turbulence. There are also analytic solutions that proceed from a stability analysis that results in a particular flow, to then doing a stability analysis on that and so on. We can see that chaos happens in flows.

    Of course, these aren’t directly climate.

    I’m not really “into” describing things as chaos. But random-seeming behaviors certainly can arise in systems involving the NS — or other non-linearities. These behaviors aren’t simply artifacts of the numerics.

    In any case, whether we think of things as chaotic or not, it’s known that trying to “average” the random features, and substitute parameterizations for the averages behavior has worked less the splendidly in the past. The circa ’70s and ’80s transport models that were chockful of these types of parameterizations were useful but needed to be used with great care. (This often meant tuning constants to the specific flow of interest or hunting for closures that worked for flow “a” but not flow “b”.)

    It’s difficult to believe that models could be bang on accurage when the documentation in the literature reads like it using ’80s commercial package software methods (e.g. Fluent, Flow3d etc.). This is not to say they aren’t useful. The 80s commercial software was useful. But, useful doesn’t mean accurate.

  32. Tom, as ever we broadly agree, but I’d just like to nitpick on one point:

    f.Ex a paper I linked somewhere says that ENSO could be pink aka a mixture of white and red noise .

    I agree with the sentiment (ENSO variability – and many other climatic parameters – could have a pink spectral response), I would also add that certain caveats apply when modelling pink noise through a combination of red and white noise processes. For the purposes of this discussion, I am assuming white noise has spectral dependency f^0, pink noise spectral dependency f^-1 and red noise f^-2.

    Clearly, on its own, white noise (e.g. gaussian i.i.d.) is a poor model for pink noise; red noise (e.g. simple Markovian process) can be made to fit better, but obvious discrepencies still occur (cf. Tamino’s efforts). It is possible to get a better “fit” with merged red and white. This gives you additional degrees of freedom, and each part maps to a different part of the spectrum; the white noise fits to the shallower high-frequency tail of the pink noise, and the red noise maps to the low frequency components.

    However, the fit is just made to the data available. If a mix of red and white noise (e.g. Lucia’s ARMA approach) is used, you get a good fit within the limits that you can estimate for the data you have. Unfortunately, you have essentially no data for the lower frequency terms (i.e. oscillations at some multiples of the data length). In this respect, the analysis inevitably ends up extrapolating the red noise term into the lower frequencies – but this is a problem, because red and pink noise extrapolate differently.

    The result of this is that when using a red and white mix to test a pink process, a substantial inflation of significance occurs in estimates of things affected by low frequency variability – trends being an obvious case! In fact, under pink noise assumptions, even the late 20th century warming fails significance tests (e.g. Cohn and Lins 2005).

    The failure of climate science to properly address these topics – which as you note, date back nearly 70 years – is a disgrace.

    Postscript: in principle, whilst I disagree that the red noise assumption is valid, I think Lucia’s analysis has some validity – because it tests the IPCC results using their own assumptions. The IPCC are, in effect, hoisted by their own petard. That said, if Lucia’s test found the IPCC results passed, I would point out that the red noise assumption is too strict. In other words, I like to have my cake and eat it 🙂

  33. SteveUK

    Postscript: in principle, whilst I disagree that the red noise assumption is valid, I think Lucia’s analysis has some validity – because it tests the IPCC results using their own assumptions. The IPCC are, in effect, hoisted by their own petard. That said, if Lucia’s test found the IPCC results passed, I would point out that the red noise assumption is too strict. In other words, I like to have my cake and eat it 🙂

    My approach is pretty much to test using their own assumptions, and then the real weather we got. Otherwise, it seems to me no hypothesis test is possible. I conclude– as you do– that unless we make the sorts of simplifying assumptions I make, (and the IPCC makes) to do any sort of hypothesis test. Using the sorts they use should avoid contention, but it doesn’t seem to do so.

    On the pink vs. red vs. white: Whether or not the weather is red, pink, white, or something else, it is clear that the measurements of weather almost contain some white noise.

    It should never be forgotten that the data are measurements and measurements nearly always contain “instrument noise”. The agencies themselves admit the existence of this noise. It seem unlikely that errors associated with measurement have long lasting autocorrelations for the following reason:
    Suppose the “bucket” method is used to measure water temperature. Errors happen because different people drop the bucket different distances, wait longer or shorter to measure the temperature etc. The measurement error in the “bucket” measurement done today is likely to be independent of the error yesterday. Similar things happen for every measurement.

    These tend to average out over all measurements, but it never zeros. So, this component of the error is “whitish”. You can step through all the component of actual errors in measurement, and see that the measurement uncertainty is likely to have a strong white component.

    ( GISS with its extrapolation over the poles and adjusting station temperatures for average of the region may end up with a red componenet to the measurement uncertainty.)

    But, in anycase, the noise due to measurement errors will have a different spectral character from the weather noise. Weather noise is larger, but, based on reported values at GISS and Hadley, and intercomparison of the instruments, measurement noise is not insignificant.

  34. Lucia, if you keep calling me Steve this is going to get very confusing 🙂

    My approach is pretty much to test using their own assumptions, and then the real weather we got. Otherwise, it seems to me no hypothesis test is possible.

    Yes and No. (I did say I like to have my cake and eat it, didn’t I?) It is very difficult to determine which model is most appropriate through purely statistical means (e.g. red vs. pink vs. another). This does not mean simplifying assumptions are valid. One possible approach is to assess the consequence of different models and comment on them – the approach taken in Cohn and Lins, as an example. This is no panacea, but it helps us to understand some of the uncertainty associated with the assumptions we make.

    Once again, I caveat this by saying in terms of testing the IPCC, testing them by their own standards is a valid simplification. Falsification here is a falsification of their results by their own assumptions. Kinda difficult to wriggle out of that one. That said, if you failed to falsify, it does not preclude falsification through different means, e.g. by calling into question the assumptions that they make. Going further, it is more difficult to make the latter kind of criticism stick. Your method of falsification makes a powerful case.

    On the pink vs. red vs. white: Whether or not the weather is red, pink, white, or something else, it is clear that the measurements of weather almost contain some white noise.

    I agree that there are multiple noise processes going on in the measurement data, and one would expect a white noise component. But for trend analysis, the biggest problems occur from the low frequency components, not the high frequency components – and if pink noise is present, it will dominate the low frequency components of the model. Simple polynomial orders tell us f^-1 will dominate over f^0 at some cut off frequency and below.

    Another way to look at it: say I’m determining a trend from 100 years of data. I have noise with periodicity in the 5-20 year range, and noise with periodicity 500-2000 years range. The lower frequency noise has ten times the amplitude. Which of these will mess up my trend analysis the most? I would argue, in this case, the low frequency components. Yet the assumptions we make (pink / red) have little effect on the 5-20 year noise band (which may be dominated by the white noise component anyway) but have a huge effect on the probable amplitude of the 500-2000 year noise band. Yet we have insufficient data to determine what the “right” noise level is at these low frequencies. The devil is buried in the detail of the assumptions.

    From your earlier post:

    In any case, whether we think of things as chaotic or not, it’s known that trying to “average” the random features, and substitute parameterizations for the averages behavior has worked less the splendidly in the past.

    This is only true for ergodic time series. Pink noise series – as just one example – do not exhibit ergodicity. There is a neat quote from Prof Demetris Koutsoyiannis (highlighted by UC at ClimateAudit) who describes pink noise series as time series that “forget their mean”. I’ve used the term “pink noise” here, but of course “pink unforced variability” would be more appropriate 😉

  35. I don’t know how you mix me and Steve up. Steve seems really friendly and polite, but I come over here and pick fights! If I was Steve, I’d be pretty upset about that. But then perhaps I’d be too polite to complain 😉 Only kidding Steve.

    I agree that this does not mean the assumptions are, by definition, valid. The difficulty is without them, we sort of end up with “can’t prove anything”.

    I think “prove” is too strong a word for AGW anyway. “Build confidence in” is about the best you’ll manage. And for that, you need to have some confidence in your assumptions as well as your calculations. I don’t have a great deal of confidence in the red noise assumption. But that’s l’il old me 🙂 That may not cause you to reach for the worry beads.

    On that post, I’m discussing CFD modeling of systems with Navier Stokes. There is no “white noise, pink noise, red noise” issue involved.

    OK, I might have been talking at slight cross purposes here. I was referring to complex non-linear systems in general – and I was careful at the end to replace the concept of stochastic noise with (deterministic) unforced variability 😉 There are many examples of non-ergodic systems that are actually fairly simple to model; Vit Klemes circular cascade of reservoirs; high DC current through a carbon composite resistor; Per Bak’s self-organised criticality (sandpile experiment). These are non-chaotic, non-ergodic systems which have f^-1 (which I should really refer to as 1/f) spectral dependency in their unforced behaviour. If you want to model the unforced behaviour of these systems, pink noise is a valid mechanism for doing so; averaging or ensembles are not. For some systems you can do this, some you can’t, and have to take a different approach. Chaotic systems are another issue altogether, and have their own set of issues. These are fundamental issues that nobody really seems to be getting their teeth into in the climate community at the moment, as Dan notes.

    I say no-one is looking into it: from the modelling side, this seems true. From the statistics side some are (e.g. those cited above), and some from the observation side (e.g. below); Tom, you may like aspects of this paper (I’ve tried to link it on climate audit but got bounced by the spam filter!)

    Tropical Convective Variability as 1/f Noise, Yano, Fraedrich and Blender, Journal of Climate 2001 Vol.14

  36. Sorry Spence! I don’t know why I keep doing that. (SteveUK is the sculptor.)

    This does not mean simplifying assumptions are valid.

    I agree that this does not mean the assumptions are, by definition, valid. The difficulty is without them, we sort of end up with “can’t prove anything”.

    Once again, I caveat this by saying in terms of testing the IPCC, testing them by their own standards is a valid simplification.

    That’s the way I mean to use valid. Of course, I do still make some statistical tests to make sure the data themselves are not wildly inconsistent with those assumptions.

    From your earlier post:

    In any case, whether we think of things as chaotic or not, it’s known that trying to “average” the random features, and substitute parameterizations for the averages behavior has worked less the splendidly in the past.

    This is only true for ergodic time series.

    On that post, I’m discussing CFD modeling of systems with Navier Stokes. There is no “white noise, pink noise, red noise” issue involved. We try to solve conservation of mass, momentum and energy. Some sort of averaging is done on the continuum equations to come up with other “averaged” continuum equations. Unfortunately, when you average terms like <vu> terms like <v><u>+<v’u’> show up. Now you need more equations. People resort to parameterizations for <v’u’>. (Or write more equations and parameterize the higher order terms.)

    This method sometimes does ok and sometimes just isn’t so great. So, then people try other methods, and so on.

    The GCM’s read like methods that kinda-sorta worked, but kinda-sorta didn’t in the past.

  37. Spence UK

    Thanks for the link , it was interesting .
    I am not familiar with “Tropical convective variabilities” so couldn’t follow much the physics that they were doing nor the physical implications .
    However they confirm what I already said about ENSO : “In current ENSO modelling the tropical convective variability is considered only as white noise forcing that has no memory by itself . The current finding suggests that a fundamentaly different type of stochastic forcing may be required .”
    Pity that they do a bit too much hand waving at the end by saying that the spatial averaging is no problem and 1/f stays preserved by it .
    I doubt it very much and if it is true , then it is far from being trivial .

    Dan

    Why do you say that chaos arises only from numerical solutions of ODEs ?
    It is true that it is how it became historicaly popular but there are results on physical systems (3 body problem) where Poincarré has proven already somewhere around 1900 that the planet’s orbits were chaotic (without using the word) and he didn’t use any computer .
    Also the whole class of Rayleigh Benard flows are chaotic what is an experimental proof that N-S solutions can exhibit chaotic behaviour and there is a ton of chaotic systems in fluid dynamics .
    Ruelle & Takens have proven a theorem that turbulence is chaotic in some cases .
    The problem with numerical methods is that they make the things more complicated and could even be able to make appear chaos or randomness there where there is none .

    The whole problem is then to be sure how to separate numerical artefacts from the real behaviour of the system .
    Of course with Navier Stokes you will never know – as you know nothing about the solution(s) , you can’t prove that the numerical simulation converges to something unknown and the Teixeira paper shows that making only the steps smaller and smaller is not sufficient .
    If the system is chaotic we already know that there can’t be uniform convergence anyway so it is a kind of Catch 22 .

Comments are closed.