Leaked Chapter 9 AR5: Musings.

Reading Willis’s post at WUWT inclined me to start delving into leaked Chapter 9 of the final draft. I will be eventually saying something a bit specific about a portion he quoted, but I wanted to plow through much of the chapter and post initial comments as they occurred to me. I admit to jumping around in the draft a bit– focusing on the pause, volcanoes, aerosols and general evaluation of models. So, please excuse this if it’s a bit disorganized. But I think since its a draft and language or exlanations could change, it’s not worth being too organized. That said, it’s starting some conversation.

From page 5/205 of Chapter 9 of leaked final draft:
As I had just read Willis’s post, I was intrigued by a discussion of the uncertainty in optical depth. Specifically, I wanted to compare the range of optical depth indicated to the range of variation due to volcanic eruption.

The majority of Earth System models now include an interactive representation of aerosols, and make use of a consistent specification of anthropogenic sulphur dioxide emissions. However, uncertainties in sulphur-cycle processes and natural sources and sinks remain and so, for example, the simulated aerosol optical depth over oceans ranges from 0.08 to 0.22 with roughly equal numbers of models over- and underestimating the satellite-estimated value of 0.12. [9.1.2, 9.4.6, Table 9.1, Figure 9.29]

Below I have put the 0.14 range in optical depth over oceans in context of the variations due to volcanic aerosols:
OpticalDepth
(Note: The “high” aerosols load from volcanoes erupting after 2000 are mentioned as possible reasons for the pause. I plan to discuss that a big more later on.)

page 20/205
Having just read Willis’s post, this also struck me.

For the natural forcings a recommended monthly averaged total solar irradiance time series was given, but there was no recommended treatment of volcanic forcing. Both integrated solar irradiance and its spectrum were available, but not all CMIP5 models used the spectral data. The data employed an 1850-2008 reconstruction of the solar cycle and its secular trend using observations of sunspots and faculae, the 10.7 cm solar irradiance measurements and satellite observations (Frohlich and Lean, 2004).For volcanic forcing CMIP5 models typically employed one of two prescribed volcanic aerosol datasets (Sato et al., 1993) or (Ammann et al., 2003) but at least one ESM employed interactive aerosol injection (Driscoll et al., 2012). The prescribed datasets did not incorporate injection from explosive volcanoes after 2000.

Note: The CMIP5 models ought to be mostly accounting for the decline in the solar forcing up to 2008. So, presumably, the fact that it declined from 2000-2008 cannot be a reason why the CMIP3 models run too warm. They are then instructed to repeat cycle 23 going forward. I’ll reflect on how this interacts with explanations of the pause later one. Meanwhile, I’ll let you all chew on that a bit.

Page 27

By
contrast, there is limited evidence that the hiatus in GMST trend has been accompanied by a slower rate of increase in ocean heat content over the depth range 0–700 m, when comparing the period 2003–2010 against 1971–2010. There is low agreement on this slowdown, since three of five analyses show a slowdown in the rate of increase while the other two show the increase continuing unabated (Section 3.2.3, Figure 3.2).

I’ll elaborate a bit here. Chapter 9 includes this interesting bit:

During the 15-year period beginning in 1998, the ensemble of HadCRUT4 GMST trends lies below almost all model-simulated trends (Box 9.2 Figure 1a), whereas during the 15-year period ending in 1998, it lies above 93 out of 114 modelled trends ((Box 9.2 Figure 1b; HadCRUT4 ensemble-mean trend 0.26°C per decade, CMIP5 ensemble-mean trend 0.16°C per decade).

Reading this, I couldn’t help wondering: how do observations and models compare if we compare using the longer combined period that is the 30 years from 1983-2012? This puts the hot 1998 El Nino in the center of the series where it can have almost no effect on computation of the trend. Reading the text, one might imagine that averaging the high and low periods, one might find models were spot on if we grouped the two.

But. No.

It turns out that assembling the ‘hindcast only’ period from 1983-1998 with the ‘some hindcast and some forecast period’, the observations fall in the lower half of the model range:
ComparisonOverFull30Years
Note that whether one deems the model inadequate will depend on whether one thinks it “matters” that the observations are too low to be explained away by “internal variability” if we estimate the internal variability based on the pooled standard deviation of repeat runs in individual models or whether we only think models are inadequate if the observations fall outside the range of all runs in all models. The discussion in the leaked AR5 focuses on the latter issue, but we can see that even so, by combining the two periods the models appear to warm at too large a rate.

That paragraph continues:

Over the 62-year period 1951–2012, observed and CMIP5 ensemble-mean trend agree to within 0.02 ºC per decade (Box 9.2 Figure 1c; CMIP5 ensemble-mean trend 0.13°C per decade). There is hence very high confidence that the CMIP5 models show long-term GMST trends consistent with observations, despite the disagreement over the
most recent 15-year period. Due to internal climate variability, in any given 15-year period the observed
GMST trend sometimes lies near one end of a model ensemble (Box 9.2, Figure 1a,b; (Easterling and Wehner, 2009)), an effect that is pronounced in Box 9.2, Figure 1a,b since GMST was influenced by a
very strong El Niño event in 1998.

My main comment on picking the 62 year period is this comparison becomes a fairly crappy meaningless test. The reason for this is even though we ordinarily anticipate longer time series will reduce the effect of “internal variability” on the uncertainty in the trend, in this particular case, the expected value of the trend itself is lower over the shorter period. Consequently, the ratio of the uncertainty to the trend to the magnitude of the trend does not grow at the rate one might ordinarily hope for. Moreover, in this particular case the test begins to be heavily dominated by the large number of hind-cast years and models have been or could have been tuned to some extent to obtain agreement in the hindcast. (For example: consider the spread in optical depth of sulphur aerosols across models). That fact: tuning happens and could be influenced by knowledge of the GMST even if modelers insist they have some level of super-human objectivity either individually or as a group and that owing to their sooper-dooper scientific training their knowledge of the GMST, their frequent comparisons of model and observed GMST and knowledge that agreement with GMST will always be highlighted did not affect their choices during the model development period.

But possibly even more importantly, over this period, the structural uncertainty in models begins to overwhelm the spread in results. This might be best illustrated by looking at the following diagram comparing model trends to observed trends over this period:

TestOver62Years

Note the huge spread in the “model means” (open circles) for individuals models shown below. In this periods, we see models whose mean trend is clearly inconsistent with those of other models. Moreover, if we gauge the effect of internal variability using the pooled standard deviation of repeat runs in models, the multi-model mean trend is inconsistent with the HadCrut at the ±95% level. It is not not inconsistent with NOAA (though this is a close call). GISTemp is well within the uncertainty intervals.

It’s a bit surprising to read authors of the AR5 suggesting that we can expect that this sort of “agreement” gives “us” much confidence in models. I would at least suggest it doesn’t give me much confidence in models. Possibly, all the authors mean to suggest is that the owing to the huge spread in structural uncertainty in models, we can expect that observations will remain in the spread. But the wording of the text seems to suggest that somehow we should expect the model mean to remain close to the data. I would suggest that one could say we should expect the observations will track below the model mean and that this claim could be made with at least the same level of confidence as the one forecasting the observations will stay in the spread. Specifically: I would not be surprised if the observations continue to remain in the lower half of the humongous spread for AR5 model trends.

The chapter contains a large number of other things. But they aren’t so much my interest, so I’ll stop for now. A bit later, I’m going to delve a bit more into the explanation for “the pause”.

UpdateI was sent copies of leaked drafts. But they all seem to be available here: http://ipccreport.wordpress.com/2013/09/23/drafts-reviews-and-leaks/ . I have not found a trove online.

127 thoughts on “Leaked Chapter 9 AR5: Musings.”

  1. The 62-year trend bit is analogous to a stock forecaster in 2000 claiming that the market should continue on its 50-year trend. Now that it hasn’t they try to claim that it isn’t that far below the 62-year trend. Except all they’ve done is switch the baseline.

    This seems disingenuous at best, if not downright dishonest.

  2. John Vetterling,
    The baseline doesn’t matter with trend comparisons. The difficulty is that most of the spread is in the scatter of the structural uncertainty. CANESM2 model mean has about 2X the warming of the observations. No model mean is below the observations, but for some models the difference can be explained by “weather” of a magnitude inside that model.

    Yet, because the mean is in the spread of all these models, we are supposed to have “confidence”? In what? “The models”? That they can do what? Create a shotgun spray of predictions with a spread so big the observations will always be inside that huge spread?

    Or are we just supposed to have confidence the earth will eventually start warming again? I have confidence in that. My confidence has absolutely nothing to do with these models!

  3. Lucia –
    (1) Are the final draft chapters available somewhere on the Interwebs?
    (2) Isn’t it a little disturbing that models can’t consistently predict optical depth, given the satellite measurements and a (mostly) prescribed sulfate injection load? This would seem to be a relatively well-constrained sub-problem that would get solved far before we see a convergence on the more general GCM problem. It immediately raises the question, if they can’t get this part close to right, why should one have confidence in the larger picture?

    These glimpses into the sausage factory are not reassuring.

  4. Lucia,

    I think we are speaking to two different issues. Yours, as I understand it, is that by increasing the analysis period the variance/trend has increased where we would expect it to decrease or at least stay the same. I agree with that. More data is not improving their fit.

    My issue is that the out of sample data is still only 2000-2012. Whether they go back 62 years, or 162 years, the only valid test is the out of sample data. That’s still only 2000-2012. The rest is obfuscation.

  5. Lucia: “…. Or are we just supposed to have confidence the earth will eventually start warming again? I have confidence in that. My confidence has absolutely nothing to do with these models!”

    .
    I have similar confidence that the earth will start warming again, if we wait long enough. Maybe it will happen sooner, maybe it will happen later.
    .
    The historical rate of increase in Central England Temperature over the last 300 years is roughly + 0.3 C per century, with a series of rising/falling trends of various lengths occurring along the way.
    .
    Central England Temperature 1659 – 2007
    .
    CET’s more recent upward trend since 1975 roughly mirrors the upward trend of global mean temperature in that same period.
    .
    Extrapolating 300 years into the future using historical CET as the predictive pattern, let’s project that in the year 2300, the trend line between the year 2007 and the year 2300, as will be shown on Wikipedia in that year, will indicate a total warming along the trend line of + 0.9 C.
    .
    If that’s what happens, I instruct any of my descendants who may be reading of this prediction 300 years from now to shout from the highest rooftops, “Yea verily, he told them so!”
    .
    In any case, rumors of the IPCC’s imminent demise are highly premature. A period of cooling lasting thirty to fifty years must occur before the IPCC and the climate science community ever begin to seriously question their models.

  6. Lucia, you make an excellent point and graph in this thread introduction concerning the 62 year trends. I hope you continue with discussions of the model “weather” noise and structural differences.

    Did the AR5 present any graphical data to correspond, in a manner approaching what you did here, with their claims for the 62 year model and observed trend period? If not, I would judge their claim deceptive.

  7. “Or are we just supposed to have confidence the earth will eventually start warming again? I have confidence in that”

    And the basis of that confidence is ???

  8. About the 62-yer period getting the rate just right seems to be something of an irrelevance.

    The discrepancies over the shorter periods have been explained away by numerous different unrelated possible causes without those causes ever being fully quantified. The 1945-1970’s Pause could be aerosols or measurement issues or internal variability. As we all know 1970’s to 1990’s needs no further explanation than GHG forcing. And the recent Pause might be internal or aerosols or volcanos or whatnot. Given this it would suggest that the match of the 62-year rate in models and obs. has come about from a fortuitous zero nett affect from all the unknown knowns, known unknowns and unknown unknowns. On that basis there is nothing more scientifically satisfying about the model and obs rates being just right over 62 years than if the models were running too hot or too cold.

    Now for some reason I fancy porridge for breakfast

  9. John

    My issue is that the out of sample data is still only 2000-2012. Whether they go back 62 years, or 162 years, the only valid test is the out of sample data. That’s still only 2000-2012. The rest is obfuscation.

    I agree. It’s just that the word “baseline” is an important term of art in climate science. And the issue of out of sample data (i.e. forecast vs. hindcast) is separate from choice of “baseline”. Creating anomalies involves the baseline.

  10. A ~60 year trend period does get rid of the effects of a ~60 year period oscillation. But if you do a ~60 year moving average or fit a ~60 year sine wave, you’re left with something that isn’t linear at all. Paul_K had some posts covering that a while back.

  11. …. and just following my same (wonky??) logic the seems to be an assumption that it’s all right if the obs. are running hotter in 80’s and 90’s than the models, I easily fall into that thinking myself. But an error is an error it doesn’t matter what sign (+ or -) it is. If we have high confidence about what caused the temperature evolution over that period then we should be able to plug in all the relevant data to the models and get a good match of models and obs. If as Lucia says here the rates for the 15 years up to 1998 for obs and model hindcasts are quite different (0.16oC/decade compared to 0.26oC/decade) then Shirley there’s still some limitations to or understanding?

  12. HaroldW

    (2) Isn’t it a little disturbing that models can’t consistently predict optical depth, given the satellite measurements and a (mostly) prescribed sulfate injection load? This would seem to be a relatively well-constrained sub-problem that would get solved far before we see a convergence on the more general GCM problem. It immediately raises the question, if they can’t get this part close to right, why should one have confidence in the larger picture?

    Since optical depth is affected by volcanic eruptions, it’s not disturbing they can’t predict it as we can’t predict when volcanoes erupt. However, if what you mean is “isn’t it a bit disturbing that the models disagree on optical depth due to sulphate aerosols?”, I’m not so sure it’s disturbing. However, it means there is a potentially very large tuning knob. Potentially it can be turned to permit better agreement with historical forcings– which gets us to the question of whether agreement in the hindcast really ought to result in a large measure of “confidence”.

    If the number of tuning knobs are small, or the range of tuning is limited then we might say that agreement with hindcasts gives a very high measure of confidence. But if that were the case, all models would agree with each other! That is: structural uncertainty would be small.

    Modelers can note that net positive warming is “robust”. But… well.. yeah. And temperatures being cold at the poles and warm at the equator is “robust”. And loads of other things– possibly on the order of “rain drops fall rather than rise” are going to be “robust” in the sense that models all agree on those sorts of things. But the fact is, robust is a rather broad term.

    The very complicated, resource intensive AOGCMs do seem to “robustly” predict things we would know even without the models. One of these is: If the net forcings on the planet are positive, the planet will warm. But of course this will happen. Just as “rain falls down” is a consequence of gravity, the planet warming when the net flux is positive is a consequence of the first law of thermodynamics!

  13. Kenneth

    Did the AR5 present any graphical data to correspond, in a manner approaching what you did here, with their claims for the 62 year model and observed trend period? If not, I would judge their claim deceptive.

    My leaked draft contains text but no figures. This limits ability to comment. So I can only comment on text.
    In the cases where I comment: it really doesn’t matter what the figures show. I would ask “Why not look at the two sub periods together?” no matter what the figures show. This is especially true as the discussion singles out the mid-point year (1998) as somehow exceptional. I would similarly figure out what I thought of the comparison since 1951 no matter what graph they showed.

    At least as it stands, the discussion of “the pause” looks suspiciously like:
    1) They did not intend to discuss it.
    2) Owing to some ‘blocking’ of papers suggesting the pause pointed to problems in models, they AR5 authors actually have very little to draw on from the literature.
    3) They are now burdened with the task of trying to explain this pause based on literature with little to draw on.
    4) They are tiptoeing around and to some extent presenting cherry-picked arguments.

    In fact, with respect to the topic of ‘the pause’, the IPCC draft smells a lot like Heartlands ‘NIPCC’ type drafts that does (whether they admit it or not) present spin. (Though the NIPCC does spin in the other direction.)

    I think the authors hoped to get away with “The pause is interesting. But we really can’t say much about it because there is very little discussion in the literature.” (That would have been true.) Instead they are trying to say things and assess levels of “confidence”.

  14. Bob (Comment #119600)
    September 23rd, 2013 at 7:31 pm Edit This

    “Or are we just supposed to have confidence the earth will eventually start warming again? I have confidence in that”

    And the basis of that confidence is ???

    (1) Confidence GHG’s have overall risen.
    (2) Confidence that radiative physics is correct and that GHG’s do tend to elevate net forcings.
    (3) Confidence 1st law of thermo is right.
    (4) Observation long term trend is positive– as expected based on 1-3.

    That’s it. That’s the entire basis for my confidence that the trend will resume eventually.

  15. My guess is that the ~60yr cycle is caused by SST’s overshooting/undershooting equilibrium. Starting from equilibrium, introduce a forcing trend which warms the well mixed layer. Due to inertia, ocean convection is at first too slow which results in an overshooting of SST’s. Eventually the convection accelerates, carrying heat to the ocean depths, but because SST’s were in an overshoot position the convection overshoots equilibrium as well. This leads to SST’s eventually undershooting equilibrium, convection decelerating, with SST’s again overshooting and so on.

    I don’t have much to support this pet theory of mine, just a couple of plots showing temperatures and sea-levels both showing a ~60yr cycle, with sea-level trends lagging temperature trends by ~20yrs.

    https://sites.google.com/site/climateadj/multiscale-trend-analysis—hadcrut4

    In a traditional one-box model, the response cycle can’t lag the forcing cycle by 1/4 cycle and retain an amplitude. If I modify the exponential decay model to introduce relative acceleration, say exp(-(t^(3/2))/tau), then I can push the lag out past 1/4 cycle and still retain an amplitude. Of course I’m assuming that sea-level is a proxy for ocean heat content and that total ocean heat is forced by surface heat.

    If this were true would it mean that SST’s were near equilibrium?

  16. “(1) Confidence GHG’s have overall risen.
    (2) Confidence that radiative physics is correct and that GHG’s do tend to elevate net forcings.
    (3) Confidence 1st law of thermo is right.
    (4) Observation long term trend is positive– as expected based on 1-3.

    That’s it. That’s the entire basis for my confidence that the trend will resume eventually.”

    I find it somewhat revealing that you place GHG’s in the position of being the “Control Knob”, i.e. 1through 3=4. I am with you on 1-3 above as it stands, but I believe it rather obvious that other factors (negatives such as clouds, ENSO, etc) are more pivotal than you allow.

  17. (2) Confidence that radiative physics is correct and that GHG’s do tend to elevate net forcings.

    The Earth’s surface temperature is driven by the surface energy budget. The surface is cooled multi-modally and mostly non-radiatively (evaporation and convection). Radiative physics is only part of the story. The so-called GHGs elevate atmospheric emissivity…

  18. DeWitt Payne (#119603): “A ~60 year trend period does get rid of the effects of a ~60 year period oscillation.”
    Not entirely. Consider a sine wave of amplitude A and period T. [That is, x=A*sin(2*pi*t/T + phase).] Taking the OLS trend over the interval 0 to T, gives you a slope of (nearly) -2*A/T*cos(2*pi*t/T+phase).
    .
    Assuming a 62-year cycle with amplitude 0.1 K, the contribution to a 62-year trend could be as great as +/-.003 K/yr, or 0.03 K/decade. Not much compared to the recent discrepancy between models and observations, but not negligible either.

  19. lucia (#119606)
    Yes, you’re correct that I should have referred to the models’ failure to postdict [is that a word?] the aerosol optical depth, given the eruption strengths and locations. If the models don’t correctly estimate AOD (& therefore forcing) for historical eruptions, then what they give for sensitivity wouldn’t match actual sensitivity.

  20. Edim,
    “The so-called GHGs elevate atmospheric emissivity…”
    .
    Thanks for another of your humorous/ridiculous comments. You would get a lot more respect for those at Skydragon slayer blogs.

  21. Lucia

    I am almost 62 years old (born in ’53)

    Over the 62-year period 1951–2012, observed and CMIP5 ensemble-mean trend agree to within 0.02 ºC per decade (Box 9.2 Figure 1c; CMIP5 ensemble-mean trend 0.13°C per decade). There is hence very high confidence that the CMIP5 models show long-term GMST trends consistent with observations, despite the disagreement over the most recent 15-year period.

    Perhaps you’d embed the image to my ‘escalator’ which may illustrate what the IPCC meant

    http://tinyurl.com/nnn2rou

    It lasted all of 30s on SkS before the cartoonist deleted it

  22. Interesting that the IPCC seems to be ‘doubling down’ on high climate sensitivity models. It seems to me an unwise gamble, since the rate of warming will most likely stay well below model projections for a decade or more. But I guess they either really believe the CGM ensemble’s climate sensitivity, or feel they can’t afford (politically) to simply admit earlier AR’s were mistaken about climate sensitivity, or maybe some of both. I expect the consequence of losing that double down will be the end of the IPCC AR process, and maybe a (welcome) change in the IPCC structure/function. On her blog, Judith Curry related an off-the-record conversation with a newcomer to the AR process, and said that there was disagreement between the hold-overs from earlier AR’s and newcomers. The holdovers wanted to preserve as much as possible of the AR4 conclusions, while newcomers wanted to ‘tell it like it is’. I guess the (more senior) holdovers got pretty much what they wanted in the final report.

  23. Bob: And the basis of that confidence is ???

    Ditto on the question. And in asking this question of Lucia, let’s note too that Dr. Fred Singer believes temperatures will continue rising for the next 100 to 200 years until the maximum of the Medieval Warm Period is reached.

    Lucia: (1) Confidence GHG’s have overall risen.

    No question about that, and no question that burning of fossil fuels is the primary source of the increase in atmospheric GHG concentrations.

    Lucia: (2) Confidence that radiative physics is correct and that GHG’s do tend to elevate net forcings.

    Of course GHG’s tend to elevate net forcings, but does the water vapor amplification effect actually occur as theorized?

    Lucia: (3) Confidence 1st law of thermo is right.

    Of course the 1st law of thermo is right, but that fact does not preclude the existence of complicated self-regulating natural mechanisms.

    Lucia: (4) Observation long term trend is positive– as expected based on 1-3.

    The Central England Temperature record between 1659 and 2007 indicates the historical presence of several periods of warming equivalent in duration and rate of increase to what is occurring today, but at preindustrial levels of CO2.

    Lucia: That’s it. That’s the entire basis for my confidence that the trend will resume eventually.

    .
    We have every justification to believe the upward trend will eventually resume — ten years from now, twenty years from now, thirty years from now, fifty years from now — whenever it will happen.

    But we don’t have every justification to believe that rising GHG emissions will be the primary agent in causing the warming trend to resume.

    Over on Climate Audit, I asked the question, is it possible to accurately determine what the maximum global mean temperature at the height of the MWP was, without relying upon a full-blown proxy reconstruction, one with all the issues such proxy-based reconstructions have?

    Craig Loehle offered his opinion that it is not possible with today’s science to determine either directly or indirectly precisely what the maximum global mean temperature was at the height of the MWP.

    We suspect it was warmer at the height of the MWP than it is today, but we cannot determine with any real precision or accuracy exactly how much warmer.

    Just to throw out a number, let’s guess that global mean temperature was 1 C warmer at the height of the MWP than it is today.

    If Dr. Singer’s opinion is correct, and if the rough pattern indicated by the CET record holds, another 300 years of gradual warming will occur, doing so in a series of rising/falling episodes lasting thirty to fifty years each.

    There is every reason to believe the IPCC and the climate science community will hang tough in defending their climate models for however long it takes for the warming trend to resume.

    It is all but written in their job descriptions that they have to do that kind of thing.

  24. Bob (Comment #119611)

    I find it somewhat revealing that you place GHG’s in the position of being the “Control Knob”

    When did I do that? Above I said within the range of uncertainty, sulphate aerosols are. Sulfate aersols aren’t gases of the greenhouse or other variety.

  25. SteveF

    Lucia,

    I read Willis’ post at WUWT, and found it completely unconvincing. What was your impression?

    You mean his homeostatic(sp?) theory? I don’t buy it. I can see zero basis for it. The earth’s climate system does include biological components, but it is not, literally, a mammal, and I can’t begin to imagine why one would think a mammals ability to self regulate body temperature is something the planet’s climate system would share.

    There are odd things about what the IPCC leaked drafts say at different spots in the report, but I don’t buy the homeostatic theory.

  26. SteveF (Comment #119618)
    September 24th, 2013 at 11:09 am

    I have quickly read through his post and recollected, perhaps incorrectly, that Willis had done a very similar analyses of aerosols from volcanoes and received a lot of criticism of his methods. I thought also that it generated a number of posts that could have been considered a good learning experience.

    Using regression to (not) find the effects of volcanic aerosols on global temperatures does not feel right to me.

  27. Lucia, have you read the paper by Tebaldi and Knutti, linked below, on using multi-model ensembles for probabilistic climate projections? The first page or so deals with a summary of the four sources of model uncertainty: initial condition, boundary condition, parameter and structural uncertainties.

    I have been having difficulty getting my head around the realizations resulting from different initial conditions and the proper method to handle the results statistically. If you plan on doing a post on that aspect of modeling I will wait until that time to post my thoughts.

    http://rsta.royalsocietypublishing.org/content/365/1857/2053.full.pdf

  28. I think “missed opportunity for a learning experience” is a pretty fair description of Willis’s earlier work and the criticisms it received. “Unconvincing” is of course an understatement here.

  29. It is interesting to me that even though several of the models really suck, the “ensemble” must persist. If the IPCC were really getting more certain, it would seem that the predicted range should be narrowing and failed models dropped in pursuit of empirically verified science.

    Instead the failed models seem to provide cover.

    If 20 years ago (1) I had a built model that predicted a fairly hefty rate warming then (2) recruited a dozen or so ignorant goobers to opine that global temps could be any outcome within a wide range and (3) I then incorporated the opinion of the ignorant goobers into an “ensemble” with mine, then people would be less likely to notice that I blew it. The inclusion of BSWAGs by ignorant goobers would have made my hypothesis product seem less falsifiable, less nakedly wrong.

    I think they should start over. Make a contest. Best model fit 15 years from now wins a prize–maybe SUVs and a lot of cash. Worst modeling team has to detail Anthony Watts’ car every month for a full year.

  30. Lucia, “When did I do that? Above I said within the range of uncertainty, sulphate aerosols are. Sulfate aersols aren’t gases of the greenhouse or other variety.”

    In comment 119609, you said, rather clearly, that 1 through 3 = 4.

  31. Is it just me or Bob completely incoherent?

    As far as I can see, “1 through 3 = 4” has nothing to do with sulfate aerosols and nothing do with “GHG’s in the position of being the ‘Control Knob’”.

  32. Bob
    You said

    I find it somewhat revealing that you place GHG’s in the position of being the “Control Knob”

    When did I say GHG’s are the “control knob”? Or what do you mean by that?

    Anyway, with respect to other things you said in that comment: I don’t think GHG’s causing warming means ENSO etc. have no effect. I was asked why I think we will have continued warming. I assume the question implies “longer term”, not a prediction for tomorrow, next week or next year, but possibly over the upcoming decade or two. I think this warming tendency will happen for reasons I stated.

  33. Kenneth

    I have been having difficulty getting my head around the realizations resulting from different initial conditions and the proper method to handle the results statistically. If you plan on doing a post on that aspect of modeling I will wait until that time to post my thoughts.

    http://rsta.royalsocietypublis…..3.full.pdf

    I don’t know what difficulty you perceive in statistical treatment of runs with different ICs. The treatment is to assume these are random draws from the set of all runs with all possible IC’s given the boundary conditions. This works provided tat the draws are more or less random and chaos virtually guarantees that they are. Though if one does not believe in “chaos”, one needs to worry about a bout this. (But the modelers seem to space out their selections from the spin up enough that things should be ok even if weather is not chaotic. But as I said: if it’s chaotic, the effect of any non-randomness in the draw is reduced and to a large extent simply obliterated.)

    If you want a better discussion, you will have to tell me the trouble you perceive rather than merely saying you hope for a discussion. Because as I see it, there are practically zero problems with statistical treatment over individual runs. And I can’t even begin to imagine what problem someone else would imagine. Under these circumstances I am unable to say anything more than that.

  34. FWIW,
    My take on Willis’ post at WUWT is that it a) ignores the offsetting influence of ENSO on volcanic influence; when ENSO is reasonably accounted for int he temperature data, the Pinatubo effect is ~0.3C to ~0.35C, and b) does not properly consider the thermal mass of the oceans in the temporal response to volcanoes. Of course, Willis has stated pretty clearly that he doesn’t believe any kind of accounting for the influence of ENSO is legitimate. (Why not?!? Perhaps because that would show the temperature response to be much greater than he believes.) The idea of an active “control system” for Earth’s temperature is, IMO, just nuts. Are there significant negative feed-backs? Sure, but positive ones also. I would not be surprised if negative feedbacks from clouds are considerably larger than the GCM’s suggest, which is consistent with the models always overstating the rate of warming and the rate of heat accumulation in the ocean.

    But an active control mechanism? Heck no! It is just as improbable as Lovelock’s touchy-feely ‘Gaia theory’…. long on imagination, short on substance. Makes one wonder if Gaia went on extended vacations during the ice ages of the last ~1 million years.

  35. Carrick, “Is it just me or Bob completely incoherent?”

    Dunno, maybe both. It was Lucia’s comment (119609) I was referring to, not her post above, and certainly not sulphate aerosols. Read her comment. It is my presumption that lukewarmers and alarmists position CO2 as the pivotal GHG. Perhaps that assumption is incorrect. If you read comment 119609 she simplistically states her confidence that there will be future warming based on 1 through3 =4. I feel that with regard to future warming, the statement was rather simplistic and not climatically ecumenical.

  36. lucia:

    Because as I see it, there are practically zero problems with statistical treatment over individual runs.

    I can think of one—biased selection of subsample of population of runs. I suspect this does occur, and to a limited extent, is reasonable. If you have a model that for particular choices of initial conditions either diverged or lead to a non Earth-like climate, my guess is the run gets kicked out. Once you start doing that, you start tuning the model output to match an expected temperature series.

    While I think the reconstructed temperature of Earth is reliable post 1950, I don’t think this is true prior to then, and almost certainly not before 1900. One can only imagine what effect trying to “fit” to erroneous data has on a model.

  37. Bob,

    . It is my presumption that lukewarmers and alarmists position CO2 as the pivotal GHG.

    I’d say at least 97% of the interested population agree that CO2 tends to warm climate, and there are certainly more than two basic categories of people that accept this proposition. I reserve the term “alarmist” for people who are using the threat of CAGW as part of their art, not just for people who are convinced that the threat of CAGW is significant. And I think many skeptics (including Anthony Watts) would include themselves in that 97%.

    I suppose it’s a assumption that the long-term climate response to increases in CO2 are strictly monotonic. That’s what you need in order to say that for a small change in CO2 you still get a temperature increase.

    Probably it isn’t strictly monotonic, but if you perturb the system enough (add enough CO2) and “fast” enough (on time scales short compared to the long-period response, which is on the order of thousands of year), any local maximas, if they exist, get filled in by natural variability as you “sweep the system” through them.

    CO2 isn’t a “control knob” because the amount of it over time (historically) is well characterized, and the direct effects of heating from CO2 are also well characterized.

  38. Carrick,

    If you have a model that for particular choices of initial conditions either diverged or lead to a non Earth-like climate, my guess is the run gets kicked out. Once you start doing that, you start tuning the model output to match an expected temperature series.

    True. But throwing out samples is a very general problem. So I guess I wasn’t thinking of that one.

    Also: I think most of this sort of thing tends to get thrown out at the ‘spin up’ point of the exercise or even earlier. So, they may be doing repeat spinups. But that’s a separate issue of treating runs initiated from the spin up as independent samples after they’ve tuned or spun up. I think these end up falling more under ‘parameter uncertainty’ or similar for the models and less under “random sampling of individual runs”.

    As long as they keep all of those runs, I think they are independent draws for that model.

  39. Carrick (Comment #119639)

    I don’t disagree with anything you say. But it has very little to do with Lucia’s confidence prediction.

  40. Lucia:

    As long as they keep all of those runs

    That’s part of what worries me, especially in highly politicized activities like numerical experiments for inclusion in the IPCC.

    Imagine this—you are finding that you are getting too much variability compared to other models, so you trim the “outliers” so as to not look ridiculous.

    The other thing that worries me is not fully exploring the state space of the model because you find a series of initial conditions that produce “nice results.”

    Both of these concerns are much more general than climate science. If you claim something that falls outside of what other people are doing, that requires recoding (new algorithms) and requires additional computational resources to solve, people are going to resist this and in fact criticize your observation. Not because it’s wrong, but because it causes them headaches they don’t feel like dealing with, like a key algorithm they used in solving the problem doesn’t work any more.

    Now imagine scaling this to the size of large GCMs and the highly political nature of the financial support for many (most?) of these large-scale models. I suspect if we make it to 20 years for a “pause”, only then will you see the codes getting a serious look over.

  41. Carrick

    is this a question to pose directly to the interested parties such as Jules and James? It’s a tough call. I remember working in a subidiary office and if anyone from top-level had asked a straight question about the viability of the operation, i would have to have …..wiggled,, simply because I was having a great time at a lesser cost than at my normal home base.

  42. “I don’t know what difficulty you perceive in statistical treatment of runs with different ICs. The treatment is to assume these are random draws from the set of all runs with all possible IC’s given the boundary conditions. This works provided tat the draws are more or less random and chaos virtually guarantees that they are.”

    Perhaps my problem is not knowing exactly how the chaotic behavior of climate is handled in calculating the model CIs for the model mean or for the model range. If any of the multiple runs is equally likely then the distribution is a uniform distribution and not a normal one. The CIs and range would then be estimated as noted in the linked Wikipedia articles depending whether a uniform continuous or discrete distribution is assumed.

    http://en.wikipedia.org/wiki/Uniform_distribution_(discrete)

    http://en.wikipedia.org/wiki/Uniform_distribution_(continuous)

    PS: Obviously it is not discrete.

  43. Things are starting to “warm” up, especially as climate fails to.

    I imagine most people have seen Steve Mc’s latest offering, which does include some of the graphics missing from Lucia’s copy.

    Then there’s this utterly slathering, foaming at the mouth, piece from Tamino.

    All I can say to that one is wow… somebody needs to get their meds refilled, stat!

    Tamino is arguing offsets again, as if that matters when you are comparing measured versus modeled trends. (You can look at the end temperature, but it’s a pretty noisy comparison and has little statistical power—i.e., avoid this.)

  44. Bob, that’s not my perception. Monotonic, or nearly monotonic response of temperature to a “ramp” CO2 forcing is integral to what Lucia (and I as well) believe to be true.

    To borrow IPCC language, is 99.9% certain that if CO2 did not act as a greenhouse gas, the Earth would be substantially colder (at least 20°C). So we “know” beyond reasonable doubt that the slope of global mean temperature versus CO2 content is positive. It is theoretically possible, there is some big “kink” around our current temperature range that tends to stabilize temperature (Willis’ Gaea Earth I guess), but regardless push the system enough, and it will start warming again.

    It seems very improbable, to me anyway, that doubling CO2 from preindustrial levels is not going to cause global mean temperature to increase from the mean value during preindustrial times.

  45. ” If you claim something that falls outside of what other people are doing, that requires recoding (new algorithms) and requires additional computational resources to solve, people are going to resist this and in fact criticize your observation.”

    I have an experience working in a technical area as a young scientist making CRTs that required oxide cathodes that is kind of to the point of this discussion. We showed what we felt was a defect in the cathode we had removed from a competitor’s CRT to the VP of engineering and his response was that he had to find out how to do the same for our cathodes. At an upcoming conference we were in contact with some of this competitor’s technical people and asked them after a few drinks about what we saw on their cathodes. Their response was oh, oh, we had a problem with that but we now think we have it under control. We found later in our own laboratory how to make the problem and how it was caused. It came from a short cut in processing that would save considerable costs. We put in that same short cut after determining how to compensate and avoid the problem that our VP thought might have been a competitors breakthrough – it was but he had it all wrong.

  46. Diogenes:

    is this a question to pose directly to the interested parties such as Jules and James?

    If you could get a straight answer from him, Gavin would be better. He understands the politics of US climate science (as a field) better than James, IMO.

  47. Kenneth:

    I have an experience working in a technical area as a young scientist[…]

    In case you didn’t guess, my description is based on personal experience too, also as a young scientist, now a while ago.

    Managers are funny beasts. They always want to know why “our system” doesn’t have the same warts as our competitors. 😉 Warts = features, apparently.

  48. Kenneth

    If any of the multiple runs is equally likely then the distribution is a uniform distribution and not a normal one.

    This doesn’t follow. If you draw 100 widgets out of a huge bag of widgets, and look at the distribution of weights, it’s not going to be uniform merely because you count the weight of each widget as widget (i.e. equal to each other.) Of course the distribution might come out uniform if the distribution of widgets in the bag was uniform. But equal weighting of the widgets doesn’t distort the probability distribution of the sample.

  49. Carrick (Comment #119647)

    I’m puzzled by this in Tamino (well… other than he’s picking what he “likes”)

    The flaw is this: all the series (both projections and observations) are aligned at 1990. But observations include random year-to-year fluctuations, whereas the projections do not because the average of multiple models averages those out. Using a single-year baseline (1990) offsets all subsequent years by the fluctuation of that baseline year. Instead, the projections should be aligned to the value due to the existing trend in observations at 1990.

    It makes a lot more sense to test the projections using the baseline chosen by those who published the projections. So: The baseline for the AR4 is not 1990. It’s not the value at 1990 based on a linear fit using start and end-dates Tamino might chose (and people might argue about.) The report clearlys tates that the projections are relative to the mean of Jan1980-Dec1999 temperatures. That’s what it says. That’s what one should use.

    Now it might be different if those making projections were sloppy and did not specify. But if they picked something as their basis, that’s the basis that should be used.

    Not only that: to be fair one should use what they said even if it’s a basis that one theorizes is a crap choice. For example: If the authors of the TAR said relative to 1990, we should assume that’s what they meant. It may well be that the anomalie in 1990 is higher (or lower) than some trend line fit through some years including 1990, but presumably the authors writing the TAR in 2001 already knew that. So, one might imagine when they wrote “relative to 1990”, the actually meant relative to 1990 even if that year happened to be “warm” or “cold”. Possibly the authors of the TAR even took that ‘deviation’ into account when selecting 1990 and considering the deviation when making their projections.

  50. I should add: Also, consistency with Tamino’s notion of picking the 1990 ‘point’ based on the trend line means that he should align the trendline through the models to the trendline for the observations. Instead, he aligns the value of the model mean at 1990 to the trendlinefor the observations. That’s applies and oranges.

    (Moreover, if he aligns trendlines at 1990 for both models and observations, and fits the trend from Jan1980-Dec1990, he will actually match the projections from the AR4. Because that’s the way OLS works!

  51. SteveF critiques Willis’ idea of a thermostatic, dynamic model of the climate, with “The idea of an active ‘control system’ for Earth’s temperature is, IMO, just nuts.”
    http://rankexploits.com/musings/2013/leaked-chapter-9-ar5-musings/#comment-119633

    If he did, yes – but that’s far from how I construe his suggestion to be (ie, like Lovelock’s Gaia hypothesis – itself too much mischaracterized). I see his notions along physical dynamic lines, instead.

    What no one seems to notice here is this: the additive physics-based model of climate, via AR5, has failed. So Lucia and others want to double-down on failure instead of doing the substantive hard work of rethinking things? That’s so wrongheaded I must repeat myself.

    What’s missing in this thread is any recognition of the charm and scientific humility of Willis proposal: replacing this failed (again) climate model with an energetic balance model, where we have to keep assessing and re-assessing the uncertainties or certainties of the components going into the model, as scientists work and useful data streams are evaluated and emerge to bear on the model.

    In this conception of a proper climate change model, instead of betting everything on the few known knowns like CO2, scientific institutions are encouraged to work to understand the vast oceans of empirically known unkowns and emergent influences on climate.

    Thus, instead of an IPCC governed by bureaucratic exegesis and political ex-cathedra, scientists can regain their ethical self-respect with the public because – as Curry steadfastly argues – the uncertainty monster can kill ya!

    The statistics assessed here and at climateaudit auger for a more humble climate science in order to achieve these meliorations of modern-day institutionalized arrogance and fiscal corruption.

    Far from dismissing Willis, we would all do better to join him in a long-overdue re-think of what’s been lacking in current climate science exercises like TAR, FAR, and AR5.

    PS I think Willis conception of a problem solving climate science has much more in common the with pioneering climatologist William Gray’s, noisily retired from Colorado State University. Willis notions more directly incorporate descriptive problems within the climate system and thereby require direct acknowledgement of spurious assumptions and uncertainties. Again – I see win-win-win in this – not some dinosaur to be rejected (which is how Gray felt, given the way AGW-mania pushed out finding for his hurricane prediction projects, as his career ended in the late 1990s.
    Mentioning this ought to ring bells of recognition and recrimination, when we look at the past decade’s failed “predictions” of rising hurricane threats. (cf, Chris Mooney’s “Storm World: Hurricanes, Politics, and the Battle Over Global Warming” – which appeared in the wake of hurricane Katrina – and his brain-dead recycling of failed science in ‘Mother Jones’ in the past two years.)

  52. This is a demonstration of overshoot involving two coupled boxes of infinite volume. Box 1 is filled at a constant rate of an arbitrary unit. Box 2 is filled from box 1 at an accelerating rate that is calculated as the previous rate plus a fraction of the difference between the volumes in boxes 1 and 2. The chart shows boxes 1 and 2 swapping overshoot/undershoot positions.

    https://docs.google.com/spreadsheet/ccc?key=0AiP3g3LokjjZdFN0d1pfWUNTMk5DVVUwcTNnTmEwdmc&usp=sharing

    I see this as a sort of “demonstration of concept” which could help explain “the pause”. Box 1 being the well mixed layer and box two the ocean depths. Due to inertia convection is at first slow but then accelerates carrying the surface heat to the depths.

  53. Orson

    What no one seems to notice here is this: the additive physics-based model of climate, via AR5, has failed. So Lucia and others want to double-down on failure instead of doing the substantive hard work of rethinking things? That’s so wrongheaded I must repeat myself.

    Could you tells us the basis for your conclusion that “the additive physics-based model of climate… has failed” — and possibly telling your precise definition of “”the additive physics-based model of climate”. Not knowing the latter, I can’t begin to guess your basis for concluding the former. Beyond that there is a huge difference between models over predicting warming and failure of “the adjective of your choice physics-based model of climate”. And further to that: even if some model of climate failed, this would not represent any sort of evidence in favor of Wilis’s homeostatic model; it could equally well be evidence of my “Leprechaun caused it” model of what is going on.

    Far from dismissing Willis, we would all do better to join him in a long-overdue re-think of what’s been lacking in current climate science exercises like TAR, FAR, and AR5.

    It’s fine to rethink what’s lacking in current climate science exercises. But that doesn’t mean anyone has to believe the following represents a remotely plausible belief

    I am a climate heretic. I say that the current climate paradigm, that forcing determines temperature, is incorrect. I hold that changes in forcing only marginally and briefly affect the temperature. Instead, I say that a host of emergent thermostatic phenomena act
    quickly to cool the planet when it is too warm, and to warm it when it is too cool.

    I get that Willis might believe in “emergent thermostatic phenomena act
    quickly to cool the planet when it is too warm, and to warm it when it is too cool.” but I cannot begin to imagine what collection of physical, biological and chemical processes would result in this, nor how the planet would know what is “too warm” or “too cool”. The planet is not a mammal. The planet is not a chicken. What in the world would make the plants system act like the temperature regulating device of a warm blooded animal? (Yep. I am violating my own rule about rhetorical questions. But if someone has a suggestion, please tell me.)

    which is how Gray felt

    How people feel about other’s reactions to their views is hardly an relevant evidence for judging the plausibility of Willis’s suggestion of a thermostatic effect.

    when we look at the past decade’s failed “predictions” of rising hurricane threats.

    Also not relevant evidence.

    If you want to convince detractors that Willis’s notion of a thermostatic effect is something other than totally implausible you are going to have to bring forward some positive argument explaining how it would work. Otherwise, it looks about as plausible as my suggestion that “Global Warming is caused by Leprechauns.” If I liked, I could put a whole bunch of stuff out about having the courage to suggest some theory, and tell you about me feeling if people reject it, and point out that other theories have failed. Then I could pretend that all that is a positive argument in favor of “Leprechauns”– or at least a reason why SteveF, Carrick, DeWitt and so on should not dismiss it out of hand. But rest assured: Unless I provide some fairly decent evidence in favor of “Leprechauns” (including some in evidence or plausibe argument to support my belief in the existence of Leprechauns ) they will dismiss my theory.

    Likewise, they will dismiss Willis’s theory of the thermostatic effect. Because right now, as far as I can see, there is absolutely no explanation of why this this thermostatic effect exists, how it works, how it might be connected to any physical understanding we might have developed in any field. It seems to amount to

    “Chickens can regulate their body temperature keeping it constant even if external conditions change. Let’s assume the earth is a chicken. Conclusion: The earth can regulate its temperature even if forcings change.”

  54. Not being in possession of an up-to-date AR5 draft — I must be the only one! — I have to ask what may be a silly question. Tamino’s 2nd “IPCC projections” chart realigns the baselines, from which he claims that “observed global temperature has gone “right down the middle” of the IPCC projections.” The untrained eye, of course, thinks that the observations are hugging the lower bounds. (His claim might be more defensible for SAR.)

    My question, though, is about the lower bounds. For FAR, is this scenario D — “stringent controls in industrialized countries combined with moderated growth of emissions in developing countries”? And if so, isn’t this misleading by the IPCC, inasmuch as this has not come to pass? They should be comparing observations with the low/best/high estimates of FAR SPM Figure 8. (I think they would match up fairly well with the low estimate, which iirc was based on ECS of 1.5 K/doubling.)

  55. Lucia:

    “This doesn’t follow. If you draw 100 widgets out of a huge bag of widgets, and look at the distribution of weights, it’s not going to be uniform merely because you count the weight of each widget as widget (i.e. equal to each other.) Of course the distribution might come out uniform if the distribution of widgets in the bag was uniform. But equal weighting of the widgets doesn’t distort the probability distribution of the sample.”

    This explanation would be clearer if you put it into terms of climate realizations and trends. What I am saying is that the chaotic nature of climate makes one multiple model run (given all conditions the same except initial conditions) no more likely than another, i.e. a uniform distribution. The different trends manifested by these different runs that occur when measured over the same period of time should have the same uniform distribution – at least in my mind. If you can show a connection between realizations and trends like the example you gave for widgets and weights you would put my mind at ease here with this matter. Otherwise, I see all realizations/trends I pull from the bucket are equally likely and would have a uniform distribution.

    Your widget example is one of merely pulling random samples from a population with a given probability distribution and that sample will obviously have approximately the same distribution. Can you show what the distribution of trends in a bucket of model realizations is?

  56. Willis E. has pointed out some shortcomings in climate science on occasion and I have noted that he has often publicly admitted to mistakes he has made in methods and concepts. In that area he, like other relatively well informed laypersons can make a contribution to the discussions of climate. Having said that, when laypersons start initiating grand theories and without good evidence or submitting their ideas in the form of peer review I go the way of SteveM at CA: ignore it.

  57. Kenneth

    This explanation would be clearer if you put it into terms of climate realizations and trends.

    You are suggesting something about statistics that ought to be generalizable. And the generalization is the same. The process of averaging is:

    1) You have a bag of “all possible Xs”. X can be anything: widgets, realization.
    2) Each thing (X) has some property Y you can measure. It could be weight of widget, it could be trend for a particular realization.
    3) You want to estimate the average “Y” for all X’s in the bag. Ideally, you would measure the Y for all Xs’s in the bag. But possibly that is either impractical or impossible. So, instead you take random draws of “Xs” and measure their property Y. So: You pull out “N” widgets and measure the weight. Or you pull out N realizations and measure their trend.
    4) You estimate the average Y of all things (X) in the bag by taking the average over Y.

    As for probability distribution: The probability distribution of the “Y’s” for the “Xs” your sample should match that of all X’s in the bag. That is: The probability distribution for the widget weights you pulled in your sample matches that of widget weights in the bag. If the widget weights in the bag are normally distributed, then the same holds for those in your sample. (Whether you can detect it is another question. That depends on how many widgets you draw. But the probability distribution does match that in the bag.) The same exact things holds for trends from the realizations in your sample.

    Otherwise, I see all realizations/trends I pull from the bucket are equally likely and would have a uniform distribution.

    All realizations are equally likely. But that does not make all trends equally likely.

    Consider this case with widgets:

    Suppose a bag contains 900 1kg widgets and 100 10 kg widgets. You dram 100 widgets from the bag, which each draw independent of all other draws such that in any given draw, any widget in the bag is equally likely to be drawn. If you were to measure the average weight of a widget in the bag it would be (900*1 + 100 *10)/100 = 1.9 kg. That is the average. (Whatever else it may or may not ‘mean’ that is the average weight of a widget in the bag.

    Quite likely, you will draw roughly 90 1 kg widgets and only 10 10 kg widget but of course, quitelyly you will not draw exactly these numbers. Nevertheless, for now, we estimate the averageweight of widgets by assuming the each widget is equally representative. We get (90*1 + 10*10)= 1.9 kg. This would be just right.

    Now notice two things:
    1) The fact taht 1 kg widgets are more abundant in the bag resulted in a larger number of 1kg widgets in our sample. So: The probability for “weight” falls out naturally by counting each widget as 1. The pdf for weight does not get distorted. It does not become uniform. It more or less matches that in the bag. (In my example it matches exactly.)

    2) We did not say “oh. I got two types of weights: 1 kg and 10kg. So, let’s say 1 kg and 10 kg are equally probable and estimate the average weight as if 1kg and 10 kg are equally probable. (That would be imposing a uniform distribution on the system for no good reason.)

    The exact same thing happens with trends and realizations. Equally weighting by realization does not result in making the pdf of trends look “uniform”. It should make the pdf of trends match the pdf of trends in the ‘bag’ of all possible realizations.

    I think this is fairly easy if you recognize that ‘the trend’ is a property of the realization. You weight by realization, and the probability distribution of trends in your sample of realizations should match that in the bag.

    The only way this can be distorted is if you have some sort of bias involved in the method of drawing runs. But before we move on to that: First let us see if you accept my explanation for:
    1) Widgets (thing) and their weights (property).
    2) See how that is the exact same as for realization (thing) and their trends(property).
    and can see that weighing each “thing” as 1 gets you the proper distribution for properties– and so the proper averages, and any other moments of that distribution.

  58. lucia:

    Instead, he aligns the value of the model mean at 1990 to the trendlinefor the observations. That’s applies and oranges.

    Yes it is. I found the post to be bizarrely written: Even for him, this post seemed very irrational and “spittle flecked” with ad hominems.

    I’ve noticed that’s been happening more and more in recent years. One wonders if the “pause” is having an effect on his psyche.

  59. HaroldW

    Tamino’s 2nd “IPCC projections” chart realigns the baselines, from which he claims that “observed global temperature has gone “right down the middle” of the IPCC projections.” The untrained eye, of course, thinks that the observations are hugging the lower bounds. (His claim might be more defensible for SAR.)

    Bear in mind: his argument for realigning the way he “likes” is nonsense. One should align the way those making the projections aligned the observations and projections. Those would be their projections. If you realign, they become “what I think they should have projected based on how I think about things”.

    Moreover: it’s not even remotely clear that if Tamino had been hurled back to 2001 and explained the issue to them, the authors of the TAR published in 2001 wouldn’t have responded like this:

    We have projections that don’t account for the eruption of Pinatubo which depressed temperature after 1990. We think our projections make more sense as starting from 1990– a year not affected by the temporary depression due to the eruption of Pinatubo– that projecting from some hypothetical value obtained by doing a least squares fit from year 19(??) to 20(??). (Argue over what ?? should be. Yes, it does change the answer). Given that we think choosing 1990 is a more appropriate choice, we are using that for projections. And anyway: Our current choice projects more warming, and so will scare people more. So that’s what we are going to show instead of projecting less cooling. In short: Stuff it time-traveling blogger Tamino”.

    So: of course those writing the TAR could have made projections the way Tamino “things” we should interpret their projections. But the fact is: They did not make them that way.

    My question, though, is about the lower bounds. For FAR, is this scenario D — “stringent controls in industrialized countries combined with moderated growth of emissions in developing countries”? And if so, isn’t this misleading by the IPCC, inasmuch as this has not come to pass?

    I don’t know what scenario in the FAR it is. But if current temperature track “moderated growth in emissions” and emissions really tracked higher, that would mean the FAR methodology over predicted warming. And I think Tamino who corrects for solar, volcanic and so on in his curve fits would insist on recognizing that you need to correct for forcings had they been lower than in the scenario.

  60. Kenneth Fritsch:

    Having said that, when laypersons start initiating grand theories and without good evidence or submitting their ideas in the form of peer review I go the way of SteveM at CA: ignore it.

    For some of the newer players on the board, it appears necessary to ignore everything they say, once they start trying to produce graphs or rolling out their pet theories. (This applies equally to players on both sides of the board too.)

  61. Lucia (#119671)-
    Steve McIntyre’s dry comment at CA: “my opinion of Tamino’s rant: it does not appear to be plagiarized.”

  62. lucia (Comment #119668)
    September 25th, 2013 at 8:55 am

    You are perfectly clear on the widgets and I agree with you about the generalization to sampling a distribution. When I say I have trouble getting my head around the problem of realizations and trends from multiple model runs it is attempting in my mind to separate the trend from the realization as you do with the widgets and the weights.

    Of course none of this answers the question about the distribution of trends in multiple runs of a climate model. I plan to look at the 10 run models and determine with a Monte Carlo analysis whether the trends in the runs could fit a uniform distribution or alternatively fit better a normal distribution.

    I do not have a good inclination about the number of multiple runs required to make reasonable test about distribution. It would also be important that all model runs made ended up in the data.

  63. HaroldW (Comment #119675)
    September 25th, 2013 at 10:18 am Edit This

    Lucia (#119671)-
    Steve McIntyre’s dry comment at CA: “my opinion of Tamino’s rant: it does not appear to be plagiarized.”

    If I’d been drinking coffee I would have sneezed it out through my nose!

  64. lucia (Comment #119609) – I’m disappointed with your answer to Bob (Comment #119600), based as it is on GHGs. I hoped you had a more robust logic, based on long term climate behaviour, ie. at some time the climate will necessarily warm because warming and cooling is what climate does. I am mindful of the reply given by the head of a major international oil company, when asked whether the price of oil was going to go up or down: “It will go up. And it will go down. But I don’t know in which order.”.

  65. Re: DaveJR (Sep 25 11:11),

    This relieves the stress put on plankton by the Sun’s harmful UV rays.

    I seriously doubt that turning direct into diffuse UV really makes much difference to the plankton. Unless the cloud cover is continuous and thick enough that there are no visible shadows, UV forward scatters quite well through clouds. It’s quite possible to get badly sunburned on a cloudy day. In fact, according to this article, in some cases UV exposure is higher on a cloudy day.

  66. DaveJR–
    That’s interesting. That said: Being stressed by UV rays isn’t the same as being stressed by elevated Temperature. So this wouldn’t be a direct thermostatic effect. Moreover, if temperature while the UV radiation remains constant (as predicted for AGW), this effect would do pretty much nothing. Zip. Nada.

  67. Kenneth Fritsch:
    “Having said that, when laypersons start initiating grand theories and without good evidence or submitting their ideas in the form of peer review I go the way of SteveM at CA: ignore it.”
    #####
    Most people round here know this. But it ain’t gonna stop me from trying!
    🙂

  68. Lucia, using the Kolmogorov-Smivnov test(KS), I could not reject the null hypothesis that for the 1964-2005 period trends from 2 models with 10 multiple runs from the CMIP5 come from uniform or normal distributions. The p-values for both cases were in the range of 0.60 and 0.80 from an average of 4 comparisons. The models used were CNRM-CM5 and CSIRO-Mk3-6-0 and are the only models that submitted more than 6 multiple runs for CMIP5. Ten multiple runs are apparently not a sufficient number to make a robust test of distribution with KS.

    If this distribution were uniform it would change how we look at a comparison of the observed trends versus those generated by the models. I am wondering whether any papers have been published dealing with these distributions – at least on a theoretical basis given the lack of empirical data. The ramifications would be critical to these analyses. I know in some of the earliest comparisons of observed climate variables versus those from models some of Karl’s papers dealt with ranges of values.

  69. Mike Jonas (Comment #119681)
    I have no idea why you think my logic is not “robust”. So what if “at some time the climate will necessarily warm because warming and cooling is what climate does.” Sure. It’s done that in the past. But we are currently adding GHGs. I get that you might be “disappointed” that I recognize we are doing so and that I believe radiative physics is generally correct and so on. But that doesn’t mean my logic is not “robust” nor that anything is “wrong with it”.

    As far as I can tell, your logic goes like this:

    “Over history, I know the interior of this cave warmed and cooled. Now you are saying that merely because you noticed that someone came in and started a fire in the center. And you are saying because the fire is there, the cave will warm. But don’t you know caves warm and cool because that’s just what caves do. And I recollect the chairman of an oil company saying oil prices will go up and down. Because that’s what they do.”

    If you argument is qualitatively different from this, let me know.

  70. Kenneth

    If this distribution were uniform it would change how we look at a comparison of the observed trends versus those generated by the models. I am wondering whether any papers have been published dealing with these distributions – at least on a theoretical basis given the lack of empirical data. The ramifications would be critical to these analyses. I know in some of the earliest comparisons of observed climate variables versus those from models some of Karl’s papers dealt with ranges of values.

    Of course the form of the distribution affects how we look at things. But why would you assume uniform in the first place? I wouldn’t assume any such thing. I don’t believe a trend of ∞ C/dec is equally likely as a trend of 0.2C/dec. And if you mean uniform between (a,b) where (a and b) are numbers, I can’t imagine any physical basis for imagining why there would be a sudden drop in probability from a finite value at (b-&eps;) to 0 at b+&eps; where &eps; is some very teeny tiny number.

    Ordinarily, you don’t just assume every possible distributional form and then insist that it must be considered because it ‘fails to reject’. You propose a reasonable one and then use it unless it ‘fails to reject’. The other case is madness. After all: the distribution could be a binomial distribution of some sort, or Chi-squared or anything! If we have little data and no reason to reject the normal distribution, that’s a sensible distribution to assume going forwards. There is no good reason to assume a uniform distribution; it has reasonable properties and we can’t reject it. Why get one’s knickers in a twist over this?

  71. Orson (Comment #119657),

    Wow. You are really, really lost.
    .
    “I see his notions along physical dynamic lines, instead.” No, dynamic physical processes do not have “set-points” like PID controllers, they generally respond in the expected direction to an applied outside force… more radiative forcing, higher temperature, less forcing, lower temperature. It is the rejection of this rather simple and obvious expectation which is nutty.
    .
    “What no one seems to notice here is this: the additive physics-based model of climate, via AR5, has failed. So Lucia and others want to double-down on failure instead of doing the substantive hard work of rethinking things?”
    What on Earth are you talking about? Please define “additive physics-based”. The model projections have clearly diverged from reality. I, and many others, have consistently said that the most plausible explanation for that is the models are a lot less than perfect (or if you want, they failed to accurately predict what has actually happened), and that the divergence is consistent with climate sensitivity being much lower than the models suggest. You certainly don’t need to reject radiative physics and basic scientific reasoning to accept that the models are inaccurate about climate sensitivity.
    .
    “Far from dismissing Willis, we would all do better to join him in a long-overdue re-think of what’s been lacking in current climate science exercises like TAR, FAR, and AR5.”
    Please. Many of the things that Willis posts are simply and demonstrably wrong on their face. I certainly don’t dismiss him for any reason other than that his posts are often plainly wrong. Climate science obviously has lots of problems, and IMO, these are due mostly to bias in the field from widely held and strongly desired ‘green’ public policy outcomes (less energy use, less wealth, more preserving nature, etc.). These outcomes are considered ‘morally right’ by most who work in climate science, independent of how much the Earth will warm due to GHG forcing, which leads to the commonly heard argument: “Even if the models are wrong and climate sensitivity is much lower, restricting fossil fuel use is the right thing to do anyway.”. IMO, most people who enter the field do so because of preexisting ‘green’ political views. This has twisted ‘the science’ into a grotesque pretzel. That doesn’t mean I am going to accept as correct a bunch of nonsensical rubbish, just because that rubbish conflicts with the projections of climate models. And lest you think I don’t hold climate science to the same standard, please see this comment from yesterday: SteveF (Comment #119699) on “The Pause: Leaked SPM” thread, or any of several guests posts I have made here at The Blackboard.

  72. “Why get one’s knickers in a twist over this?”

    Yes, why get one’s knickers in a twist? I just checked mine and they are no more twisted than usual.

    “But why would you assume uniform in the first place?”

    I believe that we agreed that model realizations in the multiple runs, that only differ by initial conditions, would have a uniform distribution, i.e. one realization is as likely as another. In my mind I noted that I had trouble separating the trends realized from the realizations, but we both agreed that the distribution of the trends calculated from those realizations would correspond to the distribution of trends from the realizations. We agreed that it is that distribution that is critical to the analysis of comparing observed and model trends.

    I merely showed that the number of multiple runs in the most run 2 models in CMIP5 Historical series is probably not sufficient to reject a normal or uniform distribution. Obviously the trends are limited and a finite range uniform distribution, if it existed, would be the applicable form.

    The model realizations are limited by the variation in initial conditions used for the model runs and those initial conditions are in turn limited by the variation in conditions determine by the model running over an extended period of time under pre industrial conditions and the selections by the modeler of which conditions/times to use. The trends from these realizations should follow those same limitations and thus as far as the model is concerned there are end point limits.

  73. Kenneth

    We agreed that it is that distribution that is critical to the analysis of comparing observed and model trends.

    Yes. But I don’t think it is necessary to collect enough data to exclude hypothetical but entirely implausible probability distributions that might apply to the data if we know nothing about the data or physics. I think that worry about this or agitating about this amounts to indulging in some sort of “nirvana fallacy”.

    I merely showed that the number of multiple runs in the most run 2 models in CMIP5 Historical series is probably not sufficient to reject a normal or uniform distribution.

    You didn’t need to “show” that. I already knew it. I’m pretty sure I said it above, but I’m not going to hunt to see if I actually did say it. The fact is: Of course. 2 runs is insufficient to test whether a distribution is normal. So are 10 runs. You need a sizeable number of samples before you have a reasonable statistical power to reject any hypothetical distribution applies to a set of observations. We do not disagree on this point.

    The model realizations are limited by the variation in initial conditions used for the model runs and those initial conditions are in turn limited by the variation in conditions determine by the model running over an extended period of time under pre industrial conditions and the selections by the modeler of which conditions/times to use.

    The initial conditions are themselves draw at random from a set of conditions thought possible under radiative conditions set to a constant level and run for an long, long period. This is an entirely reasonable way to select the initial conditions.

    It would unreasonable to select IC’s for the beginning of the industrial period from something like– say– a spin up with conditions from a glacial period 1000 years ago or conditions on the planet venus. To do this properly, you want the distribution of IC’s to themselves match the probability distribution of initial conditions you think might exist given the historical forcing (i.e. ‘boundary conditions) at the time you initiate the industrial runs.

    If you still think there is some problem with the IC’s could you be specific and state what the problem is? Are you worried they didn’t consider the possibility that the Wisconsin Ice Sheet would have existed over the Americans given the forcings that existed from year -xxxx to 18(??)) Or are you worried about something else? Because for me to understand your concern, I need to to give an example of the general sort of ICs you think they should have considered but did not.

    The trends from these realizations should follow those same limitations and thus as far as the model is concerned there are end point limits.

    What limitations do you perceive? And does this have anything to do with the ‘uniform distribution’ issue or is it entirely separate?

  74. Lucia, I have no problem with the ICs. I continue to attempt to think (aloud) and through all these issues that could affect how we look at trend comparisons between models and observed trends and the limitations we have given the available data.

    I will note here that Ross McKitrick at CA has raised the issue of how we make these comparisons and he quotes from an IPCC reply to his comment for AR5 as I have linked and excerpted below.

    http://climateaudit.org/2013/09/24/two-minutes-to-midnight/#comment-440761

    “Likewise, to properly represent internal climate variability, the full model ensemble spread must be used in a comparison against the observations (e.g., Box 9.2; Section 11.2.3.2; Raftery et al. (2005); Wilks (2006); Jolliffe and Stephenson (2011)).”

    I have not found any papers that take a detailed look at what the multiple model runs from a given model means and the limitations in using the results. I did find the following link to Fred Singer’s attempt to determine a minimum number of model runs required in order to address the issue of chaotic noise. I am not sure I agree with his approach but at least it indicates that somebody is interested in addressing the issue.

    http://www.sepp.org/science_papers/Chaotic_Behavior_July_2011_Final.doc

    I need to do more thinking on these issues and it is probably best to not indulge you and your blog at this time. I’ll only state here some basics of my current thoughts.

    Multiple runs of a given model with the only change run to run being initial conditions will provide several measures of a trend which is in turn a sum total effect of a deterministic trend which should be constant throughout all runs and chaotic noise which differs with each run. I would think intuitively that we can obtain an estimate of that range of noise as it effects the overall trend if we knew the pdf of the trends. Knowing that pdf with any reasonable degree of confidence would require many more multiple runs of the same model than we currently have from CMIP5. That multiple runs of that number are not available may well explain why the literature dealing with this matter is limited to nonexistent.

    Knowledge of those distributions are critical to making comparisons from model to model and model to observed. One estimate that models can produce, with sufficient multiple runs of the same model, that we cannot ever know with the observed is the pdf of the chaotic noise effects on the overall trend. We will never have more than a single earth run and that run provides as much information as a single model run.

  75. Kenneth

    I have not found any papers that take a detailed look at what the multiple model runs from a given model means and the limitations in using the results. I did find the following link to Fred Singer’s attempt to determine a minimum number of model runs required in order to address the issue of chaotic noise.

    I glanced at it. It strikes me that he is making a simple problem complicated. Estimating the number of samples required to lower the uncertainty in a sample mean to a particular value can be accomplished closed form and is an undergraduate exercise. It’s a fairly standard thing to do when estimating how many sample you need prior to designing an experiment. “Chaos” is just a red herring here. It doesn’t matter if the problem is “Chaos” or some “non-chaos” mechanism that results in randomness from the point of view of the person looking at the samples. Nothing about the montecarlo relaxes any of the assumptions one might make in estimating the number of samples.

    Also, it’s not clear to me what he means by a “reliable E” nor why we need to obtain it. Presumably he has arbitrarily selected some uncertainty in the multimodel mean “E” and thinks about greater than that represent “unreliable Es” while others respresent “reliable E’s”. But it’s not at all clear we need to deem any partiuclar E reliable or unreliable. We can test models even if “E” is “unrealiable”. We just have larger uncertainty intervals, and our tests are of lower power.

    Of course it would be nice if modelers ran more runs. But you, I and Fred have no control over that, we can test models anyway. Of course, people who want to do what Fred is doing are welcome to do it. But, mostly I think:
    1) He could get a better clearer answer to the question he poses by doing simple closed form non-monte-carlo analyses and
    2) There can be no ‘impact’ of this exercise because no one is going to decide to run the number of models to get the levels of “E” Fred considers “reliable”. The reason they won’t is that there really isn’t much good reason to get a ‘reliable E’.

    I would think intuitively that we can obtain an estimate of that range of noise as it effects the overall trend if we knew the pdf of the trends.

    We can estimate the range from the standard deviation. Of course this doesn’t tell us the PDF– but we can make an assumption about the pdf based on what is plausible. Absent other information a Gaussian is plausible. This is what’s done in most fields. I don’t see any reason why climate science should be an exception.
    But if one wanted to speculate that other form exists or that’s do able. It’s not necessarily complicated. I think there are theorems that let you bound thigns for any and all pdfs. The bounds on probabilities of thigns depends on… get this… the standard deviation. 🙂

    Knowledge of those distributions are critical to making comparisons from model to model and model to observed. One estimate that models can produce, with sufficient multiple runs of the same model, that w

    I think using a normal distribution is fine as a first estimate. But if you want to do sensitivity studies for what you get if the pdf is a ‘top-hat’ with the same standard deviation as you got from the sample, that’s fine too. Or some other pdf with greater or lesser kurtosis– or skewness. All these are possible. One could see whether results are robust if one wished to do so. You don’t need more runs for this. If you want, you can assume the data fit a “http://en.wikipedia.org/wiki/Generalized_normal_distribution” generalized normal distribution if you want.

    All these things could be done. But my view is that starting by using a normal distribution makes sense.

  76. Kenneth–
    The more I look at Singer’s paper, the worse it seems to me. For example:

    Yet how do we know that, say, five runs are sufficient to produce a reliable EM to compare with an observed trend?

    This statement suggest to me that Fred simply does not have a clue about the difficulty in comparing a model mean to an observed trend. The moment we have even 2 model runs, the problem lies in not knowing the how the observed trend relates to the mean one might have gotten if we could create a zillion runs of the earth applying similar boundary condition. To get an estimate we either (a) use the standard deviation of the model runs and an assumption about the pdf of those runs or (b) create an estimate of the spread of earthtrends based on the earth’s time series. You know what happens if we have a zillion model runs? Exactly the same thing. Having more model runs does help if we use approach (a) and it does tighten our uncertainty intervals on the model mean in either case. But then the problem arises 100% from the difficulties we have in the variability of earth weather which cannot be made to go away no matter who “reliable” we make our estimate of the model mean trend.

    The fact is: while it would be nicer if the models had more runs (for a variety of reasons including various tests might have more power) increasing model runs cannot eliminate the problem that we have only 1 earth run.

    While I was jogging: it also occurred to me that much of your concern about the shape of the probability distribution would also be totally irrelevant to the Santer/Singer imbrogilo. It just doesn’t matter because even if a each of 22 groups had done only 1 run and the probability distribution of of runs in each model had been binomial with only two extreme outcomes, the probability distribution of the model mean over 22 runs would be nearly normal. And that’s all you need for a t-test. This seems counter intuitive, but all you need to do is look at the probability distribution for the average of 22 coin flips where heads is 1 and tails is -1.Look at the blue trace here

    Notice at 20 flips, the distributino is already more or less normal. And that’s starting with a distribution of runs that is pathologically not normal and totally discontinuous (only 2 possible outcomes!).

    More model runs would tighte the uncertianty intervals in a computed t-test, but in a way that makes almost no difference relative to what we don’t know: The variability of trends for the earth given an imposed set of boundary conditions. We only have one realization for that. We will only ever have 1 realization..

    No number of model runs reduces the fundamental issues with not knowing the ‘true’ variability of trends for the earth. (And remember: Singer’s mistake in his paper was to set that variability to zero. For all I know he still doens’t understand it’s not zero!)

  77. http://www.realclimate.org/index.php/archives/2011/10/global-warming-and-ocean-heat-content/

    Gavin Schmidt:

    The deep ocean is really massive and even for the large changes in OHC we are discussing the impact on the deep temperature is small (I would guess less than 0.1 deg C or so). This is unlikely to have much of a direct impact on the deep biosphere. Neither is this heat going to come back out from the deep ocean any time soon (the notion that this heat is the warming that is ‘in the pipeline’ is erroneous).

    Didn’t notice this sensible point in the IPCC summary.

  78. Steve McIntyre,
    Obviously, “in the pipeline” is a metaphor. But I’ve always assumed it’s an expression of the notion that the earth’s surface temperature is not at “steady state” (i.e. pseudo-equiblrium) given the level of forcings. That would not represent any “hidden” heat but rather the heat that has yet to accumulate before the earth’s surface temperature is at “steady state”.

  79. If the heat is “hiding in the deep ocean” then wouldn’t this show up in the sea level data as increased thermal expansion? If it is, then the correction for the extra “weight” of the melt water might be overstated.

  80. Lucia@119790
    Given the age of the Earth one might think it has had ample time to stabilize. It does not appear, from the even the visible geologic record to have ever done this. What mechanisms need to be in place to prevent a “steady state”? Given the known climate history not to mention what I see when I look at the Grand Canyon, has there ever been a “steady state” on this planet? Do we truly know the machinery involved?

  81. Lucia writes “And that’s all you need for a t-test. This seems counter intuitive, but all you need to do is look at the probability distribution for the average of 22 coin flips where heads is 1 and tails is -1.Look at the blue trace here”

    Coin flips are measuring the same thing whereas models are (arguably) not. Would your reasoning hold if your took 22 opinions on something?

  82. Lucia writes “(1) Confidence GHG’s have overall risen.
    (2) Confidence that radiative physics is correct and that GHG’s do tend to elevate net forcings.
    (3) Confidence 1st law of thermo is right.
    (4) Observation long term trend is positive– as expected based on 1-3.
    That’s it. That’s the entire basis for my confidence that the trend will resume eventually.”

    Obviously you’d consider the second law of thermodynamics to be right too (maximised entropy). So how does the second law impact on your beliefs?

  83. Would your reasoning hold if your took 22 opinions on something?

    If the opinions are expressed as numbers, and each is independent, yes.

  84. Lucia writes “If the opinions are expressed as numbers, and each is independent, yes.”

    Really? So if you asked 22 people what they thought sensitivity was, then their answers would form a normal distribution around …what exactly?

  85. Re: TimTheToolMan (Comment #119803)

    So if you asked 22 people what they thought sensitivity was, then their answers would form a normal distribution around …what exactly?

    Well first of all the 22 people’s opinions would probably not be “independent.” And even if they were, 22 might still not be enough to form a reasonably “normal” looking distribution, although you might also be surprised.

    A normal distribution around what? The mean, of course. 😉

  86. I’m pretty that the IPCC models are seriously marred by spurious correlation. This happens typically when trended series are utilized to perform causality and forecasting. Same old story as the Donella/Meadows population growth reports in the Seventies.

  87. TimTheToolMan

    Really? So if you asked 22 people what they thought sensitivity was, then their answers would form a normal distribution around …what exactly?

    First that’s that what I said. I said the probability distribution of means. So: Do what you said, find the mean. That’s experiment 1 and X1.
    Go to a new room, find 22 new people. Compute the mean over that group off 22. That’s experiment 2 with X2.

    Repeat this a zillian times.

    It is the distributions of these sample means that tends toward normality as the number of samples in the sample means increase. With 22 individuals in the sample, the distribution of these means will tend to look normal. And that’s all that is required for a t-test.

    Note “tends” does not mean “becomes exactly normal right away”. It means the sample means approaches normality as the number in the sample increases even if the things you are sampling are not normal (like individual coinflips or individual opinions)

    So, the

  88. Sometimes I think Lucia would prefer if I went off to do it a zillion times rather than spend time here 🙂

  89. TTTM,
    I can’t speak for Lucia, but the issue for me is whether you can/will revise your thinking when confronted with clear contrary evidence. The IPCC summary for policy makers appears to clearly fail that test; he IPCC is IMO populated by people for whom physical reality is secondary to political desires. The WUWT gaggle is identical in behavior, even if opposite in sign.
    .
    The question is, do you pass the test that the IPCC fails? Your response to information contrary to your previous understanding about thermohaline circulation/global upwelling in the ocean was constructive. But can you honestly ever accept that GHG’s can cause significant warming via radiative forcing? It seems to me from your comments you are most reluctant to accept the obvious. I do not know why.

  90. SteveF writes “But can you honestly ever accept that GHG’s can cause significant warming via radiative forcing?”

    Of course I accept that increasing CO2 has imposed a radiative forcing, that part is unavoidable physics however I also believe that the SECOND law of thermodynamics will lessen the effect and not allow feedbacks to enhance it. I think my thinking aligns more closely with Lindzen on this.

    I dont believe models are evidence of anything and I think there is sufficient doubt in the paleo record to question whether CO2 has driven warming at any time in our history. Therefore warming could be as a result of something else. I dont pretend to know what.

    But having said all that, I DONT completely discount AGW as she is described. It could be true. But I’m coming from the “other side” of assuming its false until proven otherwise.

  91. TTTM,
    “I DONT completely discount AGW as she is described.”
    .
    Humm…. gendered pronoun for a non-gendered noun. Is English your second language? (Nothing wrong with that if it is.)

  92. SteveF asked “gendered pronoun for a non-gendered noun. Is English your second language? (Nothing wrong with that if it is.)”

    Nope. I chose to put it that way.

  93. TTTM,
    ” I also believe that the SECOND law of thermodynamics will lessen the effect and not allow feedbacks to enhance it.”
    I wonder if you can appreciate how that comment diminishes your credibility…. no, probably not.

  94. SteveF “I wonder if you can appreciate how that comment diminishes your credibility…. no, probably not.”

    I dont have any credibility. But I do think I understand thermodynamics. AGW says a small radiative forcing translates into approximately three times its effect with the feedbacks at the surface. I disagree and frankly wouldn’t surprised to see it translate to 0.5 times the effect.

    I completely appreciate the reasons why AGW says the feedbacks will be positive…its just that I put more faith ultimately in the second law without specifying precisely which feedbacks will bring it about.

    On the flipside I’m believing this in the face of measured warming so I dont blame you for being sceptical of my position.

  95. TTTM–
    How do you think the 2nd law of thermo comes into play? It’s not remotely clear to me how the 2nd law would relate to feedbacks.

  96. Lucia says “It’s not remotely clear to me how the 2nd law would relate to feedbacks.”

    But it should be, its worth thinking about. The CO2 forcing decreases the temperature gradient and in doing so, decreases entropy because it has concentrated energy towards the surface. What happens next?

  97. Re: TimTheToolMan (Comment #119830)
    CO2 forcing decreases which temperature gradient? Meridional? Vertical? Difference between the surface temperature and space?

  98. Oliver asks “CO2 forcing decreases which temperature gradient?”

    Well I normally consider the vertical (ie effective radiation level) and therefore “Difference between the surface temperature and space”. Do you think meridonal changes might compensate by increasing entropy because vertically that wouldn’t seem to be the case?

  99. TTTM-
    UHmm.. you’re going to need to make a more complete 2nd law argument. Why do you think decreasing a temperature gradient decreases entropy? (Note: do not refer to an isolated closed box with no heat transfer in explaining this.) And anyway, the 2nd law doesn’t say entropy cannot decrease. I tend to suspect people who make big vaguely claims based on the 2nd law don’t know what they are talking about. So unless you can show rather clearly what you think writing down some physical laws while you are at it, I’m very dubious of your claim.

    (For example of proofs: Derive speed of sound. Show choking in isothermal flow. Stuff like that.)

  100. Lucia writes “And anyway, the 2nd law doesn’t say entropy cannot decrease.”

    And I agree. Its just that looking at the big picture, the second law will tend to organise atmospheric processes to maximise the atmosphere’s transmission of heat to space, not enhance it’s ability to keep it at the surface.

    If I could write down the equations that showed this to be the case objectively for our atmosphere, then there would be no more arguments about AGW because you’re basically asking me to solve what the models cannot do (IMO)

  101. Re: TimTheToolMan (Sep 29 07:46),

    the second law will tend to organise atmospheric processes to maximise the atmosphere’s transmission of heat to space, not enhance it’s ability to keep it at the surface.

    I don’t see how you arrive at this conclusion. For one thing, keeping more heat near the surface would require an increase in the lapse rate. If specific humidity increases with increasing temperature, which is very likely, the lapse rate will decrease. It certainly isn’t going to increase as the physics of the atmosphere won’t allow it. (See R. Caballero Lecture Notes in the Physics of the Atmosphere for example) Increasing the concentration of ghg’s results in the warming of the whole troposphere, not just the surface.

    I think you have a serious misapprehension of how the system actually works.

  102. TTTM,
    You can’t actually show any equations, so your 2nd law claim is just based on some kind of instinct or gut feeling, or maybe even political desires. Sounds almost IPCC-like. Funny, very funny. Do yourself a favor and only make claims which are technically defensible.

  103. DeWitt,
    “I think you have a serious misapprehension of how the system actually works.”
    Understatement of the week.

    BTW, I recently saw road signs for the town of DeWitt in North Carolina. Any connection to your name?

  104. Maybe family off this dude:
    en.wikipedia.org/wiki/Johan_de_Witt
    One of the most influential Dutch people in the 17th century (“The Golden Age”).

  105. Re: SteveF (Sep 29 09:07),

    Understatement of the week.

    I’ve mellowed over the years.

    Any connection to your name?

    Dunno. DeWitt is one of those names that can be a surname or a given name. I haven’t done any genealogy to see when and where it first turned up in the family.

  106. Re: Nick Stokes (Sep 29 15:45),

    Sounds like Garth Paltridge’s maximum entropy principle.

    Except that Paltridge, et.al. agree that doubling CO2 will increase the surface temperature. However, lower poleward energy transport at higher temperatures seems counter-intuitive as does higher tropical water vapor levels leading to fewer clouds. Paleontological evidence would seem to show that when the planet had higher temperatures, the latitudinal temperature distribution was flatter.

  107. SteveF writes “You can’t actually show any equations, so your 2nd law claim is just based on some kind of instinct or gut feeling”

    At the end of the day, this is true.

    Nick writes “Sounds like Garth Paltridge’s maximum entropy principle. Must be a Tasmanian thing”

    Haha. Could be. I didn’t know about that paper but I know one of the (Tasmanian) climate scientists who’s worked with Garth! Its a small world.

  108. DeWitt writes “Except that Paltridge, et.al. agree that doubling CO2 will increase the surface temperature.”

    I never argued that increasing CO2 wouldn’t increase surface temperature. I am arguing that feedbacks will tend to minimise that effect rather than amplify it at the surface. There is no denying that increased temperature will tend to mean increased water vapour which is one effect that WILL amplify it. But the idea is that’s one feedback of many.

  109. Lucia, after giving this matter more thought I think I found a way to estimate what the distribution for the trends effects coming from the chaotic noise in the multiple model runs might be. Obviously if we had sufficient multiple model runs the distribution could be determined with good confidence. My analysis involves the 2 models that have 10 multiple runs each.

    I initially looked at the multiple run series for each model for the period 1964-2005 and using the kruskal.test in R to determine whether all 10 multiple runs came from the same distribution. The p.values for this test were for CN and CS equal to0.95 and 0.80, respectively. With these results, I proceeded to an ARMA model for the 10 multiple run series for both models. I was surprised to find that I could fit all the multiple runs very well with an arima model arima(1,1) (using best AIC scores) and with a Box.test showing that the residuals from these models were independent. The ARMA models in all cases had ar and ma coefficients which were very close for all multiple runs.

    To 5000 simulations for an “average” ARMA model of the multiple model runs for each model, I applied a shapiro.test from R to test for normality and obtained p.values for CN and CS of 0.56 and 0.91, respectively. From these simulations I also calculated standard deviations to compare with those obtained from the trends from the 10 multiple runs. From the simulations I obtained (in degrees C per decade) for CN: SD=0.026 and for CS: SD=0.043 while from the 10 runs I obtained for CN: SD=0.029 and for CS: SD= 0.039.

    From this exercise and for the models with 10 runs from the CMIP5 Historical models and for the period 1964-2005, I have convinced myself that the trends from multiple runs fits a normal distribution. I need to look at other models with multiple runs, that will have fewer than 10, with a similar analysis.

  110. DeWitt writes “I don’t see how you arrive at this conclusion. For one thing, keeping more heat near the surface would require an increase in the lapse rate. If specific humidity increases with increasing temperature, which is very likely, the lapse rate will decrease.”

    Precisely. And hence I expect to see lower surface temperatures as a result. In line with my expectation, specific humidity increases could result in faster water cycle which will involve greater transfer of energy from the surface to higher in the atmosphere.

  111. TTTM

    And I agree. Its just that looking at the big picture, the second law will tend to organise atmospheric processes to maximise the atmosphere’s transmission of heat to space, not enhance it’s ability to keep it at the surface.

    I could make a “big picture” claim that the 2nd law will tend to make the hotter on average because, as you know, entropy is higher at at higher temperature and 0 at absolute zero. But it’s generally very dangerous to delude oneself into thinking they can apply the 2nd law to some “big picture”.

    Anyway: I can’t being to see why the 2nd law would tend to organize atmospheric process to maximize transmission to space.

    If I could write down the equations that showed this to be the case objectively for our atmosphere, then there would be no more arguments about AGW because you’re basically asking me to solve what the models cannot do (IMO)

    Sure. If you could show it. But there are at least two reasons why you can’t show it:
    1) your claim may simply be untrue. So it might be that those who could do the problem would make the opposite conclusions and
    2) you don’t really understand the 2nd law, and the problem is difficult.

    Sure: I am to some extent asking you to do what models can’t do. But you are trying to make a claim and not supporting it. And the claim really sounds like nonsense. Sort of like people who try to claim evolution violates the 2nd law with no proof.

  112. Re: TimTheToolMan (Sep 29 17:04),

    Precisely. And hence I expect to see lower surface temperatures as a result. In line with my expectation, specific humidity increases could result in faster water cycle which will involve greater transfer of energy from the surface to higher in the atmosphere.

    The combination of positive water vapor feedback and negative lapse rate feedback is already included in estimates of warming and is, in fact, considered well understood.

    IPCC AR4 WG I, 8.6.3.1

    Furthermore, the combined water vapour-lapse rate feedback is relatively insensitive to changes in lapse rate for unchanged RH (Cess, 1975) due to the compensating effects of water vapour and temperature on the OLR

    As far as increased latent heat transfer: maybe, maybe not. Increasing the size of the bucket (i.e. the specific humidity) does not require a change in the flow rate into and out of the bucket at steady state.

  113. Kenneth Fritsch (Comment #119852)
    September 29th, 2013 at 4:59 pm

    Actually for the models CN and CS I should have reported doing a Kolmogorov-Smirnov test (ks.test in R) for all the model run regression residuals one against all others to form a matrix of p.values. The results show that the residuals of the multiple runs come from the same distribution even though some pairs do not quite make the 0.05 probability.

  114. Follow on discussion of post at:

    Kenneth Fritsch (Comment #119852)
    September 29th, 2013 at 4:59 pm

    After some thought on the matter of what the randomly drawn temperature realizations as represented by the multiple runs of a given climate model should look like, I am proposing without any rigor or proof that the realizations can be simulated from an ARMA model that best fits the multiple run series. I have found that the regression residuals for multiple run series, with 10 or 6 multiple runs, from the CMIP5 Historical model runs for the period 1964-2005 fits an ARMA(1,1) model quite well. The multiple run regression residuals also appear to come from a common distribution. The fits vary from model to model but overall are surprisingly good. I need to remind that the variations I am looking at here are attributed to the chaotic or “weather” noise and are impressed on the deterministic trend to yield resulting trends that, although containing the same deterministic trend, varies due to this noise.

    From ARMA simulations for each model, I can determine what the distribution of trends should be using 10 and 6 samples and compare the standard deviations estimated from the ARMA model from that estimated from the trends of multiple model runs. The linked table shows that 3 models, 2 with multiple 10 runs and 1 with 6 multiple runs, have standard deviation trends that are well within the 95% probability range as estimated from an ARMA model. The remaining models examined a have much less variability in the trends from the published runs and are significantly smaller than the variation trends as predicted using an ARMA model.

    I am aware of papers that have used the ARMA modeled parameters to determine confidence intervals for trends from observed series. The Santer (08) and Foster(12) papers come immediately to mind. I have never seen climate model series ARMA modeled, but if one can apply that approach to observed series it is difficult to see why it could or would not be applied to a model that is attempting emulate the observed series. Actually, if indeed one could use an ARMA model for the observed series (and trends), the limitation of the observed series being the only one we will ever see is mitigated by the potential use of simulations to produce the distribution of realizations and trends.

    I am not so much interested in a discussion of the details of my analysis at this time, but rather any comments about this approach making any sense.

    http://imageshack.us/a/img5/7193/xyhf.png

  115. To finish this analysis, I applied a loess filter to the 1964-2005 series using the 9 CMIP5 Historical models that had 6 and 10 multiple runs. I used that filter to determine the trends and standard deviation of those trends for the multiple runs of one of these models. I extracted the residuals from the loess smooth and determined an ARMA model that best fit the residuals. In almost all case the best fit (AIC score) was ARMA(1,1), but in a few cases it was ARMA(2,0). I then ran 5000 simulations using the ARMA model, determined trends and then compared the standard deviations of those trends with the ones from the model runs as described above. Unlike using a linear regression and modeling those residuals and doing the same comparison, using a loess smooth provided trends from ARMA simulations that had the model run trend standard deviation within the 95% confidence intervals for all models, albeit there were 2 models with standard deviations of multiple model runs that were near the limit on the low side.

    I used two span parameters in for loess smooth : one with span = 0.30 and another with span = 0.40. The choice of span determines whether the smooth and residuals look like a linear line and residuals from that line and the resulting trends or like the trends and residuals that would be produced using segmented linear segments or somewhere between these extremes. Decreasing span values reduce the standard deviations of the residuals and at some point a sufficiently low span value would be including natural variations in the smooth. Increasing the span value increases the standard deviations of the residuals and at some point could, like a linear regression, be including the changes in the deterministic trend in the residuals. Using the loess filter with lower values of span assumes that the deterministic trends are probably not linear and allows some tuning in optimizing the agreement between the standard deviations of trends from the observed model results and the ARMA simulations and at the same time obtaining a good fit with an ARMA model.

    The results of my trend comparisons are shown in the linked table. The best fits to an ARMA model were found when using the loess smooth with a span=0.40 and better than those fits found using linear regression. The loess smooth with a span=0.30 produced fits not as good as those using a linear regression.

    If the deterministic trends are suspected not to be linear over extended periods of time, like 1964-2005, I would think that measuring trends and extracting residuals would be better accomplished using a smoother like a loess filter, otherwise as noted above with linear regression, the residuals will contain the deterministic trend variations.

    An interesting side note is that the standard deviations from the ARMA simulations for both spans will, by an F test, not have significant differences while these derived from the multiple model runs will have a significant differences in several pair wise comparisons.

    http://imageshack.us/a/img46/5936/wafh.png

Comments are closed.