Models Over Predict Using Another Version of a ‘Long Trend’.

In comments, it appears Deep Climate wants us to compare the current trends to the stated trend in Table 10.5 in the AR4.

Slight correction to my previous comment: I was confusing the A2 and A1B scenario IPCC projections. The former is 0.66 C (0.21/decade) to 2030 and the latter is 0.69 C (0.22/decade). The current trends would be slightly lower of course. The exact trend probably depends on how it’s baselined and smoothed, but 0.2C/decade is a reasonable benchmark that has been used by every other analyst to my knowledge, including Roger Pielke Jr (who you have quoted from time to time).

I suppose your position is that that’s the wrong benchmark, and we should be using 0.27 or 0.28 C/decade instead. Of course, you’re entitled to your opinion. We’ll have to agree to disagree.

As I answered in comments previously: if at all possible, I think we should compare apples to apples. We should compare 20 year trends based on observations for Year A through Year B to 20 year trends from model projections for Year A through Year B. We should compare changes in 20 year averages over 31 years to the changes in 20 year averages over 30 years. (That’s what the values in Table 10.5 quote by Deep Climate represent.)

That said, when I did not have access to model data, I used the “about 2C/century”, and still mention it from time to time (as does everyone).

What we should not do is compare the
20 year trends based on observations for Year 1989 through Year 2008 to changes in 20 year averages over 31 years with the second 20 year period ending in 2030.

Now let us make two like to like comparisons, and, in the process, I’ll answer show a graph highlighting numbers I mentioned in a response to barry.

Compare Trends based on changes in 20 year averages over 31 years

In the AR4, the IPCC created projections based on the multi-model mean weighting the mean temperature from each anomaly equally, also correcting for model anomalies for model drift from control runs. Table 10.5 in the AR4 describes projected temperature changes based on the difference between the average temperature of the temperature from 2011-2030 and the average temperature from 1999-1980. These are the values Deep Climate quoted.

To compare models and observations based on this metric I downloaded all runs driven by the A1B SRES from all models used by the IPCC in the AR1 at The Climate Explorer months ago. (All but one run were available at the time; I’m going to check for the final run again tomorrow but adding it can’t make much of a difference.)

I computed the multi-model mean trend, without correcting for multi-model drift; if model drift is sizable, that could make affect results. ( Only two control runs are available. )

Below, I have plotted these trends as a function of the end year. The projection for the A1B SRES Table 10.5 in the IPCC was divided by 31 years and highlighted with a purple circle.

Trends based on 31 year change in 20 year average temperatures.
Trends based on 31 year change in 20 year average temperatures.

When examining this graph, I think it’s useful to note:

  1. If we were to compare the 31 year change in 20 year model means ending now to the observed trends for the same period, the models are running high. This is an apple to apple comparison.
  2. If we were to compare the 31 year change in 20 year model means ending now to those ending in 2030, the models would look horrible. This is because the models project warming will accelerate. Clearly, it does not make sense to compare trends over with different start and end years. This is why not even the “ice-age is coming stone cold denialists” are suggesting we make this apples to oranges comparison.
  3. The trend based on models including volcanic eruptions lies below the trend based on all models during the earlier portion of the period illustrated, but rises above that full multi-model mean trend during the later period. This behavior arises because volcanic eruptions during the 60s through early 90s are thought to have cooled the surface of the earth. This cooling appears in simulations by those model runs driven by volcanic forcing but is absent in runs that did not include volcanic forcing.

    All other things being equal, the effect of the volcanic eruptions is to depress trends with end years during periods when many volcanoes erupted and to elevate trends with beginning years during those periods. For particularly strong eruptions or many clustered eruptions, the effect is visible in one year, 5 year, 20 year and longer trends.

    Because this effect is understood and captured in simulations, I believe it is wise to compare the observed trends to simulations that capture the volcanic forcing.

  4. Because Deep Climate suggested that my trends are high relative to those in the IPCC: The projected trends I computed over the specific period selected by the IPCC is lower; the difference may be due to the correction for model drift.

Projections for 20 year trend based on A1B

As some will recall, in the earlier post I compared observed 20 year trends to simulated trends from models containing volcanic forcings driven by A1B, and ended the graph in March 2009.
Based on comments, it appears I should have included a larger number of years.

That graph contained a few extra models (because they were available form The Climate Explorer.) I deleted to eliminate the ones not used in the IPCC and replotted, extending the graph to the end year of 2030. (Note: I do not correct for model drift.)

I plotted the running 20 year trends from models and observations:

Twenty Year Trends Extended to 2030.
Twenty Year Trends Extended to 2030.

At his blog, Deep Climate said this:

Unfortunately, she has compared 20-year observed trends to a model subset that has an unrealistically high projected trend. Lucia’s graphs have the model trend rising as high as 0.28 deg C per decade, whereas the commonly accepted IPCC benchmark projection is 0.2 deg C per decade for the current decade.

Well…as you can see,

  1. The models I use reproduce the IPCC benchmark projection of about 0.2 C per decade if we use match the time period the IPCC actually discussed! This is illustrated by the purple square showing the multi-model trend for the period from 2000-2020, i.e. the trend over the first two decades of this century.
  2. The multi-model 20 year trends from Feb 1988-March 2009 which include years from the past century are higher than the trend based on the first two decades of this century; these are indicated by triangles. These trends do exceed about 0.2 C per decade but this does not indicate the models I picked are high relative to the IPCC models. It does not contradict what the IPCC said: The IPCC did not say that the 20 year trend computed ending in year “N” would be about 0.2C per decade for all end years between 2000 and 2020!
  3. Once the start point of the projections clear Pinatubo, the effect of including volcanic eruptions on calculation of the trends. So, for example, both multi-model 20 year trends ending in 2030 are indistinguishable from each other.
  4. If we compare the observed 20 year trends ending in March 2009 to the 20 year trends ending in 2020, (as Deep Climate may or may not be suggesting we ought to do) the models would appear to to over- predict only slightly. But that’s an apples to oranges comparison, particularly as we know the trend ending in March 2009 is affected by the eruption of Pinatubo.
  5. If we compare the observed trends ending in March 2009 to projections ending in March 2009, as I did, and which choice Deep Climate appears to criticize, the models appear to predict too much warming for that period.

Closing

As mentioned in a few earlier posts, these graphs are merely descriptive. I am not now, nor have I ever adhered to the belief there is anything magic about 20 or 30 year trends. However, as long as Deep Climate is going to tell us these sorts of graphs speak for themselves, I figured I might as well correct the “Swahili substituted for Greek” he injects into his comparisons. These substitutions cause his graphs to speak in a language commonly called “jibberish”.

As I mentioned previously, these long term trends can be difficult to interpret. Though using long trends reduced the variability due to “weather noise”, some noise persists in longer trends 30 year trends, particularly when volcanoes erupt.

As the illustrated graphs I show today do not include uncertainty intervals, so it not possible to say whether the over-prediction by models is statistically significant. However, it is possible to note that if we examine the data this way, comparing observed and simulated trend computed using the same method and using the identical beginning and end points, the model projections happen to exceed the observations.

Sorry Deep Climate, but that’s just the way it is. The case you are making involves a) comparing observed and calculated trends using different time periods and/or b) comparing trends based on different methods. Many people will notice the mismatch in time periods and definitions of trends, correct the mis-matches and notice the models trends are currently rather high relative to observations.

29 thoughts on “Models Over Predict Using Another Version of a ‘Long Trend’.”

  1. Lucia: Please be patient with me, for you have answered this question before but I still wonder about “fitting”. The second graph above qualitatively shows the model means doing well until 2006. Then divergence appears . One question is when were the models run? They presumably have been fine tuned over time and use observations for such tuning. I did read some documentation for one model (I forgot which) and it appeared to have several unobservable parameters whose values I presume are fit in some non-heuristic way. Hence I still worry that there is some sort of empiricism and fitting and this appears to enhance accuracy when hindcasting. I could, of course be out to lunch on this (it is 11:45 and I am actually hungry) and would like to learn more.

    Thanks

    Jack

  2. JackMosevich:

    One question is when were the models run? They presumably have been fine tuned over time and use observations for such tuning.

    I don’t know the precise dates for each of the 60-80 runs. They take place over time.

    However,

    * the SRES on which they were based were informally published in 2000 and formally published in 2001. So, these runs can’t predate that.

    * parameterizations in models have been adjusted over time. Many of these models did not contribute runs to the TAR published in 2001. So, presumably, in some sense, the models, parameterizations etc. were created this century.

    * sensitivity studies and studies to determine the sensitivity to doubling of CO2 etc. are done with all models. So, modelers do know the sensitivity of their models, they know the effect of including higher or lower aerosol and they know the surface temperature prior to performing their 20th century runs. So, the potential for some tuning by adjusting parameterizations or aerosol loadings exists.

    I don’t know whether you can say “fine tuned”. I don’t know how much they can tune and still have a working model. However the correlation between climate sensitivity in a model and the magnitude of forcing used to drive runs has been discussed in the peer reviewed literature.

    Like you, I take the ability to hindcast with a grain of salt. Good hindcasts are a necessary but insufficient criteria for an accurate or even useful model.

  3. Lucia,
    This would probably be a good time to review some points:

    a) My original observations were to the effect that longer-term trends generally rose over the 2000-2008 period and that this in turn supported the contention that the 2000s have seen continued and even increased warming relative to previous decades, even though the intra-period trend was down or flat.

    I am *not* asserting that 20 or 30 year trends are the end-all and be-all of analysis. Rather, the point was, and is, to emphasize that we should be comparing recent observations to a past baseline period, and not place much stock in the trend within the recent period.

    b) I agree that an”apples-to-apples” comparison to the end of 2008 would show less warming in the observations than in the IPCC projections. The real question concerns the magnitude and significance of that gap.

    c) I don’t “want” you to do any particular analysis. I believe, though, it would be more useful to frame the analysis as a comparison of the post-baseline period to the baseline, using a similar period average comparison as contained in the IPCC projections.

    This is what I have done in my most recent post. When I compare the 2000-2008 observations average to the baseline, computed trends are 0.17 (HadCRU) to 0.18 (GISTemp) C per decade, a gap of 0.02-0.03 deg C.

    http://deepclimate.wordpress.com/wp-admin/post.php?action=edit&post=256&_wp_original_http_referer=http%3A%2F%2Fdeepclimate.wordpress.com%2Fwp-admin%2Fedit.php&message=1

    In contrast, if I am interpreting your graph correctly, you are showing a substantially larger 0.5-0.6 deg C difference in 20-year trend at present (for the A1B scenario). But even if the 20-year trends were directly comparable (and I’m not convinced they are), such a comparison obscures the analysis by conflating the details of pre-2000 period with the period of interest, instead of explicitly evaluating the projections and observations in terms of the baseline given in AR4.

  4. DeepClimate–
    If you wish to review:

    a) Whatever you intended as your major point in your first comment here, you began it by saying “You forgot to put the 1979-2000 trend line on your RSS graph, so I’ve rectified that for you.”, followed with some a series of questions and assigned me a quiz. What point you wished to make was obscure.

    b)Despite your rather oblique method of making your point, I and others here have discussed that the continued rise is not particularly meaningful. Longer term trends being more positive in magnitude than the two underlying ones is a relatively common phenomena in statistics. The fact is: the rise that has occurred is much lower than projected by the models. This should, presumably, be the feature that matters if you are wishing to conduct a debate with Pat Michael’s claim that models have failed abjectly.

    c) Since the time of your first comment, you have followed up with a number of incorrect points, and apples to oranges comparisons to make some point that appears to be unrelated to anything I argued in my earlier blog post or in the current one. Obviously, I am free to respond to your later points and comment on the apples to oranges comparisons you are making.

    d)On this:

    I don’t “want” you to do any particular analysis. I believe, though, it would be more useful to frame the analysis as a comparison of the post-baseline period to the baseline, using a similar period average comparison as contained in the IPCC projections.

    What, does this gooble-ti-gook even mean?

    Obviously, no one can compare models to observations from 2020; it’s only 2009.

    Those comparisons that you have been posting at your blog or suggesting in comments compare trends based on periods whose end points do not match and/or are based on different definition.

    If this is framing “the analysis as a comparison of the post-baseline period to the baseline, using a similar period average comparison as contained in the IPCC projections.” then that is an utterly misleading practice.

    I’m not the slightest bit surprised to discover that you are now trying another method of computing the longer term trends.

    Out of curiosity, why don’t you compute the exact same trend based on the model runs and slap that on your newer graphs? I can of course do this, and once again, the disagreement between models and observations will look worse than you show.

    You are complaining that Michaels critcized the models. At least Michaels compared observations to model data, which, in your quest to rebut Michaels contention they are an abject failure, you seem loath to do. Of course, if one wants to show the models are doing well, it’s best to avoid comparing the observations to what the models actually project!

    I still don’t know why you have decided the place to rebut Michaels is in comments here at my blog, on posts that do not discuss Michaels analysis or claims.

  5. “I still don’t know why you have decided the place to rebut Michaels is in comments here at my blog, on posts that do not discuss Michaels analysis or claims.”

    It’s called misdirection, lucia. Deep Climate sticks to the play book well. Don’t be deterred, and don’t let him distract you.

  6. Andrew_FL– His tendency to try to make points obliquely, compounded by the fact that, when he first introduced his point, he is rebutting an argument that has not been made in the post or comments thread where posted his “point” make Deep Climate’s points particularly difficult to detect.

    In any case, in his most recent post, he
    a) by running trends through zero in the center of the IPCC baseline, he continues to take credit for the known run up in temperature before projections were made but suggest he is not taking credit.
    b) Compares a least squares ending in 2009 (and sliding) to the single trend the IPCC computed based on 20 year averages separated by 31 year with the second period ending in 2030.

    So…. a) the trends are computed in different ways and b) the years don’t match. So, they continue to be apples to oranges.

    What he is doing wouldn’t bother me too much if we didn’t have access to the underlying model data. Neverthless, I would be very cautious about the period with volcanic eruptions, known will cause the model trend to be highly non-linear. I would avoid assuming trend by any definition is linear if it spans a volcano. (Since we have the model data, we know the caution would be warranted. Mode mean trends spanning the volcanic eruptions are not linear.)

    In anycase, we have access to the model-data. So, there is no reason to pretend that his apples to oranges comparison is somehow better than comparisons between trends computed by applying similar definitions and matching the end years.

  7. 30 years is a good averaging period because it encompasses approximately 3 solar periods.

    When people do running averages, they are essentially low-pass filtering the data. If the purpose of this running average is to filter out the solar cycle, then you would need approximately 20-years or 2 solar cycles, according to the Nyquist Sampling Theorem.

    One goes to 30 years or 3 cycles simply because an unweighted running average does a horrible job of filtering out high frequency noise. If you stick to a 20-year period, some of the higher frequency jitter is getting aliased back into your signal as noise, which obviously isn’t desirable.

    This can all be fixed of course (so we can stick to 20 year averages) by going with cleaner filter designs. Mann has used butterworth filters for (I believe) exactly that reason. Whether by accident or design, they use the filtfilt function in Matlab which does the filtering correctly (it runs the data forward and backwards through the filter, eliminating phase distortion problems with the Butterworth; it also properly handles end effects, something that SteveM has commented on previously).

    On another point, it is particularly meaningless to treat the ensemble of models as if they represented temporally-uncorrelated, normally distributed data. Model outputs are constrained by physical limitations, such as energy conservation (which places bounds on the variance of the output) and e.g. inertia (which places limits on how rapidly physical quantities can change, leading to time correlations int he data).

    The only approach I know that I would consider acceptable would be to evaluate the range of possible outputs and their deviations from the data of a given model using the Monty Carlo method. Beyond that I’d have to think about the proper statistical method, but it would probably involves computing deviations between individual model runs and the global mean value.

    The point is, you test each model (allowing for uncertainties in the input parameter set) separately to the data, and if each model separately fails, then the only proper conclusion is that none of the models are valid. What you don’t do is mash the models together and pretend you can treat the result as if it were normally distributed. (Yes I know the Large Number Theorem, but it doesn’t apply to model outputs for the reasons I’ve outlined—that’s why physicists were forced to start using Monte Carlo methods.)

    This is a very difficult problem to do correctly, and I haven’t seen anybody in the climate community who has come close yet to doing this problem correctly.

  8. Carrick wrote:

    If the purpose of this running average is to filter out the solar cycle, then you would need approximately 20-years or 2 solar cycles, according to the Nyquist Sampling Theorem.

    I am somewhat confused by this application of the Nyquist theorem (perhaps it is just too late at night and too many beers for reading!).

    For the slow among us (e.g., me), why is a periodic signal not filtered out by a running mean which is one cycle in length?

  9. oms,

    I always thought that averaging over fixed periods produced a comb filter response. Anything that is an exact fraction of the averaging/integrating period gets attenuated completely. Other frequencies are attenuated by half for each octave of increasing frequency.

    This is certainly how it works with an integrating voltmeter.

    http://phobos.iet.unipi.it/~barilla/pdf/INTEGRATING_ADC_tutorial.pdf

    Like you, I am mystifed about why one would need to average over more than one cycle to remove signal at that exact frequency. Perhaps it is because the solar cycle is subject to some frequency modulation and the effects of this are reduced by averaging over more cycles.

  10. Carrick–
    The Nyquist frequency criterion relates to detecting the highest frequencies (i.e. faster than El Nino) not the lowest (i.e. PDO, Solar, NAO etc.). But we would need to average over at least 1 solar cycles to begin to remove energy at that low frequency and at least two to detect the cycle exists in the data.

    “Beyond that I’d have to think about the proper statistical method, but it would probably involves computing deviations between individual model runs and the global mean value. “

    A method like this is dicussed in Santer’s paper. Of course, you still need to account for the noise in the earth data, for which we only have 1 realization. However, I applied the test, using the standard deviation of trends over multiple realizations and discussed that here. Bar and whiskers diagrams for the 95% confidence interval on model mean trends and the data are shown below:

    The multi-model mean is show also. The only reason to test the multi-model mean is that, whether or not it makes sense, that is used as a basis for the projection.

    Out of curiosity, how would you test models using a Monte-Carlo method?

  11. oms:

    For the slow among us (e.g., me), why is a periodic signal not filtered out by a running mean which is one cycle in length?

    That’s correct, as long as the signal is truly periodic. The problem arises when the signal has a significant fluctuation in its period.

    Jorge:

    Perhaps it is because the solar cycle is subject to some frequency modulation and the effects of this are reduced by averaging over more cycles.

    This is exactly the problem. Don’t concern yourself with the notches in the transfer function—since you don’t know ab initio the proper frequency to, it’s the envelope of the response that is important:

    Note for a simple running average you need 3+ periods to get a 20-dB rejection of the signal.

  12. I messed up the image link, see if this works:

    This is a screen grab from Jorge’s tutorial

  13. Lucia:

    But we would need to average over at least 1 solar cycles to begin to remove energy at that low frequency and at least two to detect the cycle exists in the data.

    Basically this has been covered above. I shouldn’t have brought up the Nyquist Sampling Theorem, it’s just obfuscated the issues.

    Basically this is right. See the linked image above. (If you know ab initio the frequency, however, you can measure its amplitude and phase with just a fraction of a period of the signal using a least-squares-fit filter; I had the occasion of doing that yesterday [0.2 Hz signal, 1-s recording time]. But that’s another topic.)

    A method like this is dicussed in Santer’s paper.

    I was referring to Santer in my comment. Basically what they did was wrong because they tried to interpret the variance between models as being associated with a normal distribution, which is flat wrong.

    The mean is a well defined operation, its the interpretation of the variance as normally distributed that is as fault.

    I’m going to take a stab at this, see how much I get wrong:

    If you had two curves with associated measurement error, one way you could compute the likelihood that the two curves are consistent with each other (null hypothesis) using the chi-square statistic:

    chi2 = sum[(y[i]-x[i])^2/(sigma_x[i]^2+sigma_y[i]^2),{i,1,N}]

    then use the chi-square test on the data.

  14. Thanks for fixing the link. Somehow it is getting chewed up by your blog software…

    Lucia:

    Out of curiosity, how would you test models using a Monte-Carlo method?

    With the Monte Carlo method, what we do is run the same model with a set of randomly generated input parameters, where the input parameters are constrained to be physically realizable.

    E.g., let p_n refer to the nth parameter of the model, and be bounded by d_n <= p_n <= u_n. Then for the kth run of the model, we would would have:

    p_n(k) = d_n + (u_n-d_n) * urand()

    where urand() is a uniform random number generator bounded between 0 and 1.

    This will produce for example a global mean temperature versus time, call it GMT_k(t) [kth realization of the model].

    You would end up with a distribution of values, k = 1,…,K for each time t of the global mean temperature.

    The main issue that would have to be addressed would be studying the distribution of the model outputs GMT_k(t) about their mean value so that one could make a valid statistical inference about the significance of any difference between the data versus model.

  15. Carrick– Yes. When applied to the multi-model mean, it is highly unlikely (approaching impossible) that the distribution of model-means drawn from whatever might be the universal population of AOGCMs meeting the criteria for the AR4, is probably not normally distributed.

    There are an insufficient number of models to apply the Chi-square test to determine whether this multi-model mean is normally distributed. The assumption of normally is an assumption, and there is not enough data to test it.

    I know how monte-carlo is run. It sounds like you would run the AOGCMs many many times? So the AOGCM’s replace your random number generator in your example.

    That’s not done because it’s too computationally intensive. Santer type test end up being done because, from a Monte-Carlo point of view, the number of Monte-Carlo runs is less than 7 for all individual models and the total number of models is on the order of dozens, not hundreds or thousands.

  16. Carrick: Why filter out the solar cycle or use data over 3 cycles when there does not seem to be a solar signature in the observed temperature history. Question: do models utilize any non-constant solar forcings?

  17. Lucia:

    I know how monte-carlo is run. It sounds like you would run the AOGCMs many many times? So the AOGCM’s replace your random number generator in your example.

    I would run the same model many times with different input assumptions, bounded by the range of known possible values. I wish I could be more coherent, but Gavin is right, if we’re going to play with this stuff we need to get Climate Explorer and learn how to use it (I’m not there yet).

    I think one could use the Sanders approach if you had enough different models, you just can’t use the assumption of normality of observed variance across models. I personally think what this means in practice is the climate models are in far worse shape than Sanders et al are intimating (e.g. this figure), but of course that’s just my opinion.

    jack:

    Why filter out the solar cycle or use data over 3 cycles when there does not seem to be a solar signature in the observed temperature history

    The answer to that question as I understand it is yes there is an effect from solar forcing, and climate models include that.

    In fact, it is generally invoked to explain the cooling down of the Earth during the Little Ice Age. It is also needed to explain the warming trend from 1850 to roughly 1980. (Prior to 1980, again according to the climate models, the anthropogenic warming from human CO2 emissions were almost completely balanced by pollution. Thus to explain the warming from 1850-1980, one must dominantly invoke natural mechanisms for that warming.)

  18. By the way, Lucia, if I were doing the analysis (assuming the availability of CPU time), I would run the Monte Carlo simulations for each model separately and end up with a p-value for each model as to how consistent it is with the measured data. If I ended up with no models that were consistent with the data, the right answer is none of them are consistent.

    In other words, stacking 10 bad models on top of each other doesn’t make the aggregate in any better agreement with the data. That’s fundamentally the problem with Sanders approach, IMHO.

    I actually have free access to a super computer cluster, so this type of monte carlo is something that is practicable for me to do. I’m pretty sure my boss would kill me though, as there are a lot of other things that we are on the line for right now.

  19. Carrick–
    I’m still fuzzy on how believe you will run the monte carlo . The IPCC models are AOGCMs. Running 1 realization requires a lot of computing power. You could contact climate modelers for any codes that might be available.

    The stuff at the climate explorer is processed output from AOGCM’s. They aren’t models.

    It may be that you have useful idea, but I don’t grasp what it is.

  20. Carrick,

    I agree that the notches will not be very helpful in filtering out a particular signal if the frequency/period does not match the chosen averaging time period.

    However it would be a mistake to think that a time average of a signal is not affected by the comb filter response. For example, if the average is taken over 20 years any component of frequency that has that period will be removed completely. The same would be true for periods of 20/2, 20/3 20/4 etc.

    Those of us that have faced the problem of extracting a signal from noise have usually had the luxury of knowing the frequency spectrum of the signal and could design filters to attenuate frequencies outside that range as noise. The great difficulty with climate series is that nobody knows which frequencies are signal and which are noise.

    It is easy to say that high frequencies are weather/noise and low frequencies are signal but in practice there is no way to decide where to place the frequency cutoff. This is why we have so much confusion about the meaning of trends. Trends simply being the sum of the frequencies you have decided to retain after passing through a low pass filter that has an attenuation v frequency shape of your choosing.

  21. I’m still fuzzy on how believe you will run the monte carlo . The IPCC models are AOGCMs. Running 1 realization requires a lot of computing power. You could contact climate modelers for any codes that might be available.

    I have at least one code (GISS ModelE) on my main workstation. And yes I realize it’s extremely time intensive just for one simulation, at least on a desktop workstation. As I said I realize it’s a hard problem. To do this practically would require running it on our super computer cluster.

    You’d have to run each of the multitude of climate models allowing the input specifications to covary in order to get any real grasp of the uncertainty in the model projections. And then you’d have to make sure you nailed the statistical analysis, because you can see personally how touchy Gavin and the other people in that field are about the slightest mistake made by interlopers like us.

    The stuff at the climate explorer is processed output from AOGCM’s. They aren’t models.

    That’s useful to know. I hadn’t played with it so I wasn’t sure what it could do or couldn’t.

    The bottom line is the “right approach” may not be tractable, but that doesn’t make wrong approaches suddenly more “right” as a consequence.

    Sometimes the best we can do isn’t enough, and we just have to deal with that.

  22. Jorge:

    Those of us that have faced the problem of extracting a signal from noise have usually had the luxury of knowing the frequency spectum of the signal and could design filters to attenuate frequencies outside that range as noise. The great difficulty with climate series is that nobody knows which frequencies are signal and which are noise.

    I agree with this… most of the data I collect is “experimental data”, that is I provide a known input stimulus, and measure the response of the system to that stimulus.

    Climate science is mostly an observational science, which means you can’t change anything (you can’t do very much real world experimentation).

    For observational measurements, filter designs such as running averages are especially bad because the tail of of the distribution varies rapidly with frequency. (That means the noise that leaks through the filter isn’t in general stationary, which is a very nasty property of such filters.)

    That’s why when I do smoothing, I use an approach like an acausal Butterworth filter, much as has been done by Michael Mann’s group. I have no way of knowing whether he realizes how good his approach is, but it is a good approach, in my opinion.

  23. The distribution of output temperatures from these model runs will depend on the distribution of the input parameter variations. How do you know what that should be?

    The a priori assumption of a uniform random distribution of input parameter variations seems unjustified.

    To solve the computational problem it would be nice to create a similar volunteer network to that used for finding prime numbers via GIMPS Prime95.

  24. Alan– The parameters, are, at best, drawn ‘randomly’ from the set of all parameters that would be selected by people who earn their living running climate models at agencies with sufficient funds to employ people who write climate models. 🙂

  25. Alan:

    The a priori assumption of a uniform random distribution of input parameter variations seems unjustified.

    When know apriori the distribution of the uncertainty in the input parameter we use that.

    However, this a case where the large number theorem works in our favor.

    E.g., suppose we have a model with just two parameters x, y bounded between 0 and 1…

    urand(x) + urand(y) is a triangular distribution, not a uniform one.

    The main reason for the use of a uniform distribution is that it enforces known constraints on parameter values. Parameter uncertainties are rarely Gaussian for that reason. So there is an apriori reason for using it over Guassian, even when the underlying distribution of the uncertainty in the parameter space is unknown.

Comments are closed.