The Blackboard

Where Climate Talk Gets Hot!

Skip to: Content | Sidebar | Footer

Distribution of 8 Year OLS Trends:
What do the data say?

16 June, 2008 (12:24) | Data Comparisons

On of the oddities of climate blog discussions is the tendency of modelers to rely on model data without little comparison to measured data. For example, sometimes readers are told that we should estimate the variability of the earth’s temperature trends averaged over 8 years– unaffected by volcanic eruptions– from this figure created by Gavin of Real Climate:

Distribution of 8 year means

What was this figure supposed to tell us?

Gavin created this figure to explain that the recent flat-temperature trends are consistent with the IPCC’s projection of a rate of increase in surface temperatures of +2C/century. In principle, if a current trend is within roughly 2 standard deviations of the predicted value, it falls within the 95% confidence intervals of the predicted value. So, if the standard deviation is ±2.1 C/century– as Gavin would have it– then a trend of 0C/century is consistent with the IPCC predictions.

In his view, the correct method of performing a hypothesis test is to assume the variability in across models is “weather noise”. And in fact, it’s quite clear Gavin suggests that model variability is weather noise if we examine his discussion of the scatter between model predictions which he explains as follows:

This is the impact of the uncorrelated stochastic variability (weather!) in the models that is associated with interannual and interdecadal modes in the models -

So, since Gavin believe the scatter in model runs is due to, to use his word “(weather!)”, he evidently calculated all the 8 year trends from all model runs used by the IPCC, found their standard deviation and then plotted the histogram. He concluded that the earth data falls within the model scatter.

But this is model data. So one might ask:

How does Gavin’s standard deviation compare to those seen on the real planet earth?

It’s easy enough to answer this if one does something Gavin chose not to include in his post:
The actual standard deviation of 8 year trends in the thermometer record.

Fortunately, it’s easy enough to calculate them.

To do so, I averaged over historical records for monthly GMST from NOAA, GISS and HadCrut (so as to minimize measurement error due to differences in treatments across agencies.) Then, I calculated all 8 year trends from 1880 to now using LINEST from EXCEL.

Afterwards, I calculated the standard deviation of 8 year OLS trends two ways:

  1. I used STDEV over every single 8 year OLS trend. This resulted in a standard deviation of 1.9 C/century.
  2. I used STDEV over trends with beginning points spaced every 100 months. This eliminated overlap between the 8 years trends, even placing some spacing in between. This resulted in a standard deviation of 1.4C/century.

So what do we see? Gavin’s estimate of a standard deviation of ±2.1 C/century for 8 year trends is larger than both historical estimates of ±1.9C/century and ±1.4 C/century. One might, however say, “Well, 1.9 C/century isn’t that much smaller than ±2.1 C/century!”

But, actually 1.9 C/century is much smaller than 2.1 C/century. :)

Why? Well, the measured variability in 8 year trends on the real honest to goodness earth should be quite a bit larger than the “weather noise” across IPCC models predicting trends for the period from 2000 through 2007, inclusive.

Here’s why: Gavin’s estimate of the “the weather noise” effect on calculated 8 year trends is for computations with smoothly varying forcings, no volcanic eruptions, and contains no measurement uncertainty what-so-ever. Gavin even emphasizes the point that his estimated of 2.1 C/century is without volcanic eruptions saying:

Note that this is over a period with no volcanoes, and so the variation is predominantly internal (some models have solar cycle variability included which will make a small difference).

In contrast, the measured data contain dramatic excursions due to volcanic eruptions, variability due to normal measurement uncertainty, and even the “bucket to jet-inlet transition and back” correction noise– which has a considerable effect on some calculated 8 year trends.

So, the 1.9 C/century (or 1.4 C/century if you prefer) contains positive contributions due to:

  1. internal variability “(weather!)”,
  2. measurement uncertainty (including the rather remarkable ‘bucket to jet inlet transitions’,
  3. volcanic eruptions and
  4. non-linear variations in the forcing due to GHG’s between 1880-now

In contrast, Gavin’s 2.1 C/century is, according to Gavin’s text, only “weather noise”.
How does the 2.1 C/century compare to periods with no volcanic eruptions?

Unfortunately, the historical period of time with no-volcanic eruptions and no-jet-inlet to bucket measurement noise is quite short. However, if I calculate the standard deviation of 8 year trends for the period from roughly the 20s-40s, I get a standard deviation of 0.9 C/century. This is less than 1/2 the value computed by Gavin. But, I’m not at all confident it is correct, as the period is very short.

Nevertheless, it appears that the standard deviation of 8 year trends starting in 2000 predicted by the IPCC models is, well… a bit high compared to real earth data. (And that is, after all, what the models are supposed to predict.)

So why is the IPCC “climate model noise” called “(weather!”) by Gavin, in 8 year trends so large?

I may be mistaken, but suspect the most likely reasons IPCC models over-estimate the variability in 8 year trend are:

  1. The “climate model noise” includes variability across models, each of which has different parameterizations. Moreover, even the history prior to 2001 my differ from model to model. Some agencies included the volcanic eruptions from the 90s, some did not. This can affect initial conditions, and is bound to affect the trend calculated afterwards.
  2. The climate model noise may include effects of climate model drift– a feature where the control runs of models go off track. Modelers describe how they attempt to correct this but the existence of drift itself could certainly elevate the variability of short term trends in models though not the variability on earth.

Regardless, what we have is evidence that models estimates of the variability of 8 year trends during periods with no volcanic eruptions is noticably higher than those seen on the real physical earth.

Needless to say, I’m still trying to find a good way to estimate the actual variability of 8 year trends on earth so that I can present a hypothesis test that persuades those who do not wish to believe the results.

One thing I do know: using the estimate of ±2.1 C/century based on IPCC model runs is…what was the word Gavin used in his blog post? Bogus.

Comments

Arthur Smith (Comment#3383)

Lucia - what was your starting point for the 100-month-separated 8-year trends that got the 1.4 C/century standard deviation? I suspect that number would depend quite a bit on where you started, given that it’s so different from the overall 1.9 C/century number… Given that we only have 1900 months from the start of the HADCRU series, 100-month chunks would be only 19 samples so not great statistics in itself.

Anyway, even 1.4 C/century standard deviation doesn’t falsify the IPCC trend if the present trend is 0, you need a s.d. less than 1 (which you claim for the non-volcanic data, but that’s even less total samples, so hardly convincing…)

lucia (Comment#3384)

Arthur,
The first year is the average starting decenber 1880. I picked that because it’s the month year where every service reports. It happens to be in row 51, so after that, I use row 151, 251 etc. It’s amazing how much lower it is. I didn’t do this for other choices of start month. Presumably that would affect the number. (Maybe make it higher? Lower? I don’t know.)

No– the 1.4 C/century standard deviation wouldn’t falsify anything. But, on the other hand, the calculation giving 1.4 C/century includes volcanic eruptions, where as the post 2001 period does not.

The short “not volcanos” period had sd’s near 0.9 and 1.2 C/century depending on how I look at it– but there are only about 3 independent samples especially if I stop at the suspected “jet inlet bucket transition spot”. The sudden plunges in that data really contribute to the variability in the temperature record over all. The shorter the record the more that matters– and that transition happened right at the end of the “no-volcano period”!

I have histograms too if you’d like to see them. I wish EXCELL let me easily color and hightlight contributions from different periods. I may need to fiddle to show that so you can see where all the “bucket transitions” stuff is on the histogram!

Basically though– given the trends etc. we are at the point where getting knowing if 1 sd is 1.0 or 1.2 matters to some decree about falsification. I’ve been trying to look at things a few ways without out blogging results for a bit to see if I can come up with anything that is not too complicated to get estimates.

Of course, no matter what I find, we’ll learn more as the temperatures come in.

(Oh– did you read at Roger Pielke Jr.s blog the issue about more Carbon being spewed out? I’m wondering about aerosols.)

lucia (Comment#3385)

Oh– Arthur– The other oddity is that if we do a chi-squared on the uncertainty in the sd, the scatter doesn’t converge very quickly. So, the uncertainty in the sd based on model runs is also really high!

I’ve been looking at a couple of other things. For example, I downloaded the GISS Model E data for “solar only”. Then I calculated the 8 year trends and s.d.’s. For solar only from 1880-now, I get a standard deviation in the 8 year trends of 0.4 C/century.

But those are averaged over 5 runs, so I multiplied by sqrt(5), and I get an estimate of 0.9 C/century.

That’s amazingly similar to what the non-volcano periods actually show. But I need to think through whether multiplying by the sqrt(5) is quite right to estimate a single model run.

But if it is, that could suggest that model E gets the s.d. of 8 year trends matches the “no-volcano weather noise”. But that good result is one model, with on set of physics not results for a collection of models with a variety of paremterizations with a bunch of different possible histories up to 2000.

(I also have no idea how different models deal with any climate drift they might have. One of the Hansen papers discusses how they do it. If some model had a lot of drift and it was dealt with badly, I suspect that could really screw things 8 year trends.)

Alan S. Blue (Comment#3386)

Lucia, this particular argument is tilting at windmills. I’ll take a run myself. ;)

Once you’ve made the assumption that you can treat the available models as an ensemble, combining their predictions and treating the ensuing statistical analysis as valid, you’ve concurrently made the assumption that you can likewise add a new model. This is Monte-Carlo-esque. More runs is the very definition of ‘better’ there. They’re accepting that any individual model can’t predict weather - the aim is the shape of the envelope, the climate. So any individual model can be flat wrong about ‘weather’, but the assumption is that it still has plenty of (negative) information about Climate. “At no time in my model was temperature in place X below Y.”

But when you’re building a model based on something like Monte-Carlo Simulations, you’re doing as many runs as your computational power will support. Say 1000 runs of your model. Any individual run might say anything crazy. But you’re grouping the runs together to determine a probability density. When you’re running thousands of runs of a reasonable model, there’s likely several individual runs that match up reasonably with any particular metric you can apply to the real, observed data. So seeing an individual run diverge for awhile doesn’t send up any red flags - it is just one run, and at the end of the simulation, it ends up within spitting distance of the right spot.

But the kicker is that you can start seeing ‘the real observations’ as ‘one more run’.

This is reasonable when you have a solid grasp of the physics. Picture a ball balanced on the end of a stick. Balance the stick on a slow-to-react but otherwise perfectly computer controlled cart. Add Gaussian-distributed variable wind from due north. We’ve got a chaotic system, and infinitesimal changes in the initial conditions rule. If we assume the wind never crosses a level too strong for the cart to counteract, we can eventually come up with a model that describes the system. And we’ll have a lot of rules nailed down: “The stick never reaches 80 degrees off vertical”, or “The cart never exceeds 17 m/s” or whatever. We can graph ‘average ball height’ or whatever. All deviations from the long-term projection are “initial-condition noise”. But predicting the actual angle of the stick, the actual height of the ball, or the actual velocity of the cart is going to be pretty futile. The we can assign a probability density, but not predict the actual variables.

The key here is that there’s a central trend: we can say the cart is going to go north (opposing the wind), and we can eventually even predict the average northwards velocity of the cart using our Monte-Carlo Simulation derived model. Say 100 meters north (± 1 m @ 2 sigma) over ten minutes. Yet seeing a ‘real-world’ demonstration, even at a million-to-one odds of the combination of observed variables, one can still be confident that the cart will end up 100 meters ± 1 meter north of the starting point after ten minutes. Weirdly, the ‘initial condition noise’ goes down, at least in a relative sense, as the prediction is extended to, say, 20 minutes. The wind is the only disturbance, and it is constrained to a maximum speed.

But the “I see a statistically improbable combination of variables” path is a tough argument. With a solid lock on the physics, there’s a solid grasp of ‘the envelope’. Anyone believing the model won’t be disturbed by the rarity of the path the ‘actual data’ takes, because to get a Monte-Carlo model to this point, there’s an awful lot of ‘individual runs’ that have been examined, and the craziness of an individual run isn’t significant. The ‘envelope of expectations’ encompasses the wildest levels of all the outliers. The statistical information is treated as a probability density. The “actual prediction” is of being at a certain point in 10 minutes. Or a hundred years. There’s only two things that shake that worldview - the ball actually falling off the stick, or the cart not arriving in the designated spot in 100 years. And, if there’s a firm grasp of the physics, even ‘not arriving’ wouldn’t be treated as a disproof. “Hm, that was mighty unlikely. One in 10 million chance to be that far off? Wow. Well, can’t say anything statistical until we’ve done - at the very least - seven runs. So… let’s go again!”

You’re seeing this here. Nearly a decade of near-flatness was an improbable prediction to make in 2000 based upon the current ensemble of models. I don’t think anyone would argue with you on that point any more. Clearly it isn’t impossible though - we’ve done it. But occasional individual runs internal to the models have probably also done something similar. Then recovered well enough to not drag the ensemble average down when extended to a hundred years out. The trend needed to hit the mark in 2100 from where we are clearly isn’t excessively improbable. That’s what, an extra 0.2C/c from 2010 to 2100?

Anyway, sorry for being so longwinded. It’s just that the very term “weather noise” - and the way it is used - implies to me a view of the model(s) that assumes the physics is well enough understood that the viewpoint just won’t be shaken until the ball hits the floor. Any specific individual route to the target is equally improbable after all.

Niche Modeling » Chaitén Eruption June (Pingback#3387)

[...] - is pure speculation, but at the very least, this is a new stage of activity at Chaiten. Over at The Blackboard, Lucia finds a huge statistical contribution of volcanic eruptions to climate variation. How does [...]

lucia (Comment#3388)

Alan–
The main problem I have is the whole “is it chaos” issue? That may well exist. But even if it doesn’t, the model runs aren’t runs of the actual earth. So, as a result, even if we agree the average of weather exists, we still need to prove that any individual model and/or the collection of models gives the correct average value for anything. That includes trends, standard deviations, weather noise etc.

Those defending the idea of avearging always like to explain and re-explain the idea of averaging is like averaging coin flips. Yep. Everyone gets this. The problem with the whole “coin flip” analogy is that, in the analogy, to be certain of determining whether “coin A” is fair, you flip “coin A” many times. You don’t tell people you created “coin B” which you think is sort of like “coin A” and flip coin B.

But what’s even worse, is that we could average over a zillion realizations of “coin B” and, (in the analogy) the earth’s climate system might turn out to be “a die”. The statistical properties of rolling the die are different from flipping a coin. And no matter how often you fip the coin, it won’t become a die!

And so… it’s a bit much to insist that we must believe the average of models is correct for anything in particular before it’s ever been proven!

Alan S. Blue (Comment#3389)

And with your comment of 3388, I completely agree. I find it very odd that we’re going to place any trust in models that are known - in advance! - to not be complete. Witness volcanos. I can understand not being able to predict volcanos (or sunspots, or any other potential disturbance), but when you’re validating your model on historical data, it had better react correctly to the known disturbances.

But ‘this is just one run’ is seductive. And there’s a whole lot of room for moving goalposts.

fred (Comment#3395)

The problem with Gavin’s argument is the same as the problem with Pascal’s wager. He’s an intelligent guy so it is strange he does not see it. It makes it impossible to discriminate between this set of models and other sets of incompatible models by contrasting the different sets with the empirical evidence available. Pascal similarly made the argument for making his wager, with the hidden premise that there was only one hypothesis to bet on: ‘religion’. In fact, there are an infinity. Similarly there are infinitely many model sets which are compatible with the observations, in Gavin’s sense. Some forecast great warming, some cooling, some moderate warming or cooling. All they have to have in common is a large enough SD.

He has had to destroy the hypothesis in order to save it. All you have to do to blow it up is construct another model showing ultimate cooling with a large SD, and it will be just as compatible with observations. Stages in the unravelling of a scientific hypothesis. It takes time, but its happening.

steven mosher (Comment#3399)

fred, nice application of pascals wager! On other occasions I have explained to people that the precautionary principle is also an exstension of the wager. they get really mad.

On another note I raised the issue of “down selecting” GCM with gavin. That is, only using those models
that hindcast well, when doing forecasts. He did not find the idea compelling. In short, in the IPCC world
all models are treated equally. The bad hindcasters get to forecast and the best get to forecast. Further,
The number of realizations or runs each model performs is very small. Some only do 1 run, most do 3 or so,
ModelE submitted 5. It’s quite a patchwork. Ideally, there would be 1 GCM, the best hindcaster. That model
would be distributed to scientists. They would be charged with improving it. those improvements would
have to be validated and then that new model would become the standard. and more computer resource
would be dedicated to monte and his carlo.

anna v (Comment#3401)

3388, Lucia:

“And so… it’s a bit much to insist that we must believe the average of models is correct for anything in particular before it’s ever been proven!”

I have been saying over at CA that the average of IPCC models cannot be correct for the following mathematical reasons:

It is a well known effect that setting up the coupled differential equations whose solutions control the weather developement is hopeless. Figure 4 of this link http://www.globalchange.umich......echanisms/ illustrates how coupled and non linear any system of differential equations for weather will be.

We know though that weather prediction programs work with fairly good success, and we know that there are good scientists designing the models behind the programs. How is it done? by in effect accepting that the solutions are linear, or possibly quadratic in the variables: i.e the first terms in a perturbative series expansion of the unknown real solutions, from given initial values. This works for a certain spread of values, because it is very unusual for a solution not to have a linear or a quadratic term in expansion. It cannot work for a large range of the “stepping” through the variables to predict the future because the real solution might be highly divergent after the first few terms, and the further one extrapolates the more the probability of hitting a divergence. That is why weather predictions can sometimes be completely off, and certainly do not presume to go over a few weeks.

Now, weather models have been turned into climate models by using average values for highly turbulent behaviors like cloud formations and oceanic heat transfers over large grids of 100 or 200 kilometers and time slices of twenty or thirty minutes. Taking the mean values for many of the variables is the same as taking the first order term of the real solutions, so the constraints of the usefulness of the weather modeling are not overcome by calling it a climate model, and the fact that the stepping goes orders of magnitude over the usual weather prediction times guarantees that divergences from the linear approximations are inevitable. It makes little sense to average over all these unknown quantities.
Look again at figure 4 in the link above for the number of variables that should enter in the true solution.

I suspect if the time stepping was made much coarser, which means all the input values etc will follow suite, one could use these model for a thousand steps or so, i.e. take a month instead of twenty minutes to step through time, and have as good a success record for 100 years as weather for the next ten days; but I may be wrong here. There may be intrinsic time steps (cloud formation, heat transfer) that cannot be ignored.

Going back to fig 4 of the link, I have been wondering if an analogue computer climate model would be more successful in modeling the climate. By construction analogue computers solve coupled differential equations fully.

lucia (Comment#3402)

Steve–

That is, only using those models
that hindcast well, when doing forecasts. He did not find the idea compelling.

I’m guessing insisting on good hindcasting would create a slippery slope. Once you start saying you’ll pick only models that hindcast surface temperature well, why wouldn’t you start to insist on additional criteria:

1) The model must get the actual average GMST temperature for a decade during a “calm” period (say 1930) to match the earth’s measured temperature within 1K. (I don’t mean the anomaly– I mean the actual. I pick 1K because the the dumb energy balance treating the earth as a single isothermal lump with no GHG’s at all gets within 33K. Simple 1-d analyses of the atmosphere get you much closer than 1K. So, if we are to go assume predictions that can’t be verified are true for the purpose of planning, I would think a good GCM should get within 1K.

2) The model must get the average day/night variation during the 30s 1K. Then, while we’re at it, throw in the 50s!

3) The model must get the peak summer/mid winter variations in every latitude band within 1K. (I’ll call knowing the ‘peak summer temperature’ at my latitude the full month of July — or something like that.)

4) The model must get the temperature difference between the tropopause and the earth’s surface at the equator to within some value.

If the IPCC wanted to do so, they could start forcing the modelers to quantify “good agreement”. If the wanted, they could even come up with a 5 point metric to decree which model matches the earth’s properties ‘best’. In principle, some masters student should be able to go through all the IPCC runs, post process all that gridded data, and create a table containing these metrics.

Then, instead of vague prose telling us the models are “good”, with no quantification we could quickly read this information. Though predictive ability would still be unproven, at least we’d have a quantified evaluation of hindcast skill available to the public. Right now, all the public gets is the squishy assurances backed by claims to authority!

Dan Hughes (Comment#3404)

I’m sure that everyone is tired of hearing my favorite issues, but here goes again.

(1) How do mathematical models, along with the necessary associated numerical solution methods and computer code applications, that do not contain nearly complete descriptions of, and do not numerically resolve both spatially and temporally, physical phenomena and processes important to weather produce weather noise? Isn’t this approach exactly the same as saying that numerical solutions of the Navier-Stokes equations at temporal and spatial scales that have not a single possibility of resolving turbulence and yet the results ‘look’ like turbulence are valid solutions of the equation system? [Navier-Stokes used only as a specific model; any model can be used as an illustration.]

(2) It has yet to be determined that the results of GCM calculations that have no theoretical foundation other than the visual/eyeball appearance of ‘chaotic response’ are in fact results that correspond to a mathematical system that should produce ‘chaotic response’. That weather is chaotic and climate is not, but individual trajectories from a GCM calculation are chaotic, is an tested hypothesis. In fact there are three untested hypotheses listed there. That all the various equation systems of PDEs plus ODEs plus algebraic parameterizations used in GCMs are mathematical systems that produce chaotic response has yet to be determined.

(3) Verification of the coding of the numerical solution methods and Verification of the actual real-world-application order of the solutions of the discrete approximations has yet to be determined for any GCM application. The effects of the stopping criteria used in any iterative approaches in the solution methods have yet to be determined.

(4) It is a known and published fact (in Peer-Reviewed Certified Journals, even) that ’solutions’ of the discrete approximations used in GCM calculations are not converged either spatially or temporally. If you insist that this is due to chaos and turbulence, see (2) above.

My working hypothesis, in the absence of demonstrations that these issues are incorrect, is that the calculated results from GCM applications are numerical noise unrelated to the continuous equations and most certainly unrelated to weather, climate, or the physical world.

fred (Comment#3405)

SM -

Yes indeed, the precautionary principle is another of these crazed convoluted reasonings which tries to explain why we should do something with no rational justification, by pretending that there is only one alternative when there are many. If there is only the slightest chance that interrupting the increase of CO2 may produce catastrophic cooling, clearly we should do everything in our power to avoid it, the costs of getting it wrong are so great, that no matter how small the probability, we should not risk it. In fact, if there is only the slightest chance that standing on our heads a half hour every day could avoid eternal damnation, we should do that too. We should also eat both more and less dairy produce in order to avoid the very small probability that doing either one could raise our risk of heart disease.

The usual response to this is to abandon the Precautionary Principle, and say that the above argument is invalid because actually there are good scientific reasons….. In which case, we did not need the PP. So, in an argument which must remind us of Hume’s reasoning on miracles, if you need the PP because of the absence of any other support, it will by virtue of that fact be insufficient.

anna v (Comment#3406)

3386,Alan S. Blue June 16th, 2008 at 5:35 pm

I cannot see the ensemble of IPCC models as a montecarlo project.

A monte carlo program is a tool of integration when there are many variables. The IPCC has many variables, and that is where the comparison ends.

The values of the variables in a monte carlo are generated randomly according to error distributions and equations that constrain them. The longer the run, i.e. the more random numbers thrown, the more accurate the integration, and distributions are the outputs. New runs are additive, according to statistics of course, so there is no meaning in averaging distributions, just run more to get a better accuracy.

The IPCC model ensembles are not the same model being run many times. They are different models with different initial conditions and different internal assumptions within a general framework. As far as I know no variables are generated randomly.

JM (Comment#3407)

Lucia

Slightly off topic, but I just spotted your response to a comment of mine over at New Matilda so I thought I’d point out my reposte in case you haven’t seen it (pasted below). Relevant to this dicussion I think.

————————

Lucia

I’ve read tamino’s post and your characterization of it as “there was a time when Tamino also thought that roughly 7 years is sufficiently long” is directly contradicted by two of his statements in the same post:

“So we can tell, pretty much by looking, that a few years aren’t going to give us a reliable trend (we can tell by running the numbers as well).”

and

“Note that the time span is so short that these results are far less precise than the 30-year trend”

I don’t think you can fairly say that tamino has ever thought 7 years was adequate. He was simply making a rhetorical point that - accepting a short time scale for the sake of argument - warming was still apparent from the popularly cherry-picked starting year of 1998 (and also the less popular one of 2000) although perhaps not your personal favorite of 2001.

Tamino offers you no support.

Secondly, since you offered to back your views with a bet, but refused me when I took you up on it, I think we can conclude you don’t have much confidence in your own analysis.

Thirdly, you’ve made this claim several times: “the IPCC literature survey indicates that some scientists claim there is a 0.1C peak to trough effect of the 11 year solar cycle”.

Now I’ve scourred the IPCC reports and I can’t find this statement, and I’ve asked you for a pointer to it over at your site but you didn’t respond.

Could you provide a reference for this? Because I’m very dubious about it, either you’ve misread something or there is a misprint.

Total Solar Irradiance (TSI) varies peak-to-trough over the sunspot cycle by about 0.1 percent but does not cause a corresponding 0.1C temperature fluctuation in the earth’s climate.

(If it did, the 7% annual variation in incident radiation as the earth moves in its elliptical orbit would cause a 7C variation in average global temperature over the year - something which just doesn’t happen).

So. Are you able to provide a reference for your repeated claim?

lucia (Comment#3408)

JM–

Your response has a first, second in third.

In response to your “firstly”:
Could you do us both a favor? When ‘rebutting’ my statements, it might be best include the end of the sentence:
“It appears there was a time when Tamino also thought that roughly 7 years is sufficiently long to perform a hypothesis test.

I know you didn’t mean to do this, but after lopping off the end of my sentence, you rebutted a sentence that ends:
“sufficiently long to determine the trend precisely.”

I have never claimed one could determine the trend precisely; I never suggested Tamino claimed so either. So, rebutting that was rather pointless.

I claimed Tamino said it was sufficiently long to perform a hypothesis test. He made this claim, and was correct when he made it. I have always calimed it is possible perform this hypothesis test despite the fact that one can’t get a precise estimate of the trend. This is also what Tamino says after all the bits you quote:

Despite the brevity of the time span, there’s still a statistically significant warming trend in both data sets. GISTEMP indicates warming at a rate of 0.028 +/- 0.019 deg.C/yr, HadCRU indicates 0.018 +/- 0.016 deg.C/yr.

In otherwords, the short period is:
a) to short to get an accurate trend but
b) long enough to test a hypothesis and exclude it if it falls outside the very large bounds.

Tamino tests 0C/century– saying this can be done even though there is large uncertainty in the trend. I test 2C/century which I say I can do even though there is al arge uncertainty in the data.

In response to “secondly”:

Secondly, since you offered to back your views with a bet, but refused me when I took you up on it, I think we can conclude you don’t have much confidence in your own analysis.

JM: As I said: I don’t ordinarily bet, but I might under specific circumstances that could result in my finding the bet a fun diversion and which would make sure hoards of nearly anonymous commenters don’t step forward proposing screwy bets that are poorly aligned with what I have claimed to be true.

You persistently formulate bets that insist I bet *against* what I state I will happen, and/or which are so poorly worded no one could possibly figur out what outcome would mean a particular person “won”. You also propose bests for things that already happened in the past.

I also will never bet with a nearly anonymous commenter who won’t use his own name, won’t put his own reputation on the line etc. If you start a blog under a real traceable name, generate traffic, we can negotaite the specific wording and conditions of the bets. Among these are:

a) All bests must be based events that happen in the future ,
b) I will only bet chocolate chip cookies, baked by the person placing the bets. The cookies must be mailed to a third party when the bet is placed, and the third party will deliver the cookies to the winner afterwards.
c) Both parties must understand what the heck the bet actually, and which outcome means they win.
d) The other party must be a blogger with some traffic and who is making their own claims. This will permit us to negotiate bets testing their claims against mine.
d) Other conditions exchange will apply and are to be negotiated by blog post so readers can see what stakes both parties are or are not willing to bet.

Obviously, you can and will believe whatever you wish to believe based on my unwillingness to agree to bet someone who uses only the initials “JM”, and proposes nearly incomprehensibly worded bets that seem to be based on utter misunderstanding of what I say.

In response to your thirdly:

Thirdly, you’ve made this claim several times: “the IPCC literature survey indicates that some scientists claim there is a 0.1C peak to trough effect of the 11 year solar cycle”.

Now I’ve scourred the IPCC reports and I can’t find this statement, and I’ve asked you for a pointer to it over at your site but you didn’t respond.

I’m afraid you tend to post a nearly uncountable number of comments arguing about various things and I must have misssed this one. Here’s the quote and link:

In a comment by JohnV

JM: You keep insisting that you both read all my post thoroughly, and have scoured the IPCC literature. Here is the comment JohnV left at my post:

John V April 22nd, 2008 at 3:18 pm Edit This

lucia,

Section 9.2.2.1 of IPCC AR4 WG1 discusses the effect of the solar cycle on temperature:

“A number of independent analyses have identified tropospheric changes that appear to be associated with the solar cycle (van Loon and Shea, 2000; Gleisner and Thejll, 2003; Haigh, 2003; White et al., 2003; Coughlin and Tung, 2004; Labitzke, 2004; Crooks and Gray, 2005), suggesting an overall warmer and moister troposphere during solar maximum. The peak-to-trough amplitude of the response to the solar cycle globally is estimated to be approximately 0.1°C near the surface. Such variations over the 11-year solar cycle make it is necessary to use several decades of data in detection and attribution studies.”

http://www.ipcc.ch/pdf/assessm.....apter9.pdf
(I tried to include the entire pertinent paragraph — I apologize in advance if there is missing text that contradicts the excerpt above).

You will also find a link to this in the article: http://rankexploits.com/musing.....ification/.

You’ll also find all these references in the Chapter 9 of the AR4:
IPCC AR4 Solar Cycle References:

van Loon and Shea, 2000:
van Loon, H., and D.J. Shea, 2000: The global 11-year solar signal in July-August. Geophys. Res. Lett., 27, 2965–2968

Gleisner and Thejll, 2003:
Gleisner, H., and P. Thejll, 2003: Patterns of tropospheric response to solar variability. Geophys. Res. Lett., 30, 44–47.

Haigh, 2003:
Haigh, J.D., 2003: The effects of solar variability on the Earth’s climate. Philos. Trans. R. Soc. London Ser. A, 361, 95–111.

White et al., 2003:
White, W.B., M.D. Dettinger, and D.R. Cayan, 2003: Sources of global warming of the upper ocean on decadal period scales. J. Geophys. Res., 108, 3248, doi:10.1029/2002JC001396.

Coughlin and Tung, 2004:
Coughlin, K., and K.K. Tung, 2004: Eleven-year solar cycle signal throughout the lower atmosphere. J. Geophys. Res., 109, D21105, doi:10.1029/2004JD004873.

Labitzke, 2004:
Labitzke, K., 2004: On the signal of the 11-year sunspot cycle in the stratosphere and its modulation by the quasi, biennial oscillation. J. Atmos. Solar Terr. Phys., 66, 1151–1157.

Crooks and Gray, 2005:
Crooks, S.A., and L.J. Gray, 2005: Characterization of the 11-year solar signal using a multiple regression analysis of the ERA-40 dataset. J. Clim., 18(7), 996–1015.

Paul Linsay (Comment#3409)

This whole approach of averaging models together is total nonsense. Let me illustrate it as follows. The State of NY puts out a bid to build a bridge across the Hudson River. Two companies respond, Hansen Engineering and Schmidt Construction. Both proposals have nice architectural drawings, the budgets are reasonable, and the construction schedules are good. But my engineers come back and tell me that the suspension cables on the Hansen Engineering design are going to start breaking after six months use. They also tell me that the Schmidt Construction design will lead to a severely crumbling roadbed within a year. Now we apply climate science logic. The average strain on the suspension cables of both designs is well within the tensile strength of steel cables. The average buckling forces on the roadbed of both designs is also well within the strength of concrete. Therefore, they’re both right! I have to choose one of the designs for construction. OK…

The real problem is the Orwellian use of the word “experiment” applied to runs of the climate models. They are no such thing. An experiment measures properties of the physical world, e.g., a thermometer measures the temperature of a pot of water. The models don’t do that. They are guesses about how the climate works. It’s possible for a pair of models to make the same assumptions about the physical world but have different internal realizations of the science. We’d know this because they would both produce identical numerical results for identical situations. However, the IPCC ensemble of models don’t agree among themselves. We can conclude from this that they either make different assumptions about the science or have incorrect internals, possibly both (see Dan Hughes). This means that at best, only one model is correct and the rest are wrong for whatever reason. The average of a correct model with a bunch of incorrect models is meaningless.

Steve Mosher at 3399 is exactly correct in how the models should be treated, throw out the bad ones and improve the best one.

lucia (Comment#3410)

Paul:

Steve Mosher at 3399 is exactly correct in how the models should be treated, throw out the bad ones and improve the best one.

Steven is describing what is, to a large extent, the engineering approach.

The caveat is: approximate models get used. However, everyone admits the level of approximation– and generally in quantitative terms. Quantifying what we mean by “good agreement” is very important in engineering. So is comparing predictions to out of sample data. Both let you decide on safety factors, tolerances, figure out what can and can’t be done etc.

I sympathize that climate modelers don’t have much out of sample data to test against. But…. That’s no excuse for just relabling “in sample” data “out of sample”!

Arthur Smith (Comment#3411)

JM, you claim:

the 7% annual variation in incident radiation as the earth moves in its elliptical orbit would cause a 7C variation in average global temperature over the year - something which just doesn’t happen

This would be true if the response for solar variations was time-independent, or at least the full response was much shorter than 1 year. Some of the response is fast, but some of it is also quite slow - at least the 7-8 years we’ve been talking about on other threads here. So you would expect maybe half or less of the response within the annual cycle.

Second: it’s actually rather hard to find data on absolute global average temperature. What you will usually see is the temperature anomaly (GISS, Hadley, UAH, RSS, etc. all plot this) which is the difference between the measured temperature and a reference average for that month. You cannot derive the global average annual temperature cycle from these anomaly series because the main component has been removed by that differencing process, that’s what the “anomaly” means.

According to this page: http://www.spaceweather.com/glossary/aphelion.html
(note the quote from Roy Spencer!) the actual variation in global mean temperature between perihelion and aphelion for Earth is 2.3 degrees C, so not too different from the order of magnitude you would expect. The surprise is - it’s warmer at aphelion (when Earth is farther away). This could be partly a lagged response phase effect, but the explanation on that page makes sense too - aphelion happens during northern hemisphere summer, and the northern hemisphere has a much larger fraction of continental land area, so lower heat capacity, so more warming despite the lower insolation.

Anyway, the magnitude of the effect is pretty clearly consistent with what Lucia talked about, and the solar cycle amplitude of about 0.1 C is indeed referred to by IPCC.

steven mosher (Comment#3412)

re lucia 3402. A while back on RC I suggested a listing of metrics GCM should be tested on.

My reasoning was this. The metrics must be connected to the harm. So, 3 metrics came to mind.

1. having GSMT Skill. ( versus a naive straigtline historically based forecast)
2. having Sea level Skill ( blah blah)
3. having precipiation Skill.

basically, we are woried about warmer temps, higher seas, and floods and droughts. So those
metrics should dominate the selection of the best models.

Now, I’m not sure how to weight these. I know looking at some IPCC stuff that the skill level of
models differs in these regards. But having 19 different models is just an engineering abomination and huge waste of intelligence, and undermines the credibility of the science rather than enhance it.

steven mosher (Comment#3413)

re fred. 3405. ha, It’s very rare to find someone versed in Hume on Miracles. Nicely argued.

steven mosher (Comment#3414)

JM. re 3407. You ask Lucia for a reference? go ask Phil Jones for data.

Here is what I dont get. I dont get why people like JM demand a reference for a blog post,
but dont demand data for a scientific study.

So, JM. should Dr. Phil Jones release his source data? this is the data that determines GSMT.
and if a scientist doesnt release his data, what to make of that?

Here is what I suppose. I suppose that if we can ask bloggers to link to their source, if we can
ask authors to footnote, we can ask scientists working on a problem that thretens all of humanity
to post their source data and methodss. huh, JM? ya think.

This is a tenet of Lukewarmers. free the data. free the code.

steven mosher (Comment#3415)

anna v 3406.

it’s never been exactly clear what is different between esemble runs. It seems clear that
its not a monto carlo on intial conditions or paramerterizations. But beyond that your guess
is as good as mine, ok maybe a bit better. I’m feeling generous.

Atmoz (Comment#3416)

Then, instead of vague prose telling us the models are “good”, with no quantification we could quickly read this information. Though predictive ability would still be unproven, at least we’d have a quantified evaluation of hindcast skill available to the public. Right now, all the public gets is the squishy assurances backed by claims to authority!

A simple Google search found this in less than 2 minutes. Skip to figure 23 on page 35 for the Taylor diagrams.

Schmidt, G.A., R. Ruedy, J.E. Hansen, I. Aleinov, N. Bell, M. Bauer, S. Bauer, B. Cairns, V. Canuto, Y. Cheng, A. Del Genio, G. Faluvegi, A.D. Friend, T.M. Hall, Y. Hu, M. Kelley, N.Y. Kiang, D. Koch, A.A. Lacis, J. Lerner, K.K. Lo, R.L. Miller, L. Nazarenko, V. Oinas, Ja. Perlwitz, Ju. Perlwitz, D. Rind, A. Romanou, G.L. Russell, Mki. Sato, D.T. Shindell, P.H. Stone, S. Sun, N. Tausnev, D. Thresher, and M.-S. Yao, 2006: Present day atmospheric simulations using GISS ModelE: Comparison to in-situ, satellite and reanalysis data. J. Climate, 19, 153-192, doi:10.1175/JCLI3612.1.

Paul Linsay (Comment#3417)

Lucia,

Steven is describing what is, to a large extent, the engineering approach.

Interestingly enough, that’s how science is done too. I watched the Standard Model of subatomic particles being developed. There were lots of ideas, some of which were true. As the data came in they were winnowed out and in the end only one was left standing. Some of the competing ideas were very strongly defended and lasted a long time. But they ultimately couldn’t explain some new and crucial experiments and died, overnight.

lucia (Comment#3420)

Atmoz–
Yep! You got me — the TAR did do a snippet of quantification rather than none!

That said– the four things I actually discuss aren’t discussed quantitativley in the TAR. :) But, they do show that particular graph for three metrics — which is to their credit.

I mostly read the AR4, and WG1 of the IPCC decided to omit that level of quantitative information from the AR4. Or, if they show it, I don’t see it in chapter 8, where I would expect it. Instead, they do show figure 8.11 instead. (Chapter 8, page 619).

It’s intersting to compare figure 8.11 to the Taylor Diagram in the TAR. First– it has less information by design. The rms are compared, but not the correlation. Also, where the TAR uses distinct figures to inform readers which models are crummy amd which are better, the AR4 doesn’t distinguish which models are which in figure 8.11. (One might ask: Which is the wild hair model that has nearly 100% RMS error on precipitation? Is it the same one that’s not so hot on surface temperatures? Or a different one? )

In my opinion, the AR4 is very qualitiative in their text, and figures.

As for the Schmidt paper– I was discussing what the IPCC shows. Certaintly, there are individual papers out there that dicuss more– and on a variety of topics. But, as it happens, that paper isn’t really a discussion of the basis for the IPCC and it’s not even particularly quantitative. Mostly, there are colored figures we compare by eye. There are a few Taylor graphs (Figure 23) is for things modelers consider important. They are important– I don’t want to suggest they aren’t. But, Regardless of what climatologists discuss when comparing their various models to each other, the fact that one can find a paper on Model E is not the same as the IPCC compiling relevant information for readers to understand how the various model compare to each other or to metrics the IPCC is specifically discussing. The IPCC dicussions are mostly quantitiative. .

In any case, the Schmidt paper mostly does what I find not particularly useful as a consumer of information to guide what I believe about IPCC projections. (Which if fine– it’s not meant to be a paper for the general public.) That Schmidt Model E NASA paper is what it is: a journal article explaining model results for 1979. While it is laudable that NASA describes that in detail, it doesn’t tell us much about the specific results relied on by the IPCC. The Taylor diagrams aren’t even the things the IPCC included in the TAR Taylor diagrams.

So, the result is that even after finding that Schmid paper, a reader would still need to figure out which of the various configuraitons of Model E were used in the IPCC projections. Then the would need to hunt down the runs on which the IPCC relied and do all their own comparisons to the metrics the IPCC discusses. This is no criticism of the Schmidt paper– which is fine. But, with regard to the IPCC providing useful information, the IPCC would do well to compile comparisons, make more taylor diagrams and write more quantitative prose. The fact that the public is left with a research project to decide what the think is a problem for modelers and the IPCC with regard to communicating the variety of predictions from models.

anna v (Comment#3421)

Arthur Smith June 17th, 2008 at 1:16 pm

A good link that shows temperature day by day from satellites .

One can see the seasonal variation and the dominance of the northern cycle to the global, and compare with previous years and averages to see the anomaly. June 2008 had climbed over June 2007 but is now leveling off to cross down again. It is like watching horses :).

anna v (Comment#3422)

comment 3415

steven mosher

I found the following presentation useful for visualizing the insides of GCMs

It seems to be a numerical solution step by step fulfilling imposed boundary conditions. I am then guessing that the differences in the models consist of how they use the many parameters that control the equations on the boundaries to get a fit to some data. Like cooking? more of baking powder less of cinnamon etc.

As you say, it is a guess, :) unless one is willing to become an accredited climatologist.

cohenite (Comment#3424)

Koutsoyiannis has shown the shortcomings, well complete failure, of the models to predict temp and rainfall from 1990-2008; have the current models been used to hindcast over the same period?

steven mosher (Comment#3425)

atmoz thanks for the link to the taylor diagram. i had seen it in a couple reports on the ippc site
fr gcm data, and could not figure it out.

JM (Comment#3426)

Lucia (3408) I’ve responded briefly over at New Matilda. Thanks for the references, I’ll get back to you on them

Arthur (3411). Thanks for that, it’s very informative and encompasses something I hadn’t considered, please let me review that before I respond.

JM (Comment#3427)

A cross-post of my New Matilda comment:

———-

Lucia

“I claimed Tamino said it was sufficiently long to perform a hypothesis test.”

Well of course it is - mathematically. The question is whether it is valid in physical terms, which it isn’t. I can do a hypothesis test on “every coin has two heads” and mathematically test it with a single toss. It doesn’t mean the result has any physical meaning. The error bars are too large to draw conclusions (which is Tamino’s point re. climate data over 7 years). You are technically correct, but the conclusion is meaningless in real world terms.

As I said Tamino was clearly making a rhetorical point, and your characterization of it is simply solipsistic, ie. a claim so weak that it cannot be realistically falsified.

‘I don’t ordinarily bet, but I might under specific circumstances that could result in my finding the bet a fun diversion ‘

But … this was a bet - against a 30 year dataset - that you proposed. I took you up on it, you squirmed and eventually wriggled off the hook, despite all my concessions and attempts to make it more pallitable to you.

Your bet, not mine. You refused it. Even after modified heavily in your favor. I think the conclusion that you lack confidence in you own analysis is obvious.

Thanks for the references re. 0.1C. I’ll follow up and get back to you.

lucia (Comment#3429)

JM–
I responded over there. I ‘m mystified why you want to waste time posting at two places.

Two points: A) I never proposed a bet with you or anyone like you. The proposal was always restricted to people with certain characteristics.. B) The bet I proposed and that bet was very specific and none of your proposals matched.

You can read the characteristics highlighted here:

(Since the various bloggers seem to be all for demonstrating their confidence in various predictions by offering bets, maybe if the modelers are truly confident the central tendency over 30 years is 2C/century, they’ll take an even money bet. If the OLS trend over the first 30 years of this millenium is greater than 2C/century, they win. If it’s less than 2C/century, they lose. If 2C/century is really the central tendency, and 7 year trends far off the mark are just insignificant, commonly occurring blips, that even money bet ought to be attractive, right? )

The post did not indicate what is to be bet– as it happens, I’d bet chocolate chip cookies rather than money. But, possibly if an appropriate blogger came around promptly, (before new data arrives) I might have negotiated something.

But, you — an anonymous person with the initials “JM” popped in and immediately decided to over propose entirely different bets. Your first suggestion was:

Why don’t you recast it to represent the reality of your position. You win if the slope is less than 2 - 1.4 = 0.6C/century, they win otherwise. ie. you win if the IPCC is “falsified”, they win otherwise.

Note: I win if the slope is less than 2.0 C/century is not the same as I win if the slope is less than 0.6 C.century.

I turned this down. You proceeded to propose a scatter shot of bets worded differently from the one described in the post. Most seemed to have conditions pulled out of your… uhmm… fevered imagination. And, you continually announced things like “we are agreed them”, after posting your own proposal.

I’m not going to discuss this bet further with you because it’s pointless.

A new view on GISS data, per Lucia « Watts Up With That? (Pingback#3570)

[...] This is discussed Distribution of 8 Year OLS Trends: What do the data say? [...]

Surface Temperatures Trends Through May: Month 89 and counting! | The Blackboard (Pingback#3583)

[...] Ringo: There are a lot of… Distribution of 8 … [...]

Gavin Schmidt Corrects for ENSO: IPCC Projections Still Falsify | The Blackboard (Pingback#3969)

[...] In one of his previous discussion of the short term trends, Gavin suggested one could not falsify the IPCC projection of 2C/century using data beginning in 2000 because a group of models with different parameterizations and different initial conditions gave very large standard errors for the best-fit trend over 8 years. The standard error he suggested is larger than displayed by the entire thermometer record on the real earth! So, it is likely his variance is an artifact of ensemble averaging over the physical approximation equivalent of several different small planets, each intended to be “like” earth, but none precisely identical to the real earth. I show that comparison here. [...]

Hypothesis test for 2C/century: now with Monte Carlo! | The Blackboard (Pingback#4233)

[...] noise based on models? Because a ) the model “weather noise” for 8 year trends is larger than seen in the thermometer record, including periods with volcanic eruptions and large measurement errors, b) the properties of that [...]

Write a comment





Contact Lucia