What is the “true weather noise”?
What is the “true earth” weather variability? Is it the full range of variability over all possible climate models? Does it matter for testing?
I am asking this because the answer has relevance to the questions I discussed yesterday in Are the IPCC AR4 predictions falsified?. I showed that indeed the IPCC AR4 predictions appear falsified and that Gavin’s method of testing their fidelity is inadequate. The reason is related to how we estimate the true variability of “weather noise” on the real earth, and the actual questions we ask about climate models.
Yesterday, I emphasized that the question I am asking and attempting to answer is:
When I use the term “falsify”, I mean it in the sense that the answer to Q1 is “No, 2C/century central tendency forecast is not consistent with the trends observed on the real earth.”
In comments here and at other blogs (like Roger Pielke’s) many visitor often ponder this different question:
The two questions share similar words, but they are different questions. They have different answers. I believe the answers to the two questions are:
- Answer to Q1: The temperature trends of the earth do fall within the extremely wide range exhibited by climate models used to create IPCC AR4 predictions. This is what Gavin demonstrated at Real Climate. I don’t dispute this and never have.
Gavin gave an answer to an important academic question many ask when trying compare and improve models.
- Answer to Q2: The central tendency of 2 C/century predicted by the IPCC AR4 does not fall within the relatively wide range range consistent with the trend we have witnessed on the real earth. In my opinion, now that projections are published and real data are available, this question has important policy implications.
Does it matter which question we ask?
Of course it does! My opinion is we should be asking both and making sure people understand the difference between the two questions.
Global warming is real; man is exerting an influence on the climate. For that reason, I would like to have some idea about the probable magnitude of warming. Models give a huge range of magnitudes in the trend over 100 years. I think it’s important to try to assess whether models are predict high or low, and to whether the central tendencies predicted by the IPCC AR4 appear fall inside the uncertainty bounds consistent with observations of the earth. For this reason, we must ask question 2.
Of course, to improve any current batch of models, modelers must also ask question 1. As a general rule, in the infant stage of development, models often don’t even achieve the weaker skill indicated by an successful answer to the academic question. I’ve assumed GCM’s are past that infant stage. So, I never seriously doubted the answer to question 1. I think it’s laudable for Gavin to continue to verify that the predictive ability of models has not fallen so low as to fail this less stringent test.
So, I think answering question 1 is fine, provided we never forget to also ask and answer question 2. Question 2 represents a more stringent test and is extremely important from a policy perspective.
Why are the answer for the two questions different in the IPCC case?
The reason the answers to these questions differ is simple: the appropriate uncertainty range to answer the academic question is larger than that required to answer the second question. This means the models predict a very, very, very wide range of trends around 2C/century.
In contrast, while the uncertainty intervals around the trend experienced on earth, are large, they magnitude of these uncertainty intervals are proportional to the “true weather noise”. This magnitude is smaller than those for the full panoply of climate models for a simple reason: the “true weather noise” is weather noise on one specific “model”– the planet earth.
In contrrast, the uncertainty intervals for predictions from a collection of models arises both from the “weather noise” in the model and the scatter about the average behavior of an individual model.
Interestingly, if the models all get the correct magnitude of “weather noise” relative to true earth weather it is trivial to show the variance in the population of trends across all models is
where σmodel-noise is the variance in the ensemble average trend predicted by the full collection of models, and σ2weather is the variance due to “weather noise” in either one individual model run with one specific set of forcings or for the earth itself. (Running ensembles of the model and averaging is permitted– they simply must all be “the same” case.)
Because we are summing squares, we can see that σmodel-noise, the variation in predictions across models or across forcing scenarios, is higher than that due to weather alone.
To determine whether or not 2C/century is consistent with observations of the recent earth’s temperature trends, we must use the smaller uncertainties associated with weather– σweather. Using the larger ones: σmodels can result in unnecessarily large “failing to falsify” diagnoses of models that predicted incorrectly.
These sorts of false negative errors can have malicious policy consequences. (To resort to analogy: false negative results on a Pap test can result in untreated cervical cancer, false negatives in “falsifying” IPCC projections are bad. False positives are also bad. So, we’d like to be fair in both regards.)
How large is “model noise”?
But some may suggest this “climate model noise” is small compared to weather noise. So they might claim that σmodels is approximately equal to σweather
But I must then ask: Why would anyone think the “climate model noise” — i.e. variability across models– is small? If it were small, why would climate modelers bother to use multi-model ensembles? If all models predictred the identical averaged outcomes under identical forcing scenarios, one would simply multiple ensembles of 1 model!
Maybe my “what if” holds no force for you. Then take a look at this figure, which I modified from the chapter 10 of the the WG1 submission to the AR1:
Examining this figure, ask yourself:
- Do the different models predict the same average underlying increase in temperature over 100 years? This is the period where differences in between model predictions is not strongly affected by ” weather noise”. So, presumably, the differences are due to “climate model noise”. i.e. different average results due to parameterizations used by different modeling teams.
- Do the different models predict the same average underlying increase in temperature over 30 years? While pondering that question, recall that 30 years has less “weather noise” than 100 years.
- There is a lot of overlap in obscuring detail, but if we look at the edges, does it appear the models that predict larger increases over 30 years tend to predict higher increases over 7? And 100?
- Do the cases with the low trends over seven years appear to have 2C/century trends over 30 years?
So, in short are those flat trends we see over 7 year periods consistent with models that predict 2C/century or 30 years? Or, are the trends associated with lower rates of temperature increase over 30 years?
Or, to return to yesterdays example: are individuals Swedes usually taller than the average of Vietnamese, Maltese, Portuguese and Norwegians because Swedes tend to be tall?
To answer the question of whether the 2C/century prediction/projection by the IPCC falls within the 95% confidence intervals of the trend experienced on the real earth we need the uncertainty intervals for the true weather noise. Using the larger uncertainty intervals that describe the variability of the model predictions makes sense if we ask the more academic question: Does the earth’s temperature trend fall inside the trends for all possible model (which includes models that may not mimic the earth?)
Conclusion
Whether we are climate modelers or not, we all know which information we personally use to make decisions about which policies to support with regard to adaptation and mitigation. We all have the ability to decide which questions we wish to ask and have answered. In that regard, I am asking a speific category of questions. Other bloggers ask different questions.
Our answers are different because we are asking different questions. With regard to statistical treatment, I perform my assessments using the uncertainty intervals based on the “true weather noise” of the earth. I do this because I am asking:
Others use larger uncertainty intervals based on both the “weather noise” and the “climate model noise”. They do this because they wish to ask the a more academic question, which is important when working toward improving models. Those uncertainty bars are correct for a particular question. It’s just not the one I ask.
Written by lucia.Comments Closed: If you would like them re-opened, Contact Lucia



Comments
jmrSudbury (Comment#2839) May 15th, 2008 at 1:39 pm
Thank you for posting a large graph. Tired of squinting at small pixellated graphs, I went looking and found this link. The 5th frame has the graph on which Gavin (finally) drew error bars. The graph you have above is on frame six. I hope this helps you and your readers. — John M Reynolds
Martin Ringo (Comment#2853) May 15th, 2008 at 3:55 pm
Lucia,
Excellent point or distinction. There is the real world (= the “true model”) which we measure. Those measurements have some randomness, be it from the measurement errors or underlying randomness. We little humans make models of the real world — some simple, some complex — to explain parts of the real world. We take our models and make predictions or produce fitted series for the measured variables, e.g. global average temperature anomaly. If we have some methodological candor, we report the standard errors of our fitted or predicted lines. (Or if you like, the standard errors of the trends of our fitted lines.) That is our ESTIMATE of the real world noise. Those estimates will differ across models.
However, those estimates are different from the differences in the fitted lines or predictions or slopes, if you will. This is an important point. The differences in slopes in a measure of the uncertainty of our understanding, not the variation in temperature trends or such. Just to give you some numbers, I estimated 13 models of temperature trend for the post 2001 period for the models discussed in these pages and a few others (a pure Moving Average model and variations with GARCH(1,1) estimation, a technique often used estimating volatility of financial returns and generally quite useful for time series where the variance of the underlying series may be changing as a function of time, a sort of autocorrelation of the variance). The average trend was 0.043 degrees C per DECADE with standard deviation amongst the 13 estimated trends of 0.077 degrees C per decade. If you use the same models for estimating the weather noise (expressed in the terms of a slope in degrees per decade), the value is 0.078.
So for time series models the variations of my 13 choices is about the same as the estimated weather noise (expressed in terms of slope). This also should give perspective on the 0.19 standard deviation of the climate models. Now to be fair, those models are making predictions and the variation might be expected to be a bit higher. But your “models predict a very, very, very wide range of trends” is maybe a couple “very”s too many, but it is the right idea.
If I get energetic, I will take the 13 models and estimate from 1980 to 2000 and then forecast for 2001-2008. I can then give you a standard deviation of the time series model forecasts which would be an “apples to apples” comparison with the 0.19 standard deviation.
lucia (Comment#2854) May 15th, 2008 at 4:02 pm
Martin– Estimates of uncertainties in the trends based on the models would be great! That’s what I’ve been telling John I wanted to do in my not particularly good way. But for the purpose of the 2001-2008 period, I was going get a SWAG (scietific wild ass guess) based on the GISS model runs with solar only forcing! (Unfortunately…. they versions on line are averaged.)
Evidently, one can get the IPCC model runs that Gavin used on line– but you need to register. Is that what you got?
Tom Fiddaman (Comment#2855) May 15th, 2008 at 4:13 pm
The information content of 7 years of data is the same regardless of how the question is asked. So, unless you can make some compelling statement about the statistical power of a particular procedure used to answer one or the other, I don’t see how you can assert that one test is more stringent than the other. In fact, the second question is quite misleading to the lay public, because it makes a definitive-sounding statement (IPCC models falsified!) with very low confidence (which the press will ignore).
Gavin’s approach is the traditional approach to falsification: evaluate the probability that a model (or models) can generate observed data, and reject when it’s too low. The failure to reject in this case is due to the low information content of the short term data. That doesn’t mean the method is bad; it means that we should seek more data or be patient.
Your question (Does the IPCC AR4 forecast central tendency of 2 C/century fall within the range of trends consistent with the real earth?) is valid, but the answer must be qualified, e.g., “No, 2C/century central tendency forecast is not consistent with the trends observed on the real earth SO FAR.” A negative answer does not automatically constitute falsification, because we have experienced only 7% of the IPCC forecast horizon. Thus the information content of the negative response is as low as the null response to the first question, only rephrased. If you could demonstrate that the IPCC predicts a constant 2C/yr trend, then you might have a case, but I don’t think you can.
One way to do that would be to demonstrate that models have low endogenous trend variability, which you seem to be attempting here by partitioning noise into components. Eyeballing Figure 1 above, the differences in models due to trend are slow to emerge from the general noise in the first decade or so. That would suggest that, early in the simulations, sigma(weather) >> sigma(model-noise). That in turn implies that RC’s trend histogram, linked above, is a useful upper bound on trend variability, not unduly influenced by cross-model or parameter variation over short time spans. It might be possible to refine the RC figure by looking at single model ensembles (some exist in the AR4 CMIP3 archive as I recall), yielding narrower distributions. However, I suspect that it would not make much difference.
In any case, it cannot be true that you “perform my assessments using the uncertainty intervals based on the “true weather noise” of the earth.” No one has access to the true noise. It can be estimated various ways, e.g. with agnostic AR models (at the peril of ignoring forcings), with simple models (Schwartz), or with GCM ensembles. No matter which you choose, the measurement is assumption laden. If the endogenous weather trend has even a little autocorrelation (as one would expect given that the ocean has 1000x the heat capacity of the atmosphere, for example), then the 7% problem is quite debilitating to your argument.
Martin Ringo (Comment#2857) May 15th, 2008 at 4:38 pm
Lucia,
Alas, no. I would love to get — make that estimate myself — the uncertainties of the slopes of the climate models. Having read climate modelers discuss statistics for over a decade now, I simply don’t trust them to do what is a computationally intensive, probably quite expensive but otherwise straightforward exercise.
What I was referring to was time series models. One can forecast from them also. Indeed somebody once argued that there is little evidence that the climate models can do a better job of predicting than a linear model in the with the same exogenous variables as used in the climate models. Note that with time series models, unless there are large AR coefficient values for deeply lagged variables (e.g. t-48 for monthly data), the forecasted trend-slopes will be very similar to the estimated. {Time Series forecasting paradigm: AR models asymptotically approach their equilibrium values in the forecast, MA(p) models hit the equilibrium in p+1 steps.} So other than the first few wobbles in the forecasted series, one gets the estimated slope. Hence, the numbers I gave are a rough idea of the variation of time series forecasts from 2008 onward. I was going to do it for the same time period as the IPCC forecast/projections being discussed.
As a technical note: unless one understands the nature of how a particular model run/realization is made with respect to the modeling of the inherent uncertainty, then having a whole set of them doesn’t do a lot of good. This is a case where understanding the physics do one next to no good, because this is an issue of how one models a multivariate distribution. All the Monte Carlo does then is to integrate the distribution. But with time series it is easy to go wrong.
steven mosher (Comment#2858) May 15th, 2008 at 5:01 pm
Tom
Yes with gavins approach we should be more patient. We should wait 20-30 years.
Ask hansen if he wants to wait 20-30 years?
He’ll say no.
He’ll say the model show that going above 450PPM is disaster. No error bars for him.
Tom Fiddaman (Comment#2861) May 15th, 2008 at 7:26 pm
Indeed somebody once argued that there is little evidence that the climate models can do a better job of predicting than a linear model in the with the same exogenous variables as used in the climate models.
I’ve tried this with simple models (Schwartz model that I think Lucia experimented with and higher order variants). The results aren’t quite as good as the envelope of AR4 models, for reasons I don’t know. However, I think it’s somewhat irrelevant. The point of GCMs is to develop an operational understanding of how things work. Adding the spatial dimension is essential for doing that, and also brings vastly more data to bear on the validation question – you get to look at regional and seasonal patterns, lapse rates, and a zillion other things. Those may not improve your ability to predict the global temperature time series, but they tell you a lot about whether you have your physics right.
Which brings me to my second point …
Yes with gavins approach we should be more patient. We should wait 20-30 years. Ask hansen if he wants to wait 20-30 years?
Even if you want to be patient, you have to make a decision under uncertainty. It would be stupid to make that decision on the basis of a single 7yr experiment when you have reams of other data to consider.
steven mosher (Comment#2863) May 15th, 2008 at 8:27 pm
Tom.
Every decision is made under uncertainity. So the decision to turn food into fuel ( ethanol) was a decision
made under uncertainity. Luckily some people have figured out in a rather short time horizen the obvious fact that
many yelled about. If you burn food for fuel, then people starve.( DUH. burn uranium for fuel instead!)
It was brought home to me one day a year ago
as I sat in the airport bar. A guy sat next to me. he ordered doubles. I knew by glance he was from the midwest,
as am I, so I struck up a conversation about the weather. he brighten. Talking about the weather. it’s a midwestern thing . we have four seasons. Anyway, when I asked him what he did he said he sold corregated boxes.
No shame in that. the world needs pencils, pens, cardboard, paper, exotic sex toys,and climate science. Lots of needs.
Then I asked What his biggest challenge was. he said, “getting glue.” because the glue that is used to make corregated boxes is derived from corn syrup. We Chatted about Archer Daniels and how they allocated their corn syrup product
( ok boring crap) But it occurred to me that politicians were making decisions about the use of corn product
( turn it into fuel) without a fart in the winds idea about how that decision would impact people.
Did I go Off topic? Tangent MAN!
Tom Fiddaman (Comment#2864) May 15th, 2008 at 9:05 pm
Steven – I actually quite agree with your tangent. Ethanol was a stupid decision made under uncertainty, without an appreciation of the uncertainty (or even really an appreciation of the more obvious certainties).
John V (Comment#2866) May 15th, 2008 at 9:13 pm
mosher — I’m with you on uranium vs corn. Nobody likes the taste of uranium anyways.
I’m also being swayed by the idea of liquid flouride reactors. Google Nuclear Green and Charles Barton for more.
Gotta go.
KuhnKat (Comment#2869) May 16th, 2008 at 12:33 am
Tom,
“It would be stupid to make that decision on the basis of a single 7yr experiment when you have reams of other data to consider.”
Yes, we have about a million years of proxy data without a hint of catastrophe from too much CO2. We should wait 20 years to validate the models and learn a LOT more!!
Technically we could have 19 years of flat weather and make it all up in the last year. Of course, we have no observations to support this. Where is YOUR cut off??
Tell you what, let’s agree to wait and see whether Solar Cycle 24 has a max at least as high as 23’s before passing legislation!!! That is only another 5 years and if warming doesn’t resume till then we haven’t lost anything by waiting!!
I should also ask you what is going to be done about the natural increase in CO2 if warming resumes?? All we did was increase the RATE. If the AGW theory is right, we would eventually reach the same point with the natural additions!!! In other words, just cutting our CO2 emissions is not enough. If rising CO2 is an issue we need to take the NATURAL increase out of the atmosphere also. Do we reduce the biosphere or lock the excess CO2 up in calcium carbonate or some other compound???
Now, about those observations. The AGW physics requires that the oceans warm and the upper trop to warm faster than the lower trop. Care to take a shot at explaining why these indicators are negative along with no temp increase?? It doesn’t leave too many places to hide that extra energy that AGW is supposed to be hoarding. If there is no excess energy in the system, the last 20 years of warming are meaningless. Just more WEATHER NOISE!!!
Nick Stokes (Comment#2871) May 16th, 2008 at 1:59 am
Lucia,
Thanks for stating the questions clearly. Let me address the one you stated first (but sometimes referred to as the second). I think there is a problem with your use of the term “central tendency”. You said here that it is equivalent to a mean (tho I think a mean is a “measure of central tendency”). But you are not testing it as a mean; you are testing it as an instance. I think that the IPCC should not, and did not, make such a prediction.
An elementary illustration. I’m the IPCC, and asked to predict the future location of a mark on the tyre of a passing vehicle. The wheel radius is 1 m, and the speed, expected to be uniform, is 10 m/s. I can’t see the mark.
So I produce a graph with a line showing where the hub is, and a shaded region 1 m (scaled) wide. My prediction is that the mark will be found within that band at any future time. I make remarks like “the graph shows the longterm trend is 10 m/s”.
So after 1/10 sec, the onlookers, who can see the mark, say “but it went down and up, and has barely made any progress at all. Your prediction of a central tendency of 10 m/s is falsified”. And I say, but no, it is, as predicted by my model, within the shaded region.
Now back to AR4, the IPCC did show a lot about the model’s working, they did make a prediction (in that Fig 10.1) of exactly that form. It’s true that whoever wrote the section on committed change in the TS may have had and communicated the wrong idea, though the words are technically correct (“about”). And that can be criticised. But there is ample evidence of the nature of the IPCC prediction, eg in the plots in Ch 10 and the supplementary materials actually setting out the model mechanics.
Josh (Comment#2875) May 16th, 2008 at 4:51 am
If anyone thinks Gavin’s point has any validity, couldn’t we make his “test” 100% certain by adding the following 2 “climate models” to the ensemble:
T = x
T = -x
where T is the temperature anomaly and x is the year? These models would predict global warming (or cooling) of 100*/century, which is virtually certain to contain the real trend. [Or if you want 95% confidence intervals, add enough of these so that 5% of the models are these 2
] If you can add ridiculous things to your argument and MAKE IT BETTER, you’re not saying anything useful.
lucia (Comment#2882) May 16th, 2008 at 7:06 am
Nick-
Yes. I test the mean as an instance. I test the 1-sigma error bars as instances. In March, both the mean and the uncertainty intervals falsified. The temperature rose and the lower bound of the 1-sigma uncertainty interval has now pulled into the range consistent with data. The full shaded region falsified.
I have been discussing both the shaded region and the central tendency in my various posts. You will note that in Swedish cartoon, I show include 1-sigma error bars for the VMPN “height” models.
John V (Comment#2883) May 16th, 2008 at 8:01 am
lucia,
I hope you’ll have a chance to look at the uncertainty on 7-year trends (vs the underlying trend) in the near future. No matter how I look at it this uncertainty is very large and will influence your falsification.
On the other thread you mentioned using GISS Model E to estimate the 7-year uncertainty. The data wasn’t available yesterday but seems to be back online now. I’ll have a look as well. For “fun” I also hope to get ModelE running on my computer this weekend so we can generate results for any set of inputs.
In the meantime, I hope you can look at my analysis of 7-year trends from GISTEMP and HadCRUT. I probably posted too many graphs yesterday — this the one that really matters:
It shows 7-year and 22-year trends from GISTEMP (OLS using annual data ending in 2007). To me it’s clear that the 7-year trends deviate significantly from the underlying 22-year trend. The strong pattern in the 7-year deviations strongly suggests that they are not caused by random volcanic events. The range is easily plus or minus 2C/century.
It’s always possible that my calculations are wrong. I encourage anyone who’s interested to review my spreadsheet:
http://www.opentemp.org/_resul.....ia_7yr.xls
I will be adding Model E results and using Cochrane-Orcutt in the spreadsheet this morning.
Morgan (Comment#2884) May 16th, 2008 at 8:39 am
If I’m following the arguments correctly, from here and elsewhere, then:
1) The models as a group have falsified – their predictions are biased to the high side over the last 7-10 years.
1a) This falsification may be a result of overestimating the “underlying climate trend” OR of underestimating real-world, cyclical(?) variation in global mean temperature over the relevant time period (7-10 years)
1b) The complete “list” of cyclical variations is not known, and the unknown variations may be of any duration or amplitude – hence…
1c) No one has any real idea which is the problem with the models as a whole
I think that’s critical, because:
2) The viability of climate modeling as an endeavor has not been discredited, but…
2a) The potential utility of current climate models for the prediction of long term climate change depends critically on the reason for falsification
and because…
3) We can’t really reject even one model’s ability to forecast longer-term “underlying trends” given our uncertainty regarding the reason for falsification (which will persist indefinitely)
Am I on the right track?
jmrSudbury (Comment#2885) May 16th, 2008 at 8:49 am
James, you said, “[i]n the meantime, I hope you can look at my analysis of 7-year trends from GISTEMP and HadCRUT.”
I see the data and graph for GISS, but not HadCRUT. Are HadCRUT data in a different spreadsheet?
John M Reynolds
John V (Comment#2886) May 16th, 2008 at 9:03 am
Ok, I’ve repeated my analysis using GISS Model E results that were generated using solar irradiance forcing only. I have to stress that this is preliminary work so it’s not as robust as it could be.
I downloaded the data from here:
http://data.giss.nasa.gov/mode.....imsim.html
If I understand correctly the data I used is an average of 5 model runs. If I assume the weather noise between runs is orthogonal, the noise in the average should be approximately 45% of the noise in a single run (1/sqrt(5)). In reality the weather noise is probably not orthogonal because the solar forcing is the same in every run. All I can say with confidence is that the weather noise for the average is less than the weather noise for a single run (average noise is between 45% and 100% of single-run noise).
Using the procedure I described yesterday (link below), I calculated the deviation of the 7-year trend from the “underlying” 22-year trend. (I was incorrectly using the word “bias” yesterday when I meant “deviation”).
http://rankexploits.com/musing.....mment-2831
I found that the 7-year trend deviation has a near-normal distribution with a standard deviation of 0.5C/century.
Scaling to compensate for the averaging, I expect the standard deviation for a single run to be between 0.5C/century and 1.1C/century. For comparison, the standard deviation of the 7-year trend deviation for GISTEMP and HadCRUT data are listed below:
GISTEMP: 1.9 degC/century (excluding years with major volcanoes)
HadCRUT: 2.6 degC/century (excluding years with major volcanoes)
Model E: 0.5 to 1.1 degC/century
The Model E weather noise is significantly less than the observed weather noise. The extra observed noise could be caused by smaller volcanoes which I did not exclude, measurement noise, un-modelled complexities in the real system, or other factors.
For fun I will assume the Model E results as representing the true weather noise on 7-year trends and choose an intermediate value of 0.8 degC/century. Using this assumption the 95% confidence intervals for 7-year trends are plus or minus 1.6 degC/century.
I think I’ll try looking at the monthly Model E data using Cochrane-Orcutt…
lucia (Comment#2888) May 16th, 2008 at 9:05 am
John– Have you considered the fact that the strong pattern in seven year averages is due to the fact the averages overlap? You have, at most 120-22 = 100 years to create real 22 year means.
In the 100 years you have at most 100/7 = 14 statistically independent “samples” of 7 year trends. Of course there the 1965-1972 trend is not much different from the 1966-1971 trend. The wouldn’t be even if the underlying data were white noise.
So, technically, if I don’t throw out any data, I have at most 14 samples in a histogram– if I throw out NO data.
So how many independent samples am I going to have?
John V (Comment#2889) May 16th, 2008 at 9:05 am
jmrSudbury:
Thanks for checking my spreadsheet.
There are two tabs (worksheets). The first is GISTEMP and the second is HadCRUT.
John V (Comment#2890) May 16th, 2008 at 9:15 am
lucia:
I agree that the number of independent samples is not very large. I believe correcting for this would require increasing the uncertainty intervals using a t-distribution.
After thinking about it and running some tests I also think you are right that at least some of the structure comes from the overlapping averages.
These are probably two manifestations of the same problem (lack of independence between measurements).
lucia (Comment#2892) May 16th, 2008 at 9:34 am
John– you are trying to estimate the uncertainty intervals in a sort of bayesian way. The fact that we may not have enough data to do it your way doesn’t propagate uncertainty into estimating them using other methods! (Or, at least I don’t see how it does.)
John V (Comment#2898) May 16th, 2008 at 10:30 am
lucia,
The point that I’m trying to make is that 7-year trends move all over the place. Your method of looking at only one 7-year period says nothing about the weather noise over all 7-year periods. Your uncertainty in the slope a single 7-year period does not represent the weather noise on larger time scales. Any period longer than 7-years is excluded.
My plot in comment #2833 above illustrates the problem. The underlying 22-year trend is quite smooth. The 7-year trend swings back and forth wildly. I do not see how it is possible to determine the scale of the 7-year trend noise from a single 7-year trend (as you are doing). The information is just not there.
You are estimating the uncertainty in *this* 7-year trend, *not* the uncertainty in the difference between this 7-year trend and the underlying trend.
I understand some of your concern with Gavin’s method of using the models to estimate the uncertainty in 7-year trends. I am attempting to do the same using observations, and my results agree with Gavin’s. I get standard deviations of 1.9 to 2.5 degC/century from observations. Gavin showed a standard deviation of 2.1 degC/century from the models.
I’ll try another analogy:
Pretend I’m driving from Calgary to Vancouver (it’s a nice drive). It’s about 1100km and it takes me 11 hours of driving (to use round numbers). My average speed over 11 hours is 100 km/hour. You could sample my distance travelled every minute and use that information to estimate my average speed. The first hour (60 samples) is on a divided highway and my average speed was 120km/h. Using Cochrane-Orcutt you get 95% confidence limits of 112 to 128 km/h. My speed was pretty steady.
Four hours later the road is deep in the mountains. Tourists slow down because of an elk, two bears, and a couple of mountain goats (darn tourists). My average speed for this hour is only 60km/h. I was going fast for the non-animal sections so the confidence intervals are very wide: 25km/h to 95km/h.
For all the other hours we can say that the Cochrane-Orcutt estimation was good. The problem is that by looking at a single hour in isolation the observer has no knowledge of the other hours. Cochrane-Orcutt confidence intervals based on a single hour deep in the mountains say nothing about my speed on a divided highway. The confidence intervals on the highway and in the mountains both exclude the true underlying average.
What I’m trying to do is equivalent to looking at the average speed for every 1-hour period. The range of these limits comfortably encompasses the true average speed.
John V (Comment#2899) May 16th, 2008 at 10:36 am
lucia — could you please check my images in #2886? Thanks.
lucia (Comment#2900) May 16th, 2008 at 10:42 am
JohnV–
I understand the point you are trying to make. But the process of throwing out, picking time intervals for the underlying average, filling in for blanks (?) combined with having at most 14 independent samples of 7 year averages makes it extremely difficult to estimate the fraction of downtrends during periods unaffected by volcanic activity difficult.
That’s all I’m saying.
I didn’t say I’m not going to see what I would come up with. But I hate padding, filling in, adjusting data to calculate distributions. My inclination is to find the longest string of time that is clearly not affected by volcanic activity and deal with it with as few “corrections” as possible.
George W (Comment#2908) May 16th, 2008 at 1:40 pm
An off-the-wall different idea:
Re your Figure 1 and Gavin’s first graph plotting “the global mean temperature anomaly for 55 individual realizations” where he says “It should be clear from the above the plot that the long term trend (the global warming signal) is robust, but it is equally obvious that the short term behaviour of any individual realisation is not.”
I wonder if it would be useful to go through the individual model ‘realizations’ determining the max length of time from when a temperature starts to drop to when it has recovered (i.e., the width of the temperature valleys) and comparing those to the length of the current temperature ‘valley’ (since 98 per Hadley and 98 to 07 per GISS). (The long post-1944 valley may not be relevant since CO2 levels are so much greater now.)
This might give an indication of how well the models are comparing to ‘reality’ in terms of short term behavior.
It might also require some statistics using all valley widths.
steven mosher (Comment#2922) May 17th, 2008 at 6:32 am
JohnV and Lucia.
Take the HAdcru observation. REMOVE the modelled temp decrease due to volcanoes using modelE data
Then do the JohnV aproach???? err something like that
lucia (Comment#2923) May 17th, 2008 at 6:53 am
Steve– What I’m trying to mull over is to how to do this using the method with minimal number of steps that involve “judgement”, and if possible, avoid using model predictions. (The alternative, is just estimate the “weather noise” based on model predictions.
The problem with too many steps, and/or any steps where we “adjust” any temperatures is that that can introduce noise. So, after “correcting” you get a distribution that is a poorer representation of the uncertainty in the presence of volcanos, and in particular, in which the uncertainty intervals are too large, owing to the combinition of either (or both) a) not taking out some periods that are affected by volcanoes or b) “adjusting” data to the wrong value and then calculating.
This is why in labs you try to take lots and lots of good data under controlled circumstances. “Fixing” data afterwards is rarely the way to discover truth.
But… I am looking at JohnV’s suggestion when I can get to it. If we had 600 years of data with no volcano eruptions, I’m sure we could get the right answer and we would all agree. Or, if as happens with statistics, if variance is much, much larger than we get from the regression, we could get an answer we all agree with. We don’t have 600 years of data, but who knows? JohnV has shown his graphs, maybe when I look I’ll conclude the same thing.
John V. (Comment#2925) May 17th, 2008 at 7:44 am
steven mosher:
Using Model E to remove volcanoes is a good idea. I may give that a try.
Hopefully lucia will come up with an even better idea.
lucia:
My “conclusions” are very provisional. I did a few quick analyses to get an idea of the scale of the weather noise. The results indicate to me that it’s large enough to pursue. I hope you’ll pursue it before publishing this month’s “falsification”.
lucia (Comment#2926) May 17th, 2008 at 8:49 am
JohnV:
That’s the plan. We both agree this is an issue with the uncertainty intervals. Unlike some suggestions by other bloggers, this is concrete and can be pursued. It’s not cherry picking a period with volcano eruptions, changing the question, or pretending the uncertainty due to our lack of understanding of physics (which results in scatter in preditions) is “weather noise”.
Martin Ringo (Comment#2927) May 17th, 2008 at 10:06 am
John V,
The pattern of OLS slope estimates in your figure is very interesting. Looks like a AR(2) pattern with a near 1.0 AR1 and sizable but negative AR2. And indeed it is (AR1 almost 1.00 and AR2 around -0.5). This indicates that the first differences have some explanatory power beyond the AR structure. In fact your pattern suggests an interesting slope estimate:
(1) temp(t) = B0 + X1…k*B1…k + C*trend + AR terms + u(t)[ i.d. N(0,V(t)]
that is u(t) is independent but not necessarily identical distributed normal with a variance V(t), which we assume can be estimated by GARCH procedures.
(2) V(t) = V0 + A1*resid(t-1)^2 + G1*V(t-1),
where “resid” refers to the residual of Eqn 1.
But like the V(t), the C, the slope of the trend, maybe dependent on time in an autoregressive manner:
(3) C(t) = S0 + S1*C(t-1) + S2*C(t-2)
Estimate Eqn 1 subject to Eqns 2 and 3 and you have a time varying estimate of the slope. But since the lag terms decay (subject to roots outside the unit circle, i.e. the equation is stationary), the forecasted equilibrium estimate of the trend is the S0 of Eqn 3.
That is the Maximum Likelihood way of estimating the slope. It can be done without the messing around with the log-likelihood function and optimization code (which these days I seem to be a real klutz at). That is just estimate the slopes for particular time spans; fit the series of slopes with Eqn. 3 and a differencing equation (not shown). And then use S0 as equilibrium.
Of course, it might be a good idea to test it. So why don’t you take a some monthly data and estimate a series of slopes for seven or so year time span. Do this for a long temperature series, but don’t tell series (e.g. HadCRUT3v Southern Hemisphere) or the period which the series of slopes applies (e.g. Jan 1927 to Dec 1981). Send Lucia the series or post it where I can download it. And I will try to forecast the slopes for the next seven or so years.
Andrew (Comment#2930) May 17th, 2008 at 4:25 pm
Well, in the space of one year, temperature can go up or down by almost .4 C-so Noise could be quite large indeed. Interestingly, that is presumably the result of internal processes, and if it operated on the long term, could obviously account for large portions of the trend you are measuring-or considerably mask it.
steven mosher (Comment#2931) May 17th, 2008 at 5:39 pm
funny what happens to data when you turn it over to engineers.
I observe.
fred (Comment#2933) May 18th, 2008 at 2:26 am
Its excellent work, and thank you for it. For the second time one becomes aware that the issue is much simpler than it seemed at first sight, and that the RC position is basically obfuscation. The first case was the Hockey Stick, where, when you get to the bottom of it, we had incorrect statistical methods applied to give undue weight to one set of proxies which are not temperature indicators.
This is a second one, where after going through the detail carefully, we see that a poor prediction is being rescued by retrospectively increasing the range of events that it ‘predicts’ to the point where, had it been made clear when it was issued that the range was so large, we would not have thought it a prediction of any value. Psychics do this all the time but no reasonable person takes their ‘predictions’ seriously. Are they even properly called ‘predictions’?
The props for AGW are being knocked away one after another. Its not proving it is not warming. Its not proving CO2 is harmless. It still could well be that common prudence should stop us changing the composition of the atmosphere on any scale. But it is showing that many of the key arguments for those propositions are based on bad science. They could still be true though. Its important to keep that in mind. Just not for the reasons given so far.
Jorge (Comment#2936) May 18th, 2008 at 5:32 am
What is weather noise, climate noise or an underlying trend?
My feeling is that these terms are not well defined and this causes confusion when people are using alternative versions. Modellers seem to imply that without extra forcing there would be no underlying trend, that climate noise does not exist and weather noise cancels out over shortish periods. From the point of view of those looking only at observations, it seems that there are variations in temperature over all time scales, from ice ages down to days. From this perspective, weather merges into climate and underlying trends depend entirely on starting and ending times.
Clearly there are problems with all temperature data sets over all time scales which throw additional uncertainty into the mix. In the case Lucia is dealing with here, there are uncertainties in the observations which she deals with by averaging the data sets. In addition there is noise of some kind. In looking for a trend in this period of data you can do no more than try to fit a straight line. Clearly there will be error bands depending on how certain you wish to be that a given trend is consistent with the data. I am not sure how the noise should be handled in a statistical sense but I have a feeling that it is cheating to introduce data about the noise that is not present in this particular time period of the data set.
I am also unhappy about the idea of correcting the data in some way to account for known cycles etc. It seems to me that it is the job of the modellers to change their predictions to bring them into line with the observations rather than the other way around. The fact they have to do this retrospectively is simply because they chose not to model these aspects of the real world or they tried to model them and failed.
As far as I am concerned the data shows little sign of a positive trend during this period and it is up to the modellers to come up with their post hoc explanations. On the point of utilty of the predictions so far, they are either true but pointless or wrong but precise.
lucia (Comment#2937) May 18th, 2008 at 6:40 am
JohnV and others– I have looked at this enough that I will be showing two sets of uncertainty bound in future. They’ll be “pseudo-bayesian” based on the ’20s – ’60s (roughly) and C-O error bounds.
I don’t like Gavin’s method of getting error bars because those apply to a different question. , But JohnV’s suggestion makes sense for my question. The pseudo-bayesian are smaller than Gavin’s but large enough to “unfalsify”. So, then we can use these going forward.
I’ll post more on later during the week exlaining the pseudo bayesian error bars.
Mikel Mariñelarena (Comment#2946) May 18th, 2008 at 8:12 pm
John V, I welcome sense of humor as much as the next one. And certainly I follow your thoughtful posts with a lot of interest. But I would suggest we keep in mind that indeed we are talking about millions of human beings suffering from decisions taken on the basis of the models you so much defend.
In fact, I don’t think the biofuels nonsense is the primary factor in the global rise of food prices we’re witnessing. I’ve seen good quantitative analysis arguing against this hypothesis. However it should be a strong warning for climate scientists to take the critiques of the skill of their models more seriously.
In a world where still so many live in misery or on the verge of it, *any* policy taken that will reduce global economic growth will literally translate to a matter of live or death for lots of people. Under the current circumstances, I think climate modelers have both a scientific and moral obligation to address the empirical validation gap that exists between their discipline and others.
rex (Comment#2948) May 18th, 2008 at 10:10 pm
Its becoming ridiculous… now another model backtrack this time from Knudsen so of course its got 300+ news articles BUT its still only a model
http://news.bbc.co.uk/2/hi/sci.....404846.stm
I think its about time the models were all thrown out for anything related to climate. Please stick to meteorology (3 days prediction at most is close to accurate even now…). I still cannot understand why the climate scientist cannot conceive that global temepratures may actually drop well below average at any given moment and for a very long time (its actually possible). They may also go up as well of course
Rex (Comment#2949) May 18th, 2008 at 11:46 pm
I think its time the models for climate change were thrown out. Stick to meteorology (at most 3 days anything close to accurate prediction). This is another example of the lack of credibility in these models
http://www.theledger.com/artic...../813623488
Of course because its Tom Knudsen there are now 300+ news stories on this. The skeptics were saying this a long time ago. This applies to artic ice melt (not happening), temps (going down), precipitation etc . Nothing is changing apart from normal climate change…In fact I could venture to say that T Knudsen is wrong again because if temps rise due to natural causes one would expect hurricane activity to increase? (official AGW posture up to today)
Rex (Comment#2950) May 18th, 2008 at 11:49 pm
My apologies lucia did not think previous had been posted. BTW posting from Firefox does not seems to work
Larry Bolz (Comment#2960) May 19th, 2008 at 2:41 pm
“Technically we could have 19 years of flat weather and make it all up in the last year. Of course, we have no observations to support this. Where is YOUR cut off??”
Yes, good question, what’s the cut off, how many years. When is it no longer reasonably possible to get the required deg/c, how many years is that? The last 5 years were at -.01, the last 7 years .05, the last 10 years .11, the last 20 years .18, and the last 30 years at .16. Maybe it would be better to say that the climate system’s behavior over the last 5 7 and 10 doesn’t meet .2/C decade.
But clearly, if you look at three periods of 7 years 1937-1943, 1943-1949 and 1964-1970, 7 years can display a great number of things that aren’t out of whack with a 100 year trend.
But since the 2001 prediction, it’s not on line so far. I would be surprised if we go over 2005’s .51 anomaly, but it depends on the weather cooperating. When do we find out if something that’s not happened in 130 or so years would have to happen to reach +2 for the next 100 years?
Global Climate At a Glance
Larry Bolz (Comment#2961) May 19th, 2008 at 2:59 pm
If a volcanoe happened or not in that 7 year period should be is immaterial if we only care about point A 0 years to point B 100 years has a .2 C rise average per decade. We do need to know if a non-volcanoe 7 years was ever only at .05 over the period to see how normal or abnormal this is though.
That said, this last 7 years not having had a volcanoe until recently certainly is odd and points towards now as simply a cooler part in a cycle of natural variability, rather than any kind of success in linking the rise to AGW.