Do IPCC projections falsify? (Are Swedes Tall?)
Recently, Gavin at Real Climate suggested that the IPCC projections don’t falsify. He also explained the reasons he thinks they do not.
Today, I will explain that the IPCC projections do indeed falsify in any sense that is meaningful. But, you don’t think so, I think I will demonstrate you must also think we can use the average height of people from all countries to correctly predict the average height of Swedes.
My main counter argument to Gavin’s post is made by means of an analogy, and will be illustrated by a synthetic experiment comparing the “predictions” of the average height of Swedes to the actual “measured” (aka– synthetically generated). The point will be: If a predictive model (for heights or climate)is biased, but which contains lots of “model noise”, the falsification will manifest itself precisely in the way we are seeing, and which is illustrated in the figure below:

Figure 1: Illustration of distribution of Swedes heights to predictions of the height of Swedish men based on men of four other nationalities (based on a synthetic experiment). Note: The central tendency of “prediction” and the 1-sigma uncertainty bound on the means based on “models” on which it is based both fall outside the uncertainty bounds for the group of interest: Swedish men. The full interpretation of this graph is deferred,
What do Swedes have to do with climate change?
Nothing really. I bring them up because I’m pretty sure the difference in Gavin and my answers to the question of falsification from asking different questions, and I think a simple example using heights is helps me explain the answer to these questions:
- Is the mean trend in surface temperature over time predicted by the IPCC consistent with the temperature trends we have been experiencing? (That is: is 2C/century consistent with the trend we’ve seen? )
- Is the lowest uncertainty bound the IPCC shows the public consistent with the trend in GMST (global mean surface temperature) we have seen since 2001?
I think these questions are important to the public and policy makers. They are the questions people at many climate blogs are asking and they are the questions many voters and likely policy makers would like answered.
I think the answer to both questions is “No, the IPCC predictions are inconsistent with recent data. ”
What question is Gavin answering? I don’t know. I have my guesses, but preferring not waste time arguing strawman, I won’t go there.
On to men’s heights and climate change.
Let us now imagine a fictional panel of “height-o-logists” who will do their best to “predict” the height of Swedish men. However, they will be restricted as follows:
- The height-o-logists will have access to data for men’s heights in Norway, Vietnam, Malta and Portugal.
- The panel will not be permitted to know that Swedes are more like Norwegians than like Vietnamese etc.
- The height-o-logists will then average over all countries to develop a “model” that “predicts” the range of heights of Swedes.
How is this similar to Climate models?
Obviously, this clunky “model” to predict the height of Swedes is not a climate model. But it shares these factors:
- None of the “models” is the real thing we want to predict. Vietnamese men share some similarity with Swedish men: they are both homo sapiens. However, just as Vietnamese aren’t Swedes, the GISS Model E is not, strictly speaking, the planet earth.
- The panels don’t have enough information to know which “sub-model” is most closely resembles the thing they wish to predict. My height-o-logists don’t know Norwegians are similar to Swedes; the climatologists don’t know which climate model contains the best set of parameterizations.
- The panel makes a prediction based on the average of all possible “models”. We will call their full model the VMPN model. (Vietnamese - Maltese - Portuguese - Norwegian height model”
What does the panel create?
After much contemplation, the height-o-logists realize they have access to four “models” of humans. Since they don’t have access to Swedes, they decide to create a model by averaging over all the four “model” groups.
To mimic this process, I created a synthetic “super ensemble model”, whose properties I will not describe at length expect to say:
- The super ensemble model includes four “sub-models”, each of which creates a population of heights with an average height that matches a particular nationality out of the four sampled (that is Norwegians.) Because each model predicts a different average height, following the analogy, this mimics the effect of the IPCC including a range of models that predict different average trends in mean surface temperature under different forcings and histories.
- Each height sub-model also includes a random number generator to create “height noise” to mimic the effect of variations in human height on the average. In our analogy, this mimics the “weather noise” which exists in the real world.
- Each height sub-model also includes “regional, ethnic, generational ” noise in the measurement of the height within a country. In our analogy, this mimics the variability climate modelers introduce when including a range of different initial conditions into their model.
I ran the synthetic “sub-models” then “measured” 6 individuals and calculated an average. I ultimately created 56 averaged heights, and created this histogram showing the number of outcomes in each of the 56 “runs”:

Figure 2: Histogram of average height measured from batches of 6 men. The horizontal bars represent the number of samples in a 5 cm wide “bin”, the smooth yellow curve is the equivalent Gaussian curve, the solid yellow line is the average of all heights, the dashed yellow lines represent the ±1sigma bands on heights of groups of 6 men, the orange dashed line represents the 95% uncertainty bands for average heights of 6 men, calculated based on a Gaussian assumption. (Yes, they are “pseudo-error bars” because this distribution is not actually gaussian. Note however, that 3 realizations lie on the highside of these “pseudo-error” bars.)
Note that the graph provided by the “height-o-lotists” includes more information than conveyed in IPCC documents. The IPCC documents communicate a) the mean of the predicted trend in temperature (analogous to the mean height) and b) the ±1 sigma uncertainty intervals on the mean trend (analogous to the vertical dashed lines.)
To give more detail, I added the number count data to this diagram. This permits the reader to compare it to a graph Gavin provided at Real Climate:
What are the predicted heights?
Based on what they know, the height-o-logist panel decides to make the following prediction about those elusive Swedes:
- The predicted average height of Swedes is 171.2 cm.
- The standard deviation in the average height of all six-measurement batches used in our model is 5.9 cm.
- The height-o-logists explain that their findings are robust. For example, they explain that if they remove 1/2 the samples, they get the same average answer– but with more “height noise”. They can also note that the model also shows the rich are taller than the poor in all countries– a robust finding.
In the IPCC analogy, these correspond to the best estimate of the global means surface temperature (GMST) for particular year, and its standard deviation on graphics like this:

Figure 4: In this figure, the “mean” temperature as a function of time is communicated by the IPCC the bold solid curves. The spread of the haze communicates the 1-sigma uncertainty bands for predicted value of the underlying trend as reported by the IPCC. I have highlighted the “uncertainty” interval near the year 2007 with a vertical yellow line.
How are the height predictions this analogous to the IPCC Temperature projections?
These stated predictions are equivalent to the IPCC providing estimates of the central tendency at any time (say 2008) and 1 -sigma standard deviation of central tendencies predicted by all climate models. There is no “weather” or “height noise” in these predictions.
Now let’s test against the “real” world!
For outsiders to test against the real world, we must now collect data and treat it in a way that lets us test like to like: That is, average height of Swedes to average predicted height. (For climate modeling we collect data to compare average temperature trends to average temperature trends.)
How do we do this?
In the case of the height study, we go to Sweden and “measure us some Swedes!”
Since this is a simulated study, I “synthesized” the height of 6 Swedes using a random number generator set to provide an average height of 181 cm. Even though I only “measured” 1 batch of 6 Swedes, using the magic of Excel, I also computed 95% uncertainty intervals on the estimate of the average height of Swede based on my sample of 6 Swedes. (Similarly, just as I can only sample 1 realization of the earth’s global mean surface temperature, I can calculate the average trend consistent with measurements over time, and also estimate the uncertainty in the true underlying trend.)
Here is an illustration of the outcome of one particular experiment:

Figure 5: The true average height of Swedes is compared to the outcome from one measurement from a sample of 6. Note that the true mean falls between the 95% uncertainty bands: this is expected to occur in 1/20 experiments based on 6 measurements.
So, what does this figure tell me?
The graph indicates that based on the 6 measurements of Swedes the current best estimate of the height of Swedish men is 182.8 cm( illustrated with a solid purple line.) This happens to be greater than the “true” value of 181 cm illustrated in red. (In a normal experiment, we would not know the 181 cm, but I know this because this is a synthetic experiment.)
Having computed the uncertainty around my estimate, I would report that based on 6 measurements, the true height of Swedes is 182.8 cm ±3.8 cm with a confidence of 95%. These uncertainty intervals are illustrated with purple dashed lines above; they arise purely due to “height noise” in the specific sample population. Notice the red vertical line illustrating the true mean falls inside these uncertainty intervals.
So, with regard to Swedes, without knowing anything about the predictions of the height-o-logists, we can say that, based on the data from Swedes — not the height-o-logist model– the average height falls within a certain range. This estimate has nothing to do with predictive models, or model uncertainties.
The range is based on the properties of the Swedes we happened to sample.
The analogy to climate.
In the same sense, with regard to estimating the temperature trends experienced on earth: We can calculate a mean trend over a period of time and also estimate the uncertainty in the underlying trend based on the “weather noise” of the actual earth.
These are the sorts of trends and uncertainty intervals I computed in posts discussing falsification of the IPCC projections, and which can be read here as well as in many previous blog posts. (What we found was the trends consistent with weather since 2001 are inconsistent with IPCC projections for the central tendency.)
In both the case of the Swedes and the case of the empirically determined temperature trend, the estimates themselves and the stated uncertainty bounds can be questioned based on data or some physical understanding. But the prediction of the climate or height models are largely irrelevant to the empirical estimate.
So how do these compare to the “predictions”?
Of course, I already showed readers the outcome.
This is how the real data for Swedes compares to the predictions:
Notice the following:
- The central tendency predicted by models falls outside the 95% confidence intervals for all possible heights for Swedes. This can be observed by noticing the solid yellow line indicating the central prediction does not fall between the dashed purple lines indicating 95% confidence interval for the Swedes.
This is analogous to what I find when falsifying the IPCC projections for temperature trends: The central tendency predicted by the IPCC falls outside the confindence intervals consistent with recent measurements of the real, honest to goodness earth.
Conclusion: the predicted value of the central tendencies are falsified relative to the “true” value.
- The 1-sigma uncertainty intervals for the height fall outside the 95% confidence for all possible values consistent with real Swede. The 1 sigma error bars fall outside the central tendency for Swedes. This can be observed by noticing the dashed yellow lines indicating the 1-sigma intervals for the prediction do not fall between the dashed purple lines indicating 95% confidence interval for the Swedes.
This is analogous to what I found when falsifying the IPCC projections in March. The +1 sigma trend predicted by the IPCC falls outside the 95% confidence intervals for the real temperature consistent with the measurements on earth.
So, the full region between the 1 sigma error bars is falsified.
- The average height of Swedes falls inside the 95% uncertainty bands for the full range model height outcomes. This can be seen by noticing the solid purple line for the average height of Swedes falls inside orange dashed lines for the 95% bands of the individual model outcomes.
In fact, in the model I concocted 3 groups of 6 Norwegians and 1 group of Portuguese ended up “taller” than the average of Swedes. So, 4 out of 55 “realizations” were taller than the Swedes.
But: Notwithstanding Gavin’s laser like focus on this point, this feature does not unfalsify the two previous items! The full model gives biased predictions.
What is the significance of the fact that the heights of Swedes do fall in the full range for models? It means the models are biased and imprecise.
In the case of the “height” analogy we know why they are biased: All of the models are “shorter” on average than Swedes. The falsification of the mean is not due to “height noise” in individual men. The issue of “height noise” is accounted in the uncertainty intervals for the heights of Swedes and is reflected in the uncertainty intervals shown with purple dashed lines.
In the “height” analogy, the average height for Swedes falls inside the total range of heights predicted by the climate models because even though
a) the models are biased on average it is also true that
b) some of the models happened to be close to right.
In particular, the Swedes tend to fall in the range of height for Norwegians.
So, the whole VMPN -model biased– despite the inclusion of Norwegians.
Interestingly, we could keep collecting lots and lots of data on Swedes, we would likely continue to find they fall in the range of heights for Norwegians. The result is: even though over all, the “height” model as a whole is distinctly biased, the average heights of Swedes will tend to fall inside the full span of predictions for the model predictions.
Is this model useful for planning purposes? I guess that depends. But you requisition uniforms for the Swedish army based on that model, expect to run out of uniforms for tall men quickly.
What does this mean about the IPCC predictions?
When compared to earth, the IPCC AR4 predictions appear biased and falsified for that reason. So, one might want to consider this when making plans for the future.
The fact that the some models in the lower end may be predicting things accurately doesn’t magically erase the issue of bias. The fact that the climate model has huge uncertainty bars doesn’t “unfalsify” the full model with regard to the question about the central tendency:
The central tendency predicted by the model appears biased relative to the weather data.
What is the cause of the bias? Beats me!
In the case of IPCC predictions, we might suspect that at least some of the models contained in the ’super ensemble’ over-predict temperature increase on earth during the current period of time.
But some models may be ok. Just as Norwegians are a fairly decent model for predicting the height of Swedes, some of the individual models used by the IPCC may be less biased relative to the true earth.
In this regard, it might be best if future panels engage in a winnowing process to remove individual models that appear less trustworthy from the full collection of model used to make projections. I suspect they will– but this doesn’t retroactively fix the AR4 projections, which appear biased.
So… about that falsifiability issue.
Because Gavin brought up such a novel idea for not-falsifying models I need to comment on Roger Pielke Jr’s frequent discussions of falsifiability.
One of the interesting things about Gavin’s method is that, oddly enough, the use of many models results in a humongounourmous range of “weather” that can be consistent with model predictions. If this were due solely to the range of variability of weather on earth, that would be fine. But, a sizeable amount is due to the “climate parameterization noise” which causes a sizeable spread in predictions. Insisting that we cannot observe the central tendency of the predictions is clearly biased does, indeed, result in the “unfalsifiability” problem often brought up by Roger Pielke Jr.
Why? Because no matter how biased the VMPN-height model is relative to Swedes, the fact that it contains Norwegians means that the heights of Swedes will always be contained in some realizations in the ensemble.
But the average will still always be wrong, and the VMPN model has no skill. But evidently, we are not permitted to observe this average is wrong for. For… some… reason.
In conclusion
The IPCC projections remain falsified. Comparison to data suggest they are biased. The statistical tests accounts for the actual weather noise in data on earth.
The argument that this falsification is somehow inapplicable because the earth data falls inside the full range of possibilities for models is flawed. We know why the full range of climate models is huge: It contains a large amount of “climate model noise” due to models that are individually biased relative to the system of interest: the earth.
It will continue to admit what I have always admitted: When applying hypothesis tests to a confidence limit of 5%, one does expect to be wrong 5% of the time. It is entirely possible that the current falsification fall in the category of 5% incorrect falsifications. If this is so, the “falsified” diagnosis will reverse, and not we won’t see another one anytime soon.
However, for now, the IPCC projections remain falsified, and will do so until the temperatures pick up. Given the current statistical state ( a period when large “type 2″ error is expected) it is quite likely we will soon see “fail to falsify” even if the current falsification is a true one. But if the falsification is a “true” falsification, as is most likely, we will see “falsifications” resume. In that case, the falsification will ultimately stick.
For now, all we can do is watch the temperature trends of the real earth.
Update: I realized I’d left the word “model” out of “climate model noise”.
Previous Post:
« Illinois Garden Haiku
Next Post:
What is the “true weather noise”? »
94 Responses to “Do IPCC projections falsify? (Are Swedes Tall?)”
You can leave a response, or trackback from your own site.



Reference May 14th, 2008 at 1:41 pm
Finally GISS released the April 2008 Global Near-Surface Anomaly - it’s + 0.41°C (a whopping decline of 0.26°C from March) - that’s a lot of energy flowing out of the near surface atmosphere, if it could be harnessed just think how much fossil fuel would be saved.
lucia May 14th, 2008 at 1:43 pm
It declined from March? Wow!
I haven’t mostly been expecting increases right now.
John V May 14th, 2008 at 2:55 pm
lucia,
After just a quick read I can see a couple of major issues with your response to Gavin. I don’t have much time right now so I’ll just mention them quickly:
#1: In the graph showing “Error bars according to Gavin” you took the error bars on a 7-year trend and extended them out 100 years. Of course they look too large. The 20-year trend would look much more reasonable.
#2: Your analogy to the height of Swedes is only applicable if the height of Swedes changes over time. How does ENSO fit into your analogy? What about the Schwabe cycle? You can wave these off as “weather noise” but your confidence intervals do not include this noise.
John V May 14th, 2008 at 3:00 pm
Also, don’t forget that the IPCC prediction is *not* for a constant trend. The IPCC model results which form the basis of the prediction clearly show that. What you have falsified is a constant trend of 2.0C/century. The confusing thing from a scientific pov is why you insist on calling the constant trend the IPCC prediction.