Yesterday, I discussed the possible different questions one might ask when pondering the “statistical significance” of the high 2007 melt index relative to reconstructed values in FKM2011. I came up with two new questions, and mentioned I had figured out the answer to this one:
Q2: Given the uncertainty in the estimated MI for a given year, were does 2007 fall in the range of “true values†associated with the years in the reconstruction?
(Note: I now have a list of four reasonable questions one might ask. I revised the post to answer Q1(a), my first understanding of what Steve Mosher might mean “How high would 2010 have to be to be statistically different from the reconstructed melt?”. The answer to Q1(a) is, I don’t know the number, but if we assume at least one value in the past had a MI equal to 2007, the probability we’d see the sort of data we saw is very high! So, we can’t reject that null, no way, no how.)
I promised people I’d discuss the probability that 2007 was actually a record– and I plan to do so. But that post is more work than the one required to provide the answer for Q2, so I’m going to answer Q2 first.
I’ll begin by estimating the range of “true values” consistent with 1 particular year. For this, I will pick 1928, which had a reconstructed value of \displaystyle MI_{1928,est} = 1.64 $ The uncertainty in this value can be obtained from the reconstruction itself. From that reconstruction, we find the standard error in the residuals is 0.63, that these appear Gaussian, to lack temporal autocorrelation, and, as required for a reconstruction, the magnitude of the errors is uncorrelated with the reconstructed values. So, I will assume the errors are normal, with mean zero and standard deviation 0.63.
Under this assumption, I could can use a “rnorm” in R to create a distribution of 1000 possible values for the “true” MI for 1928. I did so, and saved those values.
I then generated 1000 values for every other pre-satellite year for which FKM created a reconstructed value. This results in an estimate of the distribution of “true” values that we would have seen conditioned on the reconstructed values in FKM. Note: The reconstructed values are deterministic; only the “errors” are synthesized using Monte Carlo methods.
Once I had this list of numbers, I sorted them, and found the distribution of “true” Melt Indices based on the reconstructed Melt Indices and the known magnitude of the residuals is 1.85. This is less than 2.015 the value observed in 2007. So, the melt index for 2007 is definitely outside the ±95% confidence intervals of Melt Indices for the pre-satellite record. (It’s roughly just at the edge of the ±97.5% confidence intervals.)
But does this mean that the 2000s are outside the previous range? No. Claiming that would require more work. The reason is there are 31 years in the post satellite period. If the melt-indices are uncorrelated (which they are not) the probability that at least one year will have a melt Index that falls outside the ±95% confidence intervals is 80%. So the fact that we might observe 2007 outside the confidence intervals is not by itself sufficient to decree the recent warm period inconsistent with past melt periods.
So for now:
- The MI for 2007 is outside the ±95% confidence intervals based on the FKM2011 reconstruction. (I don’t think there was ever much of a dispute about this.)
- By itself, this observation is not enough work to decree the current period inconsistent. It may or may not be. The analysis to determine which it is will require taking into account other years and considering serial auto-correlation in the melt index.)
- I haven’t shown this yet, but my estimate of probability that 2007 is a record Melt Index relative to values in the reconstruction in FKM is p=12.6%.
Coming up: Tomorrow, I’m going to show some interesting features about reconstructions that people should be aware of when using “the eyeball method” to try to learn something from a reconstruction. This will help explain why so many people comparing the reconstructed values to the “true” values tend to imagine that “true” values outside the reconstructed range must be records and/or why they resist the notion that, given information in the reconstruction 2007 is probably not a record MI. For this purpose, I’ll be using reconstructions of synthetic data where we can “know” the true values corresponding to all reconstructed values. I think it’s pretty interesting. As a teaser, I’ll show you this and ask you to guess which color corresponds to “true” and which to “reconstructed”:

Hint: Guess based on which trace has the highest highs and the lowest lows. 🙂
Ok – I will give it a try.
I guess the blue is the true and the red is the reconstructed.
My rational is that the reconstruction will dampen the original variability, so under that theory, the blue higher highs and lower lows indicate the original or true.
RickA–
Yep. It turns out that for reconstructions using the method of KFM, the “true” tends to have higher highs and lower lows. That’s why people looking at a graph that shows the reconstruction and the “true” on the same graph will tend to perceive outliers on the “true” graph even when there are none.
Note: I suspect this is a feature of all reconstructions based on a proxy. The reason is the portion of variability not explained by the proxy is just lost in the reconstruction. It happens the variance of the reconstruction will be (1-R^2)*σ where R is the correlation coefficient for the reconstruction.
Lucia,
Mike Mann may not completely agree. 😉
SteveF–
He may not. That’s one of the reasons I think it is interesting to discuss this feature.
Mind you: It this doesn’t mean the end point with real must show a max or min. It’s just that the “true” values appear to be more variable, and that is owning to the fact that no proxy is perfect and so the reconstruction must always show less variability. This is important to recognize when assessing whether the end point is a record or falls outside the range of true data one is trying to estimate using the reconstruction.
Lucia,
And your loss of variance doesn’t even address the bigger issue in multi-proxy reconstructions of ‘fishing’ through a multitude of potential proxies to find some that (maybe by pure chance) fit the data in the instrument record… and then basing the entire reconstruction on those, with the potential for much more ‘loss of variance’ (AKA just wrong) in the reconstruction period. Weak. Very, very weak.
SteveF–
No. This is just recognizing an issue that exists even when people don’t fish. The fishing problem you describe also resulted in biases in the errors and resulted in “features” that someone might think had to do with the historical variation.
As long as we are being philosophical. I’m concerned about the distinction between the error or uncertainty that is calculated when deriving the proxy, and the actual uncertainty. Which of course could only be known, if we knew the true values for the years at issue (ie the reconstruction years).
The assumption is generally that the calculated uncertainty is a good representation of the true uncertainty. Certainly, the calculated uncertainty is the lower bound. But what the (in)famous Mann and similar multiproxy reconstructions show is that there can be unrecognized sources of error that are significant and uncalculated. Perhaps uncalculatable. Both “known unknowns” that are, for whatever reason, not added in to the error calculations, and “unknown unknowns” that are completely ignored.
Knappenberger’s online persona is far superior to Mann’s (IMO), but that would not exempt his proxy-based reconstruction from these problems.
Khandekar, writing at Pielke Sr’s blog, says the two warmest decades for Greenland were the 1920s and 1930s.
http://pielkeclimatesci.wordpress.com/2011/05/11/guest-weblog-post-commentary-on-%e2%80%98sea-level-rise%e2%80%99-by-madhav-khandekar/
Don B,
The trouble is that back in the 1920s and 1930s they didnt have computers and satellites and all good scientists today ‘kno’ they don’t ‘lie’ 🙂