Recently, Anthony Watts has been reporting on weather anomalies, showing data from, GISS Land-Ocean, HadCrut, UAH, RSS. Jeff, one of Anthony’s commenters, had previously commented that, based on visual inspection
Seems like GISS is the odd man out and should be discarded as an “adjustmentâ€.
I wondered if this were so.
I know it can be difficult to compare temperature anomalies, as each is referenced to different baselines. Moreover, two refer to surface temperatures and two refer to lower troposphere measurements. However, if we subtract one similar anomaly from another, we might expect the result to be a constant plus noise. If so, then when fitting a linear regression to any temperature anomaly difference, the slope should be zero. Examining differences in measurement also permitting us to visually inspect the data to see if the differences appear consistent with “measurement noise” without being distracted by the overall trend in the underlying data itself.
So, I downloaded the data Anthony was plotting, subtracted GISS from the other three data sets and plotted. (Note: out of laziness, I did not double check this particular data against the source. I just want to see differences, and I’m assuming Anthony did get the correct data.)
Over the past 27 years, we seen good agreement in measured anomalies reported by the four groups.
This is comforting, since it means that validations of hind-casts during that time period done will generally provide similar results regardless of data set selected.
When looking backwards, GISS Land-Ocean data does not look like an outlier. (Of course, I found similar results comparing GISS Land-Ocean to HadCrut data previously. Those match well, though the GISS Met / Crut data show systematic differences.)
Of course, my readers know I am somewhat interested in validating IPCC forecasts first published in 2000 and included in the recent IPCC report. To do this, we must use data beginning in 2001.
So, I also compared data from 2001 on. Right now, it appears that relative to GISS, HadCrut and RSS anomalies are trending down at a rate of 1C/century relative to GISS. (Of course, we sort of already knew this. We know that over the past 7 years, HadCrut shows no slope in the anomaly, while GISS Land-Ocean shows a small slope.)
UHA tracks GISS quite well.
Of course, we can say little about the disagreement in the data. We already know from previous analyses, the uncertainty in the slope is large when we have only 7 years of data. While a difference in slope of 1C/century in measured data would be huge if it persisted we cannot distinguish this form a slope of zero using 7 years of weather data. I’ll admit that I had always assumed most the scatter around best fit lines was due to weather but these results may suggest quite a bit of the scatter is due to measurement error. If so, the various measuring agencies may be a bit optimistic in their estimates of measurement errors.
Interestingly, in the past 7 years, the differences in any climate warming trend estimated using different data sets is as large or larger than the underlying trend itself. Too bad measurements aren’t easier to do. This scatter in experimental results, is, one of the reasons it takes so long to either validate or falsify projections quickly. Both the weather and the measurements are noisy!
—
Update: 12:09 am March 3, 2008.: I’m trying to figure out the provenance of the data I downloaded from Anthony’s site. The blog post where I found the link data appeared temporarily to give a 404 error, but reappeared later having been edited drastically.
Because the data provenance affects how I got my data, I fished text out of the google cache, and I’m reproducing the relevant text. From Watt’s Up
I recently plotted all four global temperature metrics (GISS, HadCRUT, UAH, RSS) to illustrate the magnitude of the global temperature drop we’ve seen in the last 12 months. At the end of that post, I mentioned that I’d like to get all 4 metrics plotted side-by-side for comparison, rather than individually.
Of course I have more ideas than time these days to collate such things, but sympathetic reader Earle Williams voluntarily came to my rescue by collating them and providing a nice data set for me in an Excel spreadsheet last night.
Traceable sources are contained in the linked blog post.
Some of the differences between GISS and HadCRUT can be explained by their methods of handling polar data. GISS interpolates over the poles while HadCRUT ignores the poles (in effect, the HadCRUT method assigns the global average temperature to the polar regions).
I believe both sets of satellite data (UAH and RSS) also ignore the poles. I have not been able to find a reference for the geographic limits of each record.
It would be interesting to make GISTEMP and HadCRUT equivalent by omitting the poles from the GISTEMP analysis. I am not sure if the required GISTEMP gridded data is available.
JohnV–
Yes. Presumably, the differences have to do with handling of the poles. In some sense, it would be interesting if someone did the analysis to provide yet another data product using the same data. For myself, I’m content to just compare and remain aware that there are uncertainties in data.
If a third party did something different with the thermometer measurements, then we’d just have another averaged product based on different likely plausible ways to look that the data. To some extent, I just think this is the level of uncertainty we have, and mostly, we just need to wait for more data to arrive to narrow the uncertainty bands on the amount of warming in the future.
I agree.
Calculating a “non-polar” GISTEMP would be interesting in an academic sense but would not contribute much.
I haven’t been able to find a gridded GISTEMP dataset, but I did find a couple of GISS graphics that show the effect of the Arctic:
http://data.giss.nasa.gov/gistemp/graphs/ArcticEffect.pdf
Basically, ignoring 80N-90N has very little effect. Ignoring 70S-90S would work in the opposite direction. Eyeballing the slope it seems that GISTEMP would have a positive slope from 2001 to 2007 even without the Arctic. Other explanations for the differences between GISTEMP and HadCRUT are likely required.
Well.. therere’s always those station adjustments. Does Hadley do anything to eliminate urban heat island effects? If so, maybe one group removes “more”.
I reserve judgement on whether the taking out “more” urban heat island would be better or worse. The ultimate problem with this adjusting is there is no calibration standard.
HadCRUT uses ship measurements of sea surface temperatures, whereas GISS uses satellite data.
Hi,
I wonder if your slopes would be even longer if they were started in the first descending month afer the highest month in 1998.
Grant Hodges
Lebanon, IN
Grant–
I don’t know for sure what the slopes would look like if we started right after the high spot in 1998. However, there are two big problems with choosing that point:
1) It’s irrelevant to testing IPCC projections which were published in 2000. So, for that, 2001 is the correct year.
2) When applying hypothesis tests, one must avoid selecting the data set based on features of the data itself. That’s cherry picking and it really screws up any statistical hypothesis test.
I picked 2001 as the start year because that’s the first year after the projections/predictions were made. In the end, the IPCC projections may hold up, or not. We won’t know until weather happens and more fresh data arrive.
I’m curious how interpolation of the poles would give GISS a warmer record.
On it’s face it makes no sense.
The other diffrences between Hadcru and GISS are in the handling of the grids that are partially land
and partially sea. and the stations are different.
Ah one last thing.. GISS use stations 1200km apart to create a temp value. I found this, I might bring it up at CA
Several thousands of temperature records from the Global Daily Climatology Network are analysed by means of detrended fluctuation analysis (DFA). Long-range temporal power-law correlations extending up to several years are detected for each station. Contrary to earlier claims, the correlation exponent is not universal for continental locations. Short-range correlations are also evaluated by DFA and by first order autoregressive models. The strength of short- and long-range temporal correlations seems to be coupled for large geographic areas. The spatial patterns are quite complex, simple parameter dependence such as elevation or distance from oceans cannot explain the observed variability.
Steve–
All the things you bring up matter. Obviously, to the extent that the “instruments” get different results, we must have at least that much uncertainty in the measurements. I’m using “instrument” broadly here to include the actual devices (thermometers etc.) and algorithms used to estimate the measured value.
The differences here must reflect measurement uncertainty.