Chad was curious about a common claim that pops up in blog discussions. The argument goes more or less like this:
Temperature stations are report minimum and maximum temperatures to the nearest degree. Therefore, the mean computed from these rounded values can have no more precision than the two numbers that went into calculating it. Thus the mean ought to be rounded to the nearest degree as well.
So, this argument would say that if we round measured data to 1 degree, then the average can’t be known to better than one degree.
Now there are two issues here in the claim:
- Does the rounding matter?
- Does the min/max issue matter?
Oddly, the way the issue above is worded, one might think that there are people who claim you can’t get more than 1 C of precision because the measurements are rounded to 1C. That’s bunk. It’s always been bunk. I’ve known that as a general rule, the argument is wrong. I’ve know this as long as I can remember. Mind you, there was probably a time when I did not know it was wrong… but I don’t remember not knowing it. I’d guess that predates my knowing the definition of “standard deviation”. (There are a few special cases where rounding causes the specific problem just described–but these problems don’t apply to global average temperatures.) I’ll talk about this a bit more later on.
The other issue is: Can computing the average based on min and max matter? And by matter, I don’t mean does it affect the accuracy of individual average, but will it affect trends.
The answer to this is not as obvious. Hypothetically, replacing the average over 24 hours with the average of the minimum and maximum recorded values could introduce error into the computation of a trend if if the shape of the diurnal pattern for warming and cooling changed as the planet’s climate changed. For example: if the diurnal temperature variation was once perfectly sinusoidal over 24 hour period, but changed to something else, the relationship between the average of the minimum and maximum temperatures and the 24 hour temperature could change. If it changed a markedly, this could affect the computed trend. Assuming that you really want to know the average over 24 hours, then, hypothetically, the use of min/max could cause a bias in the observed values of the earth’s surface trend.
Anyway, out of curiosity, Chad computed both ways using climate model data. He extracted ‘model data’ from a AOGCM’s, and then computed the monthly anomaly data using the full precision in the ‘model data’. He then repeated the exercise by recording the min/max and recomputed the trends. He basically found that at least for GISS EH, recording Min/Max and rounding made practically no difference to the computed trend.
You can read about that here:False Precision – It Doesn’t Matter”.
Now, some of you might wonder: Could these things have ever mattered? Is Chad’s analysis absolute proof using the min/max method won’t introduce a bias? And what about the rounding?
I’ve put two spread sheets together creating two hypothetical examples to show how these might hypothetically affect the trend. (But sorry those of you who think there is a big problem with rounding to 1 C. No. There isn’t. Never has been. Nope. )
There might be an analogy with baseball averages. A batter can effectively score only 0 or 1, but his average is reported to three decimal places. If we followed a rule that averages should have the same precision as the individual source data points, nearly all baseball batting averages would be reported as zero.
Should it read “rounding made practically NO difference to the computed trend.”?
It’s worse than that Lucia.
The precision stated on the thermometer (something on the order of 0.1C for mercury thermometers) only applies when the temperature you want is the actual temperature of the bulb.
The actual use for the information though is as “the average gridcell temperature.” Here, the actual thermometer is just a proxy for the desired variable. Yes, it is an actual thermometer, but a glance at a local weather map would seem to indicate that “0.1C” is a wild overstatement of the precision.
Braddles, your analogy doesnt quite fit. The real comparison would be to extend your analogy and assume that instead of a 0 or 1 at each at bat, a player oculd get any number – but it was recorded only as 0 or 1.
(lets assume above 0.5 goes to 1 and below 0.5 goes to zero).
Then the question comes: How much error could you have in a batting average?
the amount of error is HIGHLY DEPENDENT on assuming there is a random distribution of results. Then the batting average acts like a polling sample, and is akin to monte carlo sampling… I Havent’ done the math, but I’d guess a 1000 at-bat ‘run’ could be off by no more than 4% 19 times out of 20. BUT … If there is ANY non-linearity or non-randomness … all bets are off.
More to the point – there IS an error range, and it is likely calculable if you make certain assumptions.
Alan–
There are some issue related to precision. But the question here relate to a) the rounding and b) how min/max might hypothetically affect trends and whether or not they seem to as a practical matter.
The difficulty with thinking that 0.1 C is a wild overstatement of precision is that when you are discussing precision or accuracy, you need to pin that to the thing you are computing. Whether people like it or not, sometimes a method can determine (A-B) more accurately than it can determine either “A” or “B”. It seems weird, but this sometimes occurs.
With regard to climate change, we usually care about temperature differences, not absolute temperatures. So, whether or not something “matters” depends on what we are trying to diagnose. So, for example: when computing a trend, we don’t actually care about the temperature of the “Alaska” grid cell. We only care about the change in the temperature relative to a baseline.
Lucia,
I thought the issue was in the uncertainties. As Briggs says, too many people are too certain of too many things.
Stan–
Oh there are certainly uncertaities. But that’s not quite the same as the issue some claim. Some people do really claim that if you round thermometer measurements to 1C you can’t know changes in anomalies to better than ±1C because of the rounding.
Now, it may well be that we don’t know the changes very well. But it’s not because of the rounding of individual thermometer readings!
It is sometimes important to understand the actual source of the uncertainty. I suspect Briggs would agree with this.
Lucia, the min/max problem has been addressed in the literature, but I liked Chads approach. Folks can do it for themselves with real data.
http://www.ncdc.noaa.gov/crn/
ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01/
braddles baseball analogy got me thinking. A batter’s average (0 for an out, 1 for a hit) can give me good information about how this hitter compares to other hitters “hits” vs. “outs”. But it doesn’t tell me enough about how “good” the hitter is (which is what we want to know). There is lots of other information that needs to be included to give the batting average meaning. You can even get into things like corked bats and juiced balls, what era the player was in, steriods, who has been batting around him, what kind of pitching he faced, slugging percantage, walks and strikeouts, the number of at-bats vs plate appearances, etc…
I’m not a climate scientist but I imagine it is similar when talking about temperature numbers in the context of climate.
Andrew
steven–
Of course it’s addressed in the literature! But that doesn’t always mean it’s done in a way that readers really “get”.
Andrew_KY– Why is how “good” the hitter is what “we” want to know? Can’t “some of us” “really” just want to know the probability he will get on base by some means other than a walk?
The current focus is on narrow claims:
1) Does rounding station temperature readings to ±1C mean we can’t know global average temperature to better than ±1C? (The answer is absolutely no.)
2) Does using min/max affect the estimate of the 24 hour average temperature? (Chad finds in GISS EH, it does not. However, the answer could change under certain hypothetical– in which case, we have to consider the plausibility of the hypotheticals.)
3) Do either using min/max or rounding affect the computation of the trend over say, 30 years. With respect to rounding: the answer will be “for all practical purposes, no”. With respect to min/max, Chad finds for GISS EH, the answer is no. But we’ll see that for hypotheticals the answer doesn’t have to be no. (Tomorrow, I’m going to show implausible hypotheticals, but that’s because I need some to show when this can happen. )
Lucia,
Because a hitter’s average is say, .250, that doesn’t mean that’s probability he will get a hit. He may be hitting very well against lefties and horrible against right handers. Is the pitcher he is currently facing a righty or a lefty? Or he may hit the cover off the ball against the Royals and can’t hit the Yankees if he was swinging a ping-pong table. Are we playing the Yankees this series?
WE need more info than just the batting average. 😉
Andrew
Lucia, that’s exactly why I liked Chads Approach.
I recall some articles where trends for Tmin, Tmax, and (Tmin+Tmax)/2 were looked at separately.
It seems that in a lot of cases, Tmax doesn’t have much trend, but Tmin has a much larger upward trend.
I don’t recall where I saw this analysis, but probably over at Pielke Sr’s blog, since he is researching (some would say fixated on) land use effects, and land use changes seem to have a greater effect on Tmin than on Tmax.
The BA is a nice baseline (ha ha) statistic. When you want more specifics, you then you can look at stuff like total OBP, “slugging percentage,” BA vs lefties, etc. If there are large discrepancies, like he bats > 0.300 normally but 0.186 against the Yankees, then we might want to guess why this could be (is there some pitcher that just has his number, does he just play the Yanks in the postseason and crack under the pressure, what?) but like any stats probably doesn’t really “prove” anything.
I’d guess that true baseball “stat junkies” have some intuitive feel of the variance given the “at-bats” stat plus the stats for this week, this month, this season, and all that. These days, more-sophisticated stat junkies probably write a model. 🙂
Amusingly, having a noisy signal makes it easier to inrease measurement resolution by oversampling. Very common with A/D convertors in radio design, where noise might even be intentionally added to the signal in order to reduce quantization noise and improve linearity.
Sean — yes but the noise must be “rich and smooth”!
oliver,
“but like any stats probably doesn’t really “prove†anything.”
Indeed. And like calculating a player’s overall batting average and the global average temperature anomaly… what have we “proved”?
Andrew
Many electrical engineers are familiar with using averaging to increase the effective resolution of an analog-to-digital converter.
It doesn’t work in a system where there is not much noise. Nor does it work well in a system with too much noise. It works when the standard deviation (or root-mean-square value in electrical engineeringeese) of the noise is a few of the smallest steps of the analog to digial converter.
If there isn’t any noise, then the reading is stuck on one reading all the time. The effective quantizing noise is +/-1/2 of the smallest step (and IIRC, with a deviation of 1/12 of the smallest step). I have designed systems where I purposefully added in some noise, which is sometimes called “dither”.
If there is a bit of noise, then averaging multiple readings will increase the resolution. The noise is reduced by square root of number of readings. With just a couple smallest steps of noise, a lot so systems can average enough reading to get rid of most of the noise while leaving the improved resolution. I’ve used this to effectively get 10 bits of resolution (1 out of 1024) from an analog to digital converter with good accuracy but only 8 bits of resolution (256 steps).
If you add in too much noise for the amount of averaging one can do, then you degrade the system.
——————————-
The climate/weather system has an actual signal variation which is many times the size of the smallest step, which is 1 degree F or 1 degree C. So if the rounding is random (which I think it is) and if we average enough readings (which happens over an entire year), then the effective resolution is much smaller than 1 degree. And that is just at one station. Combining multiple stations further enhances the potential resolution.
I guess this is an awfully long post to just say “Yep, I agree with you and have done analgous things in the real world”.
Lucia,
The IPCC are way ahead of you, read up on “Understanding and attributing climate change”.
Thanks,
I’ve seen the argument made about +/- 1 degree and thought it was nonsense, but wasn’t sure why. Looked it up and found some classroom science examples that, as pointed out, spelled out using the standard deviation to determine the precision with which to express the answer.
One quibble with braddles baseball analogy—at bats and hits are exact integers. I think you can calculate out a batting average to as many significant figures as you like. Any probability arguments you want to make would, of course, be based on the number (sample size) of at bats and hits.
Come to think of it, I seem to remember a batting championship that was awarded based on the fourth decimal place.
“The IPCC are way ahead of you”
Of course they are, bugs. You and Big Al and the IPCC all are superior to dumb hick Andrew_KY. I’m just a guy who likes baseball. Forgive me for forgetting my place. 😉
Andrew
I’m a guy with math skills challenged by long division, so please be gentle if I have totally missunderstood the issue, or really am as ignorant about this as I’m afraid I might be, but if I add 1.5 ten times I get 15. If I add 1.51 ten times I get 15.1, but if I round up 1.51 to 2 and add that 10 times I get 20. That seems a significant difference. Thanks for the explanations I hope to get here. KR
One thing that hasn’t been mentioned is that some of the historical records are time of day measurements, not min/max. The TOD measurements have to extrapolated to min/max to be included.
charlie–
This is important. More tomorrow. 🙂
bugs (Comment#27997)
Why do you organize that to make it appear you are quoting me? Andrew_KY wrote that. That said… huh?
JohnM
Yes. It seems to spring from some partial understanding of rounding I learned in 4th grade, but later learned when and why it doesn’t apply to averages by… oh.. 11th grade. The rounding doesn’t matter in the context of climate change. Otherthings might, but the rounding doesn’t.
I’m surprised nobody has mentioned this. In the USA the thermometers are in Fahrenheit, not Celsius. Metric system tried and failed and all that.
The observer does the rounding from nearest 0.1F, records it in the B91 form, mails it to NCDC where it is transcribed, checked for errors. Then published as preliminary data.
End users like GISS/CRU are THEN converting to Celsius. So we have two possible errors, rounding plus conversion.
Uhg.
You can see paper B91 forms here:
http://www7.ncdc.noaa.gov/IPS/coop/coop.html
And the transcriptions as “preliminary”
http://cdo.ncdc.noaa.gov/dlyp/DLYP
Final single station data costs bucks, Government’s gotta eat ya know.
Getting a higher resolution by oversampling depends on your measuring the same thing.
that is, if you measure the local temp 5 times as quickly as you can using the same procedure you average will probably give you a higher precision.
Measuring different things, that is, the same measurement on different days in different seasons mathematically gives you a higher precision but tells you little about the energy in the system, which is what the temp is a proxy for.
For instance a wet day may have as much, or more, energy in the local environment as a dry warmer day!! Yet, we average these things and pretend the result is helpful and that the increase in precision means something in relation to climate.
Parochial school snob. 🙂
Baseball averages I don’t think are a valid analogy because you’re talking about percentages of hits vs at bats, not, for example, the trend in the distance the ball flies when hit over over a season.
You need to state the full conditions. As Charlie says there are conditions where it works but there are some where it doesn’t. Consider 5 measurements of .1 followed by 5 of .49. Rounding gives you 0 in all cases. If the second set of 5 was .51 you get 5 zeros followed by 5 ones. I understand those are not the conditions that apply here, but that’s the point. Are the correct conditions there. The model data may or may not be a valid test. There is field data available with higher resolution from the CRN sites. Chad might want to verify his calculations against actual data rather than the models. Of course we trust the model output implicitly….
Charlie:
I have tried to follow the critiques of how we measure temperature at Pielke Sr’s blog. In addition to the work on land use factors, there is an interesting argument that we need measures that better reflect the global heat balance.
For example a one degree rise in the northern latitudes requires less heat energy than a one degree rise over the same area in the tropics because of the smaller base. See Pielke on this.
Maybe ocean temps or some weighting by latitude would be better? I dunno. Probably better than increased reliance on correction calculations on data from fewer and fewer stations.
JohnM
Parochial? The only parish school I attended was in 1st grade for 1 semester. That was when we moved to Buffalo from El Salvador. After that, I went to public school. Then Woodlands Academy of the Sacred Heart, which is roman catholic but not parochial. (Very few RC high schools are parochial.)
“bugs (Comment#27997)
Why do you organize that to make it appear you are quoting me? Andrew_KY wrote that. That said… huh? ”
Cut/Paste error.
Bugs.
.
“The IPCC are way ahead of you, read up on “Understanding and attributing climate changeâ€.”
.
And this is where they mislead. Remember, they also state quite clearly that the models are SCENARIOS. If the insolation, cloud cover, air pressure, wind patterns… match the model, THEN and ONLY THEN, will the model results apply to the physical world. Since many of these model parameters start diverging from actual measurements as soon as the model starts, what does that tell you??
.
Sorry Bugs, the IPCC is full of themselves!!
Lucia I have often found stat wisdom in your discussions but I thought the issue was one of false resolution not false precision. Definitions: Resolution = the smallest change or increment in the measured quantity that the measuring instrument can detect with certainty, Precision = the expected scatter or spread of a group of repeated measurements of a fixed quantity (Ref http://www.ncbi.nlm.nih.gov/pubmed/10511912). So since the measurements are in x.y deg C then the best the average can be is in x.y deg C and the best the anomalie can be in is x.y deg C. So isn’t going to x.yz an unjustified higher resolution.
BTW the baseball analogy doesn’t hold because the baseball at bat data is discrete not conintuous and so resolution doesn’t apply (each measurment has perfect resolution) but temp measurements are continuous (better the instrument better the resolution) so the resolution of the readings do matter. Also I would have thought that some of the temp measurments (especially pre 1900) might only have a resolution of 1/2 deg C so when using them for a trend calc the final resultion should only be in 1/2 deg C.
“And this is where they mislead. Remember, they also state quite clearly that the models are SCENARIOS. If the insolation, cloud cover, air pressure, wind patterns… match the model, THEN and ONLY THEN, will the model results apply to the physical world. Since many of these model parameters start diverging from actual measurements as soon as the model starts, what does that tell you??”
The models to the current date are valid tools for analysis of attribution to date. You have also not read the chapter. “Fingerprints” of climate also support AGW, the stratosphere cooling not warming is one. If the warming was due to increased solar activity, the stratosphere would have been warming as well.
If I have 10 thermometers in a line, all reading 10C, to one digit.
Over time, at one end, one starts reading 9.
Then the next one starts reading 9.
More time, the one after that one is reading 9.
Is it a good reason to conclude that there is a cooling trend? Can we give this trend a rate less than one but greater than zero? Can we tell this trend is coming from a certain direction even?
The traditional method of following error propagation in calculations in this situation is to say x+/-0.5 + y+/-0.5 = x+y+/-1; the mean is this value /2 so it’s (x+y)/2+/-0.5. So the error in the mean is the same as in the original observation. Of course, a statistician will disagree, saying that the error (say, 99% confidence interval) decreases as 1/sqrt(n) so keeps getting smaller with more measurements.
In my view both are right. The key difference is that the traditional method is estimating a 100% confidence interval rather than a 99% CI. But why do we think 99% is OK, and 99.9% better. Surely because they are close to 100%. But if we have a 100% CI shouldn’t we use it? (Of course, for most theoretical distributions the 100% CI is infinite, but not here.)
I vote for the 100% CI. If we could get a consensus on this we could wipe out global warming.
bugs (Comment#28017)
December 14th, 2009 at 8:37 pm
So why has the stratosphere stopped cooling?
http://www.acd.ucar.edu/Research/Highlight/stratosphere.shtml
Lucia, (Tmax+Tmin)/2 can be a better estimate of the central tendency of a skewed data distribution (central tendency is what humans would consider “typical”). I think that is why it gets used in meteorology. It doesn’t always work that… e.g., consider a Rayleigh distribution for temperature (very artificial I admit).
Obviously for a Rayleigh distribution, the limit for (Tmax+Tmin)/2 -> infinity as the observation time T->infinity, since Tmin-> 0 but Tmax->infinity.
Another choice that sometimes gets used is the median (though this has the undesirable property that the error of the median is equal to the error of an individual measurement, unlike the mean, your precision remains fixed as T->infinity).
For the case of a Rayleigh distribution, the mean is sigma * Sqrt[Pi/2] ~= 1.25 sigma, and the median is sigma * Sqrt[2*Log[2]] ~= 1.17 sigma, and the mode = sigma. If what you want is the central tendency, median is (always?) a better metric than mean.
For physical quantities related to e.g. comparisons to models, I’d still say the appropriate metric is the arithmetic mean, rather than what is “typical.” Relative to that, (Tmax+Tmin)/2 is going to be biased, unless the distribution is strictly unskewed, and for some distributions can be extremely poorly behaved.
If we had to report to the nearest 1 degree, the warming would be very dramatic indeed. On GISS data, the world would have warmed by a full degree in the last couple of decades …
In fact, since 1980, the slope would be warming of .03 degrees per year. Since 2000, it would .06 per year (since 2001, it would be back to .03). 😉
Australian weather stations report temperature to .1 of a degree. I am not sure how they flow into global data, though.
I realize the purpose of the present point was 1 vs 0.1 degree from trend of the results using just 1 degree readings, and Lucia is reasonable on the comments. However, the real problems are not 1 vs 0.1 degrees, they are what is the effect of invalid site location, change over time in site surroundings, shade variation, pre-corrections of data with the basis for correction lost, inadequate site numbers and distribution, etc. Also deliberate bias adjustment and cherry-picking to emphasize a point. Those make the shown accuracy of the trend moot.
carrick
I agree. The main reason being that’ s what’s computed from the models. So, it’s best to compare an arithmetic mean to an arithmetic mean. But then the question is: If we compare a “Tmax+Tmin/2” type average to an arithmetic mean, “does that matter”? In principle, it’s suboptimal. But in practice, it depends on what you are trying to discover and also what the shape of the diurnal variation looks like and whether or not that’s changed.
I wouldn’t use climate model data to test statistics. It should be based upon the expected diurnal pattern, so you’re just testing that the result is the same as the input (both input temperatures and any diurnal code); of course weather models are more likely to have diurnal code rather than climate models. The input temperatures to climate models have odd adjustments, and they’re probably not daily numbers anyway.
For testing/demonstration I’d use min/max of two linear trends, two sine waves, addition of several sine waves, and random values, concluding with some raw temp data. Of course it would be interesting if the raw data behaves differently than the others.
I thought we couldn’t really know global average temperature at all since it is a mathmatical construct. The rounding or min/max is just part of how it is built.
I don’t have a problem with the rounding so much as I have a problem with how people seem to think about the average. Say the global average temperature went up a degree over some period of time. Does that matter where you are? Maybe not if most of that warming took place somewhere else.
Imagine I have a room full of 100 people whose height ranges from 5.5 feet to 6.5 feet and the average is 6.0 feet. Now imagine I remove 10 of these people and replace them with 2 people 4 feet tall and 8 people 7 feet tall.
Now the “average height” went up but none of the original people got any taller. In fact, there are two seats in the room that are filled with shorter people than before.
Just because the “global average” went up it doesn’t mean that it when up where an individual lives. I am not sure if the global average value really matters for anything. Temperature stations are only on land. Weather patterns could warm places where there are recording stations and cool places where there are no stations (such as open ocean areas) resulting in a rise in “global average” when there really wasn’t one.
Or a weather pattern could change in some way that causes some area to warm more than another area cools resulting in a warmer global temperature when maybe more land area actually cooled than warmed.
In other words, I find the “global average” to be for amusement purposes only.
Lucia:
If you change the average cloud cover, for example, that would affect Tmin more than Tmax. So in principle yes, not only is there a bias, but it changes.
I guess the next thing to look at would be Klotzbach et al., which claims to have modeled this.
At Climate Science – Roger Pielke Sr.
David R. Easterling, Briony Horton, Philip D. Jones, Thomas C. Peterson, Thomas R. Karl, David E. Parker, M. James Salinger, Vyacheslav Razuvayev, Neil Plummer, Paul Jamason, Christopher K. Folland, 1997: <
Maximum and Minimum Temperature Trends for the Globe. Science Volume 277 18 July1997.
Abstract:
An interesting implication of the differences in min/max vs average trends by Klotzbach et al. 2009.
Klotzbach, P.J., R.A. Pielke Sr., R.A. Pielke Jr., J.R. Christy, and R.T. McNider, 2009: An alternative explanation for differential temperature trends at the surface and in the lower troposphere. J. Geophys. Res., 114, D21102, doi:10.1029/2009JD011841.
See update at: December 3, 2009…7:00 am
Corrigendum To Klotzbach Et Al 2009 – Our Conclusions Are Unchanged
I have seen further papers by Pielke et al on differing trends in min vs max.
Bugs asked: “If I have 10 thermometers in a line, all reading 10C, to one digit.
Over time, at one end, one starts reading 9.
Then the next one starts reading 9.
More time, the one after that one is reading 9.”
Given that resolution I’d say your ave starts at 10 and only drops to 9 when 6 of the thermos read 9 — then you have 2 points and we all know how to draw a straight line and get a trend 🙂
Of course you really have a time series more like 10,10, 10, 10, 10, 10, 9, 9, 9, 9, 9 and your trend will be something like -1 deg/time period but it won’t be +0.74 (Pachuri on the radio this morning Aust time) when the resolution at best is only allows +0.7.
And no I’m not saying that there is any practical diff between 0.74 and 0.7 especially given the error bar but rather that it is sloppy IPCC science to give results with unjustified resolution.
In my personal physicist opinion, arguing about the errors of a global temperature anomaly trend has almost as much meaning as measuring the number of angels on a pin head.
My basic objection comes from the knowledge that it is not temperature but radiated energy that contributes to changes of the climate. Temperature is only a proxy for that.
Temperature is a good proxy for radiated energy on a local basis: the F=sigma*T^4 formula that gives basis for this proxy for radiated energy is good for a black body problem. Each location of the earth on which temperature can be measured does not have the sigma of a black body, but the sigma of a gray body, variable in the value it gives and the spectrum which it gives rise to, coming from the matter which is doing the radiation: water, air, all the variations of ground, snow, forests, deserts, etc. etc. Add to this the dimensions which are fractal, not just three, (think forests, think craggy mountains) and if anybody has made a model of this to the spacial accuracy necessary to be able to integrate the difference over the globe I would be interested in a link.
In other words, the average radiated energy over the globe has an uncomputable. in my opinion, distant relationship to the average temperature which is the crux of the problem and will really tell us if the measurement errors are important or not. ( and I have not included convection, sublimation, biota, etc)
I think it is an impossible problem, a problem that fits the tools of chaos and complexity and that is the way to go if one is serious in computing climate values.
I will give you a gedanken experiment: Let the temperature anomaly at the poles be 40C, let the temperature anomaly in the tropics be 2C, and let the globe be equally divided into pole and tropics. Would the radiative balance be dominated by the huge pole anomaly or by the small tropics one? We know that it is the tropics that radiates (gray body sigmas) . Why would we average linearly the huge pole anomaly with the small tropics? Is there a meaning in the radiation budget derived this way?
You are right AnnaV but yours is another topic – the one whether spatially averaged temperatures (arithmetical average) are relevant and/or interesting for Earth’s energy variations . They are not .
.
To the theme Tmax , Tmin there is a simple model used in some energy calculations .
This model uses a 2 valued function which takes the value Tmax during the day and Tmin during the night .
One writes then that Tmax = Tmin + DTR (Diurnal temperature range) .
Then the difference between the true temperature average and Tmax + Tmin / 2 is (12 – h) . DTR / 24 where h is the duration of the night .
Clearly it can be seen that this difference is low in the tropics because DTR is low and h is near 12 .
In higher latitudes it is large because both h varies and DTR is large . It is also there that the profiles can be very far from the simple step function .
For instance in december in Paris which can be approximately described by this model , the difference between the true average and Tmax + Tmin / 2 is around -1°C to -2°C (negative meaning that the true average is 1 to 2°C lower than Tmax + Tmin / 2) .
These differences indeed “matter” if what one is interested in is a true temperature average .
Such kind of models is a gold mine for “adjustments” that the weather station people sofar (thanks God !) refrained from .
Anna V
As a physicist myself (never used the qualification in anger) I applaud you.
Thank you for explaining it simply.
Now if we could only get others to understand.
TomVonk – AnnaV is on the money in my opinion
Coincidentally I was having this same discussion with someone in the pub last night. He argued that rounding to the nearest degree meant that the annual average temperature could only be known to the nearest degree. Wrong! My take was rounding to the nearest degree for Tmax and Tmin gave an effective measurement precision of +/-0.5 degrees C. Simply propagating this over the year results in an annual average temperature +/-0.02 degrees C. There’s no mystery, no magic, just simple propagation of errors.
But this precision applies for the annual average at the point where the thermometer sits. The bigger question for me is how you estimate the average of a grid area given that the length scale of temperature variations varies over a large range of distances in nature (100’s metres to 100’s km).
Anna V
“My basic objection comes from the knowledge that it is not temperature but radiated energy that contributes to changes of the climate. Temperature is only a proxy for that.”
Because temperature is all we have for a history. That’s not by choice, just the situation we find ourselves in. Research is being done into better ways of representing the internal energy flows, but as Trenberth was misrepresented saying, we can’t explain it all yet.
However, here is the radiation balance for the globe (no models, to keep kuhnkat happy).
http://www.agu.org/pubs/crossref/2009/2009JD012105.shtml
http://scienceblogs.com/stoat/2009/11/floods_not_linked_to_climate_c.php
Look for the graph of the earth’s total heat content anomoly.
Lucia,
My understanding was that you cannot create precision greater than the precision of the instrument used to make the initial measurement. It was always a discussion of significant digits, and the only case where this didn’t apply was the already discussed oversampling. However, I understood that such oversampling was a very strict case, i.e. one could use statisitcs to determine how many iterations were required, etc., but that the most fundamental requirement was that the same thing had to be measured a certain number of times. How does one temperature reading, in different places with different instruments at different times meet this criteria?
Does this not come down to an assumption that errors will cancel rather than add?
John M (Comment#28026) December 14th, 2009 at 9:43 pm
IANACS, but as I understand it, the cooling is a result of the retention of heat in the troposphere. This is all to do with how the incoming energy is absorbed. If the earth’s troposphere has reached a balance for a period of time, (as it has for a few years), then the cooling will stop. Note that it hasn’t warmed, either. But if you want something better than a layman’s guess, you will have to ask someone else. Either way, the fact that it cooled, while the troposphere warmed, is significant.
I think this observation is already a well known statistical phenomenon, although the name escapes me now. It applies to more than just this in the matter of temperatures, for example, the assumption that all errors in readings must be positive, which Watts seems to assume.
The whole point by Briggs (he is doing a very good homogenisation of temperature series, fresh off the print) is that temperature series should necessarily carry uncertainty bounds. You are right when you say that an average can have a value more precise than 1oC, but it is wrong to assume the rounding it is not important, since it would affect the uncertainty bonds (that should be attached, but are not).
Lucia,
What about precision issues, as contrasted with rounding issues?
Coincidentally I was having this same discussion with someone in the pub last night. He argued that rounding to the nearest degree meant that the annual average temperature could only be known to the nearest degree.
.
Interesting . Actually the biggest problem is with the word known .
All those errors and paradoxes arise because people don’t ask the simple question what is there to “know” .
Actually an average temperature can never be “known” in the sense “measured” .
An average temperature is a mathematical construct , you divide a number by another number and there is no limit on the precision of a division .
So if by “known” is meant “calculated” then an average is known with an infinite precision .
But even when one does what W.Briggs is saying – to attach uncertainty which is due to the uncertainty of its components around an average , it has still nothing to do with measuring an average .
It has just to do with adding and dividing numbers where each operation can be done with infinite precision and if the said numbers are within an interval then one just adds and divides a set of numbers instead of a single one .
Temporal averages with very few exceptions have no physical meaning and can’t be verified experimentally .
bugs (Comment#28054) December 15th, 2009 at 5:48 am
Interesting, but I just saw in the abstract of your link:
“Important terms that can be constrained using only measurements and radiative transfer models are ocean heat content, radiative forcing by long-lived trace gases, and radiative forcing from volcanic eruptions. We explicitly consider the emission of energy by a warming Earth by using correlations between surface temperature and satellite radiant flux data and show that this term is already quite significant.”
That temperature word there, and no mention of gray body sigmas all nicely mixed once more?
True, temperature is what we have as a historical record. BUT it is only a proxy for radiation, and radiation is what is important. So even if we had perfect temperature measurements and correctly integrated over the diurnal and season and geographic spectra etc. etc. still the error on the quantity would be irrelevant because we do not know really the gray body constants and how they should have been applied to that perfect measurement to turn it into a radiation measure.
David L. Hagen:
One wonders about Tmin vs Tmax trends in summer vs winter. Thinking US here. If summer waste heat is generated largely from air conditioning then the summer UHI trend should be higher for daytime Tmax. Similarly, if winter waste heat is generated primarily from night-time home heating then the winter UHI trend should be higher at nighttime Tmin. Autumn and spring should be more neutral re: both. I wonder if this has been examined.
Haven’t visited for a while. Are the data all still ‘falsifying’ the IPCC projections? Can’t find any recent posts on this – and it used to be a monthly event to look forward to – or has this language changed for some reason?
slightly off topic, but ……
George Tobin (Comment#28007) December 14th, 2009 at 6:44 pm
says: “Maybe ocean temps or some weighting by latitude would be better? I dunno.”
Google on “Ocean Heat Content” and also look at http://www.argos-system.org/ , which has information on the Argos Buoy system.
The heat capacity of the oceans is far greater than the atmosphere. As heat is gained by the earth due to a radiative imbalance, many more joules of energy will be stored in the oceans for a given amount of temperature rise.
The models and calculations of greenhouse effect are often given in equivalent radiative forcings of watts/meter-squared. These can then be turned into the expected rise in heat content.
The rise is ocean heat content has been less than expected by most (all?) models. See Lucia post regarding Copenhagen Synthesis report for a good graph of the three main reconstructions of Ocean Heat Content history.
Lucia,
Would you address the point about significant digits?
There is heat transfer from low to high latitudes so radiant emission at the TOA at high latitudes is higher than the absorbed solar radiation for that area and the tropics radiate less than the absorbed solar radiation, further complicating the relationship between surface temperature and TOA thermal IR emission.
OT:
The ENSEMBLES final report has come out: http://www.ensembles-eu.org/
Lucia (#27971) mentioned situations where the difference can be measured more precisely than the absolutes. An example of such a situation is with gravimeters http://en.wikipedia.org/wiki/Gravimeter
which can now measure local gravity differences to 1 part in a trillion. Such instruments are generally designed for differential measurements and can assume that the base of the measurements does not vary throughout the measurement period. Are there differential thermometers in use today that can be sure of a constant baseline?
On another matter there seems sometimes a sense in the discussion that the standard deviation tells us something about the mean, whereas my understanding is the they are coequal in determining the statistical pattern of the measurement set. If we are to examine questions such as mean differences then we use a statistic of the mean itself, such as the standard error of the mean.
As Lucia noted, if the diurnal waveform distorts significantly over the period one tries to track, (max + min)/2 can develop a serious bias.
Clouds produce only greenhouse warming at night, but can produce both albedo-cooling and/or greenhouse warming during the day.
So since CO2 increase can increase water vapor and cloudiness, the rise in the minimum temperature could be considerably greater than the rise in the maximum. Of course, it can be much more complicated, e.g., read the Klotzbach/Pielke/Pielke type arguments.
Development of such asymmetry could also bias ‘true’ average daily temperatures.
Gus–
Yes. When compared beginning in Jan 2001– the first year after the SRES were published– multi-model mean of the IPCC projection remains inconsistent with the earth’s trend at ±95% confidence.
I changed language because people dispute what it means to “falsify”. But the two remain inconsistent.
Len–
I’m going to discuss the tmin/tmax issue later. I decided to split the two because there are sufficient numbers of people bothered by the rounding issue.
To some extent, the difficulty is… uhmm… semantic. So, I included a discussion from NIST. It engages the semantic issue by insisting people define what they mean by precision when writing documents and standards because more than one definition exists.
Lucia, regarding truncating results to 1°, there is a well known result that if the self-noise associated measurements (stations) is larger than the value you truncate to, in the large N limit you will recover (in a statistical sense) the original mean of the distribution.
I am pretty certain that temperature data satisfy this condition.
On the other hand, if the truncation error is large compared to self-noise, the truncation errors matters, and you won’t recover the mean of the untruncated series in the large N limit.
When they digitize analog signals they add a couple of bits of worth of noise to the signal before digitizing it, for just this reason. (E.g., if you have a 16-bit system, the most resolution you’d ever want is 14 bits.)
“Andrew Kennett (Comment#28044)
..Given that resolution I’d say your ave starts at 10 and only drops to 9 when 6 of the thermos read 9 — then you have 2 points and we all know how to draw a straight line and get a trend .
Of course you really have a time series more like 10,10, 10, 10, 10, 10, 9, 9, 9, 9, 9 and your trend will be something like -1 deg/time period”
Just how would you draw your line? With readings of 10 and 9 you have a resolution of .5 only. This means that your readings could have been:
10.4, 10.2, 10.0, 9.8, 9.6, 9.4, 9.2, 9.8, 9.6 trend: -0.2/t(ime per.)
or:
9.9, 9.8, 9.7, 9.6, 9.5, 9.4, 9.3, 9.2, 9.1, 9.0 trend: -0.1/t
or:
9.5, 9.5, 9.5, 9.5, 9.5, 9.4, 9.4, 9.4, 9.4, 9.4 trend: =< -0.01/t
So, precision DOES matter, even in calculating trends, as you get a difference of at least 20 times the magnitude in trend.
People confuse precision with accuracy. You can improve the precision with oversampling; you can’t improve the accuracy. I think this is the source of the confusion.
Joel, it depends on the internal noise in the original measurements. In practice, you’d never see a series like ” 10,10, 10, 10, 10, 10, 9, 9, 9, 9, 9″ unless the self-noise is small compared to the truncation error.
If that is true, then it is indeed a mistake to truncate the series to 1°
Here’s an illustration of the effect of increasing the number of points.
I’m assuming a 1°truncation error, a measurement uncertainty of sigma = 2, and a series that depends on the length n as::
y(n) = a0 + a1 * n
where I take a1 = -0.2 for illustration (a0 doesn’t matter, you can set it to 0). Using C notation, we can generate the underlying series as follows:
for (k = 0; k < Ny; k++) y0[k] = a0 + a1 * k;
To generate a particular instance of the measured, truncated data, we could write:
for (k = 0; k < Ny; k++) y[k] = floor(y0[k] + sigma * gauss() + 0.5);
where gauss() is a Gaussian random number generator with standard deviation of 1 and mean of 0.
Fit these to a straight line for a number of trials (I chose 1000000), then histogram them for different choices of Ny. This is what I get for Ny = 5, 10, and 20:
Figure here.
Even for the truncation error being 1/2 the measurement uncertainty, by the time you get to 20 points you’ve nearly recovered the distribution you would have gotten had you not truncated the data. There are still differences, that are kind of interesting:
See this.

But the point is by the time you have Ny as large as the GHCN data set is, the truncation error is irrelevant.
For comparison, when you make the truncation error larger than the measurement error (e.g., take sigma=0.1). This is no longer true:
Even for Ny=20, the original series is not recovered.

I am not a statistician but have been reading chiefio on this subject and his view seems to be different. Is he wrong or am I missing something obvious
Now what I learned in school (“Never let your precision exceed your accuracy!†– Mr. McGuire) was that any time you did a calculation, the result of that calculation could only have the accuracy (and thus ought to only be presented with the precision of) your least accurate term. Average 10 12.111111111 and 8.0004 and you can only say “10″, not 10.0 and certainly not 10.1111 or 10.04 as that is false precision.
(In fact, it’s slightly worse than this, due to accumulation of errors in long strings of calculations and the repeated conversion that GISS does from decimal in intermediate files to binary at execution and back to decimal in the next file… but that’s a rather abstruse topic and most people glaze so I’ll skip it here. But just keep in mind that repeated type changing corrupts the purity of the low order bits.)
So what gets trumpeted and ballyhooed?
Things (Not temperatures! Calculated anomalies based on averages of interpolations of averages of averages of temperatures – no, that is not an exaggeration! In fact I’m leaving out a few averaging and interpolating and extrapolating steps! ) measured as X.yz C! Not only is the “z†a complete fabrication, but any residual value in “y†from the greater precision of F over C in the raw data has long ago been lost in the extended calculations and type conversions.
IF you are lucky the X has some accuracy to it
http://chiefio.wordpress.com/gistemp/
Carrick-
Yes. There is also a well known correction for the inflation of the variance due to binning (which is a form of rounding). I used this in my Ph. D. thesis. 🙂
But for the blog post, the EXCEL demonstrations actually help a lot of readers.
Arhur Dent (Comment#28117)–
I quote Chiefio in my follow on post. My view is different from his.
anna v (Comment#28068) December 15th, 2009 at 8:46 am
IANAP, but from what I have read, grey body does not apply. If Mr Stokes is around, I think he would know a lot more about it than me.
Joel Heinrich (Comment#28090) December 15th, 2009 at 11:22 am
But with temperatures, we are taking repeated readings. Over time, the odds of the last example persisting will be small.
Bender
Good observation. Worth exploring.
Happy hunting.
bugs (Comment#28057)
December 15th, 2009 at 6:27 am
You must have hot-keyed IANACS. Good idea.
If you look at the link I gave with the actual temperature plots, the only time the stratosphere has cooled is after volcanic eruptions. For some reason, it has re-equilibrated to a lower temperature after the eruptions, but in the absence of major volcanic eruptions, it has been rock-steady. This includes the time period from ~1995-2002 when the troposphere warmed significantly.
The only “fingerprint” I see wrt to stratospheric temperature trends has ashes on it.
“If you look at the link I gave with the actual temperature plots, the only time the stratosphere has cooled is after volcanic eruptions. ”
No, that is just wishful thinking.
bugs (Comment#28127) December 15th, 2009 at 1:46 pm
One cannot turn temperatures into watts/meter^2 if there is not a formula that takes temperatures into radiation fluxes. That is what the “gray” body constants do, use the black body formula (F=sigma*T^4) with a matter specific sigma, sigma_gray ,in order to get radiation fluxes from a non black body, which is what the earth is. The earth in no way is a black body.
bugs:
That statement seems contrafactual.
Re: Carrick (Comment#28115)
One thing that might be worth noting about the analysis of truncation errors vs. “measurement” error: the “true” temperature series typically contains large variations compared to the truncation error — even if the measurement error is much smaller.
oliver:
Yes that is a good point. Even if one is using a high resolution temperature sensor (e.g., 0.04°C like the stuff I use), natural variability in temperature is going to be generally large compared to that, if the sensors are placed at widely separated locations.
For UAH, I get -0.37 C/decade, which I would describe as “generally in agreement.”
One wonders what planet bugs lives on. Not Earth, that’s for sure. Maybe Klendathu? That name may have been a hint.
Carrick,
.
Your simulations are quite impressive.
.
For the case when sigma=0.1, rounding to the nearest integer (truncation?) basically adds 1 to 5 standard deviations (sigma) to the model or data… a major change this is.
.
The rounding has altered the underlying linear model, and the build-in linear regression algorithm in any statistical software will not be appropriate any more. Well, I believe there are solutions to it though.
.
From a statistical-modeling point of view, the truncation error will depend on the value of y, however, the measurement error can be independent of y.
Lucia, I like the feature that I can edit my comments after submission.
Bill Pearce (Comment#28084) December 15th, 2009 at 10:54 am
“Lucia (#27971) mentioned situations where the difference can be measured more precisely than the absolutes. An example of such a situation is with gravimeters”
An example more directly related to climate change are the satellites that measure the TSI, the total solar irradiation arriving at the top of earth’s atmosphere. The measurements of each satellite have high resolution, and their internal blackbody calibration sources will do a lot to maintain good long term stability.
But the absolute accuracy is poor. The difference in readings between the satellites is significantly larger than the estimated forcing of all greenhouse gases.
So we must compare satellites and turn the readings into anomalies and/or adjust the readings of one satellite by a few watts/meter-squared to put it on the same baseline as the other satellite.
This is an example of high precision, high resolution, high relative accuracy, but poor absolute accuracy.
cool carrick.