Comment on Pat Frank’s Uncertainties: The algebra

At ‘The Air Vent’, Pat Frank presented results from his recent paper on errors in surface temperature measurements published in Energy and the Environment. Jeff Id introduced this paper with the following invitation:

Pat Frank has a new article recently published in E&E on the statistical uncertainty of surface temperatures. He has requested an open venue for discussion of his work here. This is an opportunity for readers to critically asses the methods and understand whether the argument/conclusion is sound. – Jeff

In the present post I am going to discuss an element of Pat’s analysis with which I disagree, limiting the discussion to the math. This element appeared in paper 1, but the error propagates into his final results in paper 2. My discussion will focus on the first two equations in paper 1 and end there. I think this will be sufficient to explain why I think Pat Frank’s estimate of uncertainty for monthly mean temperature anomalies is inappropriate. That is: He is describing a sort of uncertainty that is irrelevant to CRU’s estimate of the uncertainty in the monthly mean temperatures they report.

This present post will be followed by a post illustrating my results using monte carlo which will likely clarify for people who prefer not to discuss the math.

For those who have a good grasp of statistics, I will begin by with a discussion that explains that the problem is not really in the math. The problem with Pat’s uncertainty intervals has to do with this question:

Precisely what are the uncertainties in the observed mean ‘X’ supposed to communicate?

With regard to statistical concepts, the specific thing, ‘X’, an experimentalists wishes to observe isn’t particularly important; whatever ‘X’ is, one wishes to report uncertainty intervals that are relevant to that specific thing, not something else. It happens that Pat Frank discusses monthly mean temperatures reported by CRU, so in my discussion I will elaborate using monthly mean temperatures as examples.

The short answer to the previous equation involves first recognizing that:

  • In an experiment, X is some specified thing which the observer says they are trying to measure. In the case of CRU, when they report the monthly mean temperature anomaly for July 2011, they are trying report the monthly mean temperature for July 2011, $latex \bar{T}_{July, 2011}$ , where the ‘bar’ denotes a sample average. It may seem silly to state the following, but it is important in context of this post. The monthly mean temperature for July 2011 is a) a mean for each daily value in July, b) it is the value that actually occurred during July, not some other month, c) it is the value that occurred in the July that happened on earth during 2011, not some other year when ENSO, the PDO, the AMO or any other oscillation might have been in some other phase.
  • The uncertainty in that mean of X expresses the notion that if we had sampled that specific thing (i.e. X) differently using different equipment we would get a slightly different answer. The reason for this is that our measurements contain slight errors, and the result of these errors is that our measurement of the thing we intend to observe differs from the “true thing”. So, our estimate of ‘X’ is not precisely equal to ‘X’ that we intended to measure and report. So, for example, when CRU reports the monthly mean of observed values for temperature anomalies in July, the reported uncertainty in the mean is supposed to express the range in which we would expect reported values of observed mean temperatures to fall given the uncertainty in our measurements.

Pat’s paper 1

I’ll now turn to section 2 of Pat’s first paper. In Pat’s first paper, Pat discusses several ‘cases’ one might wish to measure, and then explains how he would compute and report the error for that case. I will be discussing cases 1 and 2 only.

In case 1, Pat discusses computation of the standard error for “repetitive measurements of a constant temperature” and tell us the measurement in a random noise model is,

(1)$latex \displaystyle t_{i}=\tau_{c}+n_{i} $

where $latex t_{i} $ is the measured temperature, $latex \tau_{c}$ is the constant “true” temperature, and $latex n_{i} $ is the random noise associated with the $latex i^{th} $ measurement.

In the case above,

  1. The thing the experimentalist is trying to measure is $latex \tau_{c}$.
  2. The experimentalists will estimate this by reporting the mean over N samples: $latex \bar{T}_{s} = \sum\limits_{i=1}^N t_{i} /N $
  3. The experimentalist might choose to report “standard error” of
    $latex \epsilon = \sqrt{ \frac{\sum\limits_{i=1}^N [ t_{i} – \bar{T}_{s} ]^2 }{ N-1 } } $.
    In the limit where the number of measurements approaches infinity, this standard error will approach $latex \sigma_{n} /\sqrt{N} $ where $latex \sigma_{n}^2 = \sum\limits_{i=1}^N [ n_{i} ]^2 / N $ is the square of the standard deviation for the population of all possible ‘noise’ values, $latex n $.

So far, this is in general agreement with Frank’s discussion. That is: For case 1, Frank and I would report the same estimate for the true value and we’d report the same standard error in our estimate.

Case 2

Building on Case 1, Pat moves on to a second case:

Now suppose the conditions of Case 1 are changed so that the N true temperature magnitudes, $latex \tau_{i} $, vary inherently but the noise variance remains stationary and of constant average intensity. Thus, $latex \tau_{1} \neq \ldots \neq \tau_{i} \neq \tau_{i} \ldots \neq \tau_{n}$ while $latex \sigma_{1} =\ldots= \sigma_{i} = \sigma_{j} =\ldots= \sigma_{n} $, Then

(4)$latex \displaystyle t_{i}=\tau_{i}+n_{i} $

Pat’s equation (4) is correct.

However, I am now going to diverge from Pat’s discussion which is ultimately intended to provide an estimate of uncertainty that is relevant to CRU monthly mean temperatures. I am diverging to focus on two things:

  1. Using the analog of “case 2”, what sort of ‘true’ values correspond to “monthly mean temperatures” are reported by CRU and
  2. what is the correct uncertainty for the quantity of the sort reported by CRU.

Recall, CRU reports monthly mean temperature anomalies for specific months. Examining Pat’s equation (4), if there were M days in a particular month, the closest approximation to a “true” value they are trying to estimate based on measurements is:

(2)$latex \displaystyle \bar{T}_m = \sum\limits_{i=0}^M \tau_{i} / M $

When estimating this true value — that is, the thing they actually wish to report, CRU would compute the mean over the sample:
(3)$latex \displaystyle \bar{T}_s = \sum\limits_{i=0}^M t_{i} / M $

So, the error in their reported value will be equal to

(4)$latex \displaystyle (\bar{T}_s -\bar{T}_m ) = \sum\limits_{i=0}^M ( t_{i} – \tau_{i}) / M = \sum\limits_{i=0}^M n_{i} / M $

If someone– like CRU– is reporting the mean over “M” days, then the standard error in the monthly mean temperature will be approximately equal to:

(5)$latex \epsilon_{mean} = \sum\limits_{i=0}^N [ n_{i} ]^2 / M = \sigma_{n} /\sqrt{M} $

So, other than my having changed “N” to “M”– which was done to emphasize that “M” will be the number of days in the month as opposed to some arbitrary number of measurements, the uncertainty in case 1 is comparable in magnitude to that estimated in case 1. Contrary to Pat Frank’s discussion in section 2, the spread in the temperatures $latex \tau_{i} $ over the “M” days of the month makes absolutely no contribution to the uncertainty in CRU’s ability to estimate the mean value of the M temperatures. Only the errors, $latex n_{i} $ will contribute to the uncertainty in the estimate of the observed value.

In a follow-post, I’ll demonstrate this using Monte-Carlo, and provide a script that will permit people to run and re-run cases to see that if the reported observed mean is an estimate of the values that actually occurred and the ‘noise’ is the errors that arise when measuring those values, then the uncertainty in the monthly mean is described by my equation (5) and not the equation used by Pat Frank.

Meanwhile, I’ll be happy to participate in questions and answers– including discussing in what the spread in temperatures does mean, and discuss hypothetical problems where that spread is meaningful.

113 thoughts on “Comment on Pat Frank’s Uncertainties: The algebra”

  1. I thought of an example using sine waves. That might make the point clear in your future posts.

  2. It would be cool to build a simulation as Jeff implies.

    To that ( in the future) we could add.

    1. a Sensor precision
    2. Noise.
    3. a Tmax observation with rounding
    4. a Tmin observation with rounding

    and finally your monthly mean stuff

    It would be a handy dany tool to answer some of the “brain busters” (hehe) about rounding, precision, and errors in monthly
    estimates in one handy package.

  3. Lucia,

    There are soooo many possible examples!

    Yes, I thought of a couple of illustrative examples from sampling of process tanks, and the like. Maybe just the Monte Carlo simulation will be clear enough.

  4. SteveF–
    I read Pat Frank write he would answer a pertinent question at TAV. I’ll pick which example when after I read what he writes.

  5. As a layperson in these matters it sometimes appears that the correct determinations are rather straight forward, but I find it always instructive to go back to the basics, as Lucia is doing here, to assure oneself that they understand what is being determined.

    My biggest issues with the uncertainties of temperature measurements over time and space is calculating those uncertainties, given incomplete coverage both temporally and spatially. I have read papers that attempt to do this, but my layperson’s view of uncertainty remains unsatisfied. I would guess that empirically something could be done by way of comparing grids of satellite data with various ground station coverage, but I have not seen any publications that have made that comparison.

  6. Lucia 79007,

    I hope Pat takes the time to really think this one through. Nobody needs another (long and absurd) Anastasia M type thread arguing about an obvious error!

  7. one question to ask pat is this:

    will he accept a monte carlo simulation that shows the following:

    and they lay out what you think you will show.

    otherwise, he will just go back to some nonsense. and when he does that they all the other fur starts to fly.. nonsense like ‘carrick is mean”, global temperature has no meaning, thermometer metrology, blah blah blah and the spherical cow will doubtless make a return.

  8. Mosher – Mr. Franks set out a paper for us to review and it appears to be flawed. Why is it you find it necessary to demean every skeptic that puts out something that ends up being flawed and kiss the ass of every team member that does the same. Mr. Franks said he is willing to be shown where he is wrong; Carrick claimed he isn’t by asserting that he won’t. Time will tell but in the mean time you find it necessary to mock him. Go hang out at Real Climate where you belong.

  9. geochemist (Comment #79011),

    Why is it you find it necessary to demean every skeptic that puts out something that ends up being flawed and kiss the ass of every team member that does the same.

    I don’t think that is fair; Mosher seems to me to be pretty hard on most everyone when they err (for example, read some of his comments during the Steig et al/O’Donnell et al controversy). I admit I don’t really understand why he thinks it is helpful to be so hard on people.

  10. I think in an effort to appear “fair and balanced” he thinks he has to disparage people on the “skeptic side” even when they don’t deserve it. And I haven’t seen him comment during some discussions of some very questionable Team papers in recent weeks.

  11. SteveF. That’s a fair assessment.
    Geo: I think you are not very discerning. There are a variety of skeptical arguments that I think have run the course and come up wanting. that is, they have been proven to have no merit. yet they persist. They persist like the hockey stick persists. I think there are good skeptical arguments. I think people like Stevef and carrick make them. To the extent that people like Pat and others persist in discussing nonsense, people like carrick and steveF are not heard. Now, I am not just kissing carricks butt or steveFs butt. but I pick them out as examples.

    Anyway; will you accept Lucia’s monte carlo result? Do you understand what she will do and why it will show whether pat is right or wrong.

    If you can accept the results of her test prior to seeing them, that is if you understand why her test will give you the answer, then there is hope for you.

    If you further commit to press pat to accept Lucia’s results, then there is hope for you.

    you can be sure that if Lucia doesnt prove pat wrong she will post her failure.

    have a nice day.

  12. If I understand correctly:
    τi = true temp val
    ti = measured temp val
    ni = noise value

    ti = τi + ni

    Now we know ti, as we’ve measured it.
    how do we make any meaningful estimate of τi or ni?

  13. Mosher – I understand that most skeptical blog sites have a mixture of good stuff and fantasy (lots of really crappy solar stuff for instance). Yes I think the monte carlo simulations are the way to go and can gladly accept the outcome. I am simply not aware of any behavior on the part of Pat Franks that warrants the boorish comments. He put out his work to be dissected, is not hiding anything and is polite to his detractors, who are questioning not only his stats but his ethics as well. If he is shown to be wrong and acts like a Team member then he will deserve criticism.

  14. Kenneth Fritsch —
    Open questions related to determining the uncertainty in the temperature anomalies exist and are important. However, the particular one Pat suggests in his “case 2” in his first paper does not contribute to the uncertainty in the estimate of the monthly mean temperature. But what’s worse is that the supposed uncertainty discussed in that example appears to be a major contributor to Pat’s estimate of the uncertainty.

    It may be that somewhere later in paper 1 or paper 2, Pat identifies things that do contribute to the uncertainty and that what he observes would be useful to discuss. But before we can get there, we need to eliminate this pesky addition to the supposed uncertainty which appears to dominate his results.

    It seems to me that this false ‘uncertainty’ introduced in Pat’s paper one may be the main reason why the time series of annual average temperatures in his figure 4 look amazingly smooth given the supposedly huge observational uncertainty in his individual annual average measurements. It is, in fact, simply impossible that annual average temperatures of adjacent years are consistently so close in value if the actual uncertainty in any individual observed value is so enormous.

  15. Lucia, Steven Mosher,

    Why Monte Carlo? There are plenty of observations that show what you want to show.
    You could use the TLT or MXD for instance. This would give an upper bound of the error but it would be even more relevant, no?

  16. MichaelJ

    First:
    ti, the measured value is an estimate of tau_i. It’s the “meaningful” estimate of tau_i for any i.

    Uncertainty estimates are attempts to bound the magnitudes of the ni’s, which are never known. No individual value of ni is ever estimated. No one even tries to do that for individual values. What they try to do is estimate the standard deviation of the all possible ni’s.

    But, more importantly, Pat Frank is trying to estimate the uncertainty in the mean of a group of ti’s (and tau_is). He’s not trying to estimate the uncertainty in the individaul tis. But I think (actually, am sure) he’s gotten confused and mistaken the spread in the distribution of tau_is for an “uncertainty in the mean” of the “tau_i’s”. It just isn’t an uncertainty in the mean.

  17. phi-
    Monte Carlo lets us create something where we “know” the “true” value. I have no idea how you think you would show Pat’s mistake using TLT. What’s MXD?

  18. Lucia,

    Imagine that, assuming that the TLT is the true value you find for instance a standard deviation of the error a quarter of that proposed by Pat, you would have proven that there is a flaw.

    MXD (maximum density) is the character of tree rings which is best correlated with temperature.

    A warning anyway, remove the trend before the test.

  19. Re: lucia (Jul 13 14:32),

    MXD = tree ring density

    I think.

    I was thinking of looking at sea ice data. You have months where the rate of change/day is high and months where it is low. So if you average different numbers of days/month to get a monthly average and look at the overall trend, you should be able to show that the spread in the data over a month doesn’t make any difference in the calculated trend, only in the uncertainty of the trend, which should reduce as the number of points averaged increases. But a Monte Carlo is better as you can change all the parameters used to generate your data sets.

  20. I never really got into it much, but was Frank’s original argument that because biases are shared across instruments and stations are not independent, the standard error of all stations combined into a temperature record should be the mean standard error of all stations rather than 1 / sqrt(N)?

    This is of course rather silly, as instrument errors have a strong random component related to station moves, instrument changes, etc. and are not significantly concentrated one way or another (Menne et al 2009: ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/menne-etal2009.pdf has a good analysis of this). Furthermore, a simple Monte Carlo analysis of randomly choosing a particular subset of stations (say, 5%) and constructing a global temperature will quickly demonstrate that the error bars are nowhere near as large as Frank alleges.

  21. Lucia,

    Oh, I agree that there is plenty of non-random bias in the temperature record, but not all the standard error in instrument measurements is related to non-random biases. While completely ignoring systemic bias probably underestimates uncertainty, ascribing all variance to non-random bias certainly overestimates it :-p

  22. Zeke–
    I’m not sure about the sum total of all possible contributions to uncertainty claimed in Frank’s analysis. The mistake around equation (4) related to case 2 is fundamental and has nothing to do with correlation. In fact, up to (4) he appears to be making a claim that ought to be general to estimating the uncertainty in the mean of anything one might want to measure.

    Based on the text and the equation, he seems to be saying the standard error of all stations combined into an average would be non-zero even if every station measured temperature perfectly in the vicinity of the station. This would happen merely because the temperatures at the stations are not identical to each other.

    This is incorrect.

    Of course, you lay over that the issue of anomalies, station moves etc. and things get complicated. But it is not useful to start off with a concept about uncertainty that would be wrong for very, very simple problems.

  23. Lucia,

    In his original TAV post, he remarked that:

    “In short, if one doesn’t know the error is random, then applying the statistics of random error is a mistake.

    Guesstimated errors don’t go as 1/(sqrtN). They go as 1/(sqrt[N/(N-1)]). That means at large N, the error rapidly goes to 1´(the original guesstimate).”

    I took this to imply that in his analysis, the sample size of stations was irrelevant to the resulting standard errors.

  24. Re: lucia (Jul 13 16:20),

    This would happen merely because the temperatures at the stations are not identical to each other.

    It wasn’t at all clear to me that was the source of the uncertainty. Besides, isn’t that the reason they use anomalies in the first place? I’m also not convinced that even if the uncertainty were as large as Pat Frank claims, that you can’t detect a trend. Sure the minimum detectable trend will be larger, but given enough points, the confidence limits on the trend could still exclude zero even for a relatively small trend. Of course it seems that the error is assumed to be i.i.d. rather than autocorrelated and possibly fractionally integrated.

  25. took this to imply that in his analysis, the sample size of stations was irrelevant to the resulting standard errors.

    He does appear to make this claim. But in paper 1 he appears to make it in a very, very general way. His fairly simple “case 2” example posits that if you measure 10 things with 10 different temperatures, and each measurement has an error, the error in the determination of the mean has two contributions:
    1) Contribution 1 is related to errors in the 10 individual measurements. This contribution decreases as 1/sqrt(N). This exists in case 1 also, and is the error I discuss above.

    2) Contribution 2 is the standard deviation of the “true” temperature of the 10 things. This does not decrease as sqrt(N).

    It happens that contribution 2 is
    a) large relative to the measurement error.
    b) would not go to zero even if we gridded the entire planet.
    c) would not go to zero even if we measured the temperature perfectly at every point on the surface of the earth.

    I did not show the formula for #2. But the fact is: This “contribution” to the uncertainty in the determination of the mean for the earth’s surface temperature is fiction. The standard deviation in temperature (or anomalies) across the surface of the earth is interesting, but if station coverage was perfect, it would contribute absolutely nothing to the estimate of the uncertainty in the observed temperature of the earth’s surface. But in Pat’s formulation, it contributes a lot— and not because of incomplete station coverage or any nuance about weather patterns.

  26. DeWitt–

    It wasn’t at all clear to me that was the source of the uncertainty. Besides, isn’t that the reason they use anomalies in the first place?

    In his first paper, Pat has not yet transitioned to anomalies when discussing cases 1 and 2. The sections could be discussed in terms of anomalies. But the issue is presented as very general. While the symbol “T” is used to imply temperature, prior to equation (4) nothing in the discussion even implies that the results would be limited to temperatures much less anomalies.

    But the fact is, what is discussed in case 2 is wrong for very simple things. I would be very surprised if it magically because correct when you suddenly changed to anomalies — but if it does, Pat needs to show why or how his equation to estimate the uncertainty which does not apply in the general case — and can be shown to be absurde in the general case, some how magically becomes correct for the specific case of anomalies.

    To get there, I need to discuss the general case first.

  27. Re: Zeke (Jul 13 15:08),

    Yes, i would seem if pat were correct, then I should be able to construct a sample of stations: ( say 500 world wide sites) compute his uncertainty bounds and then… if I select 500 other stations, say 100 samples, that a good portion should lie outside his 1sigma line.

    Another way to put this is to ask pat what do we have to observe to show that his error bars are too wide?

    A more interesting line is to question whether there are 60 observations per month in the mean or 30, since the mean isnt observed. I think RomanM and I discussed this back in 2007.

    Given a thermometer with a .2C sd. Given that the thermometer makes two measurements per day (min and max) what is the proper way to calculate the error. I’m not sure its 1/sqrt(60)…

  28. SteveF (Comment #79012)
    July 13th, 2011 at 1:34 pm

    I don’t think that is fair; Mosher seems to me to be pretty hard on most everyone when they err (for example, read some of his comments during the Steig et al/O’Donnell et al controversy). I admit I don’t really understand why he thinks it is helpful to be so hard on people.

    For the origin of that mode of behavior, I would suggest you go back to such web sites as ClimateAudit, for example. McIntyre is just far more selective in who he targets for that behavior.

  29. Lucia starts the post with a quote from Jeff Id:

    Pat Frank has a new article recently published in E&E on the statistical uncertainty of surface temperatures. He has requested an open venue for discussion of his work here. This is an opportunity for readers to critically asses the methods and understand whether the argument/conclusion is sound. – Jeff

    I am incredibly naive when it comes to having an article published in a journal, however, wouldn’t it have been more prudent for Pat Frank to have these discussions prior to submitting his paper for publication? Isn’t the arguments raised here, and at tAV, questioning his findings generally the point of peer review?

  30. DeWitt,

    “Besides, isn’t that the reason they use anomalies in the first place? I’m also not convinced that even if the uncertainty were as large as Pat Frank claims, that you can’t detect a trend.”

    Aomalies don’t correct for differences in weather patterns in the basic standard deviation uncertainty equations. The variance is reduced compared to raw temp of course, but the variance between stations is guaranteed to contain at least some valid weather signal. That is the problem, not all of the variance is error and my guess is that in fact, most of it is not error.

    As to your second point, I don’t believe that Pat’s paper addresses knowledge of trend whatsoever, except in conclusion so I think we agree on that.

  31. bugs (Comment #79041),

    Who said anything about Steve McIntyre? Lighten up dude (or dudette)!
    .
    You ought to try to get over your apparent fixation with Mr. McIntyre. Whatever you think of him personally, both his blog’s technical content and his treatment of commenters, even those he disagrees with, are far better than you regularly suggest. It seems to me bugs that you ignore the endless snark and hostility that fall upon anybody who disagrees with the blog host’s POV at most CAGW blogs (that is, if thoughtful opposing comments are allowed at all… often they are not).

  32. Ian,

    Pat published in E&E, which is often… a tad less selective than desirable in its peer review (e.g. the Iron Sun paper and other gems).

  33. Re: lucia (Jul 13 17:05),

    I would be very surprised if it magically because correct when you suddenly changed to anomalies

    I didn’t mean to imply that I think Pat Frank’s analysis is correct for either absolute temperature or anomalies. An analysis that says that the uncertainty in the computed GAT is going to be large even if you had a perfect thermometer at every square centimeter of the Earth’s surface is problematic.

    Looking at daily Arctic ice area anomalies, weather noise is large. The standard error of a linear trend hardly changes whether you use 1 day or a 30 day average for each year. The trend is also fairly large.

  34. Consider a month, say: the month of July, 2010, which begins on 00:00 July 1, 2010 GMT and ends at 00:00 August 1, 2010 GMT (please notice the careful use of “on” and “at”). Consider that this month of July is taken to be entirely present during the time specified across the entire “surface” of the earth where “surface” means whatever volume of space, enveloping the entire earth, is convenient for classifying temperature measurements as “surface” temperature measurements, according to whether the measurement in question was made within or without the said volume. Within this volume, which we are calling the “surface”, measured temperatures will vary from place to place and time to time where “place” means a volume large enough to contain the temperature sensitive portion of a thermometer and “time” means a duration long enough to allow a thermometer in that place to equilibrate with its immediate environment and record a reading. Considered in this way there will be a very large number of temperatures which might be measured within the “surface” over the course of the month; and one can imagine that if all of the possible measurements were made and recorded there would be produced a very large set of temperatures. That set would have a mean. That set would would have a variance and a standard deviation. [Now, one might also argue that in addition to this very large set of possible temperature measurements there exists a corresponding very large set of “actual” temperatures which existed in each place at each time and to which the measurements are but approximations and that there must be an “error” associated with each measurement which expresses the likelihood of that measurement being equal to the “actual” temperature which would have been measured if only the thermometer had been some sort of “ideal” thermometer having perfect accuracy and infinite precision. I suggest that this “ideal” set is not worth worrying about and that the “errors” which its hypothetical existence implies are also not worth considering.] Returning to the very large set of possible measurements with its mean and standard deviation, which set would exhaustively characterize to the limits of our instrumental capabilities the temperature history of the “surface” of the earth for the month of July, 2010: we can regard the temperatures which have been measured and recorded in the CRU database as a random and unbiased (or not so random and hopefully quantifiably biased) sample drawn from the very large set of temperatures described above. That random (or not so random) sample will have its own mean and standard deviation. We don’t know what the mean of the large set of possible measurements is. We don’t know what the the standard deviation of the large set of possible measurements is. We can calculate the mean and standard deviation of the CRU sample. We can only sample any given July once because we are only present during any given July once. So: Given the mean and standard deviation of the single CRU sample of the set of possible temperature measurements of the global surface temperature for the month of July 2010, how does one estimate an interval within which there is a 95% probability that the mean of the large set of all possible temperature measurements will lie? And what is that interval for July 2010? And does not that interval characterize the error in the estimate of the mean surface temperature of July 2010?

  35. JT

    how does one estimate an interval within which there is a 95% probability that the mean of the large set of all possible temperature measurements will lie?

    Might I suggest you propose a method?

    And what is that interval for July 2010?

    And does not that interval characterize the error in the estimate of the mean surface temperature of July 2010?
    As you have not defined “interval”, I don’t know. But it appears to me you are trying to argue by rhetorical question.

    If you are trying to say that Pat Frank’s method gives the correct estimate of the uncertainty in the mean of the surface temperature, (and I suspect this may be the “point” of your rhetorical quesiton) then the answer is “No. It doesn’t.”

    But if you have an actual argument to suggest it does, provide it. And don’t try to do it by rhetorical question because that is actually a violation of the blog policies. (Because rhetorical questions don’t work and only lead to vast amounts of miscommunication as well as plausible deniability on the part of the person trying to make claims using the method of rhetorical question.)

  36. DeWitt:

    Looking at daily Arctic ice area anomalies, weather noise is large. The standard error of a linear trend hardly changes whether you use 1 day or a 30 day average for each year. The trend is also fairly large.

    That’s due to autocorrelation of course.

    I did this analysis at one point using the spectral Monte Carlo method from real temperature data, to compute the uncertainty in the OLS trend as a function of the integration period. Based on this analysis, you need about a 40-year period to measure a 2°C/century trend to a 10% accuracy.

  37. Zeke:

    Pat published in E&E, which is often… a tad less selective than desirable in its peer review (e.g. the Iron Sun paper and other gems).

    I did query Pat on E&E, and this is what he says (thanks to jstults for being my reader and interpreter on this):

    As you raised the issue of review, E&E 2010 22(8) and E&E 2011 23(4) were once a single manuscript that was submitted to the AMS Journal of Applied Meteorology and Climatology. It went through four reviewers, two associate editors and 3 rounds of review with them. Three of the four reviewers recommended publication. One of them was adamantly opposed from the outset. This latter reviewer merely continued to repeat his objections after I had repeatedly shown they were meritless. The editor accepted the one reviewer over the three, and rejected the manuscript. The three reviewers had many critical questions but none of them found any serious error in the analysis. Neither did the fourth, but his adamancy carried the day. That particular reviewer also gratuitously accused me of dishonesty. The whole process took a full year. It was a fairly standard climate science experience, for a skeptical submission.

    On the other hand, one of the E&E reviewers found an error in one of the equations that had gotten by all four AMS reviewers, two AMS AEs, and the editor.

    I would personally like to see him post the reviews.

  38. Steven Mosher:

    i dont think anyone has questioned his ethics. Carrick raised a fair issue about whether pat is convincible.
    I would take no umbrage at anyone raising such an issue with me. In any case we shall see what we see.

    I wouldn’t take umbrage with it either…it’s the opposite of what MarkT suggested on that thread, people in science are used to others challenging us on how objective we are being when we are making an argument in favor of something that we obviously have personally invested in. Seriously, I can’t imagine this issue not coming up in scientific circles.

    Being as this wasn’t Pat’s first foray into this field, I admit I started with prior expectations as to how this will come out.

    But as you say, we’ll see what we’ll see.

  39. Re: bugs (Jul 13 17:06), Bugs.

    I love the way you try to make McIntyre responsible for my being on hard on people. You ignore the very real possibility that I was attracted to mcIntyre because I like the way he is hard on people. You ignore the fact that it takes years of practice to be a hard ass. You can probably go find a few interviews I did in the 90s where i was extremely hard on people. You should have at least had the sense to search and see if i was a hard ass prior to ever reading steve.

  40. I wonder what we would conclude about a single thermometer using Pats approach. You know that thing is never hit by the same molecule twice. is it really measuring the ‘temperature’ at its location…hmm

  41. steven mosher (Comment #79057)
    July 13th, 2011 at 11:51 pm

    Re: bugs (Jul 13 17:06), Bugs.

    I love the way you try to make McIntyre responsible for my being on hard on people. You ignore the very real possibility that I was attracted to mcIntyre because I like the way he is hard on people. You ignore the fact that it takes years of practice to be a hard ass. You can probably go find a few interviews I did in the 90s where i was extremely hard on people. You should have at least had the sense to search and see if i was a hard ass prior to ever reading steve.

    steven mosher (Comment #79058)

    He is selectively hard on people, I said. Those that don’t support his initial guess that the ‘hockey stick’ was a fraud are treated harshly, but those who aren’t a part of the IPCC get off without a comment. He is remarkably gentle all of a sudden.

  42. “He is selectively hard on people”
    bugs,
    So are you. Steve McIntyre, for example.
    Andrew

  43. bugs–
    It’s very odd for you to hop in here and complain about Steve McIntyre. This thread has nothing to do with him. Clearly, you want to make everything about whatever gripe you have about steve.

    Given your mono-mania, and need to try to derail all threads with your mono-mania, I may need to write a filter that prevents person A (i.e. ‘bugs’) from using word B (i.e. Steve McIntyre).

  44. Lucia:

    In his first paper, Pat has not yet transitioned to anomalies when discussing cases 1 and 2. The sections could be discussed in terms of anomalies. But the issue is presented as very general. While the symbol “T” is used to imply temperature, prior to equation (4) nothing in the discussion even implies that the results would be limited to temperatures much less anomalies.

    You could modify his argument slightly and write:

    T_i = tau_i + n_i + s_i + l_i

    where n_i is monthly averaged “measurement noise” (includes weather), s_i is the offset error associated with the instrument and l_i the offset associated with location, and tau_i is the long-term (e.g., 30 year average) of global mean temperature, e.g., something like this.

    We could assume that n_i obeys the statistics of weather (temporal and spatial) between l + a instrumentation measurement error, and that it has the property $latex \sum_{i=1}^{360} n_i/360 \approx 0$.
    Generally the instrumentation measurement noise can be neglected relative to the weather noise

    Replacing our original expression with an integral over a field,

    $latex T(t) = {1\over 4\pi M} \int_{t-M/2}^{t+M/2} dt \int_0^{2\pi} d\phi \int_0^\pi d\theta [\tau(\phi, \theta) + n(\phi, \theta) + l(\phi, \theta) + s(\phi, \theta)] $

    where M = one month.

    we could make the further requirement that the location error L given by $latex L = \int d\phi \int d\theta l(\phi,\theta) = 0 $.

    (phi=longitude, theta=latitude)

    That is, if you have “perfect coverage”, and s = 0, you’ll recover tau after smoothing long enough over t.

    But it makes clear that if you have imperfect coverage of the globe, that $latex L= \sum_i \ell_i \ne 0$, so L represents a systematic offset from the global mean temperature. Changing the distribution over time (or relocating a microphone) causes L to change

    If you can assume that your coverage is “good enough” then small changes in the distribution of stations can be neglected, otherwise you’ll get a variation in L over time that mimics a real signal (long term temperature drift). You can estimate l_i by performing an offset-aligned weighted mean around location phi_i, theta_i the subtracting it from the smooth global mean temperature. The point is that if what we are interested in is long-term trends, variations in l_i can affect it (regardless of the source….changes in the polar ice cap, land usages changes, as well as geographical shifts, well affect it, the only point being it is the local deviation from the global mean temperature at time t).

    The instrumentation error S given by $latex S = \int d\varphi \int\d\theta \s(\varphi,\theta \ne 0 $
    $latex S = \int d\phi \int d\theta \; S(\phi,\theta)$
    represents a systematic offset from the true temperature, however, this can be estimated by comparing the instruments actually used at some period (based on meta data) against NIST calibrated thermometers. If the instruments never changes, this has no effect on anomaly measurements.

    If the instrumentation changes, you’ll see “step functions” in temperature associated with the change in methodology (e.g., migration from a Stevenson screen with a mercury max/min thermometer to a thermister in an aspirated solar screen with an electronic hourly readout), to this would include the time of observation bias too.

    From my view, the proper way of looking at the problem of measuring global mean temperature is to break out the error sources (another person may choose to break them out further than I did), so that each source can be separately modeled. This allows a Monte Carlo approach to be employed to set bounds on the uncertainty in T(t) something not possible when one doesn’t fully explicate the problem.

  45. Lucia, my question was not rhetorical. I don’t know how to go about making such an estimate which is why I asked, having first described the problem as I understand it (which could easily be a misapprehension). I suppose one could make assumptions about the shape of the distribution of the large set of all possible measurements of temperature but I am not at all confident that the most physically reasonable assumption would be a standard normal distribution. Given a range of temperatures over the surface in July from say 50C in Bahrain to -90C in Antarctica and the asymmetrical distribution of temperatures in space across the surface and the fact that they evolve with time during the month I have no idea what a reasonable assumption in relation to the population distribution would be. I was hoping someone here would know.

  46. Had a latex error, that should read

    The instrumentation error S given by $latex S = \int d\phi \int d\theta \; S(\phi,\theta)$ represents a systematic offset from the true temperature

  47. JT the noise distribution (if I am correct that it is dominated by weather/short period climate) is not normal nor is it uncorrelated between sites. However, the distribution can be modeled based on measurements of this noise, and Monte Carlo’d.

    Given that the temperature trend varies in latitude in a manner uncorrelated with population, while one should study the effect of changes in land usage (not just UHI), I wouldn’t expect that to dominate the net measurement error.

  48. JT–
    Ok. But I guess the problem is that your questions represent a subject change. There are– oddly enough– two separate issues:
    1) Are Pat’s claims justified?
    2) How would we do what Pat is trying to do correctly?

    My post is about (1) and mostly I’m trying to stay focused on that. Pat’s paper appears to contain claims that are simply flat out wrong and the way in which he deals with the time series seems to be simply flat out wrong.

    Carrick’s comments above seem to be addressing (2)– but as you can see, doing (2) properly requires some sophistication. But we don’t need to actually do (2) fully or properly to see that some things Pat is claiming seem to not line up with reality.

  49. Lucia, I’m going to step through your analysis, top to bottom.

    You wrote, that you’re restricting your analysis to my equations 1 and 2. About them you wrote, “He is describing a sort of uncertainty that is irrelevant to CRU’s estimate of the uncertainty in the monthly mean temperatures they report.

    There’s no argument here, Lucia. Equations 1 and 2 are statistical illustrations only. As is Case 3. They are meant to lead up to Case 3b, which is clearly described as the only case applicable to the analysis of the CRU error model and then to only part of it: the CRU estimate of measurement read-error.

    But you prefaced that sentence with the statement that your discussion of Case 1 and Case 2 will be, “sufficient to explain why I think Pat Frank’s estimate of uncertainty for monthly mean temperature anomalies is inappropriate.

    But Case 1 and Case 2 are irrelevant to my actual analysis of the uncertainty in air temperature anomalies. I didn’t use them at all in making my estimate. Your analysis starts with a misapprehension, right off the bat.

    You then went on to discuss some measured “X,” about which we would like uncertainty estimates. About my concern with this “X” you wrote, “It happens that Pat Frank discusses monthly mean temperatures reported by CRU, so in my discussion I will elaborate using monthly mean temperatures as examples.

    But to go from “X” to a mean is already a jump too far. My entire concern has been with individual temperature measurements, and the errors accruing to them. Obviously, uncertainties in twice-daily temperature measurements will propagate into a monthly mean. But my primary concern about uncertainty is based in the individual measurement, the “X” values, and not in the uncertainty in the mean. You’ve now displayed two misapprehensions.

    In your first bullet point about the July 2011 mean temperature you then state that it’s:
    a) “a mean for each daily value in July“; It’s the mean of 62 twice-daily July min-max temperature measurements.

    b) “it is the value that actually occurred during July, not some other month“; The mean is not a physical observable. It’s not a temperature, it’s a statistic. It’s associated with July 2011, but it would never occurr at all at any time during July.

    c) “it is the value that occurred in the July that happened on earth during 2011…” Again, the mean is a statistic. It never occurred and has no physical reality.

    At the end of your second bullet point, you again mention, “observed mean temperatures,” as though a statistical mean is a physical observable.

    With respect to Section 2, you wrote that in it I explain “how [I] would compute and report the error for that case.” Those cases are cited to references 14 – 17. Cases 1 and 2 describe completely standard methods of computing uncertainty. It’s not that I would compute the error that way. It’s that the equations describe the standard and correct way to compute uncertainty in those cases. I know of at least two professional statisticians who have read paper 1, and neither of them objected to, or reported any error in, those cases.

    In item 3 under your equation (1), your summation expression for sigma_n is missing a supervening square root sign. As it stands your equation describes the variance of ‘n,’ not the standard deviation.

    Regarding paper Case 2, you wrote that it, “is ultimately intended to provide an estimate of uncertainty that is relevant to CRU monthly mean temperatures.

    As I noted above, Case 2 is not intended to provide an estimate of uncertainty in monthly mean temperatures. To suppose so requires that daily temperatures are measured under Case 2 conditions of stationary noise.

    In fact, I describe the Case 3 conditions, of unequal temperatures and unequal variances as “closest to a real-world spatial average, in which temperature measurements from numerous stations are combined.

    It’s Case 3 that most closely applies to the temperatures CRU collates, not Case 2.

    One of the central points of my paper is that noise stationarity, specifically read-error stationarity, is not known to be present. Section 3.1 discusses this point in detail.

    It is a capital mistake to suppose that Case 2 conditions apply, or were applied in my paper, to the uncertainty in CRU temperature means.

    Your entire analysis rests on this mistake.

    In your equations (2), (3) and (4), the summations should start at i = 1, not i = 0.

    In your equation (5), the summation should be over i = 1 to M, not 0 to N, and the summation, but not the “M” divisor, should be supervened by a square root sign. Equation 5 is wrong as you’ve written it.

    Following equation (5), you wrote, “Contrary to Pat Frank’s discussion in section 2, the spread in the temperatures T_i over the “M” days of the month makes absolutely no contribution to the uncertainty in CRU’s ability to estimate the mean value of the M temperatures.

    Is that in my discussion? Let’s see: my equation (5) gives the the computed mean as “T-bar (+/-) sigma_n/sqrtN.”

    So the Case 2 computed uncertainty in my paper is sigma_n/sqrtN. Your equation (5) gives the uncertainty in the computed mean as sigma_n/sqrtM. Apart from the fact that your uncertainty should be written as sigma_m/sqrtM, it seems to me that our expressions are identical.

    Where’s my mistake?

    But then, to emphasize that you object to noticing the uncertainty due to the inherent spread of tau_i magnitudes, you wrote that, “Only the errors, n_i will contribute to the uncertainty in the estimate of the observed value.

    This is true. But your statement is a complete non-sequitur with respect to the uncertainty due to the spread in tau_i magnitudes.

    Let’s see what I wrote about tau_i in Section 2, Case 2, immediately following my equation (5):

    However, a further source of uncertainty now emerges from the condition tau_i [is not equal to] tau_j. The mean temperature, T-bar, will have an additional uncertainty, (+/-)s, [notice, it’s not (+/-)sigma] reflecting the fact that the tau_i magnitudes are inherently different. The result is a scatter of the inherently different temperature magnitudes about the mean,…(emphasis added)”

    I then pointed out that (+/-)s and sigma_n are statistically independent. How is it remotely possible to suppose, as you did, Lucia, that I meant (+/-)s to be a measure of the statistical uncertainty in a computed mean?

    The (+/-)s is magnitude uncertainty.” Here’s what I wrote about magnitude uncertainty, (+/-)s, at the end of Section 2, Case 2:

    The magnitude uncertainty, (+/-)s, is a measure of how well a mean represents the state of the system. A large (+/-)s relative to a mean implies that the system is composed of strongly heterogeneous sub-states poorly represented by the mean state [18]. This caution has bearing on the physical significance of mean temperature anomalies (see below).

    Does it seem to anyone here that I represented magnitude uncertainty, (+/-)s, as a measure of the statistical uncertainty in the computed mean temperature?

    Magnitude uncertainty figures into paper 1 in several places. It is always distinguished from noise uncertainty. It is always represented as a measure of the physical variability of the system.

    We are, after all, discussing a physical system. That means we must hew to physical meaning.

    When we discuss a physical system, the variability of the physical state is of central interest. To represent a dynamical state as only the statistical mean value of all the sub-states is to simplify the description into near physical meaninglessness.

    Air temperature varies continuously. Daily temperatures will vary across a given month. To describe monthly temperature only as the mean value is to withhold all the information about the temperature dynamics of the month.

    Magnitude uncertainty communicates this dynamical information. It is a real aspect of any monthly mean temperature. We come to this: when we communicate the full meaning of a mean temperature, not only is the noise (measurement) uncertainty relevant, but the dynamical variability of the physical system about the state mean is also relevant.

    The statistical noise uncertainty in the mean is sigma_n/sqrtN; this is clearly derived in my paper and both Lucia and I agree on this.

    However, the physical variability in daily temperatures is given by the magnitude uncertainty, (+/-)s. The (+/-)s is present because we are interested in the physical variability about a statistical mean meant to represent a dynamic physical system. The (+/-)s is not statistical uncertainty. It is the statistical representation of a physical variability. Its application to the variability about a mean temperature is entirely justified.

    Lucia, you have misapprehended my analysis by supposing that Case 2 statistics were applied to CRU temperatures. They were not.

    It seems likely that either you have not read Section 3 at all, or you have not understood how it justifies my analysis of CRU temperatures, in terms of Section 2, Case 3b, not Case 2.

    You have either not read or not understood any of the descriptions provided in the paper about the meaning of magnitude uncertainty.

    You have completely misunderstood the analysis in my paper.

    Your analysis starts with an error and concludes with a mistake.

    To reiterate the central points of paper 1: measurement read-error is not known to be random, and is merely given an estimated average magnitude by the CRU scientists. As an estimated average, the statistics of random error are inapplicable.

    Second: instrumental systematic error is completely neglected by the CRU scientists. Sensor systematic error varies in time and place. It is deterministic and not random. The statistics of random error do not apply to systematic error, either.

    None of the central points of this analysis of the uncertainty in temperature anomalies involves random error. So any attempt to assess them using a Monte Carlo analysis is very much more than likely to be entirely irrelevant.

  50. There are many comments here, but they’ll have to wait because it’s now very late. But I’d like to respond to Steve Mosher’s second comment.

    You wrote that I might, “go back to some nonsense … like ‘carrick is mean””

    Where did I write that Carrick is mean, Steve? As I recall, I just pointed out that his comments about me and about E&E were unfair and unjustified. I wrote nothing personal about Carrick himself. That’s a fact. Where is my “nonsense”?

    Did I really write that, “global temperature has no meaning“? Or was it that global average surface air temperature anomalies without appropriate error bars are almost physically meaningless? Do you find that statement to be nonsense?

    The post at WUWT about “thermometer metrology” was by Mark Cooper, with a nice preamble by Anthony. How nonsensical.

    So far as I can see, Steve, the only nonsense appearing in your post is you making stuff up.

  51. Pat Frank,
    I do not see that for any physical process with internal variability it is less possible to accurately describe the uncertainty of the state of a system than for one which does not have internal varibility, so long as there is sufficient sampling coverage of the system. I think you continue to convolute uncertainty in a trend (the CRU temperature history) with weather variability. Whatever you are calculating, I doubt it has anything to do with uncertainty in our knowledge of the trend in Earth’s average temperature, and that is what people want to know about Earth’s temperature history. Maybe your analysis tells us about weather variability over time, but I think that is about it.

  52. Pat Frank

    Equations 1 and 2 are statistical illustrations only. As is Case 3. They are meant to lead up to Case 3b, which is clearly described as the only case applicable to the analysis of the CRU error model and then to only part of it: the CRU estimate of measurement read-error.

    In case 3b you write “The condition in Case 3 also produces a magnitude uncertainty, ±s, in analogy with Case 2.”. You end with “In Case 3b, does not diminish as …, and ±s–cannot be separated from .”

    But the “s’ from case 2 is wrong.
    So, your discussion of uncertainty in case 3b is wrong because your discussion of uncertainty in case 2 is wrong.

    But to go from “X” to a mean is already a jump too far.

    No it’s not.

    My entire concern has been with individual temperature measurements, and the errors accruing to them.

    If this is your entire concern, then your abstract should not say things like “representative lower-limit uncertainty of ±0.46 C was found for any global annual surface air temperature anomaly.” The “global surface air temperature anomaly” is not an individual temperature measurement it is a mean. So, your abstract sure gives the impression that your concern is means.

    The uncertainty of 0.46C is wrong if attached to “the global surface air temperature anomaly”. It is wrong because your method cannot be used to compute uncertainties for means. If the difficulty is that you true focus is uncertainties of individual values, but you then applied those uncertainties to a mean then you clearly managed to confuse yourself and forgot that it is utterly, totally and completely inappropriate to attach errors that apply to individual measurements to computed means.

    (FWIW: I’m not sure your estimate is even correct for any individual value since I don’t know what aspect of “uncertainty” you are trying to convey. Maybe that uncertainty is correct for something else– I will defer judgement on that.)

    As for you silly quibbles:
    (a) Even if you compute the daily value based on two measurements, the value for July is still the may of daily values in July.
    (b) Rebutting strawman is stupid. I didn’t say the mean is a physical observable. Even if something is a statistic, it still occurred. It’s stupid to try to play mr. language person to distract from your aggregious error in case 2 which propagates into case 3b and paper 2.
    (c) I repeat (b).

    Cases 1 and 2 describe completely standard methods of computing uncertainty.

    No. Case 2 is wrong — of if it’s correct for some mystery statistical problem, you need to reveal the statistical question for which you think those uncertainties are relevant.

    Case 2 is not intended to provide an estimate of uncertainty in monthly mean temperatures

    Then your paper is irrelevant to discussing CRU temperature and there is simply no point in paying any attention to it whatsoever.

    It’s Case 3 that most closely applies to the temperatures CRU collates, not Case 2.

    Equation 1 in your paper 2 parallel case 2, not case 3.

    FWIW: I’ve read section 3. The errors in your ‘case 3b” which spring from your error in case 2 propagate into that section.

    To that the error in case 2 infects section 3, you write:

    Therefore, the ±0.2 C estimate in Ref. [12] is the assessed of Case 3b above, namely an adjudged assignment taken to represent the average uncertainty from an ensemble of surface station

    So, you used Case3b, which is wrong becuase it includes the incorrect error from case 2.

    You are introducing this ±s uncertainty all over the place in that section. When applied to estimating the uncertainty in the annual average– a mean– the correct value for ‘s’ is zero. But you are using the relation for error you describe as ‘±s’ in case 2.

    That’s wrong.

    Also it is clear you are claiming your error estimates apply to averages not individual temperatures:

    Example text:
    * before equation (9)

    “Applying the estimated average per measurement uncertainty, , the total noise uncertainty in any measurement average is”

    Finally, you end by claiming:

    None of the central points of this analysis of the uncertainty in temperature anomalies involves random error. So any attempt to assess them using a Monte Carlo analysis is very much more than likely to be entirely irrelevant.

    Well, it’s convenient to decree your statistical claims can’t be tested! But in any case, what you say is mystifying. Monte Carlo isn’t restricted to random error. There only needs to be something apparantly random.

    In the en, I have to say:
    based on your response I have concluded that you are so confused about statistics that it is pointless to discuss it with you at all. You are simply so confused about what you even did that I suspect no one can possibly get you to understand how totally, utterly, wrong it is. You have achieved a level of bogosity that defies imagination.

    Yes. I am joining Carrick on this.

    That said: I do invite you to provide answers to the questions in http://rankexploits.com/musings/2011/questions-to-clarify-contribution-of-spread-in-population-to-uncertainty/ . Because I would truly like to know what answers you would get using your insights into statistics. Ideally, I would like you to provide the answer, explaining how you would do the problem, and ideally, using the method you outlined in paper 1. If you like, ignore neglect the uncertainty of the thermometers.

  53. Wow…. Is there some sort of history to the relationships among Pat Frank, Lucia, Carrick and Mosher that would account for the nasty devolution of this discussion? When I don’t have a firm grasp of the technical issues, I consider the tone of the respective arguments. Aggression and abuse generally function to compensate for missing substance, and arguments that employ them are unreliable.

  54. “devolution of this discussion”

    It happens here frequently. People have axes to grind, big and small, and there are no exceptions.

    Andrew

  55. Amac–
    I don’t think anyone is going to learn anything unless Pat Frank is willing to step back to case 2 and explain what he thinks is right or wrong about it. For example, part of his ‘defense’ seem to be to marvel

    I then pointed out that (+/-)s and sigma_n are statistically independent. How is it remotely possible to suppose, as you did, Lucia, that I meant (+/-)s to be a measure of the statistical uncertainty in a computed mean?

    Yet, reason it is “remotely possible” for me, Lucia, to think he meant ±s to be a measure of statistical uncertainty in the computed mean is that he says so in his paper 1. In fact, he says so when discussing case 2. He says

    Following from eqns. (5) and (6), although the impact of random noise on $latex \bar{T} $ diminishes
    with $latex 1/ \sqrt{N} $, the magnitude uncertainty in $latex \bar{T} $, given by, ±s = […] [17, p. 9ff], does not.

    $latex \bar{T} $ is the computed mean. He is clearly discussing contributions to the uncertainty in the computed mean, not individual measurements. This usage– treating s as the uncertainty in a mean– continues– and persists through out the entire paper and appears in paper 2.

    At some point, Pat is going to have to at least admit his paper does literally say that is the uncertainty in the mean and this is not a misunderstanding on my part. His paper says it.

    Then, if he admits his paper says it, he can either
    a) say it was a typo.
    b) that he literally did not use it. (Then we can move to section 3b and talk about that. )

    But right now, I think he would rather be pugnatious and defend case 2 by a) ignoring it and b) writing snotty rhetorical questions, hope no one answers them and imagine that people like clazy either will not notice they were snotty and think that the mere appearance of a rhetorical question suggests that the reason it is “remotely possible” for me to “assume” something is that I am confused.

    Quite honestly,unless Pat is going to step back and address case 2, sticking to what he means and what his paper says, and what his paper actually does, I don’t think anyone is going to learn anything.

  56. Lucia,
    .
    The problem is (I think) that Pat is so dug into the equations and formalism of his approach that he is completely missing the big picture; that is, he just doesn’t seem to clearly understand the most basic relationships between statistical methods and what those methods are really doing, so has no ‘reality check’ to fall back on when he is wandering off into shoulder high weeds.
    .
    Pat rejecting the legitimacy of a Monte Carlo test is a really bad sign of where this thread will end up.
    .
    Pat,
    You may imagine that a bunch of people of (more or less 😉 ) good will, who have probably 100’s of years combined experience handling noisy data from fields outside of climate science, are all wrong about this, and that you are right. But that seems to me very unlikely. IMO, you are nit-picking and not seeing the big picture; you are not even addressing the substantive issues.

  57. clazy8 (Comment #79120),
    I agree that snark and hostility serve no purpose. But in this case, Pat is, based on my personal experience and understanding, utterly wrong in a purely technical sense, and seems unwilling to consider reasoned arguments which show this is the case. His argument is comparable in quality to arguments like “back-radiation form CO2 is impossible because it violates the second law of thermodynamics”, which you will hear from a lot of well intentioned folks…. who have not a clue what they are talking about.
    .
    Being very wrong, in a purely technical sense, combined with the hostility evident in Pat’s comment above, is a sure-fire way to illicit hostility. If you read over the earlier comments here and at The Air Vent, I think you will see that most people (OK, maybe not Carrick!) were very measured in their initial comments.

  58. SteveF:

    I think you will see that most people (OK, maybe not Carrick!) were very measured in their initial comments.

    Well here is what I said:

    The real test of objectivity will be if this error analysis is shown to be in error, would he and you [Pat and phi] admit it.

    My bet is on won’t.

    This wasn’t prescience on my part by the way. It was a judgement based on prior experience with Pat and phi.

    My bet still is on “won’t”.

  59. Lucia:

    Carrick’s comments above seem to be addressing (2)– but as you can see, doing (2) properly requires some sophistication. But we don’t need to actually do (2) fully or properly to see that some things Pat is claiming seem to not line up with reality.

    I think (1) is self-evident, so I find the second more interesting.

    Pat:

    None of the central points of this analysis of the uncertainty in temperature anomalies involves random error. So any attempt to assess them using a Monte Carlo analysis is very much more than likely to be entirely irrelevant.

    The underlying function need not be random in order to use the Monte Carlo method (Simple counter example: Monte Carlo method to compute an integral.) In fact, I described a framework above for how one would apply the Monte Carlo method to this problem.

    It’s obvious you just don’t know what you’re talking about.

  60. Carrick,

    “This wasn’t prescience on my part by the way. It was a judgement based on prior experience with Pat and phi.”

    Oh, I’d be really happy that you give a reference about me. I do not like much these unsubstantiated claims.

  61. phi, it’s a pattern of behavior I’m referring to. If people are interested, they can read the thread in question and come to their own conclusion. I’m not sure why I have to be specific as to why I formed a judgement.

    But if you want a specific example—how about Pat Frank’s paper, that most people here now view as flawed, which you supported on JeffID’s thread?

  62. Carrick,

    This wasn’t prescience on my part by the way. It was a judgment based on prior experience with Pat and phi.
    My bet still is on “won’t”.

    Short of a sudden insight on Pat’s part, it looks like your bet is the right one. I had not really had any interaction with Pat before The Air Vent tread; I will be a bit more ‘skeptical’ about trying to engage him on technical issues in the future. Life is too short to waste time… especially when you are already well into the descending part of the curve.

  63. SteveF:

    Life is too short to waste time… especially when you are already well into the descending part of the curve.

    Well hopefully you’ll surprise yourself and end up that urn much farther down the road, but (and this is just a general statement, applicable to whomever one pleases) when it becomes obvious people have entrenched positions, and lack the technical ability to adequately justify them to boot, then it serves no purpose to have extended discussions with them after this has been established.

  64. Carrick,

    when it becomes obvious people have entrenched positions, and lack the technical ability to adequately justify them to boot, then it serves no purpose to have extended discussions with them after this has been established.

    Agreed. I would go further, it usually makes no sense to have even brief discussions. Which is why I have an informal list of commenters that I never engage, or even respond to, on both sides of the AGW divide… it would be a waste of my time and theirs.

  65. Anticipating that Pat would reject a monte carlo test, on Airvent I posed the questioned differently.

    What observation would we have to make to convince pat that his error bars are too large?

    failing to provide an answer to this question puts his position sqaurely in non falsifiable land.

  66. Re: Pat Frank (Jul 15 00:46),

    Pat i’m referring to all the OTHER nonsense that invariably gets raised by other participants in these discussions. from carrick is mean, to mosher is hard to people, to global temperature has no meaning, to meterology.. it’s the diversions. Doubtless, I could go to the airvent thread and point this out to you, but lets not make the topic of diversions a diversion. Lucia and others have put questions to you. start there.

  67. steven, unfortunately Pat is as prone as the others to resort to diversions to avoid discussing problems with his paper, such as Eq. (2).

  68. Carrick if you’re going to use a Monte Carlo analysis to estimate systematic error in a large ensemble of measurements, you’d have to know something about the distribution of the magnitudes and skewness of the errors. Your Monte Carlo method would have to choose out representative subsets.

    But the distribution of the magnitudes and skewness of the systematic error in surface air temperatures is almost entirely unknown. Absent that knowledge, you’d have to simulate the error and its distribution. Your Monte Carlo analysis would just be sophisticated speculation. Made in ignorance, your Monte Carlo analysis would tell us virtually nothing about real world errors.

    You could end up telling us about how the total error might look, if the systematic errors were distributed in some assumed way or another of magnitude, skewness, and kurtosis. Interesting, maybe. Empirically helpful? Not really, especially when the systematic error entered the data in the unrecoverable past.

    As I see it, the only way to approach the problem is to set up climate station calibration experiments in representative regions of the globe, and spend a few years measuring the systematic error produced by a representative set of temperature sensors. To get an idea of how that experiment might look, I’d suggest reading the papers by Hubbard and Lin, referenced in paper 1. One might make a start by doing representative sensor calibration experiments in wind-tunnels able to also simulate solar heating.

    It seems to me you have put confidence in the method without thinking enough about the problem.

  69. Carrick (#79161),
    Fun. That’s exactly what I thought. You make disparaging judgments without having the least argument. This is called defamation.

  70. Lucia, you wrote that “But the “s’ from case 2 is wrong.

    I looked among your comments to see if I could find the reason why you think it’s wrong.

    There’s this, “But what’s worse is that the supposed uncertainty discussed in that example appears to be a major contributor to Pat’s estimate of the uncertainty. … It seems to me that this false ‘uncertainty’ introduced in Pat’s paper one may be the main reason why the time series of annual average temperatures in his figure 4 look amazingly smooth…

    You’re here discussing (+/-)s in Case 2. The (+/-)s makes zero contribution to the uncertainty I calculated and displayed in paper 1 Figure 4. The uncertainty bars in Figure 4 represent the r.m.s. of MMTS systematic error and the CRU estimated average of read-error.

    The time series in Figure 4 is exactly the anomalies downloaded from GISS, with the link given in Figure 4, Legend. Whatever you think of their smoothness, I did nothing to them.

    You also wrote there, “It is, in fact, simply impossible that annual average temperatures of adjacent years are consistently so close in value if the actual uncertainty in any individual observed value is so enormous.

    Not if the sensor the physically determined systematic sensor error, like air temperature, is correlated regionally and tracks air temperature persistence.

    And there’s this, “But, more importantly, Pat Frank is trying to estimate the uncertainty in the mean of a group of ti’s (and tau_is). He’s not trying to estimate the uncertainty in the individaul tis. But I think (actually, am sure) he’s gotten confused and mistaken the spread in the distribution of tau_is for an “uncertainty in the mean” of the “tau_i’s”. It just isn’t an uncertainty in the mean.

    But I call the distribution of tau_i magnitudes a measure of the physical variability of the system. This is different from the noise uncertainty and different from systematic error. It is not a measurement uncertainty and is never represented in my paper as a measurement uncertainty.

    And there’s this one, “Based on the text and the equation, he seems to be saying the standard error of all stations combined into an average would be non-zero even if every station measured temperature perfectly in the vicinity of the station. This would happen merely because the temperatures at the stations are not identical to each other.
    This is incorrect.

    But (+/-)s is not the standard error in the temperature measurements. It’s the standard deviation of the magnitude variation of the anomalies. It will always be non-zero and always reflect the physical variability of the anomalies over time. That meaning for (+/-)s is not incorrect. And that is the meaning I give to (+/-)s in the paper.

    There’s also this one, “I did not show the formula for #2. But the fact is: This “contribution” to the uncertainty in the determination of the mean for the earth’s surface temperature is fiction. The standard deviation in temperature (or anomalies) across the surface of the earth is interesting, but if station coverage was perfect, it would contribute absolutely nothing to the estimate of the uncertainty in the observed temperature of the earth’s surface. But in Pat’s formulation, it contributes a lot– and not because of incomplete station coverage or any nuance about weather patterns.

    Once again, (+/-)s is represented throughout my paper as an estimate of the physical variability of the anomalies. It is never represented as a measurement error statistic.

    In calculating the uncertainties shown in Table 1, Table 2, and Figure 4, (+/-)s is never part of them.

    Maybe I should not have used the term “magnitude uncertainty when I described (+/-)s. Maybe I should have used ‘magnitude variability.’ Same thing, different name, still not a statistical measurement uncertainty.

    In this comment, after quoting about equations (5) and (6), you wrote, “T_bar is the computed mean. He is clearly discussing contributions to the uncertainty in the computed mean, not individual measurements. This usage– treating s as the uncertainty in a mean– continues– and persists through out the entire paper and appears in paper 2.

    “At some point, Pat is going to have to at least admit his paper does literally say that is the uncertainty in the mean and this is not a misunderstanding on my part. His paper says it.

    Above the part you quoted, I wrote this: “The mean temperature, T_bar, will have an additional uncertainty, (+/-)s, reflecting the fact that the tau_i magnitudes are inherently different.” And below the part you quoted, I wrote this: “The magnitude uncertainty, (+/-)s, is a measure of how well a mean represents the state of the system. A large (+/-)s relative to a mean implies that the system is composed of strongly heterogeneous sub-states poorly represented by the mean state [18].
    The second quote clearly gives the meaning of (+/-)s as indicating physical variability. I written nowhere that (+/-)s reflects a statistical measurement error.

    End of page 977, Section 3.2.1: “This uncertainty [i.e., (+/-)s] transmits the confidence that may be placed in an anomaly as representative of the state of the system.” Once again, (+/-)s is not represented as statistical measurement error of the mean.

    But in any case, looking over your objections, Lucia, it seems to me that they rest on a misperception of the stated meaning of (+/-)s.

    It’s not a measurement error. It’s not a measurement uncertainty. It’s a measure of the variability of physical magnitudes. I have defined it that way every single time.

    I’ve looked through Case 2 to find where you might have gotten your mistaken impression of (+/-)s, and found this likely candidate sentence: “Therefore under Case 2, the uncertainty never approaches zero no matter how large N becomes, because although (+/-)sigma_n should automatically average away, (+/-)s is never zero.

    I can see how, taken alone and with no context, that sentence could lead someone to believe as you do.

    But let’s put that sentence into the context provided by the paper. Just after that sentence, I give the usual equation for calculating the standard deviation: SD = sqrt[sum over (t_i – T_bar)^2/(N-1)].

    Above equation (6) I point out that when tau_i is not equal to tau_j, etc., then (t_i – T_bar) = (n_i + delta_tau_i), where delta_tau_i is tau_i minus T_bar.
    (I just noticed a typo in the paper: the parenthetical T is missing its bar.)

    As you noted in one of your comments, Lucia, we never know the magnitude of n_i. We also do not know the magnitude of delta_tau_i.

    That means we cannot separately calculate (+/-)s or sigma_n. When we calculate the standard deviation of our measurements, the quantity under the square root sign, (t_i – T_bar)^2, contains both n_i and delta_tau_i.

    When N is large, the portion of the calculated standard deviation due to n_i averages away as 1/sqrtN. However, the portion due to delta_tau_i never averages away.

    Therefore, when n_i and delta_tau_i are both present, the standard deviation does not average to zero no matter how large N becomes. Under the conditions of Case 2 and at large N, the empirical standard deviation will not approach zero. It will approach (+/-)s for that value of N.

    That is the meaning of the Case 2 sentence. It does not mean that (+/-)s is some sort of statistical measurement uncertainty in T_bar. It means that when the uncertainty in T_bar is empirically calculated as the standard deviation of a set of temperatures that vary inherently in magnitude, the total standard deviation will not tend to zero at large N.

    Admittedly, to get that meaning, one must put together the concepts that are separately given in the paragraph around that sentence; about the given meaning of (t_i – T_bar) joined to the given meaning of an empirical standard deviation.

    When someone calculates a standard deviation of a series of measurements of temperature that vary inherently in magnitude, the uncertainty in the mean as conventionally represented by standard deviation will never go to zero and will include (+/-)s.

    If the experimenter knows that the measured temperatures vary inherently, then s/he will be aware that the standard deviation includes the inherent magnitude variation and realize the measurement uncertainty is thereby over-estimated. If s/he does not know the temperature magnitudes vary inherently, then the standard deviation contaminated with (+/-)s will be (mistakenly) represented as the measurement uncertainty.

    If that sentence has led you astray, Lucia, I sincerely regret it, regret the trouble it has caused you, and wish I had written it a different way.

    However, given the context of the paper and the explicit statements about the content of (t_i – T_bar) and the expression for standard deviation, the sentence is correct as written.

    Finally, I’ll reiterate the existence of several places in my paper where (+/-)s is defined as representing physical variability, and never as being a statistical measurement uncertainty.

  71. Pat–
    Case 2 is wrong. You misatribute the spread in the temperatures as an uncertainty in the mean. That you don’t admit that you say this in your paper and wish to “rebut” by posing a rhetorical question asking how anyone can think you did this is a very bad sign. Read your case 2. Concoct a montecarlo simulation where you can ever find an error in them mean that obeys that equation. You won’t be able to do so.

    Don’t c claim it can’t be done with Monte Carlo. If it can’t be done with Monte Carlo, then you don’t understand Monte Carlo or statistics so go learn.

    In any case, are you suggestion the equations in paper 2 were not used to compute the uncertainty intervals in figure 4 of paper 2? Because they are wrong too. Equation 1 in paper 2 is wrong. After you run a Monte Carlo script on simple things like case 2, we can move onto your other wrong equations.

    r.m.s. of MMTS systematic error

    Presumably your perception of what constitutes “systematic error”. You have sucked things into “systematic error” that are not “systematic error”.

    For example, you are saying things like this

    Not if the sensor the physically determined systematic sensor error, like air temperature, is correlated regionally and tracks air temperature persistence.

    Air temperature persistence from day to day due to weather is real it is not an “error” to measure that it was hotter yesterday cooler today.

    Look: I’m not going forward on this unless you first show in what way your case 2 is right. I think you are very confused trying to do big boy problems first, and you are refusing to step back and look at case 2. You are also not partitioning “error” and real correctly measured variability. Until you are willing to step back to simple problems to discuss how to do them, you are not going to be able to de-confuse yourself, and you will be wasting everyone’s time spewing none sense.

    Your paper is discusses error bars that are, depending on viewpoint, either
    a) utterly irrelevant to assessing the accuracy of the CRU record or any record of annual surface temperatures because you are tagging some or
    b) intended to be relevant but computed utterly incorrectly.

    But once again: If you want to discuss this, please focus on case 2 and not the other stuff. Then, delete everything form your comment that addresses something that occurs after case 2 in your paper. Then we can move forward.

  72. Pat

    That means we cannot separately calculate (+/-)s or sigma_n.

    Nonsense. Absolute nonsense. Absolute “adjective” nonesense.

    We can calibrate. Calibration is routine.

    Absent calibration, we can still estimate the uncertainty range for instruments and instrumental installations. This concept seems to have eluded you– or at least, your paper seems founded on the notion that if we can, the only way to estimate it is by defining the variability in the object measured as the uncertainty.

    Both the importance of calibration and methods to estimate instrument uncertainty are taught in undergraduate engineering and science laboratory courses. Often in freshman year, but repeated through out various labs.

  73. phi:

    Fun. That’s exactly what I thought. You make disparaging judgments without having the least argument. This is called defamation.

    You don’t know what defamation is either. What I did was make a speculation, based on my judgement of prior experiences with your inability to reason.

  74. Pat:

    Carrick if you’re going to use a Monte Carlo analysis to estimate systematic error in a large ensemble of measurements, you’d have to know something about the distribution of the magnitudes and skewness of the errors. Your Monte Carlo method would have to choose out representative subsets.

    I have ideas for how this could be done, for each error type, so yes it could be done here.

    But the distribution of the magnitudes and skewness of the systematic error in surface air temperatures is almost entirely unknown

    I view that as a purely religiously held belief on your part. Obviously I disagree with this conclusion.

    As I see it, the only way to approach the problem is to set up climate station calibration experiments in representative regions of the globe, and spend a few years measuring the systematic error produced by a representative set of temperature sensors.

    Measurements of this sort have been done (see e.g., CASES99), the only reason to do them in different parts of the world would be to measure climate noise, not self noise.

    One might make a start by doing representative sensor calibration experiments in wind-tunnels able to also simulate solar heating.

    No you can’t do it in wind tunnels. Wind tunnels generate entirely the wrong statistical distribution. You need something that looks like Kulmogorov turbulence.

    It seems to me you have put confidence in the method without thinking enough about the problem.

    I have thought about the actual problem a lot more than I believe you have. I don’t think you understand what you think you understand nearly well enough for you to be making so many blanket statements.

  75. Again this is the comment that phi now considers defamatory:

    The real test of objectivity will be if this error analysis is shown to be in error, would he and you [Pat and phi] admit it.

    My bet is on won’t.

    I’m starting to decide that SteveF and DeWitt are right. With people whose only response to claims they lack objectivity is to claim victimhood, the only appropriate behavior is to ignore their blather.

  76. Pat–
    It seems you don’t understand one of the major strengths of doing Monte Carlo. The purpose of Monte Carlo would be to show that your method and equations would properly describe the uncertainty in cases where you do know the properties. If it doesn’t even work for those systems, then it can’t work for other systems.

  77. Carrick,
    You talk about demonstrated error unrecognized but are unable to give a reference. You just make noise. End of this stupid conversation.

  78. Carrick

    (Pat Frank wrote)

    As I see it, the only way to approach the problem is to set up climate station calibration experiments in representative regions of the globe, and spend a few years measuring the systematic error produced by a representative set of temperature sensors.

    (Carricks response)
    Measurements of this sort have been done (see e.g., CASES99), the only reason to do them in different parts of the world would be to measure climate noise, not self noise.

    I think Pat’s view merely demonstrates that he does not recognize the difference between an error in measuring something (i.e. “X”) and the distribution of “X”.

    If people studying turbulence or engineers were to view Pat’s definition of “systematic noise” as actually being “systematic noise” we would never be able to learn anything about anything.

    Engineers, scientists of all types (physical and social), technicians and plain ol’ high school students understand that you can calibrate instruments and determine the systematic error for an installation. The instruments can be deployed into a situation where the properties of the process vary over time for any reason (including “noise”, “turbulence”, “chaos”, or deterministic trends) and the systematic error can be known (or at least estiamted) based on the calibration.

    No informed technically competent person ever decrees that the variability of the process studied magically become the systematic error for the measurements. It’s nuts.

    Even in the case where we might speculate that the systematic error in the deployed installation differs from that in the calibration, no sane engineer, scientist or plain ol’s high school student concludes that the variability of the process is the systematic error.

    There are two possible sane responses to the possibility that the errors in the installation differ from those in the calibration:

    1) conclude we don’t know the uncertainty. In which case, we should identify which factors exist in the deployment that did not exist in the installation and estimate the error arising from that. or

    2) Repeat the calibration this time including the features that we think will cause our measurement to differ from the “true” value we wish to observe.

    In the former case, we report uncertainty intervals based on the calibration, our estimate of any additional uncertainties based on features of the installation we think might add error (which we justify based on something other than the variability of the process itself) and that’s it.

    As far as I can tell, Pat thinks you can only determine what he calls “the systematic error” because his definition of “systematic error” is entirely non-standard to the extent of being flat out wrong.

  79. No you can’t do it in wind tunnels.

    Yes you can, but it’s considered impolite during normal operating hours.

    Wind tunnels generate entirely the wrong statistical distribution.

    Wind tunnels generate what you make them generate.

    You need something that looks like Kulmogorov turbulence.

    All the way down to the wall? You think fast dynamic effects matter for getting a reasonable calibration? Have you done a back of the envelope to see where the length-scale of the screen fits in the spectrum? How far away from “the wall” are they usually mounted?

    If you are doing linear time-series modeling then that natural variability about the monthly mean is treated as noise. I think the problem here is more than a confusion between prediction and confidence intervals. Obviously, Lucia is right, when temporal variation is treated as error we don’t learn much.

  80. Wind tunnels generate what you make them generate.

    Of course. But presumably Pat is suggesting the windtunnel would be used to simulate the conditions outside the stevenson screens. With some frequency, flow outside the stevenson is the result of natural convection or mixed convection.

    Also, I suspect temperature measurement errors are likely to be larger when there is no wind because wind will tend to reduce any difference in temperature between the walls of an enclosure and the ambient air. So, this is a case that would need to be tested. It’s bit difficult to simulate free convection in a wind tunnel. I’m sure it could be done in, you can heat a lower surface. But the walls of the wind tunnel tend prevent any convective motions from achieving the sorts of behaviors you many flows that will occur when a stevenson screen is deployed.

  81. Acthof Unimty:

    Wind tunnels generate what you make them generate.

    Except near their walls, wind tunnels generally approach streamline flow. You can generate turbulence with a screen, but it has nothing to do with the turbulence found in the atmospheric boundary layer. I can’t imagine why anybody would even try (flat plain with good fetch and a well developed ABL is a better way to go).

    All the way down to the wall? You think fast dynamic effects matter for getting a reasonable calibration? Have you done a back of the envelope to see where the length-scale of the screen fits in the spectrum? How far away from “the wall” are they usually mounted?

    I don’t think there’s a good way to directly answer this.

    The answer to that depends on how you are measuring the temperature (is the radiation shield aspirated versus nonaspirated), how far off the surface of the ground the measurement is, what the surface roughness is, how much fetch you have upwind to the measurement, etc.

  82. Lucia:

    Also, I suspect temperature measurement errors are likely to be larger when there is no wind because wind will tend to reduce any difference in temperature between the walls of an enclosure and the ambient air.

    This is correct. Nonaspirated radiation shields rely on air flow to average over the surrounding air.

    So, this is a case that would need to be tested. It’s bit difficult to simulate free convection in a wind tunnel. I’m sure it could be done in, you can heat a lower surface. But the walls of the wind tunnel tend prevent any convective motions from achieving the sorts of behaviors you many flows that will occur when a stevenson screen is deployed.

    Getting the scale lengths (roughly meters to kilometers) and frequencies (roughly 10 mHz to 10 Hz) that are important for temperature measurements in side of a wind tunnel would be challenging to say the least. Also, turbulence in the atmosphere is typically due to a mix of forced thermal convection and wind-sheer. At typical wind speeds (say 3-6 m/s) near the surface, this translates into a source frequency region around 1 Hz (it generally looks Kolmogorov above that up to the thermal viscous limit).

    As I told Acthof Unimty, I don’t see much point in trying to replicate an outdoor environment inside of a wind tunnel.

    (The distinction between wind tunnel and outdoor is important enough that measurements taken in a wind tunnel can lead to conclusions opposite to “real world” outdoor measurements.)

  83. Lucia,

    There have been lots of experiments to measure the effects of wind speed and radiation on measured temperature by the sensor in various kinds of enclosure.

    Here is just one example:

    http://nargeo.geo.uni.lodz.pl/~icuc5/text/P_6_5.pdf

    It is virtually impossible to prevent radiation effects causing meaningful errors at low wind speeds unless forced ventilation is used. All shields have to compromise between blocking radiation and allowing air movement past the sensor.

    The solar radiation problem is obviously most acute on cloudless days at high noon near the equator. There is also quite a strong seasonal effect at higher latitudes. There is also a measurable longwave radiation effect at night when wind speeds tend to be low and air temperatures can be different to the surrounding surfaces.

    Contrary to popular opinion most air temperature readings are contaminated by variations in wind speed, cloud and continually varying radiation effects. To further add to the complication the precise effects do depend on exact siting details.

    Maybe the anomaly methods can average everything out but one has to also hope that there are no long term trends in any of these disturbing variables.

  84. Carrick, you probably already know, but it’s common in wind tunnel work not to be able to match all the similarity parameters simultaneously. I don’t think what you’re saying presents any extraordinary impediment to doing useful calibration work for these screens using wind tunnels, especially if done in conjunction with field experiments and numerical simulation. There may be extraordinary impediments, but the things you mention seem par for the course.

    This is a tangent (if an interesting one); I thought your main criticisms were spot on, but that the wind tunnel stuff wasn’t reasonable.

  85. Actof Unimty–
    The free convection vs. forced convection issue isn’t merely not matching Reynolds, Rayleigh, Prandtl and Mach numbers at the same time. It’s not even achieving a qualitative match.

    I think this is the issue Carrick means.

    Carrick–
    You can models some outdoor problems in a wind tunnel. For example, it can be useful for finding wind loads on roofs etc. Of course you and Actof will see these are cases where the sustained (or gusting) wind speed is appreciable.

    Returning to stevensen screens: These cases are the ones where we can be pretty darn confident the temperature recorded by a thermometer inside the stevensen screen is close to that outside. We anticipate appreciable errors only in cases where there is low wind– which is precisely the kind where a wind tunnel is of limited use. If anything, the existence of the walls will interfere with getting the correct sort of flow while the ability to generate a steady or gusting wind is not required.

    Actof

    This is a tangent (if an interesting one);

    We go off on tangents all the time here. 🙂

  86. Wind tunnels? That is a bit of a tangent.
    .
    The thread does seem to have turned out much like the Anastasia M “latent heat is irrelevant to atmospheric motion” thread. Equally bizarre conjecture supported by meaningless equations. The good news is that this one did not go on for as long.

  87. Achtof Unimty, yes I mean something different than you do here. I also agree with Lucia and you that there are absolutely places where you can do similar testing in a wind tunnel and outdoors. I think you’ll find wind tunnels do great in simulating the atmosphere once you get above the ABL, for example. If you are looking at load effects on building, similarly no problems.

    The issues with comparing wind tunnel data to surface boundary layer turbulence is related to the difference in the statistical properties of the two.

    Our group has experience with both wind tunnels and outdoor turbulence measurements at my lab, so this isn’t something I’m just making up. It really is possible to find examples where the optimal configuration (in the sense of being the closest to a measurement of the “static” quantity) gets flipped between wind tunnels and surface layer turbulence. Here’s a reference, this is for pressure sensor shrouds but the problems are similar for obstructions like Stevenson screens and other radiation shields.

    Lucia is spot-on in her comments on the Stevenson screen.

  88. SteveF:

    Wind tunnels? That is a bit of a tangent.

    It’s an argument that Pat raised instead of discussing Eq. 2. So yes it’s a tangent. 😉

  89. Acthof, no problem.

    Just to be clear, we are generating turbulence in our slow-speed wind tunnel using a screen. This produces fairly small scale turbulence (usually smaller than the object obstructing the flow being tested), that is very close to isotropic. The big problem with outdoor turbulence (in the ABL) is that it is not isotropic and the range of scale sizes for turbulence encompasses the dimensions of the obstructing object. Here, cross-flow contamination yields results that are distinctly different than for the case of finer-grained isotropic turbulent flow.

    Whether the turbulence looks Kolmogorov in the inertial subrange probably doesn’t influence the results that much, so I was being sloppy in my initial comments on this—I really should have said “statistics typical of outdoor conditions for flow near the surface of the ground”.

  90. Carrick, the anisotropy thing is exactly why your ‘name dropping’ triggered my bs sniffer. Thanks for teaching me something new, regardless.

    steven mosher, did you forget the rest of your comment?

  91. Lucia, you wrote,Case 2 is wrong. You misatribute the spread in the temperatures as an uncertainty in the mean. … where you can ever find an error in [the] mean that obeys that equation. (emphasis added)”

    You have misattributed “uncertainty” to exclusively mean “error.” It doesn’t. This mistaken attribution powers your entire objection. “Uncertainty” means uncertainty; it does not mean ‘error.’ In your criticisms, you have invariably made this mistake of meaning.

    You wrote, “are you suggestion the equations in paper 2 were not used to compute the uncertainty intervals in figure 4 of paper 2? Because they are wrong too.

    Equation 1 in paper 2 is the standard equation for “instrumental uncertainties” given on page 71 of Bevington and Robinson, ref. 17. It’s not wrong.

    Equations 2 through 9 develop the analysis of the “monthly constant plus weather noise” model of air temperature normals offered by Brohan, 2006. They’re not wrong.

    Equation 10 represents the standard deviation of the variation of 30 monthly anomaly normals about their mean. It’s not wrong.

    The bars plotted in Figure 4 are identified as “1-sigma uncertainty bars,” not as error bars.

    The “(+/-)0.17 C magnitude variation” uncertainty bars were calculated using equation 10. This magnitude variation is clearly identified as the natural variation of the anomalies about their mean. That is not a misapplication of equation 10.

    It is not a mistake to calculate the variation of magnitudes about their mean.

    Those (+/-)0.17 C bars are nowhere represented as error bars or as measurement errors. The (+/-)0.17 C is nowhere represented as a measurement error.

    “Magnitude uncertainty” does not mean ‘measurement error,’ and is never represented by me, or in my papers, as a measurement error.

    Where I have written “uncertainty” you have apparently invariably inferred ‘measurement error.’ This is your mistake.

    You’ve insisted that your mistake is my meaning. It is not.

    Nowhere is it suggested in my papers that “total uncertainty” exclusively means ‘measurement error.’

    Throughout your criticism, you have consistently imposed your mistaken reading on my paper, and then represented your mistake as mine.

  92. Lucia, you also wrote, “Presumably your perception of what constitutes “systematic error”. about my “r.m.s. of MMTS systematic error.”

    No, Lucia, that is the MMTS systematic error as measured by Hubbard and Lin, thoroughly discussed in paper 1, page 977, “3.2.2. Uncertainty due to systematic impacts on instrumental field resolution

    Then you wrote, “For example, you are saying things like this
    Not if the sensor the physically determined systematic sensor error, like air temperature, is correlated regionally and tracks air temperature persistence.
    Air temperature persistence from day to day due to weather is real it is not an “error” to measure that it was hotter yesterday cooler today.

    It’s worth examining what you wrote here. I offered an “If” statement, suggesting that the physically determined systematic sensor error, like air temperature, could be regionally correlated.

    I.e., surface air temperature is known to be regionally correlated. Systematic sensor error could also be regionally correlated (because, as Hubbard and Lin have shown, it arises from the same physical determinants as produce surface air temperature).

    You entirely missed that meaning. You apparently saw neither the mention of systematic sensor error, nor the mention of regional correlation. You wrote your reply as if these things were not present. You wrote your reply as though I had suggested only that variation in daily temperature, itself, is an error.

    But I wrote nothing of the kind. I wrote about the possible regional correlation and persistence of sensor systematic error.

    In short, you ignored the manifest meaning in my comment, substituted your own mistaken interpretation, and proceeded to criticize me as though your mistake was my mistake.

    So, I’m actually glad you wrote the above. That small example illustrates your globally persistent mistake. You have invariably overlooked the plain meaning of what I have written. You have imposed your own inferences. You have never granted me the courtesy of letting my papers mean exactly and only what I have written. You have invariably added a meaning that is not present in the text, inferred what I did not mean, and then criticized me from within your inference.

    Throughout your criticisms, your inferred meanings have been thoroughly mistaken, just as they were above. You have imposed your own mistaken meaning on me. And you have criticized me, and my papers, on the basis of your mistaken inferences rather than on what is actually present.

    Admittedly, when I edited the sentence above, I left in an extra “the sensor” (sentence words two and three). My mistake. Apologies. Hazards of blog postings.

    But someone who is reading and making an effort to discern what I actually meant might quickly decode away the extra “the sensor” wording.

    You not only apparently did not make that effort, but went on to assign your own meaning to the sentence; a meaning that is obviously not present and is manifestly wrong.

    Your mistake above is a small-scale example of the mistake you have made throughout your criticism. From the evidence, I’m guessing that your reading of my papers has been careless.

  93. Lucia wanted to know how Case 2, Section 2 in paper 1 is right.

    The following post appeared at tAV, but under the circumstances it’s worth posting here, too.

    The whole thing can be made really very simple.

    Beginning under Section 2, page 970, Case 1 through Case 3, substitute the following:

    Wherever the word “temperature” appears, substitute ‘intensity.’

    Removing the word “temperature” will remove the apparently irresistible impulse you folks have to add climatological meanings inappropriately to what is axiomatically defined as strictly limited derivations of three cases of basic signal-averaging statistics.

    For t_i substitute y_i.

    for tau_c substitute upsilon_c.

    For tau_i,j etc.,substitute upsilon-I,j, etc.

    For T_bar, substitute Y_bar

    So, for example, Case 1 equation (1), page 970 becomes:

    y_i = upsilon_c + n_i, (1)

    where y_i is the measured intensity, upsilon_i is the constant “true” intensity, and n_i is the random noise associated with the i_th measurement.

    Case 1 becomes repetitive measurements of a constant intensity, meaning the “true” intensities are constant,

    i.e., upsilon_1 = upsilon_c, upsilon_2 = upsilon_c,…, upsilon_n = upsilon_c; and the noise is stationary.

    The mean intensity, page 971, line 2, becomes Y_bar = (1/N)*[sum(i=1,N)y_i].

    The variable metric along the abscissa could be time, space, wavelength, frequency, you-pick-it. The observable is just the intensity of some arbitrary signal along that metric.

    These substitutions throughout Section 2 will make it clear that the Section is strictly dealing with the statistics of signal averaging, just as Section 1 stated would be done and as is introduced in the opening sentences of Section 2.

    Making these substitutions, the Section will clearly step through examples of serially more complicated signal-intensity and noise combinations, showing how the statistics of signal-averaging change with each case.

    For the Case 2 that has caused everyone so much trouble, then, the signal averaging model is axiomatically limited to signals in which the “true” intensities are not equal:

    upsilon_i =/ upsilon_j,

    where “=/” means ‘is not equal to,’

    The other part of Case 2 is that the noise is still stationary, i.e., sigma^2_i = sigma^2_j, etc.

    No more meaning is allowed to Case 2 than that.

    Apart from these changes in notation, the step-wise statistical development through the three Cases remains identical.

    The application of the Case statistics to understanding the meaning of a subjectively adjudged estimated error, such as was offered by Folland, et al, 2001, remains identical.

    The only difference is that all references to “temperature” are removed from the three Cases. The more abstract notation still carries the entire statistical message originally intended, which about the evolution of standard deviation.

    But with the use of abstract notation, no one will be seduced into reflexively adding in any meaning to the Cases that is not explicitly stated in the axiomatic definitions given at the outset of each signal averaging Case.

    Here’s how the following sentence under Case 2, for example, will change:

    Original: “The mean temperature, T_bar, will have an additional uncertainty, (+/-)s, reflecting the fact that the tau_i magnitudes are inherently different. The result is a scatter of the inherently different temperature magnitudes about the mean …

    Abstactized:, ‘The mean intensity, Y_bar, will have an additional uncertainty, (+/-)s, reflecting the fact that the upsilon_i magnitudes are inherently different. The result is a scatter of the inherently different intensity magnitudes about the mean …

    There is now no temptation to find some cryptic meaning about ‘weather intensity’ in Case 2, and impose that meaning on the rest of the paper. Nevertheless, the statistical meanings associated with the two sentences are identical.

    It should now be very clear that Section 2 is only about basic concepts of signal averaging:

    Case 1: simplest system: ______ — constant signal + constant noise.
    Case 2: more complicated system — variable signal + constant noise.
    Case 3: most complicated system — variable signal + variable noise.
    Case 3b: adjudged estimated average uncertainty.

    Maybe I should have used abstract notation in Section 2 from the outset. But it never, ever occurred to me — not a hint of a wisp of a suspicion — that anyone would misunderstand Section 2 in the manner we’ve all experienced here.

    None of the four AMS reviewers from JAMC — not even Dr. Adamantly_Opposed –or the two associate editors raised any problem with understanding the intended meaning of Section 2. Nor did the E&E reviewers, and one of those last, at least, must have read Section 2 carefully because (he) found an error in the original equation (6) that everyone else missed.

    Not a hint of a problem from any of them.

    But I truly regret the storm that was caused, and that the way I wrote caused so many of you to have a problem parsing my intended meaning. Sincere regrets to you all for that.

  94. Jeffid asked some core questions illuminating his views here following my post about substituting ‘intensity’ for “temperature,” etc., in Section 2 of paper 1. That post is reproduced above.

    I’m posting my reply to him here, because I suspect the views he expressed are widespread here, too.

    The quote Jeff engaged [from my paper] is, “delta_tau represents the difference between the “true” magnitude of tau_i and T-bar, apart from noise.”

    Jeff asked, “How is the full standard deviation of tau additional measurement error ‘s’ of Tbar?

    The standard deviation of delta_tau is not additional measurement error. Delta_tau_i does not represent an error. It is not part of a random normal spread around T_bar.

    The magnitudes of the delta_tau_i do not represent random fluctuations about a mean.

    They represent the outcome of inherently different magnitudes — similar to magnitude variations (intensity variations) one might get when taking measurements of the observable of a deterministically and systematically varying process.

    This is what I meant by the properties of the case being axiomatically defined. The tau_i were defined as having inherently different magnitudes, and no more than that.

    If the delta_tau_i were to represent random fluctuations about a mean, I would have defined them to be so. But absent that definition, a tau_i attribute of random variation about a mean can not be assumed (or imposed).

    The first sentence Jeff quoted above, said that, “… (+/-)s, must enter into the total uncertainty…” Total uncertainty. Not measurement uncertainty. Not even total measurement uncertainty. Total uncertainty.

    Part of the total uncertainty of the mean of a set of measured observables of inherently different intensities (magnitudes) is the magnitude uncertainty itself, which is apart from the measurement uncertainty. It is a measure of the non-random variation in intensity one would obtain if one measured the system again.

    In a science/engineering context it’s a measure of the natural variability of the observable magnitudes associated with a deterministically varying system.

    Usually, a magnitude uncertainty is reported separately from a measurement uncertainty, if they can be known separately, as, e.g., value(+/-)sigma,(+/-)s.

    Or sigma and ‘s’ can be combined as the r.m.s. if one wanted to express the total measured variation in observational magnitude, as recorded by your instrument.

    Jeff then asked, “And how is it that this exact statement does not include weather noise?

    Because weather noise is not part of the statistical model. The model isn’t about climate. It’s not about daily temperature. As I pointed out explicitly in #169, it’s not even about temperature at all. It’s about how standard deviation changes from Case 1 when the observables come to have an inherently different magnitude but the measurement noise remains stationary.

    Jeff finally asked, “And how is it that measurement of this noiseless ‘source of error’ adds to the total error in knowledge?

    Magnitude uncertainty is not a source of error. Magnitude uncertainty is not an error in knowledge. Magnitude uncertainty is positive knowledge. It’s knowledge of natural variability. It’s knowledge of the natural variability of a system that exhibits inherently different intensity magnitudes. Magnitudes that are non-randomly distributed.

  95. Carrick, I wrote, “But the distribution of the magnitudes and skewness of the systematic error in surface air temperatures is almost entirely unknown

    And you replied: “I view that as a purely religiously held belief on your part. Obviously I disagree with this conclusion.

    Can you cite the field studies?

    I wrote, “As I see it, the only way to approach the problem is to set up climate station calibration experiments in representative regions of the globe, and spend a few years measuring the systematic error produced by a representative set of temperature sensors.

    And you replied, “Measurements of this sort have been done (see e.g., CASES99), the only reason to do them in different parts of the world would be to measure climate noise, not self noise.

    Thank-you for telling me about CASES99. I had in mind the sort of experiments done by Hubbard and Lin, which I have repeatedly described as exemplary. The field experiments, like theirs, would include a high-accuracy temperature sensor against which the performance of the usual surface station sensors could be tested. Independent measurements of irradiance and of wind speed would allow estimation of how the systematic error of each test sensor changes with variations in local climate.

    Obviously, an experiment using an external reference standard for air temperature allows the measurement of sensor self-error.

    Setting up such experiments with a selection of differently configured sensors (LIG+CRS screen, MMTS, etc.), at globally well-chosen locations would reveal something about the global distribution of the systematic error produced by surface air temperature sensors within a globally representative distribution of real local climates.

    Once established, these experiments could continue for years. Properly configured, they could produce an excellent estimate of the accuracy of the surface air temperature record.

  96. Pat, you made the claim “But the distribution of the magnitudes and skewness of the systematic error in surface air temperatures is almost entirely unknown”, so it would seem that the onus is you on demonstrating that these systematics are almost “entirely unknown”.

    Nonetheless, one can set limits on the systematics, and it doesn’t require an appeal to authority to do so (e.g. cite previous literature, though that is certainly available).

    Before I say anything further, let me preface by saying anytime I talk about temperature in connection with global warming, I am referring to temperature anomalies, not absolute temperature. If a systematic is present that only provides an offset (but not scaling) error, it will have negligible effect on the the estimate of global anomalized mean temperature, nor on the trend in temperature over time.

    Methods for estimating the uncertainty:

    We can use Zeke’s method to test the variability in the trend estimation using different subsets of data. The fact that he gets substantively the same results with different subsets indicates that any systematic error would have to be shared between all instruments.

    Secondly, we can compare different ways of measuring temperature, e.g. ocean versus land versus satellite. In general, they will not share the same systematics. If a particular systematic error in one data set were an important component of the total measured temperature trend (e.g., UHI), that measure, e.g., anomalized land surface temperature, should not be in good agreement with other measures.

    While we expect the three to not agree in numerically (because each measures a different thing), nonetheless they should covary together (the Pearson correlation between the different measurements should be very high). And of course this is what is seen.

    Finally, we can look at the spatial distribution of the contributions to the observed temperature increase over time. If we posit a particular mechanism for producing a systematic error, each of those mechanisms will have a predict geographical distribution associated with it. For example, UHI effects on anomalized temperature will correlate geographically with areas of rapid population growth. However, this is not what is seen.

    (Both land and sea share a common temperature trend, there is a pronounced geographical influence on land temperature trend, almost none on sea-based ones, however the land-based trend does not follow a distribution whose maximum is peaked near the regions of rapid population growth…indeed at first blush maximum trend appears to inversely correlate with population density.)

    I believe taken together, the overall uncertainty in temperature trend is probably no more than about 0.2°C/century, or roughly 10% of the measured trend. So not only do I not agree that “theses systematics are almost entirely unknown”, I believe they are both quantifiable, and further believe that we can conclude that correcting for them will not substantially influence our current understanding of the anomalized temperature time series for the Earth.

  97. Pat I believe the experiement you suggest has been going on for 7 years.

    http://www.ncdc.noaa.gov/crn/

    http://www.ncdc.noaa.gov/crn/instrdoc.html

    http://www.ncdc.noaa.gov/crn/elements.html
    3 sensors per station.

    Paired with the old network at 14 sites

    http://www.ncdc.noaa.gov/crn/stationmap.html

    Now, given what you’ve argued, what would you expect to see from these paired locations? what would you expect to see from the tri redundant sensors?

    In short, is your position subject to disconfirmation.

  98. Pat
    Sorry for the delay in answering. I lost internet connection.

    The whole thing can be made really very simple.

    I agree the whole thing is very simple. I thank you for temporarily focusing on case 2.

    Beginning under Section 2, page 970, Case 1 through Case 3, substitute the following:

    Wherever the word “temperature” appears, substitute ‘intensity.’

    Note that my post does, in fact, discuss the problem in case 2 generally by using ‘X’. Specifically, I wrote:

    With regard to statistical concepts, the specific thing, ‘X’,

    I am glad to see you are coming around to thinking we should look at statistics and evaluating statistical issues generally.

    Removing the word “temperature” will remove the apparently irresistible impulse you folks have to add climatological meanings inappropriately to what is axiomatically defined as strictly limited derivations of three cases of basic signal-averaging statistics.

    Agreed. That’s why, when I read the beginning of your comment, I hoped you might finally sort yourself out.

    Though, FWIW: I have no idea why you think I have added any climatological meaning to temperature in your case 2. I don’t believe I’ve said anything about climatological meaning. I merely note that the variance across the sample is not equal to the uncertainty in the mean. It isn’t for the weight of widgets, it isn’t for the temperature of ice cream cones, it isn’t for the resistance of resistors.

    Back to Pat:

    For t_i substitute y_i.

    for tau_c substitute upsilon_c.

    For tau_i,j etc.,substitute upsilon-I,j, etc.

    For T_bar, substitute Y_bar

    So, for example, Case 1 equation (1), page 970 becomes:

    y_i = upsilon_c + n_i, (1)

    where y_i is the measured intensity, upsilon_i is the constant “true” intensity, and n_i is the random noise associated with the i_th measurement.

    Case 1 becomes repetitive measurements of a constant intensity, meaning the “true” intensities are constant,

    i.e., upsilon_1 = upsilon_c, upsilon_2 = upsilon_c,…, upsilon_n = upsilon_c; and the noise is stationary.
    The mean intensity, page 971, line 2, becomes Y_bar = (1/N)*[sum(i=1,N)y_i].

    No on disputes that this is the garden variety definition of average which can be correct expressed using an number of variables.

    The variable metric along the abscissa could be time, space, wavelength, frequency, you-pick-it. The observable is just the intensity of some arbitrary signal along that metric.

    These substitutions throughout Section 2 will make it clear that the Section is strictly dealing with the statistics of signal averaging, just as Section 1 stated would be done and as is introduced in the opening sentences of Section 2.

    Making these substitutions, the Section will clearly step through example

    s of serially more complicated signal-intensity and noise combinations, showing how the statistics of signal-averaging change with each case.

    For the Case 2 that has caused everyone so much trouble, then, the signal averaging model is axiomatically limited to signals in which the “true” intensities are not equal:

    We all agree that Case 2 is limited to signals where the “true” intensities are not equal.
    No one has disputed your equation for the average. It is the equation claiming to compute the uncertainty in the mean that is under dispute– for this case. Your equation estimating the uncertainty in the mean of N measurements of a thing in Case 2 remains incorrect whether the thing you averaged was Temperature, Mass, Resistance or any intensity you like. The variance across the sample is not the uncertainty in the mean which is what you claim in case 2 or paper 1.

    upsilon_i =/ upsilon_j,

    where “=/” means ‘is not equal to,’

    The other part of Case 2 is that the noise is still stationary, i.e., sigma^2_i = sigma^2_j, etc.

    No more meaning is allowed to Case 2 than that.

    I believe everyone appreciated you wrote this in case 2.

    Apart from these changes in notation, the step-wise statistical development through the three Cases remains identical.

    Agreed. And it remains wrong. The variance in upsilon_i is not the uncertainty in the mean of upsilon no matter what upsilon is. Your mistake is fundamental at a simple statistics level.

    >The application of the Case statistics to understanding the meaning of a subjectively adjudged estimated error, such as was offered by Folland, et al, 2001, remains identical.

    The only difference is that all references to “temperature” are removed from the three Cases. The more abstract notation still carries the entire statistical message originally intended, which about the evolution of standard deviation.

    If Folland claimed the variance across a sample is the uncertainty in the mean, then Folland is incorrect. If you referring back to the only equation in Folland et al 2001, it appears you are misinterpreting what Folland says. Note for example that the equation in Folland does not contain individual temperatures (as you do), but the difference between an already averaged value and the true value of an already averaged thing. (That is: it is we see the difference of two things– both of which are averages.)

    This is a very important difference.

    So: your abstract statistical message remains wrong and may be based on your mis-interpretation of Folland. So, let me repeat: Given your definitions of the variance in upsilon_i is not the uncertainty in the mean of upsilon no-matter what upsilon is.

    > But with the use of abstract notation, no one will be seduced into reflexively adding in any meaning to the Cases that is not explicitly stated in the axiomatic definitions given at the outset of each signal averaging Case.

    Let me remind you: I discussed the abstraction in my post, and then just mention that you happen to be discussing temperature. But my point is: What you write is simply wrong from a statistics point of view. It would be wrong for heights of American men. It would be wrong for weight of cupcakes. It would be wrong for length of femurs of Tyrranasaurus Rex.

    Here’s how the following sentence under Case 2, for example, will change:

    Original: “The mean temperature, T_bar, will have an additional uncertainty, (+/-)s, reflecting the fact that the tau_i magnitudes are inherently different. The result is a scatter of the inherently different temperature magnitudes about the mean …

    Abstactized:, ‘The mean intensity, Y_bar, will have an additional uncertainty, (+/-)s, reflecting the fact that the upsilon_i magnitudes are inherently different. The result is a scatter of the inherently different intensity magnitudes about the mean …

    There is now no temptation to find some cryptic meaning about ‘weather intensity’ in Case 2, and impose that meaning on the rest of the paper. Nevertheless, the statistical meanings associated with the two sentences are identical.

    As far as I can tell, no one has infused cryptic meaning of “weather intensity” into “Case 2”. I’ll discuss Jeff’s diagnosis later— explaining how he is correct in his interpretation.But I’ll only discuss this after you understand that case 2 is wrong *no matter what thing is being measured. *

    It should now be very clear that Section 2 is only about basic concepts of signal averaging:

    Agreed. That’s why I focused on this section which wrong at a very basic level. Let me edit this:

    Case 1: simplest system: ______ — constant signal + constant noise. ok.
    Case 2: more complicated system — variable signal + constant noise. wrong.
    Case 3: most complicated system — variable signal + variable noise. undiagnosed. I’m focusing on case 2.
    Case 3b: adjudged estimated average uncertainty. wrong because it includes the error from case 2..

    >Maybe I should have used abstract notation in Section 2 from the outset. But it never, ever occurred to me — not a hint of a wisp of a suspicion — that anyone would misunderstand Section 2 in the manner we’ve all experienced here.

    Using the abstract notation would have made the error more obvious. Maybe even a reviewer at E&E or elsewhere would have seen it.

    None of the four AMS reviewers from JAMC — not even Dr. Adamantly_Opposed –or the two associate editors raised any problem with understanding the intended meaning of Section 2. Nor did the E&E reviewers, and one of those last, at least, must have read Section 2 carefully because (he) found an error in the original equation (6) that everyone else missed.

    Then there is a problem with the reviewers from JAMC including Dr. Adamantly_Opposed. Because case 2 is flat out wrong. ( It would be interesting to read what they actually wrote. If you supply them to me, I will be happy to post them.)

    Not a hint of a problem from any of them.

    Those who found nothing wrong with your paper did a poor job reviewing. Dr. Adamantly_Opposed at least might be given some slack for possibly not identifying the problem where it first appeared.

    But I truly regret the storm that was caused, and that the way I wrote caused so many of you to have a problem parsing my intended meaning. Sincere regrets to you all for that.

    Regret or no regret, your case 2 and by extension case 3b is simply wrong. As you like the word they are axiomaticallydid I will stay away from the notion of “weather noise” because people alluding to climatologically important concepts seems to confuse you into believing the criticism of your method is based on climatology.

    In fact, it is clear the problem with your paper has to do with not understanding the meaning of “uncertainty in a mean” (and a few other statistical concepts.

  99. Referring to this comment by Pat that Lucia highlighted,

    >Maybe I should have used abstract notation in Section 2 from the outset. But it never, ever occurred to me — not a hint of a wisp of a suspicion — that anyone would misunderstand Section 2 in the manner we’ve all experienced here.

    I’d still like to see Pat to publish his reviewer’s comments, so we can see what the criticisms were rather than rely on his paraphrase of them. From what I have seen of Pat’s responses to Lucia’s comments, I find it very plausible that Pat simply didn’t understand the reviewer’s criticisms.

  100. I’ve cut and pasted the equation from Folland 2001:


    I think those who understand notation and statistics will recognize the difference between what Folland wrote and Pat Frank’s interpretation.

    Carrick
    I too would like to read reviewer’s comments in their original form. Heck, it would be interesting if the negative reviewer revealed him or herself!

Comments are closed.