Munchkin

Mar25

Comparing IPCC Projections to Individual Measurement Systems.

Recently, the subject of using only one set of measurements to perform a hypothesis test arose. As many are aware, I prefer to average over instruments. But, I’m willing to consider each set individually. So, today I did that.

My main results are: Looking at the data 12 possible ways, I get 9 results that say “reject the IPCC best estimate” to a confidence of 95%.

So, in today’s post, I’ll explain my results. But first, I will explain why I prefer to use merged data when comparing IPCC projections to data.

Why I use an “ensemble average” of multiple GMST data sets.

When testing data, it is obviously possible to take two approaches: a) Select a data set and use only that data set or b) Use a collection of all data thought reliable by practitioners. I prefer the second approach. As I see it, the advantages and disadvantages of each approach are as follows:

  1. Pick one GMST data set, ignore evidence from other data sets.
    Advantages: Least effort. Readers will notice that back in January, when I first began using data, I would use only one data set: GISSTemp. This choice was dictated by pure laziness. I was interested in getting up to speed and gaining familiarity with available data sets and the literature.

    However, while some analytical laziness is excusable in a blog post, I always planned to include more data sets as they became available.

    Disadvantages: There are two main disadvantages of using only one data set. These are: one may be suspected of cherry picking and one increases β error. The difficulties with the first can be largely minimized by explaining one’s data choice prior to performing an analysis.

    The second disadvantage cannot be eliminated. Selecting one data set when 5 are available increase β error. Period. There are some valid reasons for discounting available data. For example, if some data are known to be of poor quality or in error, one can justify leaving it out of an analysis. So, for example, had NASA GISS failed to correct the Y2K error recently discovered in their data, this might be good reason to leave it out of the analysis. However, if reasons are valid, the reasons can be stated up directly, and should be.

  2. Use all the standard data sets thought reliable.
    Disadvantages: Slightly more work.
    Advantages: a) Appear more trustworthy, b) reduce β error, c) reduces uncertainty in the mean results.

I may be wrong, but in my opinion, the averaging over multiple data sets is better than relying on only one.

However, since this topic has been discussed in blog comments, I will now take the liberty to elaborate a further on these two issues, as both are important in the context of the “blog climate wars” we all enjoy. :)

What is the problem with raising suspicious of cherry picking? Of course no one cherry picks. :)

Nevertheless, should a blogger with a particularly point-of-view accidentally select a temperature record that happens to be the outlier that gives the result that blogger is known to prefer, using that particular one data set fosters suspicions of cherry picking.

I believe AGW to be true, but since I am willing to pro-actively test projections against data during what appears to be a “stall” in warming, much of my audience consists of skeptics. Clearly, they are not going to be convinced this stall is meaningless if I restrict my analysis to using GISSTemp, the data set that shows the least recent cooling. Rather, what will happen is this: They will decide I simply pick data to suit my pre-conceived notions.

I know that trust and distrust are feelings that last. So, for this reason, I prefer to include a variety of respected data sets in my analysis and report on as many results as possible. That way, when the temperatures do warm, and my updated plots and trends show the renewed warming, I think my audience will trust my plots are not simply attempts to present a tendentious argument in favor of a theory I believe to be true.

What is the problem with elevates β error? Oddly enough, the possibility that I might be accused of cherry picking is minor compared to the real difficulty which is that using one data set inherently introduces more scatter due to instrument variability. This elevates β error, without reducing α error. I discussed β error previously and explained that if a null hypothesis is actually wrong it can take many, many years of data to disprove even a false hypotheses to a chosen, high level of statistical confidence.

Since I know many of my readers are aware of β error. Many are aware that using test with high β errors is a well know trick to claim something is proven, when in fact, all one has done is failed to disprove using very little data. Since my readers know this trick, know that I know it, and know that any competent statistician is aware of this issue, I prefer to use methods that minimize β error, while holding the α error at a specified value.

This results in a hypothesis tests that, on average, do not increase the rate of rejecting IPCC projections when they are correct (i.e. α error), but have some chance of rejecting it when it is, in fact, false (i.e. β error.)

(By convention, a “failed to falsify” result should be accompanied by an estimate of the β error, or statistical power. I haven’t seen these discussed in the ‘climate blog wars’, but I do intend to extend my spreadsheet to include these at some point. Reporting both α and β errors are important if people are to draw inferences about statistical results.)

Current comparison between IPCC projections and five data sets.

My readers already know I computed the trend in Global Mean Surface Temperature (GMST) four data sets using data from Jan 2001 through Jan. 2008. I compared that result to the IPCC projections, and found the IPCC projections…. erhmmm… not so good? ( That is: a hypothesis test using Cochrane-Orcut, and confidence intervals computed using a “t” distribution, indicated that the IPCC projections should be rejected to the 5% confidence intervals when compared to the data.)

But now it’s March! So, February data are in. Also, due to interest in this exercise, other bloggers are now performing variations on this analysis. So, naturally, I am extending my analysis. I think the variety of results will give various people more information to consider when forming their opinions about the predictive ability of IPCC projections.

To extend the analysis, I have decided to show results of hypotheses test to determine whether the IPCC best estimate for the trend in GMST during the next three century, published in the AR4, is consistent with observation for the GMST measured after the projections were made.

I will use to basic analytical to test the hypothesis, both using two-sided 95% confidence intervals (i.e. &alpha=5%). The two methods are:

a) Cochrane-Orcutt (CO) , with two-sided confidence intervals, calculated assuming the uncertainty in the mean is t-distributed and
b) Ordinary Least Squares ( OLS )with the number of degrees for freedom adjusted using Neff/N = (1-ρ)/(1+ρ), where ρ is the correlation of the residuals for the OLS fit.
In addition, I will test:

  1. The “average” temperature for each month, computed by averaging the temperature from each of 5 reliable data sources.1. This gives a one trend based on an average. Done this way, the uncertainty intervals on the mean trend include the uncertainty due to weather noise; however, uncertainty due to measurement error, which arises due to lack of precision from each data source is mimized.
  2. The temperatures from each source individually. This results in 5 trends. Because the lack of precision due to each instrument, these will have the largest uncertainty intervals. Making conclusions based on these maximize β error. That is: we increase our risk of failing to reject the IPCC projections when they are wrong.
  3. The average of the trends for each source, calculated , with the uncertainty intervals calculated as if the residuals for each instrument at a given time are uncorrelated from each other. This is incorrect, but the this uncertainty band would enclose the uncertainties in the slope computed using the five sources. It is illustrative for this reason.

Methods 1 & 2 have identical α (alpha) errors. So, I consider the method with the minimum β error superior, as it is gives the least, overall, number of errors. This is why I average over all instruments. Method 3 is deficient as method to test the IPCC hypothesis, and merely gives some sense of the uncertainty due to measurement noise without regard to ‘weather noise’.

Results

After applying this test, I find that using the method I prefer (averaging first, then fitting the trend), I the best estimate by the IPCC is rejected to a confidence of 95%. It is too high to be consistent with the weather data we have experienced.

Results of Hypothesis Test For IPCC Best Estimate Projection of 2C/century.
Best Fit Trend Reject 2.0 C/century to confidence of 95%? (α=5%)
Method C-O C/century  <m> OLS( C/century) C-O OLS
Average all, then fit trend. -1.1 ±2.2 -0.3  ± 2.2 C/century IPCC Projection Rejected IPCC Projection Rejected
Fit trend to each, then average. -0.9 ± 1.6 -0.3 + 1.4 See note. See note.
Individual Instruments
GISS -0.4± 2.2 0.2 ± 2.3 IPCC Projection Rejected Fail to reject
HadCrut -1.6 ± 1.8 -1.0 ± 1.9 IPCC Projection Rejected IPCC Projection Rejected
NOOA -0.3 ± 1.7 0.0 ± 1.7 IPCC Projection Rejected IPCC Projection Rejected
RSS -1.4 ± 2.1 -0.6 ± 2.2 IPCC Projection Rejected IPCC Projection Rejected
UHA -0.8 ± 2.9 0.0 ±2.9  Fail to reject Fail to reject
Note: 1 ‘Method 3′, that is taking the average of the 5 individual trends results in ‘reject/reject’ for the IPCC 2C/century trend. However, as I noted, that is meaningless, as the uncertainty intervals only include the variation due to measurement uncertainty and fail to properly include weather.
Note: 2: Estimates using OLS are given for comparison only. When data exhibit ‘red noise’, the C-O results are more accurate than OLS.)

Examining the table, we see that the IPCC projections of 2C/century are “rejected” to the 95% confidence level using most the methods I tested. If we average the data, and then test, the trend is rejected to the 95% confidence level using both C-O and OLS. (Note however, that when the two methods disagree, C-O is more accurate.) Using each individual instrument, it is rejected under 7 out of 10 possible test methods. The ambiguous result “fail to reject” arises in 3 out of 10. However, due to the small sample time, we know that β error is large– so, “fail to reject” is best interpreted as “not enough data to tell for sure”, rather than “IPCC projections are likely correct”.

Below, I have graphically illustrated the main result and illustrated it below.


GMST vs Time March 25, 2008
Larger Image: GMST vs Time March 25, 2008

“Average” results are those obtained by applying Cochrane-Orcutt to the “averaged” temperature as my standard for determining the trend. The ±95 uncertainty intervals are also calculated using Cochrane-Orcutt; I assume the uncertainty in the mean is t-distributed. (These give very slightly larger uncertainty intervals than the corrected OLS. So, it reduces the rate at which I reject the IPCC trends.)

The best fit for each instrument is illustrated; as are all the data. Currently, GISS gives the least negative trend; HadCrut gives the most negative trend. Other instruments provide intermediate results with UHA MSU giving results closest to the mean off individual instruments.

The IPCC central tendency is illustrated: it lies outside the uncertainty intervals which corresponds to rejecting the hypothesis that the IPCC projections are correct.

So, can this change?

I suspect that the current trend will break, as all trends do. Warming is hardly excluded by the current data. As I have said repeatedly, warming is not rejected by this data. In fact, pre-existing 30 year trends are not excluded by the current data.

So, given the past trend, and the strength of the theory underlying the theory of AGW, warming is likely to resume. When this occurs, the central tendencies for all data sets should turn positive.

But what this data indicates is that if and when warming resumes, it will likely occur at a rate that is lower than projected by the IPCC. So, while the trends will turn up they are unlikely to reach the 2C/century of warming.

I’d also like to note another feature of these test. Let us supposed, the “true” underlying tendency turns out to be 1 C/century. How will these hypothesis test “look” over time?

Oddly enough, due to β error, we are quite likely to see a number of “failed to rejects” increase and decrease over time. The reality is that, though I have not calculated β error, we are in the period of time when β error is anticipated to be large. So, until there is sufficient time for β error to drop below 50%, we will tend to see more periods of “failed to reject” than periods of “reject” even if the IPCC projections are wrong.

Because of the effect of high β (beta) error, careful scientists rarely interpret “failed to reject” as confirming a hypothesis that has not been supported by very large amounts of historical data and sound theory with very few approximation or assumptions. While the theory of AGW is well supported, it is not clear to me that the specific quantitative predictions by the IPCC are, by extension, supported with equal strength.

My understanding is: The consensus states that AGW is proven. But the magnitude is still being debated. One of the ways to test the various hypotheses regarding the magnitude is to do data comparisons. This comparison to observation suggests the IPCC’s estimates are high.

Footnotes:

1. Data from GISS Land Ocean, UHA MSU, NOAA, RSS, and HadCrut

Updates
3/27/2008: I inserted a link to a relevant post comparing to IPCC projections to data.

3/27/2008: I uploaded a figure about beta error. The figure is supplied by reader martin ringo.
Illustration of Power of a Test

Previous Post:
« The Teeter-Totter of Temperature!

Next Post:
When were the models used in the TAR frozen? Around 2000. »


116 Responses to “Comparing IPCC Projections to Individual Measurement Systems.”

You can leave a response, or trackback from your own site.

{ 116 }

Comments

Read more comments, pages: « 1 2 [3] 4 » Show All
Page 1 shows the earliest comments. Earlier comments are above later comments.

  1. comment 2465

    JM….
    I guess I should have added the noise term and discussed ensemble averaging.

    In principle, since both GISS and Hadcrut measure the temperature trend for the same planet (earth) over the same time frame, the underlying trend in GMST over time is supposed to be the same. If you can find a statistically significant difference in the trend, that would be news and you should report it widely.

    BTW: Should you show the trend in GMST based on GISS and HadCrut are different and the difference is statistically significant, you will find the “temperature measurements are bad” contingent of skeptics and denialists applauding you wildly. They are the main group suggesting such a difference may exist due biases introduced by scientists who are finding reasons to adjust historic data up and down. :)

  2. comment 2466

    JM. Hadcru and GISS do not use the same stations, they do not use the same methods. Neither does NOAA.

    I will simplify for you.

    Imagine there are 5 stations in the world to measure temperature. 1,2,3,4,5.

    GISS selects station 1,2,and 3. They compute their average, using their own unique technique.
    Hadcru selects 2,3, and 5. They compute their average using their own unique technique.
    NOAA Selects, 2,4 and 5. They compute their average, using their unique technique.

    So Giss gives you one answer, Hadcru another, Noaa a third. They sample a global database
    of stations. They select different statios. they use unique techniques to adjust and compute averages.

    So. do you pick 1 of the three? Average all three? Or Do what Lucia did. Average all three
    and then ALSO analyze each in isolation.

  3. comment 2475

    Ok Lucia, one more question, which I think is pretty fundamental to the statistical analysis if you are trying to “falsify” something: the independence of the investigation from the phenomenon being investigated. Perhaps exactly the study you have done here has been done at other times, but it did not lead to “falsification” because the trend was up, not down, and well within expectations, so nobody took any notice. Or perhaps it hasn’t been done. But the fact is, the reason there is an interest in this right now is because temperatures have been on a (warm) plateau for about 10 years. You started this investigation recently (January?), not back in 2001, the start of the data you are looking at. To have a truly unbiased study, you would have to start the study where both you and the subject being examined have no knowledge: you could do that now by starting from the April 2008 numbers going forward, and see how things go, of course.

    Or you could look at adjusting the start and termination dates of your comparison period, to see how robust the “falsification” conclusion is to those effects.

    Or you could try to use (as I said, I’m a stats novice, but I know something I think) Bayes’ theorem. Say you have, given the chosen data period, a 4% probability that the data matches the IPCC trend. Now, we also know that the present period of 3-4 years is the first time in about 3 decades that we’ve had this sort of plateau. I.e. the likelihood we’d be looking at it and see this sort of thing, all else being equal, over the past 3 decades is about 1 in 10. Then Bayes’ theorem tells us the actual likelihood that the trend matches IPCC is 40%, not 4%, because of that factor-of-10 observational bias.

    Unless I’m missing something else in what you’ve done here?

  4. comment 2479

    Arthur–
    I disagree that there is only interest in testing the IPCC projections only because they currently look incorrect. I am equally interested in data comparison whether the IPCC projections looks right or wrong. I happen to believe it is important to do data comparisons, in a systematic way, no matter how they look. I would assume everyone is interested in this sort of test.

    Presumably, those who have been making and endorsing the projections are interested in comparisons of this sort, and in principle would always have been interested in these comparisons.

    If anyone is uninterested verification of projections or predictions as a matter of routine, I would like read their explanations why they think data comparisons should not be done.

    I picked the date of 2001 not because the temperatures turned flat around that time. As far as I can see, there are two rational dates for testing IPCC AR4 projections: Start when AR4 was published, which would be 2007 or start the year when they claim their projections start this century , and apply this century. Their projections show data comparison as hindcasts up through 2000, and project starting after wards. That means start data comparisons in 2001 makes sense (as far as I can determine.) Otherwise, a case could be made for 2007, but in which case, we end up in ridiculous situation for purposes of comparison. The next document is scheduled for publication in 2014. So, are we seriously going to make a rule that we can never test “current” projections using more than 7 years data?

    I’m starting based on 2001. Not at the 1998 high, not at the 2000 low, not at the 2002 sort of high, but in 2001, which I picked based on the claims in the AR4. If you wish to start with data from 2007, feel free. I would be happy to watch the progress.

    For my part, I plan to repeat this when the fifth report emerges, if I haven’t moved on to another interest, and will test regardless of current state of the weather.

    As for your suggestion on Bayes methods. I’m not entire clear what you are suggesting, but I sort of think I know what you are saying. If you have a concrete idea, and want to do it, I’d love to see it.

    I’m just applying the traditional methods taught undergraduates who need to test hypotheses. So, those are the results I’m getting.

    I would also point out that when you are trying to detect the rate of “flat spots” by looking at historic data, to say what we are seeing is not rare, you need to find a collection of “flat spots” for which all of the following three are true:
    a) 2C/century was expected to occur under the theory of AGW. (This does not apply to periods ealier than 2000.)
    b) There were no volcano eruptions during the period and
    c) protracted flat or negative trends persisted for at least 8 years.

    Matching the circumstances is necessary if you want to fish out the probability of this flat trend from historic data while also testing the 2C/century hypothesis.

  5. comment 2481

    Re #2475 (Arthur Smith):

    Yes, I think that you ARE missing a lot in what Lucia has done here and on other blogs in the past few months. Some of her most informative posts have been directed to answering the questions that you’re now raising - see, in particular, the ‘Raniers or Maraschino? Accusations of Cherry Picking and Climate Change’ thread on this blog and some contemporaneous posts at David Stockwell’s ‘Niche Modeling’.

    On my count, Lucia has made 11 posts on 3 different threads on this blog within the past 12 hours - and during that time she has also made a valuable contribution at Climate Audit (Unthreaded #34, post #13). These posts alone total some 4000 words, and many of them were detailed and patient responses to your inquiries. I urge you to read the previous discussions before posing further questions in language that seems to me to be needlessly combative.

  6. comment 2482

    Arthur,

    I hadn’t seen Lucia’s Comment 2479 when I made my Comment 2481. It fully confirms my point.

  7. comment 2486

    I’ll try again

    Two glasses of water, one at 0C, one at 273K.

    Mix them Actual temp = 0C or 273K

    Avg them =(273+0)/2 = -136.5C / 136.5K

    Which answer is correct?

    You cannot mix numbers from different scales (baselines) even if they have the same ticksize

    You *must* apply a correction factor or the “hotter” baseline will “ccool” the result

    I think you should redo your analysis

    Best regards

  8. comment 2487

    JM, your averaging of points is pointless (excuse the pun). Also you’re mixing temperature scales C & K, which is not the same as using a different base in the same temperature scale.

    What’s important, as Lucia has pointed out, is the trend. In Lucia’s example above she used 2 linear functions with the same
    trend m. & the same scale. However say they have different trends, m1 & m2. When you average the 2 functions the resultant trend will be (m1 +m2)/2.

    Say the 2 functions have different base periods, as they do in these different time series (HADCRU, GISS etc). You decide to first bring them to the same base, before averaging, by adding a constant term to the constant term of the relevant function. The trend of the modified function doesn’t change, the linear function just moves up or down. When you average these modified functions, the trend is still (m1 + m2)/2.

    So there is really no need to bring all series to a common base.

  9. comment 2488

    Lucia - if it’s a robust “falsification”, it shouldn’t matter whether the starting point is January 2001, July 2001, January 1999, July 2004, etc. except the different time spans will inherently give you different uncertainty numbers because of the different numbers of data points. If the 7-year span is too short to do this kind of robustness analysis because the uncertainties are inherently large, and if you feel the need to only compare data from after the prediction was (effectively) made, then we really should go back to the TAR and compare its numbers with trends from 1995 or so on. You could do the same with AR4 once we have enough years beyond 2001 to make a meaningful analysis.

    The point is not that you were deliberately trying to bias things - I am pretty sure you wouldn’t do that. The point is that you did have to make a choice on starting point - you have “rational” arguments for it, but a proper statistical analysis really needs to show whether making that choice as opposed to all the other possible choices for starting (and ending) was in itself a low- or high-likelihood event, that’s where Bayes comes in.

    Ian - I’ve visited here when Lucia started up this blog, but I’ve been away a while. I did read (some of?) Lucia’s posts on this before commenting, and many of the comments as well. I am aware there’s an issue with solar cycle forcing, but I hadn’t noticed any previous discussion of the two issues I have raised: the actual meaning of the +- in Lucia’s tables (she answered that question just fine) and this Bayesian a priori probability issue.

    JM - if each glass of water has some internal heat source and one is increasing temperature at 1 degree C per hour, and the other is increasing at 3 K per hour, since the relative units are the same, when you mix them it doesn’t matter whether you normalize to the same baseline or not, you get the same average rate of increase. I think you’re being misled by tamino’s critique of something very different that Anthony Watts did comparing absolute anomaly numbers; it’s not an issue for what Lucia’s doing here.

  10. comment 2490

    Arthur,

    Lucia - if it’s a robust “falsification”, it shouldn’t matter whether the starting point is January 2001, July 2001, January 1999, July 2004,

    This is silly.

    First, we can’t fairly “falsify” or “verify” a prediction using data from 1999 or earlier for two quite obvious reasons.

    * The IPCC AR4 projections don’t apply to 1999. They don’t hindcast 2C/century into the past; the 2C/century values is higher than values in the past and applies only to this century. So, using data from 1999 or earlier to test the 2C/century prediction is like determing the average height of Swedes by measuring Norwegians. (However, if are really serious that the start data doesn’t matter, and you wish me to do the illogical exercise, I could perfectly well test to see if 2C/century is consistent with the past trend, starting in… on 1950? Yes, that’s reducto ad absurdum. But the fact is, the IPCC hindcast doesn’t “post-dict” 2C/century for the 90s.)

    Second, the IPCC AR4 projections, and the method for creating them are tested and developed by hindcasting over past data. You cannot “verify” a prediction using data used to develop the prediction. Moreover, one would expect these things to hindcast relatively well— had hindcasting been pitiful, those predicting would surely adjust their method. (Everyone in science and engineering would do this. To do otherwise, actually violates the scientific method! But, even outside science, we don’t test the accuracy of prediction by including hindcast data; we don’t test psychic’s predictive skill by letting them predict the outcome of the 1960-2004 presidential election in 2006, reading their prediction in 2008, and then testing using data from 1960-2008 to falsify or verify. Of course the test won’t falsify. The stuff from 1960-2004 was not a prediction!)

    Minde you, is true that we can show methods are wrong by showing they don’t even hindcast, but in that case we aren’t falsifying a prediction or projection. In the case of the psychic we might be showing he has access to a poor library, or has a poor memory.

    So, having dispensed with why we can’t start with data before the projections are made, let us discuss why we can’t expect falsification to be “robust” as we decrease the amount of data. We know, the beta error (or type II) errors in these sorts of tests are infinite when we have little data. This means, if we were to begin falsifying using data from Feb 2008– simply because Arthur Smith decrees I can’t use data before it entered my mind to test a prediction– then the type II error for the test is 100%. We can’t falsify because we will have zero degrees of freedom for the test. So, the uncertainty intervals on a slope determied from two data points are infinite. We will never falsify even horrifically poor projections. The beta error decreases as we get more and more data.

    So, it is reasonable to pick the data that gives the maximum data permissible to test a falsification. In fact, this is the only reasonable choice.

    I’m not sure where people are getting the idea that falsification needs to be “robust” to be meaningful. Falsification in at early times is rarely “robust”, unless the prediction is insanely incorrect. For example, had the IPCC projected 10C/century, we would be getting falsifications which will never reverse to “fail to falsify”.

    What we are likely to see– and I have blogged about this– is the measured trend will oscillate about the true underlying trend. Supposing this is, say 1.5C/century, the central tendency in the TAR, and the lower bound from the AR. If that value is correct, we are going to see the best estimate of the trend oscillate around 1.5C/century, sometimes exceeding 1.5, sometimes falling below. As we accumulate more and more data, the uncertainty intervals will narrow, and we will eventually falsify 2.0C/century in what you might call “robust”.

    In contrast, if 2C/century is correct, we would expect to get “falsification” at a rate of roughly 5%, because we’ve set alpha (type 1) error to 5%. The fact that we got one, the first time I happened to test it, does cast serious doubt on 2C/century.

    The fact of the matter is, if the IPCC projection of 2C/century is correct, getting a value as low as we got over a period as long as we got using the only reasonable start date, is an unusual event that requires explanation. The possibilities are:

    a) this is totally random in the way you can flip 6 heads in a row,
    b) the IPCC decision to not includes effects like solar/ land use etc in their projections is leading to biased results
    c) we hit a “perfect storm” of unusual weather events in 2001 (Example: simultaneously hitting the solar peak, a turn in the PDO, yada, yada, yada) in
    d) the scenarios fed to the models are unrealistic or
    e) the models themselves are biased even if the correct scenarios are fed to them.

    Conditions (a) and (c) would still mean the statistical result is “falsified” to the 95% confidence interval. It is simply a fact that statistical tests alway have some level of uncertainty, and so there is a finite rate of false positives. For a 95% confidence interval, that rate is 5%. It could happen now.

    Conditions (b), (d) and (e) would indicate deficiencies in the IPCC process.

    Ian– I’m fine with Arthur’s questions. I know questions often sound a bit more confrontational in blog comments than intended. (So do questions at professional society meetings. )

    As for Arthur’s response; It’s true that I have not directly addressed the Bayesian issue in those terms. However, I’m not entirely sure what specifically Arthur meant. However, I think I may have indirectly addressed Arthur’s issue when responding to Stoat’s comment about the frequencies of flat spots in past data. I replied to that here:
    http://rankexploits.com/musing.....2ccentury/

    I included this figure which shows that all negative and flat spots embedded in swiftly rising periods of time occur as a result of volcanic eruptions:

    I don’t expect everyone to read every one of my posts. (Plus, creating better archives is on my “to do” list.) But it is true that I’ve commented on many of the recurring issues here:
    http://rankexploits.com/musing.....snt-apply/

  11. comment 2491

    Lucia

    Please stop distracting people

    Please address my point at 2486.

    You said at 2449:

    “Baseline:

    
I use whatever baseline period each agency choses and average the data they provide. With respect to finding trends, the baseline doesn’t matter and it doesn’t even need to be consistent.

    T = m * time + b

    We seek “m”. The baseline shifts “b”.
If you go through the math, if you average 5, this results in a different “b”. But as we don’t care what it is, that doesn’t matter.”

    I understand this to mean that you do no correction to bring anomalies into line before averaging.

    “… the baseline doesn’t matter and it doesn’t even need to be consistent …”

    “… the baseline doesn’t matter …”

    Both statements seem pretty definitive to me. You’ve gone to some effort to confirm them in subsequent comments.

    Are they true or not?

    And please don’t talk about slopes again, I’m talking about data acquisition, not analysis.

  12. comment 2492

    JM wrote:

    Lucia

    Please stop distracting people

    JM, I snorted coffee out my nose when I read this.

    For what it’s worth, I don’t think answering Arthur or Ian’s comments is “distracting” people. Also, it may surprise you to learn this is my blog. If I want to “distract” visitors who elected to come here and have conversations with me, I will do so.

    If you believe that differences in the baseline affect a trend, I suggest you find a copy of EXCEL, down load some data from any of the agencies. Afterwards, find the trends. Then rebaseline and find the trends over time.

    If you do it right you will find the baseline doesn’t affect the trend.

    Or, better yet, if you think it does, and you don’t want to do the work yourself, why don’t you go back to Tamino’s thead, and ask him to do it for you? :)

  13. comment 2493

    Geoff Larsen: “s not the same as using a different base in the same temperature scale.”

    Thanks Geoff, that is my point exactly. Each of these data series use a different base. Kelvin and Celcius - same ticksize, different base.

    GISStemp, HadCru, RSS, UAH, NCDC - same ticksize, different base.

    [blather about interception points and slopes] - irrelevant.

    Arthur Smith: “JM - if each glass of water has some internal heat source and one is increasing temperature at 1 degree C per hour [etc]”

    OMG: Arthur, please try and understand the example. You have two glasses of ice water and mix them together.

    What is the temperature of the mix?

  14. comment 2494

    Lucia

    Please address my point at 2486.

    What is the correct temperature of the mix? 0C or the calculated average of -136.5C?

  15. comment 2496

    JM.

    What point? That you intentionally made both a sign error and a unit conversion error in a problem involving thermodynamics? And which has nothing to do with the determination of a slope? And is, in short irrelevant?

    If you wish to discuss baseline, the correct answer are infinite. They include the average temperature is

    a) T= OC,
    b) The anomaly is T’=+1C relative to a baseline of To=-1C where T’= T-To.
    c) The anomaly is T’=-1C relative to a baseline of To=+1C where T’= T-To.
    d) T= 273 C relative to a baseline of -273C.

    In any case, when graphed, my data are set to a common baseline. So, even if you were correct about this making a difference in principle, it does not in practice. The data are on a common baseline.

  16. comment 2497

    Lucia

    Sorry I missed this bit of your post:

    “down load some data from any of the agencies. Afterwards, find the trends. Then rebaseline and find the trends over time.”

    So you recognize “rebaselining” is important?

    Good.

    Progress.

    But it completely contradicts your earlier description of how you’ve done this,, where you’ve said many times “… the baseline doesn’t matter …”

    How do you rebaseline then? And when? Before or after find the trends?

    Because you should do it before.

  17. comment 2498

    Lucia: “In any case, when graphed, my data are set to a common baseline”

    I’m talking about data acquisition.

    How do you get it on a common baseline? And when? Before or after averaging? Before or after trending?

    You’ve previously said categorically that baselines don’t matter and that the *don’t* do it.

    What’s the story?

    I can’t figure out what you’re doing here unless you tell me.

  18. comment 2499

    Lucia: “when graphed, my data are set to a common baseline.”

    When graphed?

    That is simply not good enough

    I want you to put them on a common baseline before you even start to analyse them

    Do you do that?

  19. comment 2505

    Spence_UK. Well put

  20. comment 2506

    JM - my goodness, you’ve been answered a dozen times here already. Try actually reading some of the responses to your comments and think about them carefully. Several us have experience as practicing scientists, PhD’s, etc, and we don’t agree on many things. But on this one you are just wrong. Go do what Lucia said, get a spreadsheet yourself and try it with real data. Let me repeat, the baseline does not matter when what you are interested in is the rate of change with time!!!

    Do you know any calculus? What Lucia has been studying here is dT/dt - the rate of change of temperature with time. If you modify the temperature T by a constant value C, then it is a constant adjustment. It is pretty fundamental in calculus that the derivative of a constant is exactly zero. I.e. d(T+C)/dt = dT/dt.

    The baseline does not matter!

  21. comment 2508

    Arthur: “JM - my goodness [etc]”

    Arthur, don’t teach your grandmother to suck eggs.

    Yes I do understand calculus and a lot more besides.

    I doubt however, that Lucia understands her input data.

    Get out of the way.

  22. comment 2509

    Lucia

    Do you put your datasets on a common baseline before averaging or not?

    You first said that you didn’t, and went to quite some effort to tell me it wasn’t necessary.

    Now you say you do.

    Which is it? And when do you do it?

  23. comment 2510

    JM–

    Sometimes I rebaseline; sometimes I don’t. I do it at different times as a matter of convenience. It doesn’t matter whether or not one does it or when. As many have patiently explained to you, it doesn’t matter.

    Also, I do not permit visitors to say things like “don’t teach your grandmother to suck eggs”.

    I am adding your name to the “slow down boris” plugin. and possibly modifying to deal with your special habits. Maybe Arthur can suggest a special feature just for you.

  24. comment 2511

    “Sometimes I rebaseline; sometimes I don’t.”

    Then you are condemned out of your own mouth.

    Read the footnote at the bottom of the GISS file that you download each month,

    Try to understand it.

    Your results mean nothing.

  25. comment 2512

    Me: “Read the footnote at the bottom of the GISS file that you download each month,”

    Oh, and do read the usage notes on the web pages for every other data file as well. They say exactly the same thing.

  26. comment 2513

    Geoff (2487)

    Just before I go, I want to address the point you are trying to make. You’re saying that baseline doesn’t count when looking at trend right?

    But it does in Lucia’s procedure. Averaging two values assumes that both values make equal contributions to the result, but if you look at my example with K/C you’ll find that the C value makes *no* contribution to the result.

    The result is skewed towards the higher valued scale (K), in my example Kelvin. This is how the “hotter zero” -> “cooling” effect works. The measure against the hotter zero point contributes less to the result and therefore acts to “cool” the result.

    I thought I was making that clear in my example (which I’ve restated in a couple of different ways in this thread) but it appears that Lucia doesn’t understand it.

    This is why you *cannot* mix values from different scales. The differing zero points completely screw up the assumption of the “divide by 2″ assumption,

    If you don’t agree, please let me know and we can discuss.

  27. comment 2515

    “In fact, I think all these things! I’ve been for alternative energy sources since…. the oil embargo in the ’70s! At the time, my thoughts were not related to CO2 accumulation, but this is now an additional important factor.”

    No it isn’t. Look Lucia. There is just no need to be compromising with these lunatics. The starting point of this debate always must be that we are in a brutal and pulverising ice age. You cannot let yourself be beaten down by the sheer weight of mindless leftist idiocy.

    In the middle of this current food and energy crisis we cannot let these environmentalists repeat the mass-killing they pulled off with the DDT bans by spuriously agreeing with their conclusions, even after showing they have the science wrong.

    Now is not the time. If we have ubiquitous, saturation nuclear power, then we can think of these things 50 years down the track when a reduction in CO2 output isn’t going to lead to the starvation of millions. These are not honest mistakes that people like JM make.

  28. comment 2516

    JM - you say your 0 C/273 K average proves “the C value makes *no* contribution to the result.”

    But your proof is wrong - the zero does contribute just as much as the 273. It doesn’t matter if the number is 0 or -10000 or 10^26, when you average it with another number, half of the resulting value comes from one, and half from the other.

    When you average degrees C and K, the averaged unit has a zero point half-way between the zero points of the two. The “baseline” of the average is at exactly 273/2 K, and -273/2 degrees C, i.e. +136.5 K or -136.5 degrees C. This is a well-defined and perfectly valid, though unusual, temperature scale. In this temperature scale, the freezing point of water is at a value of +136.5. Not coincidentally, that’s exactly what you get when you average 0 C and 273 K - you get 136.5 in these new units.

    If you take measurements on different baselines and average them all together you get a new measurement with a modified baseline. That is proved in your 0 C/273 K example, and it’s perfectly valid with the averaging done by Lucia here.

    Now, there is and issue with the troposphere (UAH, RSS) numbers fundamentally measuring something different from the surface (GISS, HADCRU) numbers - sometimes people divide the troposphere numbers by a factor of 1.2 to make them comparable, so maybe that should have been done, and that would have changed the effect of averaging. But the differing baselines have no effect on the averaging.

  29. comment 2519

    Arthur, Lucia, et al., take note. Do not think, when engaging with JM, that you are engaging with an honest interlocutor, he is in fact the opposite. How else can one regard the following statement made by JM at the following site, where I myself was engaging with him, about Lucia:

    “Sorry DB, I tried, but I cannot take Lucia seriously. Anyone who thinks mixing two iced drinks creates liquid helium is beyond my capacities to argue with.”

    http://catallaxyfiles.com/?p=3.....ment-93438

    You might also ask him to consider the situation if he were right. What are the implications for each of the major datasets themselves seeing as they are collections of temps. at different instrument stations each with its own adjustments, etc.? Further, what are we to make of the ensemble means of the IPCC projections that represent the results of seperate models, each with different parameterisations, etc.?

    Apologies, Lucia, for ever pointing such a dissembler as JM to your site.

  30. comment 2522

    Arthur (2516)

    You are simply wrong. Water does not freeze at -136.5C, it freezes at 0C. No mathematics can change that

    Refer to Lucia’s comment at 2416:

    “Obviously, if GISS reported in C and Hadcrut reported in F, I would need to convert to maintain consistency of units”

    That is a correct statement.

    But while GISS and Hadcrut report in the same *units* (C), they report on different *scales* (zeropoints)

    You have to adjust to get them on the same *scale* as well as units.

    Lucia does not do that. I can have no doubt about it - she has spent a lot of effort telling me - so her results are meaningless.

    She should make the adjustments, redo her analysis and present her new results.

More comments: « 1 2 [3] 4 » Show All
Page 1 shows the earliest comments.

Leave a Reply

Your email is never published nor shared. Required fields are marked *

*
*

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 
 

Recent Posts

Popular Categories

No categories

About

You have no about page, you should add one through the admin interface, or edit 'footer.php' and put some super cool information here!

  • Recent Trackbacks:

    • The Blackboard: Accounting for Measurement Uncertainty.
    • The Blackboard: Ninety Month Trends: IPCC AR4 2C/Century still outside ±95% uncertainty bands.
    • The Blackboard: Hypothesis test for 2C/century: now with Monte Carlo!
    • The Blackboard: Result of Boring Series: Gavin’s “Closer” Process Falsifies.
    • The Blackboard: Result of Boring Series: Gavin’s “Closer” Process Falsifies.