How Do We Know That Temperature is I(1) and not I(2)?

I’m not going to go through all the math that has been posted elsewhere on unit roots. This will be much more practical.  I’ll treat the climate system as a black box that just adds noise to the input signal.  I can add more complexity like time constants for a two box model later.  As a first pass, I’ll use white noise and see what happens. First I’ll create the synthetic time series A,A1 and A2 with 0, 1 and 2 unit roots respectively. I’m also going to use just the Phillips-Perron test for unit roots.  I’m using R for Windows version 2.10.1 and the PP test is from the tseries package.  The PP test has a null hypothesis that the series has a unit root.  A p value of less than 0.05 means the null hypothesis is rejected (no unit root) with a confidence of greater than 95%.

One can easily create a time series with one unit root by integrating, or cumulatively summing, a white noise series. The code in R for this is:

A<-rnorm (n)

A1<-cumsum (A)

Where n is an integer defining the length of the time series, rnorm is Gaussian noise with mean zero and standard deviation one and cumsum is A1(n) = A(n) + A1(n-1). Similarly, an I(2) time series can be created by cumulatively summing an I(1) series. Differences are created using diff(X).  The series are centered to as close to a mean of zero as I can get before summing to minimize trends in the summed series and scaled to similar ranges.  The length of each series is 1000 points.

A2 is black, A1 is blue and A is red.

Testing A2, A and A1 for unit roots:

Phillips-Perron Unit Root Test

data: A2
Dickey-Fuller Z(alpha) = 0.0049, Truncation lag parameter = 7, p-value
= 0.99
alternative hypothesis: stationary

Warning message:
In pp.test(A2) : p-value greater than printed p-value

> A2d1<-diff(A2)

>pp.test(A2d1)

Phillips-Perron Unit Root Test

data: A2d1
Dickey-Fuller Z(alpha) = -13.204, Truncation lag parameter = 7, p-value
= 0.3732
alternative hypothesis: stationary

> A2d2<-diff(A2d1)

>pp.test(A2d2)

Phillips-Perron Unit Root Test

data: A2d2
Dickey-Fuller Z(alpha) = -921.257, Truncation lag parameter = 7,
p-value = 0.01
alternative hypothesis: stationary

Warning message:
In pp.test(A2d2) : p-value smaller than printed p-value

Phillips-Perron Unit Root Test

data: A
Dickey-Fuller Z(alpha) = -923.5331, Truncation lag parameter = 7,
p-value = 0.01
alternative hypothesis: stationary

Warning message:
In pp.test(A) : p-value smaller than printed p-value

> pp.test(A1)

Phillips-Perron Unit Root Test

data: A1
Dickey-Fuller Z(alpha) = -13.4424, Truncation lag parameter = 7,
p-value = 0.3599
alternative hypothesis: stationary

> A1d1<-diff(A1)

>pp.test(A1d1)

Phillips-Perron Unit Root Test

data: A1d1
Dickey-Fuller Z(alpha) = -922.0155, Truncation lag parameter = 7,
p-value = 0.01
alternative hypothesis: stationary

Warning message:
In pp.test(A1d1) : p-value smaller than printed p-value

The PP test results are as expected.  A2 has to be differenced twice and A1 once before the PP test rejects the unit root hypothesis.  But a cumulative sum is a low pass filter. Does the temperature series look like data from a low pass filter? Not really. If the climate is chaotic, then one expects to see variation at all time scales even when the input isn’t changing.  I’ll add some noise, a fraction of series A, to A2 and see the effect of the noise and the series length on the test results.

>A2n<-A2+0.1*A

Phillips-Perron Unit Root Test

data: A2n
Dickey-Fuller Z(alpha) = -0.0466, Truncation lag parameter = 7, p-value
= 0.99
alternative hypothesis: stationary

Warning message:
In pp.test(A2n) : p-value greater than printed p-value

The full series with noise still has at least one unit root. But what are the results for a shorter segment?

>pp.test(A2n[1:100])

Phillips-Perron Unit Root Test

data: A2n[1:100]
Dickey-Fuller Z(alpha) = -9.8403, Truncation lag parameter = 3, p-value
= 0.545
alternative hypothesis: stationary

> pp.test(A2[1:100])

Phillips-Perron Unit Root Test

data: A2[1:100]
Dickey-Fuller Z(alpha) = -2.2552, Truncation lag parameter = 3, p-value
= 0.9602
alternative hypothesis: stationary

The first 100 points still fails to reject the presence of a unit root for both the original and the noisy series, but the p-value for the noisy series is less than that for the noise free series.  Now I’ll test the first difference.

> A2nd1<-diff(A2n)

Phillips-Perron Unit Root Test

data: A2nd1[1:100]
Dickey-Fuller Z(alpha) = -119.0049, Truncation lag parameter = 3,
p-value = 0.01
alternative hypothesis: stationary

Warning message:
In pp.test(A2nd1[1:100]) : p-value smaller than printed p-value

Phillips-Perron Unit Root Test

data: A2nd1[1:200]
Dickey-Fuller Z(alpha) = -252.2832, Truncation lag parameter = 4,
p-value = 0.01
alternative hypothesis: stationary

Warning message:
In pp.test(A2nd1[1:200]) : p-value smaller than printed p-value

Phillips-Perron Unit Root Test

data: A2nd1[1:500]
Dickey-Fuller Z(alpha) = -722.274, Truncation lag parameter = 5,
p-value = 0.01
alternative hypothesis: stationary

Warning message:
In pp.test(A2nd1[1:500]) : p-value smaller than printed p-value

Phillips-Perron Unit Root Test

data: A2nd1[1:1000]
Dickey-Fuller Z(alpha) = 4430.345, Truncation lag parameter = 7,
p-value = 0.99
alternative hypothesis: stationary

Warning message:
In pp.test(A2nd1[1:1000]) : p-value greater than printed p-value

I need nearly the full series to fail to reject the presence of a second unit root.

To see how things compare, here’s a plot of the first 124 points of the synthetic noisy series in black and the GISS anomaly from 1880-2003, scaled to the same range, in red:

In answer to the title question, we don’t know that the temperature series is I(1) and not I(2) because it’s too short and too noisy. In terms of whether there’s a trend or not, I think there are more suitable tools available based on control chart theory like CUSUM or EWMA charts that are designed to test for the presence of small trends or variations from the mean in short series. That’s another subject, though and besides I don’t have a package to do EWMA control charts.

82 thoughts on “How Do We Know That Temperature is I(1) and not I(2)?”

  1. DeWitt,

    You performed this test on one A series, one A1 series, and one A2 series (all three are shown in the first figure).

    Suppose, Monte Carlo style, you went back to R and composed a second, third, nth set of A / A1 / A2 series, and tested them one by one.

    Would the p-values generated by the Phillips-Perron Unit Root Test be consistent, such that the “fail to reject” threshold was consistently met/unmet for nearly all “sets” of A / A1 / A2 series (and the differenced series, and the noisy series)?

    That should be the case, I think.

    (As a near-innumerate, apologies in advance if this question isn’t sensible.)

  2. Re: AMac (Mar 30 12:04),

    I should probably do that. If I were a better programmer, I would have done it already. The problem is that although rnorm produces a series with a nominal mean of zero and a standard deviation of one, which means that the standard deviation of the mean should be 0.01, the calculated value of the mean for A was ~0.07. Integrate that twice and you get a trend and some large numbers. So to automate the process, I have to calculate the mean and subtract it from the series, integrate, calculate the mean and subtract, integrate again and scale before I can test. I should probably also adjust the scale factor and noise level so that the p value for A2nd1 is ~0.05 too.

    Another thing I didn’t do was calculate the correlation coefficient between the noisy and noise free series. If the two series were really of different integration orders, then a high coefficient would be spurious. Or at least that’s my understanding.

  3. Hi lucia,

    “Does the temperature series look like data from a low pass filter? Not really. If the climate is chaotic, then one expects to see variation at all time scales even when the input isn’t changing. I’ll add some noise, a fraction of series A, to A2 and see the effect of the noise and the series length on the test results.”

    Why that particular DGP?

    I *measured* the series formally, and came to an ARIMA(3,1,0) specification. I.e. three autoregressive terms in the first difference equation. That’s why I used it in my simulations.

    Simulation results pertaining to other types of DGP models (like the one you employ here), are less informative as to how the estimator behaves in *our sample*.

    Here are my Monte-Carlo results, with Matlab code, if you want to reproduce/modify:

    PP: test (result: heavy bias)

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-2457

    Subsequent ADF 3 lag test (result: almost exact)

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-2468

    Cheers, VS

  4. PS. I believe temperatures (or in this particular case the GISS record) to be I(1) rather than I(2) because I tested the first difference series for unit roots as well, and rejected.

    All ADF test equations (using all available information criteria to determine lag length) reject unit root in first difference series

    All DF-GLS test equations (using all available information criteria to determine lag length) reject unit root in first difference series

    I posted the ADF test results, but only reported the DF-GLS, because we were discussing the first unit root.

    Here’s a link to the test results:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1524

  5. Oh dear,

    “Hi lucia”, should be “Hi DeWitt”..

    I somehow assumed lucia posted this, because it’s here.. and.. eh, never mind..

    apologies!

    Cheers, VS

  6. Re: VS (Mar 30 15:26),

    lucia didn’t write this post, I did.

    Sorry but you’re assuming your conclusion. You reject the possibility of a second unit root based on analysis of a short, noisy data set. The question really is not whether temperature is I(0) or I(1), it’s whether it’s I(1) or I(2) and I contend that the rejection of the unit root in the first difference is because the data set is too short and too noisy. The temperature record is a sample of the temperature, not the temperature itself, and as a result it must contain sampling noise. Differencing amplifies that noise. What effect does that have on the unit root tests? If I do tests on an ARIMA(3,1,0) model, I’ve already rejected the possibility of temperature being I(2). For simplicity, I started with a (0,2,0) model. Did you test, for example, an ARIMA (2,2,0) model? That should have somewhat similar properties to a (3.1,0) model.

  7. DeWitt,

    Hmm, I didn’t express myself clearly. I listed my Monte-Carlo as an example of how to (canonically) perform the simulation you were trying to perform. I didn’t say you needed to use an ARIMA(3,1,0). I said you needed to justify the relevance of your chosen DGP.

    Apart from that, testing has indicated the presence of the second unit root, that’s what I referenced to in the PS.

    Anyhow, I see no reason to infer that that the ADF / DF-GLS tests misbehave on our particular sample (i.e. first differences of temperatures).

    Cheers, VS

    PS. Sorry again for the mix-up 🙂 was doing multiple things at the same time.

    PPS. Note that nobody in the literature found a second unit root.

  8. Re: VS (Mar 30 16:21),

    PPS. Note that nobody in the literature found a second unit root.

    Of course they haven’t. The tests have little power in the presence of noise and a limited sample length.

  9. Hi DeWitt,

    “Of course they haven’t. The tests have little power in the presence of noise and a limited sample length.”

    Do note that these issues have been studied to death in the econometric literature 🙂 However, your effort is appreciated.

    Kind regards, VS

  10. Hi DeWitt,

    “The tests have little power in the presence of noise and a limited sample length.”

    I do have to warn you that this has been studied to death in the econometric literature. The theoretical part of the field deals with little else.

    Nevertheless, I appreciate the effort! 🙂

    Kind regards, VS

  11. oops, I just noticed an (important) typo:

    “Apart from that, testing has indicated the presence of the second unit root, that’s what I referenced to in the PS.”

    Testing has in fact REJECTED the presence of a second unit root.

    VS

  12. VS

    you apply econometrics to a system that can be modelled using a physical basis. I think you need to broaden your education, it seems to be lacking.

  13. VS–
    I sometimes (though rarely) invite guests posts.

    Could you define your acronyms– like DGP? You are discussing issues with people who are not econometricians, and that means everyone is going to have to go back to using more general language. Does the DGP refer to DeWitt’s low pass filter? He picked that because the mathematic resemble a 1-lump models for a system that conserves energy and is forced. It is the simplest possible model that might possible apply to the earth’s climate system. (It is oversimplified. But at least it’s not a linear regression of this sort:
    (1) T(t)=a + b*t+ e(t)

    which everyone absolutely knows is mis-specified based on phenomenology.

    Or does DGP mean something else?

    The underlying question many of us have is this:

    If we have systems that obey the sorts of ordinary differential equations (ODE’s) and partial differential equations (PDE’s) we commonly see when mass, momentum and energy are conserved, and we force these models to create a synthetic series “Temperature” as a function of “time”, what sorts of outcomes do we get from the sorts of tests you apply?

    (We will be examining longer time series, shorter etc.)

    I do have to warn you that this has been studied to death in the econometric literature. The theoretical part of the field deals with little else.

    That’s fine. Just explain whatever you think applies from inside the field to those outside the field, and when possible, use general language instead of jargon or “terms of art” from economics.

    In particular, we’d all love to know the answer to something like this:

    If we force a “two-lump” model for a “cartoon” climate system using, F(t), with the model set to conserved energy and does not violate the 2nd law of thermo and then pass the output, T(t) through your tests, do we reject the unit root? (We’ll want to apply various types of forcing and run things for various lengths of time.)

    One of the reasons this is asked is that no one thinks the correct specification for temperature (T) of the earth’s surface as a function of time,t, is
    (1) T(t)=a + b*t+ e(t)

    where e(t) is any type of noise and the part in blue is the ‘derministic’ part. (I don’t know what language you guys use in econometrics. In turbulence studies we might say “ensemble average’ for the deterministic (blue) part and ‘fluctuation’ for the non-deterministic part. The signal processing guys would probably say “signal” & “noise”.

    But basically, given that no one thinks equation (1) applies, we are curious about what happens if we test output from systems of equations that resemble the systems we expect to be more similar to those that govern the climate system.

  14. Re: VS (Mar 31 01:24),

    I do have to warn you that this has been studied to death in the econometric literature.

    I’m not going to plow through the literature. I’m running some simple minded tests to see if this sort of thing makes sense. So far, it doesn’t.

    The PP test does seem to be the most sensitive to noise so I’ve done some Monte Carlo type stuff with the KPSS and ADF tests. I’m mostly testing 1,000 examples because I don’t want to wait around for an hour while R grinds through 50,000 examples. I test the original series A2 and A2n and the first and second difference of each series. With the original series scaled to a mean of zero and a range of 20, for a 128 point long series both the ADF (using the default settings which calculates a lag of 4) and KPSS (also default) tests find two unit roots about 95% of the time at the 95% confidence level. That is, ADF fails to reject for the original and first difference series and rejects for the second difference. KPSS rejects for the original and first difference and fails to reject for the second difference. For the noisy series where the noise added has an s.d. of 2, both KPSS and ADF fail to find a unit root in the first difference series.

    To determine the approximate noise level of the GISS series in degrees times 100, I fitted a sixth order polynomial to the data. The standard error of the fit was 9.3 and the range of the data was 96. No, I haven’t done normality tests on the residuals. Maybe I’m too simple minded about this, but it still looks to me that the failure to find a second unit root in the temperature series is not proof that it doesn’t exist and so the whole cointegration argument fails.

  15. “it still looks to me that the failure to find a second unit root in the temperature series is not proof that it doesn’t exist and so the whole cointegration argument fails.”

    That does not read to me like a valid conclusion. I think you would need to demonstrate a second unit root exists before you can conclude the cointegration argument fails.

    If it is possible that a second unit root exists, then it is possible that the cointegration argument fails, but this is as yet unproven.

    So long as it is possible that no second unit root exists, it is possible that the cointegration argument is valid.

  16. I think this paper has been discussed here before in comments?

    http://cbe.anu.edu.au/research/papers/pdf/wp495.pdf

    It seems like a fairly decent “in” for someone like me who is trying to follow along but doesn’t really have the stats chops to argue strongly one way or the other. For the most part it seems a logical approach to take and pretty well explained. Is there a linear trend? Can we trust it, perhaps there is a unit root? What does it mean if there is a unit root? Can we account for a unit root and see if there is a trend anyway? All without resorting to “but the physics say, so…”

    Their conclusion isn’t the same as VS as far as I can make out. But regardless of that, as a primer on the types of tests that econometricians might apply it seems alright to this non-expert.

    A quick google shows that this sort of analysis has been applied to global temperature time-series since at least the early ’90s so it is hardly a new approach or avenue of enquiry and the issues certainly don’t seem to be unknown in the context of climate, interesting as it certainly is.

  17. Re: lucia (Mar 31 07:23),

    I interpret DGP as Data Generation Process which refers to how we generate our synthetic time series. In ARIMA notation, I think my test series are (0,2,0) where the first number is autoregressive lags, the second is unit roots and the third is moving averages. I read VS as having convinced himself that the temperature series is (3,1,0) and so all his tests are based on data generated with that model using AR lag coefficients determined by fitting to the temperature series. I contend that he can’t really tell the difference between (2,2,0) and (3,1,0) as autoregressive lags have similar behavior to unit roots and so he is in some respect begging the question. I think it also implies that there is no detectable deterministic signal in the 1880-200x temperature series.

    The problem of how to model this is what to do about the high frequency information in the temperature signal. It isn’t really all random noise, but for modeling purposes, I don’t know of any other way than adding random noise to simulate it short of an AOGCM (coupled Air Ocean General Circulation Model). And that probably wouldn’t work either as, IIRC, AOGCM’s don’t correctly reproduce the frequency domain behavior of measured temperature.

  18. Re: Paul_K (Mar 31 07:39),

    DGP = Data Generating Process

    Thanks! Then I guessed correctly.

    Does the temperature series look like data from a low pass filter?

    A low pass filter exhibits the same mathematical properties as a lumped parameter model. A bunch of us have been discussing “DGPs” of this sort, since they are the simplest possible over-simplifications of the sorts of equations one might used to describe climate trends in a system that conserved energy.

    I haven’t had time to read how these things might relate to the way econometricians view things. If VS has literature that tries to span the gap between the ODE (ordinary differential equations) and PDE (partial differential equations) type representations and the linear regression looking things that his tests assume are used to support his statistical models, that would be useful.

  19. Re: jr (Mar 31 07:53),
    That paper concludes with

    We can see that the realised temperatures in most of the past 10 years lie above the no-drift forecast intervals. This re-iterates the results of the previous section: there is su¢ cient evidence in all three temperature series to reject the hypothesis of no drift in favour of a warming trend in global temperatures. If we look at this question from the perspective of 30 years ago, we reach the same conclusion.

    So, they are saying that they think the recent temperatures are unusual even accounting for drift. Scanning quickly, some of the differences seem to arise from the difference in assumptions about the possible forms take on by the deterministic process to which noise is added.

  20. Re: jr (Mar 31 07:53),

    From the linked paper:

    Stock
    (1994) emphasises the importance of properly specifying the deterministic trends
    before proceeding with unit root tests

    Also, their model is (0,1,2) not (3,1,0) and they conclude that temperature in the last decade has exceeded confidence limits calculated from temperature data up to 1950.

  21. Re: DeWitt Payne (Mar 31 08:39),
    My comment on Stock got eaten– I concur with you on what they conclude. I don’t know which aspect of their analysis causes them to conclude differently from VS. (I skimmed. My brother and nephew are here; we’ll be on the road taking a trip to visit Popsie-Wopsie soon.)

  22. Re: ge0050 (Mar 31 07:53),

    I think you would need to demonstrate a second unit root exists before you can conclude the cointegration argument fails.

    Each noisy series A2n has a deterministic trend A2. By construction, A2n-A2 is stationary. But if we test A2n and A2, A2 has two unit roots (again by construction) while A2n has only one by test. My understanding of the cointegration argument is that those test results would say that the correlation of A2 and A2n was spurious and A2 could not be subtracted from A2n. Tell me where I’m going wrong here.

  23. “it still looks to me that the failure to find a second unit root in the temperature series is not proof that it doesn’t exist and so the whole cointegration argument fails.”

    P1. failure to find a second unit root in the temperature series is not proof that it doesn’t exist
    P2. (implied) if there is a second unit root then cointegration argument fails
    C1. so the whole cointegration argument fails.

    Is C1 valid given P1 and P2 are true?

    C1 is True if P2 is true.
    P2 is true if there is a second unit root.
    However, P1 says we don’t know if there is a second unit root.

    Therefore it is unkown if P2 is true or not, therefore it is unknown whether C1 is true of not. Therefore we cannot conclude that C1 is true or not.

  24. Jr.

    nice paper:

    “However, if we remember
    that a unit root process is observationally equivalent to a process with a dominant
    stochastic cycle with periodicity that exceeds the span of our sample, then a unit
    root is just a proxy for long cycles. It is a well-established principle in time series
    analysis that statistical inference is more accurate if we approximate a near unit
    root by a unit root than by a stationary process. Looking at it this way, a unit
    root model is just a vehicle for accounting for the strong persistence in the series
    and getting better measures of conÖdence.”

  25. Re: ge0050 (Mar 31 10:04),

    No, it’s more like:

    P1 Only a single unit root is found in the temperature series
    P2 Two unit roots are found in the CO2 forcing data
    P3 A mismatch in integration order invalidates a determinative trend
    C1 CO2 forcing is not directly determinative for temperature but its first difference may be.

    But I have shown in a trivial case that P1 for A2n and P2 for A2 can be true but the conclusion that A2 is not determinative of A2n is false. Therefore P3 cannot always be true.

  26. Re: steven mosher (Mar 31 11:34),

    Looking at it this way, a unit root model is just a vehicle for accounting for the strong persistence in the series and getting better measures of conÖdence.”

    On the one hand, that’s fine. But on the other hand, . . .

  27. “Therefore P3 cannot always be true.”

    That seems reasonable to me. There are likely some data series in which P3 is not true.

    However that leaves open that P3 is sometimes true and there appear to be learned papers to support this.

    To me this does appear to call into question the learned papers that did not test for unit roots (when they should have). I don’t take it to mean they are wrong, simply that they need to be checked for spurious trends.

  28. Isn’t this sort of problem common in mathematics? For example, division is a very handly mathematical tool. You can do quite a bit with it.

    However, in the case of division by zero all sorts of spurious results are possible. Thus, data that contains a zero in the divisor requires special handling. One simple method is a coordinate change. Shift the data so the divisors are all positive.

    The unit root concept seems very similar to me. It test for the presence of poorly behaved data that may lead to spurious result. So we difference the data to shift the co-ordinates, to improve the accuracy of the results.

  29. Re: ge0050 (Mar 31 17:21),

    However that leaves open that P3 is sometimes true and there appear to be learned papers to support this.

    It’s certainly true if you compare A2 and A1. You have to difference A2 to have it strongly correlate with A1. Doing unit root tests makes the most sense to me when you have two time series that you think should be strongly correlated but are only weakly correlated. But would you bother to test for unit roots before comparing the output of a GCM to the temperature record? lucia’s approach of subtracting one from the other and testing the residuals is more logical.

    Question: Can someone show me two time series that have different integration order but have a correlation coefficient R of 0.9 or greater where that correlation is spurious and the series with higher integration order has to be differenced? Or do these spurious correlations all have relatively low correlation coefficients? The square of R for the optimal two box model fit was greater than 0.8.

  30. I think that there is something awry here. It is to do with the units.

    The temperature record is in degrees (K) and the Flux forcing per unit area in J/m^2/s.

    Now thermal mass/unit area has the units J/m^2/K.

    So the first derivative of the temperature has the units K/s the same as the Flux forcing per unit area divided by the thermal mass per unit area as one would expect.

    That is the case for the flux retained as heat, which in the short term at least is the major component.

    The flux forcing rejected to space can be treated as the product of the temperature and the conductivity per unit area into space with the units K*(J/s/m^2/K) which is J/m^2/s as required. This models as the dominant term in the long term.

    So the Flux forcing can be resolved into components due to K the temperature, and K/s its first derivative. Or put the other way the temperature varies with the time integral of the flux in the short term and with the flux in the long term.

    This alone is I think enough to guarantee an apparent unit root in the temperature record if one considers it is driven by white noise fluxes, certainly at sub decadal scales but likely to extend to multi-decadal scales.

    More importantly, in order to compare flux forcings from say WMGs one would need to consider not the values of the forcings but their integrals with respect to time.

    I beleive that in order to get the units (and a lot of the physics) right one should be comparing the forcings with the first derivative of the temperature which is I(0).

    So if the WMGs are I(2) two differences need to be found not one. Alternatively one would need to integrate the WMG series making them I(3) against the temperature series which is I(1), it is the same problem.

    Now one can remove one root from the WMG series by admixture with other I(2) forcings, but to remove the second one without changing the units one would need to compose a new series by admixure with(1) forcings if possible, and regress that against the first derivative of the temperature series.
    One could of course just take the first difference with respect to time and multiply by a factor that has seconds as its unit, but then one might want to know what physical meaning such a factor might have.

    I think it is erroneous in general to try and regress forcings with temperature without some attempt to allow for the intgerating effects of thermal mass, whether there be unit roots or not.

    Alex

  31. On another point,

    I do not know the rules for adding series with unit roots but I imagine that:

    I(n) + I(n) => I(n) and

    I(k) + I(l) => I(m) for some k<=m<=l {when k<l}

    I beleive I can model the SST record as an admixure of a stochasitc term that is I(0), (at least in the long term) plus terms derived from GISS forcings and get something that is both I(1) and a good match for the spectral properties of SST when working with monthly timesteps based on the principles outlined above.

    Now I understand the difficulties in regressing such a series on the I(2) component used to generate it. It is a bit like trying to unmix mayonnaise, but I know it is in there because I put it there, but even when it is "correctly" scaled to match the 20th century SST warming it is still swimming in the sea of partially integrated noise needed to obtain (on occassion) decent matches on first few dozen lagged correlation coefficients and (on average) produce roughly the right amount of overall variance. Of course much of the time it does it own thing but in general it looks and smells like an earthly SST series just not necessarily the one we are familiar with.

    Of course the "correct" scale for the WMG component needs those quotation marks as it simply is not possible to say what it should be, due not inconsiderable stochastic trends it swims with.

    Alex

  32. Hi guys, I stayed up late last night and investigated that Breusch and Vahid (2008) working paper posted up there.

    http://cbe.anu.edu.au/research/papers/pdf/wp495.pdf

    These three posts are relevant:

    Use of PP by B&V 2008:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-3056

    ARIMA(3,1,0) and ARIMA(0,1,2) compared:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-3061

    Impulse response functions ARIMA(3,1,0) and ARIMA(0,1,2) compared:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-3063

    Hi lucia,

    The main reason I’m trying not to get too involved in this discussion here, is that I’m trying to keep everything at Bart’s (since he’s been a nice host so far, it seems like the decent thing to do :)).

    Now, I completely understand that you haven’t had the time to read the whole thread yet, but I believe that doing so, will clear up a lot of misunderstandings. A lot of the questions you pose here, have been posed and answered in this month long discussion.

    I’m really interested in your opinion once you have had the time to look at my argument more carefully 🙂

    Cheers, VS

  33. Stephan,
    Very compelling. We are clearly facing an Arctic ice catastrophe that will drown even more Polar Bears.
    ;^)

  34. VS

    The main reason I’m trying not to get too involved in this discussion here, is that I’m trying to keep everything at Bart’s (since he’s been a nice host so far, it seems like the decent thing to do 🙂 ).

    That’s fine. We understand you are answering questions there. Discussion is also happening here. That’s the way of blogs. If you would prefer to stay there, that’s ok. But, of course, if you drop in and tell us what we are doing has been addressed in the literature, then people will ask you to just point to where you think it has been so addressed.

    As for these figures:

    Which you suggest “From my ‘layman’ climate-science perspective, I would say that my naive ARIMA(3,1,0) looks more like something global mean temperature trend would ‘do’ after a shock in one period, than their ARIMA(0,1,2) specification. ”

    I don’t know what your econometrics definition of “shock” is. Is it a dirac delta function in the forcing at a certain time? Why do you think the ARIMA(3,1,0) looks more like something the global mean temperature would do in response to such a thing?

    Anyway, I don’t know, in words, what you did when you say you something like

    Here are the impulse responses for the two specifications. I believe that the bottom one (i.e. accumulated effects of an exogenous shock) is more relevant for the question I posed.

    Did you just impose a temperature excursion and then watch it propagate? Why?

    With your formulation, can you impose anything on the forcing side and watch the temperature evolve? I don’t see how what you did relates to what those who are interested in understanding the physical connection are looking for.

    For what it’s worth, the correlogram for an ARMA(1,1) process looks like the response of a very simple 1-lumped parameter model that conserved energy driven by white noise forcing to create a temperature of the planet and then adding white noise (possibly from measurements, possibly something else) to the planet. If we apply other forcings, we can do other things.

    But much of the discussion of the physics doesn’t have to do with your choice of noise model. It has to do with the assumption in ADF that the deterministic trend is linear. That assumption is almost certainly wrong, and the questions on my part relate to that. Nothing I read over at Bart’s addresses this issue– which is fine. But it’s one of the issues that is going to be explored over time.

  35. Hi lucia,

    The impulse represents the ‘measured’ response of the process to an exogenous shock (so an ‘error’, size one s.d., at time t). Note the general ARIMA(p,d,q) specification is *extremely* flexible.

    If you don’t believe me, simulate a few, and you will see the enormous variety of outputs the general specification can produce.

    Now, I hope it is clear from this that an ARIMA specification is not a ‘noise model’, it is a description of a system, or, when estimated, a ‘trend measurement’ (hence the ‘metrics’ in econometrics). We use formal methods to ‘exclude’ some specifications, we use information criteria (enthropy measures) to prevent ‘over-fitting’… etc etc. So, there exists a formal approach to arriving at one’s specification.

    I checked elaborately with diagnostics, and found that for the GISS record over this period, the system is better described as an ARIMA(3,1,0) than an ARIMA(1,1,1).

    The idea with these two impulse response functions was to compare the ARIMA(3,1,0) with the ARIMA(0,1,2). Now, I think that physics makes some predictions as to how ‘the system’ would respond to a shock. Considering that, I honestly think that the measurement of this impact arrived at via the ARIMA(3,1,0) specification is more appropriate for the system we are studying (in terms of impact on ‘the trend’).

    ARIMA(3,1,0)

    http://img694.imageshack.us/img694/460/impulseresponsear.gif

    ARIMA(0,1,2)

    http://img511.imageshack.us/img511/212/impulseresponsema.gif

    You see my point?

    As for the ADF. Note that I didn’t only apply an ADF. I applied outlier consistent estimators, endogenous structural break estimators, H0 stationarity asymptotic estimators, heteroskedasticity consistent estmators… etc. I really tested it from all sides.

    Now, what I’m disputing, are nonsense statements like these (first paragraph, first section, IPCC summary report):

    “Eleven of the last twelve years (1995-2006) rank among the twelve warmest years in the instrumental record of global surface temperature (since 1850). The 100-year linear trend (1906-2005) of 0.74 [0.56 to 0.92]°C is larger than the corresponding trend of 0.6 [0.4 to 0.8]°C (1901-2000) given in the TAR (Figure 1.1). The linear warming trend over the 50 years from 1956 to 2005 (0.13 [0.10 to 0.16]°C per decade) is nearly twice that for the 100 years from 1906 to 2005. {WGI 3.2, SPM}”

    Do note that, in light of my test results, all these point estimates and confidence intervals have absolutely no (formal statistical) meaning.

    Best, VS

  36. Do note that, in light of my test results, all these point estimates and confidence intervals have absolutely no (formal statistical) meaning.

    while the intervals you give us (+-1°C over 60 years) have no real world meaning.
    .
    and this is based on the assumption, that you got stuff right.

    just take a look at the weight example that Bart gave.
    .
    http://ourchangingclimate.wordpress.com/2010/04/01/a-rooty-solution-to-my-weight-gain-problem/#comment-3124
    .
    ps: the forecast interval around your figure 2 also looks wrong. why would it be wide around the pre-1935 period, that you base your forecast on?

  37. Re: Alexander Harvey (Mar 31 21:03),

    I think it is erroneous in general to try and regress forcings with temperature without some attempt to allow for the integrating effects of thermal mass, whether there be unit roots or not.

    I completely agree. A one box model of the planet is like lucia’s tank with a porous plug in the drain. Increase the flow into the tank and the level goes up, but not instantaneously. I was planning to get to that eventually, but I wanted to look at a trivial case first to see if my suspicions about the various unit root tests and noisy data were valid.

  38. Re: VS (Apr 1 08:36),

    If the accumulated response is the meant to represent the measured temperature then (0,1,2) looks a lot more like the real world then (3,1,0). We have something approximating an impulse in the temperature record in the major volcanic eruptions of Pinatubo and El Chichon. Where is the overshoot and ringing that your (3,1,0) model predicts?

  39. VS

    The impulse represents the ‘measured’ response of the process to an exogenous shock (so an ‘error’, size one s.d., at time t). Note the general ARIMA(p,d,q) specification is *extremely* flexible.

    I know ARIMA is flexible. So? I’m trying to figure out how what you are doing relates to anything physical. The fact that ARIMA is flexible does not mean it will automatically relate to anything physical about climate.

    In a physical process, applying this sort of “shock” to temperature violates the first law of thermo. So, what do you think we are supposed to learn from this? Do you see it as some sort of limiting case to something?

    You see my point?

    No. Do you think you can provide words to explain why you think one of those pictures tells us one thing is more physical than others. (I think it’s impossible. But if you can find words, that would be useful.)

    Where is the overshoot and ringing that your (3,1,0) model predicts?

    I think that image looks wildly unphysical. I’m not sure the first one is “good”, but at least it doesn’t look like a system that violates thermodynamics!

  40. “lucia’s approach of subtracting one from the other and testing the residuals is more logical.”

    Why would that be true? Isn’t late correction typically more complex/less reliable that early correction, because of the possibilty of complex interaction between the errors?

    Since in this case the model and the data are not truly independent, this would appear to increase the chances for the errors to interact in a complex fashion.

  41. Re: ge0050 (Apr 1 10:40),

    You’re right about the lack of independence during the historical period because the models were tuned to reproduce the historical period as well as possible. But if we compare the model projections and new temperature data, I don’t see the problem with subtraction. I’m unsure what you mean by early and late correction.

  42. Since in this case the model and the data are not truly independent, this would appear to increase the chances for the errors to interact in a complex fashion.

    When I test, I test whether models match data. I subtract because the residuals from linear show cross-correlation from model run to model run. So, subtracting is the method of dealing with that.

    In context of what VS is doing, the difficulty is that his tests are assuming a that the deterministic trend is linear. No one expects that. People expect it to look more like the multi-model mean from AOGCMs. That mean may be wrong– but it’s certainly not of the form T(time)=f(time)= a+ b*time.

    One question is: What does the fact that the VS’s tests assume f(time) is a linear function do to the outcome of his tests? Is it possible that he is finding a “unit root” merely because the deterministic response to the applied forcings is highly non-linear?

    Bascially, no one who understands the physics assumes that
    T(time) =a + b*time + e(t) where e(t) is some sort of noise.
    They think it’s
    T(time)=f(t) + e(t) where f(t) is highly non-linear. But VS’s tests assume f(t)= a+b*time. That automatically means that a large amount of the deterministic variation is being viewed as whatever is in e(t)– which is “unexplained” or “noise” or whatever word someone from some other field wants to call it.

  43. lucia, I’m greatly appreciating this discussion but I’m having difficulty with your assertion that nonone who understands the physics assumes the deterministic trend is linear; unless of course you are being very ironic; the IPCC insists the trend is linear for example FAQ 3.1 Figure 1 which purports to show the increasing linear trend during the 20thC; see page 19 of Chp 3 AR4:

    http://www.ipcc.ch/pdf/assessment-report/ar4/wg1/ar4-wg1-chapter3.pdf

    Of course from a physical viewpoint CO2 cannot have a linear effect on temperature unless its effect is exponentially increasing; the opposite is the case with Beer-Lambert causing an exponetial decline for increases in CO2; this why AR4 relies on an enhanced greenhouse effect which in turn is defeated by a host of real phenomena including convection, cloud forcing and troposphere location of SH. In this respect VS is entirely within his rights to object to the usual apocalyptic IPCC warming warning which he refers to at comment 39733.

    As for sod with his ubiquitous tinny disclaimers about VS:

    “while the intervals you give us (+-1°C over 60 years) have no real world meaning.”

    Good one sod; have a look at the Figure 1 in the above link; it shows an accelerating linear trend over the course of the 20thC with the last 25 years showing a trend of 0.177C per decade or, in the 60 year equivalent, 1.062C; why do you bother.

  44. ouch. there is no claim made by VS, that is too stupid to be picked up by some of his not so well educated followers.
    .
    he IPCC insists the trend is linear for example FAQ 3.1 Figure 1 which purports to show the increasing linear trend during the 20thC; see page 19 of Chp 3 AR4:

    http://www.ipcc.ch/pdf/assessm…..apter3.pdf
    .
    beware, you will be disappointed. follow that link, and you wont find the claim that the trend is linear. either on page 253, nor on any other page.
    .
    but perhaps the evil folks at the IPCC simply hide their claim? but make it explicit in the other stuff they write?
    .
    ahm, no. here is what they say:
    .
    “Linear trend fits to the last 25 (yellow), 50 (orange), 100 (purple) and 150 years (red) are
    shown, and correspond to 1981 to 2005, 1956 to 2005, 1906 to 2005, and 1856 to 2005, respectively. Note that for shorter recent periods, the slope is greater, indicating accelerated
    warming”

    .
    this makes it perfectly clear, that they do NOT think, that the temperature trend is linear. if they thought it was, all those time periods should show EXACTLY THE SAME TREND.
    .
    and the graph at the top (let us assume that the IPCC took a look at it, before they printed it) doesn t look like a linear trend either.
    .
    Of course from a physical viewpoint CO2 cannot have a linear effect on temperature unless its effect is exponentially increasing; the opposite is the case with Beer-Lambert causing an exponetial decline for increases in CO2; this why AR4 relies on an enhanced greenhouse effect which in turn is defeated by a host of real phenomena including convection, cloud forcing and troposphere location of SH. In this respect VS is entirely within his rights to object to the usual apocalyptic IPCC warming warning which he refers to at comment 39733.
    .
    so you build a strawman, and now you are trying to burn it? with some vague hints to what ever physics fit your believes?
    .
    nice.
    .
    As for sod with his ubiquitous tinny disclaimers about VS:

    “while the intervals you give us (+-1°C over 60 years) have no real world meaning.”

    Good one sod; have a look at the Figure 1 in the above link; it shows an accelerating linear trend over the course of the 20thC with the last 25 years showing a trend of 0.177C per decade or, in the 60 year equivalent, 1.062C; why do you bother.
    .
    this is the reason, why i caution against handing out complicated statistical methods on fools day.
    .
    cohenite has been initiated into the “unit root” cult, even though it is obvious, that he struggles to understand the most simple concepts.
    .
    but perhaps someone around here can explain the difference to him? i will only offer one illustration:
    .
    Business model 1 gives $100000 +- 20000 at the end of 60 years.
    .
    Business model 2 gives between $+100000 and $-100000 after the same timespan.
    .
    can you spot the difference?

  45. Earth to sod; end point fallacy is your poison not mine; the AR4 graph is just another example of the perpetual season of cherry-picking in AGW ga-ga land; your snark at VS about intervals based on anomalies which are neutral misses the point of stationary don’t you think [sic]? Still irony is not your strong point, to quote approvingly from AR4 that;

    “Linear trend fits to the last 25 (yellow), 50 (orange), 100 (purple) and 150 years (red) are
    shown,”

    and then to assert that “this makes it perfectly clear, that they do NOT think, that the temperature trend is linear.” because the linear trends are different for the cherry-picked intervals is so symptomatic of the alarmists’ position on AGW as to be a text-book example of that particular pathology.

  46. Cohenite, you and SOD are both correct in a way. What AR4 did was to take a phenomena that they think is nonlinear, and then linearize it to make a point. If you go to the models or attribution sections you will find that SOD is correct. However, you will find that they took the models’ runs and used a mean from the models to compute the expected trend. However, their confidence intervals were not simple OLS projections, other than for discussion. They backcast and describe it similarly, as nonlinear. Somewhere, they also state that to linearize is acceptable for short time spans. I agree, this is done repeatedly by engineeers. It has its own set of problems and assumptions. One which you mention is the cherry picking. One can cherry pick and show that there were periods with similar span and rate increases in the past. Another problem is extrapolating forward which is what I believe VS is getting on about wrt a misspecified model. I don’t think that VS’s points are more than “be careful” at this point. I definetly don’t think the models fit the framework of his. And just as he points out model misspecification, I think his framework applied to what the authors actually do and did is misspecified.

  47. Another point to ponder is that some of the forcings that comprise the ModelE input data set, particularly the greenhouse gas data which is derived from ice core data, do not have annual resolution. The annual data has been created by interpolation and smoothing. Smoothing can not only increase autocorrelation, but can also increase the apparent integration order.

  48. JohnF

    And just as he points out model misspecification, I think his framework applied to what the authors actually do and did is misspecified.

    The difficulty is that, if this is his main point, he is pointing out that fitting a linear trend to data from 1880-now is a mis-specification. Everyone agrees with that.

    But I at least think VS is trying to say more than that. What is not entirely clear. At a minimum he seems to be saying that if we look at the data only, we can’t decide the recent temperatures are outside expected ranges based on data. Maybe. Maybe not. It is true that some of the confidence people have is based on physics, not curve fits to past data.

  49. Re: lucia (Apr 2 08:27), I agree that VS is saying more. What I find so interesting are the implications. I think that climate does show long term persistence and that this needs to be accounted. More interesting is that the “burst” effect of CO2 wrt to temperature has such similarities to the (contested) findings that water vapor is short term positive, but long term negative. With the predominance of water vapor as the GHG forcing, and the modeled assumption of being only positive would from a surficial examination to be in agreement. Don’t know if it would stand up to an in depth examination. Thus my interest. Of course, I find temperature such an awkward consideration since my training and experience indicates one should be using a mass and energy balance to investigate forcings. Or even discuss them.

  50. “One question is: What does the fact that the VS’s tests assume f(time) is a linear function do to the outcome of his tests? Is it possible that he is finding a “unit root” merely because the deterministic response to the applied forcings is highly non-linear? ”

    The isn’t what I read. What I read so far is that VS is demonstrating that the accepted conclusion that recent warming is somehow different than past warming is largely dependent upon the statistical methods used and the choice of end points. That when you chose a different statistical method and/or different end points, the current warming is statistically within natural variability. Thus, the currently accepted conclusion may be wrong.

    As to the mertis of unit root and polynomial co-integration, I believe that is where VS is headed next. Non-linear analysis is an area in which mathematics struggles. Thus, any technique that allows us to reduce complexity and increase accuracy I find of great interest.

    In concept it makes sense to me. Replace the linear co-ordinates with non-linear co-ordinates and map the problem from non-linear to linear as a means of simplification. Use this simplification to improve the accuracy. However, until I’m more familiar with the technique I’d prefer to see more. Comments about the relative merits of the technique seem misplaced otherwise.

  51. Thank you John; as galling as it may be to be as right as sod! I have been interested in the ‘linear’ trend issue for some time; as a matter of fact what got me interested was a post by our host on isolating trend after removing natural variables such as Enso, volcanoes etc;

    http://rankexploits.com/musings/2008/gavin-schmidt-corrects-for-enso-ipcc-projections-still-falsify/

    If natural variation is removed I can’t see how the GHG trend can be other than linear; Bob Tisdale looks at a Thompson paper in this context [noting that the literature to do with isolating either natural factors or GHGs as ‘determinants’ of temperature is fairly extensive];

    http://wattsupwiththat.com/2009/09/24/a-look-at-the-thompson-et-al-paper-hi-tech-wiggle-matching-and-removal-of-natural-variables/#comment-193229

    This is my response to Bob’s analysis:

    “Decomposing natural/periodic and AGW components of trend is the hot issue of the ‘debate’ right now; however the waters are being muddied by AGW attempts to establish high-order dependencies of periodic components on AGW [ see Vecchi, Cai and the oddest of the lot Meehl;

    http://ams.confex.com/ams/88Annual/techprogram/paper_133611.htm ]

    and the Modoki as a proxy for AGW component of temperature trend is making a comeback;

    http://www.agu.org/pubs/crossref/2009/2009GL037885.shtml
    http://www.nature.com/nature/journal/v461/n7263/full/nature08316.html

    But as a matter of the 2nd law of thermodynamics if AGW components are causing variations in periodic components then that must reduce the direct effect of the AGW component on temperature otherwise the total imput will exceed 100%. In the first instance what is required is a comprehensive comparison of periodic/stationary and AGW/non-stationary/model components to establish what dominates trend [and I believe David Stockwell is attempeting such an analysis]; if it is the case that periodic components do dominate trend then all that is left for AGW is the fanciful notion of ‘possession’ of periodic components”

    It seems to me that VS has made a valuable contribution to this debate, even if there is some uncertainty to what that contribution is!

  52. Re: ge0050 (Apr 2 16:15),

    The isn’t what I read. What I read so far is that VS is demonstrating tha

    The assumption is implicit in using the ADF test. It permits a drift and a linear trend as the functional form for the deterministic trend.

  53. Of course, I find temperature such an awkward consideration since my training and experience indicates one should be using a mass and energy balance to investigate forcings. Or even discuss them.

    They weren’t taking energy balance readings over the past century or so. Temperature is what we are stuck with. The research in into energy balance is quite active now.

  54. Re: lucia (Apr 2 16:19),

    According to this paper, even the CADF test, which allows multiple covariants to be specified rather than just one drift and intercept, requires that the covariants be trend stationary. So Tamino’s use of the net forcings as a covariant in the CADF test was probably incorrect, as the KPSS test rejects their stationarity at better than 99% confidence.

  55. Bugs, that is correct. However, they (IPCC and certain climate scientists) have already drawn conclusions and made attributions. It is not, that I necessarily disagree with them. It is that without such an approach the CI’s may literally go from floor to ceiling. The really irritating part of this fact is the some of the climate community are treating legitimate questions as to assumptions, methodology, or confidence intervals as an attack on the science. It is no such thing. One can be wrong in one’s assertions of the revelance of assumptions, methodology or CI’s, but this is not the same as one is attacking science when one examines these attributes. Short version, certain climate scientists have made their bed and now are complaining of it.

    That we are stuck with temperature does not mean that it is an acceptable vehicle of explanation for one side or the other. That is the fascinating part of the VS and Bart discussion.

    Besides, one of the reasons it is “in” is because we do it so poorly and it is in need of illumination.

  56. DeWitt, is it fair to say that the Lupi paper overcomes the ‘need’ to isolate natural and AGW factors to achieve attribution to trend or otherwise and also overcomes problems such as collinearity in such a process of attribution?

  57. Re: cohenite (Apr 2 16:17), LOL, I too have had my discussions with sod, but give credit where it is due when he (she) is right, (s)he is right.

    Yes Lucia’s discussion and explorations on ENSO I think are publishable. But then publishing is not what I do, so I consider my opinion from the peanut gallery.

    You say

    But as a matter of the 2nd law of thermodynamics if AGW components are causing variations in periodic components then that must reduce the direct effect of the AGW component on temperature otherwise the total imput will exceed 100%. In the first instance what is required is a comprehensive comparison of periodic/stationary and AGW/non-stationary/model components to establish what dominates trend [and I believe David Stockwell is attempeting such an analysis]; if it is the case that periodic components do dominate trend then all that is left for AGW is the fanciful notion of ‘possession’ of periodic components.

    I would caution you that when one follows a phase envelope with a substance such as water with high heat of vaporization, fusion, and GHG capacity, stating that a comparison of a periodic stationary component as exceeding the total input of 100% based on past temperature observations, is especially ill-defined if one has not performed a mass and energy balance.

    On “Gerald A. Meehl”, I would accept or reject their model the same as I would accept or reject Model E or any other model that used hyperviscosity in a set of continuum equations and assumptions to go from PDE’s to a finite differencing/integrating model. I do not think we have the data from the 1970’s in the Pacific to qualify and quantify a reasonable heat and mass balance to support that the observed relationship of ENSO and temperature. It may be no more, no less, than a fortuitous expression of a model.

    La Niña Modoki impacts Australia autumn rainfall variability

    has teleconnection claims. I am not convinced that teleconnections are nothing more than phenomenalogical expressions of spurious correlations or confirmation bias. I think VS’s discussions are especially relevant to this claim.

    El Niño in a changing climate

    has this quote

    “El Niño events, characterized by anomalous warming in the eastern equatorial Pacific Ocean, have global climatic teleconnections and are the most dominant feature of cyclic climate variability on subdecadal timescales.

    So on the one hand one states inconsistent (anomalous); and on the other one states Teleconnection which from Wiki means “in atmospheric science refers to climate anomalies being related to each other at large distances (typically thousands of kilometers). I find inconsistent warming (or cooling) relations over thousands of kilometers as being indistinguishable from spurious correlation or confirmation basis without extraordinary evidence. Plus they are dominant?!? Consider me old fashioned “don’t know means don’t know”, though I agree a good guess is a good guess.

  58. Re: cohenite (Apr 2 17:53),

    Would you go into more detail? I’m not sure exactly what you mean. I also didn’t read the paper in depth. I was mainly looking at the mechanics of using CADFtest in R and like all computer languages, the R help files only help if you don’t actually need them.

  59. DeWitt, I’m periously close to exceeding my pay scale here but the issue I am referring to is ascertaining whether natural factors are stationary or, as is denied by AGW, like ghgs, particularly ACO2, determistitic and trend setting. Following the McLean et al brou’haha David Stockwell did a comment looking at this issue of whether natural factors, in this instance cSOI, would be non-stationary, but found due to collinearity with ghgs that seperation and quantification of any trend attribution by cSOI was not possible or at least difficult:

    http://arxiv.org/PS_cache/arxiv/pdf/0908/0908.1828v1.pdf

    The Lupi paper seems to overcome this sticking point in attribution to trend by natural factors by considering all [multivariate] factors so that, presumably, whether natural factors can be determined to be stationary without first seperating them from ghg effects does not have to be done. Is this a fair interpretation or am I, in sod’s parlance, still suffering from the rose coloured glasses of the VS/unit root cult?

  60. Re: cohenite (Apr 2 19:13),

    I seriously considered making a similar comment about exceeding my pay scale in my previous reply. Basically, I dunno. Suggest something and I’ll try it. I’ll try to read the Stockwell paper, ASAP, but it’s a Formula 1 race weekend, not to mention a few other things ( I really liked How to Train Your Dragon, but Joe Morgenstern was rather hard on Clash, so I may make that a rental) so it may be a bit.

  61. DeWitt, my suggestion is to watch the races while I try to wade through John’s advice; I will also pass on the Lupi paper to David and see what he comes up with.

  62. Sorry that you have to wade Coh, but perhaps that is not my fault but the “state of the art”.

  63. Figure of speech John; I’m an ex-surfer with a bad back; wading is a pleasant fall back.

  64. There appears to be a difference in behavior, or at least the ARIMA specification for the GISS temperature series, depending on exactly which series you use. This must be a consequence of GISS’ continual revision of their series. I get results very similar to VS for the most recent GISS incarnation, but quite different, IMO, for the series that ends in 2003. So now I’m questioning how much of the autocorrelation of the series is actually caused by GISS.

  65. Actually, as long as we are doing phenomenology, you can get a very nice linear regression of the temperature to the logarithm of the CO2 concentration, after you “remove” the lowest-frequency cycles (possibly associated with the Pacific Decadal Oscillation) in the temperature time-series. Please see here:

    http://comp.uark.edu/~jgeabana/gw2.html

    I haven’t run any sophisticated statistical analyses on it, but it looks pretty good to me…

  66. Re: julio (Apr 6 12:50),

    If you use the forcing rather than the concentration then you are using the logarithm of the concentration. One problem with Beenstock and Reingewertz is it’s not at all clear, to me anyway, whether they are using forcing or concentration. The simplified version of forcing according to the IPCC is 5.3*ln([CO2(t)]/[CO2(0)]) where [CO2(t)] is the CO2 concentration time series.

  67. Re: DeWitt Payne (Comment#40030)

    If you use the forcing rather than the concentration then you are using the logarithm of the concentration.

    I know, that’s why I used it 🙂 This is not to say that one should necessarily expect a linear relationship between the temperature and the forcing, but at least using the forcing makes more sense (to me anyway) than just using the concentration. And, as it turns out, the relationship is remarkably close to linear, once you remove the most obvious long-term “natural variability” component.

  68. Re: julio (Apr 6 14:34),

    Ad hoc filtering of the data raises other questions about significance, though. The question remains whether long term cycles are deterministic or chaotic artifacts that only appear to be periodic. Koutsoyiannis would probably say that they are artifacts and that the time series would be stationary if you had enough data.

  69. Re: DeWitt Payne (Comment#40033)

    Ad hoc filtering of the data raises other questions about significance, though. The question remains whether long term cycles are deterministic or chaotic artifacts that only appear to be periodic. Koutsoyiannis would probably say that they are artifacts and that the time series would be stationary if you had enough data.

    Yes, except that, at a minimum, radiation physics requires a deterministic term approximately given (in degrees C) by 1.1 log2[CO2] (that’s a “log base 2”). That’s the bare minimum, with no feedbacks of any sort.

    I get a good fit to a similar expression with a coefficient of the order of 1.9 instead. I would personally be very happy if anybody could prove convincingly that the difference is just a particularly persistent random walk…

  70. lucia (Comment#39749)
    When I test, I test whether models match data. I subtract because..

    They think it’s
    T(time)=f(t) + e(t) where f(t) is highly non-linear.

    As I understand you are subtracting the model TM(time) from the observed TO(time), trying to minimize the result.

    Solve TM – TO = 0

    Assume that you do find f(t) for your model FM, that exactly matches observation f(t). Then by substraction you get:

    fo(t) + e(t) – fm(t) = 0

    Since we know that fo(t) = fm(t) if your model describes observation, we are left with solving

    e(t) = 0

    However, this is meaningless. e(t) is the unknown, the thing that is beyond our present knowledge and ability to measure.

    This would suggest the model accuracy is limited by natural variability, and by subtraction the model error at any one time can be as much as twice the natural variability.

    So, if for example, if we ran the earth itself as a model multiple times, given the “same” initial conditions, how much variability would be seen in temperature? Surely temperatures would be different with only very minor changes in the earth itself, below the level of our present technology to measure or recreate. This appears to be at the heart of the confidence levels that can placed on the observed data used to train a model.

  71. Re: ge0050 (Apr 20 20:08),

    The residuals, e(t), are assumed to be purely stochastic noise. There is no information content. The problem is that, other than noise introduced by random measurement and sampling error, most of the fluctuation observed in temperature is actually signal, not noise.

    The infinite parallel Earths thing was discussed to death during the discussion of the Douglass paper. Does the name ‘beaker’ ring a bell?

  72. most of the fluctuation observed in temperature is actually signal, not noise.

    Is that proven? The difference between deterministic and stochastic is often just a matter of resolution. It could be argued that both the observations and the models lack resolution. For example, isn’t “natural variability” one explanation put forward for models failing to predict the lack of significant warming in the past decade? If this “natural variability” is deterministic, why wasn’t it predicted by the models? This suggests there is a stochastic component which adversely affected training.

  73. Re: ge0050 (Apr 20 23:02),

    I get the impression that deterministic means that you can tell why something happened after the fact, but you can’t predict it. You may find a statistical model that fits a particular time series, but the model parameters may not be the same for the next series for a period of the same length. The standard error for any parameter may increase over longer periods rather than decrease.

    What is natural variability and how do you estimate it? I’m not at all convinced that the observed variability of the temperature over the last century or so is a good estimate of the variability one would observe between runs of a hypothetical parallel Earth over the same time period. It’s also not at all clear to me that anything less than a resolution equivalent to a parallel Earth is sufficient. If I read Jerry Browning correctly, increasing the resolution from the current 100km scale leads to more rather than less problems with the programs, i.e. requires more kludges to achieve stability.

  74. I’m not at all convinced that the observed variability of the temperature over the last century or so is a good estimate of the variability …

    I agree. Looking at the temperature record going back hundreds of millions of years, it looks to me like average temperature is bounded by approx 15C and 22C. In spite of all the factors, including large changes in solar output, CO2 and land mass, something keeps temperature in that range. Further, these bounds are preferred states. Temperature is stable at the bounds, and varies rapidly between the bounds. This sugests that:

    1) temperature is not a distribution around a mean.
    2) climate variability is not constant over time.

  75. ge0050,

    ‘most of the fluctuation observed in temperature is actually signal, not noise.
    Is that proven?’

    A weed is a beautiful flower growing in the wrong place.
    Whether something is signal or noise depends entirely on what they wish to observe. If that fluctuation is indeed our focus of interest, it is, I would say our desired ‘signal’?
    I’m not sure how that could be ‘proven’

  76. >>>If I read Jerry Browning correctly, increasing the resolution from the current 100km scale leads to more rather than less problems with the programs, i.e. requires more kludges to achieve stability.<<<

    That seems to support what I said earlier and seems to contradict "most of the fluctuation observed in temperature is actually signal, not noise."

    For example, consider my earlier simple formula:

    fo(t) + e(t) – fm(t) = 0

    By increasing the resolution, we get a more exact solution such that fo(t) = fm(t) more closely. This means after subtraction e(t) = 0 more closely as well.

    However, since the solution is inceasingly unstable, this implies that e(t) = 0 is contradicted as the resolution increases. This contradition implies that e(t) = 0 is false.

    This contradition implies the assumption if false. That most of the fluctuation observed in temperature is NOT actually signal, at least at this resolution.

  77. I’ve uploaded a very simple excel simulator to the public domain. All are welcome to give it a try and modify as you wish.

    http://rapidshare.com/files/378673183/tempSim.xls

    This uses a coin toss to simulate temperature change. H=1, T=-1. The resultant temperature is the nett sum of H and T.

    The results are graphed. Each time you press F9, the graph is recalculated. Ignore the scale, it is arbitrary. Look at the graphs produced each time you press F9.

    I believe you will find that this simulator produces plausible temperature records. This suggest to me that temperature could well be a random walk.

    Try this simulator.

Comments are closed.