Questions for VS and Dave

Several readers, most recently Alex Heyworth, have been asking me to jump into the VS/Bart/Dave Stockwell/Tamino fray. It appears VS is making some sort of claims in comments at Bart’s blog. The claims and clarifications are spread out over 760 or so comments, with various interjections. I have been alerted that these claims all a have something to do with a paper by Beenstock and Reingewertz (PDF). Dave appears to have written several tutorials [(1), (2) and (3)] to help us with terminology. Meanwhile Tamino has performed a number of statistical analyses which may or may not address VS’s claims. (I cannot say because I do not know precisely what VS’s claims actually are. (This is due in large part to my not wishing to try to tease out the claim by reading 700 comments but may be due to other factors.)

So, since 1) my readers are asking me to comment on the VS issue, which seems to have something to do with Beenstock and Reingewertz, 2) the claims in Beenstock and Reingewertz is dramatic: ( they claim “Although we reject AGW, we find that greenhouse gas forcings have a temporary effect on global temperature”), and 3) I don’t want to read 760+ comments to tease out VS’s claims (or that of anyone else’s), I’m going to step back and ask some very simple questions for clarification. I won’t proceed to make any conclusions about who is right or wrong until I’m sure I understand what is being claimed.

Oddly enough, my first questions will not be based on anything VS wrote, nor on anything specifically in Beenstock and Reingewertz (PDF).

I’m going to ask questions that will help me understand the terminology (I(0), I(1) etc.) and possibly uncover what is being claimed. These questions may seem mysterious to others who want to attack the full claim, but I want to relate the almost purely statistical argument in comments at Bart’s to at least one simple concrete physical system. To ask my questions, I’m going to modify the “Water pouring into a tank” example David used to illustrated the concept of co-integration using water a pouring into a tank.

Set up of Tank Problem

My main goal for this tank problem is to show a physical problem where I think:

  1. The flow rate into the tank Qi is I(o)
  2. The level of water in the tank tank is I(∞) (if I read people who say the definition is based on how often I need to difference to make the process stationary) or I(1) if I read this wikipedia page) .
  3. The orders do not match: that is 0 ≠∞.
  4. The level of water in the tank is dictated by the flow rate.

To do this, I will describe a simplified problem, and provide the solution. We will see that the flow rate totally dictates the level of water in the tank.

I’m hoping that using this simple problem to present questions, someone can better explain how VS’s claims relate to a simple physical problem and we can also use this simple problem (or some extension ) to better understand VS’s claims, particularly in the context of something physical.

I will now provide some details for the explanatory problem. They aren’t necessarily important to the main argument, but I think I must show them. Many of you can just skip the text between the hard rules.


For the purpose of asking questions, I have added a drain to the bottom of the tank, and I have written down a simple approximation often used to estimate the rate of change of the water level in the tank. Below is a schematic of my tank system:

Figure 1: Tank being filled and drained.

In the tank above, water flows in at a rate, Qi; in principle, this can vary with time, but for today’s post, I will assume the tank level was previously held at some constant value. The tank has a drain pipe which contains a long porous plug whose characteristics were desinged to ensure that flow out of the tank varied linearly with pressure drop across the plug. Consequently, the flow out of the tank, Qo varies linearly with the level of the tank contents, i.e. Qo=βY, where β has dimensions length2/time, and its magnitude is positive definite and property of the plug geometry.

Assume at time t=0 a process operator changed the setting on the valve, to increase the flow rate into the tank. Because the flow rate Qi increased we expect the water level, Y, to increase with time, but eventually stabilize at some level, which I will denote Ye, with “e” representing “equilibrium”.

Applying a mass balance to the tank we find the water level in the tank will vary as

(1)AtdY/dt =Qi-Qo
= Qi-βY

where At is the cross-sectional area of the tank.

It’s possible to show the equilibrium level of water,Ye, for any flow rate Qi by setting the time derivative of the level to zero (dY/dt),

(2)Ye=Qi/β

Defining the difference Y-Ye = y, for 0≤y, equation (1) becomes:
(3)Atdy/dt =-βy

whose solution is
(4)y=C exp(-[β/At] t)
where C is a constant that may be adjusted to match the initial condition for y at time t=0; we will denote this value as yi-1. For our problem as stated = {( Qi-1-Qi)β where Qi-1 is the flow rate before the operator tweaked the valve. Note if the new flow rate is higher than the previous one, yi-1 is negative. This means at time t=0, the water level is currently below its equilibrium level; we should expect it to rise.

The water level will rise, and y will rise toward zero exponentially as:

(4)y= yi-1 exp(-[β/At] t)

The level of water in the tank is then

(5)Y= {Qi-1-Qi)/β } exp(-[β/At] t) + Qi/β

The properties of this solution for the water level (i.e. eqn 5) are 1) as time grow very large, the water level reaches a steady value equal to Qi)/β, 2) for short times, the water level rises exponentially toward the steady value.

We could, by the way, add noise to both Q and Y if we wished, and make this more complicated. (I imagine, that we will eventually make this more complicated, since I’m hoping this simple problem will help VS the statistician explain how his method related to something physical.)


In context of the VS article we have a specific process where

  1. The water level Y is absolutely, totally and completely dictated by variations in Q. That is, the absolute level of Q, and variations in Q cause variations in Y. (This would be so even if we varied Q with time or added noise to either Q or our observations of Y.)
  2. If I understand the terminology, I think Qi (“the cause”) is I(o). That is: it is stationary without taking differences.
  3. I think Y is I(∞) (or I(1)?) . That is, if we added noise to Y and studied it, strictly speaking, we would need to take an infinite number of derivatives to make Y stationary. (Or if I read the wikipedia page, it’s I(1) because I can write yi=ρ yi-1+ εi)

The Questions

So, for some questions:

  1. Is anyone claiming hat the fact if two processes are I(n) and I(m) with m≠n, then one cannot cause the other?
  2. Is anyone making any claim that if Q ’causes’ Y, then the order for Q must be lower than for Y? (That is, Q = I(o), can be the cause of Y I(∞), but if Q was I(2) it could not cause Y=I(2).
  3. Is anyone claiming that if Y is I(2) or greater (which it is), we can’t estimate uncertainty in a trend? (I can show cases where Q is I(1) where, for all practical purposes, we could fit trends.)

If no one is claiming any of these things, can anyone tell me what is being claimed about the relationship between the properties of the causal variable Q to its effect Y?

I apologize to anyone who thinks these questions are too elementary, but I do want to understand the main claims associated with all the discussion of determining whether processes are I(n). A certain fraction of the discussion has revolved around whether a particular statistical procedure was applied correctly to determine the magnitude of “n” or whether a time series has a unit root or anything of the sort. However, currently, I cannot figure out how the determination of the “n” for any series, or possession of a unit root relates to any honest to goodness claim touching on AGW. I’m hoping if I get simple answers using the non-tendentious bucket problem, I might see the light and understand what people believe they have shown and what they believe their proofs mean.

61 thoughts on “Questions for VS and Dave”

  1. The Bart VS thread is one of the most exciting threads this year – as Bishop Hill said Statistics was not meant to be this fun.

    I hope you get a chance to read through it.

    In the meantime I am going to do a cartoon of you Lucia – your setting up of toy planets is inspiring (as is your blog).

    It will appear on http://www.cartoonsbyjosh.com but I will send you a preview.

  2. Thanks Josh,
    VS’s thread is interesting. But I think VS needs to write a few blog posts so he can interlace figures. A blog post would also permit better formatting, and let him isolate his main points from questions and answers.

    Worse, VS’s first comments are deleted. So, it’s very difficult to figure out what he actually claims.

    A cartoon of me should be fun! But…. do you even know what I look like?

  3. Very interesting to see the angle you take on this. I’ve been mostly thinking about it in physical terms; VS has argued only in statistical terms; perhaps your questions/argument can shed some light on combining the two.

  4. Bart–
    I actually always try to use statistics in light of what physics test us. No matter how fancy a statistical test might get (and I actually don’t do fancy) I prefer to specify as “null” something we actually expect based on a physical model.
    I’m also trying to post my questions here so I can keep track of them. Your comment thread is just too long!

  5. Agreed, it got a bit out of hand (quite literally).

    Which comments that are deleted are you referring to? AFAIR, I haven’t deleted any comments by VS.

  6. Here is a list of cites VS provides here:
    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1444

    ** Woodward and Grey (1995)
    – reject I(0), don’t test for I(1)
    ** Kaufmann and Stern (1999)
    – confirm I(1) for all series
    ** Kaufmann and Stern (2000)
    – ADF and KPSS tests indicate I(1) for NHEM, SHEM and GLOB
    – PP annd SP tests indicate I(0) for NHEM, SHEM and GLOB
    ** Kaufmann and Stern (2002)
    – confirm I(1) for NHEM
    – find I(0) for SHEM (weak rejection of H0)
    ** Beenstock and Reingewertz (2009)
    – confirm I(1)

    Here is a good VS summary post – with a cite for the ADF test:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1524

    In the 1524 post linked just above – VS also says:

    (i.e. the GHG forcings are I(2) and temperatures are I(1) so they cannot be cointegrated, as this makes them asymptotically independent. They, therefore have to be related via more general methods such as polynomial cointegration).

    So I gather that because the temperature and the CO2 forcing are of a different order they cannot be cointegrated.

    I don’t think VS is saying you cannot do statistics on them – only that OLS is not the proper tool – unless the two variables are of the same order.

    I look forward to reading your thoughts on this cointegration stuff.

  7. Bart–
    I’ve also found something I actually mostly disagree with!

    We can therefore never ‘accept’ a null hypothesis. We can only reject it, or fail to find sufficient evidence to reject it.

    Technically what VS says is true. But as a practical matter, if I claim I can reject a hypothesis as false, I think that I can equally well accept a null hypothesis as true. That is: I consider “reject”, “accept” and “fail to reject” to be possible outcomes of a test.

    I will pick a significance level: α=5%. I figure out if I “reject” or “fail to reject” the null. If I reject, that’s a meaningful finding, in the sense, that either the result is a false positive that happens 1 in 20 times, or the null really is false.

    I can also “fail to reject”. If I fail to reject, what does this mean? Well, I might say “ho hum”. I could throw up my hands and decree one never confirms a null. But, that leaves me in an odd situation if I really think the null is true and I want to convince someone..

    So, what I do is I set up an alternate hypothesis, that someone else might think is true. So, for example: in this comment at your blog, I would look up the trend expected by the models. Then, I would determine the power to reject a trend of m=0, if the models were right about the trend. Then, if I found I “failed to reject” m=0, with a power of 95%, I would say I have pretty good evidence that m=0 or is pretty close given the levels people who believe in warming expect to occur.

    It may be that VS would agree with me on this, and the issue would have something to do with semantics.

  8. Re: RickA (Mar 25 12:57),

    It’s going to be a while before I can get to the co-integrated stuff. Reading, all my questions are much more elementary. As in: What’s he really treating as null? Why? Before we even get to cointegration, what is the bulletted outline of his argument? (After that, we can get to the evidence to support the claims.)

    On of the difficulties with everything being in comments is I see lots of details, but it’s hard to find the actual synopsis. Which is ok– but I’m hoping someone who thinks what VS did was right can tell me. After all, someone who thinks what he did is right ought to be able to give a two paragraph highlevel summary abstracting the argument.

  9. Lucia:

    Also writing as VS metrics – here:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1643

    VS says:

    IMPORTANT:

    **I’m not ‘disproving’ AGWH here.
    **I’m not claiming that temperatures are a random walk.
    **I’m not ‘denying’ the laws of physics.

    *****These are all strawmen, posted by Tamino’s (admittedly statistically illiterate) ‘fan base’ here, in an effort to dillute my argument, and make my contributions unreadable.

    All that I am doing is establishing the presence of a unit root in the instrumental record. The presence of a unit root renders regular OLS inference invalid. Put differently, you cannot simply calculate confidence intervals assuming a trend-stationary process, because the temperature series is shown to be non-stationary (i.e. contains a unit root).

    Alex gives the technical reason why OLS inference is invalid in the presence of a unit root. This concerns non-singularity/finiteness of lim n->Inf, Qn=(1/n)*X’X matrix (‘consistency’, or ‘raakheid’ in Dutch, of the t and F based tests demands Qn to be finite/non-singular in its limit). In case of unit root(s) somewhere in X, Qn is infinite in its limit. This is a violation of one of the assumptions of OLS-based testing.

  10. Lucia,

    Thanks for starting with a physical model.

    I think you need to look at a physical model where you have the kind of positive feedback you would see in the climate. But for now filling a tank will do.

  11. Hi Lucia,
    OK where to start. Its not I(1) because it can be written yi=ρ yi-1+ εi but because ρ=1. Let backtrack, sort of.

    Two variables are correlated if the residuals of a linear combination are small (in some sense). Two variables are cointegrated if the residuals of a linear combination are stationary (in an AR sense). That is is y is +ve then the next change will likely be negative. If y is -ve then the next change will likely be positive (plus noise). That is I(0).

    It is just a different type of condition on the residuals. In the long run a stationary variable will ‘come back to zero’.

    A normal variable is stationary I(0) — the change in y is negatively correlated with y.

    OK Q1 – causation of I(n)=I(m). As in the bucket example, I(0) may ’cause’ I(1) but while I(0) hangs around zero, I(1) could wander off to infinity. They don’t cointegrate, and a correlation of trends is deemed to be ‘spurious’. To attempt to develop a model with a linear combination of them is a system misidentification.

    This is why B&R find the derivative of rfCO2 appropriate in a linear model and not rfCO2. This makes a big difference projected forward, as the effects of CO2 increase are short-lived. It must keep increasing exponentially for temperature to increase linearly. If CO2 only increases linearly then temperature is constant (is this whats happening since 1998?).

    Your example with the leakage is perhaps a ‘fractional root’ or ‘near unit root’ in that changes have a long memory, but not infinite memory. Temperature could be like that (too close to be sure).

    Over a short period there would be a correlation between I(0) and I(1) but over the long run not. That is what this is about. Cointegration is usually about two I(1) variables sharing a trend, but its not limited to that.

  12. Lucia:

    Let me try to summarize.

    I believe the VS null is:

    “In particular, the H0 (null-hypothesis) of the ADF test is that the series [GISS] contains a unit root, while the Ha (alternative hypothesis) is in fact stationary (and in this case ‘could’ contain a deterministic trend).

    VS finds GISS does contain a unit root (meaning it has a stochastic trend – not deterministic).

    This then means that OLS cannot be used – but polynomial cointegration must be used instead.

    I hope I did VS justice with this summary (I am an electrical engineer/patent attorney – not a statistician).

  13. Rick A–
    Is that really how he’s asking his questions?

    First, taken literally, if “contains unit root” is the null, then “fail to reject” the unit root is not the same as showing something contains a unit root. It only means he hasn’t proven it does not contain a unit root.

    The question then becomes: Why make “contains unit root” (i.e. is not stationary) the null in the first place? Why not make “does not contain unit root” (i.e. is stationary) the null?

    And if he is going to set up “contains unit root” as the null, has he given any information about the statistical power of his test relative to some reasonable alternate hypothesis. (Like, the root some value? If that makes sense.)

    Basically, I guess I want to know rather precisely what his “null” is. If that “null” is not physically realistic, I want to know the statistical power of his test relative to physically realistic null hypotheses. If the statistical power of his test is low, I will keep believing something physically realistic before I will accept a null that makes no physical sense based on a “fail to reject” from a test that has little statistical power.

  14. Rick A: Its not that the tests are restricted to that setup. The KPSS test works the way you want. You have to understand the tests, I think, to talk about them in a sensible way, as always. The only physically unrealistic part is the asymptotic behavior, ie. temperature doesn’t go to infinity (I hope). You could say, that with the finite data we have, that it is indistinguishable from a unit root behavior and be correct.

  15. David,
    Thanks.

    We are going to have to do babysteps here. We can talk about consequences later. I just want to be sure of the terminology.

    1) In my bucket problem with the drain, and Q=constant, is Q I(0)?
    2) In my bucket problem, with the drain,and Q=constant is Y I(0)? Or is it I(1)?
    3) In your bucket problem with no drain and Q=constant, is Q I(0)?
    4) In your bucket problem with no drain, is Y I(1)?

  16. Hiya Lucia,

    I think eventually one of two things will happen: Either someone will scrape the relevant VS explanations and make one post or you will be sucked screaming into the vortex of the 700 comments, swearing, kicking and screaming. I hope it’s the former.

    VS used successive tests for unit root, but I don’t think that resolves your question…

  17. Tom–
    The outcome of the tests for unit root absolutely do not resolve my questions. My questions have to do with what the heck are his null hypothesis, what is the structure of the main argument, what are the actual conclusions.

    What I am seeing is a bunch of tests.

  18. Lucia, this was a great start, but I think it’s going to leave a lot of people in the dark who don’t know the lingo involved.

    I’m just going to add a few details that I think are pertinent, it would be great if you or somebody volunteered to write a tutorial on some of this though.

    When VS is referring to a unity root, he is talking about e.g., this equation:

    y(n) = a(1) y(n-1) + a(2) y(n-2) + … a(p) y(n-p) + noise

    where a(1), a(2) … a(p) are constants. Basically, you compute the characteristic equation for the impulse response y(0) = m, y(n, n< 0) = 0, to this autoregressive process, which gives you

    m^p – a(1) m^(p-1) – a(2) m^(p-2) – … – a(p) = 0.

    A solution m=1 is referred to as a unity root. Suppose there are "r" roots m=1 of this equation,then the multiplicity "r" of the m=1 root is referred to as the order of integration, and is specified as I(r).

    Hopefully that helps with the notation a bit. Lucia or anybody else please feel free to add some tutorial material. This stuff isn’t that complicated, and there are some really interesting physics questions that are getting suffocated by notation.

    One of these is whether a particular physical system admits to any unit roots. There are some basic theorems related to this.

    It is made less transparent by the fact that these “roots” are actually the positions of the poles of the digital filter representation of an underlying analog process. When you look at the system, you have to start with constraints like causality and conservation of energy, which apply to the Green’s function for the physical system’s PDE.

    You then have to map these into the digital domain to see what these constraints mean for e.g. an autoregressive model.

    Short version of this, I agree with Lucia you do need to make connection with the physical system.

    One other issue in applying statistics is statistical quantities are not pure numbers… they represent distributions (so a “mean” is not “statistically meaningful” if it isn’t accompanied by a standard deviation) and even statements like the observation of a unity root need to come with an associated confidence interval. I haven’t seen any uncertainty statement for this claim yet.

    Till an uncertainty analysis has been done, in my book, VS hasn’t even done a “purely statistical analysis” let alone done the necessary testing for physicality.

  19. Carrick

    A solution m=1 is referred to as a unity root. Suppose there are “r” roots m=1 of this equation,then the multiplicity “r” of the m=1 root is referred to as the order of integration, and is specified as I(r).

    Thanks.

  20. Carrick

    It is made less transparent by the fact that these “roots” are actually the positions of the poles of the digital filter representation of an underlying analog process. When you look at the system, you have to start with constraints like causality and conservation of energy, which apply to the Green’s function for the physical system’s PDE.

    You then have to map these into the digital domain to see what these constraints mean for e.g. an autoregressive model.

    Coming from continuum mechanics, I tend to think analog first. 🙂

  21. Having made that comment, it’s easy to rewrite Lucia’s model in the digital domain:

    A_t dY/dt = Qi-Qo

    Finite difference method: dY/dt -> (Y(n) – Y(n-1))/dt,

    Rearrange terms:

    Y(n) = Y(n-1) + (Qi-Qo) dt/A_t

    This actually is a special version of:

    Y(n) = Y(n-1) + X(n) * dt/A_t

    where in this case X(n) = Qi-Qo is a constant.

    Even this simple system doesn’t reduce to a purely AR process. Is that important?

  22. Tom Fuller:

    And my recollection of the 700 posts is that he answers those questions, but in bits and pieces throughout the thread.

    Well if somebody, VS or one of his knights, wants to go through those 700 comments and try and summarize responses that would be great.

    I certainly don’t have time to do that.

  23. X(n) = Qi-Qo
    It is AR(1). Instantaneously, Qo(t) is a function of Y(t)

    If Qi is a constant during the time step “dt”, we can find the exact value for &rho, the lag 1 autocorrelation and show this is AR(1).

    (I picked this problem just in case we need to move forward with AR(1) noise.)

  24. Lucia:

    It is AR(1). Instantaneously, Qo(t) is a function of Y(t)

    In my lingo, because X(n) ≠ 0, it’s an inhomogeneous AR process. I guess my question was does the inhomogeneity matter here?

    (This is also a special case where the inhomogeneity is a constant, that isn’t going to be something that happens very often in physical systems of course.)

  25. Re: Carrick (Mar 25 14:58),
    Create an anomaly, that is go back to equation (4). Now it’s homogeneous.

    Then, use the substitution I gave you for Qi. AR(1) and X goes away.

    (If you want noise, just make Qi be a constant plus noise. This is the classic analog process that becomes AR(1).)

  26. Lucia:

    Create an anomaly, that is go back to equation (4). Now it’s homogeneous.

    Yeah I realized that, hence the “special case” comment. But this only happens because you have a constant forcing.

    An equally interesting problem is

    AtdY/dt = Qi * (1 + cos(2 pi f t))/2 – Qo

    (Raised cosine driving. This is more like the type of forcings of a physical system that I would see, or be likely to analyze.)

    Climate has very complicated forcings that can’t be reduced using anomaly methods. I was just curious what happens with the econometric assumptions when that happens.

  27. But this only happens because you have a constant forcing.

    I think if we have Qo is the Weiner process, this is AR(1).

    If Qo has some weird deterministic behavior, then we get something else. But that’s why I want to figure out what VS is stating as nulls.

    In engineered systems, we often have a good idea of the drivers (in this case, we might know a lot about both the deterministic and stochastic components of Q). I think in economics they often don’t and one reason that someone might set something like “has a unit root” as null is to avoid having people go on fishing expeditions to find trends when they have no particular argument for why a trend should or should not exist.

    So… I want to see the power of the test VS used to reject the trend early on in his analysis.

  28. I went through the VS posts – and think I found the one with the complete explanation of his analysis:

    He lays out his null and all his statistical tests.

    Here is the link:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1524#comment-1524

    Here is the pasted version:

    I will show all the steps taken in the process of establishing the I(1) property of temperature series. I will list all test results, motivations, and decisions. This way Alex, or anybody else for that matter, will be able to inspect them.
    I will use the GISS-NASA combined surface and sea temperature record that I downloaded from their website. I will resort to this series, because everybody seems to be using it in this discussion. However, I have to stress that more or less the same results are established using HADCRUT or CRUTEM3 (or the GISS-NASA land only) temperature records.
    ————————–
    TESTING THE I(1) PROPERTY
    ————————–
    We start by examining the GISS-NASA temperature series 1880-2008 (GISS-all). We want to see whether the series contains a unit root. As mentioned here, and on various other places, the presence of a unit root in a time series invalidates regular statistical inference (including OLS with AR terms) because the series is no longer stationary (this is a necessary condition).
    Definition stationarity (from wiki):
    http://en.wikipedia.org/wiki/Stationary_process
    “In the mathematical sciences, a stationary process (or strict(ly) stationary process or strong(ly) stationary process) is a stochastic process whose joint probability distribution does not change when shifted in time or space. As a result, parameters such as the mean and variance, if they exist, also do not change over time or position.”
    ————————–
    AUGMENTED DICKEY FULLER TESTING
    ————————–
    I start with applying the Augmented Dickey Fuller test. The definition (and purpose) of the ADF is given here, again on wikipedia:
    http://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test
    I stress this part of the definition:
    “By including lags of the order p the ADF formulation allows for higher-order autoregressive processes. This means that the lag length p has to be determined when applying the test. One possible approach is to test down from high orders and examine the t-values on coefficients. An alternative approach is to examine information criteria such as the Akaike information criterion, Bayesian information criterion or the Hannan-Quinn information criterion.”
    The ADF can be applied in different forms, depending on how you want your alternative hypothesis to look like. The null hypothesis is the presence of a unit root. The alternative hypothesis (determining the specification of the test-equation) can be:
    (1) no intercept
    (2) intercept
    (3) intercept and trend
    I will focus on (3) here, because this is the most ‘restrictive’ case and because I have been accused of ‘ignoring’ this alternative hypothesis when arriving to my test results. It also corresponds to what has been posted here and elsewhere as the probable alternative hypothesis. Do note however, that the results given below are *much* more conclusive in cases (1) and (2).
    I will furthermore use all the information criteria (IC) available to me to arrive at the required lag length (‘p’ in quote above, I will refer to it as ‘LL’ below) in the ADF test equation.
    Hypothesis specification:
    H0: GISS-all contains a unit root
    Ha: GISS-all is trend stationary (testing against case 3)
    NOTE: All residuals of test equations have been tested for normality via the Jarque-Bera test for normality (the p-value is reported as JB below), and in all cases the null hypothesis of normality is not rejected. The ADF test, under the assumption of normality of residuals, is then exact. For a definition of this normality test, see here:
    http://en.wikipedia.org/wiki/Jarque_bera
    ADF test results:
    IC: Akaike Info Criterion (AIC)
    LL: 3
    p-value: 0.3971
    Conclusion: presence of unit root not rejected
    JB: 0.393560
    IC: Schwartz / Bayesian Info Criterion (BIC, used by a critic of mine)
    LL: 0
    p-value: 0.0000
    Conclusion: presence of unit root rejected (I will get to this below, bear with me)
    JB: 0.202869
    IC: Hannan-Quinn Info Criterion (HQ)
    LL: 3
    p-value: 0.3971
    Conclusion: presence of unit root not rejected
    JB: 0.393560
    IC: Modified Akaike
    LL: 6
    p-value: 0.8619
    Conclusion: presence of unit root not rejected
    JB: 0.370261
    IC: Modified Schwartz
    LL: 6
    p-value: 0.8619
    Conclusion: presence of unit root not rejected
    JB: 0.370261
    IC: Modified HQ
    LL: 6
    p-value: 0.8619
    Conclusion: presence of unit root not rejected
    JB: 0.370261
    Now, we see that using the ‘BIC’ one arrives at a deviant number of lags (namely 0). This warrants further inspection. Note that the purpose of the lag length is to eliminate all residual autocorrelation, so that the ADF tests can function properly.
    In order to inspect this issue, we compare the residuals of the test equations with 0, 3 and 6 lags respectively. Here I report the Q statistics for the first 10 lags in the residual series. The Q statistic is used to determine the presence of residual autocorrelation. A more detailed explanation is given here:
    http://en.wikipedia.org/wiki/Ljung%E2%80%93Box_test
    I quote, for those with no time to ‘click’ ;), the following:
    “The Ljung–Box test is a type of statistical test of whether any of a group of autocorrelations of a time series are different from zero. Instead of testing randomness at each distinct lag, it tests the “overall” randomness based on a number of lags, and is therefore a portmanteau test.”
    0 Lags in test equation:
    0.447
    0.683
    0.858
    0.102
    0.161
    0.159
    0.215
    0.168
    0.178
    0.081
    3 Lags in test equation:
    0.862
    0.885
    0.953
    0.983
    0.912
    0.950
    0.938
    0.837
    0.854
    0.731
    6 Lags in test equation:
    0.939
    0.997
    1.000
    0.999
    0.998
    1.000
    1.000
    0.999
    0.999
    0.989
    So, once we use the BIC determine lag length, our residuals are very messy (i.e. borderline significances etc. See first sequence Ljung-Box Q-statistics). Higher numbers of lags however, especially 6, successfully eliminate all traces of residual autocorrelation. Note also that the condition that the residuals of the test equation are normal is least solid when using the BIC for lag selection.
    Both conditions are necessary for the ADF to function properly.
    By using statistical diagnostic measures, we can therefore safely disregard the deviant lag length arrived at via the BIC, and use one of the other measures (so either AIC or HQ, or the modified versions of all three, so basically any IC except the BIC/SIC).
    Our ADF-based inference is coming to a closure. We now need to proceed to test for the I(1) versus I(2) property of the GISS-all series, in order to make sure temperature is not I(2). Again, we perform the tests, now on the first difference of GISS-all, or D(GISS-all).
    For the sake of readability (and because we still have a bunch of other tests to do) I will only report the p-values of the test using the remaining 5 ‘untainted’ IC’s. The IC implied lag length will be again be reported as ‘LL’.
    VERY IMPORTANT NOTE: The alternative hypothesis, in the first difference series will now be intercept (or drift) instead of intercept and trend. So this is case (2). The reason for this is that an intercept in the first differences immediately implies a trend in the level series. Again, as above, I am giving the ‘deterministic trend hypothesis’ the benefit of the doubt (contrary to what has been claimed elsewhere).
    ADF test results, for D(GISS-all):
    IC: Akaike Info Criterion (AIC)
    p-value: 0.0000
    LL: 4
    Conclusion: presence of unit root rejected
    IC: Hannan-Quinn Info Criterion (HQ)
    p-value: 0.0000
    LL: 2
    Conclusion: presence of unit root rejected
    IC: Modified Akaike
    p-value: 0.0000
    LL: 0
    Conclusion: presence of unit root rejected
    IC: Modified Schwartz
    p-value: 0.0000
    LL: 0
    Conclusion: presence of unit root rejected
    IC: Modified HQ
    p-value: 0.0000
    LL: 0
    Conclusion: presence of unit root rejected
    So, using the ADF, we do not reject the presence of a unit root in the level series. However, once we difference the series, the unit root is rejected in all instances. We therefore conclude that the ADF test implies that GISS-all is in fact I(1).
    Now, let’s turn to other tests.
    ————————–
    KWIATKOWSKI-PHILLIPS-SCHMIDT-SHIN TESTING
    ————————–
    The careful read has probably noted that the null hypothesis of the ADF test is that the series actually contains a unit root. One might argue that, due to the low number of observations in the series, or simply bad luck, this test fails to reject an untrue null-hypothesis, namely that of an unit root, in the level series. In other words, the possibility that we are making a, so called, Type II error.
    We can however test for the presence of a unit root, by assuming under the null hypothesis that the series is actually stationary. The presence of a unit root is then the alternative hypothesis. In this case we ‘flip’ our Type I and Type II errors (I’m being very informal here, the analogy serves to help you guys ‘visualize’ what we are doing here).
    To do that, we use a non-parametric test, the KPSS, which does exactly that. Namely, it takes the null hypothesis as being stationarity around the trend, and the alternative hypothesis is the presence of a unit root.
    See also: http://en.wikipedia.org/wiki/KPSS_tests
    “In statistics, KPSS tests (Kwiatkowski-Phillips-Schmidt-Shin tests) are used for testing a null hypothesis that an observable time series is stationary around a deterministic trend.”
    IMPORTANT NOTE: The KPSS test statstics’ critical values are asymptotic. Put differently, the test is exact only when the number of observations goes to infinity. The ADF on the other hand, is exact in small samples under normality of errors (that we tested for above using the JB test statistic).
    KPSS Test result, for two different bandwidth selection methods. The spectral estimator method employed is the Bartlett-kernel method.
    The asymptotic (!) critical values of this test statistic are:
    Critical values:
    1% level, 0.216000
    5% level, 0.146000
    10% level, 0.119000
    So once the Lagrange Multiplier (LM) test statistic is ABOVE one of these values, STATIONARITY is rejected at that significance level.
    Newley-West bandwith selection:
    TEST STATISTIC: 0.165696
    Conclusion, stationarity is not rejected at 1% significance level. Rejected at 5% and 10% significance levels.
    Andrews bandwith selection:
    TEST STATISTIC: 0.154875
    Conclusion, stationarity is not rejected at 1% significance level. Rejected at 5% and 10% significance levels.
    PARZEN KERNEL:
    Newley-West bandwith selection:
    TEST STATISTIC: 0.147904
    Conclusion, stationarity is not rejected at 1% significance level. Rejected at 5% and 10% significance levels.
    Andrews bandwith selection:
    TEST STATISTIC: 0.130705
    Conclusion, stationarity is not rejected at 1% and 5% significance levels. Rejected at 10% significance levels.
    Let’s now try to interpret the results of the KPSS test.
    We see that the null hypothesis of NO unit root, is rejected at 10% for all methods used, and 5% in most cases. At a 1% significance level, it is however not rejected.
    Two things to note:
    (1) The test is asymptotic, so the critical values are only exact in very large samples
    (2) The null hypothesis in this case stationarity, and the small sample distortion severely reduces the power of the test (the power is the ‘inverse’ of the probability of a Type II error). In other words, the test is biased towards NOT rejecting the null hypothesis in small samples.
    However, in spite of this small-sample bias, we nevertheless manage to reject the null hypothesis of stationarity in all cases, at a 10% significance level and in all but one case using a 5% significance level. I conclude that there is strong evidence, when testing from ‘the other side’, and minding the small sample induced power reduction of the test (i.e. the fact that it is biased towards not rejecting stationarity in small samples), that the level series is NOT stationary.
    I(0) is therefore rejected.
    ————————–
    PHILLIPS-PERRON TESTING
    ————————–
    Unlike the ADF, the Phillips-Perron test doesn’t parametrically deal with autocorrelation. Instead, the test statistic is modified directly to robustly account for it. Furthermore, this makes the test robust to heteroskedastitcity (varying variance). However, as always with robust tests, these modifications reduce efficiency if these ‘robustness corrections’ are in fact not needed. This is however a very lengthy discussion and I’ll leave it there for now.
    Let’s take a look at those PP test results then, shall we. We begin by taking case (3) again, so our test equation contains both an intercept and trend. The test results reject the presence of an unit root:
    Phillips-Perron test on GISS-all, Bartlett kernel, Newley-West bandwith:
    Ha: Trend and intercept (case (3))
    TEST STATISTIC -5.744931
    1% level, -4.031899
    5% level, -3.445590
    10% level, -3.147710
    Conclusion: the presence of a unit root is rejected
    Now, let’s, just for the sake of sensitivity analysis, test with using just an intercept (and no trend) in the test equation.
    Ha: Intercept (case (2))
    TEST STATISTIC: -1.555403 (p-value 0.5024)
    1% level, -3.482453
    5% level, -2.884291
    10% level, -2.578981
    Conclusion: the presence of a unit root is NOT rejected
    Just like it was claimed elsewhere, and confirmed by Kaufmann and Stern (2000), the PP test results lead us to conclude that the series is I(0), when setting the presence of a trend as the alternative hypothesis. Setting simply an intercept in the alternative, in fact fails to reject the presence of a unit root.
    ————————–
    DICKEY FULLER GENERALIZED LEAST SQUARES TESTING
    ————————–
    Our final set of tests will concern the DF-GLS tests, which are similar, but not the same as the ADF tests. Again, we will use (3) as the alternative hypothesis, and we will use all available IC’s to derive the required lag length.
    DF-GLS test results:
    The critical values of the relevant Elliott-Rotherberg-Stock DF-GLS test statistic are given below:
    1% level, -3.551200
    5% level, -3.006000
    10% level, -2.716000
    IC: Akaike Info Criterion (AIC)
    LL: 3
    TEST STATISTIC: -1.759718
    Conclusion: presence of unit root not rejected
    IC: Schwartz / Bayesian Info Criterion
    LL: 3
    TEST STATISTIC: -1.759718
    Conclusion: presence of unit root not rejected
    IC: Hannan-Quinn Info Criterion (HQ)
    LL: 3
    TEST STATISTIC: -1.759718
    Conclusion: presence of unit root not rejected
    IC: Modified Akaike
    LL: 6
    TEST STATISTIC: -1.065158
    Conclusion: presence of unit root not rejected
    IC: Modified Schwartz
    LL: 5
    TEST STATISTIC: -1.305844
    Conclusion: presence of unit root not rejected
    IC: Modified HQ
    LL: 6
    TEST STATISTIC: -1.065158
    Conclusion: presence of unit root not rejected
    Again, just like in the case of the ADF test series, we do not reject the presence of a unit root, when using (3), i.e. linear trend and intercept, as our alternative hypothesis, Ha. In this case, even the SIC/BIC measure points to the use of 3 lags, and is in line with both the HQ and AIC.
    If we move on to the first difference series, the presence of a unit root is clearly rejected (I won’t bore you again with a series of tests, since this isn’t what we’re debating).
    So on the basis of the DF-GLS test series, using all information criteria, we again conclude that the GISS-all series is I(1)
    ————————–
    SUMMARY AND CONCLUSIONS
    ————————–
    We have now applied a myriad of different methods to check for the presence of unit roots. As you can see, and like pointedly Alex noted, you do actually have to interpret the results.
    ADF: Clear presence of a unit root
    KPSS: Stationarity (no unit root) rejected at 5% and 10% sig, not at 1% sig.
    PP: No presence of unit root, but only when using (3) as an alternative hypothesis (this is a robustness issue)
    DF-GLS: Clear presence of a unit root
    For me personally, adding all these together (and minding the small-sample properties of the ADF, if the autocorrelation is properly dealt with and the errors are normal), leads me to conclude that the GISS-all series are in fact I(1).
    I do have to ***stress*** here that I’m not the only one who looking at these results draws this conclusion. These tests have been extensively reported in the literature (see references in my first post), by both AGWH proponents and AGWH skeptics, and all conclude I(1).
    A very conservative econometrician or statistician, *might* conclude that the evidence is ‘mixed’, although it leans towards the presence of a unit root. However, if one is THAT conservative, it is truly impossible to conclude, in light of all this evidence, that the series does NOT have a unit root.
    That was my whole point, and this was my statistical argument.
    VS

  29. I tried pasting in the entire post – but it didn’t work.

    Here is VS’s own summary of his analysis – all in a single post.

    It lays out his null hypothesis here:

    Hypothesis specification:

    H0: GISS-all contains a unit root
    Ha: GISS-all is trend stationary (testing against case 3)

    Here is the link to the whole 2500 word analysis:

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1524

  30. I thought this comment from Nick’s blog was interesting:

    This is used in causal arguments as follows. If, say, a temperature rise is believed to be caused by a CO2 rise, then there is a consistency requirement for the series. The long-term qualitative difference between a series with roots less than unity, and one with unit roots is great. So it would be odd to find that if temperature is I(1), so its a0 = 0 but a1 is not 0, while for CO2 (with coefs b0, b1 etc) b0 = b1 = 0. For that would mean that the temperature differences were stable, with disturbances decaying, even though the corresponding differences of CO2 could drift. That’s the supposed significance of the failure of “polynomal cointegration”. Though its still a big jump to say “AGW is refuted”.

    I’m still struggling to understand what VS is actually claiming, but it’s easy to see if CO2 isn’t the only forcing on temperature that even if you could positively establish that CO2 is I(2) and global mean temperature is I(1), that doesn’t imply anything more than there are other forcings besides CO2 that control global mean temperature.

    We knew that.

  31. Lucia:

    I think if we have Qo is the Weiner process, this is AR(1).

    Good point, lol.

    I was implicitly allowing for any stationary stochastic process. Measurement “noise” is a “given”.

    XD

  32. RickA, thanks for the link.

    It is going to take a while to think about what it means when you have climate driven by systematically varying forcings (so neither the system nor system noise are stationary) and whether that is an issue with the sort of tests that VS is trying to apply.

    That’s why I’m focussing in on the “inhomogeneous” issues of the AR process. The whole thing could be as simple as lumped anthropogenic forcings appear like a nonstationary stochastic process. This is interesting, in VS’s words, because:

    We start by examining the GISS-NASA temperature series 1880-2008 (GISS-all). We want to see whether the series contains a unit root. As mentioned here, and on various other places, the presence of a unit root in a time series invalidates regular statistical inference (including OLS with AR terms) because the series is no longer stationary (this is a necessary condition

    If you lump anthropogenic forcings (which are in principle measurable) in with “noise” you might arrive at a conclusion that the system can’t be tested/modeled/whatever using “regular statistical inference”. The question I have here is what happens if you “subtract” the various forcings out (i.e., “anomalize” the measured temperature series by correcting for the variance associated with these forcings).

    Does the resulting system still appear nonstationary?

    Because if the resulting anomalized series is stationary, all you are saying really saying by examining the raw data is anthropogenic forcings are not represented by a stationary stochastic process.

    I think that is a given.

  33. If you lump anthropogenic forcings (which are in principle measurable) in with “noise” you might arrive at a conclusion that the system can’t be tested/modeled/whatever using “regular statistical inference”. The question I have here is what happens if you “subtract” the various forcings out (i.e., “anomalize” the measured temperature series by correcting for the variance associated with these forcings).

    Does the resulting system still appear nonstationary?

    That’s what I want to know too!
    For example: If you subtract the multi-model mean of models form the 20th century from GISS and analyze that do you still fail to reject a unit root. (And as usual, I would still like to know the power!)

  34. Lucia,

    With all due respect to you. Your decision to not read the VS comment stream in Bart’s blog is saying, “don’t read Sir Francis Bacon, just read what someone else says about him”.

    It is not reasonable.

    I just picked Bacon arbibrarily. I am not implying anything about VS or Bacon.

    John

  35. I am not sure about this – but I also think VS is using annual values for 1880 – 2008. I remember reading something about 120 data points. Hence his discussion about the small sample size. I could be wrong about that – but it seems like an important point to raise.

    As somebody who only had one quarter of statistics in college – I have no idea of “the power!”. Sorry.

  36. John–
    I have read the thread.

    The thread is long and digresses a lot. I’m not finding a core that precisely states the VS’s claims, stated by VS. No one else seems to have found it either– because those who are providing me links are providing links to comments that, as far as I can tell, discuss details required to support some claim or conclusion is, as far as I can tell, absent.

    What I am not going to do is spend 7 days 24 hours a day trying to find the claims or what they are supposed to mean.

    Do you think you know what VS claims, and whether or not he has supported is claim? If yes, can you tell me what you think these things are?

  37. John Williams:

    With all due respect to you. Your decision to not read the VS comment stream in Bart’s blog is saying, “don’t read Sir Francis Bacon, just read what someone else says about him”.

    Mm… bacon.

    Actually it’s more akin to not trying to interpret his graffiti and scrawlings on envelopes and transcribed barroom arguments and trying to piece together a coherent argument from those. If like Sir Francis, VS has written coherent arguments on this, please let us know.

  38. ””””lucia (Comment#39254) March 25th, 2010 at 4:15 pm – What I am not going to do is spend 7 days 24 hours a day trying to find the claims or what they are supposed to mean. . . . . Do you think you know what VS claims, and whether or not he has supported is claim? If yes, can you tell me what you think these things are? ”””””

    Lucia,

    Thanks for you reply.

    I don’t comment here much although I do spend some good time here, so please excuse any impertinence on my part. I do respect the venue you have set up for minds to meet. Thank you for that.

    The best source of what VS says is of course VS. He has written, in my opinion, clearly and with politeness. I am most impressed by his exhaustive energy devoted to explaining repeatedly both statisticians and layman what he is stating. I am on a steep but enjoyable learning curve by reading and rereading and researching his comments.

    Why do you choose the word ‘claim’ in lead post? Isn’t ‘state’ more of a neutral expression? Claim in a scientific discussion gives me a negative connotation. It appears pejorative. Maybe it is just me. Why start out on that foot?

    I have spent a lot of time on climate science post in past 1.5 years. I would vote the VS comment stream at Bart’s as one of the most fascinating I have read, even including all the Climategate stuff.

    As to my views of what VS says, of course I wil comment on ‘statements’ made here as I can. And at Bart’s thread.

    Would you consider to invite VS to post here? I personally (as others I am sure) would enjoy that.

    John

  39. Lucia, Carrick,

    If you look at the thread title/header over at Barts, it is ‘Global average temperature increase GISS HadCRU and NCDC compared’

    i.e. Bart posted some comparisons between the 3, including a trend for the3 last 30-40 years or whatever. VS posted a comment suggesting that when dealing with time series, it is a good idea to test for unit roots, and posted some test results for several temperature series indicating that they did indeed have unit roots.
    (all I(1)).
    And it was all downhill from there.

    But, VS is proposing that the GISS-all dataset, when hit with a variety of tests, is I(1), has a uni5t root, whatever, and therefore it is not appropriate to calculate OLS trends etc.

    There was a whole lot more on a million different subjects, side alleys, diversions etc, as well as the whole Beenstock & Reingewertz paper, and GCM and models, all of which will be dealt with ‘real soon now’ in future posts.
    For the moment I’d stick to the temp series are I(1) and the implications thereof.

    Hope that clarifies slightly.

  40. @Lucia

    Just read the post from whbabcock March 17

    This summarises things quite well, I believe

  41. ””””Carrick (Comment#39255) March 25th, 2010 at 4:25 pm – Mm… bacon . . . . If like Sir Francis, VS has written coherent arguments on this, please let us know. ””””’

    Carrick,

    Good idea, please go over to the VS stream on Bart’s blog and ask him to stop that ‘writing graffiti’ stuff. : )

    Bacon . . . mmmmmmmm . . . it is breakfast time here in Taipei. I think I will get some bacon and eggs!

    John

  42. Lucia,
    my interpretation is that the claims by VS and by B&R are different, although they may be related. I found the ms by B&R clearer, at least for me , but this may just indicate that it is a manuscript where as VS wrote just more spontaneous blog postings.

    B&R more correctly use radiative forcing by GHG, instead of the concentrations. They are not linearly related: for CO2 the is a logarithmic dependence, for CH4 the forcing is proportional to the square root of concentrations. In the end they claim that the level of temperature anomaly is related to the time derivative of the GHG forcing, and not to the forcing itself. This should be why they write that the effect of GHG forcing is temporary: it drives the temperature only when the GHG forcing is changing. Once the GHG forcing has risen and stabilized at a certain level, the temperature should drop again to its original value. They also claim that the warming in the last decades is indeed caused by GHG forcing (it seems that this has been overlooked or I misunderstood the ms), but this happened just because the GHG forcing was rising rapidly. One it stabilizes ‘on the long-term’ temperature should drop again.

    I see a couple of things that could influence their results: aerosols and volcanic forcing are not being considered. There are also very basic formal flaws that indicate that the authors are not very familiar with the physics. For instance, physical units are virtually not indicated. What are the units of all parameters in equations 1 ,2 , etc.. And when units are indicated, they are wrong. In page 9 they write about the rate of change of GHG forcing as 1w/m2, and one has to assume that it is 1w/m2 per year (???, that would ridiculous, probably per century? , who knows?). I guess that Nature rejects manuscript on the fly for much less than this – if this ms is intended for Nature. Not being climate researchers ( apparently) they could have easily ask someone to check those formal aspects.

    But to be honest, the background message, i.e. that T is driven by the time derivative of the forcing, is intriguing. Perhaps it is bonkers.

  43. Re: John Whitman (Mar 25 17:24),

    Why do you choose the word ‘claim’ in lead post? Isn’t ’state’ more of a neutral expression? Claim in a scientific discussion gives me a negative connotation.

    I don’t think “claim” should be seen as communicating a negative connotation in a scientific discussion. Any argument (scientific or otherwise) should consist of a claim and then provide support for that claim. See UNC”s writing handout:

    “What is an argument? In academic writing, an argument is usually a main idea, often called a “claim” or “thesis statement,” backed up with evidence that supports the idea. “

    I’m using claim in that way– not as in “She claimed to be the reincarnation of Cleopatra! (eyeroll!)” So, I do understand what you are saying. It’s just that “state” is not the right word in context of an academic type argument– which is what VS is advancing. Claim is the right word (I think.)

    I’m not saying the individual posts are confusing or not clear. Only that the argument (in the academic sense) is spread out over a large number of comments, and it now takes a lot of time for someone to go through, find the “claims” and link them to details that are intended to support those specific claims.

    As for inviting VS: On the one hand, I’d be happy to run a post that permitted him to organize things so we know what the claims are, and how he supports them and what he thinks what he has shown means in terms of climate. Posts are better for that than comments threads. But I’m reluctant to permit someone so totally anonymous on the admin side of the blog. So, if he agreed, we’d have to figure out how that’s run. (Zeke just has permission on the admin side.)

  44. Hi Lucia
    Thanks for posting this tread. I’ve been following the tread on Bart’s blog almost since day one and find it one of the most interesting I have read recently. I was unaware of these statistics.

    It would be good if at some point VS summarised his key points. It appears to me that his thought’s have to a degree developed & clarified as the thread progressed; e.g. earlier on there was talk about global temperature following a “random walk” which he later agreed was a “somewhat of a red herring”

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-1550

    This comment 25/3, by VS, showing differences between stochastic & deterministic trends is very informative to me.

    http://ourchangingclimate.wordpress.com/2010/03/01/global-average-temperature-increase-giss-hadcru-and-ncdc-compared/#comment-2550

    With the link to the two figures.
    FIGURE 1 – (misspecified) Deterministic trend forecasting confidence intervals (based on 1880-1935 observations), together with GISS series

    FIGURE 2 – Stochastic trend forecasting confidence intervals (based on 1880-1935 observations), together with GISS series

    http://img146.imageshack.us/img146/6674/deterministicvsstochast.gif

    “They say a picture is worth a 1000 words. I think these two are worth more than that, considering the length of this thread 🙂

    The fundamental difference here is that our misspecified deterministic trend model would lead us to conclude that the warming seen in the past 20-30 years is anomalous considering the trend observed over 1880-1935. Note how the actual realized GISS series is jumping in and out of our 95% forecasting confidence interval.

    Climate scientists looking at this seem to conclude that something fundamental has changed in the latter half of the last century (e.g. increased climate forcings). An econometrician however would refrain from jumping to conclusions, and investigate the series for non-stationarity.

    Now, notice how the forecasting confidence intervals of the stochastic trend estimate leads us to conclude that ‘the trend’ has not changed at all significantly (in statistical terms).

    In fact, every single realization of the temperature anomaly between 1935-2008 falls perfectly within our 95% forecasting confidence interval !!!

    In other words, when applying formal methods to arrive at our ‘trend estimate’ (i.e. find the unit root, and account for it), we find that the temperatures observed over the period 1935-2008 are perfectly in line with the ‘trend behavior’ exhibited by the temperature series in 1880-1935″.

    Now that is a claim. As to the length of the blog wait till, in VS’s words,”we actually start reproducing Beenstock and Reingewertz”.

  45. Lucia,

    My ‘to state’ same as your ‘to claim’. No problem. We have differing backgrounds perhaps. In the area of my professional background ‘claim’ is not typically used in polite technical business discussions, that’s all. It is a turn off, especially internationally. Maybe discussion in pure science doesn’t worry about polite/civil so much?

    Hope one of the fair blogs (such as yours) does pick up somehow another VS stream. I admire Bart, even thought it appeared for some reason that VS comment stream caused (is causing) him some considerable tension.

    Well, back over the VS stream at Bart’s for more work on my statistics learning curve. Note: I find David Stockwell comments to be an excellent tutorial.

    I enjoy your venue.

    John

  46. “he water level is currently below its equilibrium level”

    Shame on you Lucia, it is a ‘steady state’, not an ‘equilibrium’. the two are quite different, both thermodynamically and kinetically.

  47. eduardo:

    I see a couple of things that could influence their results: aerosols and volcanic forcing are not being considered

    IMO this is a huge problem.

    Until 1980 or so anthropogenic CO2 and aerosols more or less canceled each other. So spake the models.

  48. John

    Maybe discussion in pure science doesn’t worry about polite/civil so much?

    I think it’s just that “claim” doesn’t have a negative context in an academic/debate setting. To some extent, business isn’t generally about arguing who analysis is right, but figuring out if people have mutually beneficial interests.

  49. John:

    Maybe discussion in pure science doesn’t worry about polite/civil so much?

    Honestly it doesn’t, though Lucia strives much harder for civil discourse than most, all the while instituting a much more liberal comment policy.

    I don’t see any big deal with “claims”. That’s what they are. We aren’t suppose to take their word for what they found. Skepticism is not only assumed, it’s demanded.

  50. lucia,

    If you want to model the atmosphere, shouldn’t you be changing the length of the porous plug while the flow rate into the tank remains constant rather than changing the flow? Or are they mathematically equivalent?

  51. Carrick (Comment#39280) March 25th, 2010 at 9:26 pm

    lucia (Comment#39276) March 25th, 2010 at 8:43 pm

    Lucia & Carrick,

    Lucia, I do understand your point. That the use of ‘to claim’ in the context of in your post is not pejorative.

    Carrick, this venue is very polite. So therefore the use of ‘to claim’ versus a more neutral (strictly in my opinion) word like ‘stated’ seemed . . . . off. Off particularly in regards to the civility and openess of VS over at Bart’s. I am too sensitive perhaps.

    I will adjust to the venue. I just wondered, therefore my comments to you Lucia. Let’s get back to looking at VS’ statments. : )

    John

  52. DeWitt,

    If you want to model the atmosphere, shouldn’t you be changing the length of the porous plug while the flow rate into the tank remains constant rather than changing the flow? Or are they mathematically equivalent?

    Well, we could also have the holes in the plug getting filled with gunk or getting backflushed by the operator. 🙂

    The plug is to make the flow out of the tank be Qo ~Y instead of Qo ~√Y which would be a horrible problem if we end up using this to explain what the various claims translate into physically for the simplest possible physical problem that everyone has seen in some form.

    Maybe if the main purpose for this model was to concoct something similar to the atmosphere, it would be useful to discuss a plug whose properties vary somehow. But the purpose of where some discussion of unit roots might go, varying Q is easier than varying the properties of the plug. I’m not sure whether mathematically having the plug properties change could be made into the same problem as varying Q. I’d have to think about that.

  53. Might I suggest something Lucia, run the model. You have a very simple steady state model with a zero order influx and a first order efflux, the latter being based on the height of the water and the size of the porous plug.
    Either change the influx rate with time or the size of the hole.
    Used a sine wave or a sawtooth wave to change the influx, with respect to time and observe the height of the water.
    Now, see if you observe all those lovely auto-correlations you love so much.

  54. Doc–
    Actually, the properties of the system that has the math above is well known. Closed for solutions for the autocorrelation between Q and Y are known for a number of simple cases particularly Q=white noise. It’s also known for Q= sinewave etc. That might look like a tank of water, but the equations are also those for a) a small particle suspended in turbulent flow, b) a ‘one lump’ climate system.

    Dewitt payne is planning to run a two lump climate model and test it’s output. I don’t know how he’s going to force his two lump system.

  55. Re: lucia (Mar 26 06:15),

    I’ll try the GISS forcings first, but I’m also thinking about white noise alone, then step, linear ramp, and exponential ramp with different amounts of noise added. Then there’s the question of whether white noise should be added to the synthetic temperature series to model sampling error.

Comments are closed.