Correcting for Serial Autocorrelation: Cochrane Orcutt

Today, I’m going to try to apply Cochrane-Orcutt a method that could be used to deal with serial autocorrelation in residuals to a linear regression. I hadn’t done this before, so I hunted down some references, and read how it’s supposed to be done.

I have some result– but the problem is I really need someone to look them over and tell me if I’ve made a hash of this. And, even more unfortunately, the only way to do that is to a) publish an analysis, even if it might be wrong.

So, I’ll be doing that, and hoping someone tells me what I’ve done right or wrong in applying the method.

So, this post organized as follows:

  1. Discussion of Least Squares applied to GMST data from 1979-Jan 2008 without correcting for red noise.
  2. Discussion of how I corrected for red noise using Cochrane-Orcutt.
  3. Discussion of how I think one interprets the uncertainty intervals
  4. Questions for real statisticians!

I’m asking questions because the issue of calculating the uncertainty intervals is key to validating or falsifying hypothesis. Today, I’m limiting my post to examining a long run of data, because the conclusions will be no surprise: Temperature rose since the late 70s. This lets me illustrate what I have done, and get answers to questions without worrying dealing with any potentially controversial issues.

So, here goes the boring post!

Ordinary Least Squares: Ignore Red Noise.

For the past few blog posts– and in many statistical analysis on the web, we’ve been assuming that GMST anomaly data can be described as obeying a linear trend with time, plus some amount of noise. The noise may be due to weather or measurement error or both. That is, if we have a series of temperature, time pairs (Ti,ti) if denoted by Ti, the data obeys the following equation:

(1) Ti=( m ti+ b )+εi

where “m” is a trend in temperature over time, b is an intercept at t=0, and εi is the ‘noise’ or ‘residuals’ associated with data point ‘i’.

Now, if the values of εi — the residuals– are all uncorrelated with each other, one can simply perform an OLS using a LINEST in Excel (or whatever tool you prefer.) The values of the slope, m, and the intercept pop-out. (They may be wrong. But Excel will give you numbers if you ask!)

These are anomalies with 20 year trends shown.To the left, I show simple linear regression to monthly data obtained using Excel.

Even better, LINEST provides values of the standard error in the slope, sm. Knowing that, and the number of data pairs used to create the fit permits the analyst to state 95% uncertainty intervals, or do simple hypothesis tests.

In fact, using OLS results in fits like those illustrated in the figure to the left, where I include for GMST data sets plus the average of the four. As you can see, depending on which of four data sets for temperature measurements I used, the trend in temperature between 1978-2007 ranges from 1.8 C/century to 1.4 C/century.

For the my test case, I averaged the temperature measurements from all four souces; I get a m=1.6C/century with a claimed standard error or sm=0.1 C/century. (If you download the EXCEL spread sheet, this result is the fit to average temperature and year in columns C and D. LINEST is done in cell C364.)

In principle, if GMST data really follows a linear fit as in (1), I can look up the “t” for 5% confidence bands, and I find t=1.967. (I used TINV with probability 0.05 and 346 degrees of freedom.)

Using this “t”, I found confidence bands by taking the product product (t sm= ±(1.967) (1 C/century). This results in: 1.5 C/century < m< 1.8 C/century. But there is a problem: The trend, "m" and particularly sm aren’t correct. Why not, some might ask?

The Residuals Are Autocorrelated.

Residuals from Least Squares Fit to Global Mean Temperature Anomalies.It turns out that if I examine the values of the residuals, εi, and calculate something called the “autocorrelation”, the residuals are strongly autocorrelated. (This is because weather trends tend to persist.)

To calculate the residuals, I use the formula:

(2) εi = Ti-( m ti+ b )

If I’ve found “m” and “b” with excel, this is easy to do. I just put that in a column next to the (temperature, time) data ( Ti, ti), enter the formula that matches (2) and create a column of residuals.

I then look examine the autocorrelation of adjacent residuals by taking the ratio of two sums:
(3) ρ = Σεiεi-1/ Σεiεi)

In Excel, is most easily done using the CORREL function. I did this for the ‘average’ monthly GMST measured using four standard methods, and got ρ=0.78. (See cell E10, shown in green.)

Now, it happens that there is a method to diagnose whether ρ=0.78 is statistically significant– but I haven’t learned it yet. Plus, for a correlation coefficient this large, I don’t need to. With 30 years worth of monthly data, I know ρ=0.78 is statistically significant.

Cochrane-Orcutt: Correction for Red Noise

I read about Cochrane-Orcutt. It appears to be very easy.

The first step is to assume the errors in the residual are AR(1) noise with a serial autocorrelation of ρ. If we know the ‘true’ correlation for residuals, ρ, we can linearly transform the variables of interest (i.e. temperature and time) to new variables using some simple algebra. Once we do this, we apply OLS to the transformed variables. If we had really known the true correlation, ρ, we would now be done. The trend, mnew, for the transformed variables is the same as for the original variables. We can also obtain a new intercept, bnew

However, if we didn’t know the true ρ, we do something a bit more involved. It requires iteration, involving a number of relatively simple steps.

The iteration:

  1. We guess for the ‘true’ serial correlation for residuals, ρ; I’ll call the current guess ρold. (It makes sense to pick the current guess from the OLS we already did. So, I pick ρold=0.78 from above.)
  2. Using ρold, transform the data set (T,t) into a new data set (X,Y). (Equation given later.)
  3. After doing this, we obtain new estimates for slope and intercept (mnew, bnew).
  4. We then compute new residuals, using the updated slope and intercept. Calculate ρnew based on the new residuals.
  5. If ρold=ρnew to some limit, we just decide that’s ρ, and use these current values of “m” and “b” as correct. Otherwise, we

What I do involves a few steps:
Computational Step 1: Guess ρold. On my first try, I guessed ρold=0.78 because that’s what I got for the OLS on T and t. (I put this in cell E9 and named the variable Correl so I could later proof-read.)

Computational Step 2: Define and calculate new transformed variables (X,Y) using these two formulas:

(4a&b)
Yi= Ti-ρoldTi-1 and Xi= ti-ρoldti-1

To plug and chug, I just created columns for ( Yi, Xi) adjacent to those for (Ti, ti) entered the formula and filled down. (Naturally, I have 1 less set of (Y,X) compared to (T,t). You can find Y and X in columns H and I. )

Computational Step 3: Perform a linear regression on (Y,X) to get new values for “m” and “β”. (I just use LINEST on (Y,X); m is the slope, β is the intercept.)

That is, I assume Y and X obey:

(5)Yi=( m Xi+ β )+εi

I obtained the slope and intercept for Y vs. X using LINEST. The formula is in H364.

It turns out that the best fit slope for equation (5) is expected to return the same value as equation (1) for (T,t) pairs. So, “m” from this new fit is the new improved “mnew“.

However, the new intercept is not “bnew“. The intercept from this linest is β; this is related to the intercept in (1) as β= b*(1-ρ). So, I to get a new improved ‘bnew‘ . So, it can be calculated as bnew=βnew/(1-ρold).
(This isn’t actually in a cell. I use it algebraically in step 5.)
Now, in principle, I have “new improved” values of “m” and “b”.

Computational Step 5: Now, using the new values of “mnew” and “bnew” calculate use equation (2) and calculate new residuals, εi, using the original (T, t) pairs. (I turned this into a new column, rather than replacing the old ones. It’s column F. )

Computational Step 6: Now, equation (3) to re-compute a new estimate for ρnew. (I used the “CORREL()” in Excel; see cell F10, in yellow.)

Compare the new value of ρ to the value you had before you began step 1. If the new and old values agree withing some criteria you set, you are done! The most recent values of “m” and “b” are the converged values, and apply to this data sample.

If the ρ differ too much, guess a new value for ρold. Your new guess should, of course be, ρnew!

Repeat computational steps 1-5 but starting with the new value of ρ. (I substituted the new ρnew for ρold by copying the new value, and then pasting it into E9 using “paste special, pasting only values.).

It happens that this computation converged in one iteration. (I set the cells to read to 4 decimal places.)

If a real statistician confirms this is correct, I’ll probably write a macro to simplify this. (I couldn’t find a free one for Excel.) But, I’m pretty sure that’s what all the various articles I found are telling me to do.

Results so far

Having done this, I now found m=1.5 C/century and sm=0.3 C/century for JAN, 1979-now.

Now, I think that standard error is an honest to goodness standard error on the slope, “m”. I think I have 346 degrees of freedom for any t tests. The transformations (4 a&b) were supposed to give me a set of independent uncorrelated variables. So, I think I could now go ahead and do t-test with this.

Question for Real Statisticians

Can I do hypothesis tests using sm=0.3 C/century as the standard error to get confidence intervals? Are the number of degrees of freedom driven 346? (That’s three less than the 349 (T,t) data pairs I began with.)

But you aren’t really done.

Ok, that seems easy enough. Tedious, but easy.

Unfortunately, there are a few more steps. Once you have the value of ρ, you should check the residuals on the transformed variables, X and Y, and see if those residuals are correlated. Mine had a correlation coefficient of -0.21.

If this second set of residuals are correlated to a level that is is statistically significant, you repeat to basically pretend your (X,Y) variables are “original” variables, and redo the fit.

I skipped the test to see if ρ=-0.21 is statistically significant, and just assumed it was. (I will teach myself how to test later on. But for now, I figured, just assume it is.)

When I finished this, my new results were:

Temperature Trend and Standard Errors for “mean” GMST anomaly.
No Correction AR(1) correction. AR(2) correction.
m 1.6 C/century 1.5 C/century 1.5 C/centuy
sm ±0.1 C/century ±0.3 C/century ±0.2 C/century
95% confidence intervals. 1.5 C/century < m< 1.8 C/century 1.0 C/century < m< 2.1 C/century 1.1 C/century < m< 2.0 C/century
ρ of residuals. 0.78 -0.21 0.03

Conclusions

I think I have applied Cochrane-Orcutt correctly to the monthly data from 1979-Jan. 2008. My goal to day is simply to request feed back from statistician in the know: Did I do this correctly.

I have done some reading, and understand that uncertainty intervals, while improved by using Cochrane-Orcut still tend to be too narrow. (I suspect this is because of the uncertainty in determining ρ)

So, if there are any real statisticians reading:

  1. Did I apply this technique correctly.
  2. Do I need to modify my uncertainty intervals? Of so, how do I do this?

To those wondering: Yes, I have done this for another more interesting data set. I have also attached the Excel spread sheet. Obviously, you can modify to look at other data sets. But, before you fiddle, let’s wait to hear back from the econometricians to see if I understood the method correctly.

—–
Attachment: Excel SpreadSheet Cochrane Orcutt– since 1979

Question I asked at a blog: Standard Errors.

10 thoughts on “Correcting for Serial Autocorrelation: Cochrane Orcutt”

  1. I downloaded the spreadsheet and took the temperature series to try a couple of models correcting for the serial correlation. First, I tried a nonlinear Cochrane-Orcutt regression. Then I tried the basic ARMA(1,1) followed by ARMA(3,3), the model I use when I don’t know what the heck is going on in the serial correlation. Finally, I noticed that the higher MA terms were either not significant or substantially changed the nature of the ARMA model. Thus, I tried a pure AR model with AR1, AR2 and AR12.

    The results for the trend coefficient and its standard error are below:

    trend slope standard error
    C-O 0.001261 0.000230
    ARMA(1,1) 0.001150 0.000344
    ARMA(3,3) 0.001169 0.000324
    AR 1, 2, 12 0.001333 0.000230

    All models except the Cochrane_Orcutt passed the Breusch-Godfrey Lagrange Multiplier Test for Serial Correlation. We could eliminate the ARMA(3,3) model because of a lack of significance for most of the ARMA coefficients and the distortion of the first order autocorrelation. The ARMA(1,1) performs slightly better on the usual regression statistics but that isn’t the issue.

    Look at the standard errors recalling that they are computed under the assumption that the regression specification is the correct view of the world. We might take the greater SE as appropriate, and in which case the trend would be significant. However, the difference between the ARMA(1,1) coefficient and the AR 1, 2, 12 coefficient is large — 50-80% — of the standard errors. This tells me that the measure of dispersion is highly sensitive, much more so than the trend coefficient itself, to the specification of the ARMA model. And thus while I cannot say the trend coefficient is not significant, I am at least skeptical of the significance. Assuming I were some hot-shot distribution theorist who could figure out how to incorporate the variance of the variance estimator into an efficient estimate of the variance, I would bet that my forecast interval would so large as to make comparison (in a post hoc forecast test) to the actual values effectively meaningless, akin to forecasting a coin flip to be either heads or tails.

    Recall (or note if you don’t recall) the purpose of correcting for serial correlation is to reduce the variance of the coefficient estimates and any forecast made from the estimates. What I am saying here is that computed reduction in the variance may be illusory, just an artifact of the particular specification the of the ARMA structure. And hence tests based on those standard errors may overstate their significance.

    Since the above comments aren’t of much help, let me offer this: as a rough rule the significance of a correlation coefficient estimate is +/- 2 / square root of the sample size. Thus, an autocorrelation of 0.2 with a sample size of roughly 350 is significant.

    Marty

  2. I can’t help with the statistics, but I just wanted to say that I enjoy these exploratory posts. I’m trying to educate myself beyond undergrad-level stats and it’s nice to follow along.

  3. JohnV–

    I think this technique may be undergrad in econometrics. But in my branch of engineering, we learn to avoid taking data that requires this technique! Depending on what phenomena I normally wish to study, I either estimate the integral time scale (which in the intergral of the autocorrelation function over time) and sample much much slower than that or if I want to learn about spectral properties of something in a turbulent flow, I would estimate the magnitude of the smallest scales. There are a couple ways to do this.

    As a first cut, I’d take data fast, estimate something called a “Taylor microscale”, and use that to make a SWAG about sampling rates. I’d also do a back of the envelope calculation to estimate the Kolmogorov time scale. Then, I’d take even more data, check the spectra and try to get a cut off.

    But basically, I would never use this particular technique– I’d modify my experiment. (Some of the things Marty is mentioning fall in the category of questions that are difficult to resolve with limited data. So, the “cure” is to take the right data. That works in laboratory settings!)

    Marty– Thanks for the feedback. Since my numbers and yours are disagreeing at the 2nd significant figure, I need to check a few things. Either I coded something wrong or there is rounding error in Excel. (The ill conditioning of curve fits can result in odd behavior when the programmers decide to be ‘computationally efficient’ from the point of view a coder might take. I’m going to subtract means before doing OLS etc. and see what difference that makes. I need to be sure I’m doing this right before I go boldly forward to test IPCC predictions.)

  4. A heuristic method to correct the standard error s for correlations is to reduce the number of degrees of freedom in the uncorrelated standard error by the factor (1 – p). To see this, think of the first nearest neighbor, say (n+1), to a random point n as containing a contribution ‘p’ from point n; point (n+2) has p^2, etc., leading to a total ‘influence’ of 1/(1 – p). So for p = 0.78, you have effectively only 0.22 of the raw degrees of freedom, and in going from the uncorrected to AR(2), multiply by sqrt( 1/(1 – 0.78)) = 2.1.

    Nice work.

  5. Lucia, while taking breaks from working in my garden, I looked at your Excel SS and did a quick annual trend and tested for autocorrelation with the GISS series. I found no significant autocorrelation. I suspect that the autocorrelation for these series derives from the correlation of month to month temperature anomalies.

    By the way if you do a regression analysis from your Excel add-in it automatically calculates your residuals that you require for a check for autocorrelation.

    I agree with the poster here who suggested compensating for autocorrelation by reducing the degrees of freedom. As you and others have suggested a model that avoids autocorrelation is the best answer for compensating for autocorrelation.

    The Durbin-Watson statistic is used for testing for significant autocorrelation – both positive and negative. You do need at least 50 data points, as I recall, for a reasonable DW test.

    I appreciate your approach to showing your complete analysis and not hesitating to ask for help where you think you require it. That takes self-confidence. And that leads me to my disclaimer that I am not sufficiently qualified to pass judgment on what you have done here. I can only suggest what I would do and hope like you that I am corrected by the experts. I am not so much self confident as I am of an advanced age that being wrong is no big deal and particularly if I can learn from it.

  6. On further analysis of the monthly versus annual GISS time series and autocorrelation, I failed yesterday to note that with the monthly series I obtain the same standard error of the mean (SEM) as you did for the uncorrected monthly and using the annual (without autocorrelation) I obtained the same SEM as you did for the corrected monthly. Therefore under these conditions I do not see why one would use the correlated monthly series.

    I did a DW test on the yearly series residuals and found no significant autocorrelation (not even close, but the sample size was only 28 with 50 being preferred) while the test showed the monthly series having a very significantly and positive autocorrelation.

    I am not up to date on how you plan on using the derived information here, but if it is to compare it to a climate model result from a scenario then it seems we invariably get the interminable arguments about was the input wrong/right or was the model processing of the input wrong/right or were both wrong but gave the right final result.

  7. Lucia, if you apply the methods used in Santer et al 2008 (based on much earlier work by others), they calculate (or appear to calculate) the effective number of degrees of freedom as n_eff= N*(1-r)/(1+r) where r is the OLS trend. This reduces the number of degrees of freedom to 42 in your example. They then increase the standard error of the trend estimate by sqrt( (N-2)/(n_eff-2)). In your case, the OLS CI is 1.45 to 1.81; using the autocorrelation adjustment of Santer et al, this increases the CI to 1.1 to 2.1 deg C in your example.

  8. Lucia, here’s an R-script which should provide a “macro” in a sane language for your calculation here.

    cochrane_orcutt=function(y,K=10) {
    CO=rep(list(NA),K)
    year=c(time(y));N=length(y)
    CO[[1]]=list()
    CO[[1]]$fm=lm(y~year)
    CO[[1]]$rho=arima(CO[[1]]$fm$resid,order=c(1,0,0))$coef[1]

    for (i in 2:K) {
    CO[[i]]=list()
    rho=CO[[i-1]]$rho
    ynew=y[2:N]-rho*y[1:(N-1)]
    xnew=year[2:N]-rho*year[1:(N-1)]
    fm=lm(ynew~xnew)
    CO[[i]]$fm=fm
    mnew=fm$coef[2]; bnew=fm$coef[1]/(1-rho)
    resid= y – (mnew*year-bnew)
    CO[[i]]$rho=arima(resid,order=c(1,0,0))$coef[1]
    }

    cochrane_orcutt= list(fm=CO[[K]]$fm,rho=CO[[K]]$rho,CO=CO)
    cochrane_orcutt
    }

    library(xlsReadWrite)
    download.file(“http://rankexploits.com/musings/wp-content/uploads/2008/03/cochraneorcutt79_2008.xls”,”temp.xls”,mode=”wb”)
    lucia=read.xls(“temp.xls”,colNames = TRUE,sheet = 5,type = “data.frame”,colClasses=”numeric”,from=11)
    lucia=lucia[1:349,1:15] #only to row 360 although one extra row
    names(lucia)[c(3:5,10)]=c(“time”,”y”,”resid”,”CO.resid”)
    y=ts(lucia$y,start=c(1979,1),freq=12)
    test=cochrane_orcutt(y)
    test$rho #0.78554945
    summary(test$fm)

    sapply(test$CO, function(A) A$rho)
    #0.78527670 0.78554795 0.78554944 0.78554945 0.78554945 0.78554945 0.78554945 0.78554945 0.78554945 0.78554945

  9. SteveM–
    I haven’t gone back to this old post when I was working on figuring out how to deal with autocorrelation, so this post mostly a question. Since this post was written as a question while I was figuring it out, it might not be the correct result. (I wrote the question using a non-contentious question precisely for that reason.)

    I’ve been doing method of Santer more recently and compared. The size of the uncertainty intervals with CO are usually comparable to those from Santer– but not exactly the same. So, yes the number are different.

Comments are closed.