{"id":15698,"date":"2011-06-17T09:48:35","date_gmt":"2011-06-17T15:48:35","guid":{"rendered":"http:\/\/rankexploits.com\/musings\/?p=15698"},"modified":"2011-06-17T09:54:12","modified_gmt":"2011-06-17T15:54:12","slug":"noaa-may-cooler-than-april","status":"publish","type":"post","link":"https:\/\/rankexploits.com\/musings\/2011\/noaa-may-cooler-than-april\/","title":{"rendered":"NOAA: May cooler than April."},"content":{"rendered":"<p>NOAA&#8217;s May temperature anomaly is in. I&#8217;ll be darned but it was down relative to April.  Must be that dying sun. \ud83d\ude42   <\/p>\n<p>The monthly observations since Jan 2000, are plotted along with the multi-model mean based on the A1B SRES, trend fits based on least squares and an choice of ARMA(1,1) that creates the widest uncertainty intervals based on a subset tested:<\/p>\n<p><a href=\"http:\/\/rankexploits.com\/musings\/wp-content\/uploads\/2011\/06\/NOAA_NCDC_May2011.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/rankexploits.com\/musings\/wp-content\/uploads\/2011\/06\/NOAA_NCDC_May2011-500x500.png\" alt=\"\" title=\"NOAA_NCDC_May2011\" width=\"500\" height=\"500\" class=\"aligncenter size-medium wp-image-15700\" srcset=\"https:\/\/rankexploits.com\/musings\/wp-content\/uploads\/2011\/06\/NOAA_NCDC_May2011-500x500.png 500w, https:\/\/rankexploits.com\/musings\/wp-content\/uploads\/2011\/06\/NOAA_NCDC_May2011-300x300.png 300w, https:\/\/rankexploits.com\/musings\/wp-content\/uploads\/2011\/06\/NOAA_NCDC_May2011.png 1008w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/a><\/p>\n<p>Some readers will notice I&#8217;ve picked a new color scheme. Let me know if these choices seem &#8216;visible&#8217; or &#8216;difficult to see&#8217;.<br \/>\n<!--more--><\/p>\n<ol>\n<li>As usual, the graph contains the trend fit using ordinary least squares and the associated uncertainty intervals computed under the assumption that the &#8216;weather+measurement error&#8217; process is AR(1).  These are shown in gold.  Under this assumption, the multi-model mean trend (dashed black: 0.205 C\/dec) as a point value falls outside the 2&sigma; uncertainty for the trend of earth&#8217;s temperatures as reported by NOAA\/NCDC which indicates the upper 2&sigma; value for trends is 0.164 C\/decade.   <\/li>\n<li>The trend fit using ARIMA(4,0,0) is also shown (dark green). This fit was selected in a rather odd way. It is not the <i>best<\/I> fit ARIMA for the data shown: it is the ARIMA(p,0,q) with p or q up to 4 that results in the <I>largest<\/i> uncertainty intervals.  I pumped up the uncertainty intervals from that fit by taking the pooled average of the reported uncertainty intervals and the standard deviation of all trends from all ARIMA&#8217;s tested.    The 2&sigma; uncertainty intervals for the trend are illustrated with straight dashed lines in dark green.   Note the trend based on the multi-model mean also falls outside the upper 2&sigma; value for the trends on this basis.\n<p>Statistical tests to determine whether the multi-model mean is consistent with the NOAA\/NCDC trend require using the pooled variance of trends from all models in the multi-model mean and the estimated variance. The ratio of the difference of the multi-model mean and observed trends nomalized by the pooled results in d*= 2.09; this is larger than the critical T value for the 95% confidence interval for 135 degrees is 1.98.  Being conservative and assuming I need to reduce the number of degrees of freedom for the t-test, and estimating degrees of freedom based on the red-noise correction, the critical T is 2.02.  In both cases, the d* is greater than the critical t&#8217; based on this comparison, the trend for the multi-model mean is both higher and inconsistent with the observations. Contingent on the assumptoins for estimating the uncertainty intervals, the multi-model mean is rejected relative to the trend in NOAA\/NCDC.<\/p>\n<p>Cherry picking notes:<br \/>\nBear in mind that if one wished to cherry pick a start date to get a result one &#8220;likes&#8221;, start dates during relative minimums will give <i>higher<\/I> trends; start dates during relative maximums will give <i>lower<\/i> estimates.  Jan 2000 is during a La Nina and gives higher observed trends that a start date of Jan 2001. So, the choice of 2000 tends to be more favorable to models than the choice of 2000. (I still prefer 2001 for testing owing to the date when the SRES were published.)<\/p>\n<p>Also: this is NOAA\/NCDC only. Full results should involve HadCrut and GISTemp. I&#8217;ll discuss those when those agencies report. <\/li>\n<li>Having noticed that the observations have tended to fall below the multi-model mean over the entire decade, I decided to do test on both the <i>constant<\/i> and the <i>trend<\/i> for a least squares.  The appropriate place to test whether the constant for the fit through observations and a forecast is correct is the center of the data set tested. So, for 137 evenly spaced points, we test at point 68.5.  The easiest way to do this is to normalize the time series to [-68,68] instead of [0,137] and perform the fit to that time series. Fit routines then return an uncertainty in the intercept which corresponds to the appropriate value to test.\n<p>(For those wondering: Despite framing this as a parameter in a mode fit, this test amounts to testing whether the mean temperature over 137 points differ for observations and the multi-model mean.  Also, to a some extent, the differences in outcome for this test and the previous one will be affected by whether the trend for the model mean exceeded the observed trend during the <i>baseline period.<\/i>  )<\/p>\n<p>Owing to baselining comparisons of constants associated with fits can only be done using data <I>outside<\/i> the baseline period.  IPCC projections are stated relative to the average for Jan 1980-Dec 1999 which sets the period I use to rebaseline models and data. So, comparisons of the constants for fits are restricted to fits with data beginning in 2000.   This is the main reason today&#8217;s post shows data beginning in 2000 rather than my usual choice of 2001.<\/p>\n<p>Also owing to the uncertainty in the estimate of the observed mean relative to the multi-model mean must include the uncertainty in determining the value of &#8216;0&#8217; based on finite data from the baseline. I estimated this also using the ARIMA fit to the baseline period that gave the largest estimate in the uncertainty estimate during the baseline period. <\/p>\n<p>To let readers visualize the uncertainty in the constant, I placed a vertical line at the appropriate time point.  A thin vertical dark green line appears just after 2005 in the figure above. The top and bottom of the line touches the curved dashed dark green lines.  Note that the top of this line falls below the multi-model mean at the point in time and just grazes the lower <b>1<\/b>$sigma; uncertainty for the multi-model mean (heavy grey line just outside the heavy black line).   This suggests the constant value for the <\/p>\n<p>Of course, I also applied a test to determine whether the constant value for the multi-model mean and observations match, using an estimate of uncertainty that includes that for the spread of model means. The result is: d*=2.88, which indicates the difference between the 137 month mean for observations and for the multi-model mean are statistically significant.  The diagnosis: contingent on choice of start date, observational set and method of estimating the uncertainty intervals, the multi-model mean is warmer than the data and the difference is statistically significant at p=95%.<\/p>\n<p>Cherry picking note: Of course we should do the test will all agencies reports of observations. Also, for this tests, picking a start date during a relative minimum has the <i>opposite<\/i> &#8216;cherry picking&#8217; effect relative to that for comparing trends. That is: starting during a low temperature will result in larger values of d*. So, tests starting in 2001 will result in lower d* than starting in 2000.  When all three land series have reported for May, I&#8217;ll report all results in a table, and also show results starting in 2001.\n<\/li>\n<li>Believe it or not&#8230; rumor has it that theory tells us errors in the estimate trends and that for estimates in the constant value  for a linear fit are <i>uncorrelated<\/i>. (And, yes, I&#8217;ve checked the rumor at least when the errors are white. I have not fully tested for other noise models, but I suspect I&#8217;ll confirm more generally.)\n<p>This suggest that I can create a more powerful metric based on the pooled values of the two d*. That is: If I argue that I would &#8216;reject&#8217; models as showing too much warming when the combination of both d*&#8217;s suggests {the models were too cold over the 137 month periods and the trend is also too low}  and I would reject as showing too little warming if the opposite occurred, but otherwise I would accept the models as on track, then I can create a statistic to monitor this by computing the pooled d* by taking the square root of the average of the squares of the d*&#8217;s.  <\/p>\n<p>This results in d*<sub>pooled<\/sub> 3.51 which is well outside 95% confidence intervals contingent on accepting the ARIMA I chose gives an upper bound on the uncertainty for the 137 months trends and 137 month means.  <\/p>\n<p>Cherry picking note: Recall that people testing trends who wish to cherry pick start year to get a result the &#8220;like&#8221; would make <i>opposite<\/i> choices to get the answer the &#8216;like&#8217; for comparisons of trends and means. So this pooled metric is fairly robust to choice of start year and more difficult to cherry pick. It is also more statistically powerful because it is based on data from 1980-2011 rather than data from 2000-2011 alone and includes information from both deviations in the mean and the trends.  (Note to those wondering: picking a more powerful statistical tests is not &#8216;cherry picking&#8217;; it&#8217;s called &#8216;good practice&#8217;.  If one becomes aware of a more powerful statistical test to apply to a predefined data set and the new method doesn&#8217;t introduce adverse features like bias, one should pick the more powerful test. Always.)<\/p>\n<\/li>\n<\/ol>\n<p>For those wondering about the main message: Based on NOAA\/NCDC, the multi-model mean is running hot, and the difference is statistically significant.   You&#8217;ll be seeing more pooled d* results when GISTemp and HadCrut report.  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>NOAA&#8217;s May temperature anomaly is in. I&#8217;ll be darned but it was down relative to April. Must be that dying sun. \ud83d\ude42 The monthly observations since Jan 2000, are plotted along with the multi-model mean based on the A1B SRES, trend fits based on least squares and an choice of ARMA(1,1) that creates the widest &hellip; <a href=\"https:\/\/rankexploits.com\/musings\/2011\/noaa-may-cooler-than-april\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">NOAA: May cooler than April.<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15],"tags":[],"class_list":["post-15698","post","type-post","status-publish","format-standard","hentry","category-data-comparisons"],"_links":{"self":[{"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/posts\/15698","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/comments?post=15698"}],"version-history":[{"count":0,"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/posts\/15698\/revisions"}],"wp:attachment":[{"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/media?parent=15698"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/categories?post=15698"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rankexploits.com\/musings\/wp-json\/wp\/v2\/tags?post=15698"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}