The GISTemp Land Fraction

Much digital ink was spent over the weekend castigating GISTemp for the supposed sin of mixing up the land and ocean areas of the earth. Frank Lansner (and Bob Tisdale, in a somewhat more nuanced post) both points out that to successfully replicate the global land/ocean GISTemp record requires a weighting ratio of around 70% land and 30% ocean instead of the 29% land / 71% ocean that characterizes the real world. If this were true, it would indicate a practice difficult to defend on the part of our friends at NASA. Fortunately, however, it is based on a simple misconception: mistakenly using the published GISTemp land record as an estimate of actual land temps.

[Fig 1]

As we discovered back during The Great GISTemp Mystery, the land temperature record published by GISTemp is not actually an estimate of global land temperatures. Rather, its an an approximation of global temperatures using only land stations. It does not use a land mask, and has strong zonal weightings (90°N to 23.6°N, 23.6°N to 23.6°S and 23.6°S to 90°S with weightings 0.3, 0.4 and 0.3, respectively) whose net result is to considerably lower the resulting temp record vis-a-vis a true land-only record. This is shown in Figure 1, where we can see the GISTemp land record running considerably below a standard land-masked and unweighted reconstruction using the GISS Step 0 data.

The claim that GISTemp would use such an odd weighting scheme seemed rather far fetched to me, since I’ve spent so much time working on reconstructing GISTemp using the underlying data. Indeed, the very fact that we can easily replicate Hadley-area GISTemp by using a standard 29% land / 71% ocean weighting of GISS Step 0 and HadISST/Reynolds suggests that GISTemp isn’t doing anything too screwy. The difference between standard GISTemp and GISTemp run using the Hadley area is entirely interpolation, and isn’t nearly large enough to account for a land/ocean ratio reversal.

[Fig 2]

Indeed, the only thing that can account for the odd 70% / 30% land/ocean ratio is mistakenly combining the GISTemp land series with an ocean series, and failing to take into account that GISTemp land series is not a true land record. Figure 3, below, shows how a 70 / 30 combination does replicate GISTemp land/ocean fairly well. It also shows, however, that using the correct GISS Step 0 series only requires a 35% land 65% ocean ratio to fully match the interpolated version:

[Fig 3]


We see that GISTemp’s interpolation adds an implied 6% to the global land area, rather than the 41% originally claimed. Even this is somewhat misleading though, since the area mainly covered by interpolation (e.g. the Arctic) is warming much faster than the average land temperatures, so adding a little bit of coverage there has an out-sized effect on the trend. Regardless, this exercise shows that folks should be a tad more careful before accusing our friends at NASA of deliberate manipulation, though it also suggests that GISS should probably alter the descriptive text on their website to make it clearer what exactly their land record represents.

As a small bonus, here is what you would get if you actually did a 70% land 30% ocean weighting using the correct land record:

[Fig 4]

Update

A more careful reading of Bob Tisdale’s post shows that he indeed pointed out this issue, remarking that “Note how the GISTEMP LST data extends out over the oceans. This is not the case for their combined product, because GISS masks the LST data over the oceans in its combined product. So in order to properly create a weighted average of GISTEMP land and sea surface temperature data with 1200km radius smoothing, the land surface data where it extends out over the oceans would first need to be masked.”

169 thoughts on “The GISTemp Land Fraction”

  1. To be fair, GISS is responsible for much of the confusion by not being clearer about what the land temp series on their site actually represents.

  2. Could you show the components a bit more explicitly, to show why the GISS met station index runs cooler than a true land record? Maybe show your results for the three different latitude zones, so one could get a sense of what happens when they’re put together using different weightings?

    I agree with your conclusions. The GISS page needs a note explaining what their land record is. The sceptics.. well, I thought Bob Tisdale was on the right track, at least.

    Anyway, my wish came true, in a most unexpected way. When you stumbled across this the first time, I wanted you to write a breathless post, accusing GISS of fraud because their land index was not warming fast enough. Little did I expect the sceptics to take that and unwittingly turn it upside down in this way, and bless us with the ‘Hansenizer’.

    It’s actually hilarious – it’s as if the entropy of the universe is maximised when the sceptics find some way to accuse GISS or NOAA of inflating warming. Somehow, they’ll just find a way to do it.

  3. Luis Dias:
    I wouldn’t call it a stupid mistake by WUWT and friends. The GISS page should give a better description. But the fault is in how the sceptics react, when they find something they don’t understand. Especially the lack of self-doubt, considering how the official records have been independently confirmed inside and out at this point.

    Compare it to the “great gistemp mystery” post here, when Zeke came across the same thing from the front (instead of the back), and there was puzzlement for a bit, until an inquiry to GISS and somebody reading IPCC report cleared it all up.

  4. What GISTEMP actually does is this:
    0. It uses a global grid with 80 boxes, each of which is divided into 100 cells. 8000 cells altogether. Each cell has equal area.
    1. From GHCN, USHCN, and a small amount of other data, it computes a land-based monthly anomaly series for every grid cell (steps 0-3).
    2. From Reynolds it computes an ocean-based monthly anomaly series for every grid cell (step 4).

    Then in step 5:

    3. For each grid cell, it combines these land-based and ocean-based series to produce a single monthly anomaly series.
    4. It combines these grid cell series into box series: each cell is weighted equally (because they have equal area). Each datum in a box series has an associated weight, according to how many cell data contributed to it (i.e. how many cell series had valid data for that month).
    5. It combines box series into zonal series, including a global series. This combination is weighted according to the weights calculated in part 4 above.

    Obviously the meat relating to your article is in part 3 above, which has some tricky parameterized code which turns out to have very simple behaviour because of the values of the parameters. For each grid cell, if there are fewer than 240 valid monthly data points in the ocean series, or if the nearest land station is less than 100km from the centre of the cell, then the land series is used. Otherwise the ocean series is used for that grid cell.

    The global land series is computed using the same weighting and combining code, and can be thought of as simply flagging the all the ocean series as missing data (and in ccc-gistemp this is exactly what we do).

    So all this discussion of 30% land weight or 70% land weight is based on a misconception: that the GISTEMP global combined series is calculated somehow from the global land series and the global ocean series. It is not.

  5. Nick Barnes,

    You are correct, of course, though what GISTemp does effectively results in the correct land/ocean ratio, since you can reconstruct it (or, at least the Hadley area run) by using GISS Step 0 and HadISST/Reynolds in a 29% / 71% combination.

    The zonal weighting also does screwy things when only land temps are used.

    Carrot eater,

    I could run each band separately, though I’d have to know the actual land fraction in each to put them together in a way that reflects the true land record (instead of simply using 0.3, 0.4, and 0.3 weights).

  6. I’m aware of my responsibilty to familiarize myself with the content of recent postings before asking questions (I’ve been an avid lurker on this site for a long time but the recent “reconstruction” discussions don’t interest me, although I am VERY glad they are happening), but I could use some help.

    Apparently, independent reconstructions based on the raw data closely resemble the prior “official” records? If so, that’s good to know. But, it seems to me that people are saying that independent reconstructions are matching GISTemp, etc which, I thought, was suppossed to be corrected for UHI effects.

    My primary interest has always been HOW the UHI adjustments were made and discussions as to the best method of making them. Do we now know how UHI adjustments were made and are we now in position to discuss that topic? Does this mean that GISTemp isn’t using UHI adjustments?

    I’d appreciate anyone taking the time to straighten me out on this. I’m sure I’m missing something (or a lot of things).

  7. BillB,

    Not quite, it just means that UHI adjustments are small enough that they are well within the range of methodological uncertainty in the reconstructions. E.g. valid choices on methods (CAM, LSM, grid box sizes, etc.) have as large an impact as UHI adjustments.

    Here is the GISTemp adjustment per CCC:

    http://clearclimatecode.org/gistemp-urban-adjustment/

  8. Zeke,
    Even if you don’t know the land/sea fractions for the three bands, it’d be useful to see the bands individually. The differences between them should tell us something.

    realising that you aren’t the most able to do this.. i could just run ccc for myself.

  9. BillB

    The issue with GISSTEMP UHI adjustments is that they are rather small. In fact, some of the adjustments warm “urban” stations.

    The algorithm essentially takes a ‘urban station” and then adjusts it using nearly “rural”

    Issues:

    1. are the Rural stations really “Rural” This a TOUGH metadata
    question. But Ron B is making some headway on new metadata.

    2. Does GISS adjustment make sense? The algorithm DOES add warmth to some URBAN stations. This is counter intuitive. Not wrong. but definately bears some scrunity.

    However the bottom line is this. The UHI signal is most likely to be rather small. We know this by doing gross comparisions between urban and rural. For example if the land is showing
    a warming of 1C, we know that the UHI portion of this is less
    than .3C or so ( that’s the highest figure sited in literature). By looking at UHA we can further constrain the envelop of expected UHI contamination.

    We can probably have a bunch of fun arguments about how to size and measure the signal. I’m probably going to argue for an approach that reduces the effect by carefl slection of stations
    and live with the attendant uncertainty. Or perhaps an approach like Jones whereby the UHI is buried in the error bars.

    Adjusting for UHI seems like a fools errand. I’ll have to look at all the stuff I’ve said in the past, but Off the top of my head.. Jones puts the figure at .05 century
    and M&M put it at .3C since 1979. truth’s inbewteen that. maybe
    .15C of the land record.. if it was bigger than that it would pop right out in an area analysis

    So UHI is real. It’s likey bigger than what Jones says, but not as large as skeptics need to make their case. oh, and the refinement of this measure doesnt change the science. It MIGHT impact testing models against observations. but C02 still warms the planet and that is not something we can ignore.

  10. McI was winging about the upwards UHI adjustments on isolated stations for a long time. Given his general intelligence level, I almost wondered if he was being deliberately disingenous. I pointed out he concept for why this might happen several times and he just totally failed to even acknowledge and deal with the issue.

    Mosh is correct that this may/may not be an addressable issue (for instance if the stations we believe to be unaffected by UHI are themselves really affected by UHI. But this is a separate issue.

    The basic rationale for the Hansen adjustments is to take rural stations and assume they have no UHI. We can debate this, but let’s say we were in business, had to make a marketing bet and…it’s a decent assumption. So at least take it as a given, for the purpose of discussion here.

    So, essentially what Hansen does is to “correct” the urban station for what it ought to say if it were responding like the rural station. How this affects degrees of freedom, whether one should use rural only, or whether there is some benefit in keeping the urban but correcting it is an interesting question. (Still not the point of this post though.)

    So you have Rn (rural reference= average of nearby rural stations) and U1(urban to be corrected). And you take the difference between the two and call that UHI. Then assign that correction to U1. Essentially U1 (new) =U1 – (U1-Rn).

    Now…something to consider is that there is inherent noise and variability INDEPENDANT of UHI. For one thing, two different R stations may have some difference in trend themselves (and remember these are the references, assumed to have no UHI). therefore, even an assumed U1 (let;s say with “0” UHI in reality), might have a CHANCE difference from the rural references (just like they have chance differences from each other).

    Given this, it is likelty that SOME of the corrected urban stations will actually be “warmed” by the UHI correction (especially if in reality UHI is a small effect compared to sensor to sensor inherent varations). When McI blathers about this it makes no sense…since he’s not grocking the above concept.

    Furthermore, if you “culled” these warmed corrections from the dataset, you would BE WRONG, since there are aslo some “chance” variations that give OVERCORRECTION of UHI. (IOW too large of a correction). You have to leave in the intuitively strange (warmed urban) spots to counteract the effect of overcorrected (overcooled by correction) urban spots.

    I’m honestly BLOWN AWAY that a guy as smart as McI did not at least comprehend and discuss this simple concept. It actually makes me think he’s a sophist, that he did not describe this concept or respond when it was explained to him. Several times! By several people!

    Note1: If the rural references themselves are biased, that’s a different issue. But independant of the concept above.

    Note2: If UHI really is small, negligeable, this is the expected result. And let’s at least entertain that possibility! Let’s not be like Watts who looks for surface station effects and when he doesn;t find it, but still sees warming, says “that’s signature of larger scale UHI”. It’s like even the possibility of GW (what we are looking for) was not in his possibility list! Maybe cities were mostly grown by 1900, maybe the urban stations are a small fraction, maybe wind blows a lot, maybe UHI is an effect that has diminshing returns after a certain size, maybe airports really are a decent place to measure temperature, etc.

  11. It’s a curious situation when I view replies from the likes of Zeke & Steve Mosher in the same light as others might see a personnel communication from Obama, the Pope or Bono. I’m going to have to reflect on this, I might just be spending too much time reading about Climate Change.

    I have always believed that there has, indeed, been warming over the last 100 + years or so. I also suspected that some portion of the increase in the record was probably due to UHI. The question (for me) has always been, How Much?

    The skeptical community’s obsession with micro-siting, UHI adjustment algorithms, data collection, etc. was brought about by the climate scientist’s lack of transparency regarding data and methods. Now, thanks to your hard work, the fog is clearing and the details are emerging.

    The skeptical position that there is probably some warming coming and some portion of that may well be due to man’s actions but that it is unlikely to be catastrophic is no more dependent of GISS fudging the temperature record with evil intent than the alarmist position was dependent on the Hockey Stick. The “Alarmists” lost credibility defending “that which could not be defended”. I don’t want to see skeptics make the same mistake by continuing to attack “that which perhaps should no longer be attacked”. But I suspect my hopes will be in vain.

  12. I’m sorry this has nothing to do with the post but I am going to keep asking this question until somebody gives me an answer. In the time of the dinosaurs, we have 7,000ppm of co2. What happened to all of this co2? Why didn’t the planet blow up? Why won’t somebody just admit that the earth is self-regulating? I find it amazing that myself, a 24yr old seems to know more about the climate system than Gavin Schmidt or Michael Mann or take your pick. Is it not obvious that the oceans simply absorb the co2 and convert into rock, like limestone? Honestly, it is such a joke that a blog site like this even exists. I am convinced that Lucia knows that I am 100% right but will not say anything because then her blog site has no purpose. I’m sorry but until a global warmer explains this anomaly I am forced to treat this whole debate as a complete joke. The earth had tons of co2 and it managed to filter all of it out and the earth didn’t blow up, that is why we are here. Therefore, you people are essentially saying that if we add a little more co2 (nowhere even close, not a chance of seeing 7,000ppm for an incredibly long time) it is going to cause disasters. Therefore, the argument holds no water, makes no sense. We know that the earth had x amount of co2, but now!…it cannot hold even close to that much but all of these bad things are going to happen. Lucia, please just admit that I nailed it.

  13. “So UHI is real.”

    Nobody questions the reality of UHI. It just doesn’t bias the GISTEMP trend towards warming. The satellite record shows the same trend. Nor do the siting issues. Menne et al (2010) and the satellites, again. You need a new talking point.

  14. The weird thing is one can’t tell if someone like Shooshman or MikeC is a parody, as there are actually commenters that have this level of understanding. But it sure looks like a parody. Looks like someone trying to make skeptics out to be boobs, by running a sleeper agent.

  15. I find it amazing that myself, a 24yr old seems to know more about the climate system than Gavin Schmidt or Michael Mann or take your pick.

    .
    When I was ten, I thought my parents knew everything. When I became twenty, I was convinced they knew nothing. Then, at thirty, I realized I was right when I was ten.
    -Mark Twain

  16. The flipside of what Zeke said is that solar radiation was less intense, long ago.

    The carbon that we’re putting back into the atmosphere will eventually go back into the oceans and finally the weathering of rocks. But that’s a very slow process.

  17. Zeke: Thanks for mispresenting the intent of my post, which was to illustrate the error made by Frank Lansner. Did you miss the discussion of Figure 6? It read,
    Figure 6 is a map illustrating the GISTEMP LST data (trends) from 1982 to 2009. Note how the GISTEMP LST data extends out over the oceans. This is not the case for their combined product, because GISS masks the LST data over the oceans in its combined product. So in order to properly create a weighted average of GISTEMP land and sea surface temperature data with 1200km radius smoothing, the land surface data where it extends out over the oceans would first need to be masked.

    It was the simplest way I could present the basic error Frank made. Another portion that the weighted average can’t account for is the fact that GISS deletes SST data where there’s seasonal sea ice and replaces it with Land Surface Data.

  18. Bob,

    Fair enough, I confess to not reading your post carefully enough (and being distracted more by Fig. 5). I’ll correct the original post to indicate as such.

    It is worth noting that the particular approach to zonal weighting decreases the trend about as much as the failure to apply a land mask.

  19. Nick Barnes (Comment#49280) July 19th, 2010 at 10:24 am

    What GISTEMP actually does is this:
    .
    thanks Nick. one of the most clear description of GISS that i have seen so far.
    .
    and thanks to zeke, for the article. very well written.
    .
    and finally thanks to lucia, for allowing such guest posts. good stuff!

  20. Regarding how GISTEMP does UHI analysis and adjustment, here’s another mini-essay on that. This is pretty clear in ccc-gistemp step2.py. We have a new release out now; David Jones has just spent some time clarifying this very section.

    If you are not familiar enough with numbers, science, or data handling for this description to make sense to you, that does not mean that it is all mumbo-jumbo, or “data manipulation” in any pejorative sense. It simply means that *you* cannot tell whether or not it is any good. You are not competent to criticise it. You should therefore think twice before commenting on it.

    For every station, generate a series of annual (Dec-Nov) anomalies. For each urban station, find nearby rural stations and combine their annual anomaly series, weighted by distance from the urban station, into a combined rural annual anomaly series. If you can’t find enough nearby rural stations, look further afield. If you still can’t find enough rural stations, or if their combined series doesn’t have enough overlap with the urban station, discard the urban station completely (it plays no further part in GISTEMP).
    Once you have a satisfactory combined rural annual anomaly series, subtract from it the annual anomaly series for this urban station. Find the best two-part linear fit (a line with a knee), to this difference series. If that linear fit is not “good enough” (several parameters, see parameters.py) use a one-part linear fit instead. Add this linear fit to the urban monthly series, for a period including the fit (i.e. overlapping) section and extending (as a constant) if possible on either side.

    The essential idea is to make sure that each urban station has the same long-term trends as the nearest-available combination of rural stations. The idea of the two-part fit is to allow for urban stations which start, or stop, urbanising at some specific time during the data period (1880-).

    Yes, this will generate step-wise annual behaviour in the resulting urban series (because the same value, based on a fit to December-November annual anomaly differences, is added to each month from December to November). This is already well-known. It doesn’t change the validity of the process although it does suggest a refinement.

    Finally, this process was all described pretty well in Hansen et al 1999 (although that paper describes a fixed ‘knee’ at 1950; later papers describe the refinement). Anyone interested in this question could find that out very quickly with Google. Anyone asking about it here either doesn’t care very much, or cares and has looked up the paper and can’t understand it (in which case you should say “what does this mean?”), or has swallowed some garbage about “secret data manipulations”, or possibly doesn’t know how to use Google.

    Hansen, J., R. Ruedy, J. Glascoe, and Mki. Sato, 1999: GISS analysis of surface temperature change. J. Geophys. Res., 104, 30997-31022, doi:10.1029/1999JD900835.

    http://pubs.giss.nasa.gov/cgi-bin/abstract.cgi?id=ha03200f

  21. Regarding how GISTEMP does UHI analysis and adjustment, here’s another mini-essay on that. This is pretty clear in ccc-gistemp step2.py. We have a new release out now; David Jones has just spent some time clarifying this very section.

    If you are not familiar enough with numbers, science, or data handling for this description to make sense to you, that does not mean that it is all mumbo-jumbo, or “data manipulation” in any pejorative sense. It simply means that *you* cannot tell whether or not it is any good. You are not competent to criticise it. You should therefore think twice before commenting on it.

    For every station, generate a series of annual (Dec-Nov) anomalies. For each urban station, find nearby rural stations and combine their annual anomaly series, weighted by distance from the urban station, into a combined rural annual anomaly series. If you can’t find enough nearby rural stations, look further afield. If you still can’t find enough rural stations, or if their combined series doesn’t have enough overlap with the urban station, discard the urban station completely (it plays no further part in GISTEMP).
    Once you have a satisfactory combined rural annual anomaly series, subtract from it the annual anomaly series for this urban station. Find the best two-part linear fit (a line with a knee), to this difference series. If that linear fit is not “good enough” (several parameters, see parameters.py) use a one-part linear fit instead. Add this linear fit to the urban monthly series, for a period including the fit (i.e. overlapping) section and extending (as a constant) if possible on either side.

    The essential idea is to make sure that each urban station has the same long-term trends as the nearest-available combination of rural stations. The idea of the two-part fit is to allow for urban stations which start, or stop, urbanising at some specific time during the data period (1880-).

    Yes, this will generate step-wise annual behaviour in the resulting urban series (because the same value, based on a fit to December-November annual anomaly differences, is added to each month from December to November). This is already well-known. It doesn’t change the validity of the process although it does suggest a refinement.

    Finally, this process was all described pretty well in Hansen et al 1999 (although that paper describes a fixed ‘knee’ at 1950; later papers describe the refinement). Anyone interested in this question could find that out very quickly with Google. Anyone asking about it here either doesn’t care very much, or cares and has looked up the paper and can’t understand it (in which case you should say “what does this mean?”), or has swallowed some garbage about “secret data manipulations”, or possibly doesn’t know how to use Google.

    Hansen, J., R. Ruedy, J. Glascoe, and Mki. Sato, 1999: GISS analysis of surface temperature change. J. Geophys. Res., 104, 30997-31022, doi:10.1029/1999JD900835.

    http://pubs.giss.nasa.gov/cgi-bin/abstract.cgi?id=ha03200f

    Incidentally, it appears to be my lot both (a) to correct people who attack scientists for “manipulating data” – as if they’ve discovered some secret and nefarious data processing such as, say, the UHI adjustment – and (b) to correct people who claim that the temperature record doesn’t account for (say) UHI. Sometimes the same people.

  22. I feel the frustration seeping through, Nick.

    The giss adjustment is conceptually extremely simple, yet the form of it still generates so much confusion out there.

    I ask people if they’d be happy if GISS just left the urban stations out altogether, assuming they’re being appropriately identified. They usually say yes. Then I try to explain that this is much the same thing. Never get anywhere.

  23. Nick,
    As for your lot in life: look around, and you see this. First, people claim there are all sorts of mysterious adjustments and evil processing that goes on in giss somewhere to invent warming, and they say that all raw data are pure and holy and must never be adjusted. You tell them there isn’t anything mysterious, and you explain what adjustment is done by GISS. Then they start complaining about UHI and all sorts of siting issues, and complain that nobody is accounting for these.

    At which point you smack your head.

    There is definitely legitimate room for discussing whether the GISS or NCDC adjustments can be improved in some way. They surely can be; this will never be perfect. To his credit, Mosh thinks about that, and tries to actually numerically bound the effect he’s looking for. But for the rest of these guys, it seems like they just see a graph with an upwards trend, and assume it’s spurious for some reason or other.

  24. Bob,

    No worries; the poor blog software tends to get overwhelmed easily 😛

    Nick,

    Thanks for the great explanation of the GISTemp method for UHI corrections. Don’t worry about the tone; all of us get frustrated by the relentless sophisms about the temp record every now and then.

  25. Hi!

    When I posted my long article(see PART4):
    http://hidethedecline.eu/pages/posts/the-perplexing-temperature-data-published-1974-84-and-recent-temperature-data-180.php

    And when we posted at joane Nova:
    http://joannenova.com.au/2010/07/did-giss-discover-30-more-land-in-the-northern-hemisphere/

    then what we did was to ask: Can this really be??

    !

    After a while when it seemed nothing really heavy argument against my finding came up at joanne Novas blog, only then my tone became more confident in the finding.

    Here at this site it seems that you are pretty sure in your findings, i would like to understand you guys fully but so far im not convinced and im a little sad and surpriced about your tone against other sceptics who do a honest truth seeking job.

    I see you operate with a “step0” – is this the strictly Land area – version of GISS LST? (Or could you please describe what is the step0 of GISS LST).

    In the replies on various blogs, ive been told over and over that GISS land also covers ocean. This I new in advance, i have myself described that in a huge article recently, but that did not explain why only SST is represented in the 1900-20 etc.etc.etc.

    I will create short cut to this blog in hope someone takes the time to write in logic terms what it is your saying. If competent peoble can not reach anyone besides those with same knowledge, their message wont get far, and what did they win?

    K.R. Frank Lansner

  26. Nick Barnes: You wrote, “For each grid cell, if there are fewer than 240 valid monthly data points in the ocean series, or if the nearest land station is less than 100km from the centre of the cell, then the land series is used. Otherwise the ocean series is used for that grid cell.”

    Since both SST datasets (HADISST And Reynolds OI.v2) are complete datasets, then the “if there are fewer than 240 valid monthly data points in the ocean series” portion is no longer required.

    You continued, “So all this discussion of 30% land weight or 70% land weight is based on a misconception: that the GISTEMP global combined series is calculated somehow from the global land series and the global ocean series. It is not.”

    This, and the “if the nearest land station is less than 100km from the centre of the cell, then the land series is used”, and the fact that GISS masks the SST data where there’s seasonal sea ice and extends land data out over the Southern and Arctic Oceans, all prompt the question, what percentage of the GISS data contains land surface data?

  27. For the record, heres how the subject was introduced in the original article, quote:
    “I would like some reader comments on this one: Did I overlook something or is this a severe problem for GISS?”

    Heres just one of several cautious words from Joanne Novas article:
    “Frank is looking for feedback and suggestions, and wondering if there could be any other explanation. So am I. ”

    !!!

    Bob Tisdale showed me a row of maps with different ocean coverage from GISS land stations which im very familiar with. But since the land% from 1980 to 1995 jumped from 40% to 73% these small changes in GISS ocean coverage 1980-95 are obviously not “it” when explaining things. So im very very sorry, but if you guys “have seen the light” please share it in a good logic easy to understand way´. Normally, peoble who really understands a subject are very good explaining themselves…

  28. wops, i spelled my name wrong ;-), its Lansner..

    My comment 49368 6.04 i cant see, await moderator
    Bob Tisdales comment 49370 6.10 I can see,
    my comment 49371 6.22 i cant see, await moderator
    ?
    K.R. Frank

  29. Frank, for some reason your comments aren’t clearing the spam filter; I’m asking Lucia to look into it.

    The GISS Step 0 referred to in this post is the temp data that results when unadjusted GHCN data is combined with (adjusted) USHCN data and antarctic station data in the first step of the GISTemp code. You can find the an R script written by Mosh with the necessary datafiles here: http://drop.io/GlobalTemp

    Note that many others (Jeff Id, Nick Stokes, Chad, myself, etc.) all get similar results for the land record. See this post: http://wattsupwiththat.com/2010/07/13/calculating-global-temperature/

  30. Just wondering if anyone thinks GISS should reduce the 1200 km smoothing algorithm. This episode demonstrates another problem beyond the questionable southern and Arctic ocean impacts. Even the original Hansen Lebedeff paper showed the correlation drops to 0.50 or lower at 1200 kms with varying impacts depending on latitude.

  31. Luis Dias: “WUWT making yet another stupid mistake? WATT? Not possible!”

    Actually, Anthony did something quite interesting that few people noticed. He presented Frank Lansner’s post with my rebuttal immediately after it.

  32. Anthony did something quite interesting that few people noticed.

    I did notice that.

  33. Frank,

    This post might be instructive as to why the GISTemp land series differs from a true land-only reconstruction: http://rankexploits.com/musings/2010/the-great-gistemp-mystery/

    To quote the reply that Dr. Reto Ruedy sent me:

    “The curve NCDC and most likely you are computing shows the mean temperature over the land area (which covers about 1/3 of the globe, a large part of it located in the Northern hemisphere).

    None of our graphs represents that quantity. We could obtain it by
    creating a series of maps, then averaging just over the land areas
    (similar to what we do to get the US graph).

    Since our interest is in the total energy contained in the atmosphere which correlates well with the global mean surface temperature, all our graphs display estimates for the global mean, the ones based on station data only as well as the ones based on a combination of station and ship and satellite data. Obviously, the latter is the more realistic estimate and we keep the first one mostly for the following historical reason:

    When we started out in the 1980s analyzing available temperature data, historic ocean temperature data were not yet available and we did the best we could with station data. As soon as ocean data compilations became available, we used them to refine our estimates (calling it LOTI). But we kept the earlier estimates also, mostly for sentimental reasons; they are rarely if ever mentioned in our discussions (see also the “note” in the “Table” section of our main web site).

    To get back to your question: The mean over the land area is heavily weighted towards the Northern hemisphere and that hemisphere experienced a larger warming than the Southern hemisphere. Hence our estimate which gives equal weight to both hemispheres exhibits a smaller trend, as you noticed, but it still somewhat overestimates the true global mean trend.”

  34. Bob,

    Lucia sent me a note that Frank was being caught in the spam filter for posting links as a first-time poster. He should be fine getting his comments through here now.

  35. I love it when legacy analysis keeps kicking around due to ‘sentimental reasons’

  36. Easy to understand explanation, or my attempt at one:

    Lasner thought (not unreasonably) that the GISS land index was a record for land areas only. But it isn’t. Leave aside for now what the GISS land index actually represents.

    On figure 1, you see that the actual land area trend (calculated by Zeke or Mosh) warms more quickly than the published GISS land index.

    If you take 29% of the actual land record, and 71% ocean record, you get something that’s about the same as the final GISTEMP product. So GISS is not out of whack in this respect.

    But since the GISS land index warms less quickly than the actual land area record, then if you try to back-calculate the land/ocean area percentages using the GISS land index, then you’ll way overestimate the land area percentage.

  37. This is marginally off-topic, but there seems to be occasional talk on this thread regarding land vs. ocean records, so I’ll throw it out there.

    Klotzbach et al.(2009) compared 2 surface temperature sets (from CRU & NCDC) with satellite measurements of lower tropo temperatures (UAH & RSS). The key finding is that the 30-year trends for ocean data were similar between the surface & satellite sets, but land values diverged sharply, with the surface-measured trend ~50% higher than satellite-measured. [Actually this represents an even greater disparity, as the tropo trend was expected to be ~1.1x the surface trend over land, apparently based on GCM models.]

    The paper concludes that there is a positive bias trend in the surface land temperature measurements. They cite many possible causes, but barely touch upon urbanization as a source of bias.

    Klotzbach, P. J., R. A. Pielke Sr., R. A. Pielke Jr., J. R. Christy, and R. T. McNider (2009), An alternative explanation
    for differential temperature trends at the surface and in the lower troposphere, J. Geophys. Res., 114, D21102,
    doi:10.1029/2009JD011841.

    Klotzbach, P. J., R. A. Pielke, Sr., R. A. Pielke, Jr., J. R. Christy, and R. T. McNider (2010), Correction to “An alternative explanation for differential temperature trends at the surface and in the lower troposphere”, J. Geophys. Res., 115, D01107, doi:10.1029/2009JD013655

    http://hurricane.atmos.colostate.edu/Includes/Documents/Publications/klotzbachetal2009.pdf

    http://hurricane.atmos.colostate.edu/Includes/Documents/Publications/klotzbachetal2010.pdf

  38. Hi all!

    Have the patience for me to check your things out tonigh Danish time, ok?

    *question*
    In the mean time then you elite peoble could briefly explain: NEVER MIND what the GISS LST is, why is it weighted “ZERO” in 1900-20 (only trends of SST to be seen) and then weightes “A LOT” later?
    See: http://hidethedecline.eu/media/GISSglobal/fig1b.jpg

    (Another thing: What you guys dont know is, that I asked Jo Nova to have my findings checked out before publishing by her good competent colleagues – that was one of the reasons that i used Jo Nova and will co-operate with her in the future. So she did, and in all this has gone through in all 6 sets of reviewer eyes before publishing. More: Your idea that Anthony posted Bobs article as some kind of heads down to my article blah blah is not my impressions from communications going on, i think you are wild guessing.)

    Im really looking forward to go through what ever it is that makes you so sure in your case, but it will be tonight 🙂 Hope you have patience.

  39. And in general: I think there is 2 discussions relevant:

    1) Is it correct for 2 to use 1200 radius over see

    2) If so, are the final land+SST from GISS correct and based on methods of reasonable logic.

    Checkout fig 51 from my new article, it just shows how siginificantly differently the SST and the supposed land+ocean trends are. Imagine that the brown curves where truly 50%land50%ocean. See how in 1970 to 1974, the “land-ocean” graphs goes 0,1K UP while the SST is going 0,2K UP. It doens exactly point to the correctness of the message from Hansen Lebedeff 87, that their temperatures where just as much land as sea, does it?

  40. A quick logging run of step5.py, with some slightly old data which I happened to have around, tells me this:

    In 75 grid cells, there is no ocean data and no land data, so no data is used.

    In 2529 grid cells, there is no ocean data, so land data is used.

    In 7 grid cells, there is ocean data for exactly 50 months, so land data is used. I don’t know why this is.

    In 469 grid cells, there is complete ocean data (1564 months) but there is a land station within 100km, so the land data is used.

    In 3 grid cells, there is very nearly complete ocean data (1560, 1562, 1563), and no nearby land station, so ocean data is used.

    In 4917 grid cells, there is complete ocean data and no nearby land station, so ocean data is used.

    A breakdown of the cells in which there is land data but no ocean data:

    864 cells have a land station within 100km.
    677 cells have a land station within 100-200km.
    541 cells have a land station within 200-400km.
    292 cells have a land station within 400-700km.
    108 cells have a land station within 700-1000km.
    57 cells have a land station within 1000-1200km.

  41. HaroldW,

    You said “The paper concludes that there is a positive bias trend in the surface land temperature measurements. They cite many possible causes, but barely touch upon urbanization as a source of bias.”

    The actual conclusion from the paper is:

    “Specifically, the characteristics of the divergence across the
    data sets are strongly suggestive that it is an artifact resulting
    from the data quality of the surface, satellite and/or radiosonde
    observations.”

    and from the abstract

    “These findings strongly suggest that there remain important inconsistencies between surface and satellite records.”

    They do say:

    “The differences between surface and satellite data sets tend to be largest over land areas, indicating that there may still be some contamination because of various aspects of land surface change, atmospheric aerosols and the tendency of shallow boundary layers
    to warm at a greater rate”

    Note the use of the word “may”.

    The conclusions have been portrayed somewhat differently via less formal communication channels with “may still be” becoming “is”.

    As far as I can see, the paper doesn’t “prove” that there is a bias in the surface temperature data. The conclusions and abstract reflect this. The authors do make it clear that they think that the problem lies in the surface data, which is perfectly consistent with their results. However, their results are also consistent with a more “consensus”-style explanation.

    The paper doesn’t demonstrate decisively that either of the viewpoints is “right” (or wrong), which is a shame. I suppose it does put one point of view across quite comprehensively, but it hasn’t moved the science on.

  42. Quote, “GISTemp land series is not a true land record”

    Why sow confusion and create complexity, intentionally or not, when simplicity is at hand, namely the satellite data?

    The GISTemp land(?) series cannot be defended post Climategate.

  43. Mac,
    I also have no idea what “Climategate” has to do with anything here.
    If it makes them happy, there’s no reason why GISS shouldn’t keep putting up their met station index, dTs or whatever they call it. It’s simply a matter of expanding the note there, so people know what it is, and what it isn’t. There probably aren’t very many instances in which that’s the analysis you’d want to use, so that should be more clear.

  44. Bob: Since both SST datasets (HADISST And Reynolds OI.v2) are complete datasets, then the “if there are fewer than 240 valid monthly data points in the ocean series” portion is no longer required.
    There’s not much SST data for Kansas, Kenya, or Kazakhstan.

    I hope the numbers above answer your question. In my test run, 4920 cells (61.5% of the total) use ocean data. 3005 cells (37.6%) use land data, and 75 cells (0.94%) have no data.

    As I described earlier, the GISTEMP code is written in a parameterized fashion allowing interpolation between the ocean series and the land series for cells in which both series are present. However, the parameters are set such that interpolation does not take place.

  45. Lasner

    what ever it is that makes you so sure in your case

    Basic difference: you’re taking end results, and trying to work backwards to try to guess at how they came about. Sometimes you’ll correctly diagnose things, but sometimes you won’t. And sometimes you’ll misuse something, out of confusion.

    Zeke and others are starting with the source data, and working forwards. Zeke using his own methods; Barnes using GISS’s methods. So they can test ideas much more directly.

  46. FWIW, the few unusual cells to which I refer – with some ocean data but without complete ocean data series – are as follows:

    count box cell south north west east
    50 2 33 68.435 70.052 27.0 36.0
    50 4 86 59.317 61.642 -153.0 -148.5
    50 4 87 59.317 61.642 -148.5 -144.0
    50 4 88 59.317 61.642 -144.0 -139.5
    50 4 89 59.317 61.642 -139.5 -135.0
    50 6 75 57.140 59.317 -67.5 -63.0
    50 6 85 59.317 61.642 -67.5 -63.0
    1560 2 80 78.521 81.890 0.0 9.0
    1562 11 43 51.261 53.130 148.5 153.0
    1563 2 70 75.930 78.522 0.0 9.0

    On this run a complete data series has 1564 items. The ‘count’ column is the number of monthly data items in the ocean data series for each cell. The ‘box’ and ‘cell’ columns identify the cell by number in the GISS grid. The S/N/W/E columns here are the positions of the cell edges, in degrees lat and long.

    On a quick look these all appear to be cells around the periphery of the arctic ice. In any case there are not enough cells here to make any appreciable difference in the results.

  47. For the record:

    Not for ONE NANOsecond did I think that Hansen would not have methods and data series that “explains” matters (!!!)
    Of course he has!!

    Obviously (!!) Hansen did not just write some numbers up without some backup data and methods??!!?! Welcome to the climate debate..!

    But It appears that you have come one step further matters – which is good, and then.. thats it?

    Then who cares that the mixed land+ocean temperatures differs strongly from SST even though ocean covers 70% of the globe and a mixed land+SST should therefore allways be rather near the SST?
    (And that the big descrepancy just happens to occur with huge land warming in recent years)

    Who cares that this magic Hansen dataset (supported by another Hansen set) is weighted ZERO for twenty years 1900-20 and then weighted strongly in 2007?

    Who cares that these “land-ocean” curves often has trends for years in opposite direction of the SST´s even though ocean data was supposed to be a big part of the land data?

    Who cares, hansen explains one set with another set and, a method, big surprice.

    Are you sure you have come to the buttom of this issue?

  48. Lasner:

    Calm down. So you went down the wrong path on something, though you left open the chance that you missed something. People pointed out what you missed. It happens. Why go lashing out so angrily? [what shouldn’t happen is your friend putting up a nice picture of a Hansenizer, but that’s a different matter].

    Then who cares that the mixed land+ocean temperatures differs strongly from SST even though ocean covers 70% of the globe and a mixed land+SST should therefore allways be rather near the SST?
    (And that the big descrepancy just happens to occur with huge land warming in recent years)

    That’s simply what’s in the underlying data. That’s what you get, regardless of whether you use GISS’s methods or your own.

    Oceans have a good deal more thermal mass. You’d expect them to lag in response to a radiation imbalance.

    Now maybe you think there is some problem with the underlying data that is exaggerating such effects beyond physical reality. But that’s not a question of processing methods, as you’re banging on about, but rather the quality of the underlying data.

    Who cares that this magic Hansen dataset (supported by another Hansen set) is weighted ZERO for twenty years 1900-20 and then weighted strongly in 2007?

    I’m having to guess at what you actually mean here. Which is a poor state of affairs. I am guessing that you observe that in the recent decades, the land shows a higher trend than the ocean, whereas in 1900-1920 they showed similar trends to each other. Is this correct?

  49. Frank,

    It might be useful to redo your analysis with a true land record and see how the results change. As Rudy’s email above makes pretty clear, the GISTemp land series is not appropriate to use in this case. You can either grab one of the publicly available scripts to do this (via Mosher, Jeff Id, Chad, Nick Stokes, or myself), or just use the NCDC land record (which, conveniently enough, is almost exactly the same as what you get using the GISS Step 0 stations).

    You will see that a simple linear combination of 0.29 * land + 0.71 * oceans does a pretty good job of replicating GISTemp over the last century (it only diverges a tad in recent years, due to GISTemp’s method of interpolation).


    http://i81.photobucket.com/albums/j237/hausfath/Picture488.png

    I’m also not sure where your argument that land has a 0 weight in the 1900-1920 period is coming from. 0.29 * land + 0.71 * oceans seems to do a reasonably good job of replicating temperatures over that period.


    http://i81.photobucket.com/albums/j237/hausfath/Picture489.png

    [Note that GISTemp itself does not actually use fixed ratios of land/ocean to generate global anomalies. Nick Barnes has a comment early on in this thread that goes into more detail about the actual GISTemp method.]

  50. Carrot Eater:

    Oceans have a good deal more thermal mass. You’d expect them to lag in response to a radiation imbalance.

    And there’s a matter of loss of ice coverage and associated land coverage changes for land (esp northern climes) and the positive feedback associated with it.

    It gives you a picture like this, and it’s in the underlying data set. It’s the strongest signal present (beyond the global mean temperature trend itself) and appears to overwhelm any other signal, like UHI, shift in station and so forth. In addition to the physical properties you’ve mentioned, ocean is liquid and can adjust to reduce south-to-north temperature trends via circulation while of course land cannot.

  51. I have a question for Nick Barnes.

    If you look in in the lower left of the image below you can see where I identified with a black dot the location of Byrd which shows a 0.000c temperature anomaly for all cells north of the station. My contention is that the adjacent gray area should be filled in the same. Maybe Nick Barnes can explain. Is this an instance of no anomalies calculated for those cells or is it a plotting error? I noticed this about a year ago and I have always assumed it is a plotting error. Figured now was a good time to ask.

    The rest of this is for general consideration.

    The reason for the 0.000c anomaly for that year is due to the fact that Byrd doesn’t have a record prior to March 1980 and no other stations are within 1200 km of those cells. It seems just a single station month or year within the anomaly period is sufficient for use in calculating 1951-1980 averages. anomalies. If you look at a couple maps you will see that Jan, Feb, and Winter season never show anomaly data for Byrd.

    If a coastal station was placed 600 km north of Byrd and accumulated 20 years of readings, it would be compared to the same one month cell values as Byrd. Does this seem reasonable?

    I have seen numerous stations with five years or less of monthly values at either end of the 1951-1980 period. Is it reasonable to use those stations when calculating the 1951-1980 period averages? Maybe someone has an easy way to count those stations.

    Why is it that land station data is smoothed out to 1200 km, yet the same isn’t done with SST data? It seems an inconsistent way to do things. If ice cover doesn’t hinder the smoothing of land anomaly data, why shouldn’t it be reasonable to do the same with SST? It seems incongruous to me that a land station anomaly at 75N can be extended 1670 km all the way to the pole, but SST at the same latitude isn’t.

  52. To clarify the preceding graphs a tad, GISS Step 0 Land Reynolds Ocean refers to my model run. MoshTemp Land Reynolds Ocean refers to Steven Mosher’s model. GISTemp Hadley area and GISTemp are both provided via NASA.

  53. Bob Koss, in an indirect fashion, the image you show demonstrates one of the biggest weaknesses of most/all current reconstructions: Namely the inability to set uncertainty bounds on their central values.

    The largest warming trends are in areas with the least number of stations, clearly this must have some impact on the uncertainty of temperature estimations. (Though I’d suggest plotting figures like this on equal area projection, since it gives a more realistic idea of what region is being covered.)

  54. Mac.

    Climategate was not about GISSTEMP. It was about a small team of scientists who felt embattled by skeptics and took it out on McIntyre. As a result ( Muir Russell found) they were overly defensive about sharing data. As a result ( as Muir russell found) they brought more trouble upon themselves ( a pile of FOIA requests). They violated FOIA law ( as the ICO found) to protect themselves, not to hide any flaws in the science. They created presentations of data that, while, largely accurate, had a certain misleading quality to them.

    GISTEMP only enters the Climategate records in a few places. Namely where Phil Jones criticizes it.

  55. Climategate was the point where the tricks, the past confusions and uncertainties over climate change where brought to the fore. The result, the certainty surrounding the entire AGW narrative collapsed.

    You simply cannot sustain complexity, the confusion and the different interpretations surrounding issues, data sets, etc, such as this GISS discussion when there is clarity at hand, the satellite data.

  56. Bill Illis (Comment#49375) July 19th, 2010 at 6:41 pm
    Just wondering if anyone thinks GISS should reduce the 1200 km smoothing algorithm. This episode demonstrates another problem beyond the questionable southern and Arctic ocean impacts. Even the original Hansen Lebedeff paper showed the correlation drops to 0.50 or lower at 1200 kms with varying impacts depending on latitude.

    *********************
    There is a study we discussed on Airvent that has more coverage than hansens 87 study that suggests 750km. there are latitude and seasonal variations. FOR NH at high latitudes, 1200km isnt so bad.
    if you are at 90,0 its cold which ever way you choose to walk.
    SH is more problematic.

    There is some moving buoy data one could use, I suppose, …
    In the end to extrapolate or not is just a choice. So you have one record (CRU) who choose not to, and one record that chooses to extrapolate.

    The extrapolation matters ONLY IF, you are extrapolating over
    a zone where evidence would suggest a long term difference in trend.

    Bottomline: the trend since 1880 doesnt vanish.

  57. Bob, can you separate that into a number of easier questions? I’m pretty busy this week and I’m not really familiar with the visualisation which you present.

  58. FOR NH at high latitudes, 1200km isnt so bad.

    And that’s the place where doing it or not doing it makes a difference.

  59. PolyisTCOandbanned (Comment#49291) July 19th, 2010 at 1:17 pm
    McI was winging about the upwards UHI adjustments on isolated stations…

    Jeez.

    As I see it one problem is this with Hansen’s adjustments is that some Urban stations are WARMED by the method. That’s counter intuitive and anti theoretical. I would think those cases merit some further investigation. Essentially the urban is just smoothed to match the rural. That doesnt get you increased N. it may look like you have more information but you dont.

    Other issue is land use change with Rural. Basically any changes in land use ( de forestation, dam building, different vegetation) that change the water vapor transport.

    I’ve just stumbled on some papers that have GIS stuff on water vapor transport and land use ( from land cover classes) thinking..

    http://www.gwsp.org/fileadmin/downloads/7612.pdf

  60. Mosh

    As I see it one problem is this with Hansen’s adjustments is that some Urban stations are WARMED by the method. That’s counter intuitive and anti theoretical.

    No, no, no and no. It’s only those things if you think UHI is some dominant, predictable and monolithic thing.

    It should not be that surprising if you find some urban stations which have lower trends than the rural neighbors. As is the case.

    Now in some cases, this can come about because of some other discontinuity. Say the urban station had an instrument change, that led to recent temperatures being offset to the cold side. This sort of thing can certainly happen; GISS’s method doesn’t go looking for it.

    But otherwise, there’s no a priori reason to assume an urban station can’t have less warming trend than the rural neighbors. Micro-site is as important or more important than meso-scale. And then if those influences aren’t changing over the observed time, you won’t get necessarily get a trend from them, anyway.

  61. Ya carrot.

    This would be one of those cases where some synthetic data could make the point.
    create faux stations north of 70 degree.
    trend = -1C
    trend =+1C

    trend = Global land average.
    trend = trend at 60-70 degrees north.

    sumtin like that might help make the point.

  62. Carrick (Comment#49437) July 20th, 2010 at 9:20 am
    Bob Koss, in an indirect fashion, the image you show demonstrates one of the biggest weaknesses of most/all current reconstructions: Namely the inability to set uncertainty bounds on their central values.
    The largest warming trends are in areas with the least number of stations, clearly this must have some impact on the uncertainty of temperature estimations.

    ***************************************
    Thats a concern. You have two cells. equal area. One has 25 stations. the other has 1. You avrage the 25 to get one value for one cell and this is given the same weight as a cell that has one station. Not sure how to represent that in the error calculation.
    but the quality of the two estimates is different.

  63. Mosh:

    Not sure how to represent that in the error calculation.

    I’d use a Monte Carlo approach.

  64. Thats a concern. You have two cells. equal area. One has 25 stations. the other has 1. You avrage the 25 to get one value for one cell and this is given the same weight as a cell that has one station. Not sure how to represent that in the error calculation but the quality of the two estimates is different.

    DUH! That was the point of a little discussion I was having with carrot eater awhile back when Carrick butted in.

  65. Thats a concern. You have two cells. equal area. One has 25 stations. the other has 1.

    That sort of thing is taken care of, if you use a high resolution model to gauge error due to undersampling. Whether you like that approach or not.

    But that’s only the sampling error. Things get amplified if some of the more isolated thermometers have data quality issues.

  66. I agree with Carrot Eater wrt to UHI… if you move a station from one of Anthony’s famed asphalt roof tops into the middle of a city park… guess what? You end up with a negative UHI.

  67. Indeed. I’ve been looking at individual urban/rural station pairs, and its quite noisy. Some signal emerges on aggregate, but there are plenty of nearby urban areas that trend lower than rural areas.

  68. Carrot Eater:

    That sort of thing is taken care of, if you use a high resolution model to gauge error due to undersampling. Whether you like that approach or not.

    Clearly when you have one station in a cell, there is less smoothing from area averaging than when you have say 25 stations in a cell. The cell with one station will add more weather and short-period climate noise than the cells with e.g. 25 stations. That’s true without regard to whether station issues are present.

    Thus, cells with few stations will give a great contribution to the fluctuating part of the global temperature reconstruction. That’s almost unavoidable. As the number and spatial distribution changes (or equally as you go from different data sets used in the various reconstructions), the weighting of the various regional weather/climate noise changes with it.

    Again what I was discussing was bounding the uncertainties associated with this sort of episodic noise in the reconstructions. ATM, it’s mostly not done, and certainly not in any of the blogger reconstructions to date.

  69. carrick

    The cell with one station will add more weather and short-period climate noise than the cells with e.g. 25 stations.

    That isn’t necessarily true. In climate talk, we call weather to be noise, but it isn’t noise. It’s something real. And it correlates out some ways. So your 25 stations in the same area could report much the same ‘noise’ – they’ll show the same variations with ENSO, for instance. It’s the measurement errors, the true noise, that would hopefully cancel out.

  70. Man this is awesome. We are all talking about how to stop global warming, each of us probably burning 20 pounds of coal each per day, using our computers and other electronics…not to mention petroleum and petroleum derivatives. How funny would it be if there was a giant earthquake that broke up one of the ice caps?

    It is really funny to read everybody’s arguments about the satellite and thermometer data, we’re not even talking about 1 full degree. Oh by the way, what happened to alternative energy? So much for wind and solar. Oh and I would like to offer proof that people like the carrot eater only come on here to have “feel good” moments but do not truly care about global warming.
    1. We know the carrot eater is probably burning 20lbs of coal per day, (this is what the average person consumes daily)
    2. If the carrot eater cared about global warming, he would not be using a computer, or he would use one that uses a “clean energy” source. Extremely doubtful.
    3. If the carrot eater truly cared, he would move to Spain. Man, Spain is just a green paradise. Sure, the unemployment is high and you can’t use an air conditioner, but hey there saving the planet. So because the carrot eater will not move to Spain, he is a fraud.

    Additionally, I find it unlawful and outright rotten that some people have the audacity to try and force a change they want on everybody else. Ex. Healthcare

    If you want a government run program, move to Europe. Move to Canada. These types of people think they know better than everyone else and instead of just moving to an area that suits their political beliefs, they want to wreck it for anyone else.

    Here is an excellent contrast between Bush and Obama. The Iraq war was not popular. However, Bush did not force anybody to sign up and fight. Also, he did not impose a war tax. Therefore, people who complained about the war (not counting family of soldiers) had no right and no stake at all. If I had family in the twin towers, I would be fighting in Iraq or Afghanistan right now.

    Contrast Obama: Completely dwarfs Bush’s spending, under the guise that Bush made things so bad he had to spend a lot more money than Bush. Next, he mandates changes to healthcare and demands certain preventative services be covered free. He creates a tanning tax (I do not tan and think people that do are extremely stupid to the point that they may need counselling). The danger however, is that these mandates open the door for new mandates. If the government can force you to buy healthcare (you will be fined if you do not pick up health insurance, and I suspect if you do not pay the fine you go to jail) what is to stop them from forcing other things, like handing over retirement money?

    I apologize for shattering many world views and making people think. Please forgive my thoughtful insight and dismiss me as a right wing extremist, homophobe, racist, corporate pig, tea partier or whatever you can think of. Keep in mind though, I will vote for a democrat in a second if they go back to their roots (Harry Truman was a great president) and quit crying about social issues. At some point, class struggle has to end and I believe it has. Why do we need labor unions now? I have not been hearing about unsafe labor conditions? Why is it that union employees can fail drug tests up to 3 times? Hmmm….

  71. Nick Barnes,

    That image is a Giss provided anomaly map of 1980 values which they have interpolated from the 8000 subbox method into a 2×2 grid for plotting purposes. The plotted map indicates no data between -77s -143w and -77s -127w. Also no data between -75s -143w and -75s -137w.

    Station Byrd(black dot on map) is located at -80.0s -119.4w. It is my contention that the area in question should have anomaly data similar to Byrd and not be empty. The distance from Byrd to the farthest point at -75w -143w is only 785 km. I’m wondering if this is simply a flaw in their data plotting, or if the Giss algorithm actually generates no data for that area. I suspect it is a plotting flaw, but I’m not certain. I’m hoping you will take a look and confirm whether data has been generated for that area. I don’t have a way to look at the individual subbox data generated by your CCC to check.

    The other questions were of a more general nature and not meant specifically for you. No comment is necessary.

    Thanks in advance.

  72. Mosh:

    I already explained why one sees these “counterintiutive corrections”. It has to do with the correction being (at least sometimes) smaller than random sensor to sensor variation.

    If you think about it, you really won’t be surprised. After all, there are always some stations that show a different trend (even oppositie in direction) than the average. Right? You know and agree that happens right? That you can’t base things off of a single station…can always find some that go the opposite of the general trend? Well given that…imagine paired stations and doing corrections using paired stations. SOME of those paired stations will have trends that are randomly different than the overall trend! This will cause the counterintuitive corrections. Capisce? And you can’t just “cull” them or indict the method even becauce of the behavior at indif=vidual stations. Just like you can’t fault the average because every station does not ahve identical trend! Furthermore…you’re only seeing and thinking and worried about the counterintituvie (warmed urban) stations…but you need to realize that there are ALSO overcooled urban stations (becuase this mathematical effect described above occurs in both directions).

    Maybe your inability to grok this gives me hope for Steve’s honesty. Maybe he’s just not as smart as I thought. Or at least self-blinded. Kind of amazes me though. So smart on philosophy and code and all…but such poor “physical intuition” (as a physicist would use the term). Maybe it’s just a block that you all have to questioning your own thinking. Jeff Id took forever to get the point on the negative thermometers. Lucia took forever with ENSO and nature of noise on her 10 years temp decline postings.

  73. Carrot Eater:

    That isn’t necessarily true. In climate talk, we call weather to be noise, but it isn’t noise. It’s something real. And it correlates out some ways. So your 25 stations in the same area could report much the same ‘noise’ – they’ll show the same variations with ENSO, for instance. It’s the measurement errors, the true noise, that would hopefully cancel out.

    I agree, it’s not necessarily true that regional scale climate fluctuations get suppressed with the 25 stations. And I agree that the anomalized measurement errors are more likely to be cancelled with the 25 than with the one.

    It’d be interesting to do some studies on how much of the climate-related high-frequency content in the fluctuation spectrum gets reduced by averaging…

  74. Well, carrick suggest a proceedure and I can give it a wack,

    Just for clarity assume I have four cells

    1: 50 stations
    2: 10 stations
    3: 5 stations
    4: 1 station.

    How would you proceed?

  75. There’s something to keep in mind with the high frequency stuff. Some of it is like a seesaw, with thermal energy being redistributed, but no net global change at the surface – some area is warmer than usual, while some other area is colder than usual. But of course we can all look at the global surface means, and see plenty of wiggles there, too, which survive all the spatial surface averaging. To that extent, what I’d really like to see is a more accurate measure of the total heat content (read: ocean heat content). Not saying that what we have now is at all bad, but I’m sure the level of observation will improve over the years.

  76. PCTO
    “If you think about it, you really won’t be surprised. After all, there are always some stations that show a different trend (even oppositie in direction) than the average. Right?”

    yes of course I get that. The issue is the bias should be large enough come through. Assuming of course the bias is large.

    Its entirely possible to have an urban station that undergoes a local trend in climate that is negative ( or less than the up trend) while a nearby rural station sees and upward climate trend that is greater than the trend seen at the urban station. Nad its mathimatically possible that the UHI trend is cancelled and even overwhelmed. It would surprising if this confluence happened a bunch of times. Who know maybe we build our cities in places ( by coasts and lakes and rivers) where the climate trend tends to less severe or lagging ( cause of the ocean).. I would hope that the spatial pairing would dimish the probablity that you get a urban site that sees less climate trend than a rural site.

    Still, I find the result surprising and Peterson found the result surprising and I think it BEARS investigation. I find it curious. I’m a curious fellow. Who knows maybe we build our cities in places that see higher winds? more cloudiness? ( on average) who knows maybe cities change precipitation patterns downwind?

    Fundamentally, we look at heat maps of cities and we get alarmed because we see large Urban heat islands. We note this even as cities grow from no people to a few thousand. We dont have cases where we document widespread COOLING in an urban environment. Pockets of cold? yup. Transient pockets of cold.
    BUT ITS NOT COLD that matters. its trend. petersons whole appeal to cool parks is kinda strange. That’s why I think there is another reason. I think Parker was more on the right path, although he too waved his arms and said “cool park”.

    But in the end the argument is over mousenuts. which makes for interesting conversations between those who give a rats ass about mousenuts and those who think they are the family jewels

  77. PTCO.

    To be clear I’m not indicting the method. But I dont think the method gives you anything worthwhile. I would not correct for UHI.

    Looking at one station ( my favorite) orland california. You have rural station that gets classified as a urban station because of nightlights. I can probably tell you which friken light it is and if I took my shot gun to it, the site would be rural. Single story buildings. farmland.

    http://moyhu.blogspot.com/2010/02/giss-uhi-adjustments.html#more

    Take a look at the adjustment in 1880. when Orland had around 200 people.

    The algorithm is a meaningless meatgrinder. The arguement that it all evens out has no force with me, because if it all evens out then it is quite meaningless to apply it, except for the marketing appeal. In short, Hansens algorithm does not adjust for UHI.

    It classifies stations by using nightlights and smooths some stations with data from other stations. It has nothing to do ( being extreme to make a point) with UHI.

    Wanna fight? ( just kidding)

  78. Wow, can the logic get anymore circular?

    In short, Hansens algorithm does not adjust for UHI.

    This is only true if you can show that GISS’s dark stations actually have significant UHI themselves.

    Otherwise, it’s a meaningless statement.

    And it’s not a meat grinder; it’s quite simple in concept. That it doesn’t have the effect you expect is more a statement on your expectations vs what the underlying data actually contain.

  79. But McI cited individual examples as some sort of “tada”. And it’s OBVIOUS that there will be SOME individual examples of this counterintuitive correction, given that we know we can always find some stations which decline in temp, even though the average goes up. Furthermore, the phenomenon that causes the counterintuitive examples is ALSO causing some overcorrected examples (which are not honed in on). this is why citing the individual stations as some sort of “tada” is silly. It’s analagous to citing individual stations that go in the opposite direction as the average! (thinking about paired station corrections makes this more apparent). The failure to explain this, was either an example of lack of physical insight or sophistry.

    -moving on-

    Regarding the lack of much difference for the ENSEMBLE, this is a different issue (we could have a definite impact on the ensemble and STILL find individual stations with counterintitive corrections):

    Essentially what Hansen correction does is almost the SAME as taking a rural only measurement set. You can see this if you imagine a paired set correction: U1(new)=U1(old)-UHIcorrection; UHIcorrection=U1(old)-R1. Therefore, U1(new)=R1!

    So, if doing the UHI correction doesn’t change things much that is similar to saying “do a rural set only” and the temp increase is same. could be a lot of reasons for that.

    1. Could be that, the method of looking for UHI is inadequate (population or nightlights or whathave you).

    2. Or could be that UHI is not really that much of a contaminent of the data (for various reasons…perhaps the effect is not as dramatic versus time as we think, perhaps the sensors are in good spots within urban areas (cool parks, cool airporsts, etc.)

    3. Perhaps both rural and urban areas are all warming with UHI. this is sorta similar to 1, I think, becase what it means is you don’t have a valid (uncontaminated) reference set.

    ——————————

    Nevertheless: I think we can at least say that a reasonable, plausible attempt (nightlights) was used to try to correct for UHI and the effect (on trend) was tiny. At least to some Bayesian extent, this makes me think it more likely that (2) is correct.

    Consider a business case where you were looking at some market estimation and worried about a confounding factor. So you try to remove the confounding factor. you use your best ability to test the confounding factor and see no impact. Sure, you may not have a perfect ability to look for that confounding factor. But it makes your Bayesian belief higher. And the Wattsian willingness to believe in UHI instead is something I find vexing. It shows a biasedness.*

    If you know a better way to look for UHI, let me know. If we keep hammering at it, with a couple more different approaches and still don’t see much UHI, then this makes it more and more likely that 2 is correct. 🙂

    *not exactly UHI, but indicative of the problem: remember how Watts cheered Steve’s initial analysis of his preliminary surface station data…but then when JohnV did a better calculation (and I think you will agree with me tha V’s geoweighted averaging comparison was superior to McI’s numerical averaging), Watts got unhappy…wanted all kinds of holes poked in V’s work…and took his data away and hid it!

  80. Steve:

    1. Hansen’s nightlight method certainly IS at least A method of removing UHI as a confounding factor! If nightlights are a GOOD proxy for urbanization, then running his method will remove the UHI signal. If you run his test, see minimal UHI, then you can say “provided nightlights are a good proxy for urban heat islands, UHI is minimal”. Of course, if that caveat is wrong, you can’t say how much UHI there is. But this is slightly different than saying he did “nothing”.

    2. Think about it from first principles. Let’s say you didn’t even KNOW that the Hansen test was going to show low UHI. Imagine going back in time. Let’s say, you were challenged to try to deconvolute the average and remove any UHI trend impact. Wouldn’t you at least think of nightlights as a PLAUSIBLE method of looking at urbanization? It’s actually kinda neat and techie. 😉

    3. If you think the night lights are a bad measurment device, use population data or something. how else can we test this? If a population mask gives you the same effect (minimal UHI), what else can you suggest? How many ways of separating UHI out will it take to satisfy you? What would be a “gold standard test” for you?

    4. Orland, Orland, Orland. I read the blog too. I been around these parts a long time, Wildcat I remember. So Orland as an issue is not news to me. Let’s try thinking about it…

    Orland on it’s own can’t be taken as a general repudiation of nightlights unless you expect a method that is PERFECT. look…you’re going with single station criticism again! The bigger question, because we are interested in trends overall and the correction of them overall is how do nightlights do IN GENERAL at measuring urbanization. Some sort of test, where you look at a statistically significant number of urban and rural (by nightlights) sites (randomly selected) and see how nightlights performs in showing urbanization would be the method to evaluated nightligghting.

    This is ANALAGOUS to taking survey data where you are worried about a response bias (let’s say it’s on dieting…a subject near and dear to my heart) and you go back and do some deeper examination of a small number of the respondents (not the whole set). A case study is a more exacting, but more expensive and time consuming method, to get information than just a survey response. So you check people by going and actually weighing them or what have you. Maybe you find that the answers are exactly the same. Maybe you find a little bit of differnce, but random (without bias). Maybe you find that there’s a definite bias. So now you apply that bias against your survey result. Or even future survey results (relaizing that the case studies are expensive, else you would just do them to start with and always and over the full data set).

    But puh…leeze. Don’t make more “Orland” hay. The single site kvetching is so silly.

    5. (and no fighting…go paint “The Rock”…you wouldn’t want to to mess with me…I’m 70 pounds lighter, 99/58, lifts almost doubled…no booze for a year…and a little guy but with that high school defensive back attitude.)

    —————-

    Lucia, I respectfull request to be taken out of troll control. Hard when my comments don’t come through for hours to have discussions with people. I promise no drunk posts or Palin love (I don’t like her anymore anyhow).

  81. Are you really adressing the issues about UHI effects that McIntyre has put forward? McIntyres position seems to be that the Surfacestations project confirms that the GISS UHI corrections have some merit at least in the US and that it disconfirms the NOAA and CRU approaches of non-correction:

    “As I’ve mentioned before, in my opinion, the moral of the surfacestations.org project in the US is mainly that it gives a relatively objective means of deciding between these two discrepant series. As others have observed, the drift in the GISS results looks like it’s going to be relatively small compared to results from CRN1-2 stations – a result that has caused some cackling in the blogosphere. IMO, such cackling is misplaced. The surfacestations results give an objective reason to view the the NOAA result as biased. It also confirms that adjustments for UHI are required. Outside the US, the GISS meta-data on population and rural-ness is so screwed up and obsolete that their UHI “adjustment” is essentially random and its effectiveness in the ROW is very doubtful. Neither NOAA nor CRU even bother with such adjustments as they rely on various hokey “proofs” that UHI changes over the 20th century do not “matter”.”
    http://climateaudit.org/2009/01/16/noaa-versus-nasa-us-data/
    http://climateaudit.org/2009/01/20/realclimate-and-disinformation-on-uhi/

    The tropospheric temperature trends measured by satelites are a tad lower than the surface trends where they should in theory be 1.2 times higher. To me that also suggests that UHI could be undercorrected for in the surface trends.

  82. When I read these statements about UHI, I think about the strict discipline that Steve McIntyre keeps about making broad statements. It would be better if the people in this place did more studying and less proclaiming.

  83. Neils:

    I’m NOT trying to comment on everything McI has said about UHI and in particular was not commenting on some evaluation he had of surfacestations. I’m commenting on what I commented on. A clear logic gap (or sophistry) regarding the counterintuitive corrections. Kvetching about those counterintuitive corrections is similar to kvetching about individual stations that show an opposite trend to the average trend. I’ve explained ad nauseum why this can occur. Other have too. FOR YEARS!

    McI is supposed to be some super brainiac that use Latin and wierd Buckeyesque words, that codes (at his age!) and that knows vector algebra and all. But is either too dumb to understand a basic concept, self-blinded and won’t think, or a sophist and won’t admit something.

  84. TAG: McI plays a lot of games though. He will fall back on the caveats when pushed (“I never said that”). But he’ll also try to have a PR impact. Will use “tada” argumentation (not asserting the point and labeling it). I’ve actually caught him a few times at it and called him on it. [Basically, I explained that he had never made some negagitive assertions since he didn’t spell them out! ;)]

  85. surfacestations isn’t about UHI as a mesoscale phenomenon in the first place. It’s a microscale thing. Seems to me that the micro is more important than the meso, though.

  86. Niels,

    Excellent! Your links took me back to where to a thread where I had a rare comment summarizing what I thought at the time. GISSTemp UHI correction is adequate. ROW likely isn’t.

    Now I’m back to knowing what I knew a year and a half ago. With any luck, in another year and a half I’ll be back to that point again.

  87. Well, there is also updated metadata now (thanks in a large part to Ron’s work). Plus, GISTemp no longer uses the metadata urbanity designations. They switched to satellite nightlights back in January for the whole world.

  88. TAG (Comment#49506) July 20th, 2010 at 2:33 pm

    When I read these statements about UHI, I think about the strict discipline that Steve McIntyre keeps about making broad statements. It would be better if the people in this place did more studying and less proclaiming.

    But he does make broad statements. He generalizes quite quickly from his nitpicking to claiming the IPCC is corrupt.

  89. PTCO.

    brief comment. more later, but I have to run (helping a pretty girl with stats.. wawa)

    “4. Orland, Orland, Orland. I read the blog too. I been around these parts a long time, Wildcat I remember. So Orland as an issue is not news to me. Let’s try thinking about it…
    Orland on it’s own can’t be taken as a general repudiation of nightlights unless you expect a method that is PERFECT.”

    never said that. Look, as somebody who has developed methods for adjusting data and truncating data and performing all sorts of lewd acts on data, I know the value of a spot check. That check of orland says to me.. “think of something better than this approach.” THAT is the total sum of my insight. Think of something better than this approach which failed with the first station I looked at. #2. From early On I did not criticize nightlights. I criticized using it without CHECKING. I critcized it for using old data. I critcized it for labelling a town of 50K as Rural. I suggested using the 1km population density data from columbian ( which Ron now has) I suggested using Impervious surfaces data. #3. Since nightlights is a proxy for population and since factors like building height play a stronger role in UHI ( according to some) I think it makes sense to look at screens in addition to nightlights. Hey, I even suggested ( following Hansen) looking at Gallos work on NVDI. Read hansens paper carefully.

    Bottomline. First and foremost I would NOT adjust. I would eliminate suspect stations and live with the noise. IF you decide to adjust, then nighlights and a two legged slope matcher is quick and dirty spackle job.

    “look…you’re going with single station criticism again! The bigger question, because we are interested in trends overall and the correction of them overall is how do nightlights do IN GENERAL at measuring urbanization. Some sort of test, where you look at a statistically significant number of urban and rural (by nightlights) sites (randomly selected) and see how nightlights performs in showing urbanization would be the method to evaluated nightligghting.”

    TCO. My first critcism of nightlights is that it is a PROXY for urban density. Go read me over at CA two years ago. As a proxy it hadnt been tested very well.. I think Imhoffs tests were constrained to the US. I noted that if POPULATION DENSITY was what Hansen wanted, hansen could have used 1Km population density information from columbia (grump). ( you go search CA .. oh wait

    http://climateaudit.org/2008/02/23/googling-the-lights-fantastic/#comment-138640

    http://climateaudit.org/2008/02/23/googling-the-lights-fantastic/#comment-138647

    http://climateaudit.org/2008/02/23/googling-the-lights-fantastic/#comment-138649

    http://climateaudit.org/2008/02/23/googling-the-lights-fantastic/#comment-138657

    to quote myself:
    “So what IMHOFF97 proved was this: nightlights is as good as census data in dtermining
    URBANITY.”

    and we already have census data.

    Gotta run. pretty girls are more fun.
    But good to see ya outta the spam filter

  90. CE.

    Sorry you didnt get it.

    Hansens algorithm doesnt adjust for UHI. Think slowly about that.

    Anyway, these arguments are a lot more interesting that arguments about mails.

  91. Moshpit:

    As a single example of nightlights being flawed, ALL that Orland shows is that nightlights are NON-PERFECT. If you want to show that nightlights are a lousy proxy in general, then you need some sampling, some set of Orlands. But the posts on Orland (and I read them) were single station gloating. Like Watts jumping up and down on charcoal grill and air conditioner pictures.

    If you have a general concern with nightlights, fine. But I would be a lot more impressed if that had come from you, before finding the low measured UHI. I mean SERIOUSLY DUDE…if you went back in time, didn’t know what answer you would get, and didn’t have the money to somehow sitecheck and validate nightlights, would you think that they were a meaningless proxy? If your choice was just never checking for UHI or checking it with nightlights would you think that nightlights would do no good? Or twist it a different way…if we take 100 urban stations at random and 100 rural stations at random (as determined by nightlights), do you think the two samples will ahve the same population density or some significant differences. And it’s the GROUP impact that matters.

    P.s. I’m still troll controlled. Slows the discussion down, bunch of comments you can’t see yet).

    P.s.s. Have fun on the date.

  92. Hansens algorithm doesnt adjust for UHI. Think slowly about that.

    Wasting time with word games now?

    GISS takes urban stations, and gives them the trends of the rural neighbors. In some stations, this will take out a recognizable UHI trend. In other stations, there never was any such obvious trend to begin with, either because it simply wasn’t there, or it was obscured by other factors. In any event, this adjustment is essentially equivalent to just erasing the urban stations. And whether you do this using NOAA’s old population data, or nightlights, or GRUMP, or not at all, it doesn’t really move the global mean very much.

    So now what?

  93. CE had an interesting point above. It’s actually NOT the mislableing of rural stations as urban that might mess the corrections up (Orland example). It’s the (possible) mislabeling of the rural (dark) reference stations.

    If you have a true rural station that is mistakenly labeled as urban, the only downside is that you lose a data point (but no bias is introduced), since the urban stations are all corrected back to rural references anyhow and the trend impact is essentially what would happen if we just threw out the urban stations and did a rural only trending.

    However, if you mistanely classify urban stations as rural, than you could bias the results. Since the dark stations are assumed to be clean, non-UHI impaired. Even though here we might be talking about the urban measurement (nightlighgting) being the concern, it’s a similar issue if you somehow thing every station shows UHI (iow that UHI happens a lot on farms and towns and such…I’ve heard this concern by the way.)

    BTW, Mosh: You’re really wrong to say that Hansen didn’t adjust for UHI at all. You might not use the urbananization measurment he used, but he did USE ONE. At least entertain the idea, that we do a big validation study (which you say is missing) and then find out nightlights are a GREAT proxy for urbanization (even for the kind of urbanization that causes UHI). I’m not arguing that’s what’ll happen, but entertain it is POSSIBLE. Then, since Hansen did use the nightlights to correct, then he did correct for UHI!

    ———————

    Think about an example in a different context. Let’s say we look at children and see declining high school reading performance and reading education delivery (whether from the shcool or effort by thge student) has declined. Maybe. Let’s say someone else comes in with a concern that actually this has nothing to do with education quality or level of effort by the students, but what has changed is the population. The confounding factor concern would be that student’s aren’t as smart nowadays (from population changes, say).

    Let;s say we measure this by controlling for IQ. We would be removing that as a factor. LEt’s say we don’t have direct IQ tests, so we use parent’s IQ tests. Then the question becomes how good a proxy are they. And note it’s NOT critical that the parent’s IQ be perfectly correlated to the children’s (there are kids smarter than parents and visaversa), but that group cohorts correlated well.

    Do you capisce? This is really simple man.

  94. Oh…and we’ve kinda segued which is fun and all…but do you understand why site to site variation (what makes not every site reflect regional trends) can implicitly cause some counterintuitive looking corrections, and that’s not wrong…and that you ahve to presever that since some other sites would be overcorrected? I mean do you undersstand and yeild that point at least? do you see why some urban stations will have to be warmed by the UHI correction because of this site variability issue? I’d just like that admitted and we can agree that STeve should have understood it (if he didn’t) or admitted it (if he did) when it was explained to him a couple bazillion times and when he had a headpost making hay out of the counterintuitive corrections!

  95. Nebuchadnezzar (#49411):
    Thanks for responding, and sorry for the delayed reply, but work has priority of course.

    I disagree with a couple of things you said about Klotzbach et al. (2009). We’re mostly in agreement about the paper’s point – you say that “[t]he authors do make it clear that they think that the problem lies in the surface data” while still holding on to the paper’s use of Santer’s first hypothesis, viz. “the characteristics of the divergence across the data sets are strongly suggestive that it is an artifact resulting from the data quality of the surface, satellite and/or radiosonde observations.” But those words are Santer’s, not Klotzbach’s. Klotzbach makes it clear that (in their view) Karl et al. (2006) focused solely on uncertainties in the satellite/radiosonde observations, and Klotzbach only discusses possible weaknesses in the surface measurements.

    I agree with you that the paper doesn’t prove that the divergence is 100% due to inaccuracies in the surface measurements. Frankly, in this low signal-to-noise regime, I’m not sure what would constitute proof anyway. And to compound the issue, the “true” value which is being estimated is already somewhat abstract, as the temperature can really only be described as a hypothetical reading of an instrument under ideal circumstances.

    What I disagree more strongly with is the assertion that Klotzbach “hasn’t moved the science on.” Karl et al. claimed that the surface v. satellite discrepancies were resolved; Klotzbach demonstrates that there remain significant discrepancies esp. over land. Karl concludes that “[i]t is likely that these [surface measurement] biases are largely random and therefore cancel out over large regions such as the globe or tropics”; Klotzbach challenges this assertion, and proposes reasons for biases which might account for the discrepancies. Moving away from the Karl et al. assessment, which (to my mind) unduly favored models over observations, is a step forward.

    ==========
    Klotzbach et al. citations can be found at #49387

    Temperature Trends in the Lower Atmosphere: Steps for Understanding and Reconciling Differences. Thomas R. Karl, Susan J. Hassol, Christopher D. Miller, and William L. Murray, editors, 2006. A Report by the Climate Change Science Program and the Subcommittee on Global Change Research, Washington, DC.

    http://www.climatescience.gov/Library/sap/sap1-1/finalreport/sap1-1-final-all.pdf

  96. Stephen Mosher, sorry… I thought I had replied to you regarding Monte Carlos at work today.

    I think I got caught up in algorithms for producing a random field with a given covariance matrix, and never pressed the Submit! button. No doubt it’s sitting ready to click “Submit!”. I’ll post the reply in the morning.

  97. OK, so everybody seems agreed that GISS is warming slightly faster than CRU of late, and that this is due to the interpolation, pretty much over the Arctic.

    That’s well and good and tidy, but I’m a little surprised it had not caused a divergence (however slight) between GISS and CRU before. Greenland has a habit of doing some interesting things, after all.

  98. Carrot Eater, GISS and CRU have diverged before. A quick trip to woodfortrees shows me that trend in temps since 1900:

    GISS 0.66 deg/century
    CRU 0.74 deg/century

    A trend map at GISS shows some cool spots on the north coast of Russia that may have been extrapolated to extra cooling in the Arctic not included CRU.

    On the general question of extrapolation, assuming CRU in any way reflects global temperature trends is making an implied extrapolation. The average temperature for the parts of the globe that are measured is extrapolated into whatever parts of the globe are not being measured. In contrast GISS extrapolates temperatures from stations ‘close by’.

  99. MH

    Carrot Eater, GISS and CRU have diverged before.

    Mostly before 1960. The difference between the two shows some more dramatic trends back then, compared to now. But it had been more stable since 1960 (though it’s noisy), and I was wondering about that exactly in light of what you said afterward.

  100. Carrot Eater, I think part of the apparent agreement for 1960-1980 is the baseline choice and the relative lack of warming or cooling during that period, isn’t it?

  101. regarding baseline choice, that’s why I’m plotting the residuals between the two.

    but the apparent agreement (at least visually) continues past 1980.

    but i guess given the relatively small area up there, for it to show up, the stations surrounding the arctic used by GISS have to really have different trends from the mean of the rest of the world, to see an obvious divergence due to arctic extrapolation. And that difference has to be sustained.

  102. If you look at old photos of cities, you will see a lot of unirrigated, barren, treeless hellholes. Depending on when the records start and where the instruments are placed (and moved), it is certainly possible to get artificial cooling within urban environments.

  103. Ah Carrick I finished the first experiment.

    Subdivide the world into 3×3 cells, then randomly select 1 station per cell. 699 cells only have 1 station.. anyways. the answers
    are pretty damn tight..( did 100 cases)

    Next I suppose I would resample by randomly dropping cells ur thoughts

  104. There’s no choice to make in the cells that only have one station. Wouldn’t that make the answers more ‘tight’ than they ought to be? Why not limit the analysis to cells where your random choice has a choice to make?

  105. Carrot no word games.

    Hansens adjustment adjusts for nightlights, not for UHI. Nightlights is a proxy for population density. population density is a proxy for UHI. And as you say hansens method effectively erases stations that where bright or dim in 1995. Not sure what the purpose of that is OTHER than to make a marketing claim that you adjust for UHI. which it clearly does not. better, as I said, to eliminate the stations which have the potential of a contaminating influence. And yes, when you do this I would not expect to see the land numbers move by much.. maybe .15C, if that.

  106. Well since everybody rags on Palin I’m going to have to say I think she would be a better president than Obama. I don’t think she is very smart but I think she is more intelligent than Obama. Palin would be better because she wouldn’t do a stimulus package and she wouldn’t screw up our healthcare or pass crap and trade.

    Carrot eater, do you have a defibrillator at your house in case the global temperatures rises .0001 degrees in case you have a heart attack?

    Did anyone read “America’s Ruling Class and the Perils of Revolution”?
    This quote is just dynamite:
    “If, for example, you are Laurence Tribe in 1984, Harvard professor of law, leftist pillar of the establishment, you can “write” your magnum opus by using the products of your student assistant, Ron Klain. A decade later, after Klain admits to having written some parts of the book, and the other parts are found to be verbatim or paraphrases of a book published in 1974, you can claim that your plagiarism was “inadvertent,” and you can count on the Law School’s dean, Elena Kagan, to appoint a committee including former and future Harvard president Derek Bok that issues a secret report that “closes” the incident. Incidentally, Kagan ends up a justice of the Supreme Court. Not one of these people did their jobs: the professor did not write a book himself, the assistant plagiarized instead of researching, the dean and the committee did not hold the professor accountable, and all ended up rewarded. By contrast, for example learned papers and distinguished careers in climatology at MIT (Richard Lindzen) or UVA (S. Fred Singer) are not enough for their questions about “global warming” to be taken seriously. For our ruling class, identity always trumps.”

    This is why I have to ridicule people like the carrot eater. If Hitler were alive today and Michael Mann said, “there is propoganda against Hitler suggesting he does not like Jewish people. This is wrong, Hitler loves Jews.”

    I have no doubt the carrot eater and countless others would believe him.

  107. Lucia I hope you don’t delete my post. That is such a good example of how ridiculous and gullible the people who believe co2 is going to blow up the planet are.

  108. No, those are absolutely word games you’re playing. If you obliterate the stations you think are currently urban, then there’s not much chance of UHI creeping through.

    One remaining question is simply whether you have too high a rate of false negatives, on the test for urban-ness. But that question doesn’t justify you saying “they don’t correct for UHI”. It would only justify you saying “I’m not sure yet if their correction is effective, because it’s possible they are mis-identifying which stations could be prone to UHI”.

    The other remaining question is about micro-site effects even at rural stations; that isn’t a matter of UHI but it could still be a non-climate signal. Transition from LiG to MMTS, that sort of thing.

  109. Re: cce (Jul 21 08:41),

    If you look at old photos of cities, you will see a lot of unirrigated, barren, treeless hellholes. Depending on when the records start and where the instruments are placed (and moved), it is certainly possible to get artificial cooling within urban environments.

    You mean like Houston, Texas?
    http://files.harc.edu/Projects/CoolHouston/Documents/RemoteSensingStudy.pdf

    “over the course of 12 years, between 1987 and 1999, the mean nighttime surface temperature heat island of Houston increased 0.82 ± 0.10 [°C].”

  110. Ok CE. you can call it a word game. I think taking things absolutely literally is sometimes instructive. Hansen corrects for nightlights. At some point I want to see the physical theory that predicts a cooling for urban locations.

    Simply:

    If you have an urban station surround by rural stations and if the urban station COOLS relative to the rural stations, I agree with Peterson. This is counter to everything they teach in climate science 101. Such that.

    1. The data is wrong.
    2. the metadata is wrong.
    3. the theory requires modification or enhancement.

    peterson, held #3. He argued that these stations MUST BE in cool parks. hence, the data is right, the metadata is right, and if we consider what Oke said, the theory is not challenged.

    peterson POSTULATED ( his word) that the stations were in cool parks. That is subject to verification. I’d suggest that #1 is on the table in some cases, #2 is on the table in some cases, and #3 bears some scrunity

  111. I agree….it’s word games. It’s like if I wanted to deconvolute student intellegence differences from training methods examination. And I corrected for IQ differences between two schools. You could always say IQ is just a proxy for intelligence!

    The issue is how good is the proxy. But certainly an attempt was made to correct. And you have to use some proxy for UHI. you can’t use temp change (“H”) itself! That would be like Jacoby favoring certain responders in his tree data sets. You have to deconfound for Uishness.

    If you think there is some other factor that causes UHI, then you need to label it and we can at least conceive of an experiment to deconfound it. But lable it…the way a real analyst would. Any type of marketing or social science analysis involved trying to excluded confounding factors.

    You can say Hansen did (or really, I think all you can say is MAYBE did) a poor job of correcting for UHI. But saying he didn’t do it at all? AT ALL!?

    And “proxy for a proxy” is another word game. It’s a proxy. either a poor one or a good one. And you really have no consistent evidence that it is a poor one. One site, Orland, doesn’t cut it! That’s argument for a trend from an isolated, dramatized example. In fact…I’d Bayesian bet…that if nightlights really WAS just arbitrary and did not resolve urban from rural settings, that some skeptic blogger would have posted that. Heck, I’ll bet it would have already been noticed by Hansen. I’ll further Bayesian bet that nightlights have a high correlation with population density OR with pavement density.

    Furthermore, you really need to have nightlights mislabeling urban AS rural to affect the overall trend. Mislabeling a rural as urban just loses you a datapoint, but doesn’t bias the trend. (You follow why? Because Hansen’s method is like doing a rural only analysis.)

    P.s. I would appreciate it if you acknowledged understanding the CONCEPT of why site to site variation will make selected sites get a counterintuitive UHI correction, but that this does not bias the trend. We can move on and discuss the efficacy of nightlights as a proxy. But I want to make sure we agree on this basic math point first. Also, this would be good to have you on record, since SM has played possum and refused to acknowledge this concept.

    P.s.s. I had several long posts (with a lot of content) either moderated out or spam filtered out. I can’t continue to post, if this continues to be the pattern (my work is wasted writing the posts). If you want to discuss things, we have to do it elswehere, Steve. And if you don’t hear from me…it definitely doesn’t mean you beat me or something (I’d man up and tell ya actually if you did. I get so sick of hoi polloi on CA and RC the like, who say they’ve beaten someone when they don’t get a response…in some cases with responses are being moderated out.

    P.s.s.s. The time delays are a hassle also.

  112. PCTO:The issue is how good is the proxy.

    Yes. how good is the proxy. Duh, asked that 2 years ago. catch the hell up already.

    Satellite maps of anthropogenic night lights provide an additional source of information on the spatial extent of urban land use. The Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) has provided observations of the location and frequency of occurrence of visible and near-infrared (VNIR) emissions from fires, lighted human settlements, and other anthropogenic sources since 1972 (Croft, 1978). These data have been compiled and processed to provide a unique measure of the location and spatial extent of human settlements in 1994/95 by Elvidge et al. (Elvidge et al., 1997a; Elvidge et al.,1999). Temporal persistence distinguishes stable lights associated with permanent settlements from intermittent emissions related to fires. The global distribution of lighted settlements is shown in Figure 1. From this dataset it is possible to derive estimates of the locations (centroids) and areas of continguous lighted settlements worldwide. In discussions of distance and resolution, it is more intuitive to think of a linear dimension than an area so the size of lighted settlements is given in units of a circular equivalent diameter. A circular equivalent diameter is defined here as 2. The size distribution and cumulative area of the light dataset is shown in Figure 4.

    The use of lighted areas as a proxy for human settlement is subject to a number of caveats. A detailed comparison of the stable light dataset to higher-resolution (30 m) Landsat imagery indicates a detection threshold below which increasing numbers of lighted settlements are not imaged (Small et al., 2004, manuscript submitted to Photogramm. Eng. Remote Sens.). The limited spatial resolution (2.5 km) and sensitivity of the operational linescan system (OLS) sensor precludes detection of many small light sources (Elvidge et al., 2004). Atmospheric scattering and spatial uncertainty in geopositioning also diminish the spatial accuracy of satellite-detected night lights resulting in lighted areas somewhat larger than the associated settlements. Because the spatial extent of a lighted area is known to be significantly greater than the actual built-up area of many settlements (Elvidge et al., 1997a), discussions of area and dimension must account for overestimation. A recent comparison of lighted areas with Landsat imagery by Elvidge et al. (Elvidge et al., 2004) indicates that a lighted area linearly overestimates a built-up area by approximately a factor of 2. In units of equivalent diameter this translates to an overestimation by a factor of 1.4.

    Stable lights provide a valuable complement to census-derived proxies for human settlement patterns. The median area of the 60 080 contiguous lighted settlements in the 1994/95 dataset is 33 km, which corresponds to a median settlement diameter of 6.5 km (circular equivalent). In comparison to the population-weighted median area of 961 km and equivalent resolution of 31 km for the census tracts used in GPW2, the night light dataset offers considerably higher spatial resolution and therefore provides complementary estimates of the number and size of urban settlements. A lighted area is moderately correlated (r2 = 0.68) with population (Sutton et al., 1997; Sutton et al., 2001). Because of the considerable variability in the relationship between a lighted area and population, no conclusions are drawn about the number of people living within specific lighted areas. Rather, the size and location of the lighted settlements are used as spatially explicit indicators of local concentrations of population, which are associated with urban land use…..

    If a lighted area is used as a proxy for urban area, then satellite data indicate that at least 2.5 million km2 are occupied by urban and developed areas (including airports, oil production facilities, and other sparsely populated lighted areas). This is certainly an underestimate because of the detection threshold of the OLS sensor (Elvidge et al., 2004), and the fact that many small settlements are not detected or even illuminated at night. Accounting for undetected smaller (and less brightly lit) settlements would increase this area, but without knowing the size frequency distribution of undetected settlements, it is not possible to make a meaningful estimate. The size frequency distribution of lighted settlements does, however, provide a reasonably accurate description of the scaling properties of larger human settlements. In the context of this analysis, the combination of lighted settlements and moderate-resolution census data can distinguish between densely populated rural areas and more heavily developed centers of economic activity.

    NEXT how important is population WRT UHI. Duh.

  113. Mosh, you are being a bit careless here, for insisting on being literal to the point of losing meaning.

    At some point I want to see the physical theory that predicts a cooling for urban locations.

    There are many things wrong with this statement, though the end of your comment cleans some of them up.

    1. Flip that around. There’s no physical theory that necessarily predicts or specifies the level of extra warming for urban locations. There simply isn’t. It may be observable in many parts of the city, but it’s not required, nor is it cleanly predictable. It depends on a multitude of factors, such as whether wind is blocked in the area, whether the radiation view upwards is blocked in that area, whether there is ground moisture in the area, what the materials of construction are in the area.

    2. For an urban station to get adjusted up, it didn’t have to be cooling. It only had to be warming less quickly than the neighbors. We’re being literal, aren’t we?

    3. Once the factors that cause a local hotspot are in place, there isn’t necessarily going to be a continuing extra trend there. If an area around a weather station is already built up, and it’s got all the concrete and tall buildings it’s going to get, the UHI won’t keep going up and up and up.

    4, and possibly the biggest one: Signal/noise. You want to work with individual stations? You have to remember the noise that comes with it. Take any one station, I don’t care if it’s rural, urban, whatever. Look at it. The long term climate trends could be quite difficult to make out, because of all the non-climate disruptions (station moves, etc). Take an average over many stations, and a lot of that noise cancels out and the signal emerges.

    Simply: GISS takes one urban station, and compares it to a composite of rural neighbors. So long as there are several rural neighbors (and I can’t quantify ‘several’ at the moment), that composite should be pretty OK. But the one urban station could be any sort of mess, with nominal trends that are meaningless and indicative of neither climate nor UHI. All it takes is one station move, and it can be offset colder from some point on, and the GISS algorithm will react by putting a warming trend through that period.

    This is counter to everything they teach in climate science 101.

    A bit overly dramatic there? I don’t think any of my above points are at all obscure or difficult to grasp.

    1. The data is wrong.
    2. the metadata is wrong.
    3. the theory requires modification or enhancement.

    peterson, held #3. He argued that these stations MUST BE in cool parks. hence, the data is right, the metadata is right, and if we consider what Oke said, the theory is not challenged.

    “MUST BE” is your own invention, apparently based on your reading of ‘postulated’. In context, that’s obviously a poor reading, so it’s disingenuous for you to continually make a point of it. He suggested it as a logical explanation, not as some fundamental principle. Should he have used a different word? Sure, but it’s clear that he didn’t mean what you’re making it out to be. Clear unless you want to play games.

    What’s happening here is that you are really really expecting to see a certain signal, but it’s actually quite difficult to unambiguously detect it, in trends over time in the regional means. This is apparently difficult for some to accept. Oh well.

    If you want to spend a ton of time trying to figure out why it isn’t easily observed, then go for it. Look at individual urban stations, one by one. See what their histories look like. Look at the station histories. Look at all the rural neighbors. I’m sure you’ll find a few stations that had an unambiguous spurious trend. This could be from a series of jumps, it could be smooth, but it probably doesn’t go on for 100 years. I’m also sure you’ll find some urban stations with non-UHI discontinuities, such that putting a OLS trend through the record makes it look like it’s warming less quickly (or is cooling), but that’s just an artifact of the inhomogeneities. I’m also sure that you’ll find some stations that are at open park or airport areas.

    It seems to offend you that nobody has gone through each station, case-case, looking for these things. Well, if it makes you happy, it’s something you can start doing for yourself.

  114. PCTO.

    http://www.epa.gov/hiri/resources/pdf/HeatIslandTeachingResource.txt

    Nightlights is a proxy ( better than tree rings I suppose) for population. and population I suppose is a proxy for building height and building materials and waste heat.

    “Within the urban canopy layer, building geometry
    and surface thermal properties have been shown to
    have the largest effect on the magnitude of the UHI
    (Oke, 1982, 1987). Measurement of building geometry
    includes 1) building height/canyon width (H/W) ratio, 2)
    sky view factor, which is the proportion of sky seen from
    an outdoor point in space (Grimmond et al., 2001),

    3) a compactness index, which is the ratio of building
    surface area to the surface area of a cube which has the
    same volume as the building (Unger, 2004; Emmanuel
    and Fernando, 2007). Other causal factors of the UHI
    include anthropogenic heat release from buildings and
    vehicles on the roadways, loss of evapotranspiration
    due to reduced vegetation and latent heat transfer, and
    the loss of wind within the built environment to transport
    heat out of the city (Oke, 1988).

    When you get down to the physics of what causes UHI, let me know. But fundamentally its geometry and materials.

    And let me know when you have a physical theory that predicts cooling in urban areas. Waving your arms ( like skeptics) and special pleading for natural variation is kinda weak.

  115. Steve:

    Please…let’s disaggregate issues so we can move forward. find areas of agreement. Find areas of disagreement and work through and change your or my views on those disagreements by figuriung out the right answers.

    The discussion of UHI proxy correlation is interesting and we should really get into it. honest. I will get into both the logic (where I think you may have some gaps) as well as the physical details where I agree you have looked and talked about it a lot. More than I have. honest, I will address this with you.

    Could you please do me the courtesy though of letting me know if you understand the concept of why INDIVIDUAL sites may have a counterintuitive correction? that it’s analagous to the reason why individual sensors may differ from a trend, but that the overall trend can still be meaningful. If you can’t graps this math point, it’s going to be hard to move forward on more complex issues. And I don’t want this to be like the Jeff Id negative thermometer thread where people (not just me, but Carrick et al) kept explaining a basic math concept and he just could not grok it for hundreds of posts. And actually the only way for me to break the logjam was to round up some actual PCA professors and they weighed in and finally Jeff thought to reconsider his too bold statement. Or like the amount of time it took for AP Smith to correct a minor point you had misstated. (This is also important to me as it’s something CA has trumpeted, but that shows a logic gap. I actually expect more from you in intellectual fairness than Steve…although I’m sure he’s birghter on the math and all…)

  116. Mosh

    it took a long time for the chart to display, too. I’m surprised how tight it is. Can you give more of an idea of the cell population distribution, for the cells you used?

  117. CE

    1. Flip that around. There’s no physical theory that necessarily predicts or specifies the level of extra warming for urban locations. There simply isn’t. It may be observable in many parts of the city, but it’s not required, nor is it cleanly predictable. It depends on a multitude of factors, such as whether wind is blocked in the area, whether the radiation view upwards is blocked in that area, whether there is ground moisture in the area, what the materials of construction are in the area.

    On the contrary. Note your switch. There are physical theories, they do predict. About as well as early GCMs I would hazard.
    One of the biggest determinants to separate the positive UHI regions from the negative UHI regions is Canopy cover.
    Then Building height. Gosh I wonder how that physics works.

    http://pages.unibas.ch/geo/mcr/Projects/BUBBLE/textpages/ov_frameset.en.htm

    Kinda cool

    http://pages.unibas.ch/geo/mcr/Projects/BUBBLE/textpages/ov_frameset.en.htm

    “3. Once the factors that cause a local hotspot are in place, there isn’t necessarily going to be a continuing extra trend there. If an area around a weather station is already built up, and it’s got all the concrete and tall buildings it’s going to get, the UHI won’t keep going up and up and up.”

    Yes. I’ve said this repeatedly. hence the importance of some glance toward history. obviously.

    “Simply: GISS takes one urban station, and compares it to a composite of rural neighbors. So long as there are several rural neighbors (and I can’t quantify ‘several’ at the moment), that composite should be pretty OK.”

    Hypothesis. not fact. how to test it, is the question. Put ANOTHER WAY, what evidence would accept as evidence that it is NOT PRETTY OK. What would make you question the algorithm? what would you look at? specific outputs obviously. OR do you accept it blindly on the face of its description. Its should be pretty ok. Cool. How do you propose testing whether it is pretty ok?

    ““MUST BE” is your own invention, apparently based on your reading of ‘postulated’. In context, that’s obviously a poor reading, so it’s disingenuous for you to continually make a point of it. He suggested it as a logical explanation, not as some fundamental principle. Should he have used a different word? Sure, but it’s clear that he didn’t mean what you’re making it out to be. Clear unless you want to play games.”

    Wrong. There are only certain logical possibilities. Peterson postulated, because the alternative, LOGICAL alternative would be to question the data. It is a logical explanation, a hypothesis.
    he argued that 1. he corrected the data. 2 he audited the metadata. Then he found found no warming trend and concluded
    ( that would be a logical step) that the urban sites were in cool parks, properly sited. His conclusion, not mine. I point out that he is forced to this conclusion by the logic of the situation. He argued this was a “mystery” his word, and solved the mystery by:
    correcting the data ( controling for altitude, instruments, etc) by
    eliminting certain rural sites ( rooftop locations) and by appealing to a ‘cool park’ postulate. In terms of the logic of argumentation it was something that he must do.

    Now, His argument has been recapitulated elsewhere, his postulate was recapitulated by Parker and by Jones. But his argument has not been tested. Is that point really that hard to grasp.

    To be SURE, there is no INTEREST in testing Petersons hypothesis for a variety of reasons.

    1. Its not very interesting to climate scientists.
    2. Its tedious.
    3. The governing belief is that UHI is not a problem, so the
    “logical” explanation, does the job of maintaining the status
    quo.
    4. Your chance of getting published is next to nil.

  118. 49551 (Steve):

    I couldn’t get the graph and I might be violating the 5 minute rule, by jumping in…but…

    Is the result unexpected? Wouldn’t sampling theory have made you expect this. Central limit thereom and all that?

  119. CE sure. [N] = station count per cell.
    [2] 45 would mean 45 cells with 2 stations per cell

    The tightness I suppose comes from the fact, I suppose, that a large portion of cells ONLY HAVE 1-3 stations. I resampled with replacement.

    hmm I posted on this last night but I think it got eaten. With so many cells having few stations, you are not gunna see big differences by resampling. no leverage.

    Next, I’ll resample by cell.. But thats a big programming change to collect the actual area sampled ( I have it, buried in a subroutine) and I think I should look at that monthly

    [1] 699
    [2] 247
    [3] 125
    [4] 68
    [5] 58
    [6] 39
    [7] 28
    [8] 23
    [9] 14
    [10] 16
    [11] 13
    [12] 12
    [13] 11
    [14] 6
    [15] 8
    [16] 7
    [17] 8
    [18] 6
    [19] 8
    [20] 4
    [21] 4
    [22] 3
    [23] 3
    [24] 2
    [25] 3
    [26] 3
    [27] 2
    [28] 2
    [29] 1
    [30] 1
    [31] 1
    [32] 3
    [33] 1
    [34] 0
    [35] 1
    [36] 1

  120. That’s why I said to do it for only those cells with a larger number of stations to pick from. Things will of course be tight if you keep choosing the same stations over and over again, by default.

  121. Great Zeke, a lot of the co2 the earth had went into the coal deposits we are busily burning. So what. I have not watched your video yet as I am at work. So we’re burning coal, and its going to add co2 to the atmosphere. Over about 1000 years, (rough estimate) the ocean will take all of that carbon out and put it into rocks again so I dont care.

  122. Zeke, Thanks

    The raster package made a bunch of stuff way easier. The biggest issue is getting the data into the right form for R to work quickly.
    As Ron found out (me too) Loops kill you. hehe, you can visit the R help archives and see me asking all sorts of stupid begineer questions. The guys on that list ROCK. Its really kind of cool as far as help lists go. The rules are RTFM first, and then POST CODE with your problem. Really slick.

  123. Zeke, later today I’ll post up the code with the ‘bootstrapping’
    scripts. I’ve added parameters to govern how you want to resample
    ( pick the cut off.. say 3 stations per cell if possible, and then pick the size of the sample you want 1, station, 2, 3 etc. )

    next, I’ll do a sampling of grids. that might take some fiddling

  124. steven mosher:

    How would you proceed?

    I’d start with a cell that has good coverage.

    I’d measure the time and spatial correlational structure for that site (I’d divide by season).

    I’d use this set of measurements as the basis for generating realistic temperature fields as a function of (lat, long, time)

    Of course you do this a bunch (“N”) times. So you have N instances,

    T_1(lat,long, time), T_2(lat,long,time), … T_N(lat,long,time)

    For each instance compute the average value of T_n in the cell of interest.

    Also compute it for your synthetic networks of sizes m=1,5,10,50.

    In general your synthetic network may be written as a series of doublets:

    (lat_m, long_m), m = 1, 2, … M.

    I would probably analyze the spatial correlation scale of real networks (real networks often show a lot of clustering, you want to incorporate this on your modeling, at least at some point), and generate synthetic networks that exhibit realistic clustering.

    For each network choice, compute Tavg_M(time):

    Tavg_M(time) = sum( weight(m) * T(lat_m, phi_m, time), m=1,…,M).

    (Assuming here weights sum to one.)

    Then compute

    err_M(time) = Tavg(time) – Tavg_M(time)

    Of course compute the mean square error of err_M(time), and plot this against M.

    I would probably generate a series of synthetic networks for each M, and average the mean-square error over this series.

    The main technical issue here is generating the pseudorandom distributions from the know correlational structure. The best way of doing that is to take the 2-space+1-time inverse Fourier transform of the correlational matrix.

  125. Ugh. I thought about synthetic data here, but I’m not entirely convinced it would actually add any information in the end.

    By the way, there’s a distinction to keep in mind here. We’re interested in the required sampling density required to capture spatial variations in anomaly. But when you look at the distribution and correlations of real station data, you’ll also pick up variations due to various sorts of measurement error – station moves, etc. That’s a separate issue. The advantage of using model results here is that you don’t have to worry about deconvoluting those effects. The ‘measurements’ are perfect.

  126. Slimething,

    What does the linked paper have to do with the examples I gave in my post?

  127. Carrot Eater:

    The ‘measurements’ are perfect.

    They don’t contain station location errors, that’s true, but they do contain physical modeling errors, and over the correlation times of interest, they most likely are meaningless.

    With respect to station location moves and so forth, these are broad band noise sources, so they aren’t going to contribute much to a covariance matrix which was analyzed over e.g. 100 years.. Another advantage to this approach is you can evaluate one grid area where the stations history is super-well known. In this case, the station moves can explicitly be accounted for using “manual” (rather than automatic) adjustments.

    Of course, it would be very interesting to compare a statistically-based Monte Carlo to a model-based one: How do the covariance matrices compare?

    It’s a pretty rich topic, actually, if you’re not scared of analyzing data.

  128. Carrick

    but they do contain physical modeling errors

    Ahem. Keep in mind what is being attempted here. You’re trying to figure out spatial correlation patterns over latitude and longitude when you don’t have dense (or sometimes, any) data in the parts of the world that you’re worried about in the first place. In that case, I’d give the model a second look, since it’s got circulation patterns driven by physics, even if there is some error there. I’d do that if I could, instead of taking some nicely populated cell at one place on earth, and hoping its characteristics would hold elsewhere.

    and over the correlation times of interest, they most likely are meaningless.

    Well, there’s a sweeping handwave. Seeing as using a model would be a heck of a lot easier, and it doesn’t suffer from measurement error at all, I wouldn’t be so fast to dismiss it. A fairly simple test: repeat H&L’s 1987 correlation study with today’s data, as well as the model results. If the correlation vs latitude (and if you want to go further into longitude, by grid cell) relationships match OK, then so far as you can tell, the model is doing OK on spatial correlation. So you don’t feel so bad about using the model to gauge spatial correlation where you don’t have so much data.

    Another advantage to this approach is you can evaluate one grid area where the stations history is super-well known.

    Elusive.

    the station moves can explicitly be accounted for using “manual” (rather than automatic) adjustments.

    Again elusive. Say, the field notes say some tree branches overgrew the station, then were cut down a few years later. How do you manually adjust for that? You have no hope but to go for a statistical objective method.

    And anyway, one thing at a time. The model path that you’re so fast to dismiss avoids these issues, and allows you to hone in on the question that is being posed. So it deserves a look.

    If I recall correctly, there’s a paper that tried to do this using both model results, and observation. Believe it or not, the model method came back with the more conservative answer – that more observation points were required.

  129. Carrot Eater:

    Ahem. Keep in mind what is being attempted here. You’re trying to figure out spatial correlation patterns over latitude and longitude when you don’t have dense (or sometimes, any) data in the parts of the world that you’re worried about in the first place. In that case, I’d give the model a second look, since it’s got circulation patterns driven by physics, even if there is some error there. I’d do that if I could, instead of taking some nicely populated cell at one place on earth, and hoping its characteristics would hold elsewhere.

    The problem is precisely that the current generation of GCMs are unlikely to reproduce regional scale spatial and temporal correlations. It has to do with the “rule of 10” meaning you need roughly 10 discretization points to accurately reproduce the physics of a given spatial scale. If the scale you are interested in is circa 600-km, you need to have no more than 60-km sized discretizations.

    If people want to claim they can do better, the way to do this is by bootstrapping: Namely validate the models in the regions where you have good spatial coverage first before applying it to regions where you have no idea whether it is giving reliable result or not.

    Secondly “elusive” is over-sell here. You are basically making the same logical error that the critics of GISTEMP and other reconstructions are making. Simply pointing out the possibility of an error is not a demonstration that it matters. And further, we do have nearly complete knowledge in some stations at least, and Monte Carlo’ing can be used to find out what rate of station shifts etc would be required before this became a confounding factor.

    I’d be willing to be in many regions it doesn’t significantly affect the covariance matrix. You might disagree, but that isn’t proof of anything. Proof requires quantification of the magnitude of the effect and demonstration that under realistic circumstances (e.g. no assuming yearly hops of stations), you can substantially affect the fidelity of your empirically obtained covariance matrix.

    I’d be happy if the models were good enough to sort this out because what it means is you could use combined model+data approach to interpolate over regions where you have limited data, and improve the accuracy of the temperature reconstruction. But it’s been my experience in analyzing short-period fluctations from climate model outputs, that they do a horrible job of reproducing climate on relevant time scales for this sort of problem. Given that there’s a relationship between temporal and spatial correlations for weather/climate, I don’t personally hold much hope that the models can give us much useful information for this problem…

    …but as I said this is testable.

  130. If the scale you are interested in is circa 600-km, you need to have no more than 60-km sized discretizations.

    I would agree with that, if you’re expecting strong gradients in anomaly within that 600 km. What’s the resolution on the highest-res GCM these days? Otherwise, I think there are regional models at higher res.

    Namely validate the models in the regions where you have good spatial coverage first before applying it to regions where you have no idea whether it is giving reliable result or not.

    That’s exactly what I said above. I’d feel more comfortable using a model that works elsewhere, to get an idea for the sparse areas, than extrapolating empirical characteristics from a lower latitude to the Arctic.

    Secondly “elusive” is over-sell here. You are basically making the same logical error that the critics of GISTEMP and other reconstructions are making. Simply pointing out the possibility of an error is not a demonstration that it matters. And further, we do have nearly complete knowledge in some stations at least,

    Not oversell. Some histories are fairly complete, but it is difficult. You can’t expect all the microsite effects to be recorded. And again, even if everything was recorded, then what? What does that do for you? In many cases, you’re going to end up comparing against the neighbors anyway. Now and again you might get a station move so sharp, or with enough overlap, that you can figure out the adjustment just by looking at that station itself. But otherwise, you have to compare with the neighbors.

    In the USHCN’s latest automatic procedure, about half the adjustments made can be matched up with something in the field notes. About half cannot. I’m inclined to think that the entire unassigned half is not just the algorithm making things up phantom things, but it’s picking up some real discontinuities.

    I’d be happy if the models were good enough to sort this out because what it means is you could use combined model+data approach to interpolate over regions where you have limited data, and improve the accuracy of the temperature reconstruction

    Already done. The reanalysis products take this approach.

    But it’s been my experience in analyzing short-period fluctations from climate model outputs, that they do a horrible job of reproducing climate on relevant time scales for this sort of problem. Given that there’s a relationship between temporal and spatial correlations for weather/climate, I don’t personally hold much hope that the models can give us much useful information for this problem…

    I’d say the spatial effects are more important than temporal here. Yes, they are coupled to some extent.

    …but as I said this is testable.

    As I said as well.

  131. Carrot Eater:

    Not oversell. Some histories are fairly complete, but it is difficult. You can’t expect all the microsite effects to be recorded. And again, even if everything was recorded, then what? What does that do for you?

    The oversell is claiming every little microsite change matter for covariance studies.

    Sounds like we agree on most everything else. 😉

    Other than I’m not aware of any temperature reconstructions (e.g., CRU, GISTEMP) that regularly include GCM modeling to interpolate sparsely sampled regions of the globe.

  132. Other than I’m not aware of any temperature reconstructions (e.g., CRU, GISTEMP) that regularly include GCM modeling to interpolate sparsely sampled regions of the globe.

    No, they sure as heck don’t, though GISS uses it to figure out the error bars we’re talking about here.

    I’m talking about re-analysis, like the NCAR/NCEP product. Its objectives are much, much wider than anything we’re talking about here, but in essence it combines a ton of historical observations with a model.

  133. Ok, Carrick.

    To do that is gunna require a reasonable coding effort.
    I’m really close to being able to pull off a grid resampling. if I can put that to
    bed, I can start on the proceedure for generating synthetic
    data, per your description. You and carrot can debate the merits.

    The station resampling code is done. Based on carrot’s suggestion
    I added some parameters to control how you resample.

    On My grid resampling effort…

    there are 3236 grids with land in a 3×3 world
    currently about 1340 of these have temp records.

    According to source
    148940000 km2 : global land (wiki)
    146473976 km2 : according to my land mask ( arrg trenberth
    has missing heat..)

    summary stats for actual coverage

    summary(DF2[,5])
    Min. 1st Qu. Median Mean 3rd Qu. Max.
    25800000 38510000 49690000 54990000 74690000 83220000

    So on average 1/3 of the land is sampled and a bit more than
    50% at best

    data looks like this

    “Date” “Temperature” “Stations” “Cells” “Area”
    1900 0.207797063061673 1324 439 25883932.1227342
    1900.08 -0.638757141373956 1331 437 25820030.5862553
    1900.17 -0.495168576073524 1352 441 26108594.0212702
    1900.25 -0.350728258261984 1360 442 26123168.6257816
    1900.33 0.135146067458723 1360 440 26053950.8777645
    1900.42 0.300874305898727 1341 437 25891858.1985893
    1900.5 -0.0189891136055169 1348 440 26171126.7426165
    1900.58 0.214695748671025 1335 433 25803861.0681256
    1900.67 -0.0363089820855964 1345 441 26223051.2476435
    1900.75 0.817315097206382 1341 440 26161471.9325204
    1900.83 -0.178316870876534 1343 443 26321135.3760828
    1900.92 0.93473340239923 1343 445 26433878.9627745
    1901 -0.00705570904559395 1348 440 26158721.2442363

    So, I can randomly drop cells, compute the difference, and
    get some gauge at the relationship between area and the deviations that arise from randomly losing area coverage.
    Kind of an area versus error..if that makes sense..

    Before doing the Monte Carlo, I will probably bound the problem from above and below

  134. (Carrot Eater, Zeke and others please read)

    Dear Bob.
    regardng your article:
    http://bobtisdale.blogspot.com/2010/07/land-surface-temperature-contribution.html

    I offered you a bottle of wine if you would go through my new article:
    http://hidethedecline.eu/pages/posts/the-perplexing-temperature-data-published-1974-84-and-recent-temperature-data-180.php

    If you had done this, you would certainly not have written as you do.

    One more time. Please look at PART 2 of my article, chapter 3.4.
    This is where I explain that GISS inlcudes ocean in their station data series, and where I show a graphic of the ocean included.

    So if anyone is aware of ocean data in GISS station “land” data, its me. The fact that you and others keep writing that you think im not aware of ocean data in GISS station “land” data might be my fault due to bad communication.

    I wrote in my article PART 4:

    “I am sure that the algorithm or specific method used by GISS to combine Land temperature and SST explains some of these apparently odd findings. But whatever the “algorithm” used by GISS is, can it be justified that GISS gradually weights the warm NH-Land graph more and more? And ends up with around 67% NH land fraction in 2007 although NH only has 40% land? Maybe, this algorithm or method deserves some attention?
    ”

    And in the WUWT article i write: http://wattsupwiththat.com/2010/07/17/tipping-point-at-giss-land-and-sea-out-of-balance/#more-22126

    I write:
    “In general GISS defends use of larger land fraction due to their 1200km zones around land stations reaching some Ocean areas. But this does obviously not explain a land fraction that appears to go from near zero to around 70% globally during the 20th century.
    ”

    Now, Your article, Bob:
    You focus on the similarities between CRU and GISS – i suppose to say that the resulting GISS is ok?
    The thing is, CRU and GISS ends up rather alike. But in CRU data i find much more direct land data adjustment than for GISS. On the contrary for GISS, the direct land data adjustments are not so big at all (to my surprice) but in stead the GISS warming trend thats similar to CRU comes when combining the SST and “land”.

    SOmething thats messy in al this is, that you seem to trust that CRU land is not ocean while GISS is… Yes yes, GISS has ship and island data included, but a BIG part of the GISS ocean area in their “land” data is obviousy from coastal stations. These stations are exactly the same as for CRU. So its nonsense to say “CRU is just land data”.
    Just becasue CRU says that their coastal stations are land while GISS (the same) coastal stations covers huge ocean areas, you cant just treat the same data as if completely different.

    I have raised some serious problems in data, and I know you disagree strongly, but i have not seen conving arguments from you, its not bad will.

    K.R. Frank

  135. Frank

    But whatever the “algorithm” used by GISS is, can it be justified that GISS gradually weights the warm NH-Land graph more and more? And ends up with around 67% NH land fraction in 2007 although NH only has 40% land? Maybe, this algorithm or method deserves some attention?
    ”

    “In general GISS defends use of larger land fraction due to their 1200km zones around land stations reaching some Ocean areas. But this does obviously not explain a land fraction that appears to go from near zero to around 70% globally during the 20th century.

    I don’t know what calculation you are doing to backcalculate the land/sea fractions, and I’m honestly not curious enough to care. Maybe somebody else will. But whatever it is that you are doing, it’s just incorrect.

    As stated before, if you go forward instead of backward, and start with the same source data that GISS starts with, and calculate your own land average, and calculate your own sea average, and combine them in a properly proportionate way, you get something like GISTEMP’s land-ocean index. This should give you great pause about your own calculations, and how meaningful they might be.

  136. Frank,

    Please point out where you acknowledge that the GISTEMP “meteorological” index (dTs) is weighted not by the amount of land, but by the size of latitude zones. It warms too slowly for a land-only index. It therefore cannot be used to back-calculate the weight given to the sea surface in the land-ocean index. Once you comprehend this, you will (hopefully) understand why all of your calculations are wrong.

    You can see the relative size of the ocean and land coverage for various years by going to the GISTEMP map creation tool.

    http://data.giss.nasa.gov/gistemp/maps/

  137. Moshpit:

    1. Please note here that you understand why many individual urban stations could have counterintuitive (“wrong sign”) UHI corrections if the mean of site to site non-UHI variation was of the order of the mean of UHI corrections.

    2. Please also note here that you understand why a SMALL number of individual urban station might have a counterintuitive UHI correction even if the mean of non-UHI site to site variation is significantly smaller than the mean of UHI, provideded there is a bell curve of distributions.

    3. Please note that you understand why any rural stations misclassified as urban may have counterintutive corrections because of site to site variation swamping the negligeable UHI correction.

    4. Please also note that you understand for all of the above that there will be other stations which are OVER-corrected, and that the overall effect on TREND will be nil (just some stations over and under correected).

    ———————————————-

    After this, we can then discuss the fundamental concept of how well UHI is being assessed and what the best measurement should be. Agreeing to the above does not require you to make any concessions on nightlights, efficacy of UHI spotting, etc. I want agreement on basic concepts of corrections within distributions that also have random differences not related to the corrected variable. It just clears the decks of some single station pontificating done earlier as an “aha”.

  138. Nick Barnes,
    That’s good stuff there. Highlights what GISTEMP ends up doing in oddball cases.

    And a lot of effort to answer somebody’s question. Nice.

  139. Harold W said “What I disagree more strongly with is the assertion that Klotzbach “hasn’t moved the science on.” ”

    Sorry for the very slow repsonse. I didn’t want to drift off topic and then I forgot which thread I’d posted to.

    Maybe I was being a bit harsh. The problem as I saw it was that the way their hypotheses were constructed they were bound to be rejected based on what we already knew. Predictions are supposed to be bold and their consequences unlikely and I didn’t think Klotzbach fit into that category.

    If I look in the IPCC report, I see that the rate of warming is higher over land than over the oceans. I see that nights have warmed faster than days and that estimates of tropospheric temperature don’t show the ‘expected’ amplification. These things we already know, so the hypotheses proposed by Klotzbach et al. were bound to be rejected.

    It seemed that they were rehashing an alternative hypothesis. I no doubt missed some subtlety in the interpretation, but that was the way I saw it at the time I read it.

    I agree with you that saying its an either/or situation is perhaps simplifying the problem unnecessarily.

  140. Zeke and Lucia,

    Stumbling about, I came upon this very nice website, woodfortrees.org. Software developer Paul Clark put it together: the site allows the user to choose various temperature data sets and see how they look over user-selected time periods.

    Maybe “everybody else” has already seen this, but it’s elegant, and new to me.

Comments are closed.