Category Archives: Toy Physics

Toy Math vs. ‘Lew’ math: Toy wins.

On Saturday, I discussed Lewindowsky’s No Lew. We don’t need that level of flood defenses. In that post, I presented a qualitative discussion of a Lewandowsky claim; today, I’m going to show some “math”. The purpose is to explain why the italicized claim in Pancost and Lewandowsky below is exaggerated to the point of being wrong:

Second, the uncertainty in our projections makes adaptation to climate change more expensive and challenging. Suppose we need to build flood defences for a coastal English town. If we could forecast a 1m sea level rise by 2100 without any uncertainty, the town could confidently build flood barriers 1m higher than they are today. However, although sea levels are most likely to rise by about 1m, we’re really looking at a range between 0.3m and 1.7m. Therefore, flood defences must be at least 1.7m higher than today – 70cm higher than they could be in the absence of uncertainty.
And as uncertainty increases, so does the required height of flood defences for non-negotiable mathematical reasons.

(In their blog post, the claim about ‘non-negotiable mathematical reasons’ was linked to a paywall verion their paper. I’ve replaced the link to send you to a free copy.)
Getting mathematical, I will also show that while the final claim “And as uncertainty increases, so does the required height of flood defences for non-negotiable mathematical reasons”, is true, the uncertainty increase can be so small as to be lost when one rounds the required height to the number of significant figures in the estimated sea level change or its stated uncertainty. This makes that claim rather less impressive than it might sound.

Friday, my comment about this was

The idea that “flood defences must be at least 1.7m higher than today” is utter nonsense. The more correct claim is “we need to consider the possibility that flood defenses in 2100 may need to be 1.7 m higher than today”.

Note in the previous discussion, 1.7 m is the upper range of of the previously discussed uncertainty window for sea level rise (1.7m). The general point would appear to be one ‘must’ build to protect to the ‘upper range’– that’s bunk.

Today, I’m going to use a simple “Toy” model to show that, in fact, ‘uncertainty’ does not require anyone to build a wall to math the upper range of projections now or ever. One only needs to build to that range if that sea level rise materializes?

Why don’t we need to do this: Because we can wait.

Of course, I already said this, but today, I’m going to do math. The math will be similar to that done by Lewindowsky and co-authors in the paper he linked. Unlike Lewindowsky and Pancost, I’m not going to suggest that my results are based on ‘non-negotiable’ math. They are based on

  1. A decision making algorithm policy makers could chose to follow if they wished.
  2. Some simple “toy” assumption permitting a mathematical function to ‘project’ the expected value of sea level rise (slr) and its uncertainty as a function of time. This model will require specifying two parameters.
  3. A mathematical model to predict inundation risk. This model will require specifying a parameter. λ
  4. Mathematical manipulations themselves.

Though it might not be obvious based on Lew’s verbiage in his recent blog post, Lew did three of the four in his paper. I’ll discuss his assumptions and my extention below. :

  1. In his peer reviewed paper, Lew ‘analysis’ is based on the assumption that decision makers build defenses one time and one time only and they act based on current, or at least reasonably current, projections for sea level height in 2100. That is: they don’t get to build a little now and then re-visit needs later.

    In contrast, I will assume that a board deciding on implementing engineered protections that might be required 86 years from now (2014-2100), can elect to build protections in increments. So, for example, they can chose to build in 2 increments, starting by building protections they anticipate will protect through the first 43 years and deferring the decision for further building to the future.

    Note: this assumption is not ‘math’; it will turn out to make a very large difference to our estimate of the height of engineered protections ‘required’ in light of uncertainty in projections.

  2. Because of the assumption the board only acts once, Lew didn’t need a toy model to project sea level rise. This simplifies his “math”. You get to decide if it simplified it a bit too much.

    Since I am permitting the board to make prudent decisions, I will assume

    (a) The expected value of sea level rise $latex E[SLR] $ varies quadratically with time and the board will continue to believe this is the functional form. The current level is 0 m; the value in 2100 is 0.5m. It’s currently 2014, they currently believe the SLR varies quadratically with time measured from 2014 reaching 0.5m in 2100. $latex E[SLR]= (0.5m ) (t/86)^{2} $ where time is measured from 2014. (Note: I introduced a parameter here. It’s the exponent of ‘2’.)

    The expected value of sea level rise is illustrated with the black squares in the figure below which shows the current sea level and projections at the end of each build time if we assume 5 build periods:

    QuadraticProjectionsofSLR
    Note the final expected value for sea level rise is 0.5 m and matches that used by Lewandowsky. His analysis didn’t need additional detail because the board was assumed to lack any critical thinking skills and could act only once.

    Those with critical thinking skills know the entire point of deferring decisions is to permit the current and future boards to use future events to guide their future decisions. The current board can’t know the future value of $latex SLR $, nor can they know the future projections. For the purpose of decision making, they need a model. I’ll suggest this simple one:

    The current board will assume that when some future year Y arrives (e.g. Y=2050), scientists will observe the sea level rise, which we will call “Observed”, $latex O[SLR](Y) $. At that future time, the current board assumes the future board will continue to believe the sea level rise varies quadratically), passing through the pairs $latex (2014, 0) $ and $latex (2050,O(SLR)(Y) $ with minimum at 2014. That future board will be assumed use the same risk methodology the current board likes, but base flood risk calculations for events further into the future on new projections which have been updated based on new observations.

    Note that under this “delayed action” plan, if the observed rate of rising exceeds the current best estimate of the rate of rise, the future board will expect the sea level rate to rise at a faster rate than currently projected; at that point they can build the wall higher than the current board’s best estimate of the build height in the future year. The converse is also true.

    (b) The current board will also assume uncertainty in projected sea level rise also varies quadratically with time measured from the date on which they are making decisions. The current level is 0m. The value in 2100 is 0.36 m. So, in 2014, the uncertainty in projected sea level rise varies as $latex \sigma = (0.36 m) (dt/86)^{2} $ where $latex dt = t-2014 $; note $latex T=2100-2014 = 86 $ in the denominator is the full time period the board is considering when developing their response. (Note: I’ve introduced my second prameter, it’s 2.) The current uncertainty intervals are illustrated by the range bars in the figure above.

    When the future arrives, the current board assumes scientists ability ability to predict sea level rise has not have improved, and that $latex (0.36 m) (dt/86)^{2} $ where dt is now computed using the current year, i.e. dt = t-Year_{current} $; . So, for example, when making decision in 2050, they assume the future board will use the equation indicated above substituting $latex dt = t-2050 $ (Note: if scientists uncertainty model improves, that will make the case for waiting better than presented here.)

    (c) With respect to decision to build in the start year (i.e. 2014) the current board will assume they will build $latex N_{build} $ times between now and 2100, each time building at the beginning of the period. So, the first build period begins in 2014. If there are two build periods, the second one begins 43 years from now in 2057. At each time, the board will apply the exact same method Lew used to estimate the proper wall height to build, except that rather than building to protect through 2100, they will build to protect up to the next scheduled build time. That is: in 2014, they will build to protect to 2057. Moreover, Lew actually provided three methods of estimating the added protection height required to permit flood risk in the final year match current flood risk , each using a different probability distribution function (pdf) for the uncertainty. Out of caution, the board will assume this pdf is gaussian which maximizes the predicted height of protections.

    Also, in it’s wish to use the exact same method as Lew, the current board will sift through his papers to find any parameters he used, and match those. They will notice he cites “Hunter” who uses a formula that includes a parameter $latex \lambda $ which is assumed constant with time. The board will read Lew’s paper, and discover this bit of text that contains numbers that permits them to back out the value of $latex \lambda $ used by Lew:

    When uncertaintySLR is non-zero, then irrespective of what assumptions are made about the distribution of SLR, the required protective response increases and deviates rapidly and
    in an accelerating manner from the anticipated mean SLR. For example, under a Gaussian assumption, if uncertaintySLR is around 0.36 m, this raises the required protective response to around 1m. That is, an expected SLR of 0.5m requires that dikes and levees be raised by twice that amount in order to keep the risk of flooding constant in light of uncertaintySLR. If other distributional assumptions are made, the values change but the in-principle conclusion remains the same:

    (Note: consultation with Hunter, suggests Hunter used the (5%-95%) uncertainty range and the 0.36 m may be a typo; Hunter shows 0.26m. I’ll be using the 0.36 m as my goal is to make a qualitative rather than quantitative point. )

    Note: text corresponds to a discussion of the point highlighted in the figure below:

    LewindowskyFigureWhichPoint

    My future figures will use this value of $latex \lambda $ which is held constant, as Lew suggests.

    At future build times, the current board assumes the future board will act as follows: if based on the observations available at that future time the height of the wall is estimated to be sufficient to protect the village at the end of the next build time, the future board will skip that build but will not demolish it, otherwise, they will build to protect using the updated risk model that incorporated knowledge of the current observed sea level.

Results! (Otherwise known as the fun part!)

First: My results represent a detailed discussion of the height of the protections for the case represented by the highlighted point in “Lew’s” figure above. Recall the fundamental reason my results will differ from Lew’s is I permit the board to build in increments and in particular, they may build more than once. I will call the number of planned builds $latex N_{build} $. Since I am using Lew’s method of risk analysis, my results will reproduce his when the number of builds is $latex N_{build} =1 $ as that is the case his analysis represents. (Note however, I didn’t spend much time matching. I had to back a parameter out form the text comparison of two papers suggests a typo on one of the other. The results are somewhat sensitive to that parameter.)

To obtain results, I coded in the decision algorithm describe above in “R”. I did this because as far as the current board is concerned, the future observed values of sea level rise, $latex O[SLR](Y) $, are random. So, I made $latex O[SLR](Y) $ in build periods after the initial one to be a random variable with standard deviation equal to that estimated for the projected time between builds and mean equal to the updated projection for $latex E[SLR](Y) $. Recall that updated projection in future years is based on $latex O[SLR](Y) $ for the most recent observation. I also coded to implement the boards decision to only build if the expected value of the protections at the end of the current build period exceed that of the current wall. Results below are based on 10,000 iterations.

Expected Height of Protections
First, recall, in Lew’s analysis, there was 1 build. For the case I picked out, the “required” height of protections was 1-m with no variability in the height the board might build. Essentially, the board figures out the level they ‘need’ to build given current information, and build it. They are done. That result is represented by the black square above ‘1’ build below. Meanwhile, the blue circle represents the expected value of sea level rise $latex E[SLR] $.
HeightOfWallBuilt

Note that in the $latex N_{build} case, on average, the protection will exceed the amount that is actually required to meet acceptable flood risk by 0.5 meters in order to protect 2100 citizens adequately. This was the horrible idea that Lew’s text suggest is somehow “required”.

Next look at the figure, allowing your eyes to travel to the right. Suppose the board decides to build twice, $latex N_{build}=2 $, with an initial build now, and a second one 43 years from now. In this case, we can’t know the level of protections the board will judge proper 43 years from now. That magnitude will depend on the observed sea level rise. However, what we can examine is the expected value of the protections they will build, and its standard deviation. In this case, the expected value for the wall they will ultimately build 0.53m; this is substantially smaller than the 1-m they would build if they planned protecting citizens of 2100 using a protections built in 2014. Morever, it’s only a smidge above the expected sea level rise of 0.5m.

Looking further to the right, you can see the expected value of the protections declines as the number of build increments increases approaching 0.5 m as the number of builds increases to infinity. At this point, it is worth noting that qualitatively this result is fairly general: The expected value of the wall height required to match flood risk in the beginning and end periods will tend to diminish. However, the quantitative results are affected by functional form the board assumes for projections and its uncertainty.

Next: it’s worth admitting that the future boards may build protection levels that are either higher or lower that the best estimate for the future protection height. The ±90% spread is illustrated with the blue uncertainty bars. Generally speaking, under the multiple build scenario, boards will build for higher protection if sea level actually rises at a higher rate than anticipated currently and lower if lower if it rises at a lower rate. Interestingly, under the current set of assumptions for the parameters (both creating quadratics), the height of protections ‘required’ if the board builds only once lies outside the ±90% spread of heights they will build if they defer part of their decision for 43 years. So: the ‘build full barriers now‘ tends to result in over building, which is unnecessarily costly. (Necessary funds might need to be taken from lunch subsidies for low income children, or for medical care for the elderly. Who knows?)

But some might think: We’ll at least the public will get ‘better’ protection. Sort of. Recall that even if the height of the wall built becomes deterministic under Lew’s “1 build” scenario, the future sea level is a random variable. So the height of protection actually required in 2100 is a random variable whose mean is 0.5m and ±90% variability is 0.36m The following graph compares the height built to the height of protection actually required in 2100.

ExcessHeight

Notice that in the figure, the height of the wall and it’s ±90% uncertainty intervals are well away from 0m. This means that in more than 95% of future outcomes, the public has much more protection that required to maintain adequate flood risk. In fact, they would have obtained the level of protection the board thinks is adequate building a wall that is more than 0.14 m lower.

In the other cases, there is a possibility that when 2100 comes along, the wall is a bit too short. With two builds, when the final protection height is too short relative to the height required for adequate flood risk in 2100, the short fall is based on the sea level rising faster during the final period than anticipated based on the previously observed periods. In the case computed, there is 5 % chance the wall is 0.06m too short to give whatever level of protection the board deems adequate.

This too-short wall be seen as a big “disadvantage”, but fear not. The future board can schedule another build. If they believe the sea has stopped rising, they can add 6 cm to the wall elevation. If the 2100 board believes the sea is continuing to rise, and they believe the area continues to need protection, the future can base their decision on new, updated, hopefully improved projections of uncertainty.

Summary

  1. There is no mathematical reason a current, 2014, board needs to build a protections to levels required to protect citizens in 2100.
  2. There is no mathematical reason a current, 2014, board needs to build to protect to the upper uncertainty bound for sea level rise in 2100.
  3. If the board opts to schedule several many periods, the best estimate for the required protected height approaches the mean value for the projected sea level rise.
  4. If the board opts to build as required, they can come close to building “just the right” height protections.
  5. Other factors not discussed here become very important to the boards decision. These include: the discount rate which makes current expenditures more costly than future ones, incremental added cost of maintaining unnecessarily tall protections for 100 years, risk of unnecessary excess loss if the unnecessarily tall protections are destroyed by an earthquake sometime between 2014 and the time when the flood protection of that height might be needed and added costs when engineering projects start and stop. Most of these will tend to argue in favor of many builds; the final one argues in favor of a smaller number of builds. Careful calculations would be required to determine the optimum number of builds; it is unlikely to be 1.
  6. It is true that uncertainty results in higher costs. However, note for the case considered, the “Lew” method suggested the uncertainty meant one needed to build 1-m protections when the best estimate for required protections under certainty was 0.5 m. But by responding with sanity the expected value of build heights were (0.53 0.51 0.50 0.50)m for (2,3,4,5) builds respectively, with the additional height above the 0.5m required under certainty falling within rounding error. Admittedly, rounding down was required, but I think few boards would be impressed by the thought that ‘climate uncertainty’ adds horrific costs when the difference in cost is less than 0.5% of the expected costs, represents less than 1mm in height of a protection and this uncertainty is dwarfed by other uncertainties that affect board decisions.

Anyway, I thought some of you might enjoy this “toy math” post. I did.


Links to papers
Readers might want these handy links:
A simple technique for estimating an allowance for uncertain sea-level rise. John Hunter y analysis is an application of equation (6).

Scientific uncertainty and climate change: Part I.
Uncertainty and unabated emissions
Stephan Lewandowsky · James S. Risbey ·
Michael Smithson · Ben R. Newell · John Hunter

Oscillating heating: Toy Problem for Ocean Heat Content.

This post is a “gedunkan” to illustrate something that happens in simple systems. It’s prompted by a discussion in comments in the discussion about the conversion of Ocean Heat Content to equivanlent Temperature Change. This is not intended as explaining the complexities of what happens in the ocean, but merely to show a feature that makes it difficult to unambigously interpret what is happening in the depths of a body based on data about the total heat content in the lower and mid layers of a body heated from above.

The Gedanken
In this Gedanken or “toy” problem, we will example a very deep solid body that is heated from above. This body could be made from solid block of any very good heat conductor (e.g copper, aluminum, gold) which we insulate at the sides. We will assume the sides are perfectly insulated. The very bottom of the block will be dunked in an stirred ice water bath keeping it the bottom surface of the aluminum at T=0C. At time t=0 we assume the temperature of the aluminum is T=0. The conceptual diagram is show below:
VeryDeepPlate

In our mind we will partition the upper layer illustrated and call it “atmosphere”. (For the purpose of the gedanken, ignore the fact that the earth’s atmosphere is not aluminum. We can discuss in what ways this problem is similar or different from “the earth’s climate system” in comments.) The next layer down is the “top ocean”, the next down is the “top 2000 meters” and the lowest bit is “the lower ocean”. We will not explore what it means for the bottom of the ‘ocean’ to be kept at 0C.

Now we start the actual gedaken: at the top of the aluminum block we place an element that can either heat or cool the top of the block. This element provides a controlled heat flux of Q= 0.1 cosine (πt) where t is time. That is illustrated with the red arrows. Note that the heat flux is positive for 0<t< 1/2 then it goes negative. This is shown below:
HeatAtTop

HeatAtTop

Below, I’ve illustrated how temperature varies with depth at three different times, two at times < 1/2; the third a bit afterwards.
TemperatureWithDepth

The earliest time illustrated is shown in blue, as nearly everyone expects, the heating at the top causes the temperature at the top of the block to rise. Because the bar is conductive, heat propagates downwards. The red trace shows the progression in temperature over time. Because heat flux at the surface remains positive in the time frame from the blue trace to the red one, the temperature at the surface continues to rise, ahd heat continues to propagate downward.

Finally, the green trace represents a time just after 1/2, when heat flux at the top surface turns negative. What happens during this time is a bit complicated. At the very top surface, heat is sucked out at the top. So this tends to cool the top of the block. At the same time recall that at t=1/2, the temperature at the top of the block is higher than the temperature lower down. So, at that point, conduction acts to suck heat away from the top and down to the bottom. So, the “top” layer loses heat both because the heat flux at the top turned negative and by conduction to lower layers.

When creating the green trace, I chose a point in time when the top three points had lost sufficient heat such that the maximum temperature is now between the third and fourth triangles on the left hand side of green trace. For those who think ‘this doesn’t happen’: Yes it does. Go into a desert before sunset. The top of the sand will be hot, dig a bit: the sand lower down will be cooler. Go later: the top will have cooled, but if you dig a bit, the sand lower down will be warmer than the top. This happens.

Returning to the figure: At the point in time represented by the green trace, conduction will act to transfer heat from points closer to the maximum point toward those further away. So: to to the right of the 4th point from the left, conduction will cause heat to flow deeper into the block while for points to the left of the peak it will cause heat to flow upward toward the surface. We could show more and more traces. But for now, I won’t. Instead, I’ll discuss the relevance to the discussion of interpreting what it might mean

  1. A top layer cools
  2. A mid layer warms
  3. The lower layer warms (possibly even faster.)

Note that in my conceptual figure of the block, I divided the region into “atmosphere”, “top ocean”, “mid ocean”. Suppose we were to mentally place the division bewteen atmosphere between between the three green triangles on the left and the forth to the left:
TemperatureWithDepth

In that case, it’s fairly easy for someone integrating by eye to determine that the mean temperature (or heat content) of the top layer would cool slightly, while those of all lower lower layers warm during the time period from the “red” trace to the “green” trace. It happens that if I continue this exercise, depending on the choice of where I place the “mid” and “lower” layer partitions, I can get all sorts of behaviors.

I can have the

(top layer cool, mid layer warms , lower layer warms less quickly than midlayer)
(top layer cools, mid layer warms slowly, lower layer warms more quickly than midlayer)
(top layer cools, mid layer cools, lower layer warms)

and so on. And I’ve only just started the oscillating heat application, and (because EXCEL bogs down) I haven’t didn’t make by block deep. Let’s suppose I continue to apply oscillating heat. Look at the temperature variation with depth at time 2 which is the first full cycle of heat appliation:
LaterOn

Given this illustration of what happens in a very very simple physical system, I think you might want to contemplate what one might conclude if the only thing one knows about a “haitus” in surface heating is the “ocean” to heats when the atmosphere displays a “hiatus” from previous heating. I would suggest that unless someone fills in more blanks, it’s difficult to conclude much at all. Because if the heat is applied from the top and stalls, this is precisely what happens with no need of introducing any fancy theory about “enhanced mixing” at lower layers, or “the top didn’t really stall”, or “heat is hiding” or whatever. It’s just what happens in a simple system.

Some will wonder: Could I make this problem more complicated and still show the main result? Sure. But I can’t simultaneously make it simple and sufficiently complicated to match the dynamics of the ‘earth ocean’. If I add complications, that can cause people to think the behaviors has somethign to do with the complications when, in fact, the behavior is a sort of leading order very simple behavior. So, for discussion, I think this is best. The purpose is merely to let people see this and reflect. If you have complications that interest you, I might be able to explore some of the simpler ones. (Though, I have to admit, I did this in EXCEL which is behaving like a lard-ass cranky program. )

Still, discussion is welcome.

“The truth is out there” : Comment on the Dimple Part I of III

It is often good to focus on “The Truth” which in this post we will represent using $latex T_{true} $. We will also call this truth “the measurand”. With “The Truth” in mind, we will discuss the error in a measurement, apply that definition to define an error in a proxy reconstruction and and see if we can learn something Marcott’s Dimple.

uncertainty

In the figure above, we see a plot that shows “Uncertainty” in some unstated thing. Blog arguments have broken out; I think it’s worth considering how we would fill in the blank in “Uncertainty in _____”. I’ll begin with this question:

What is the measurement error in a proxy reconstruction for the surface temperature of the earth?
Let us consider the goal of such a reconstruction which is to estimate or measure the surface temperature of the earth, $latex T_{true}(t) $ which varies over time t. The proxy reconstruction is an estimate which we will represent as $latex T_{rec}(t) $. Given this definition of what we wish to measure, the error in the proxy reconstruction, $latex e_{rec} $, can be determined by applying the definition of a measurement error. That is we subtract our estimate from the thing we wish to measure.

(1)$latex \displaystyle e_{rec}(t) = T_{rec}(t)- T_{true}(t) $

The confidence interval for the measurement error $latex e_{rec}(t) $, is defined as the standard deviation across proxies of $latex e_{rec}(t) $ which we will represent as $latex \sigma_{e}(t) $. I claim confidence intervals based on this $latex \sigma_{e}(t) $ is the proper definition of “the measurement uncertainty” for the proxy reconstruction.

Before proceeding, it’s worth considering two example properties of confidence intervals described by $latex \sigma_{e}(t) $.

Feature 1: provided the temperature of ‘the earth’ $latex T_{true} $ is defined using a consistent baseline that is not affected choice of proxy, then the measurand, $latex T_{true}(t) $ is effectively deterministic. In this case, as long as $latex T_{rec}(t) $ from all possible proxies share a common baseline, $latex \sigma_{e}(t) = \sigma_{T_{rec}}(t) $ should be identical.

Feature 2: Since the motivation for the Marcott proxy reconstruction is to discover changes in temperatures of the earths surface over time, it is worth considering how the uncertainty intervals in (1) can be used to estimate the uncertainty in $latex \Delta T_{true} =[ T_{true}(t_{2}) – T_{true}(t_{1})] $ for an arbitrary choice of $latex t_{1} $ and $latex t_{2} $ .

Notice that if $latex T_{true} $ and $latex T_{rec} $ are both white noise, using definition of error in (1) has desirable property that the error for $latex \Delta T_{true} =[ T_{true}(t_{2}) – T_{true}(t_{1})] $ is $latex \sqrt{ [ e_{rec}^{2}(t_{2}) – e_{rec}^{2}(t_{1})] }$ Consequently, uncertainty and confidence intervals defined based on the standard deviation of these errors will exhibit this property one anticipates for uncertainty and confidence intervals of a measurement of $latex T_{true} $. (They will also share other properties one anticipates for confidence intervals of measurement uncertainties.) This means these confidence intervals are useful if we wish to make claims like that the one the one that follows:

” Global temperatures are warmer than at any time in at least 4,000 years, scientists reported Thursday, and over the coming decades are likely to surpass levels not seen on the planet since before the last ice age. ”

See:New York Tmes

They are also useful if one merely wishes to know the uncertainty in when one compares the earth temperature in 1850 to the temperature at other times during the Holocene.

Other confidence intervals
Because our motive is to discuss alternate views of possible definitions of “confidence intervals for measurement uncertainty in a proxy reconstruction”, I will discuss a second set of confidence intervals discussed at some length by Nick Stokes. He seems to call these “confidence intervals [for some unstated property] of the Holocene temperature reconstruction”. (Note, I inserted the words in blue). His confidence intervals (for this unstated property) are defined as the standard deviation on $latex T_{rec, rebaselined}(t) $ which is $latex T_{rec}(t) $ from an individual reconstruction that has been rebaselined using the the mean of $latex T_{rec}(t) $ computed using reconstructed temperatures from that individual reconstruction over the baseline chosen for a particular analysis; we will denote the standard deviation for this quantity as $latex \sigma_{Trec,rb}(t) $.

For now I would like to highlight this feature of the confidence intervals for (some unstated thing): The measurand, $latex T_{true}(t) $ for the proxy reconstruction does not appear the definition. Of course, if we compute a standard deviation on $latex T_{rec, rebaselined}(t) $ we will obtain confidence intervals for $latex T_{rec, rebaselined}(t) $ which is something. So Nick is describing confidence intervals in the measurement of something.

However, it is rather difficult to use these intervals to compute the uncertainty in $latex \Delta T_{true} $ above. (We will later see that if you try to use them for that purpose, you do so incorrectly.)

The Toy Problem
I will now set up and discuss a toy problem which will show that — at least in the simplest problem the confidence intervals for $latex \sigma_{Trec,rb}(t) $ show a “dimple” in the baseline period, while confidence intervals for $latex \sigma_{e}(t) $ describing measurement uncertainty have no dimple. I’m not going to go so far as to claim that confidence intervals describing measurement errors can never have a dimple. It may be that under certain conditions they do. However, in this simple case, the dimple does not appear in confidence intervals for measurements errors.

True temperatures in ‘The Toy’
To set up the toy I will posit that the true earth temperature is measured at evenly spaced intervals $latex t_{i}$ and the measurements $latex T_{true}(t_{i}) $ are Gaussian white noise with mean 0 and standard deviation 1. The seeming contradiction that from the point of view of a proxy reconstruction the earth temperature is deterministic will be resolved by decreeing that the temperature already exists and once generated is considered “frozen”. The goal of proxy reconstruction is to figure out what those temperatures are.

We will decompose the temperature $latex T_{true}(t_{i}) $ into the mean computed during a baseline defined “N” specifically selected points in time, $latex \overline{T_{base}} $, and a residual $latex \overline{T(t_{i})} $ . The decomposition is:

(2)$latex \displaystyle T_{true}(t_{i}) = T(t_{i}) + \overline{T_{base}} $

Henceforth the overline $latex \overline{X} $ will be used to denote the sample average of X over whatever period is defined as the baseline for a proxy reconstruction. The subscript “base” above will communicate the same notion; its use above is redundant. (The redundancy may be helpful to those who wish to read the accompanying code.)

Note that outside the baseline $latex T(t_{i}) $ is a random variable with mean $latex -\overline{T_{base}} $ and standard deviation 1, while inside the baseline it is a random variable with a mean 0.

Proxies in ‘The Toy’
Next suppose we can obtain an estimate of $latex T_{true}(t_{i}) $ from available proxies. There are many and any individual proxy will be denoted with a subscript ‘j’. The raw proxy value of each proxy will be assumed to vary linear with true temperature as

(3)$latex \displaystyle P_{j}(t_{i}) = m [T(t_{i}) + \overline{T_{base}}] + W_{j}(t_{i}) + P_{mean,j} $

where $latex m $ is the linear conversion constant which to simplyfy the analysis will be considered common to all proxies, and $latex W_{j} $ is Gaussian white noise with variance $latex B $ and zero mean. $latex P_{mean,j} $ as a constant which may be specific to the individual proxy. It is introduced to account for the effect of calibration biases of unknown magnitude in proxy ‘j’. For perfectly calibrated proxies, the term could be set to zero would have no effect on the value of $latex \sigma_{e}(t) $.

Let us now define the raw reconstructed proxy calibrated temperature based on a proxy ‘j’ to be
(4)$latex \displaystyle T_{rec,j}(t_{i}) = P_{j}(t_{i})/m $

Error in the reconstruction and Uncertainty Intervals in ‘The Toy’.
With a small amount of algebra it is possible to show that the difference
(5)
$latex \displaystyle e_j{rec}(t) = T_{rec,j}(t_{i}) -[T(t_{i}) + \overline{T_{base}}] = W_{j}(t_{i}) + P_{mean,j}/m $

If we like baselines, we could define $latex T_{rec,j}(t_{i}) $ over the baseline as $latex T_{rec,j} $ and subtract write (5) above as

(6)$latex \displaystyle e_{rec,j}(t_{i}) = [ T_{rec,j}(t_{i}) -\overline{T_{rec,j}} ] -[T(t_{i}) -\overline{T_{rec,j}}] ] – \overline{T_{base}} = W_{j}(t_{i}) + [P_{mean,j}/m] $

Recognizing that both $latex \overline{T_{base}} $ and $latex T(t_{i}) $ are unaffected by choice of proxy, j, the standard deviation of $latex e_{rec,j}(t_{i}) $ (i.e. $latex \sigma_{e}(t) $ ) over all possible proxies is unaffected by the magnitude of either. Their effect will be to shift every point in a reconstruction up or down uniformly by some unknown temperature. This is called a bias error and the standard deviation in that quantity would represent an uncertainty in the bias.

Diagnosing the contribution of $latex [P_{mean,j}/m] $ to $latex \sigma_{e}(t) $ is a bit more difficult. If all proxies responses are similar to a perfectly calibrated laboratory thermometers and the true surface temperature of the earth was known, then we would anticipate that $latex [P_{mean,j}/m] $ would be effectively deterministic for a proxy reconstruction. (Specifically, the analyst creating the reconstruction could subtract out value at the “true” baseline temperature for the earth and it the term could be forced to zero for every proxy.) So the the term would make no contribution to $latex \sigma_{e}(t) $.

However, if a thermometer is more like single “treenometer” or the true surface temperature of the earth can never be known, then $latex [P_{mean,j}/m] $ may include sizable random component that corresponds to different calibration errors both for true thermometers and “treenometers”. The magnitude of that calibration error will depend both on the noisiness of the ‘treenometer’ itself (i.e. the magnitude of “B” for the proxy), the number of data points used to calibrate the individual ‘treenometer and the uncertainty in the true temperature of the earth.

However, the effect of $latex P_{mean,j}/m $ is irrelevant to the question of “The Dimple”. So in the (not yet written part II) of this discussion, I will show graphs of the uncertainty computed with $latex P_{mean,j}/m $ set to a deterministic value. This decision means that I will be finding the

  • lower bound on the estimate of the variance in the measurement errors in the reconstruction. In (as yet not written) part III I will compute bounds varying its magnitude.

    Before moving on, let’s consider some features of confidence intervals based on (5) or (6) . In ‘The Toy’ problem neither $latex P_{mean,j}/m $ nor the standard deviation of $latex W_{j}(t_{i}) $ are functions of time, therefor the $latex \sigma_{e}(t) $ is not a function of time. So $latex \sigma_{e}(t) $ will contain no dimple in this problem.

    Nick’s uncertainty intervals in the toy.
    Finally, it’s useful to examine the properties of Nick’s confidence intervals for some unstated thing. Recall his confidence intervals are defined as the standard deviation of rebaselined reconstructions. Using the overbar terminology that means his he is taking the standard deviation of:

    In the original, I have an error. I’m revising the bits inside the grey boxes.
    (7original) $latex T_{rec,rb}(t_{i}) = [ T_{rec,j}(t_{i})-\overline{T_{rec,j}}] = W_{j}(t_{i}) – \overline{ W_{j} } $

    If we take the standard deviation of (7) to obtain $latex \sigma_{Trec,rb}(t_{i}) $ we will find that both terms on the right hand side contribute to this standard deviation. In this case, $latex \sigma_{Trec,rb}(t_{i}) $ is a function of time. For points inside the baseline, $latex W_{j}(t_{i})$ and $latex \overline{ W_{j} } $ are positively correlated and $latex \sigma_{Trec,rb}(t_{i}) $ will be smaller than the variance of $latex W_{j}(t_{i})$. For points outside the baseline, the two noise terms are uncorrelated; the computed variance will be larger than $latex W_{j}(t_{i})$.

    ——–

    (7 corrected) $latex T_{rec,rb}(t_{i}) = [ T_{rec,j}(t_{i})-\overline{T_{rec,j}}] = T(t_{i}) + W_{j}(t_{i}) – \overline{ W_{j} } $

    From this we can identify a differen measurement error which is
    (8) $latex e_{rec,anom}= T_{rec,rb}(t_{i})- T(t_{i}) = W_{j}(t_{i}) – \overline{ W_{j} } $

    If we take the standard deviation of (8) to obtain $latex \sigma_{Trec,rb}(t_{i}) $ we will find that the two terms on the right hand side contribute to this standard deviation. In this case, $latex \sigma_{Trec,rb}(t_{i}) $ is a function of time. For points inside the baseline, $latex W_{j}(t_{i})$ and $latex \overline{ W_{j} } $ are positively correlated and $latex \sigma_{Trec,rb}(t_{i}) $ will be smaller than the variance of $latex W_{j}(t_{i})$. For points outside the baseline, the two noise terms are uncorrelated; the computed variance will be larger than $latex W_{j}(t_{i})$.

    This is the origin of the dimple which does appear in the sorts of confidence intervals Nick is computing but does not appear in confidence intervals describing measurement errors of absolute temperatures.

    Nick will be happy to know that I now see a “measurement error” because it is expressed relative to a measurand! (Whether this measurement error is properly interpreted when these are compared to temperatures in the thermometer record or projections I cannot say because we now have two types of measurement errors but which is most easily used when patching to a thermometer record I don’t yet know.)

    Before continuing on to the exciting plots (to appear in Part II), I think it’s also worth noting that $latex P_{mean,j}/m $ does not appear in (7 or 8).

    The following is largely right, but interpretation has to change. Missing that term does affect comparisons to a temperature (or anomaly) at a time outside time period of the reconstruction but possibly not two terms inside the reconstruction

    This means that whatever (7) is supposed to account for, it does not capture the effect of calibration bias in proxies on the uncertainty in measurement errors in proxy reconstructions. For this reason alone, whatever the confidence intervals might be what they are not is confidence intervals that can be used to compute the uncertainty in $latex \Delta T_{true} =[ T_{true}(t_{2}) – T_{true}(t_{1})] $ .

    I don’t know if confidence intervals computed the way Nick is computing confidence intervals were applied to estimate the uncertainty in $latex \Delta T_{true} $ in Marcott. But if the claims about temperature differences relative to those in the thermometer record made in Marcott are based on that sort of confidence intervals, those claims would be based on misinterpreting what confidence intervals computed the “Nick” way describe. Remedying this mistake would require computing confidence intervals for the error in the reconstruction. A proper estimate would require capturing the effect of the calibration uncertainties described by $latex P_{mean,j}/m $.

    Summary: Part I
    Since I’m going to be showing graphs in separate posts it’s useful to summarize the main points in this post:

    1. The uncertainty intervals computed using (1) correspond to the uncertainty in the error in a reconstruction. These are useful if you wish to determine the uncertainty in the difference in the temperatures at two different times in the earth’s history. Learning this difference seems to be the goal of a proxy reconstruction, so these confidence intervals are useful and describe the uncertainty in the measurand of interest. These uncertainty intervals will not show “The Dimple”. I call these proper “measurement uncertainties” for a proxy reconstruction because they can be used to estimate the uncertainty in the measurand of interest (i.e. the changes in the temperature of the earth.)
    2. The uncertainty intervals computed using the method discussed by Nick in his blog post are not may not be useful if we wish to estimate the uncertainty in the difference in the temperatures at two different times in the earth’s history when one of those times lies outside the range of the proxy reconstruction. If we base claims about the statistical significance of differences in temperature on that sort of confidence interval, your conclusions will have no legitimate foundation and — unless we are sufficiently luck– our claims will be incorrect. These confidence intervals will show “The Dimple”.
    3. Computation of the two sets of confidence intervals share some features. In certain limits, the quantitive difference between the two will be imperceptible and conclusions based on the “Nick” type intervals be nearly identical to those made with proper uncertainty intervals. This happens only when both the following are true: The proxy reconstruction used a baseline of sufficiently long duration to make “The Dimple” so small as to be imperceptible and the proxies calibration is perfect. However, the fact that “The Dimple” appears is sufficient evidence to demonstrate this goal has not been achieved.
    4. I have no idea whether Marcott’s claims about the uncertainty in temperature changes relative to the thermometer record were based on uncertainty intervals computed as in the top figure in this post. I know those uncertainty intervals are described, but not having read the paper, I don’t know if those uncertainty intervals were used without modification to estimate the uncertainty in difference in the earth’s temperature at different points in time nor do I know whether those uncertainty intervals were places around the mean reconstructions in their figure of the reconstruction. (If they were, that choice would be misleading.) Because I don’t know whether they did these things, I cannot say whether how Marcott might have used confidence intervals computed the “Nick” way, I can’t say whether what they did was “right” or “wrong”. I can say nevertheless say that if they used that sort of confidence interval to compute the uncertainty interval for temperature difference at different times on earth or if the used slapped those sorts of confidence intervals around the mean reconstruction and represented that as the uncertainty in the measurement of the earth temperature, that would be misleading to the point of being incorrect.

    Upcoming: Graphs showing “The Dimple” as it appears in “Nick-type” confidence intervals if the temperature series and the proxy noise are Gaussian white noise. Afterwards, graphs showing the relative size of confidence intervals using selected values chosen to highlight qualitative effects of interest which may not correspond to values relevant to Marcott. Those of you who like math and have been reading Marcott can suggest reasonable values for the random componenent of $latex P_{mean,j}/m $, and the noise in the proxy reconstruction and so forth.

    Teaser graph
    RejectionRates

  • Observation vs Model – Bringing Heavy Armour into the War

    As I have noted before, most of the AOGCMs exhibit a curvilinear  response in outgoing global flux with respect to average temperature change.  One of the consequences of this is that there is a well-reported apparent increase in the effective climate sensitivity with time and temperature in the models; in particular, the effective climate sensitivity required to match historical data over the instrument period in the GCMs is less than the climate sensitivity reported from long-duration GCM runs.   This is not a small effect, although it varies significantly between the different GCMs.   In the models I have tested, it accounts for about half of the total Equilibrium Climate Sensitivity (“ECS”) reported for those models.   (Equilibrium Climate Sensitivity is defined by the IPCC as the equilibrium temperature in degrees C after a doubling of CO2.)  In general, models which show a more pronounced curvature will have a larger ratio of reported ECS to the effective climate sensitivity required to match the model results over the instrument period, and vice versa.

    Kyle Armour et al have produced a paper, Armour 2012 , which offers a simple, elegant and coherent explanation for this phenomenon.   It comes down to geography.

    Continue reading Observation vs Model – Bringing Heavy Armour into the War

    Uncertainty: Drives up expected value of costs.

    This post is a follow on to my earlier post discussing what happens if we were to believe the uncertainty in the climate sensitivity rose relative to our current understanding while our best estimate of the expected value in temperature rise ($latex E[T] $) remained constant. For now, the focus will simply be numbers. For the purpose of this blog post, I make the same assumptions discussed in my previous post and assume the damage function (i.e. present value of costs ) for climate change is a function of the temperature rise due to doubling of CO2 and that it the costs in quatloos are a quadratic function of temperature with zero cost at no rise. That is $latex C(T) < T^{2} $ . This is the choice Lewandowsky made in his post on implications of uncertainty which touchted on costs. Some results I will discuss depend on the choice of damage function; others do not.

    Some discussion of the economic model
    I have selected the damage function of the form $latex C(T) < T^{2} $ where C is the present value of costs arising from damages from climate change denominated in Quatloos and T is the temperature rise under doubled CO2. I selected this form because it is the choice Lewandowsky made. This is a somewhat cartoonish choice is suitable for the current discussion which, despite containing numerical values, is qualitative. Even though this is a cartoon model, I think it's important to understand the features the cartoonish economic model is intended capture. One of these features is the effect of "temporal discounting". In Lewandowsky's post that touched on costs, there was quite a bit of huffing and puffing (some of it nonsense) about economists temporally discounting costs. For example:

    This is an important point to bear in mind because if the greater damage were delayed, rather than accelerated, economists could claim that its absolute value should be temporally discounted (as all economic quantities typically are; see Anthoff et al., 2009). But if greater damage arrives sooner, then any discounting would only further exacerbate the basic message of the above figure: Greater uncertainty means greater real cost.

    Economists do temporally discount costs. (In fact, most engineers take engineering economics to help guide rational choices about mundane things like spending money on improvement to physical plant.) One reason costs are temporally discounted is simple and is touched on in 8th grade home economics. Here’s an example on might discuss in 8th grade:

    Suppose you have the opportunity to buy a refrigerator now. It costs $700. You have $700. You currently don’t need and can’t use this refrigerator but anticipate you will need a refrigerator a year from now when you move from your dorm to an apartment. You also know you can lend your money safely to an enterprising friend who will pay you back $770 a year from now and you are 100% certain they will pay you back. You are also 100% certain the refrigerator will cost $720 a year from now. Given this fact pattern, should you buy the refrigerator now or wait a year?

    Written this way, everyone knows they should defer buying the refrigerator and lend the money to their enterprising friend. The $70 you make by not spending your money and putting it to work means you will easily cover the $20 cost increase in the price of the refrigerator. The “discount rate” is the formal method of accounting for the fact that available money can generally be used to do something that creates value. In economics, these valuable things are denominated in money; in this example, the extra value is $70.

    Because this post is largely motivated by Lewandowsky’s discussion, it’s worth nothing that if we are going to do the sort of cartoon analysis he did and which I largely imitate here, the damage function (i.e. estimate of the present value of costs based on the change in climate sensitivity) has already been assumed to include the temporal element. That is the discount rate has already been applied.

    This must be so because mathematically speaking the damage function we chose for this exercise is only a function of climate sensitivity and so must be understood depend only on the climate sensitivity. (More complicated models can be created from more detailed damage functions by using cost models that account for time, finding more complicated probability distribution functions for scenarios of temperature. From these one might develop the simpler more cartoonish model shown here.)

    The consequence of the fact that the cost function used here already accounts for the time evolution means that one ought not to suggest the the cost climate change should be further inflated to account for the fact that they may arise sooner rather than later because that effect is already accounted for in the cost model. When any such suggestion follows an analysis using a cost model of this sort, it can be taken as evidence the person making the suggestion may not understand the math underlying their cartoon analysis. Alternatively, they have forgotten the implications of their earlier assumptions.

    Results of effects of uncertainty on estimated costs of climate change
    Now, based on the cost model described above, I’m going to present some simple graphics showing what happens to our estimate of the probable costs of climate change if we use the sort of cost model used by Lewandowsky. The analysis is simple: To compute the probablity density function for costs I inverted my cost function to obtain the temperature rise as a function of cost, inserted that into the pdf for temperature (shown in the previous post) and used the change rule to change dCost = dT (∂C/∂T). I coded that up, and did higher mathematics also called “summing things up” (i.e. integrated.)

    The results in graphical form are below. On the left, I show the cumulative probability distribution of costs computed in the ‘base case’ (i.e. lower variance.) To the right I show the values computed assuming that the best estimates of temperature is fixed but the standard deviation increases. I did this by increasing the standard deviation of our uncertainty in feedback parameter from Roe and Baker (2007).

    Examining the figure we can see some features that are independent of the cost model; others are dependent on the cost model.

    The feature that is independent (or nearly independent of) the choice of economic model.

    1. Increases uncertainty in our ability to predict the cost. That is: the range of costs we consider possible increases. This is tautological.

    Other features we predict that depend our choice of economic model. (It is worth nothing some depend on the shape of the probability distribution for the uncertainty in our ability to predict the feedback parameter.) If the expected value of the rise in temperature was correctly reflected in our first analysis, but the standard deviation our “true” uncertainty in feedback parameter is twice than value we used to compute costs:

    1. In both the base case and the higher uncertainty case, the expected value of the cost is greater than the cost computed based on the expected value of the temperature. That is $latex < E( C[T]) $. This will always occur when the present value of the costs are assumed to be a quadratic function of the temperature, but might not occur for other economic models. It's worth nothing that this observation doesn't need we need to "worry more". It merely means that when estimating costs, one uses $latex E( C[T]) $ not $latex C(E[T]) $. The principle that one uses $latex E( C[T]) $ is well established in economics would hold even if the inequality reversed. For this reason, reports generally mention $latex E( C[T]) $.
    2. In the base case, the expected cost of unmitigated climate change ( $latex E( C[T]) $ ) is 13.9 Quatloos. What this means is that under the base case, if we had only two choices: a) do nothing or b) undertake a perfect method to prevent temperature from rising at all, we should undertake that method provided the present value of its cost is less than 13.9 Quatloos. Otherwise, if we believe our cost model properly accounts for the costs at all levels of climate sensitivity and our statistical model describing the uncertainty in climate sensitivity is correct, we should select “do nothing”. That’s how economic models are used.
    3. In the base case, the probability that costs would fall below 13.9 Q is 62.7%. That means that if we do spend 13.9 Quatloos to avoid climate change, there is a 37.3% chance we will have spent more Quatloos than we ought to have. How much we would have wasted will depend on the actual temperature rise– which we know we cannot predict. (Note: if we spend the 13.9 Quatloos and the method works, we may never know the actual costs that would have occurred if we did not spend the Quatloos.)

      In contrast, there is a 37.3% changes damages will exceed 13.9 Q if we do nothing. The damages may possibly exceed 13.9 Q by a large amount. That means that it is possible that damages could be some amount– say 35 Quatloos. If we are presented with a prefect mitigation strategy whose cost has a present value of 14 Q, and we pass that up because it costs more than 13.9 Q, we will end up being out of pocket 35 Quatloos when we could have gotten away with spending only 13.9 Quatloos.

      Note however, that the skewed distribution means that if we use expected value of costs to decide whether to undertake a mitigation project we are more likely to spend too much rather than spend too little The flip side is that in cases where we spend too little, our costs might be very large. It is generally accepted that using the expected value balances these two issues properly.

    4. The estimate for expected value of costs ( $latex E( C[T]) $ ) computed using by doubling our estimate of the uncertainty in the standard deviation of the climate feed back is is 17% higher than anticipated using the base case. That means that means that if we estimate using the higher level of uncertainty, we should chose mitigation strategies even if they cost a 17% more than under the base case.
    5. Even though the the expected value of the cost of climate change increased to 16.3 Quatloos when we considered the possibility that the uncertainty in our estimate of feedback parameter, the probability that the cost will exceed 13.9 Q we fall below the value we estimated in the basecase is 64%. That is: We are still more likely to spend a little too much. That said: the item we should focus on ins the expected value of the cost: 16.3 Quatloos.
    6. For those who like to consider the extreme outcomes: Whereas in the base case, we anticipate there is a 90% chance the costs will fall between 3.7 Quatloos and 33.23 Quatloos, we now think the 90% range is 1.6 Quatloos to 53.6 Quatloos. Expressed in Quatloos, the width of the interval encompassing 90% of the likely outcomes increased by 76%.
    7. Comparing the cost estimate under higher uncertainty to that under base case (i.e. lower uncertainty), likelihood that costs will be lower than anticipated lower bound based on the base case increases from 5% to 19.8%. The likelihood that costs will be higher than the anticipated upper bound computed in the base case increases from 5% to 12.1%. So, the likelihood that costs will fall below the level we anticipated under the base case rises more rapidly than the likelihood costs will rise above the level we anticipated.

    It seems to me those are the major observations one can make about how uncertainty affects costs under my assumptions about the probability distribution function for our feedback parameter using the simple cost function and using feedback parameter values from one of the cases described in Roe and Baker. Of course as with all results involving statistical models, assumptions made prior to doing any math matter. Numerical values would differ if we chose a different cost function a different probability distribution function and so on. It’s perfectly legitimate to suggest the cartoon cost function is not correct (it’s not) or that one might pick a different probability distribution function for feedback. I certainly invite debate on those issues– my goal is to show a cartoon or ‘toy’ analysis that is food for thought.

    So, what should we do?
    A bit later on, I’ll write a post discussing how we would use the results of analysis of this sort (whether cartoon or real) to make decisions about mitigation strategies. Much of that will be plain old discussion with no graphs or math but I’ll try to avoid rambling about how much we should “worry” or whether or not uncertainty is our “friend”. Naturally, if the analysis involves making decisions based on cost, then the discussion will involve making decisions based on costs. As we can see above, uncertainty affects those costs— generally speaking greater uncertainty increases the expected values of costs of everything. This includes both the cost of the impacts of climate change and the cost of mitigation strategies.

    Screening: Now with good+bad treenometers.

    In my recent posts, I discussed how screening treenometers (or any proxy) based on correlation with temperature during the calibration period can screwup bias a proxy reconstruction in ways that deceive the analyst who is either unaware of or refuses to believe in the types of biases that screening introduces. With respect to blog spats about AGW, the short unnuanced interpretations of the two limits of what screening can do are:

    • “If you screen a batch of treenometers that do contain some signal, you can exaggerate a hockey stick.” and
    • “If you screen treenometers that contain no signal at all, you can create a hockey stick from trendless data.”

    Both issues clearly make it difficult to decree that any hockey stick contained in a reconstruction where the specific proxies were selected by screening with correlation. (Note: What I mean by screening here is “peeking” at the data to decided what to toss out. It’s ok to believe certain conditions result in high correlation, decree you will go out and collect trees from pre-designated sites, and then use all tree cores from all sites. What you must not do is collect the trees and afterwards toss out any tree cores or sites based on correlation with temperature during the calibration period.)

    Naturally, based on my two extremes, people whose intuition says there must be some advantage to screening want to know two things:

    • Suppose you pick sites, and you were somewhat successful, but some of the treenometer were temperature sensitive and others weren’t.
    • Couldn’t you figure out how to “improve” results from a batch of “good” treenometer and bad ones?

    Today, I’m going to show what your results would look like in two cases. In one, by picking sites, you got batch of “quite good treenometers” mixed with a batch of “not treenometers at all” and in another, you got “adequate treenometers” mixed in with “not treenomters at all”

    In the example, the “quite good treenometers” treenometers will have a correlation between “ringwidth” and “temperature” of R=0.50; these are better treenometers than in the previous example where I used R=0.25. The ‘not treemometers” will have R=0 which is utterly, totally completely unresponsive to temperature. In the second case, the “adequate treenometers will have R=0.25 as previously.

    My toy cases will involve 1000 ‘treenometers’ and 1000 ‘not treenometers’. For brevity, I’m going to assume people have read the previous posts tagged “screening” where my “true” temperature is described as being an sin wave+ piecewise linear function, the calibration period is described and so on.

    Meanwhile, I’m going to show a case where the “true” temperature is a sinusoid with a period of 100 years, the calibration period is the past 50 years, and I generate “proxies” with the target values of R. Then I recreate my best estimate of the temperature as described previously both by averaging over all trees in the batch (including the bad ones) and then over only those with the highest R. The results (with little explanation of “why” are shown below.)

    Case 1: “Quite Good Treenometers & Not Treenometers”

    Below is an example of the two reconstructions I created based on a batch of 1000 Treenometeres with R=0.50 and 1000 series of white noise. Violet is the unscreened and green is created by screening out the worst 56%. (I could screen differently, this happens to be the percentage Gergis appears to have screened.)

    Notice in the case above, the violet trace is unbiased. The main effect of introducing the 1000 white noise series is to add noise– but it doesn’t bias the reconstruction.

    However, if someone applies a filter and screens out the worst 56%, the result (green trace) becomes slightly biased. It happesn that the filter will screen out most of the ‘white noise’ proxies which had introduced noise– but had not biased the result. But the filter also screens out a few of the good treenometers. The later results in bias– which is generally a bad thing.

    I’ll discuss why this happens little more in an upcoming post where I show the histograms of the correlation coefficients and discuss which information you could use to try to screen out some of the noise without introducing too much bias.

    Case 2: “Adequate Treenometers & Not Treenometers”

    I repeated the example above, but this time the “treenometers” have R=0.25, which contains signal, but that signal is dirty. I kept 44% of the treenometers. The results are below:

    Notice in this case, the violet (unscreened) case remains unbiased. However, the bias in the green trace is quite noticeable and in this particular case, it could lead an analyst to conclude that the current temperatures exceed those in the historic record by a small amount.

    This conclusion would be incorrect.

    In my opinion, introducing a bias — particularly one that causes an analyst to make the incorrect conclusion in his main results– is much worse than introducing unbiased white noise. All unbiased white noise does is force the analysit to recognize that his proxy contains uncertainty— which is true. Reducing the white noise by introducing bias hides that uncertainty from the analyst, causes them to believe the wrong result and by masking the size of the uncertainty makes him state the wrong result with greater confidence than the true uncertainty would warrant.

    This is not a good thing.

    So I think one should very rarely prefer a slight bias to a noisy signal. In the rare case where it might be warranted, if a slight bias was potentially introduced by a method, the analysist who uses screening needs to estimate how large that bias might be. One way is to compare screened and unscreened results. But it turns out data themselves do contain some information that can help us refine our estimate of the bias introduced by screening (and possibly screen in a way that limits the magnitude of the potential bias): that information is in the distribution of the sample correlation coefficients. Tomorrow, I’ll show those histograms.

    I think anyone who hopes to use screening to improve results should want to know how to estimate the potential bias. After all, in the case of diagnosing whether the most recent years are warmer or colder than past years, the fact that a bias of unknown magnitude may have been introduced by the processing method casts doubt on any diagnosis that the recent years were the warmest in a millennium. Naturally, if screening is relied on in multiple studies the fact that each “confirms” the the general result that recent years are the warmest is rather unpersuasive.

    After all: the method introduces a bias, and it will introduce it over and over and over in all results where the method is used. So the replication of results in which we have little confidence owing to the repeated use of screening can never convince people who understand screening really does introduce a bias and will do so every time it is used. Those using screening need to admit the issue exists, address the issue face on and estimate how much error it introduced in their case.

    HockeyStick_Cherry_Calibration

    Update:
    Amac commented that I might get a different effect if the noise added to the temperaure was red. Below, I repeated the case where the treenometers are quite good (R=0.50), but the ‘noise’ has a lag 1 coefficient of R1=0.9. This is very red:

    Fun with screening: Create your own “Decline”!

    Try as I may, I can’t get people to not discuss “The Uniformity Principle”. One reason is that “The Decline” is often thought to prove “The Uniformity Principle” can’t possibly apply to trees.

    I used to suspect that “The Decline” seemed to be fairly persuasive evidence treenometers used to create some hockeystickes were responding to something other than temperature. Or their response was non-linear. Or…

    But then…. then… I started screening. In the past posts, I screened using a calibration period that spanned the final 50 years, all of which had an uptick. But I noticed that Roman’s calibration period ended in 1980. So, I thought: What does the reconstruction look like if calibrate on a period that doesn’t end with the final temperatures in the data record?

    I tweaked a little to make a very sharp “decline” (which I don’t plan to hide!) The specific tweaks are:

    • the period of the oscillation for my ‘True’ Temperature is 200 years instead of 100 in last weeks posts and the “True” temperature is a pure oscillation.
    • the calibration period starts 100 years before the ‘end’ of my ‘known’ Temperature series. (So, at year 900.)
    • the calibration period ends in year 980 instead of year 1000.
    • My synthetic proxies have an R=0.10 with true temperature. So, the do carry a temperature signal, but it’s weaker than in the previous one.
    • only 10% of the 2000 synthetic proxies are retained. These are selected as having the highest correlation with the “true Temperature” durign the calibration period.

    I then created a reconstruction by averaging over all the 200 retained proxies. In addition to showing a smaller amplitude oscillation pre-calibration, the new reconstruction shows….

    Decline
    !!!
     
     
     

    Here’s a graph comparing screened and unscreened reconstructions. The unscreened reconstruction shows no decline.

    Ok… I don’t know whether “The Decline” was caused by screening. I suspect it will turn out not to have been. Now I have to get the papers and see whether a) they screened by correlation and b) whether the calibration period ended before the end of the record. Anyone happen to know?

    I’m pretty sure the answer to (b) will turn out to be “no” and my bubble will be burst. But I’m still pretty excited to discover I can use the mathemagic of screening to create a decline!

    Screening Fallacy: So what’s the past really like?

    In a previous post, I showed that bias can be introduced by a batch of proxies that do response to temperature to retain only those individual proxies with high correlations during a calibration period. That post is intended to address a situation where:

    1. the analyst assumes one can use ‘metadata’ to identify which proxies are temperature sensitive and selects proxies whose metadata indicate they will be temperature sensitive.
    2. the analyst assumes response of individual trees to temperature is uniform over all time.
    3. both the assumptions above are actually correct.
    4. but despite having made the first assumption, the analyst decides to think that maybe the assumption might be incorrect then eliminates those individual proxies whose meta-data indicated they should be temperature sensitive but correlation between width and temperature during the calibration period is low.

    (Note that the analyst is simultaneously assuming he can predict a proxy is temperature sensitive and then using a process whose motivation springs from the assumption he cannot predict a proxy is temperature sensitive. Note that in the process, when the analyst finds ‘tree_i’ does not correlate, he does not throw out all trees with similar metadata. He only throws out ‘tree_i’. )

    The previous post was organized to show qualitative feature. The specific feature was to show that after correlation screening reconstructions based on “correlation screened” and “unscreened” cases do give qualitatively correct reconstructions for the pre-calibration period. However, only the unscreenedversion permits someone to compare the temperature in the calibration period to those in the pre-calibration period. The correlation screening introduces a bias that prevents anyone from making any comparison between temperatures in the calibration period and those outside the calibration period.

    I am now going to show graphs where I use a particular method to scale the ‘ring widths’ up into ‘temperature’. The method I will use is as follows:

    Given a batch of N proxies

    1. Compute the mean width as a function of time. This will result in a mean-width trace for unscreened (violet) and screened proxies (green).
    2. compute the best fit trend between width and temperature for each of N proxy. This is of the form
      width= mTemperature + b. This results in “N” estimates of “m”.
    3. Compute the mean value of “m” over all N proxies.
    4. compute the mean of the “N” proxy widths as a function of time. This was shown in the previous post, but I will repeat here because owing to a boo-boo, I didn’t use the same value of R I am using now. That boo-boo makes no difference to the qualitative result (which are the point of this exercize) but does change a scaling factor.
    5. Divide the mean width as a function of time by the mean m. This is the estimate of the temperature at time (t) according to this method.

    For my example case, the synthetic data consist of:

    1. Temperature (dark grey) is the sum of a sine wave over 1000 ‘years’ (blue dashed) and piecewise linear function that is 0 for the first 950 years and decline during the final 50 (red dashed). The calibration period is the final 50 years.

    2. There will be 2000 synthetically generated proxies (aka ‘treenometers’). For each ‘treenometer’ the “ringwidth” is correlated with R=0.25 with Temperature. I’ve also set the value of an arbitrary constant to make the variance of ring widths equal to the variance of the true temperature over the full 1000 years.
    3. I will select the 44% ‘best’ proxies to use for my fit. (This choice is totally arbitrary, but was picked because Gergis picked 44% of their proxies.)
    4. I will not use any proxies “upside down”.

    The graphs

    The mean ‘ring widths’ as a function of time are shown for the unscreened and screened ‘treenometers’ respectively:

    The next step is to compute the best fit trend of the form

    width= m Temperature +b
    for each of the 2000 treenometers and find the mean value for the batch. For the unscreened batch, I obtained m=0.25 ‘length/temp’ which happens to match the correct true unchanging over time value for every single ‘treenometer’ in the synthetic batch. Computing m using the screened batch, I obtained m=0.35 ‘length/temp’ which is larger than the true value.
    (Carrick discussed this effect in a comment here.

    The reason the value of ‘m’ is greater than the know true value when based on screened treenometers screening method selected only those treenometers whose ‘noise’ during the calibration period biased the results to obtain an high estimate for m.

    (Note: Because Nick keeps bringing up “the uniformity principle”, what that principle would tell us is that the value of the correlation coefficient RwT between ring width and Temperature for a treenometer does not vary over time. But what it does not imply is that the sample value computed during the calibration period is the true value. The bias introduced by screening arises because the true value of “R” differs from the sample value, but we threw away some trees based on a misinterpretation of the cause of variations in sample values estimated based on a finite calibration period. You cannot get around this by saying “uniformity principle” because this biases arise even if we assume the uniformity principle applies to individual trees.)

    After scaling each of the mean ring width using the mean value of ‘m’ based on the treenometers retained in their respective batches results in the reconstructions below:

    Examining the figure above we note:

    1. Both reconstructions are qualitatively correct in the proxy era. That is: Both show temperature oscillating as it did.
    2. Both correctly follow the temperature during the calibration period. This is because the method of rescaling forces that to happen in a noiseless case.
    3. The unscreened reconstruction is quantitatively correct. The screened reconstruction is biased showing smaller oscillations in the past. So, historic maxima are cooler than occurred.
    4. If you compare current temperature to the proxy reconstruction based on the screened data, you will incorrectly conclude that current temperature exceed those that occurred in the past.

    Of course there could be other methods of converting the ring widths to temperature. Each will give somewhat different methods. But the main difficulty to overcome is that the screened batch of data systematically over estimates the true value of the correlation between temperature and ring widths. The result is that when there is a final uptick it exaggerates this effect.

    Screening bias: Cartoon form

    In comments at climate audit Jeez explained the screening bias in cartoon form:

    One aspect of the problem for people with zero math skills.

    Here are six scattered chronologies in abstraction of randomness, which, if all averaged together, produce a flat line. Only two of these chronologies correlate with modern temperatures-the red circles. When those two are averaged, voila, a hockeystick magically appears.

    I tweeted this, and there seem to be some interest in a cartoon version explanation of the two-sided method. In the two sided method, we pick proxies with ending with upticks and those ending with downticks.

    We throw away the other ones, flip the proxies ending with downticks (to use ‘upside down’) and then average. Voila:

    IPCC’s overestimation of climate sensitivity: Kimoto

    Kyoji Kimoto emailed me a response to comments and questions I made on his paper when discussing Monckton’s claim about the Planck parameter “implicit” to a paper by Kiehl and Trenberth describing the earth’s energy balance. I offered him a guest post and he responded: “Thank you for your invitation to your blog.
    I am not familiar with posting method. Could you please manage the WORD document attached? “
    I have reformatted the Word document into HTML am posting on behalf of Mr. Kimoto. I invite your comments.
     

    — — — —

    IPCC’s overestimation of climate sensitivity
    Kyoji Kimoto, 2011

    Climate sensitivity is 0.5K from the global energy budget of the earth(see Fig.1). However, IPCC claims that most probable value is 3K with the range of 1.5-4.5K.

    This overestimation comes from the following two roots.

    1. Cess’s mathematical error in Planck feedback parameter calculation
      ( K.Kimoto,ENERGY&ENVIRONMENT,Vol.20,1057,2009 http://www.mirane.co.jp/)
    2. Overestimation in 1D RCM study of Manabe et al. 1964/67 (see Fig.2)

    Fig.1 Global energy budget of the earth (adapted from Kiehl et al,1997)

     

    Fr:long wave radiation Fb:back radiation
    Fe:evaporation Fs:short wave absorption
    Ft:thermal conduction OLR: outgoing long wave radiation
    True greenhouse energy : Fb-Fs=257W/m2
    Greenhouse effect: 288K-255K=33K
    Climate sensitivity factor with feedbacks: 33K/257(W/m2)=0.13K/(W/m2)
    IPCC’s radiative forcing for CO2 doubling: 3.7W/m2
    Climate sensitivity with feedbacks: 0.13K/(W/m2)x3.7W/m2=0.5K

     

                                         
     

         

    CO2 contribution in GH effect of 33K climate sensitivity with feedbacks

    Manabe et al.(1964/1967)  over10K 2.4K  
    Observed*) 3.3-6.7K 0.5K        

    Overestimation over 200-300% 500%

     
    *)R.E.Newell et al.JOURNAL OF APPLIED METEOROLOGY,Vol.18,822-825(1979)
    J.Barrett, ENERGY&ENVIRONMENT,Vol.16,1037-1045(2005)

                       (Copyright Kyoji Kimoto, 2011)
    Link to original Word document: TruthofAGW

    Update: On this thread, at least for the first 24 hours, I would like people to stick to questions for Kimoto. Avoid promoting your own theory, or making comments about skeptics vs. alarmists and so on. I’m moving off-topic comments to another thread.