Wishing Away

The problems of climate science
and the dangers of policy projections

By Josh Lerner

“Scientists, like artists, have the bad habit of falling in love with their models.” - Professor George Box

With the recent breakdown of talks in Copenhagen (or, for those in an alternate universe, the agreement on “non binding” suggested carbon “targets”) and the admission by several high profile Democratic Senators that any major climate change legislation passing in this congress is “unlikely,” the political momentum seems to be rapidly escaping advocates for large-scale reform and regulation to counteract the affects of man-made global warming.
            And why shouldn’t it? To the layman (also known as the average voter), the case for catastrophic consequences coming from man-made global warming seems to be getting weaker and weaker. Two major developments have recently increased the doubts that these voters have over both the predictive validity of these results and the honesty of the scientists who collected them.
            The first development is the abnormally cold winter we are having in much of the country. The two massive December snow storms in Oklahoma, a state so unprepared for the snow that major cities like Tulsa were not plowed out for weeks, along with rather unremarkable temperatures in much of the country since 2006, creates the impression that this mythical warming is simply not happening. Obviously, temperature readings, particularly in such a small time frame, say very little about global climate trends. Climate scientists are right to say that these fluctuations neither prove nor disprove any element of anthropogenic climate change. This result is, unfortunately, mostly a sideshow.
            But the other major development is much more significant and has drawn major attention to a more banal, yet equally important element of climate projections: the validity of the data. That would be the scandal quickly dubbed “climategate.”
            This past November, a large cache of emails and other documents from the Climate Research Unit (CRU) at the University of East Anglia in Britain were either hacked or leaked to the general public. These emails, consisting of almost 10 years of exchanges, show a group of insulated scientists who, in the words of MIT’s Michael Schrage, engaged in “malice, mischief and Machiavellian maneuverings.” The scandals in these emails, with regards to the validity of their research, range from trivial (tax code “hand waving”) to catastrophic (the “massaging” of data and discussing their desire to avoid complying with the Freedom of Information Act).

            The scandal, although far from disproving the claims of the scientists hereto embroiled, did cast major doubt as to the honesty and the interpretations of this data. Professor Phil Jones, the CRU’s director, is in charge of the two key sets of data the IPCC (Intergovernmental Panel on Climate Change) used to draw up the most famous reports that provided the United Nations with their catastrophic predictions. Through its link to the Hadley Centre, his global temperature record is one of the most important of the four data sets on which the IPCC relied to make its eschatological predictions. The charges of widespread malfeasance have cast serious doubt upon the whole operation.

            But even ignoring the whole controversy over the emails—and given that new information has been slowly leaking over the past few months and the story of those implicated has changed just as rapidly—there exists a much more important, and often unstated, problem underlying the entirety of this climate prediction industry. The problem is essentially one of systemic uncertainty, the uncertainty that is built into the very foundation of a prediction. The greater part of the case for a radical reformation of the global economy is built on the catastrophic nature of predictions based entirely on long-term probability models with inputs both ancient and modern. The systemic uncertainty here, at every level of this intellectual formulation, is the crucial, and often overlooked, element of analysis. But given the gravity of the policies designed to alleviate anthropogenic global warming (hereon referred to as AGW), it is a topic necessary to discuss.


What is Uncertainty?

            Uncertainty is an idea whose very nature is hard to define. It is usually thought of, in the dictionary sense of the word, as “the lack of sureness about someone or something.” This definition, however, doesn’t capture the important effects of uncertainty. A better definition would be one given by Professor Frank Knight, of our very own University of Chicago, in his seminal 1921 book Risk, Uncertainty, and Profit, which clarifies the important distinction between risk and uncertainty:


Uncertainty must be taken in a sense radically distinct from the familiar notion of risk, from which it has never been properly separated.... The essential fact is that ‘risk’ means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.... It will appear that a measurable uncertainty, or ‘risk’ proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all.


When we break up uncertainty we can create different categories of ignorance and how one should respond to them. At the most basic there is uncertainty, the lack of certainty, a state of having limited knowledge where it is impossible to exactly describe existing state or future outcome, and more than one possible outcome. When one has described uncertainty, there are some situations in which you can measure it. If one can measure uncertainty, one can define the set of possible states or outcomes where probabilities are assigned to each possible state or outcome—this also includes the application of a probability density function to continuous variables.

            Risk, however, is a state of uncertainty where some possible outcomes have an undesired effect or significant loss. This is the crucial concept in the evaluation of policies designed to mitigate the warming. The measurement of risk is the set of measured uncertainties where some possible outcomes are losses, and the magnitudes of those losses— this also includes loss functions over continuous variables.

            The classic example of uncertainty and risk would be simple weather predictions. If I do not know if it is going to rain tomorrow, I have an uncertainty. If I know there is a 70% chance of rain tomorrow, I have a quantified and qualified uncertainty. If I’m planning a picnic tomorrow, the probability it is going to get rained out versus the cost of putting it on is the risk I’m taking.

            What is rather hard to do is to measure the uncertainty without an a priori knowledge of the probabilities built into the system. Quantifying uncertainty is something that becomes more and more difficult with the increased complexity of the system you are analyzing, and the simple probabilities become far more difficult to comprehend. Uncertainty that is by its very nature immeasurable, sometimes described as Knightian uncertainty, is an important factor in undermining the usefulness of a probability model.

            Knightian uncertainty is best understood as an epistemological phenomenon, one that necessitates separating objective and subjective uncertainty. Objective uncertainty can be quantified: it is uncertainty that is based around what Donald Rumsfeld in a radically different context called “known unknowns.” This is uncertainty that can be planned for, quantified, and assessed. All models and scientific experiments deal with these known unknowns; the very existence of probability modeling is the measurement and use as a predictive tool of this objective uncertainty.

            Knightian uncertainty, however, deals with the “unknown unknowns” or, said differently, with subjective uncertainty. This type of uncertainty creates situations in which grand systems and predictive forecasts break down in unexpected and unforeseen ways. Knightian uncertainty is something that, in say financial markets, drives people to behave very cautiously and react as if there exists highly probable catastrophic risk even if none is detectable. Knightian uncertainty arises from inherent complexities in a system, in the errors made in data collection, in the tiniest human errors in computation, etc. By the very nature of these mistakes, they are nearly random and unpredictable. Compounded Knightian uncertainty can lead to catastrophic outcomes if too much weight is given to the predictive capabilities of the models. If there exists the possibility of poorly collected, calibrated, or refined data, the uncertainty that arises renders the model useless. Going back to our weather example, Knightian uncertainty would refer to the probability of the forest burning down the night before, or a meteorite striking, or your car as you go to the park breaking down, if all you know is the weather probabilities. Given that any of these incidents have a nonzero probability of happening, and that many of them could be exacerbated by human error, Knightian uncertainty can drastically derail any sort of predictions made with incomplete information.

            Which, of course, brings us back to the Climate Research Unit. The language used to describe the data corrections in the emails are not, by themselves, troublesome. Just because Phil Jones wants to “hide the decline” when he speaks of inconsistent tree ring data, doesn’t mean that the scientists at the CRU are, necessarily, doing things to the data that are suspect. Anyone who deals extensively with raw data can tell you, it must be treated and standardized before meaningful analysis can be done. It’s the worst kept secret of all scientific research: a lot of value judgments come into place when you are preparing the data to be used. As statistician Ronald Thisted once said, “Raw data, like raw potatoes, usually require cleaning before use.” So these types of adjustments are far from unusual.

            What makes this unusual is the recalcitrance the scientists at the CRU were to making their data available to be scrutinized. Olympian efforts were made to continually squash Freedom of Information Act requests by multiple parties. Michael Mann, a key figure in this kerfuffle and the author of the infamous “Hockey Stick” projection, was particularly obstinate about turning over his data to Canadian statistician Steve McIntyre—a data fraud specialist whose work showed both drastic overestimations of global temperature projections by NASA’s James Hansen (a point NASA grudgingly admitted on their website) and misleading certainty in Mann’s Hockey Stick graph[1]—going as far as to say that “If [McIntyre et al] ever hear there is a Freedom of Information Act now in the UK, I think I’ll delete the file rather than send to anyone.” This refusal to let others see their raw data is a troubling indicator that something else must be going on.

            Indeed, there was something else afoot; Phil Jones admitted that they were unable to acquiesce to the Freedom of Information Act request because about five percent of the raw data had been lost in a data transfer. As the Climategate emails clearly show, the scientists who made the adjustments to the data are severely invested in proving the veracity of anthropogenic global warming. And they now admit that they’ve lost some of the data. This is by far the most damning element of it all. If one is conducting important work that he knows will be controversial, particularly if it will have public policy implications, he cannot lose the data. He should document everything he did to the data and make the data available to others. Even if a small amount of data  was lost, the data becomes worthless if it is not the exact same reference they used; even the smallest difference in the data set could have humongous consequences.

            This is where Knightian uncertainty rears its ugly head: if the data collection cannot be considered totally honest (and how is one to know it was totally honest when their results cannot be checked against the raw data) the unknown unknowns within the model become astronomical. The only way to test the validity of the models, if they have any left, is by back-testing their predictive validity. These models must show that they are the superior predictor of future temperatures, something they tendentiously fail to do. Scott Armstrong, founder of the International Journal of Forecasting and a professor of “marketing and projection” at the Wharton School of Business, has engaged in a rigorous analysis of the predictive power of these models, particularly the IPCC models, comparing them to what he calls naïve models, models that assume no advance knowledge of climate and simply assume precise zero growth.

            The first test Armstrong performed was to compare the error in predicting temperatures over a one hundred year interval. In Armstrong’s own words, “Assume that it is 1850 and you make a forecast that global temperature will be the same 50 years later (i.e., 1900). In 1851 you make another such forecast for the year 1801 . . . And so on up to 1958 when you forecast to 2008. You then compare the forecasts against HadCRUT3 [IPCC model] and calculate the errors (ignore the signs). What would be the average error for the 108 50 year forecasts in degrees centigrade?” Using rolling forecasts and the UAH temperatures, the Naïve model had a mean absolute error of .215 degrees Celsius, versus the IPCC model with a mean absolute error of .203 degrees Celsius, a difference of .012 degrees Celsius. So, from this first glance, it appears that the IPCC model, for all of its refinement of the inputs and precise knowledge of climate permutations is almost no better than the assumption of steady to nonexistent growth. What is even more troubling is that, the longer the timeframe, given the selected inputs, the worse the IPCC model does. For the 10-year projections, it’s nearly perfect. For the 100-year projections, it is almost 12 times worse, and performs slightly worse than the naïve model.

            Armstrong takes it a step further and begins comparing his own selection of correlates to global temperature versus the Carbon input integral to the anthropogenic global warming. When simply looking at correlation coefficients, atmospheric carbon does fairly well, with a .86 correlation. It does about as well as US postal rates (.85), Consumer Price Index (.87), NOAA expenditures (.83), and books published in the US (.73), hardly an overwhelming result. When you break it down into fifty-year intervals, and you use these to predict these results with results from the next fifty-year chunk, atmospheric Carbon falls to the middle of the pack in Weighted Cumulative Relative Absolute Error,[2] between NOAA expenditures and books published in the US. The predictive value of Atmospheric Carbon is, it would seem, rather pedestrian and furthers the doubt as to the long-term projections of the model. Given the long evolution of the climate model, and both its inputs and outputs, one would expect that the quality of the models, in terms of their predictive power, would increase over time. What Armstrong found, however, was that increasing the total inputs, partially by incorporating paleoclimatological data (particularly the tree ring size), has made at best minor improvements, and, in some cases, even decreasing its quality.

            Which is where uncertainty reemerges: if the validity of the inputs cannot be verified and the predictive power of the models is fairly unremarkable, the level of Knightian uncertainty in these models becomes overwhelming. Further, it is the lack of care that these specific scientists have put into their statistical methodology before, particularly Michael Mann and his cadre, which drastically undermines any predictions these models make. Returning to Mann’s Hockey Stick, it was one of the only major studies the National Association of Sciences felt compelled to appoint a special review committee to inspect as to his methods, his data, and his conclusions. This committee concluded that


Substantial uncertainties currently present in the quantitative assessment of large-scale surface temperature changes prior to about A.D. 1600 lower our confidence in this conclusion compared to the high level of confidence we place in the Little Ice Age cooling and 20th century warming. Even less confidence can be placed in the original conclusions by Mann et al. (1999) that “the 1990s are likely the warmest decade, and 1998 the warmest year, in at least a millennium.


Kurt Cuffey, a physicist from UC Berkley and a committee member later chided the IPCC for using the Hockey Stick model stating that it “sent a very misleading message about how resolved this part of the scientific research was.”

In 2006 Mann’s Hockey Stick model was further scrutinized, this time by a group commissioned by the United States House of Representatives Energy and Commerce Committee. The Wegman Report (named after the head of the panel, Edward Wegman, a statistics professor at George Mason University and former chair of the National Research Council’s Committee on Applied and Theoretical Statistics) concluded:


The sharing of research materials, data and results was haphazardly and grudgingly done. In this case we judge that there was too much reliance on peer review, which was not necessarily independent. Moreover, the work has been sufficiently politicized that this community can hardly reassess their public positions without losing credibility. Overall, our committee believes that Dr. Mann’s assessments that the decade of the 1990s was the hottest decade of the millennium and that 1998 was the hottest year of the millennium cannot be supported by his analysis.


The shear enormity of the corruption here—methodological and intellectual more than monetary—demands the totality of paleoclimatelogical based forecasting models be audited and recreated by an independent group of forecasting experts and scientists. Neither of these committees attempted to recreate the results of Mann’s study or any other major climatelogical study given the raw data. These committees only performed a review of his procedures and his results, not a recreation of the studies.

            This, of course, would not be too much of a hassle were it not for the fact that some of the raw data had been lost. Recreating the results with only a partial sample—with reports as to what parts of the data were lost, whether it was areas affected by the Urban Heat Island effect or simply outliers, varying greatly—is just impossible. Effective recreation is simply not a viable option now, and leaves us with very little to alleviate the uncertainty about the nature and extent of the results.

            Now, it has been argued that these were peer-reviewed pieces, and therefore have met these basic requirements: recreating the experiment wholesale seems time-consuming and wasteful. In most cases, this argument certainly would hold true: we simply do not have the time or the resources to recreate every single study in the sciences; nothing would ever get done, and the process of increasing total knowledge would be slowed immensely. What makes this an unusual case is the controversial and, even more important, consequential nature of the findings.

            By their very nature, most scientific studies have minimal impact associated with them: the veracity of their claims affects only a very small subset of the population and the difference of magnitude within a result is not terribly important. With the climate modeling, given the prescriptive nature of the field, the reliability of the results is of absolute importance to everyone in the world: the cost of a miscalculation or other basic error would be enormous. When studies have such a wide impact on public policy, or if they theorize a radical or controversial finding, they are simply held to a different standard of ex post facto corroboration. Since these studies would directly pertain to the world of public policy and politics, the ways in which they are evaluated has to take on the far more extreme verification requirements of a policy piece or, even, prescription drug research. These requirements are far from standardized, but there are certain practices that seem to be the norm.

            For some of the most wide-affecting and controversial studies, the scrutiny they are placed under is far greater than anything these anthropogenic global warming studies have been. One of the most famous examples of such a wide reaching study would be the Coleman Report, a study of the effects of racial integration and differences in income on performance in primary and secondary education. The huge implications of the study—which concluded that the quality of the school was independent of the spending and that the single biggest predictor of academic success was the background and education of the students’ parents—necessitated a major and thorough review of not only the methods Dr. Coleman used, but a recreation of his study by a panel at Harvard University. Coleman not only provided them his raw data, but he also provided the manipulations he used every step of the way to get the data into treatable form. The significance was that the Harvard panel, aside from finding a minor coding problem, largely reaffirmed Coleman’s monumental results and provided the study with the necessary legitimacy to act upon it.

            But James Coleman is far from the only social researcher to provide this type of access to his models and his data. Robert Putnam, in his equally controversial study of democratic institutions, allows for his whole data set, raw and treated, to be accessed by anyone simply by going to his website. What makes the paleoclimatelogical research so unnerving is that there are no cases in which these researchers allowed anywhere near this kind of access to their methods, data, and models, if the intent of the person making the demanding said materials was skeptical. The procurement of data in these studies—of which there remains considerable doubt as to the validity of such seemingly arbitrary choices, a claim made by MIT physicist and climatologist Richard Lindzen—involves such a thorough and widespread population, that it necessitates direct and systematic reviews of what raw data was included and excluded, why, and how it was treated.

            Given the admitted loss of the original inputs and a deliberate refusal to release the specifics of the treatments, the uncertainty about the data, models, and conclusions are simply too large to overlook. These flawed studies are not the things that good policy is made from; they seem to be the product of an over politicized and poorly constructed process that emphasizes the urgency of the results over the measured skepticism that the scientific process needs. While we cannot be certain about the eschatological predictions made by climate scientists, what we can be certain of is the economic damage caused by implementing some of the “solutions” to this problem; in no uncertain terms, enacting a cap-and-trade mechanism in the United States would cost us anywhere from 1.7 to 4.8 trillion in the next two decades. By some estimations, the bill would cause a net loss of over 1 million American jobs by 2016. The economic costs of these programs are much more likely to be knowable commodities, if for nothing else than the fact that we’ve done it before and know, to a far greater extent, the strengths and weaknesses of such analysis. It also does not involve the rampant uncertainty of the climatological predictions; these are costs one can, proverbially, take to the bank.

            Where does that leave us on the issue of Climate Change? Well, a simple reading of the relative uncertainties and predictive powers would suggest that the best course of action may be to do very little or, in fact, nothing. The costs of action are high and known, the costs of inaction have yet to be fully explored and are, as of right now, mostly unknown. The obvious solutions should entail gradual and prudential changes that won’t drastically weaken the footing of the American economy, while moving us in the right direction, probably, to fight this theoretical danger. Small, not broad, strokes are what we need now; large-scale action based on poorly constructed models would seem not only imprudent, but reckless. We shouldn’t undermine the American economy over poorly constructed and nonreplicable forecasting models.

[1]1. The Hockey Stick Graph famously showed that global temperatures remained largely consistent in the millennia leading up to 1850, but has, since then, gone up like a hockey stick. It was a major selling point of the 1997 IPCC report and was one of the most inflammatory studies ever done on global temperature. Most of Al Gore’s most apocalyptic predictions were at least somewhat based on this graph.


[2]2. Relative to no change benchmark, weighted so that errors for each forecasting horizon are counted equally