Metadata Errors in the global weather station database

Errors in GHCN metadata inventories show stations off by as much as 300 kilometers

In the debate over the accuracy of the global temperature nothing is more evident than errors in the location data for stations in the GHCN inventory. That inventory is the primary source for all the temperature series.

One question is “do these mistakes make a difference?” If one believes as I do that the record is largely correct, then it’s obvious that these mistakes cannot make a huge difference. If one believes, as some do, that the record is flawed, then it’s obvious that these mistakes could be part of the problem. Up until know that is where these two sides of the debate stand.

Believers convinced that the small mistakes cannot make a difference; and dis-believers holding that these mistakes could in fact contribute to the bias in the record. Before I get to the question of whether or not these mistakes make a difference, I need to establish the mistakes, show how some of them originate, correct them where I can and then do some simple evaluations of the impact of the mistakes. This is not a simple process. Throughout this process I think we can say two things that are unassailable:

1. the mistakes are real. 2. we simply don’t know if they make a difference. Some believe they cannot (but they haven’t demonstrated that) and some believe they will (but they haven’t demonstrated that). The demonstration of either position requires real work. Up to now no one has done this work.

This matters primarily because to settle the matter of UHI stations must be categorized as urban or rural. That entails collecing some information about the character of the station, say its population or the characteristics of the land surface. So, location matters. Consider Nightlights which Hansen2010 uses to categorize stations into urban and rural. That determination is made by looking up the value of a pixel in an image. If it is bright, the site is urban. If it’s dark (mis-located in the ocean) the site is rural.

In the GHCN metadata the station may be reported at location xyz.xyN yzx.yxE. In reality it can be many miles from this location. That means the nightlights lookup or ANY georeferenced data ( impervious surfaces, gridded population, land cover) may be wrong. One of my readers alerted me to a project to correct the data. That project can be found here. That resource led to other resources including a 2 year long project to correct the data for all weather stations. Its a huge repository. That led to the WMO documents one of the putative sources for GHCN. This source also has errors. Luckily the WMO has asked all member nations to report more accurate data back in 2009. That process has yet to be completed and when it is done we should have data that is reported down to the arc second. Until then we are stuck trying to reconcile various sources.

The first problem to solve is the loss of precision problem. The WMO has reports that are down to the arc minute. It’s clear that when GHCN uses this data and transforms it into decimal degrees that they round and truncate. These truncations, on occasion, will move a station. I’ve documented that by examining the original WMO documents and the GHCN documents. In other cases it hard to see the exact error in GHCN, but they clearly dont track with WMO. First the WMO coordinates for WMO 60355 and then the GHCN coordinates:

WMO: 60355 SKIKDA 36 53N 06 54E [36.8833333, 6.9000]

GHCN: 10160355000 SKIKDA 36.93 6.95

GHCN places the station in the ocean. WMO places it on land as seen above.

To start correcting these locations I started working through the various sources. In this post I will start the work by correcting the GHCN inventory using WMO information as the basis. Aware, of course that WMO may have it own issue. The task is complicated by the lack of any GHCN documents showing how they used WMO documents. In the first step I’ve done this. I compared the GHCN inventory with the WMO inventory and looked at those records where GHCN and WMO have the same station number and station name. That is difficult in itself because of the way GHCN truncates names to fit a data field. It’s also complicated by the issue of re spelling, multiple names for each site and the issue of GHCN Imod flags and WMO station index sub numbers.

Here is what we find. If we start with the 7200 stations in the GHCN inventory and use the WMO identifier to look up the same stations in the WMO official inventory we get roughly 2500 matches. Here are the matching rules I used.

1. the WMO number must be the same

2. The GHCN name must match the WMO name (or alternate names match).

3. The GHCNID must not have any Imod variants. (no multiple stations per WMO)

4. The WMO station must not have any sub index variants. (107 WMO numbers have subindexes)

That’s a bit hard to explain but in short I try to match the stations that are unique in GHCN with those that are unique in the WMO records. Here is what a sample record looks like.WMO positions are translated from degrees and minutes to decimal degrees and the full precision is retained. You can check that against GHCN rounding. As we saw in previous posts slight movements in stations can move them from Bright to dark and from dark to bright pixels.

63401001000 JAN MAYEN 70.93 -8.67 1001 JAN MAYEN 70.93333 -8.666667

63401008000 SVALBARD LUFT 78.25 15.47 1008 SVALBARD AP 78.25000 15.466667

63401025000 TROMO/SKATTO 69.50 19.00 1025 TROMSO/LANGNES 69.68333 18.916667

63401028000 BJORNOYA 74.52 19.02 1028 BJORNOYA 74.51667 19.016667

63401049000 ALTA LUFTHAVN 69.98 23.37 1049 ALTA LUFTHAVN 69.98333 23.366667

You also see some of the name matching difficulties where the two records have the same WMO and slightly different names. If we collate all differences on lat and lon in matching stations we get the following:

And when we check the worst record we find the following

WMO: 60581 HASSI-MESSAOUD 31.66667 6.15

GHCN: 10160581000 HASSI-MESSOUD 31.7 2.9

GHCN has the station at longitude [smm] 2.9. According to GHCN the station is an airport:

The location in the WMO file

And the difference is roughly 300km.WMO is more correct than GHCN. GHCN is off by 300km

An old picture of the approach (weather station is to the left)

And diagrams of the airfield

Now, why does this matter. Giss uses GHCN inventories to get Nightlights. Nightlights uses the location information to determine if the pixel is dark (rural) or bright (urban)

NASA thinks this site is dark. They think it is pitch dark. Of course they are looking 300km away from the real site. From the inventory used in H2010.

10160581000 HASSI-MESSOUD   31.70    2.90  398  630R  HOT DESERT    A    0

0 0 votes

Article Rating

79 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Eric Anderson

October 31, 2010 7:42 pm

“1. the mistakes are real. 2. we simply don’t know if they make a difference. Some believe they cannot (but they haven’t demonstrated that) and some believe they will (but they haven’t demonstrated that).”
While this is true — as a pure logical matter — the position that the record is accurate bears the burden of showing that it is the case and that the record can be trusted. The proponent of a theory, or a methodology, or a dataset must demonstrate a meaningful level of accuracy before anyone else should be required to even take it seriously. I can’t simply assert that my dataset is good, and then it is up to the other side to pick it apart.
Having seen poor siting, poor data entry (sometimes even switching the sign of the entry), infilling over hundreds of kilometers, adjustments to data, and, now, sites in the wrong locations, I don’t personally feel any need to take the record seriously until the mess gets cleaned up.
Of course, we really need to ask: accurate in which sense? Is the record “accurate” enough in broad strokes and at a very rough resolution to show some level of warming over the past century? Yeah, probably. Is it accurate enough for us to know, to within tenths of a degree what has happened across the globe, whether the warming is meaningful, whether it has been global, regional, isolated, etc.? Is it accurate enough for us to identify any kind of meaningful trend from the noise? It is in these areas that, if I may be so bold, many of us have serious doubts. The record has certainly never been shown to have any such skill, and each revelation of problems with the record only adds to the unease. The whole thing is pretty spooky. 🙂

Claude Harvey

October 31, 2010 7:58 pm

“If we start with the 7200 stations in the GHCN inventory and use the WMO identifier to look up the same stations in the WMO official inventory we get roughly 2500 matches.”
This kind of “slop” certainly does not engender confidence. With 65% of the stations not being located exactly where the analysts think they are and the analysts categorizing and adjusting data based on location, who could trust the results?
Skeptics like me have noticed that “slop” which produces results skewed toward”warming” is accepted and goes unquestioned and uncorrected within the “true believing” scientific community. Conversely, “slop” that produces results contrary to the theory of AGW is intensely questioned within that community and corrected without delay. That mind-set is guaranteed to produce a false record tending toward a preconceived conclusion.

jorgekafkazar

October 31, 2010 8:03 pm

Fascinating, Steven! Not sure of the ultimate significance, but another interesting piece of the pie.

October 31, 2010 8:12 pm

“1. the mistakes are real”
Ok, with all the money spent on climatology, why is it that the raw data is not more carefully collected?
E.g. in the UK 1700 ‘scientists’ signed a petition supporting the conclusions of the IPCC. How come these same ‘scientists’ could not keep the temperature records straight? I rather think at this point we have more ‘scientists’ than thermometers.

James Allison

October 31, 2010 8:13 pm

Perhaps its as much about scientific credibility as it is about whether “these mistakes make a difference?”
If the so called scientists who use the data these stations produce are either unaware of or unconcerned that station positioning is up to 300km off then I would question the credibility of these scientists. After adding in all the other temp recording/adjustment problems we hear about on WUWT and these so called climate scientists lose all credibility. They have simply become a complete waste of money, time and space.

Judd

October 31, 2010 8:25 pm

In the end, this all deals somewhat on philosophy. I’ve been deeply concerned (albeit not from an alarmist side) about AGW since 1988. I believe that AGW has long broken past being a scientific issue (if it ever was one) and into a religious, political issue. I want to be careful here in relationship to bringing up my medical issues as a parallel to the issues regarding AGW. But those issues have taught me things about the lack of ‘certainty’ in science. This lack of certainty is, in the end, a wondrous thing. I wish to dwell upon this further on this fine website.

AusieDan

October 31, 2010 8:48 pm

If you start from the basis of one partiular location in your community, then expand that to cover other locations as well, then it is difficult if not impossible to find that there has been any warming at all since records began.
If you start to find evidence of some scattered warming, then look for evidence of UHI.
Match rainfall with temperature.
Rising temperature with flat rainfall is a sure pointer to UHI.

John Kehr

October 31, 2010 8:48 pm

A simple check would be to determine the trend at the 2500 stations that match and compare it to the entire sample. Statistically a population that large is enough to show the trend.
If there is not difference between the 2500 and the 7200 then the catalog error is insignificant. If there is a difference, then it needs investigation.
I don’t think it matter that much as there is enough data to support that the Earth is warmer now than it has for most of the last few hundred years. That it is warmer now also doesn’t matter as that kind of variation is natural.
John Kehr
The Inconvenient Skeptic

Evan Jones

Editor

October 31, 2010 8:55 pm

Mosh, take a look at MMS record for Blue Hill station, MA.
You will see several different locations listed. I spoke to the curator up there (very longtime old dude). He insisted there had been only one station move since 1905 and that was decades ago and the move was less than 20 feet.
They have the full suite of equipment up there, but they take the USHCN data straight from a big old Hazen screen, the same one that’s been there over 100 years. They also list the equipment as MMTS since 1990 (the curator said they never even hooked that junk up). They also have two hygros, only one of which was listed (but no longer is listed). They only mention the SRG although they have all sorts of new and spiffy equipment (including one of those spanking new ASOS-type rain gauge jobs) not listed in their equipment.
The Hazen screen is never mentioned as part of equipment (though a more recent CRS is noted).
So you can count on NCDC not only making big errors regarding siting, but they also appear prone to error on equipment and even what sort of sensor is being used to collect their data.

Gary

October 31, 2010 8:56 pm

Once current locations are verified, how will the correctness of previous locations be dealt with? Presumably the same type of errors have always existed with the data set.

Anything is possible

October 31, 2010 8:58 pm

“Consider Nightlights which Hansen2010 uses to categorize stations into urban and rural. That determination is made by looking up the value of a pixel in an image. If it bright, the site is urban. If its dark (mis-located in the ocean) the site is rural.”
_____________________________________________________________
Correct me if I’m wrong, but even with increasing population and urbanisation, the over-whelming majority of the Earth’s surface would still be categorised as dark.
It is therefore logical to assume that if a station is randomly mis-located, then there is a far greater chance that it will be wrongly categorised as a rural station when it should be urban than as an urban station when it should be rural.
Wrongly categorising an urban station as rural would mean that no allowance would be made for any UHI effect.
Lo and behold, we have the potential, at least, for bias in the record.
That seemed too easy, so if I am missing something blindingly obvious, I apologise in advance…….

Evan Jones

Editor

October 31, 2010 9:01 pm

Ok, with all the money spent on climatology, why is it that the raw data is not more carefully collected?
Okay, remember the old Soviet-era Russian joke about the bridge and the watchman?

EthicallyCivil

October 31, 2010 9:05 pm

Thank you Steven — the difference with Dr Trenberths “Scientists almost always have to massage their data, exercising judgment about what might be defective and best disregarded” couldn’t be more striking.
If the data is potentially flawed, a scientist looks to quantify the size and impact of the errors and correct them if possible. Failing that, identifying the impact of the flaws on the conclusions and fully disclosing these limitations is acceptable.
Dr. Trenberths comments read far more the way sales and marketing spreadsheets are “massaged” to give a good business case — a job I used to have.

a jones

October 31, 2010 9:26 pm

Natural philosophy has a long and honourable tradition of welcoming the contributions of the skilled or expert amateur: where indeed would our astronomical friends be without their army of amateur stargazers watching the heavens.
And indeed as Anthony Watts’ surface station project has shown the sheer manpower the amateurs can mobilise is awesome.
You would think climate scientists might welcome these resources and interact with them as the astronomers do.
Not a bit. Yet as the amateurs and some professional scientists have slowly come together in the blogosphere they are increasingly challenging the scientific integrity of Cimatology from almost every aspect. You yourself Mr. Mosher are addressing a serious issue with the quality of the observations used for surface temperature sets.
Although your approach is valid and offers important insights it will not welcomed by the professionals. Personally I find your observations fascinating. But no doubt your painstaking work will be derided, ridiculed and personal slurs made about you.
Strange is it not? Yet stranger still that the blogosphere has managed to mobilise a large number of well informed largely unpaid persons to challenge the scientific establishment of climatology on almost every aspect of their discipline: as they carry it out to their profit, their professional advancement: and above all else to their entire satisfaction. For none in their tight little world will gainsay them.
Yet despite all attempts to ignore, suppress and vilify it this groundroots approach has gained traction. And slowly, very slowly, is shifting the debate from rhetoric, polemic and derision, not to mention obstruction, to serious scientific investigation.
Which is to be welcomed.
But it is very strange nevertheless.
Amazing thing t’internet. Interesting times Mr. Mosher, interesting times indeed.
Kindest Regards

Jim Owen

October 31, 2010 9:49 pm

I’m a long-distance hiker and I’ll tell you with no uncertainty that 10 km can, and very often does, make a very large difference in local conditions. A 300 km error makes the data AND any calculations based on that data simply ridiculous.
I”m also a systems engineer with over 50 years of experience with instrumentation of little things like nuclear reactors, amphibious vehicles, scientific spacecraft, etc. For most of that time I was the science operations engineer for such minor programs as Landsat, UARS and HST. And I have never in all those years seen either scientists or engineers be as unconscionably sloppy as has become abundantly and obviously common in the “climate community”. If I’d been that sloppy at any time in my 50+ year career, I’d have been fired – and rightfully so.
Tell me again – why do we pay these guys the big bucks? IMO, what they seem to term “science” wouldn’t get a passing grade in a middle school science fair.
Hmmm – I guess it’s obvious that I’m not happy with these people. They give science, engineering — and mathematics, a black eye.

Frank Lee Meidere

October 31, 2010 9:59 pm

Place thermometer. Note location. Take periodic readings.
I can see how that could pose problems.
On a related note, are there any statistics how many of these scientists have done themselves a mischief through being unable to distinguish between depressions in the earth and certain areas of their anatomy?

Latimer Alder

October 31, 2010 11:19 pm

I hope that this is not a dumb question, but has anyone done any work looking at the combination of uncertainties involved in the sequence of
a. recording the temperature at all these places – with possibilities for error/UHI/equipment failures/calibration errors/misreads etc
b. working out where the damned things are
c. collecting the data
d. ‘adjusting’ the data
e. sticking it all into some ginormous computer program (preferably not by

Harry_Read_Me

) and computing a daily/weekly/monthly/annual average
and all the other processing steps involved.
My ‘feeling’ is that given all the uncertainties involved, we might be able to detect really large changes in temperature (5 degrees or so over a century). But to accurately find a change of less than 1 degree seems to stretch the point a bit. Even with a faultless sequence of processing from data capture to overall answer, that would seem to be pretty unlikely.
But we continue to see more and more evidence that we do not have faultless data collection nor processing…and there is a slight odour of sharp practice as well.
Can we really have the level of faith in the temperature record that we are asked to believe? Or is the signal too small and the noise too great?

pat

October 31, 2010 11:26 pm

I remember a meteorologist in Honolulu laughingly pointing out to a reporter trumpeting the record temperature that his instruments a mere half mile away would read significantly lower. Because he was stuck at a frickin airport. That the ‘record’ was bogus. Of course the local news outlet left off the explanation and went with the ‘record’.Even tho the meteorologist gave a very concise explanation of the urban heat effect on near ground instrumentation.
Let everyone be clear. It is warmer in Hawaii than it was 25 years ago. Is it unprecedented? Absolutely not.

Share this:

Like this:

Related