Getting oldWeather data ship-shape for science

Author of Post: Larry Spencer

Date of Post: Tuesday, December 04, 2018

 

     The oldWeather science team is using the ship logbook observations to help make a three-dimensional global reconstruction of the Earth’s weather in order to understand the past climate of the Earth and better predict its future. This uses the ship observations in exciting and fascinating ways, but it needs them to be processed into a precise format and as accurately as possible. We know that the ship logbook transcriptions made by our volunteers are very accurate, but we still have plenty of challenges in using the data. Some of the entries were incorrect when originally recorded in the logs, and we need to infer some information that is not in the logbook explicitly. Therefore, we have to calculate exact positions and standardized dates from the local dates and location names provided in the logs. We need to get the observations ready to sail!

     You might be wondering how all of that transcribed historical weather data is processed and then ends up being used by the scientists after the work of the volunteers has been completed. As one member of the science team, I work on this extensively, and I can tell you that it is a very enjoyable process. For each one of the oldWeather ships available, I meticulously analyze and quality-control the transcribed data for both latitude and longitude positions and weather observations (checking for and removing existing errors in the data) in order to ensure the highest-quality datasets possible. As a way to illustrate just how important this process is, two maps are displayed below. The first one is a “Before Map” that shows the Thetis’s positions and tracks (represented by the gold dots and lines) that were automatically generated by Philip’s software from the place names transcribed from the logbooks. The second one is an “After Map” that shows the ship’s positions and tracks after I carefully analyzed and quality-controlled the geographical positions data from the ship. There is a significant difference between the two maps displayed, with the “After Map” illustrating a much more “polished-up” and realistic version of the ship’s voyage positions and tracks (for instance, the ship not traveling over any land masses).

 

Figure #1. – “Before Map” of the oldWeather Thetis ship’s voyage positions/tracks throughout the time period of May 01, 1884 – December 31, 1908 before quality-control on the geographical positions data from the ship.

 

Figure #2. – “After Map” of the oldWeather Thetis ship’s voyage positions/tracks throughout the time period of May 01, 1884 – December 31, 1908 after quality-control on the geographical positions data from the ship.

 

     The transcribed data produced by oldWeather are all stored in one big database. From this database, Philip’s software makes two separate data files for each ship: one for the latitude and longitude positions and one for the weather observations. I then apply a meticulous data analysis/quality-control process to the data that is contained in both of these files for each ship. As a part of this process, to make error-detecting a bit easier, I import the weather data into a spreadsheet and arrange it in order from the least to the greatest values and from the greatest to the least values. This is a quick and easy method of detecting and correcting the most obvious errors that exist in the data. I complete this for the three meteorological variables of surface pressure, 2-meter air temperature, and sea surface temperature. In addition to this, I manually edit the data where errors exist throughout the entire files, both in the geographical positions data file and in the weather observations data file. I use a standard set of criteria as a guide for how to locate errors and perform quality-control on the data where I detect problems. For example, in regards to the surface pressure data, if there are any values provided in the original file that are lower than 28.00 inches of Mercury or if there are any values that are higher than 32.00 inches of Mercury, then I remove that particular value and replace it in the file with an “NA”, which represents “Not Available”, because it would be classified as a “meteorologically-unrealistic” value. This specific criteria is used in applicability to either if there are any values of this type contained in the original file or if there are any of these types of values recorded in the original ship logbook. Another example would be if there are any values that have been transcribed correctly by the volunteers, but are obviously incorrectly recorded in the original ship logbooks, I then either remove and replace the values given in the original file with an “NA”, or I remove and change them to the obviously-correct values, depending upon the exact circumstances. In those particular circumstances, I make a careful comparison of the “obviously-incorrect” values with the values provided directly above and below them in the original ship logbooks and then proceed to manually edit them accordingly. When I have performed and completed this technical process in its entirety, I then convert a given ship’s quality-controlled weather and position data from the transcribed database into a standard format called the International Marine Meteorological Archive (IMMA) format. Each oldWeather ship has a final file that is created for it, which contains the combined weather and position data presented in the IMMA format. The overarching goal of this entire process is to produce standardized data, as accurate as possible, that can be easily used by professional scientists in major research projects.

     This IMMA-formatted data is being used in several ways, notably for assimilation into the Twentieth Century Reanalysis Project (20CR) and for distribution through international databases, such as the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) and the International Surface Pressure Databank (ISPD). For example, this IMMA-formatted data for OldWeather3 ships will be archived as an ICOADS auxiliary dataset in the Research Data Archive (RDA) at the National Center for Atmospheric Research (NCAR) located in Boulder, Colorado. The overarching purpose of doing this work is to develop weather and climate models and to better predict and understand extreme, high-impact weather and climate phenomena on a global scale. We would be much less able to successfully accomplish this purpose without our oldWeather volunteers! So, I want to extend a huge “thank-you” to all of our volunteers for contributing so much time, effort, and dedication to this project ….. because it all starts with the very critical step of getting the observations and positions contained in the original ship logbooks accurately transcribed!

A better centenary

Exactly 100 years ago: at 11 a.m. Paris time on 11 November 1918, the Armistice of Compiègne came into force, and the Great War came to an end. Though there had been no major naval battles since Jutland, the Royal Navy played a vital role in hastening the end of the war by defeating the German submarine campaign and maintaining an economic blockade. A key moment leading to the armistice was the Kiel mutiny – when sailors of the German High Seas fleet refused to sortie for another battle with the Royal Navy.

We get to read our own particular view of the armistice, and the surrender and internment of the German Navy that followed, in the words of our own logbooks. The logbook editing team has been working extra hard over the last few months to make sure that there is an edited version of every one of the oldWeather logbooks from the WW1 period available on naval-history.net. You can read the characteristically terse accounts of the end of the war from a whole list of ships from HMS Almazora (‘1.5am: Signal re “Armistice” received.’) to HMS Virago (‘11.30am: Party left for thanksgiving service on board HMS Tamar’).

This November 11th we not only remember those who died in the war, we remember Gordon Smith, who did so much to lead and inspire the historical work done with our logbooks. It was Gordon’s fond wish to have the WW1 logs edited by the centenary of the armistice, so we also have something to celebrate: Congratulations to all those who have worked on transcribing and editing the logs to complete this.

The Sitka Hurricane of 1880

Two reconstructions of the ‘Sitka Hurricane’. Pressure contours before (left) and after (right) adding oldWeather-Arctic observations from USC&GS Yukon. (Details).

Sitka is in the Alaska panhandle, at 57 degrees North. So the storm that hit them on October 26th, 1880 can’t possibly have been an actual hurricane. But it was a very severe extratropical cyclone – probably stronger than any storm that has hit the Sitka region since.

Worst-on-record storms for any region are worth studying – they set a benchmark for predictions of future extreme weather, and they are a great target for attribution – can we find out just why they were so bad? Some of our colleagues, led from the University of Bern, looked at this storm in the Twentieth Century Reanalysis. Almost immediately, they hit a snag:

Because the next assimilated pressure measurements are located more than 1000 km south of Sitka, the storm cannot be found in the 20CR2c ensemble mean

That is, because there are no pressure observations from the north-east Pacific in our databases for late 1880, the reanalysis uncertainty is so large we can’t say anything much about it.

But that was pre-oldWeather Arctic – since then we’ve put many new observations from oldweather into a major database update, and the Twentieth Century Reanalysis (20CR) team have been working night and day building a new version of their reanalysis. The resulting improvement is large – the image above shows a before-and-after reconstruction: The key is those black concentric circles – a characteristic marker of a storm in a weather map, and of course the yellow dots – those mark our new observations. The hero here is USC&GS Yukon, providing those vital observations in the north Pacific. (You can just about see the the Jeannette up there in the Arctic Ocean also, but she’s too far away to have much effect on this storm).

But Wait, There’s More!

When we sent out our last batch of new observations to the climate datasets we had not completed USS Jamestown, but now we have, and the Jamestown was moored in the harbour at Sitka at the time of the storm. So we shipped those data over to the 20CR team, and they quality controlled them and managed to add them to their system just in time to include them in the final reconstruction for their new reanalysis. So we have another new reconstruction, with our Jamestown observations in too:

Two reconstructions of the ‘Sitka Hurricane’. Pressure contours from 20CR2c (left) and 20CRv3 (right) adding oldWeather-Arctic observations from USC&GS Yukon and USS Jamestown. (Details).

Adding the Jamestown strengthens and improves the storm reconstruction still further (particularly apparent in the video diagnostic).

So thanks to everyone who has worked on the Yukon, the Jamestown, and 20CRv3 – between us, we’ve created a hurricane: An iconic storm which was missing in the last reconstruction is present in the new one. The uncertainty in the reconstruction is still large, but future researchers now have something concrete to work on.

Editing the weather observations

Original oW1 observations (red dots) and as revised using the edited ship histories (smaller yellow dots). Some new observations have been added (ships for which we now have good positions), a few errors in the original observation set have been removed, and much valuable new detail has been added.

We finished transcribing the original RN WW1 oldWeather logbooks some time ago. And as soon as we could, we produced a set of climate data from those transcriptions: 1.6 million new weather observations. Those data have now been included in standard climate databases, and produced a notable improvement in major research products.

However, we said at the time that those were preliminary data. They were useful as they were, but we hoped to improve them further in the future. In particular we struggled to get good position information from every logbook page: Generally, when they reported their latitude and longitude (’32 27 03 N’, ’24 78 08 W’) we could locate them successfully (though there are difficulties), but often the logs give their location as the name of a port or place, and tracking that down can be hard.

When we made the preliminary set of climate data we did all the data processing (to turn the log transcriptions into climate records) in software. This is fast, but not powerful enough to deal correctly with the difficult positions. It got us what we most needed (as much data as possible, now) but a substantial fraction of our 1.6 million observations were left with positions that were missing, approximate, and, occasionally, just wrong. These issues did not damage climate products using the observations, but they did restrict them. Effectively, many of our observations were not good enough to use. What we really needed was a careful, expert examination of the record for each ship, teasing out the precise route of each ship from the limited, idiosyncratic, and occasionally just wrong, information in the log. This would require two things, a group of expert analysts, and time for them to work.

One of the glories of oldWeather is the expertise and dedication of the project community, and the ship history editors, in particular, have been working through the log transcriptions, using their expertise to make edited histories, and maps, of their travels. They have not had enough quite time to finish the task – not every ship has yet been edited – but most are done, and once again we need the data now. So I have gone through the edited histories and used their position records to improve the weather data records.

This has improved our weather records a lot (see the video above). Some ships with log details that defeated the software the first time around now have good positions and can be used for the first time. More have details improved and some errors fixed. A particularly noticeable improvement are the gunboats on rivers in China, which now show movement along the Yangtze in good detail. We still have 1.6 million records, but about 500,000 of them have received a big upgrade in their quality, and this will feed through to substantial improvements in the climate products we derive from them.

Gordon Smith

Gordon Smith, who guided the historical side of oldWeather from the beginning, died on 16 December 2016 after a long illness: he was 75.

Gordon joined oldWeather in April 2010. He was brave enough to team up with a group of scientists planning a citizen science project rescuing historical climate observations. His job was to broaden the scope of the project – to teach us to value and use the ship logbooks we were reading as historical records, not just sources of pressure and temperature observations.

Gordon was a serious scholar, the author of two books on naval history, but he also had the vision to see that writing books was no longer the best way to communicate his subject, and the courage to try something new. He founded a website (naval-history.net) and, with a group of collaborators, built it into a valuable resource for both professional and amateur historians.

The thousands of volunteers contributing to oldWeather offered a flood of new material for naval-history.net, but that material needed to be checked, collated, and edited, to be useful to researchers. Gordon dealt with this by engaging unreservedly with the volunteers reading the logbooks; advising, encouraging, and teaching anyone interested. Some of the volunteers became sufficiently expert and enthusiastic to take on the necessary editing work, and this group of new naval historians is now playing a major role in the ongoing development of naval-history.net.

The success of oldWeather as a history project has also helped our work in climate science. Expanding the project in this way has been vital in sustaining the public interest that has kept oldWeather going for six years; now 20,000 people have contributed, generating millions of additional historic weather observations for use by researchers.

Gordon was able to do all this because he was not trying to write a book, or build his personal career as a historian. Instead he was willing to build naval-history.net as a public service, and to train and support a large number of amateur historians working with him. This long period of innovative and unselfish work has not only produced a valuable historical resource, but has also been of material assistance to climate science. oldWeather is both bigger and better for his contributions, and we’ll go on building on what he started.

The four million

The contributions to classic oldWeather as they have changed with time

Contributions to classic oldWeather: Time runs from top to bottom, each descending ribbon is a different contributor, the colours mark different ships. (Bigger, printable, version).

We started 2016 with some good news from classic oldWeather – the 3-millionth transcribed weather observation from the US Government Arctic logbooks. It’s great to finish it with some more – we have just rescued our 4-millionth.

Those 4-million observations come from more than 480,000 transcribed pages, and are the work of 4,730 different people. We’ve looked before at how the work has been divided up between the participants, but to celebrate 4 million I wanted to go beyond that, and show not only what had been done, but also when.

This is oldWeather meets abstract expressionism, but the descending stripes are not abstract – each one represents one contributor: If you’ve transcribed a page for oldWeather-classic, one of them is you. Time runs down, from the project launch at the top, to the present at the bottom, and the colours distinguish work done on the different ships.

We can see most clearly the large and consistent contributions of our most committed and expert participants – truly the backbone of the project. But also the brief bursts of interest produced by newsletters and media mentions, and everything in between. Again I hope everyone is proud of their contribution – it’s taken us all to do it; and in spite of the awesome size of our achievement (4 million new observations) we are, just about, still all on the same page.

Global Warming as you’ve never seen it before

Temperature anomalies from the HadCRUT dataset: Blue regions are colder than normal, red warmer. Where the indicator is missing, we have no observations.

Working on the world’s weather observations means I spend a lot of time looking at maps. I like the equirectangular (plate carrée) projection (fills the screen nicely, latitude and longitude are all you need to know), but it does have a couple of diadvantages: Map geeks disdain it as both boring and badly distorted, and it’s hopeless for looking at the Arctic and Antarctic.

You can work around both of these problems by the technical trick of ‘rotating the pole’. There is no fundamental reason why a map has to have the North Pole at the top. If you rotate your globe so that some other point is at the top before performing the projection that turns it into a flat map; you can make a map that is still equirectangular, but looks very different, and has the Arctic (or location of your choice) in the middle. It’s no less distorted, but it is less boring, as the distortion has moved into different places.

HadCRUT is a global temperature monitoring dataset. We use it to keep track of global warming, amongst other purposes. It combines thermometer observations, from ships and land weather stations, to make estimates of temperature change month-by-month back to 1850. The sea-temperature observations we are rescuing in oldWeather will be used to improve HadCRUT.

HadCRUT is constructed on a regular grid on a conventional equirectangular map. Looking at it on a map with a rotated (and rotating) pole gives a fresh look at what we know about global temperature change (and a sharp reminder of the problems with map projections). I like this visualisation because not only does the changing observation coverage show the same sort of historical effects we’ve already seen in the pressure observations, but it illustrates what we know and what we don’t about past temperature: The growing global warming is unmistakable in the last few decades, in spite of the large regional variability and observational uncertainties, but smaller-scale changes, further back in time, can still have large uncertainty – new observations could make a big difference.

Free at last

We highlight improvements to the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) in the latest Release 3.0 (R3.0; covering 1662–2014). ICOADS is the most widely used freely available collection of surface marine observations, providing data for the construction of gridded analyses of sea surface temperature, estimates of air–sea interaction and other meteorological variables. ICOADS observations are assimilated into all major atmospheric, oceanic and coupled reanalyses, further widening its impact. R3.0 therefore includes changes designed to enable effective exchange of information describing data quality between ICOADS, reanalysis centres, data set developers, scientists and the public. These user-driven innovations include the assignment of a unique identifier (UID) to each marine report – to enable tracing of observations, linking with reports and improved data sharing. Other revisions and extensions of the ICOADS’ International Maritime Meteorological Archive common data format incorporate new near-surface oceanographic data elements and cloud parameters. Many new input data sources have been assembled, and updates and improvements to existing data sources, or removal of erroneous data, made. Coupled with enhanced ‘preliminary’ monthly data and product extensions past 2014, R3.0 provides improved support of climate assessment and monitoring, reanalyses and near-real-time applications.

Sounds exciting, doesn’t it? Well, it’s even more exciting than it sounds, because that’s the abstract of Freeman, E., Woodruff, S. D., Worley, S. J., Lubker, S. J., Kent, E. C., Angel, W. E., Berry, D. I., Brohan, P., Eastman, R., Gates, L., Gloeden, W., Ji, Z., Lawrimore, J., Rayner, N. A., Rosenhagen, G. and Smith, S. R. (2016), ICOADS Release 3.0: a major update to the historical marine climate record. Int. J. Climatol.. doi:10.1002/joc.4775 which has just hit the scientific literature, and which describes the latest version of the International Comprehensive Ocean-Atmosphere DataSet, the collection of marine weather data.

Everything about oldWeather has been free from the start: our ambition has always been to make the information in our logbooks openly available for anyone to use, and our results already have seen use in datasets, reanalyses, historical and personal projects, … But whenever anyone asks me “What are you doing with the results of this project?”, I’ve always answered “We’re going to put the new observations in ICOADS – to make them available for all future uses, in climate and other fields”. With ICOADS R3.0, we have finally achieved this: ICOADS is internally arranged in ‘decks’ (a reminder that data collection is older than the digital computer) – it now includes decks 249 “World War I (WW1) UK Royal Navy Logbooks” and 710 “US Arctic Logbooks” – clearly illustrated in figure 1.

I’ve been working as a scientist for a while now, but the publication of a new paper is still something of an event. Scientific papers come in many forms: some describe wild new ideas, brave experiments, or dramatic breakthroughs – this one is nothing like that; it just reports the work of many people, over several years, scavenging observations from wherever we can find them, systematising them, quality controlling them, analysing them, and now releasing them. It makes up for its lack of drama by being useful – the surface marine record is one of the most widely used datasets in all of climate: our observations are used directly in the monitoring datasets that measure climate change, they are assimilated in all reanalyses, they provide boundary conditions and validation for the models we use for predicting future change, and they provide calibration to palaeoclimate reconstructions of the deep past.

So from now on, if you’ve contributed to oldWeather, keep an eye out for any new climate results. Whether it’s a new global temperature record, a prediction of climate for the next generation, a study of changes in flood or drought risk, a government report on climate impacts and adaptation, … anything really. When you see it, square your shoulders and stand a little taller – that result owes something to you.

Better bad weather with oldWeather

A spaghetti-contour plot of a storm in 1916 as reconstructed by the 20th Century reanalysis.  On the left, without oldWeather observations, on the right with the oldWeather ships added.

A spaghetti-contour plot of a storm in 1916 as reconstructed by the 20th Century reanalysis. On the left, without oldWeather observations, on the right with the oldWeather ships added. The yellow dots mark observations used in the reconstruction – the additional yellow dots in the right image are from Royal Navy ship logs transcribed by oldWeather.

Next week sees an important event in the calendar of the observational climatology community: the ninth annual meeting of the Atmospheric Circulation Reconstructions over the Earth (ACRE) project, in Maynooth, Ireland. I’ll be there to talk about the new knowledge we are generating with oldWeather, and I thought I’d share a sneak preview here.

The picture above is a pair of contour plots: on a map contour lines mark places with the same height and they are used to show the shape and size of hills. Here the lines mark places with the same atmospheric pressure, and they show the size and shape of a valley in the atmosphere – an atmospheric depression – a storm. The picture is messy because this is also a spaghetti plot – I have 56 different maps of the same storm (the individual ensemble members of the 20th century reanalysis) and I’ve drawn all 56 in the same image.

In an ideal world we’d know exactly the size and shape of this storm, so all 56 maps would be exactly the same and the plot would show pin-sharp simple contours. We’re not there yet, but adding the oldWeather observations to the reconstruction has made things a lot better – our map of this storm is much more precise than it was before.

Too windy for Zeppelins

100 years ago today – 31st May 1916, saw the start of a major fleet action between the British and German navies: The battle of Jutland.

This is right in the middle of the period covered by the original oldWeather project, so you’d think we had all the logbooks and observations, at least from the British half of the battle, but alas, it’s not so. The Grand Fleet sounds impressive, and with as many as 40 major warships surely was impressive, but it didn’t travel much: The doctrine of ‘Fleet in being’ means that all those battleships stayed in port as a threatening influence rather than travelling to distant locations, and that puts them right at the bottom of our priority list for transcription, and we’ve never looked at them.

So we don’t have the Grand Fleet, but we can still reconstruct the weather of the battle, and our observations still make a major contribution to the reconstruction:

Weather (contours are pressure, streamlines show wind and temperature), as reconstructed by 20CR version 2c, for the battle of Jutland. Dots mark observations used to make the reanalysis - red dots are observations from oldWeather, yellow dots are observations from other sources

Plenty of our ships are contributing to that weather reconstruction – in port on the West coast, on patrol or convoy duty in the North Atlantic, and they are doing a good job describing the dominant weather feature – the low pressure moving into the Norwegian Sea. But the North Sea around Jutland is pretty bare of observations – two major fleets, and we have almost nothing from either of them. The reason we don’t have them is that we don’t need them – the European weather stations give us a pretty good picture of the weather anyway, and the ships were only out of port for two days, so they wouldn’t be a huge asset to science; but it’s still a pity from the historical perspective, we’ll keep an eye out for a future opportunity, both in the UK and in Germany.

The weather did, apparently, play a part in the battle – strong winds grounded the Zeppelin fleet that would otherwise have been scouting and bombing on 31st May – and it was one of those Zeppelins that engaged (on June 1st) the only member of our fleet to have participated in the action at all: HMS Fearless was not big enough to mix it with the battleships and battlecruisers; but she was there, and you can read the story we rescued from her logs at naval-history.net (and see the observations she made in the video above).

The battle had no winners – 9,823 men died, and 25 ships were sunk, including one (HMS Invincible) who’s story helped inspire the start of oldWeather.