Archive | Data RSS for this section

Editing the weather observations

Original oW1 observations (red dots) and as revised using the edited ship histories (smaller yellow dots). Some new observations have been added (ships for which we now have good positions), a few errors in the original observation set have been removed, and much valuable new detail has been added.

We finished transcribing the original RN WW1 oldWeather logbooks some time ago. And as soon as we could, we produced a set of climate data from those transcriptions: 1.6 million new weather observations. Those data have now been included in standard climate databases, and produced a notable improvement in major research products.

However, we said at the time that those were preliminary data. They were useful as they were, but we hoped to improve them further in the future. In particular we struggled to get good position information from every logbook page: Generally, when they reported their latitude and longitude (’32 27 03 N’, ’24 78 08 W’) we could locate them successfully (though there are difficulties), but often the logs give their location as the name of a port or place, and tracking that down can be hard.

When we made the preliminary set of climate data we did all the data processing (to turn the log transcriptions into climate records) in software. This is fast, but not powerful enough to deal correctly with the difficult positions. It got us what we most needed (as much data as possible, now) but a substantial fraction of our 1.6 million observations were left with positions that were missing, approximate, and, occasionally, just wrong. These issues did not damage climate products using the observations, but they did restrict them. Effectively, many of our observations were not good enough to use. What we really needed was a careful, expert examination of the record for each ship, teasing out the precise route of each ship from the limited, idiosyncratic, and occasionally just wrong, information in the log. This would require two things, a group of expert analysts, and time for them to work.

One of the glories of oldWeather is the expertise and dedication of the project community, and the ship history editors, in particular, have been working through the log transcriptions, using their expertise to make edited histories, and maps, of their travels. They have not had enough quite time to finish the task – not every ship has yet been edited – but most are done, and once again we need the data now. So I have gone through the edited histories and used their position records to improve the weather data records.

This has improved our weather records a lot (see the video above). Some ships with log details that defeated the software the first time around now have good positions and can be used for the first time. More have details improved and some errors fixed. A particularly noticeable improvement are the gunboats on rivers in China, which now show movement along the Yangtze in good detail. We still have 1.6 million records, but about 500,000 of them have received a big upgrade in their quality, and this will feed through to substantial improvements in the climate products we derive from them.

The four million

The contributions to classic oldWeather as they have changed with time

Contributions to classic oldWeather: Time runs from top to bottom, each descending ribbon is a different contributor, the colours mark different ships. (Bigger, printable, version).

We started 2016 with some good news from classic oldWeather – the 3-millionth transcribed weather observation from the US Government Arctic logbooks. It’s great to finish it with some more – we have just rescued our 4-millionth.

Those 4-million observations come from more than 480,000 transcribed pages, and are the work of 4,730 different people. We’ve looked before at how the work has been divided up between the participants, but to celebrate 4 million I wanted to go beyond that, and show not only what had been done, but also when.

This is oldWeather meets abstract expressionism, but the descending stripes are not abstract – each one represents one contributor: If you’ve transcribed a page for oldWeather-classic, one of them is you. Time runs down, from the project launch at the top, to the present at the bottom, and the colours distinguish work done on the different ships.

We can see most clearly the large and consistent contributions of our most committed and expert participants – truly the backbone of the project. But also the brief bursts of interest produced by newsletters and media mentions, and everything in between. Again I hope everyone is proud of their contribution – it’s taken us all to do it; and in spite of the awesome size of our achievement (4 million new observations) we are, just about, still all on the same page.

Global Warming as you’ve never seen it before

Temperature anomalies from the HadCRUT dataset: Blue regions are colder than normal, red warmer. Where the indicator is missing, we have no observations.

Working on the world’s weather observations means I spend a lot of time looking at maps. I like the equirectangular (plate carrée) projection (fills the screen nicely, latitude and longitude are all you need to know), but it does have a couple of diadvantages: Map geeks disdain it as both boring and badly distorted, and it’s hopeless for looking at the Arctic and Antarctic.

You can work around both of these problems by the technical trick of ‘rotating the pole’. There is no fundamental reason why a map has to have the North Pole at the top. If you rotate your globe so that some other point is at the top before performing the projection that turns it into a flat map; you can make a map that is still equirectangular, but looks very different, and has the Arctic (or location of your choice) in the middle. It’s no less distorted, but it is less boring, as the distortion has moved into different places.

HadCRUT is a global temperature monitoring dataset. We use it to keep track of global warming, amongst other purposes. It combines thermometer observations, from ships and land weather stations, to make estimates of temperature change month-by-month back to 1850. The sea-temperature observations we are rescuing in oldWeather will be used to improve HadCRUT.

HadCRUT is constructed on a regular grid on a conventional equirectangular map. Looking at it on a map with a rotated (and rotating) pole gives a fresh look at what we know about global temperature change (and a sharp reminder of the problems with map projections). I like this visualisation because not only does the changing observation coverage show the same sort of historical effects we’ve already seen in the pressure observations, but it illustrates what we know and what we don’t about past temperature: The growing global warming is unmistakable in the last few decades, in spite of the large regional variability and observational uncertainties, but smaller-scale changes, further back in time, can still have large uncertainty – new observations could make a big difference.

Free at last

We highlight improvements to the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) in the latest Release 3.0 (R3.0; covering 1662–2014). ICOADS is the most widely used freely available collection of surface marine observations, providing data for the construction of gridded analyses of sea surface temperature, estimates of air–sea interaction and other meteorological variables. ICOADS observations are assimilated into all major atmospheric, oceanic and coupled reanalyses, further widening its impact. R3.0 therefore includes changes designed to enable effective exchange of information describing data quality between ICOADS, reanalysis centres, data set developers, scientists and the public. These user-driven innovations include the assignment of a unique identifier (UID) to each marine report – to enable tracing of observations, linking with reports and improved data sharing. Other revisions and extensions of the ICOADS’ International Maritime Meteorological Archive common data format incorporate new near-surface oceanographic data elements and cloud parameters. Many new input data sources have been assembled, and updates and improvements to existing data sources, or removal of erroneous data, made. Coupled with enhanced ‘preliminary’ monthly data and product extensions past 2014, R3.0 provides improved support of climate assessment and monitoring, reanalyses and near-real-time applications.

Sounds exciting, doesn’t it? Well, it’s even more exciting than it sounds, because that’s the abstract of Freeman, E., Woodruff, S. D., Worley, S. J., Lubker, S. J., Kent, E. C., Angel, W. E., Berry, D. I., Brohan, P., Eastman, R., Gates, L., Gloeden, W., Ji, Z., Lawrimore, J., Rayner, N. A., Rosenhagen, G. and Smith, S. R. (2016), ICOADS Release 3.0: a major update to the historical marine climate record. Int. J. Climatol.. doi:10.1002/joc.4775 which has just hit the scientific literature, and which describes the latest version of the International Comprehensive Ocean-Atmosphere DataSet, the collection of marine weather data.

Everything about oldWeather has been free from the start: our ambition has always been to make the information in our logbooks openly available for anyone to use, and our results already have seen use in datasets, reanalyses, historical and personal projects, … But whenever anyone asks me “What are you doing with the results of this project?”, I’ve always answered “We’re going to put the new observations in ICOADS – to make them available for all future uses, in climate and other fields”. With ICOADS R3.0, we have finally achieved this: ICOADS is internally arranged in ‘decks’ (a reminder that data collection is older than the digital computer) – it now includes decks 249 “World War I (WW1) UK Royal Navy Logbooks” and 710 “US Arctic Logbooks” – clearly illustrated in figure 1.

I’ve been working as a scientist for a while now, but the publication of a new paper is still something of an event. Scientific papers come in many forms: some describe wild new ideas, brave experiments, or dramatic breakthroughs – this one is nothing like that; it just reports the work of many people, over several years, scavenging observations from wherever we can find them, systematising them, quality controlling them, analysing them, and now releasing them. It makes up for its lack of drama by being useful – the surface marine record is one of the most widely used datasets in all of climate: our observations are used directly in the monitoring datasets that measure climate change, they are assimilated in all reanalyses, they provide boundary conditions and validation for the models we use for predicting future change, and they provide calibration to palaeoclimate reconstructions of the deep past.

So from now on, if you’ve contributed to oldWeather, keep an eye out for any new climate results. Whether it’s a new global temperature record, a prediction of climate for the next generation, a study of changes in flood or drought risk, a government report on climate impacts and adaptation, … anything really. When you see it, square your shoulders and stand a little taller – that result owes something to you.

neōn katalogos

It’s not all about the shiny and the new – we should appreciate, also, the virtues of the classics: In particular classic oldWeather, our original and ongoing project to rescue data from the US Government Arctic logbooks, which has now transcribed more than three million (3,000,000) weather observations.

All the contributors I could not tell nor name, nay, not though ten tongues were mine and ten mouths and a voice unwearying, but now I will tell the leaders of the ships and the ships in their order:”

  • Of the Albatross (1884); leelhat and Hanibal94 were captains, with steeleye and jd570b and Zovacor, with 569 more. They brought 150,734 weather observations, rich in pressures, temperatures, and wind directions.
  • Of the Albatross (1890); hurlock and Ravendrop were captains, with p3nguin53 and listritz and 1049 more. They brought 62,931 weather observations.
  • Of the Albatross (1900); Danny252, hurlock and pommystuart were captains, with HHTime, JanetET-S and wendolk with 482 more. They brought 57,991 weather observations.
  • Of the Bear, veteran of many campaigns; lollia paolina, gastcra and Hanibal94 were captains, with DennisO, jil and pommystuart, with 410 more. They brought 349,015 weather observations
  • Of the Concord; pommystuart and gastcra were captains, with Hanibal94 and MAPurves, and 1207 more. They brought 380,191 weather observations.
  • Of the Corwin; gastcra, pommystuart and lollia paolina were captains, with but 24 more. They brought 9,588 weather observations.
  • Of the Jamestown (1844); kimma001 was captain, with gastcra and Zovacor and 92 more. They brought 83,533 weather observations.
  • Of the Jamestown (1866); leelhat, Hanibal94 and kimma001 were captains, with 445 more. They brought 128,922 weather observations.
  • Of the Jamestown (1879); lollia paolina was captain, with gastcra, LouisaEvers, smith7748 and 475 more. They brought 93,696 weather observations
  • Of the Jamestown (1886); leelhat was captain, with lollia paolina with 385 more. They brought 82,624 weather observations.
  • Of the Jeannette; gastcra, Clewi and jil were captains, with with 67 more. They brought 42,982 weather observations and much knowledge of the ice.
  • Of the Patterson; Hanibal94, gastcra and asterix135 were captains, with helenj, avastmh and 101 more. They brought 334,146 weather observations.
  • Of the Perry; leelhat and Hanibal94 were captains, with exim_202, elizabeth_s, and rbertin1068, with 427 more. They brought 7,352 weather observations.
  • Of the Pioneer; Hanibal94 was captain, with gastcra and helenj and 86 more. They sought out 182,586 weather observations.
  • Of the Rodgers; leelhat was captain, with Hanibal94, avastmh and 50 more. They saved 19,718 weather observations from the fire.
  • Of the Rush; lollia paolina was captain, with leelhat and researchib with 368 more. They carried 25,174 weather observations.
  • Of the Thetis; lollia paolina was captain, with jil, leelhat, KookyBird and 716 more. They brought 220,493 weather observations.
  • Of the first Unalga; Hanibal94 and propriome were captains, with gastcra and Caro, with 92 more. They brought 136,001 weather observations
  • Of the Second Unalga; Hanibal94 was captain, with gastcra, Caro, and 36 more. They brought 10,395 weather observations
  • Of the Vicksburg; leelhat and lollia paolina were captains, with 393 more. They brought 357,525 weather observations
  • Of the Yorktown; Lekiam and lollia paolina were captains, with gastcra with 737 more. They brought 279,546 weather observations
  • Of the Yukon; gastcra and Hanibal94 were captains, with 80 more. They brought 31,111 weather observations

Collectively awesome

We launched the new a month ago, which means that the volunteers using the site have provided quite a bit of new data, and we can start to analyse it. This is one of my favourite moments in any project – first blood, when we get the initial sense of what we’ve got, how it’s going to work, what we can learn from it.

One of the golden rules of statistical analysis is “first plot the data” – always start by making a simple visualisation, so you can be sure you understand what you’ve got, and you’re not missing anything obvious. But the oldWeather data is not easy to plot: the database contains records from hundreds of people making thousands of annotations on dozens of different logbook pages; what, exactly, should we look at?

So I’ve taken inspiration from Listen to Wikipedia, and asked ‘what would it look like if we could see (and hear) the data as it came in – in (accelerated) real time?’ The video below shows every contribution to over a three hour period on December 3rd 2015. The number of pages shown is the number of volunteers contributing at each point in time. Each box drawn, and sound played, is one annotation, a contribution to the project. Blue boxes contain weather data, yellow boxes ship positions, orange boxes dates, and red boxes other events; pages that have moved on to the transcription phase have grey boxes.

Listen to oldWeather: December 3rd 2015.

December 3rd was when we launched the new site, so we can see a large change in the number of people participating as they learn about the launch. It’s instantly clear that it’s working – we are collecting annotations and transcriptions in quantity, as we hoped. There is much to be learned from careful examination of visualisations like this, but mostly I think it shows the power of the project – the awesome capability of collective public science.

Note that this shows only data from the new Panoptes version of oldweather. It does not include data from the whaling site – I’m still working on that.

Mentioned in despatches

We're in there: Figure 9 from Cram et al. 2015.

We’re in there: Figure 9 from Cram et al. 2015.

oldWeather does not produce that many research papers: The process of science is changing fast at the moment – moving away from individual researchers and small groups working independently, and towards much larger consortia working together with big datatsets on major problems – fewer small papers, more big science. As a major data provider, closely linked into the international climate research community, we fit nicely into this new model.

But the academic papers industry is still out there, and now and again one appears that involves us directly. One of the major datasets we contribute to is the International Surface Pressure Databank, and Tom Cram and colleagues have a new paper about the data (including our contributions) and how it’s assembled and distributed through the excellent Research Data Archive at NCAR. It’s open access – free for all to read (here’s the link) – why not check out the oldWeather references and see exactly how our results are being used by researchers.

News from the U.S. National Archives: Over a Half Million Digitized Logbook Pages

April is here and cherry trees in Washington, DC, are in peak bloom this week, with hundreds of thousands of white and pink flower petals gently swaying in the breeze. At the National Archives in Washington, DC, we have something else in abundance. I am pleased to announce that on April 1st the Old Weather imaging team at the National Archives reached a new milestone: we have imaged over a half million logbook pages of the U.S. Navy, Coast Guard, and Coast Geodetic Survey vessels since the start of the Old Weather project in the United States in 2012. Our thanks go out to all of our current and former imagers and our collaborative partners, including the funding organizations that enabled us to carry out the digitization work, and especially Mark Mollan at the National Archives who provided the imaging team with 1,026 boxes and volumes of logbooks to image over the last 2 ½ years.

The growing collection of newer images is organized under five main themed categories and further divided into yearly voyages. You will not see them on the list for transcription just yet, as we are currently revising the software running to make transcription easier and support new logbook formats. But come the summer there will be many new ships and voyages to explore.

Happy transcribing to all.


At the National Maritime Museum, London, April 15th

On April 15th (2015) the (UK) National Maritime Museum is hosting a special one-day seminar organised by the (UK) Royal Meteorological Society. The meeting is in honour of the remarkable Matthew Fontaine Maury (1806-1873), who established the value of marine weather observations for scientific research.

The meeting covers everything from using the very earliest records to make circulation indices, to modern satellite observations. The speakers include several members of the oldWeather science team, and one of the talks is about the leading current method of recovering marine observations: I’m talking about oldWeather at 14:40. (Full agenda).

It’s an open meeting – all are welcome.

Crossovers (3)

What links weather observations, citizen science, open data, the Met Office, contributions to climate science …?

It does all sound rather familliar, but actually I was thinking of The Secret Life of a Weather Datum – a project from the Arts and Humanities Research Council’s Digital Transformations theme, led by the University of Sheffield and the University of the Creative Arts.

So while we are mostly interested in the effect of newly-recovered weather observations on our understanding of physical climate variability, The Secret Life are interested in the cultural effect of the same data as it flows through people, systems, and organisations – including oldWeather.

Their website launched on March 16th. Joan Arthur and Helen J. contributed interviews as representatives of the oldWeather participants, and Joan was also at the launch event to give a talk.

Thanks to @LifeOfData, @PaulaGoodale, and our own Joan and Helen for this citizen science/weather/open data crossover.