There’s a lot of history hiding in even purely scientific datasets. This movie shows just the locations of the 1.4 billion observations in the International Surface Pressure Databank (1851-2008), and in it I think I can see:
- The constraints on sailing-ship trade routes imposed by the global wind fields.
- The transition from sail to steam in shipping (late nineteenth century).
- The opening of the Suez canal in 1869 (01:30).
- The Famous Arctic voyage of Nansen’s Fram (03:20).
- The heroic age of Antarctic exploration (starting at about 04:00).
- The opening of the Panama canal in 1914 (05:10).
- The first world war (05:10).
- The second world war (07:00).
- Major administrative changes in India (08:00).
- The introduction of drifting buoys (1978: 10:20)
- And, sadly, a reduction in observations coverage in the last couple of decades as participation in the Voluntary Observing Fleet declines.
Of course these observations are not all that were made. Many more historical observations exist (on paper, or in restricted access collections), but these are the ones that are currently available to science. The process of rescuing the observations has also left its mark on the coverage – including right at the beginning of the video, where the coverage of ship observations reduces sharply in 1863 – the end of Matthew Fontaine Maury‘s pioneering data collection work. Various subsequent rises and falls in coverage result from the work of many other scientists and teams; including, of course, a large group of Royal Navy ship observations in the period around the First World War (starting about 05:00) clearly distinguishable just from their locations, as Naval ships move in a quite different pattern from commercial shipping. (Our US Arctic ships are not in this database yet – they will be in the next version).
We start 2015 with another big achievement – we’ve completed all 14 years of records (1922-1935) from the Pioneer. That’s more than 60,000 new weather observations from nearly 11,000 logbook pages. Congratulations to the lightning-fingered captain Hanibal, lieutenants gastcra, helenj, jill, pommystuart, and the 84 other crew, on another tremendous piece of work.
As a Coast and Geodetic Survey ship, Pioneer behaves quite differently from our earlier vessels – rarely venturing out into the open ocean and often staying in one place for extended periods of time. Possibly as a result of this, her logs rarely include latitude and longitude positions, preferring instead the current port or land location. This does make tracking her location more difficult, but with our excellent database of port and place locations and a little effort we can estimate a good location for almost every day.
Today is the last Thursday in November, and our friends in the U.S.A. are celebrating Thanksgiving. This festival has not caught on here in the UK, so I’m spared the turkey, and the pumpkin pie.
But I do know about being thankful, and today I’m particularly thankful for the 19,683 people who have transcribed at least one logbook page for oldWeather. Every one has made a contribution, from those who visited only once, to those who have done thousands of pages, and help guide and drive the project and its community. I’m proud to count them all as co-investigators.
So it’s an appropriate day to release a revised version of the project credits reel:
The Met Office, where I work, has just finalised an agreement to buy a new supercomputer. This isn’t that rare an event – you can’t do serious weather forecasting without a supercomputer and, just like everyday computers, they need replacing every few years as their technology advances. But this one’s a big-un, and the news reminded me of the importance of high-performance computing, even to observational projects like oldWeather.
To stand tall and proud in the world of supercomputing, you need an entry in the Top500: This is a list, in rank order, of the biggest and fastest computers in the world. These machines are frighteningly powerful and expensive, and a few of them have turned part of their power to using the oldWeather observations:
- Currently at number 34 in the world is Hopper: A Cray XE6 at the US National Energy Research Scientific Computing Centre (NERSC). Hopper is the main computing engine for the current developments of the Twentieth Century Reanalysis (20CR).
- At numbers 60 and 61 in the list are the pair of IBM Power775s (1,2) which used to support the European Centre for Medium-Range Weather Forecasts (ECMWF). Operational centres, like ECMWF, tend to buy supercomputers in pairs so they can keep working even if one system needs repair or maintenance – we have to issue weather forecasts every day, we can’t just stop for a while while we fix the computer. These two machines were used to produce ERA-20C.
Two other machines have not used our observations yet (except for occasional tests), but are gearing up to do so in the near future:
- At number 18 in the world is Edison: NERSC’s latest supercomputer, a Cray XC30.
- At number 64 is Gaea C2 – the US National Oceanic and Atmospheric Administration (NOAA)‘s supercomputer at Oak Ridge.
My personal favourite, though, is none of these: Carver is not one of the really big boys. An IBM iDataPlex with only 9,984 processor cores, it ranked at 322 in the list when it was new, in 2010, and has since fallen off the Top500 altogether; overtaken by newer and bigger machines. It still has the processing power of something like 5000 modern PCs though, and shares in NERSC’s excellent technical infrastructure and expert staff. I use Carver to analyse the millions of weather observations and terabytes of weather reconstructions we are generating – almost all of the videos that regularly appear here were created on it.
The collective power of these systems is awe-inspiring. One of the most exciting aspects of working on weather and climate is that we can work (through collaborators) right at the forefront of technical and scientific capability.
But although we need these leading-edge systems to reconstruct past weather, they are helpless without the observations we provide. All these computers together could not read a single logbook page, let alone interpret the contents; the singularity is not that close; we’re still, fundamentally, a people project.
Today is the fourth birthday of oldWeather, and it’s almost two years since we started work on the Arctic voyages. So it’s a good time to illustrate some more of what we’ve achieved:
I’m looking at the moment at the Arctic ships we’ve finished: Bear, Corwin, Jeannette, Manning, Rush, Rodgers, Unalga II, and Yukon have each had all of their logbook pages read by three people; so it’s time to add their records to the global climate databases and start using them in weather reconstructions. From them we have recovered 43 ship-years of hourly observations – more than 125,000 observations concentrating on the marginal sea-ice zones in Baffin Bay and the Bering Strait – an enormous addition to our observational records.
The video above shows the movements of this fleet (compressed into a single year). They may occasionally choose to winter in San Pedro or Honolulu, but every summer they are back up against the ice – making observations exactly where we want them most.
So in our last two years of work, we’ve completed the recovery of 43-ship years of logbooks, and actually we’ve done much more than that: The eight completed ships shown here make up only about 25% of the 1.5 million transcriptions we’ve done so far. So this group is only a taster – there’s three times as much more material already in the pipeline.
Sometimes there is just no word powerful enough to describe the achievements of oldWeather.
Back in March we reached a million, and since then we’ve powered on from that milestone, now having added an additional five hundred thousand observations to our tally. That’s two new observations every minute, night and day, 7 days a week: Come rain or shine; snow or sleet; ice, fire, or fog.
As I’ve mentioned previously, last Thursday I was warm up man for Charles Darwin and Robert Fitzroy (finally, a job truly worthy of oldWeather) – I was giving a talk about the project at the Progress Theatre in Reading.
HMS Beagle isn’t (yet) one of our ships, the observations from her 1831-6 circumnavigation had been rescued before oldWeather started; but I could use what I’ve learned from analysing the oldWeather observations to show the route of the ship, the weather they experienced, and the effect of their observations on our reanalyses for the period.
The answer, as we know, is 42 – but does that mean that it’s exactly 42; or somewhere between 41.5 and 42.5; or is 42 just a ball-park estimate, and the answer could actually be, say, 37?
The value of science is its power to generate new knowledge about the world, but a key part of the scientific approach is that we care almost as much about estimating the accuracy of our new knowledge as about the new knowledge itself. This is certainly my own experience: I must have spent more time calculating how wrong I could be – estimating uncertainty ranges on my results – than on anything else.
One reason I like working with the 20th Century Reanalysis (20CR) is that it comes with uncertainty ranges for all of its results. It achieves this by being an ensemble analysis – everything is calculated 56 times, and the mean of the 56 estimates is the best estimate of the answer, while their standard deviation provides an uncertainty range. This uncertainty range is the basis for our calculation of the ‘fog of ignorance‘.
We are testing the effects of the new oldWeather observations on 20CR – by doing parallel experiments reconstructing the weather with and without the new observations. We have definitely produced a substantial improvement, but to say exactly how much of an improvement, where, and when, requires careful attention to the uncertainty in the reconstructions. In principle it’s not that hard: if the uncertainty in the reanalysis including the oldWeather observations is less than the uncertainty without the new observations, then we’ve produced an improvement (there are other possible improvements too, but let’s keep it simple). So I calculated this, and it looked good. But further checks turned up a catch: we don’t know the uncertainty in either case precisely, we only have an estimate of it, so any improvement might not be real – it might be an artefact of the limitations of our uncertainty estimates.
To resolve this I have entered the murky world of uncertainty uncertainty. If I can calculate the uncertainty in the uncertainty range of each reanalysis, I can find times and places where the decrease in uncertainty between the analysis without and with the oldWeather observations is greater than any likely spurious decrease from the uncertainty in the uncertainty. (Still with me? Excellent). These are the times and places where oldWeather has definitely made things better. In principle this calculation is straightforward – I just have to increase the size of the reanalysis ensemble: so instead of doing 56 global weather simulations we do around 5600; I could then estimate the effect of being restricted to only 56. However, running a global weather simulation uses quite a bit of supercomputer time; running 56 of them requires a LOT of supercomputer time; and running 5600 of them is – well, it’s not going to happen.
So I need to do something cleverer. But as usual I’m not the first person to hit this sort of problem, so I don’t have to be clever myself – I can take advantage of a well-established general method for faking large samples when you only have small ones – a tool with the splendid name of the bootstrap. This means estimating the 5600 simulations I need by repeatedly sub-sampling from the 56 simulations I’ve got. The results are in the video below:
By bootstrapping, we can estimate a decrease in uncertainty that a reanalysis not using the oldWeather observations is unlikely to reach just by chance (less than 2.5% chance). Where a reanalysis using the oldweather observations has a decrease in uncertainty that’s bigger than this, it’s likely that the new observations caused the improvement. The yellow highlight in this video marks times and places where this happens. We can see that the regions of improvement show a strong tendency to cluster around the new oldweather observations (shown as yellow dots) – this is what we expect and supports the conclusion that these are mostly real improvements.
It’s also possible, though unlikely, that adding new observations can make the reanalysis worse (increase in estimated uncertainty). The bootstrap also gives an increase in uncertainty that a reanalysis not using the oldWeather observations is unlikely to reach just by chance (less that 2.5% probable) – the red highlight marks times and places where the reanalysis including the observations has an increase in uncertainty that’s bigger than this. There is much less red than yellow, and the red regions are not usually close to new observations, so I think they are spurious results – places where the this particular reanalysis is worse by chance, rather than systematically made worse by the new observations.
This analysis meets it’s aim of identifying, formally, when and where all our work transcribing new observations has produced improvements in our weather reconstructions. But it is still contaminated with random effects: We’d expect to get spurious red and yellow regions each 2.5% of the time anyway (because that’s the threshold we chose), but there is a second problem: The bootstrapped 2.5% thresholds in uncertainty uncertainty are only estimates – they have uncertainty of their own, and where the thresholds are too low we will get too much highlighting (both yellow and red). To quantify and understand this we need to venture into the even murkier world of uncertainty uncertainty uncer… .
No – that way madness lies. I’m stopping here.
OK, as you’re in the 0.1% of people who’ve read all the way to the bottom of this post, there is one more wrinkle I feel I must share with you: The quality metric I use for assessing the improvement caused by adding the oW observations isn’t simply the reanalysis uncertainty, it’s the Kullback–Leibler divergence of the climatological PDF from the reanalysis PDF. So for ‘uncertainty uncertainty’ above read ‘Kullback–Leibler divergence uncertainty’. I’d have mentioned this earlier, except that it would have made an already complex post utterly impenetrable, and methodologically it makes no difference, as one great virtue of the bootstrap is that it works for any metric.