Oldweather is steaming past milestone after milestone, and a few days ago we passed a big one : 150 ships complete. That is; 150 ships, from Acacia to Wonganella, have had all of their log pages for the period transcribed by at least three people. That’s 89,000 pages of new information for climate and historical research.
To mark the completion of 100 logs, we made a movie showing the ships bustling about across the world’s oceans. Rather than updating this with the new information, I thought I’d be a little more ambitious and show the transcribed data in a more comprehensive and interactive format.
I’ve long been an admirer of Google Earth – a geospatial data viewer that’s powerful, easy to use, and, most important, free to download and use. If you haven’t played with it I urge you to give it a try – download a copy and have a look at the satellite’s-eye view of your favourite places. But the real charm of Google Earth is that we can add our own data as overlays. It works very well for following ships; and I’ve made an overlay from the 150 completed oldWeather ships.
So once you’ve got Google Earth, download the overlay and have a look at what you’ve created: select a ship from the list on the left, pick a time using the slider at the top, and click on the markers to see the day’s records for that ship. The user interface takes a bit of getting used to, but with practice you can make your own animations.
I included as many of the transcriptions as I could, but there are some that I haven’t yet managed to convert into this format. So if you can’t find something you know you entered, don’t worry: we haven’t lost it. It will take a little longer, but we will make it all available.
Edmond Halley is best known for his comet, but he was one of the great polymaths – as well as making astronomical discoveries he was also a notable meteorologist: he did important early work understanding the trade winds and monsoons. It’s less well known that that he was also a Naval Officer: in 1699 he was granted a commission as captain in the Royal Navy, and he commanded HMS Paramour (a pink) on an expedition into the South Atlantic to investigate the variation of the compass.
His main concern was with magnetism, but as a man of wide interests, Halley took with him examples of those two exciting modern scientific instruments: the thermometer and the barometer. I can’t find the logbook of the voyage, but Halley’s notes have survived: they were published by Alexander Dalyrmple, in 1775, as part of “A collection of voyages chiefly in the Southern Atlantick Ocean“. They date from 220 years before the logbooks we’re used to in OldWeather, but to anyone who’s looked at our logbooks they are oddly familiar: records of latitude, longitude, wind force and direction and, in the left-hand margin, thermometer and barometer readings.
In 1699 the barometer had been around for more than 50 years, and the barometer records in Halley’s account are clearly in the familiar inches of mercury. But the thermometer did not become a reliable, precision instrument until about 1725, when Fahrenheit invented the mercury thermometer with a standardized, calibrated scale. So when Halley says the temperature is ’33’ it’s not immediately obvious how this should be interpreted. Careful scholarship has established, however, that Halley was using a thermometer designed by Robert Hooke, and lavishly described in his book Micrographia:
The Stems I use for them are very thick, straight, and even Pipes of Glass […] above four feet long […] [filled] with the best rectified Spirit of Wine highly tinged with the lovely colour of Cocheneel, which I deepen the more by pouring some drops of common Spirit of Urine, which must not be too well rectified, […]
From Hooke’s description we can convert Halley’s reported units into modern equivalents at least approximately – Halley’s ’33’ was about 8°C.
The diary entries are mostly routine accounts of the movements of the ship, but occasionally he puts in longer and more interesting reports: here’s an example from Thursday 1st February 1700, when they were close to South Georgia, in the cold waters of the Southern Ocean:
[…] between 4 and 5 we were fair by three Islands as they then appeared; being all flat on the top, and covered with Snow milk white, with perpendicular Cliffs all round them […] The great height of them made us conclude them land, but there was no-appearance of any tree or green thing on them, but the Cliffs as well as the tops were very white, our people called A by the name of Beachy-Head, which it resembled in form and colour. And the Island B in all respects was very like the land of the North-foreland in Kent, and was at least as high and not less than 5 miles in front, […]
The following day they were disconcerted to discover that these ‘islands’ had moved, and fled north to warmer waters. This is the first recorded sighting of a tabular iceberg.
Halley’s observations are probably not of great value to climate scientists: his instruments were state-of-the-art for 1699, but it took decades longer for such observations to became accurate and plentiful enough for climate reconstructions. He did set a precedent though – possibly as the first person to go to sea with a barometer and a thermometer – and we’re still following his example more than 300 years later.
Working with the logbooks has done wonders for my knowledge of global geography. If it’s at sea level, one of our ships has probably been there, or at least mentioned sighting it on the way past, and we can travel, vicariously, with them; from Abadan to Zanzibar by way of Cockatoo Island, Fernando Po, Nuku’alofa, Surabaya, and Wuhu (with assistance from lighthouses on Mwana Mwana, Muckle Roe, and Makatumbe).
We’d expect the Royal Navy to spend most of their time in British ports, but we deliberately chose the logs we’re looking at to include those going foreign, and omit the stay-at-homes, because this gives us better information on global weather. This choice means that foreign ports are the most frequently mentioned in our logs. In the 300,000 or so log-pages we’ve looked at so far, Hong Kong tops the ‘most visited’ table (with 23,000 mentions), followed by Bermuda and Shanghai. The first UK port comes in fourth: Devonport (6000 mentions) and though most of these are for the UK naval base near Plymouth, its statistics are boosted by the existence of another base of the same name in Auckland.
The existence of two Devonports highlights a difficulty we run into in using the port names. When the ship is in port, and sometimes when it is operating close to land, the port name or landmark is the only information we have on the ship’s location. So we have to convert the name into a latitude and longitude, and this can be challenging. For many ports a position is not hard to find: Gibraltar, Bombay, Glasgow and Aden are all well known. Many more are only a quick web search away: Esquimalt is on Vancouver Island, Thursday Island is in the Torres strait, and Walvis Bay is in Namibia.
After that it gets harder – East London is nowhere near East London, St Vincent usually means Cape Verde, rather than the identically named place in the West Indies or the Portuguese headland made famous by the battle of 1797. ‘No. 10 dock’, ‘No. 5 buoy’, and ‘No. 7 warf’ are all in Plymouth, but ‘on patrol’, ‘southern base’, and ‘on surveying ground’ could be anywhere.
The Navy are renowned for their courage and seamanship. Their orthography and penmanship are a little more variable, so we have Wei Hai Wei (2345 entries), Wei-hai-wei (1357), Wei hai Wei (633), Wei hai wei (314), Wei-Hai-Wei (231), wei hai wei (91), wei lai wei (69), Weihai wei (57), wei-hai-wei (53), Wei hei wei (33), Wei-hei-wei (32), WEI HAI WEI (30), and even W.H.W (26) – all of which are references to the same place.
With the technology of 1914-22, sorting all this out into a set of positions would have been a terrible job; but modern internet search engines, atlases, encyclopaedias and gazetteers are very powerful tools for tracking down obscure and badly spelt place-names. Today I’m particularly grateful that I live in the future.
I was really excited to see, a few days ago, that we’ve now completed the logs for 100 ships (100! and that only includes those where every page has been looked at by at least three people – there are many more making good progress). So we’ve now got enough information to look at the results from the fleet as a whole, as well as individual ships.
As one way of doing this I’ve made a video showing the ship movements from all the completed pages so far: Every point is one digitised weather observation, coloured by pressure (red=high, blue=low). At 10 days a second it takes 5 minutes to get through 1914-22, but there’s plenty to see – I like such visualisations because they always bring out new information:
Apart from the weather information (watch the sporadic outbursts of widespread low pressure – bad weather – in the North Atlantic), I like the way one ship settles in Bermuda – apparently determined to watch out the whole war from there; I bet that’s a very desirable posting. It’s also notable how big an effect the war apparently has on the patterns of movement – there is an explosion of activity in August 1914 and a clear reduction in late 1918. The war gets very little mention in the logbooks, but our results still indicate that it had a big impact on the activity of the ships.
We can also see a few cases that require more quality control: there are a few ships travelling through the Sahara, the Amazon and the Greenland Ice-cap – these are probably position errors in the logbooks. I think the ships in inland China, however, are the river gunboats about their proper business.
As we said in our recent blog post, Old Weather has been churning through Royal Navy logbooks from World War 1 for long enough now that we can start to extract some interesting stats from the words transcribed by the community.
Social networks are all the rage now, but here at Zooniverse HQ we’ve been wondering what the 90-year-old social graph of Old Weather would look like. We’ll have more to say in the near future about the interactions of people on board the Royal Navy ships from our logs, but what about the ships themselves? When ships pass each other at sea, or meet to exchange supplies, officers and information, they make a note of this in their logs.
This enormous chart shows all of the Old Weather ships in a big grid, highlighting in purple where ships connect to each other. You can look down the chart, or across it, to find the interactions for a given ship. You can see that the HMS Arlanza and the Alsation seem to meet up with quite a few of the other ships of the chart. Both are Armed Merchant Cruisers that cross the busy stretch between the UK and the USA. So is the HMS Motugua, and it too has a fair few interactions with other vessels.
Taking those ships that are often mentioned, we can delve further into their interactions and create arc plots for those vessels. The arc plot below, for the HMS Alsatian, shows that it has encountered 26 ships in the transcriptions made to date. The thickness of the lines connecting vessels indicates the relative number of times that the two ships reference each other. The HMS Moldavia and HMS Patia are fairly well-connected with the Alsatian.
What isn’t shown on the large network plot is that the most mentioned vessel in the Old Weather fleet is the HMS Bee, a river gunboat and a ship that is only 36% complete so far on Old Weather (maybe you could jump aboard and help to complete it?). This ship is not mentioned a great deal by every ship but rather features regularly in the logs of a few vessel in the fleet. The arc plot for the HMS Bee is shown below. The HMS Bee interacts a great deal with the HMS Scarab and the HMS Cricket. all three are gunboats, as is the HMS Gnat. The next step here is to examine the logs and find out when these vessels interacted so much, and why. A blog post of these at a later time.
Finally, for this post, let’s look at the arc plot for the top twenty most-connected vessels in Old Weather so far. These are the ships from the large network plot that connect with the most other ships. These plots can be made for the whole fleet – but they become very large and complex and thus difficult to take value from. This slimmed-down version showing just the top twenty gives you an idea of the ships that are linked to other ships.
This is the kind of simplistic data that can be extracted from your transcriptions of events. So far, only the development team have been looking at this, but the tools are being made available to the historians of Old Weather for further analysis. I’m excited by what they can uncover.
Many of the ships listed in these charts are available on our Old Weather Voyages page, so you can see for yourselves how they interact with each other. You can use that page to read the log entries and see where ships were when they encountered one another. We’re always trying to find new ways for everyone to explore the Old Weather data and if you have any suggestions we’d love to hear them, either here on the blog or via twitter @oldweather.
The Old Weather log books contain a huge number of events that occurred on board ships. These can range from reporting the weather in more colloquial terms, to noting that crewman are sick or injured or even that a comet has been seen in the night sky. Cleaning the ship, notable visitors and the details of battles all constitute a huge chunk of the massive volume of text that now fills the Old Weather data bulkheads.
Stuart Lynn, at Zooniverse HQ in Oxford, is the principal developer of Old Weather. He’s been busy toying with interesting ways to mine the ocean of textual event data that now exists thanks to tireless efforts of the Old Weather volunteers. We’ve toyed with Old Weather words before, but creating full-text search for the entire project is not a simple task. With 1.97 million words contained in a quart of a million logbook pages it can be painfully slow to find what you’re looking for if you don’t approach the task, and the database, in the right way. Never-the-less, Stuart has been busy and we’re now able to delve into these events to begin to extract some useful – and some fun – information.
Happy vs. Sad
This simple pie chart shows the relative proportions of log pages that include happy or sad events. As you can see, the result is not too cheerful. Only 5% of log pages containing these emotional words are happy. We searched for the words ‘happy’, ‘joy’, or ” and then for ‘sad’, ‘sadness’, ‘funeral’ or ”. One imagines that perhaps it felt more important to note the sad events in the log. Either way, these ships were not out on a jolly, there were at war.
There are often mentions in the logs regarding leisure activities aboard ship. The crew might play against each other or against other crews, but they definitely played some sport. Perhaps unsurprisingly, for British vessels, football is the most mentioned sport in the Old Weather logs, followed by cricket and a smattering of tennis.
As well as recording the details of the weather that were required, the logs often also make reference to conditions in general. We grouped mentions of various weather conditions into good and bad weather categories. You can see that bad weather dominates the logs – again this could be a selection effect. It may be that people don’t note good weather as often as bad.
These are just some very simple charts that represent the initial skimming of the amazing data Old Weather is creating. As we get more transcriptions and learn how to navigate the database more nimbly, we hope to bring you more in-depth observations. We have a few more charts to share with you this week – so keep your eyes on the blog. If you can think of something you’d like to see from this data, then please let us know, either in the comments here, or on Twitter @oldweather.
One question I’m asked again and again by people encountering OldWeather for the first time is ‘How accurate are the transcriptions?’. We’ve known for a while that the answer is ‘very accurate’, but it’s always nice to be precise about such things, so just how accurate are we?
To find out, let’s look at HMS Defence, which we followed through much of 1914 and 1915, on a voyage from the Dardanelles, to Montevideo, to South Africa, and then back to the UK and patrol in the North Sea. The figure shows the air temperature and pressure recorded during this voyage.
We can see clearly in this image the date when they stopped cruising in tropical and sub-tropical oceans, and returned to the colder and stormier seas around Great Britain – around the beginning of 1915 the air temperature fell by around 30F and the pressure became much more variable. But looking closely at the image, we can also see some errors, both ours and those of the mariners writing the logs in the first place.
We can spot our own errors because each log page is transcribed by at least three people, and when those three people disagree, someone has made a mistake. The logs of the Defence yielded 1119 pressure observations (six a day for about 6 months). For 997 of those observations (89%) everyone who transcribed the observation agreed what it was; for 107 of the observations (10%) two or more of the transcribers agreed on a value, but 1 person disagreed; and for the remaining 15 observations (1.3%) the transcribers did not agree, there was no value with a clear majority of the inputs. (The values entered by individuals that did not agree with the majority are shown in the figure as small red points.)
From the first two categories we can estimate the transcription error rate: in 997*3+107*2=3205 cases the value entered is correct, and in 107 cases it is incorrect, so the error rate is 107/3312 – about 3%. So transcriptions are about 97% accurate – in other words, about 97% of the time the value entered by an individual transcriber is the value that most people would agree is written in the logs – an excellent individual accuracy rate.
If you are familiar with statistics, you may have spotted an inconsistency here: if one person makes a mistake 3% of the time, at least two out of three people should make a mistake on the same observation only about 0.3% of the time (3%*3%*3), while actually this happens much more often than that (1.3% of the time). The reason for this excess of cases where all the transcribers disagree, is that some of the entries are illegible. For example, consider the barometer height at 4am in the log for Thursday 10th September 1914; this was variously transcribed as ‘30.18’, ‘30.10’, and ‘30.12’ – all of which are plausible readings. In this case there is no one answer we can agree on and the disagreement is not a transcription error but a success – we have flagged an entry which cannot be transcribed with confidence. (This is why we encourage you to guess when entering hard-to-read values, when everybody guesses a different answer we know the entry is illegible.)
Even when we have transcribed a value with certainty it may not be correct – sometimes the log-keepers wrote the wrong value in the log: There is no doubt that the barometer height entered for midnight on Wednesday 7th October 1914 is ‘28.80’ inches, but there is also no doubt that the actual pressure was much higher than this (possibly ‘29.80’), and this error can be seen as the first of the three spikes in the figure above. So there are three errors in the log big enough to be obvious in the plot, and probably others with a smaller effect.
This post has turned out much longer and more complicated than I planned – mostly because the definition of ‘transcription error’ from a logbook containing erroneous and illegible entries is not simple – so, in summary:
- Individual transcriptions are about 97% accurate
- Of 1000 transcribed logbook entries:
- 3 will be lost because of transcription errors
- 10 will be illegible
- At least 3 will be errors in the logs
So for every 16 errors in the transcribed data (which we pass to the science team), only 3 are the responsibility of those of us reading the logs; the other 13 are the problems in the logs themselves. We can say with some confidence that we are better at reading the logs than the original log-keepers were at writing them.
Congratulations to captain ebaldwin and the crew for an excellent job on HMS Defence; and to all the oldWeather participants, as the accuracy of transcription is similarly high on all the ships I’ve looked at.
Can you believe that Old Weather is 44% complete‽ I can’t, but that’s what the site is currently telling me. It’s amazing how much care and effort has been poured into this project by people all over the world. 63 vessels are now complete and the task of processing and understanding everything is underway.
Yesterday we said we had a little thank you coming and here it is: Old Weather Voyages. Stuart Lynn, the principal developer for Old Weather, and myself have been toying with the idea of displaying your weather and event transcriptions in a fun and interesting way. Old Weather Voyages lets you see the data that you have all provided in a way that puts the voyages of these ships front and centre
The Old Weather Voyages site displays one ship’s log book at a time and lets you watch events unfold as they did nearly a centrury ago. you can either just sit back and watch the ship at it voyages around the globe, or you can grab the time slider and whiz back and forth, seeing what happens and when.
As the ship moves around the world, its track is coloured according the the sea temperatures that have been recorded. The events from the ship’s log are shown on the log page on the left hand side. The image below shows the voyage of the HMS Africa up to the afternoon of Saturday November 3rd 1917. The log on the left shows that there have been several sicknesses reported over the past week on the ship. The ship’s track shows that the water is warmer nearer the equator than it is in South Africa, for example.
We hope that this new way of exploring the Old Weather data shows not only that the information being transcribed is coherent and useful, but also that your data really does go somewhere! We are often asked, here at the Zooniverse, what happens to your clicks and transcriptions. Here is a great way to explore the early results of the project and explore the history behind the science.
[If you have a Mac, you can also grab Old Weather Voyages as a screensaver]
Excellent news arrived today. The powers-that-be at JISC have seen fit to reward Old Weather’s success with further funding. We’re one of the projects who were successful in the latest round of their rapid digitization grants. The application – which incidentally laid great emphasis on the hard work that you’d all already done – went in six weeks after the original project launched, and will provide for a host of new things in the next six months or so.
Firstly, we’re going to get more logs. The grant provides funding for the imaging of another (roughly) 3000 ship’s logs – so that’s 3000 more months of history to feed to the site. The idea is to go back and fill in the gaps during the existing time period covered by the site, so there will be new ships, and some existing ships will gain new images.
Secondly, we’re going to add an interface that allows you to assist us in the task of cleaning up the transcribed data. In the proposal, I used the example of HMS Acacia, where the final temperature record contained some sudden jumps between temperatures in the 70s and in the 40s. Reviewing the logs makes it pretty clear what went wrong – 7s are easy to misread as 4s (at least in the handwriting of the officer who wrote the log of the HMS Acacia) – but that’s easily fixed once you review the temperature series. The same goes for sudden jumps in position caused by mixing up East and West. Producing a tool to help make changes like this will not only help us maintain data quality, but also will mean that it’ll be easy to review your ship’s results once it reaches the end of a log. We’re also going to take the chance to build a more flexible interface, allowing us potentially to transcribe logs that aren’t in the same format as the current set.
We’ll report on progress here as we get cracking. Thanks, JISC, for the support, but thank you all for your hard work that led to this vote of confidence. As a thank you, we’ve got a little surprise prepared but I’ll save that for tomorrow.
It probably won’t surprise many of you to hear that hear that the Earth is generally warmer at the equator, and colder towards the poles. I base my holiday plans heavily on latitude: going north (from England) for snow, and south for sunbathing. We all know this, but OldWeather has now completed enough log pages that we can prove it just from the logbook observations – the image below shows how air temperature changes with latitude, using the 120,000 temperature observations from pages that have already been examined by the three people we need to provide reliable results.
So it’s warmer (on average) in Singapore, and colder in Scandinavia; we didn’t need the logbook records to tell us that, but that doesn’t mean that this way of looking at the data is not interesting – partly because comparing the temperature records with others made at the same latitude is a good way of finding outliers: values that are likely to be errors in either recording or transcription.
One thing we can immediately see from the figure is the spikes at locations associated with ports. The spikes go both up and down, meaning there more of both high and low temperatures at these locations. Partly this will be a a physical effect – temperatures over land do vary more than those over the ocean – but it’s also partly an artefact of the way I’ve made the plot: The Navy ships spend a lot of time in port, so we have many more observations from those locations, and so more unusually high or low values. Even in the ports, however, there are very few really way out values, but some are suspicious: are there really marine temperatures below 0F at about 45N? (Seawater freezes at 29F) Those values come from HMS Bayano, off the Canadian coast in December, (thanks captain spudman and lieutenant Dinsdale, among others) so very low temperatures can’t instantly be ruled out, but they will need further investigation.
The variation of barometer height (air pressure) with latitude is less well known, but just as interesting: this picture is dominated by the low pressure variability in the tropics (steady weather) and the much more variable pressure in the higher latitudes (anticyclones, depressions and storms). We can see very nicely the transition, in the southern hemisphere, from the steady trade-wind regions to the famous ‘roaring forties’ and ‘furious fifties’.
Captains care about the air pressure because it warns them of changes in the wind. This sort of plot isn’t ideal for showing winds, because the wind measurements are restricted to the Beaufort scale categories, but we can still see where the strong winds are to be found. Cruising in the North Atlantic, the Royal Navy’s main stamping ground, was clearly no picnic: with temperatures down to freezing, variable weather and strong winds.
The Beaufort scale only goes up to 12; extensions are sometimes used for severe tropical storms, but the value of 15 recorded by HMS Cambrian in Rosyth dockyard in March 1919 is not credible. (Though I congratulate captain MamaLizard and the crew on correctly entering the value in the log – we always want the value written, even when it’s obviously an error). If we disregard the Cambrian’s exaggerations, there are four reports of wind force 12 so far, but they are all typographical errors – it’s not much of a slip of the pen to turn ‘1-2′ into ’12’. We’re still waiting for our first real hurricane.