5,000 sites archived and counting – Scottish web archiving in the UK.

At the end of August 2018 added my 5000th URL to the UK Web Archive – and despite being, at the time of writing, two months ago it is a chance to reflect on my archiving experience.

I was something of a late comer to web archiving although I knew of the concept having written a masters dissertation with a principal focus being the introduction of Non-Print Legal Deposit. The theory was there, but how it might be achieved was to my mind nebulous.

However, by 2015 I had started working in a curatorial department with close links to all forms of collection through legal deposit, including some colleagues who worked on early special collections of websites, for the 2014 Commonwealth Games and the Scottish Independence Referendum. It was time for our team to be involved in less lofty but still important matters. And to help out, we were all introduced to the British Library’s web archiving platform – called W3ACT.

The first sites I tackled were family and local history ones relating to Scotland – I was aware that one of the most popular activities online was searching for family trees, so it was a little daunting. However, there are organisations in Scotland that list their affiliates, the Scottish Association of Family History Societies and The Scottish Local History Forum, and there was always the option of following every possible link found while researching. There is of course a considerable amount of overlap between local and family history, and building a picture through a network of sites around the country became quite straightforward.

Once a satisfying quantity of family and local history sites had been targeted, things began to waver a little, and I started targeting sites on a pretty ad hoc basis; but it struck me, and it is obvious, that meaningful targeting requires a cohort of similar sites that are linked by subject, activity and so on. I had been working on the National Library of Scotland’s football programmes collection at that time, and thought what better than to tackle Scotland’s national sporting obsession and target all the Scottish football leagues and teams, down to amateur level (which includes the peculiarly Scottish notion “Junior”, not football for children, or the faint hearted, but a level of semi-professional football). Scotland is small enough for this to be achievable but, the forty or so teams in the ‘top’ four leagues is just the tip of the iceberg…and once that was done, rugby, shinty, orienteering and for some quite forgotten reason, pétanque sites were all targeted creating a nice mix of Scottish sporting material in one collection. Moving on, I looked at other Scottish material relating to Festivals, Brewing, Visitor Attractions, Film & Cinema, Tourism, Masonic lodges and the University of the Third Age (U3A).

Many of the nationally important organisations in Scotland were targeted in the early days of the non-print legal deposit – among them the confusing array of governing bodies for churches. If there is anything Scotland has ‘enjoyed’ over the years it is the fracturing of its Presbyterianism into smaller communities of churches. And then there is Roman Catholicism, Scottish Episcopal (part of the World Wide Anglican Communion), Methodism, Baptists, Elim Pentecostal, the list of Christian denominations goes on. And of course there is Islam, Judaism, Hindu, Sikhism, and many other faiths, religions and sub-denominations, to consider.

It is ambitious to try to target sites of individual congregations and, needless to say, identifying the sites to target can be a challenge – however, churches are switched on to the possibilities of the Internet and tend to provide comprehensive lists of their denominations’ representation throughout Scotland and sometimes beyond. The difficulty is that there are an awful lot of them. Nevertheless, the effort is valuable as churches remain the focus of communities in some parts of the country, and their websites’ content often includes church newsletters which would otherwise be almost impossible to identify let alone collect. The process of listing and targeting large numbers of Scottish church websites continues.

The value of web archiving is plain – in September 2018 I revisited the 1277 sites I targeted during 2016, to find that 5.72% of the live sites are now no longer available. Of course, much of the material would have been scooped up in the annual collection of UK websites carried out by the British Library, but clearly some material had not been identified as within scope for web archiving in the UK. The ongoing effort to identify and collect sites that reflect our society now and in the future is enlightening for me, as I learn so much I did not know about Scotland. Hopefully it will prove to be equally valuable to historians, researchers and the plain nostalgic in the future.

What was the 5000th site I targeted? Why thank you for asking. It was https://u3asites.org.uk/eras/, Easter Ross & Sutherland University of the Third Age.

Trevor Thomson
6 November 2018