Finally getting around to posting a little something about the web archiving conference held at the British Library a couple of weeks ago.
From a local archives perspective, it was particularly interesting to hear a number of presenters acknowledge the complexity and cost of implementation and use of currently available web archiving tools. Richard Davis, talking about the ArchivePress blog archiving project, went so far as to argue that this was using a ‘hammer to crack a nut’, and we’ll certainly be keeping an eye out at West Yorkshire Archive Service for potential new use cases for ArchivePress’s feed-focused methodology and tools. ArchivePress should really appeal to my fellow local authority archivist colleague Alan who is always on the look-out for self-sufficiency in digital preservation solutions.
I also noted Jeffrey van der Hoeven’s suggestion that smaller archives might in future be able to benefit from the online GRATE (Global Remote Access to Emulation Services) tool developed as part of the Planets project, offering emulation over the internet through a browser without the need to install any software locally.
Permission to harvest websites, particularly in the absence of updated legal deposit legislation in the UK, was another theme which kept cropping up throughout the day. So here is a good immediate opportunity for local archivists to get involved in suggesting sites for the UK Web Archive, making the most of our local network of contacts. Although I still think there is a gap here in the European web archiving community for an Archive-It type service to enable local archivists to scope and run their own crawls to capture at-risk sites at sometimes very short notice, as we had to at West Yorkshire Archive Service with the MLA Yorkshire website.
Archivists do not (or should not) see websites in isolation – they are usually one part of a much wider organisational archival legacy. To my mind, the ‘web archiving’ community is at present too heavily influenced by a library model and mindset, which concentrates on thematic content and pays too little attention to more archival concerns, such as provenance and context. So I was pleased to see this picked up in the posting and comments on Jonathan Clark’s blog about the Enduring Links event.
Lastly in my round-up, Cathy Smith from TNA had some interesting points to make from a user perspective. She suggested that although users might prefer a single view of a national web collection, this did not necessarily imply a single repository – although collecting institutions still need to work together to eliminate overlap and to coordinate presentation. This – and the following paper on TNA’s Digital Continuity project – set me thinking, not for the first time, about some potential problems with the geographically defined collecting remits of UK local authority archive services in a digital world. After all, to the user, local and central government websites are indistinguishable at the .gov.uk domain level, not to mention that much central government policy succeeds or fails depending on how it is delivered at local level. Follow almost any route through DirectGov and you will end up at a search page for local services. Websites, unlike paper filing series, do not have distinct, defined limits. One of the problems with the digital preservation self-sufficiency argument is that the very nature of the digital world – and increasingly so in an era of mash-ups and personalised content – is the exact opposite, highly interdependent and complex. So TNA’s harvesting of central government websites may be of limited value over the long-term, unless it is accompanied by an equally enthusiastic campaign to capture content across local government in the UK.
Slides from all the presentations are available on the DPC website.