Feeds:
Posts
Comments

Posts Tagged ‘LOCKSS’

Visiting Arizona was a useful way of pulling together many of the strands of what I’ve learnt so far. I was particularly interested in the Persistent Digital Archives and Library System (PeDALS) project, which aims to create an automated workflow for processing digital collections, but also to keep costs as low as possible in an effort to reduce the barriers to addressing the challenges of digital preservation.

The automation aim is of course shared with another of the State Government NDIIPP projects at Washington State Digital Archives, and there are indeed some conceptual similarities in the workflow. However, PeDALS also makes use of a LOCKSS (Lots of Copies Keeps Stuff Safe) private network to provide inexpensive storage with plenty of redundancy and automatic error detection and correction. Having visited the LOCKSS team earlier in my Fellowship, I was curious to see how this system (originally designed to enable libraries to collect and preserve locally materials published on the internet) could be implemented in an archival context.

The envisaged workflow for PeDALS works best when there are clear series of records – in other words, it should work pretty well for government record series, but less well for miscellaneous private and personal accessions.  This is because the system is based upon the application to systematic ‘business rules’ to process large sets of similar records in the most efficient way possible.  This programming work could only be justified where there are sufficient records of a similar type, being created as the result of a routine process.  As has become something of a theme in most of the operational digital archives I have visited, the PeDALS team originally intended to focus on born-digital records but has found that many routine processes are still embedded in a paper system, and hence is currently working primarily with digital records.

The current phase of the collaborative, inter-State NDIIPP PeDALS project is looking at writing these business rules and setting up the PeDALS workflow and storage systems.  Without going into all of this in a tremendous amount of detail (I’d suggest a look at the PeDALS website for further details), the basic idea is to write the rules once and then allow individual participants in the network to tweak them to suit their local circumstances.

Whilst very much in the early stages of building the system, the project is definitely work colleagues in the UK local archives network keeping an eye on – not least because of the emphasis on keeping costs down.  As well as the main project website, there is an update log at https://pedals.updatelog.com/login (you need to register for a username and password).

Advertisements

Read Full Post »

Probably the best posting I can make on the Internet Archive, based in San Francisco, is to encourage colleagues to have a look at their Archive-It subscription service, and perhaps attend a free webinar about the tool (details on the site) or at least have a look through some of the collections from partner institutions in US State Archives.

Although not yet listed on the site, some UK colleagues are already experimenting with the tool.  The Internet Archive offers full hosting and storage, or can also ship the results of the web crawl back to the partner institution – as they will be doing for the major full Australian domain web crawl for the National Library of Australia, which had just completed at the time of my visit.  The IA is also working LOCKSS for storage of harvested websites, and hoping to work with the digital repository software platforms DSpace and Fedora.  Tools to enable more sophisticated pre-crawl scoping and to bookmark potential sites of interest before harvesting are also due for release soon.

Read Full Post »