Feeds:
Posts
Comments

Posts Tagged ‘Costs’

Presentations from the successful open consultation day held at TNA on 12 November on digital preservation for local authority archivists are now available on the DPC website – including my report on my Churchill Fellowship research in the US and Australia.  Also featured were colleagues from other local authority services already active in practical digital preservation initiatives – Heather Needham on ingest work at Hampshire, Viv Cothey reporting on his GAIP tool developed for Gloucestershire Archives, and Kevin Bolton on web archiving work at Manchester City. 

Heather and I also reported back on the results of the digital preservation survey of local authorities and a copy of the interim report is also now available on the DPC site.   A paper incorporating the discussion arising from the survey, from the afternoon sessions of the consultation event, will be published in Ariadne in January 2009.

Read Full Post »

Lots of interesting work going on at North Carolina State Archives – plenty to read on their electronic records page. One project I’d particularly like to highlight is their work on the preservation of e-mail.

E-mail seems to be one of those types of electronic record about which there’s been lots and lots of discussion about how difficult it is to preserve, but not so much (at least that I knew of) in the way of practical advice of how you might go about attempting to keep it.

As well as the very practical guidelines for users, and suggested retention periods for e-mail, staff in the North Carolina State Archives Government Records Branch have been working on a collaborative project to transform e-mail from its native format into XML for preservation. The catalyst for this project was the deposit of e-mail messages from a former North Carolina governor and his staff. The website for the e-mail project has a full set of documentation, and links to other e-mail preservation initiatives. More recently, North Carolina has been working with the Collaborative Electronic Records Project (CERP) at the Smithsonian Institution Archives and the Rockefeller Archive Center, and an XML schema for a single e-mail account has now been published.

I have also visited the Smithsonian Institution Archives, who have also developed some automated tools to help with the processing of e-mail archives, which they hope to make available on their website in due course. The CERP Project will be of particular interest to UK local archives, since this work has been achieved with an emphasis on low-cost solutions suitable for small and medium-sized organisations.

Read Full Post »

Visiting Arizona was a useful way of pulling together many of the strands of what I’ve learnt so far. I was particularly interested in the Persistent Digital Archives and Library System (PeDALS) project, which aims to create an automated workflow for processing digital collections, but also to keep costs as low as possible in an effort to reduce the barriers to addressing the challenges of digital preservation.

The automation aim is of course shared with another of the State Government NDIIPP projects at Washington State Digital Archives, and there are indeed some conceptual similarities in the workflow. However, PeDALS also makes use of a LOCKSS (Lots of Copies Keeps Stuff Safe) private network to provide inexpensive storage with plenty of redundancy and automatic error detection and correction. Having visited the LOCKSS team earlier in my Fellowship, I was curious to see how this system (originally designed to enable libraries to collect and preserve locally materials published on the internet) could be implemented in an archival context.

The envisaged workflow for PeDALS works best when there are clear series of records – in other words, it should work pretty well for government record series, but less well for miscellaneous private and personal accessions.  This is because the system is based upon the application to systematic ‘business rules’ to process large sets of similar records in the most efficient way possible.  This programming work could only be justified where there are sufficient records of a similar type, being created as the result of a routine process.  As has become something of a theme in most of the operational digital archives I have visited, the PeDALS team originally intended to focus on born-digital records but has found that many routine processes are still embedded in a paper system, and hence is currently working primarily with digital records.

The current phase of the collaborative, inter-State NDIIPP PeDALS project is looking at writing these business rules and setting up the PeDALS workflow and storage systems.  Without going into all of this in a tremendous amount of detail (I’d suggest a look at the PeDALS website for further details), the basic idea is to write the rules once and then allow individual participants in the network to tweak them to suit their local circumstances.

Whilst very much in the early stages of building the system, the project is definitely work colleagues in the UK local archives network keeping an eye on – not least because of the emphasis on keeping costs down.  As well as the main project website, there is an update log at https://pedals.updatelog.com/login (you need to register for a username and password).

Read Full Post »

This was my most challenging (in a thought-provoking way) visit so far. The Washington State (upper left hand corner of the US, for those whose geography is as hazy as mine was!!) Digital Archives doesn’t seem to be terribly well known in the UK, and I’d certainly recommend colleagues have a look at their website, particularly some of the background documents in the About Us section. The Center for Technology in Government’s case study on the public value of returns to government resulting from the Digital Archives investment (available at http://www.ctg.albany.edu/publications/reports/proi_case_washington?chapter=1) is also well worth reading.

Why was the visit challenging? Well, essentially because this digital repository has been largely conceived and operated as an IT development project, and more recently as a business service and disaster recovery facility for creating agencies and departments within state and local government in Washington (and in this sense has certain parallels with TNA’s Digital Continuity Project). Microsoft, being in Washington’s backyard in Seattle, also have a not inconsiderable influence, and the Digital Archives staff have a strong working relationship and level of support from Microsoft.

Quite a contrast, then, from the Australian operational repositories, whose workflows are firmly rooted in archival and recordkeeping paradigms, often with a strong commitment to the use of open source software and XML open standards.

Initially, I found this approach very difficult to grasp, and indeed the staff at the State Archives freely admit that future developments of the Digital Archives will require a greater degree of partnership between archival staff and technologists. As I learnt more about the detail of the Digital Archives operation, however, I began to see both parallels with other digital archive operations (for instance, in maintaining authenticity and safe transfer of custody of files by means of sealed hard drives and secure FTP transfer) and ways in which the greater level of IT input into this Digital Archives has enabled extremely high levels of automation and efficiency in processing and searching.

The current run rate for ingest of single page TIFF images is over a million a day; use of the website (boosted by the decision to concentrate initially on the ingest of digitised birth, marriage and death records) runs at a level of around fifty to sixty thousand uses a day.

I still struggle, from a conceptual archival point of view, from the way in which different record series are merged together for access, and would hope to see a greater degree of contextual information in series descriptions into the future, although I can understand the processing efficiencies gained through only having to manage the one large database. The approach really makes you think about which of your archival assumptions are vital theoretical foundations for facilitating secondary use of archival resources, and which are merely legacies of a paper world.

Washington will also be of interest to those colleagues who would like to see regional partnerships of digital archives develop in the UK. Washington is leading one of the current round of NDIIPP projects to develop a centralized multi-state digital preservation consortium so that other States in the US can benefit from the expertise and workflows developed in Washington. Further details are available from the project website.

Read Full Post »

Having read through the original LIFE project documentation, I was looking forward to the project conference for the follow-on research, LIFE2. It is all too common, unfortunately, to hear doom-laden rumours being peddled about the supposed high costs of digital preservation, often in contexts where this truism becomes a convenient excuse to avoid addressing the real challenges of digital curation and preservation. Chris Rusbridge argued in an ARIADNE article that digital preservation being expensive is simply a fallacy. For the local government archives context at least, it seems to me that there is simply insufficient evidence to either support or discredit the assertion. Would LIFE2 offer us an objective tool to assess the likely costs of developing and running a digital preservation service for local government?

LIFE2 promised a revised lifecycle costing model, including mappings to relevant digital preservation standards such as OAIS, clearer element descriptions, and a new set of case studies, including an examination of non-born digital newspaper material. This case study was designed to allow for the comparison of analogue and digital lifecycles and to begin a cost comparison.

Whilst the revised LIFE2 model is more closely aligned in terminology to OAIS, and the elements now appear in a more logical order, I admit I was disappointed that the model does not seem as transferable to the local government archives context as I had hoped. As Neil Beagrie pointed out in his presentation of the costs of curating research data, the decision to exclude infrastructure costs such as the start-up costs of building a digital repository or of maintaining a technology watch service means that the tool could not be used in business cases making the comparison case for or against curating digital material in-house or for outsourcing – the major decision facing me at West Yorkshire Archive Service – although some attempt to address this shortcoming is being made with the development of a Generic Preservation Model (GPM), which will be released at the end of the LIFE2 Project.

The newspaper case study also ran into difficulties around differing patterns of access and the problems of retrospective costing. Although the case study continued using a per entity costing model to assess the relative costs of preserving a digitised newspaper collection with a year’s analogue curation costs of legal deposit newspapers, the results, although interesting, are not truely comparable.

I had hoped I might be able to use the tool to compare the not inconsiderable costs of building and fitting out an archives building for traditional materials conforming to BS5454, with the costs of developing automated tools and digital storage and management capacity for born-digital and hybrid collections. There was discussion at the end of the day about how the LIFE project might progress in a next phase, which – promisingly – included the development of a predictive tool for costing, further case studies and scenario building, and a proposal that comparison studies are made between the costs of a shared preservation service versus an in-house digital repository.

There was also extensive discussion during the panel session about the need to demonstrate the value of digital preservation, particularly to funding bodies. The LIFE tool offers a method for digital repositories to assess costs; different kinds of value assessments are required to convince funders. The point was also made that the more significant properties of digital objects a repository attempts to preserve, the greater the cost – making me ponder on the potential for integrating the lifecycle costing models into preservation planning tools, such as PLATO.

Read Full Post »