Posted in Digital Preservation Networks, tagged Arizona, Costs, data storage, digital archives, digital preservation, digital storage, LOCKSS, low cost, NDIIPP, PeDALS, Preservation Networks, USA on 5 October 2008|
Leave a Comment »
Visiting Arizona was a useful way of pulling together many of the strands of what I’ve learnt so far. I was particularly interested in the Persistent Digital Archives and Library System (PeDALS)
project, which aims to create an automated workflow for processing digital collections, but also to keep costs as low as possible in an effort to reduce the barriers to addressing the challenges of digital preservation.
The automation aim is of course shared with another of the State Government NDIIPP projects at Washington State Digital Archives, and there are indeed some conceptual similarities in the workflow. However, PeDALS also makes use of a LOCKSS (Lots of Copies Keeps Stuff Safe) private network to provide inexpensive storage with plenty of redundancy and automatic error detection and correction. Having visited the LOCKSS team earlier in my Fellowship, I was curious to see how this system (originally designed to enable libraries to collect and preserve locally materials published on the internet) could be implemented in an archival context.
The envisaged workflow for PeDALS works best when there are clear series of records – in other words, it should work pretty well for government record series, but less well for miscellaneous private and personal accessions. This is because the system is based upon the application to systematic ‘business rules’ to process large sets of similar records in the most efficient way possible. This programming work could only be justified where there are sufficient records of a similar type, being created as the result of a routine process. As has become something of a theme in most of the operational digital archives I have visited, the PeDALS team originally intended to focus on born-digital records but has found that many routine processes are still embedded in a paper system, and hence is currently working primarily with digital records.
The current phase of the collaborative, inter-State NDIIPP PeDALS project is looking at writing these business rules and setting up the PeDALS workflow and storage systems. Without going into all of this in a tremendous amount of detail (I’d suggest a look at the PeDALS website for further details), the basic idea is to write the rules once and then allow individual participants in the network to tweak them to suit their local circumstances.
Whilst very much in the early stages of building the system, the project is definitely work colleagues in the UK local archives network keeping an eye on – not least because of the emphasis on keeping costs down. As well as the main project website, there is an update log at https://pedals.updatelog.com/login (you need to register for a username and password).
Read Full Post »
This was my most challenging (in a thought-provoking way) visit so far. The Washington State (upper left hand corner of the US, for those whose geography is as hazy as mine was!!) Digital Archives doesn’t seem to be terribly well known in the UK, and I’d certainly recommend colleagues have a look at their website, particularly some of the background documents in the About Us section. The Center for Technology in Government’s case study on the public value of returns to government resulting from the Digital Archives investment (available at http://www.ctg.albany.edu/publications/reports/proi_case_washington?chapter=1) is also well worth reading.
Why was the visit challenging? Well, essentially because this digital repository has been largely conceived and operated as an IT development project, and more recently as a business service and disaster recovery facility for creating agencies and departments within state and local government in Washington (and in this sense has certain parallels with TNA’s Digital Continuity Project). Microsoft, being in Washington’s backyard in Seattle, also have a not inconsiderable influence, and the Digital Archives staff have a strong working relationship and level of support from Microsoft.
Quite a contrast, then, from the Australian operational repositories, whose workflows are firmly rooted in archival and recordkeeping paradigms, often with a strong commitment to the use of open source software and XML open standards.
Initially, I found this approach very difficult to grasp, and indeed the staff at the State Archives freely admit that future developments of the Digital Archives will require a greater degree of partnership between archival staff and technologists. As I learnt more about the detail of the Digital Archives operation, however, I began to see both parallels with other digital archive operations (for instance, in maintaining authenticity and safe transfer of custody of files by means of sealed hard drives and secure FTP transfer) and ways in which the greater level of IT input into this Digital Archives has enabled extremely high levels of automation and efficiency in processing and searching.
The current run rate for ingest of single page TIFF images is over a million a day; use of the website (boosted by the decision to concentrate initially on the ingest of digitised birth, marriage and death records) runs at a level of around fifty to sixty thousand uses a day.
I still struggle, from a conceptual archival point of view, from the way in which different record series are merged together for access, and would hope to see a greater degree of contextual information in series descriptions into the future, although I can understand the processing efficiencies gained through only having to manage the one large database. The approach really makes you think about which of your archival assumptions are vital theoretical foundations for facilitating secondary use of archival resources, and which are merely legacies of a paper world.
Washington will also be of interest to those colleagues who would like to see regional partnerships of digital archives develop in the UK. Washington is leading one of the current round of NDIIPP projects to develop a centralized multi-state digital preservation consortium so that other States in the US can benefit from the expertise and workflows developed in Washington. Further details are available from the project website.
Read Full Post »
Posted in Digital Preservation Networks, Journal Articles, Preservation Tools, Research Projects, tagged AONS, Cultural Challenges, Library of Congress, National Library of Australia, NDIIPP, PeDALS, Planets, PLATO, Preservation Planning on 17 August 2008|
Leave a Comment »
A couple of articles in the most recent edition of the International Journal of Digital Curation caught my eye this week as I prepare for my forthcoming Winston Churchill Memorial Fellowship to Australia and the US.
Martha Anderson reviews the evolution of the National Digital Information Infrastructure and Preservation Program initiated by the Library of Congress, and draws some conclusions about lessons learned, many of which will be familiar to those of us working within existing partnership organisations, such as West Yorkshire Joint Services. The layered stewardship model introduced in the paper is nevertheless a useful concept to bear in mind as the UK archive sector begins to build our own national network of diverse stakeholders to tackle the digital preservation challenge. The full paper is available at http://www.ijdc.net/ijdc/article/view/59/60.
David Pearson and Colin Webb discuss issues of file format obsolescence and introduce the AONS II Project, something I hope to find out more about when I visit the National Library of Australia in September. The project aimed to develop a software tool that would find and report indicators of obsolescence risks. It will be interesting to see how this works fits with European Planets Project and their PLATO preservation planning tool. The IJDC paper can be found at http://www.ijdc.net/ijdc/article/view/76/78.
I see more papers have appeared on the PeDALS project website in Arizona too – plenty of reading to get through…
Read Full Post »