Developing the Digital Archive at PROV
Australian archival theory emphasises the continuum of recordkeeping, with good records management seen as the core foundation of archival practice. This continuity is well illustrated by the Digital Archive project at the Public Record Office of Victoria (PROV).
Funding for the development of the Digital Archive was won on the back of previous successes in digital recordkeeping, beginning as far back as 1995 in the report ‘Keeping Electronic Records Forever‘ and the resulting specification of a Victorian Electronic Records Strategy (VERS).
With firm backing at a strategic level within the Victorian State government, the project appears to have flourished in spite of its complexity in terms of both the project structure and the technical implementation. In addition to staff contracted specially for the project, external contractors and consultants, PROV also seconded around a fifth of its own staff to the Digital Archive over the life of the implementation project, in order to ensure good staff buy-in and familiarity with the Digital Archive workflows. This contribution was included in annual performance appraisals for the year, and emphasised as important to personal development. Since several of the contract staff were also retained at PROV after the end of the project, knowledge of the system has remained within PROV, and the operation and maintenance of the Digital Archive is now well integrated into the general organisational structure.
Although larger than the typical local Archive Service in the UK, with 67 FTE staff, PROV is not a very large institution and it was felt important to integrate the Digital Archive into existing systems. Technical support for the ongoing operation of the Digital Archive consists of just three staff members. It was noted that OAIS was found useful in establishing a common vocabulary at the very beginning of the project, but was too high level to be of use in the tender and design phases of the project, and also views the Archival Information System as a standalone function rather than fitting into an pre-existing organisation and automated systems (such as the PROV catalogue). This integration – with the PROV archival control database and with existing workflow models for processing new accessions – proved to be one of the most challenging aspects of the whole project. Many of the processes followed in the analogue world can be streamlined in the digital, although a practice of repeated trial runs before completing a transfer has been found to work better than PROV staff attempting to correct errors in the records submitted.
The objectives were to keep digital archives for the long term, defined as up to 100 years – the project found that this concrete period was far easier for the IT contractors to grasp than a vague aspiration of keeping records indefinitely. The solution had to be cost effective over time, and to allow (online) access immediately – unlike the National Archives in Australia, the State of Victoria’s Public Records Act does not include a ’30 Year [closure] Rule’. Building from these considerations and the previous work on VERS, the decision was taken to use a single long-term format, a so-called VEO or Victorian Electronic Object. Essentially this is an XML wrapper containing the original digital object plus a normalised version (for example, PDF/A) and associated metadata to enable management of the digital record over time. This VEO is used as both a Submission Information Package (SIP) and Archival Information Package (AIP) under the OAIS model, and as the basis for the Dissemination Information Package (DIP). The whole package, which may contain anything from a single record to hundreds, each consisting of multiple encodings, is then locked and signed with a digital signature. Future users will be able to verify that each record has remained unaltered since its transfer from the government Agency which deposited it at PROV.
Long-term preservation formats are chosen on the basis of being open specifications and widely used (so that good translation tools are likely to remain available even after the format itself becomes obsolescent in the future). The current specification lists plain text, PDF (with a preference for PDF/A), JPG, JPEG2000, TIFF and MPEG-4. Interestingly, PROV have not yet begun to work with database formats, which appears to be one of the more immediate issues facing West Yorkshire Archive Service if a recent records management audit at Leeds City Council is anything to go by.
Bolstered by strong archival legislation, the PROV Digital Archive workflow pushes as much of the load back onto the depositing government Agencies as possible. Agencies are required to create the VEOs (on transfer) – software suppliers can participate in a compliance programme run by PROV to certify which products meet the VERS specifications and can generate valid VEOs. Currently five products are fully compliant, with several more close to completing the programme.
Unlike the National Archives of Australia, PROV do not maintain a ‘dark archive’. Digital accessions are securely stored on Centera file systems, which cannot be accessed by standard protocols and are continually monitored to detect corruption and repair failures. Off-site back-up copies are automatically updated. Controlling the whole system is a customised version of Documentum’s Enterprise Content Management platform – this proved costly in the short term, and is likely to continue to be a problematic and expensive aspect of the Digital Archive until such time as a public domain engine might become available to run an automated workflow on ingest.
Access to preserved digital records for the public is via the PROV website at http://www.access.prov.vic.gov.au. Different icons indicate the format of individual items – a sheet of paper for analogue archives, a camera for digitised material, a disk for born-digital. Digital archives are delivered to the user in their long-term preservation format, although the full VEO can also be downloaded and the user can manually extract other formats from the VEO. PROV decided not to require users to login to obtain digital archives – this was felt to be an audit control needed in the reading room which was not necessary in the digital world. The Digital Archive currently holds over 300000 VEOs, many of them digitised images from PROV’s own paper-based collections. Use of the PROV website has increased enormously.
Lessons learned from the project and subsequent operation of the Digital Archive included the importance of strong project management and stakeholder communications, and the need to have developers working on site rather than contracting the project overseas. The bulk sometimes encountered in archival accessions has created bottlenecks at certain points along the Digital Archive workflow. Customising off-the-shelf software is expensive. Further training is required for Agency staff who prepare the records for transfer, and simplifications to the procedures may be required.