Feeds:
Posts
Comments

Archive for the ‘Preservation Tools’ Category

It’s been a while since I’ve posted here purely on digital preservation issues: my work has moved in other directions, although I did attend a number of the digital preservation sessions at the Society of American Archivists’ conference this summer.  I retain a keen interest in digital preservation, however, particularly in developments which might be useful for smaller archives.  Recently, I’ve been engaged in a little work for a project called DiSARM (Digital Scenarios for Archives and Records Management), preparing some teaching materials for the Masters students at UCL to work from next term, and in revising the contents of a guest lecture I present to the University of Liverpool MARM students on ‘Digital Preservation for the Small Repository’.  Consequently, I’ve been trying to catch up on the last couple of years (since I left West Yorkshire Archive Service at the end of 2009) of new digital preservation projects and research.

So what’s new?  Well, from a small archives perspective, I think the key development has been the emergence of several digital curation workflow management systems – Archivematica, Curator’s Workbench, the National Archive of Australia’s Digital Preservation Software Platform (others…?) – which package together a number of different tools to guide the archivist through a sequenced set of stages for the processing of digital content.  The currently available systems vary in their approaches to preservation, comprehensiveness, and levels of maturity, but represent a major step forward from the situation just a couple of years ago.  In 2008, if (like me when WYAS took in the MLA Yorkshire archive as a testbed), you didn’t have much (or any) money available, your only option was – as one of the former Liverpool students memorably pointed out to me – to cobble together a set of tools as best you could from old socks and a bit of string.  Now we have several offerings approaching an integrated software solution; moreover, these packages are generally open source and freely available, so would-be adopters are able to download each one and play about with it before deciding which one might suit them best.

Having said that, I still think it is important that students (and practitioners, of course) understand the preservation strategies and assumptions underlying each software suite.  When we learn how to catalogue archives, we are not trained merely to use a particular software tool.  Rather, we are taught the principles of archival description, and then we move on to see how these concepts are implemented in practice in EAD or by using specific database applications, such as (in the U.K.) CALM or Adlib.  For DiSARM, students will design a workflow and attempt to process a small sample set of digital documents using their choice of one or more of the currently available preservation tools, which they will be expected to download and install themselves.  This Do-It-Yourself approach will mirror the practical reality in many small archives, where the (frequently lone) archivist often has little access to professional IT support. Similarly, students at UCL are not permitted to install software onto the university network.  Rather than see this as a barrier, again I prefer to treat this situation a reflection of organisational reality.  There are a number of very good reasons why you would not want to process digital archives directly onto your organisation’s internal network, and recycling re-purposing old computer equipment of varying technical specifications and capabilities to serve as workstations for ingest is a fact of life even, it seems, for Mellon-funded projects!

In preparation for writing this DiSARM task, I began to put together for my own reference a spreadsheet listing all the applications I could think of, or have heard referenced recently, which might be useful for preservation processing tasks in small archives.  I set out to record:

  • the version number of the latest (stable) release
  • the licence arrangements for each tool
  • the URL from which the software can be downloaded
  • basic system requirements (essentially the platform(s) on which the software can be run – we have surveyed the class and know there is a broad range of operating systems in use, including several flavours of both Linux and Windows, and Mac OS X)
  • location of further documentation for each application
  • end-user support availability (forums or mailing lists etc)
This all proved surprisingly difficult.  I was half expecting that user-friendly documentation and (especially) support might often be lacking in the smaller projects, but several websites also lack clear statements about system requirements or the legal conditions under which the software may be installed and used.  Does ‘educational use and research’ cover a local authority archives providing research services to the general public (including academics)?  Probably not, but it would presumably allow for use in a university archives.  Thanks to the wonders of interpreted programming languages (mostly Java, but Python also puts in an occasional appearance), many tools are effectively cross-platform, but it is astonishing how many projects fail clearly to say so.  This is self-evident to a developer, of course, but not at all obvious to an archivist, who will probably be worried about bringing coffee into the repository, let alone a reptile.  Oh, and if you expect your software to be compiled from code, or require sundry other faffing around at a command line before use, I’m sorry, but your application is not “easy to implement” for ordinary mortals, as more than one site claimed.  Is it really so hard to generate binary executables for common operating systems (or if you have a good excuse – such as Archivematica which is still in alpha development – at least provide detailed step-by-step instructions)?  Many projects of course make use of SourceForge to host code, but use another website for documentation and updates – it can be quite confusing finding your way around.  The veritable ClamAV seems to have undergone some kind of Windows conversion, and although I’m sure that Unix packages must be there somewhere, I’m damned if I could find them easily…

All of which plays into a wider debate about just how far the modern archivist’s digital skills ought to reach (there are many other versions of this debate, the one linked – from 2006 so now quite old – just happens to be one of the most comprehensive attempts to define a required digital skill set for information practitioners).  No doubt there will be readers of this post who believe that archivists shouldn’t be dabbling in this sort of stuff at all, especially if s/he also works for an organisation which lacks the resources to establish a reliable infrastructure for a trusted digital repository.  And certainly I’ve been wondering lately whether some kind of archivists’ equivalent of The Programming Historian would be welcome or useful, teaching basic coding tailored to common tasks that an archivist might need to carry out.  But essentially, I don’t subscribe to the view that all archivists need to re-train as computer scientists or IT professionals.  Of course, these skills are still needed (obviously!) within the digital preservation community, but to drive a car I don’t need to be a mechanic or have a deep understanding of transport infrastructure.  Digital preservation needs to open up spaces around the periphery of the community where newcomers can experiment and learn, otherwise it will become an increasingly closed and ultimately moribund endeavour.

Read Full Post »

8am on Saturday morning, and those hardy souls who have not yet fled to beat Hurricane Irene home or who are stranded in Chicago, plus other assorted insomniacs, were presented with a veritable smörgåsbord of digital preservation goodness.  The programme has many of the digital sessions scheduled at the same time, and today I decided not to session-hop but stick it out in one session in each of the morning’s two hour-long slots.

My first choice was session 502, Born-Digital archives in Collecting Repositories: Turning Challenges into Byte-Size Opportunities, primarily an end-of-project report on the AIMS Project.  It’s been great to see many such practical digital preservation sessions at this conference, although I do slightly wonder what it will take before working with born-digital truly becomes part of the professional mainstream.  Despite the efforts of all the speakers at sessions like this (and in the UK, colleagues at the Digital Preservation Roadshows with which I was involved, and more recent similar events), there still appears to be a significant mental barrier which stops many archivists from giving it a go.  As the session chair began her opening remarks this morning, a woman behind me remarked “I’m lost already”.

There may be some clues in the content of this morning’s presentations: in amongst my other work (as would be the case for most archivists, I guess) I try to keep reasonably up-to-date with recent developments in practical digital preservation.  For instance, I was already well aware of the AIMS Project, although I’d not had a previous opportunity to hear about their work in any detail, but here were yet more new suggested tools for digital preservation: I happen to know of FTK Imager, having used it with the MLA Yorkshire archive accession, although what wasn’t stated was that the full FTK forensics package is damn expensive and the free FTK Imager Lite (scroll down the page for links) is an adequate and more realistic proposition for many cash-strapped archives.  BagIt is familiar too, but Bagger, a graphical user interface to the BagIt Library is new since I last looked (I’ll add links later – the Library of Congress site is down for maintenance”).  Sleuthkit was mentioned at the research forum earlier this week, but fiwalk (“a program that processes a disk image using the SleuthKit library and outputs its results in Digital Forensics XML”) was another new one on me, and there was even talk in this session of hardware write-blockers.  All this variety is hugely confusing for anybody who has to fit digital preservation around another day job, not to mention potentially expensive when it comes to buying hardware and software, and the skills necessary to install and maintain such a jigsaw puzzle system.  As the project team outlined their wish list for yet another application, Hypathia, I couldn’t help wondering whether we can’t promote a little more convergence between all these different tools both digital preservation specific and more general.  For instance, the requirement for a graphical drag ‘n’ drop interface to help archivists create the intellectual arrangement of a digital collection and add metadata reminded me very much of recent work at Simmons College on a graphical tool to help teach archival arrangement and description (whose name I forget, but will add it when it comes back to me!*).  I was interested particularly in the ‘access’ part of this session, particularly the idea that FTK’s bookmark and label functions could be transformed into user generated content tools, to enable researchers to annotate and tag records, and in the use of network graphs as a visual finding aid for email collections.

The rabbit-caught-in-headlights issue seems less of an issue for archivists jumping on the Web2.0 bandwagon, which was the theme of session 605, Acquiring Organizational Records in a Social Media World: Documentation Strategies in the Facebook Era, where we heard about the use of social media, primarily facebook, to contact and document student activities and student societies in a number of university settings, and from a university archivist just beginning to dip her toe into Twitter.  As a strategy of working directly with student organisations and providing training to ‘student archivists’ was outlined, as a method of enabling the capturing of social media content (both simultaneously with upload and by web-crawling sites afterwards), I was reminded of my own presentation at this conference: surely here is another example of real-life community development? The archivist is deliberately ‘going out to where the community is’ and adapting to the community norms and schedules of the students themselves, rather than expecting the students themselves to comply with archival rules and expectations.

This afternoon I went to learn about SNAC: the social networks and archival context project (session 710), something I’ve been hearing other people mention for a long time now but knew little about.  SNAC is extracting names (corporate, personal, family) from Encoded Archival Description (EAD) finding aids as EAC-CPF and then matching these together and with pre-existing authority records to create a single archival authorities prototype.  The hope is to then extend this authorities cooperative both nationally and potentially internationally.

My sincere thanks to the Society of American Archivists for their hospitality during the conference, and once again to those who generously funded my trip – the Archives and Records Association, University College London Graduate Conference Fund, UCL Faculty of Arts and UCL Department of Information Studies.

* UPDATE: the name of the Simmons’ archival arrangement platform is Archivopteryx (not to be confused with the Internet mail server Archiveopteryx which has an additional ‘e’ in the name)

Read Full Post »

Day 1 Proper of the conference began with acknowledgements to the organisers, some kind of raffle draw and then a plenary address by an American radio journalist.  Altogether this conference has a celebratory feel to it – fitting since this is SAA’s 75th Anniversary year, but very different in tone from the UK conferences where the opening keynote speaker tends to be some archival luminary.  More on the American archival cultural experience later.

My session with Kate Theimer (of ArchivesNext fame) and Dr Elizabeth Yakel from the University of Michigan (probably best known amongst tech savvy UK practitioners for her work on the Polar Bear Expedition Finding Aid) followed immediately afterwards, and seemed to go well.  The session title was: “What Happens After ‘Here Comes Everybody’: An Examination of Participatory Archives”.  Kate proposed a new definition for Participatory Archives, distinguishing between participation and engagement (outreach); Beth spoke about credibility and trust, and my contribution was primarily concerned with contributors’ motivations to participate.  A couple of people, Lori Satter and Mimi Dionne have already blogged about the session (did I really say that?!), and here are my slides:

After lunch, I indulged in a little session-hopping, beginning in session 204 hearing about Jean Dryden’s copyright survey of American institutions, which asked whether copyright limits access to archives by restricting digitisation activity.  Dryden found that American archivists tended to take a very conservative approach to copyright expiry terms and obtaining third party permission for use, even though many interviewees felt that it would be good to take a bolder line.   Also, some archivists knowledge of the American copyright law was shaky – sounds familiar!  It would be interesting to see how UK attitudes would compare; I suspect results would be similar, however, I also wonder how easy it is in practical terms to suddenly start taking more of a risk-management approach to copyright after many years of insisting upon strict copyright compliance.

Next I switched to session 207, The Future is Now: New Tools to Address Archival Challenges, hearing Maria Esteva speak about some interesting collaborative work between the Texas Advanced Computing Center and NARA on visual finding aids, similar to the Australian Visible Archive research project. At the Exhibit Hall later, I picked up some leaflets about other NARA Applied Research projects and tools for file format conversion, data mining and record type identification which were discussed by other speakers in this session.

Continuing the digitization theme, although with a much more philosophical focus, Joan Schwartz in session 210, Genuine Encounter, Authentic Relationships: Archival Convenant & Professional Self-Understanding discussed the loss of materiality and context resulting from the digitisation of photographs (for example, a thumbnail image presented out of its album context).  She commented that archivists are often too caught up with the ‘how’ of digitization rather than the ‘why’.  I wouldn’t disagree with that.

Back to the American archival cultural experience, I was invited to the Michigan University ‘alumni mixer’ in the evening – a drinks reception with some short speeches updating alumni on staff news and recent developments in the archival education courses at the university.  All in all, archives students are much in evidence here: there are special student ‘ribbons’ to attach to name badges, many students are presenting posters on their work, and there is a special careers area where face-to-face advice is available from more senior members of SAA, current job postings are advertised, and new members of the profession can even pin up their curriculum vitae.  Some of this (the public posting of CVs in particular) might seem a bit pushy for UK tastes, and the one year length of UK Masters programmes (and the timing of Conference) of course precludes the presentation of student dissertation work.  But the general atmosphere seems very supportive of new entrants to the profession, and I feel there are ideas here that ARA’s New Professionals section might like to consider for future ARA Conferences.

Read Full Post »

This should be the first of several posts from this year’s Society of American Archivists Annual Meeting in Chicago, for which I have received generous funding to attend from UCL’s Graduate Conference Fund, and from the Archives and Records Association who asked me to blog the conference.  First impressions of a Brit: this conference is huge.  I could (and probably will) get lost inside the conference hotel, and the main programme involves parallel tracks of ten sessions at once.  And proceedings start at 8am.  This is all a bit of a shock to the system; not sure anybody would turn up if you started before 9am at the earliest back home! Anyway, the twitter tag to watch is #saa11, although with no wifi in the session rooms, live coverage of sessions will be limited to those who can get a mobile phone signal, which is a bit of a shame.

The conference proper starts on Thursday; the beginning of the week is mostly taken up with meetings, but on Tuesday I attended an impressive range of presentations at the SAA Research Forum.  Abstracts and bios for each speaker are already online (and are linked where relevant below), and I understand that slides will follow in the next week or so.  Here are some personal highlights and things which I think may be of interest to archivists back home in the UK:

It was interesting to see several presentations on digital preservation, many reflecting similar issues and themes to those which inspired my Churchill Fellowship research and the beginning of this blog back in 2008.  Whilst I don’t think I’d recommend anyone set out to learn about digital preservation techniques the hard way with seriously obsolete media, if you do find yourself in the position of having to deal with 5.25 inch floppy disks or the like, Karen Ballingher’s presentation on students’ work at the University of Texas – Austin had some handy links, including the UT-iSchool Digital Archaeology Lab Manual and related documentation and an open source forensics package called Sleuth Kit.  Her conclusions were more generally applicable, and familiar: the importance of documenting everything you do, including failures; planning out trials; and just do it – learn by doing a real digital preservation project.  Cal Lee was excellent (as ever) on Levels of Representation in Digital Collections, outlining a framework of digital information constructed of eight layers of representation from the bit(byte-)stream to aggregations of digital objects, and noting that archival description already supports description at multiple levels but has not yet evolved to address these multiple representation layers.  Eugenia Kim’s paper on her ChoreoSave project to determine the metadata elements required for digital dance preservation reminded me of several UK and European initiatives; Siobhan Davies Replay, which Eugenia herself referenced and talked about at some length; the University of the Arts London’s John Latham Archive, which I’ve blogged about previously, because Eugenia commented that choreographers had found the task of entering data into the numerous metadata fields onerous: once again it seems to me there is a tension between the (dance, in this case) event and the assumption that text offers the only or best means of describing and accessing that event; and the CASPAR research on the preservation of interactive multimedia performances at the University of Leeds.

For my current research work on user participation in archives, the following papers were particularly relevant: Helice Koffler‘s report on the RLG Social Metadata Working Group‘s project on evaluating the impact of social media on museums, libraries and archives.  A three-part report is to be issued; part one is due for publication in September 2011.  I understand that this will include some useful and much-needed definitions of ‘user interaction’ terminology.  Part 1 has moderation as its theme – Helice commented that a strict moderation policy can act as a barrier to participation (a point that I agree with up to a point – and will explore further in my own paper on Thursday).  Part 2 will be an analysis of the survey of social media use undertaken by the Working Group (4 U.K. organisations were involved in this, although none were archives).  As my interviews with archivists would also suggest, the survey found little evidence of serious problems with spam or abusive behaviour on MLA contributory platforms.  Ixchel Faniel reported on University of Michigan research on whether trust matters for re-use decisions.

With my UKAD hat on, the blue sky (sorry, I hate that term, but I think its appropriate in this instance) thinking on archival description methods which emerged from the Radcliffe Workshop on Technology and Archival Processing was particularly inspiring.  The workshop was a two-day event which brought together invited technologists (many of whom had not previously encountered archives at all) and archivists to brainstorm new thinking on ways to tackle cataloguing backlogs, streamline cataloguing workflows and improve access to archives.  A collections exhibition was used to spark discussion, together with specially written use cases and scenarios to guide each day’s discussion.  Suggestions included the use of foot-pedal operated overhead cameras to enable archival material to be digitised either at the point of accessioning, or during arrangement and description; experimenting with ‘trusted crowdsourcing’ – asking archivists to check documents for sensitivity – as a first step towards automating the redaction process of confidential information.  These last two suggestions reminded me of two recent projects at The National Archives in the U.K. – John Sheridan’s work to promote expert input into legislation.gov.uk (does anyone have a better link?) and the proposal to use text mining on closed record series which was presented to DSG in 2009.  Adam Kreisberg presented about the development of a toolkit for running focus groups by the Archival Metrics Project.  The toolkit will be tested with a sample session based upon archives’ use of social media, which I think could be very valuable for U.K. archivists.

Finally only because I couldn’t fit this one into any of the categories above, I found Heather Soyka and Eliot Wilczek‘s questions on how modern counter-insurgency warfare can be documented intriguing and thought-provoking.

Read Full Post »

A round-up of a few pieces of digital goodness to cheer up a damp and dark start to October:

What looks like a bumper new issue of the Journal of the Society of Archivists (shouldn’t it be getting a new name?) is published today.  It has an oral history theme, but actually it was the two articles that don’t fit the theme which caught my eye for this blog.  Firstly, Viv Cothey’s final report on the Digital Curation project, GAip and SCAT, at Gloucestershire Archives, with which I had a minor involvement as part of the steering group for the Sociey of Archivists’-funded part of the work.  The demonstration software developed by the project is now available for download via the project website.  Secondly, Candida Fenton’s dissertation research on the Use of Controlled Vocabulary and Thesauri in UK Online Finding Aids will be of  interest to my colleages in the UKAD network.  The issue also carries a review, by Alan Bell, of Philip Bantin’s book Understanding Data and Information Systems for Recordkeeping, which I’ve also found a helpful way in to some of the more technical electronic records issues.  If you do not have access via the authentication delights of Shibboleth, no doubt the paper copies will be plopping through ARA members’ letterboxes shortly.

Last night, by way of supporting the UCL home team (read: total failure to achieve self-imposed writing targets), I had my first go at transcribing a page of Jeremy Bentham’s scrawled notes on Transcribe Bentham.  I found it surprisingly difficult, even on the ‘easy’ pages!  Admittedly, my paleographical skills are probably a bit rusty, and Bentham’s handwriting and neatness leave a little to be desired – he seems to have been a man in a hurry – but what I found most tricky was not being able to glance at the page as a whole and get the gist of the sentence ahead at the same time as attempting to decipher particular words.  In particular, not being able to search down the whole page looking for similar letter shapes.  The navigation tools do allow you to pan and scroll, and zoom in and out, but when you’ve got the editing page up on the screen as well as the document, you’re a bit squished for space.  Perhaps it would be easier if I had a larger monitor.  Anyway, it struck me that this type of transcription task is definitely a challenge, for people who want to get their teeth into something, not the type of thing you might dip in and out of in a spare moment (like indicommons on iPhone and iPad, for instance).

I’m interested in reward and recognition systems at the moment, and how crowdsourcing projects seek to motivate participants to contribute.  Actually, it’s surprising how many projects seem not to think about this at all – the build it and wait for them to come attitude.  Quite often, it seems, the result is that ‘they’ don’t come, so it’s interesting to see Transcribe Bentham experiment with a number of tricks for monitoring progress and encouraging people to keep on transcribing.  So, there’s the Benthamometer for checking on overall progress, you can set up a watchlist to keep an eye on pages you’ve contributed to, individual registered contributors can set up a user profile to state their credentials, chat to fellow transcribers on the discussion forum, and there’s a points system, depending on how active you are on the site, and a leader board of top transcribers.  The leader board seems to be fueling a bit of healthy transatlantic competition right at the moment, but given the ‘expert’ wanting-to-crack-a-puzzle nature of the task here, I wonder whether the more social / community-building facilities might prove more effective over the longer term than the quantitative approaches.  One to watch.

Finally, anyone with the techie skills to mashup data ought to be welcoming The National Archives’ work on designing the Open Government Licence (OGL) for public sector information in the U.K.  I haven’t (got the technical skills) but I’m welcoming it anyway in case anyone who has hasn’t yet seen the publicity about it, and because I am keen to be associated with angels.

Read Full Post »

Last Thursday I was delighted to attend the culminating workshop for the Society of Archivists‘ (SoA) funded digital curation project at Gloucestershire Archives.  As Viv Cothey, the developer employed by Gloucestershire Archives, has noted, “Local authority archivists may well be fully aware of the very many exhortations to do digital curation and to get involved but are frustrated by not knowing where to start”.  Building upon previous work on a prototype desktop ingest packager (GAip), the SoA project set out to create a proof of concept demonstration of a ‘trusted digital store’ suitable for use by a local government record office.  The workshop was an important outreach element of the project, aiming to build up understanding and experience of digital curation principles and workflow amongst archivists in the UK.  I have been involved with the management board for the SoA project, so I was eager to see how the demonstration tools which have been developed would be received by the wider digital preservation and archivist professional communities.

Others are much better qualified than me to evaluate the technical approach that the project has taken, and indeed Susan Thomas has already blogged her impressions over at futureArch.  For me, what was especially pleasing was to see a good crowd of ‘ordinary’ archivists getting stuck in with the demonstration tools – despite the unfamiliarity of the Linux operating system – and teasing out the purpose and process of each of the digital curation tools provided.  I hope that nobody objects to my calling them ‘ordinary’ – I think they will know what I mean, and it is how I would describe myself in this digital preservation context.

Digital preservation research has hitherto clustered around opposite ends of a spectrum.  At one end are the high level conceptual frameworks: OAIS and the like.  At the other end are the practical developments in repository and curation workflow tools in the higher education, national repository, and scientific research communities.  The problem here is the technological jargon which is frankly incomprehensible to your average archivist.  Gloucestershire’s project therefore attempts to fill an important gap in current provision, by providing a set of training tools to promote experimentation and discourse at practitioner level.

I’ll be interested to see the feedback from the workshop, and it’d be good to see some attendee comments here…

Read Full Post »

Chris Prom‘s talk on his Fulbright research ‘Tools for implementing Digital Preservation Standards’ for the ‘under resourced’ archive at the Society of Archivists’ Data Standards Group meeting (presentation slides should be available here shortly) yesterday has finally spurred me into posting a roundup of projects which I’ve encountered over the last couple of months, which are specifically relevant to digital preservation in a small archives repository.

When I embarked upon my Churchill Fellowship in 2008, practical implementations of digital preservation research were only occurring in large repositories, usually at a national or sometimes state level.  With the notable exception of the Paradigm project and related work at Oxford University, there had been few attempts to scale down the large programmes, or to package up the various tools available with the products of digital library/repository world, as envisaged by the 2007 UNESCO report Towards an Open Source Archival Repository and Preservation System.  The smaller programmes I did visit were generally concentrating on a niche subset of digital archives (for example, email or web archives).

Dedicated followers of digital preservation issues are probably already aware of the RODA repository created on a Fedora base by the Portuguese National Archives, and may have read this review of the demo site from another UK local archivist.  Chris Prom is now embarking on a more formal assessment, and his blog postings on RODA (and the evaluation criteria he is using) make for worthwhile reading.  RODA is likely to be of particularly interest to UK-based archivists who use the collections management software package, CALM, since this is also in use at the Portuguese National Archives, although there doesn’t seem to have been any attempt to date to link the two together.  What happens with a hybrid accession? is the obvious question.

Chris also introduced yesterday’s audience to a new project, Archivematica, which is packaging already available open source preservation tools into a Linux Ubuntu-based virtual appliance.  As the project’s wiki explains, ‘This means an entire suite of digital preservation tools is now available to the average archivist from one simple installation’.  This is a really exciting development and I am looking forward to seeing the results of Chris’s evaluation.  Archivematica is developed by the same Canadian team, Artefactual Systems, who are behind the ICA-Atom archival description software commissioned by the International Council on Archives.

Closer to home, since I am involved on the board for one of the projects, it is remiss of me not to have mentioned before on this blog the digital curation work going on at Gloucestershire Archives, although the website itself has only been made available relatively recently.  This work is the first real attempt to develop a practical digital curation architecture in a UK local authority archives setting (as opposed to simple re-use of existing tools, piecemeal).  Plenty to explore here.

And finally, on a less technical level, but nevertheless, I think, an important development.  At the sixth of the Society of Archivists’ roadshows in December 2009, I was delighted to hear of Kevin Bolton‘s work in drawing up simple accessioning checklists for digital archives at Manchester Archives and Local Studies, and – most importantly – how these are being developed regionally for the North West, in conjunction with Cheshire Archives and Local Studies.  Particularly at this time of economic recession (or are we supposed to be out of that now?) I believe it is vital that smaller archives pool their resources and work in partnership to find solutions to digital archives issues, and it is good to see a framework for the future being mapped out here in the North West.

Read Full Post »

Presentations from the successful open consultation day held at TNA on 12 November on digital preservation for local authority archivists are now available on the DPC website – including my report on my Churchill Fellowship research in the US and Australia.  Also featured were colleagues from other local authority services already active in practical digital preservation initiatives – Heather Needham on ingest work at Hampshire, Viv Cothey reporting on his GAIP tool developed for Gloucestershire Archives, and Kevin Bolton on web archiving work at Manchester City. 

Heather and I also reported back on the results of the digital preservation survey of local authorities and a copy of the interim report is also now available on the DPC site.   A paper incorporating the discussion arising from the survey, from the afternoon sessions of the consultation event, will be published in Ariadne in January 2009.

Read Full Post »

Lots of interesting work going on at North Carolina State Archives – plenty to read on their electronic records page. One project I’d particularly like to highlight is their work on the preservation of e-mail.

E-mail seems to be one of those types of electronic record about which there’s been lots and lots of discussion about how difficult it is to preserve, but not so much (at least that I knew of) in the way of practical advice of how you might go about attempting to keep it.

As well as the very practical guidelines for users, and suggested retention periods for e-mail, staff in the North Carolina State Archives Government Records Branch have been working on a collaborative project to transform e-mail from its native format into XML for preservation. The catalyst for this project was the deposit of e-mail messages from a former North Carolina governor and his staff. The website for the e-mail project has a full set of documentation, and links to other e-mail preservation initiatives. More recently, North Carolina has been working with the Collaborative Electronic Records Project (CERP) at the Smithsonian Institution Archives and the Rockefeller Archive Center, and an XML schema for a single e-mail account has now been published.

I have also visited the Smithsonian Institution Archives, who have also developed some automated tools to help with the processing of e-mail archives, which they hope to make available on their website in due course. The CERP Project will be of particular interest to UK local archives, since this work has been achieved with an emphasis on low-cost solutions suitable for small and medium-sized organisations.

Read Full Post »

Probably the best posting I can make on the Internet Archive, based in San Francisco, is to encourage colleagues to have a look at their Archive-It subscription service, and perhaps attend a free webinar about the tool (details on the site) or at least have a look through some of the collections from partner institutions in US State Archives.

Although not yet listed on the site, some UK colleagues are already experimenting with the tool.  The Internet Archive offers full hosting and storage, or can also ship the results of the web crawl back to the partner institution – as they will be doing for the major full Australian domain web crawl for the National Library of Australia, which had just completed at the time of my visit.  The IA is also working LOCKSS for storage of harvested websites, and hoping to work with the digital repository software platforms DSpace and Fedora.  Tools to enable more sophisticated pre-crawl scoping and to bookmark potential sites of interest before harvesting are also due for release soon.

Read Full Post »

Older Posts »