The National Archives and The Royal Historical Society
Gerald Aylmer Seminar, 21st April 2010

There was no doubting an enthusiasm for collaboration amongst the (appropriately diverse – archivists, academic historians, community activists, outreach professionals) audience at the stimulating ‘Diverse Histores – One Archive‘ seminar (#1arc) organised jointly by The National Archives and the Royal Historical Society, and held at University College London last week.  Early in the day, Dr Tony Murray identified what he characterised as the mutually beneficial relationship which exists (or should exist) between community knowledge shared in exchange for capacity-building support from ‘official’ archival organisations.

But I am not sure that anyone was quite expecting another ‘C’, consensus, which emerged from the discussions surrounding the kind of language used in archival description and indexing.  Of course, it may be – as Dr Matt Houlbrook pointed out – that there is a greater tolerance for potentially offensive terminology used in archives contexts amongst academics than in the wider community.  Nevertheless, participants at the seminar seemed clear that diversity will not be served by sanitizing the prejudices of the past.  Or, as the LGBT staff group at The National Archives have commented on Your Archives,  “the documents…whilst showing obvious [intolerant attitudes] also reveal to us the vibrancy and diversity of social life… Many valuable resources…still need to be identified and surfaced…and when we find them, make sure people record them in cataloguing projects, with accurate terminology that doesn’t change the meaning of the document, but that doesn’t reiterate [bigotry] found within them”.  S.I.Martin commented that the archivist’s role is to clarify, not censor; whilst Beth Brooke from TNA neatly concluded the day’s discussion by calling for two-way learning: the archivist to clarify the language, and the user to ‘read’ the archive.

Pondering further, I wondered how far the day’s outcomes reflected:

  1. The specific types of communities represented at the seminar (for ‘diverse’ read ‘traditionally marginalised’ or ‘persecuted’) and/or
  2. An assumption of archival/historical context

To elucidate, firstly, does the campaigning sense of social justice which often underpins historical research into marginalised or persecuted communities make for more willing user-archives collaborators, with greater resilience against potential controversy?  In a twist upon conventional archives’ outreach wisdom that equates increased archives access with greater user-empowerment, Dr Jeevan Deol suggested that increased access to historical sources would expose the prejudices of the past to a wider audience, and consequently to greater public dispute.  He commented that it is important that archives consider in advance how to deal with the fall-out of such wrangling.  This would also seem to require the archivist to relinquish some control over the archives in his or her care, something which (if the delegates’ tales of over-zealous archival gatekeeping are to be believed) the researchers of ‘diversity’ histories may feel to be somewhat overdue.  These communities are not likely to be mourning the demise of archival (or indeed any other traditional form of) authority (sorry, an ‘A’!).  Consequently, I suspect, such researchers might be more disposed to help enrich archival description with new perspectives and alternative readings – aka user generated content – than the researchers of more mainstream histories.

This said, however, I was struck by how the audience appeared to assume that the user’s introduction to archival sources would come solely (or at least primarily) via the catalogue – see, for example, the quote above, “make sure people record them in cataloguing projects“. Of course, this was a seminar sponsored by The National Archives, which specifically invited delegates to critique cataloguing and indexing in the archive.  But I wondered how many of the audience had considered how easily historical records can become decoupled from their archival descriptive context in an online world, surfacing again on facebook, blogs, flickr and web mashups.  Surely it is not the language of the catalogue and other archival finding aids which is significant here, so much as conventions of citation which tie the historical record back to its archival context.  In the analogue world, citations have not been something that have concerned archives professionals too much¹.  A gentleman’s agreement over publication permission has helped to preserve some illusion of control over the contexts in which archival materials are re-used, but on the whole we have been happy to let authors and publishers set their own conventions for referencing our collections.  The archivist could concentrate on the catalogue; citation was the historian’s responsibility.

It should be self-evident that collaboration is not a one-way street.  User engagement methodologies such as Revisiting Archives invite the user into our descriptive world; social media applications may help to boost the audience for archives and permit greater transparency for our documentation processes.  But perhaps as archive professionals, we have not yet begun to look beyond our own established roles to see how we need to adapt our functions to support our users’ worlds.

¹ See http://archives30.wikidot.com/citations-in-the-wild-a-collection-of-preferred-formats which begins to illustrate some of the problems with archival citations by gathering together the myriad ways in which different Archives suggest their collections should be referenced.  See also Tim Sherratt ‘Emerging Technologies for the Provision of Access to Archives: Issues, Challenges and Ideas‘ 2009, pp.24-26. A thank you to Tim for pointing me again towards this report, which pre-empts many of my thoughts here on collaboration, control and context.

In conversation with the very excellent RunCoCo project at Oxford University last Friday, I revisited a question which will, I think, prove central to my current research – establishing trust in an online archival environment.  This is an important issue both for community archives, such as Oxford’s Great War Archive, as well as for conventional Archive Services which are taking steps to open up their data to user input in some way – whether this be (for example) by enabling user comments on the catalogue, or establishing a wiki, or perhaps making digitised images available on flickr.

A simple, practical scenario to surface some of the issues:

An image posted to flickr with minimal description.  Two flickr users, one clearly a member of staff at the Archives concerned, have posted suggested identifications.  Since they both in fact offer the same name (“Britannia Mill”), it is not immediately clear whether they both refer to the same location, or whether the second comment contradicts the first.

Which comment (if either) correctly identifies the image?  Would you be inclined to trust an identification from a member of staff more readily than you’d accept “Arkwright”‘s comment?  If so, why? Clicking on “Arkwright”‘s profile, we learn that he is a pensioner who lives locally.  Does this alter your view of the relative trustworthiness of the two comments (for all we know, the member of staff might have moved into the area just last week)? How could you test the veracity of the comments?  Whose responsibility is this? If you feel it’s the responsibility of the Archive Service in question, what resources might be available for this work? If you worked for the Archive Service, would you feel happy to incorporate information derived from these comments into the organisation’s finding aids?  Bear in mind that any would-be user searching for images of “Britannia Mills” – wherever the location – would not find this image using the organisation’s standard online catalogue: is potentially unreliable information better than no information at all? What would you consider an ‘acceptable’ quality and/or quantity level for catalogue metadata for public presentation? You might think this photograph should never have been uploaded to flickr in its current state – but even this meagre level of description has been sufficient to start an interesting – potentially useful? – discussion.  Just as a relatively poor quality scan has been ‘good enough’ to enable public access outside of the repository, although it would certainly not suffice for print publication, for example.

Such ambivalence and uncertainty about accepting user contributions is one reason that The National Archives wiki Your Archives was initially designed “to be ‘complementary’ to the organisation’s existing material” rather than fully integrated into TNA’s website.

In our discussion on Friday, we identified four ways in which online archives might try to establish trust in user contributions:

  • User Profiles: enabling users to provide background information on their expertise.  The Polar Bear Expedition Archives at the University of Michigan have experimented with this approach for registered users of the site, with apparently ambiguous results.  Similar features are available on the Your Archives wiki, although similarly, few users appear to use them, except for staff of TNA.  Surfacing the organisational allegiance of staff is of course important, but would not inherently make their comments more trustworthy (as discussed above), unless more in-depth information about their qualifications and areas of expert knowledge is also provided.  A related debate about whether or not to allow anonymous comments, and the reliability of online anonymous contributions, extends well beyond the archival domain.
  • Shifting the burden of proof to the user: offering to make corrections to organisational finding aids upon receipt of appropriate documentation.  This is another technique pioneered on the Polar Bear Expedition Archives site, but might become burdonsome given a particularly active user community.
  • Providing user statistics and/or manipulating the presentation of user contributions on the basis of user statistics: i.e. giving more weight to contributions from users whose previous comments have proved to be reliable.  Such techniques are used on Wikipedia (users can earn enhanced editing rights by gaining the trust of other editors), and user information is available from Your Archives, although somewhat cumbersome to extract – in its current form, I think it is unlikely anybody would use this information to form reliability judgements.  This technique is sometimes also combined with…
  • Rating systems: these can be either organisation-defined ratings (as, for instance, the Brooklyn Museum Collection Online – I do not know of an archives example) or user-defined (the familiar Amazon or e-Bay ranking system -but, again, I can’t think of an instance where such a system has been implemented in an archives context, although often talked about – can you?). Flickr implements a similar principle, whereby registered users can ‘favourite’ images.

A quick scan of Google Scholar reveals much research into establishing trust in the online marketplace, and of trust-building in the digital environment as a customer relationship management activity.  But are these commercial models necessarily directly applicable to information exchange in the archives environment, where the issue at stake is not so much the customer’s trust in the organisation or project concerned (although this clearly has an impact on other forms of trust) so much as the veracity and reliability of the historical information presented?

Do you have any other suggestions for techniques which could be (or are) used to establish trust in online archives, or further good examples of the four techniques outlined in archival practice?  It strikes me that all four options above rely heavily upon human interpretation and judgement calls, therefore scalability will become an issue with very large datasets (particularly those held outside of an organisational website) which the Archives may want to manipulate machine-to-machine (see this recent blog post and comments from the Brooklyn Museum).

Today I am tired.  Last night I watched live proceedings from the House of Commons with increasing disillusionment, as the Digital Economy Bill was ‘washed-up’ with unseemly haste before the dissolution of parliament.  There are many reasons – political, ideological, personal, professional – why I am so dismayed by the passing of this deeply flawed piece of legislation. Archivally, the biggest disappointment is actually the government’s withdrawal of Clause 43, which would have permitted re-use of ‘orphan’ works (where no author or copyright owner can be traced).  But the potential repercussions upon collaborative creativity are even wider.

Clearly there will be some specific implications for user collaboration in archives contexts, but I need more time and a clearer head to consider them, away from the enraged, polarised rhetoric which has characterised many reactions to yesterday’s Commons debacle.  Commentaries which draw deliberate parallels with China are not helpful, but I am grateful to this blog post about the bill for drawing my attention to a great lecture by Larry Lessig about user generated content, the potential for the revival of what Lessig characterises as ‘read-write culture’, and the need to develop a new consensus over business models which will support such a culture of creativity.  Enjoy!

On 27th March (yes, I know, Easter got in the way) I attended the Rewired Culture unconference at The Guardian in London.  I’d not been to an unconference before, let alone one associated with a hackday, but I’d followed similar intiatives, such as the THATCamp series at a distance via twitter and blog postings.  So I was intrigued – if a little nervous – to find out from the inside how such an event worked. [Coincidentally, there has been the most extraordinary flame today on the UK Records Management listserv about the concept of an unconference, which is obviously unfamiliar (excuse pun) to many records professionals in the UK.  I hope this blog post goes a little way towards demonstrating the potential value of this type of event to the archives and records sector.]

The day’s events were organised jointly by DCMS and Rewired State, a not-for-profit company whose mission is neatly summed up in their tagline ‘geeks meet government’.  Rewired Culture, which also masqueraded under the twitter hashtag #rsrc, aimed to bring together cultural ‘data owners’ (such as Museums, Libraries and Archives) with Britain’s “vibrant developer community” and “growing and active entrepreneurial base”.  The half day unconference strand (which was free, incidentally – thank you) offered an opportunity to discuss how cultural creators (ie record creators in an archive context), curators (read archivists), developers (IT professionals) and entrepreneurs can collaborate to exploit the potential of cultural content and promote innovation in a participatory web2.0 world:

How do we ensure that the exciting work already underway in a number of organizations is shared more generally, so even smaller bodies and SMEs can learn from best practice and find workable routes to market? What are the cultural content business models for the 21st century? …for data owners, entrepreneurs, data users and communites to discuss business models, funding mechanisms and challenges.

Encouraged by the promise that at an unconference, “everybody’s voice is as valid as everyone else’s”, I went along nevertheless expecting to be the only archivist in a room full of people from the big national museums.  I was pleasantly surprised, therefore, to find that fellow participants included a bunch of colleagues from The National Archives, as well as a number of other people who for a variety of reasons had an interest in smaller cultural organisations.

My own attendance was also prompted by a somewhat vaguely thought-through idea that techie/geek mashups making use of cultural content could be viewed as one extreme of a user-collaboration continuum (disclaimer: these are very much thoughts-in-progress, and need a lot more mashing!):

During Rewired Culture, I was pointed towards the work of one of the current Clore Fellows, Claire Antrobus, who is researching user-led innovation in art galleries.  There are some interesting parallels and contrasts with the archives domain here, and I like the ‘user-led innovation’ concept.

Each unconference session lasted for an hour (possibly a little too long – at times I felt the discussions would benefit from more focus, but this perhaps depends on the participants in each group and anyway, you are at liberty to ‘vote with your feet’ and join another session if you wish, something which is not usually possible in a formal conference setting).  The first session I attended discussed institutional barriers to opening up cultural data.  Some familiar themes emerged, including language barriers between ‘techies’ and ‘curators’, business drivers for engaging in new, potentially risky, areas of work at a time of significant budget cuts in the public-sector, and identifying external funding streams for technological innovation (I wondered specifically whether the regional structure of the principal archives-sector grant funder, the Heritage Lottery Fund, and the emphasis they place upon localised community outcomes for projects they support, inhibits innovation in the re-use of archival content on the internet, which is by definition global in its reach).  The session also surfaced what I felt was a misunderstanding of the positivist, Jenkinsonian theory of the archivist as passive custodian (as opposed to active interpreter) of archival content, which one museum professional present had taken as a particular reluctance amongst archivists to open up archival data.  My former employer, West Yorkshire Archive Service, has had its full electronic catalogue freely available on the internet for over ten years, which is more than can be said, even now, of many local museum services.  Admittedly there is plenty of work still to be done in making this catalogue data available in re-usable, developer-friendly formats, and there is a definite need for better data aggregators in the archives sector – the UK Archives Discovery Network may have an important role to play here.  But it would be wrong to fail to recognise the achievements of the sector in making archival catalogue data available, and consequently to miss out on opportunities for its re-use (particularly where it is even now held as easily harvested and re-purposed Encoded Archival Description, as with the ArchivesHub and A2A federated collections).  Equally, there is perhaps a need to bring postmodernist trends in archival theory to greater prominence within the UK archives practitioner community, and to explore how such concepts might support the kind of technology- and user-mediated innovation under discussion at the Rewired Culture unconference.

Following on from this, the second session I attended considered what would make  the ideal API for a cultural organisation.  Here we seemed to be back in ‘If we build it, will they come?‘ territory, or to be more precise, ‘If we release open data, what do we expect developers to do with it?’.  Indeed, I agree, it would be very useful to know what use has been made of existing cultural sector APIs and datasets made available, such as that provided by the V&A Museum, or, to give an archives example, what use has been made of the NARA catalogue data that has been made available for download?  As a non-geek archivist (albeit with geek-like tendencies), I also freely admit I do not altogether understand what data formats are optimal to maximise potential for re-use, nor do the developer community seem to articulate clearly what ‘open data’ might mean in practical terms.

Finally, at the end of the afternoon, we came to the hack presentations.  I was slightly disappointed that only two of the creations (HMRC Artworks and LandingZone) made any use of actual cultural content (as opposed to information about special events or the geographical locations of cultural organisations).  Nor, as far as I know, was any use made of archives sector data (although I do not know what data was provided, and it may be that there was no suitable archive data to hand).  So the hackers had maybe breathed new life into the discoverability of collections, whereas the real promise of user-led innovation in the cultual sector, it seems to me, is to enhance meaning and understanding of collections.  However, I left thinking that a hackday with archival data could prove an interesting experiment – and something of a technical challenge, presumably, given the contextual richness and complexity of archival catalogue data, in comparison to the discrete object record of the typical museum or library catalogue.

Incidentally, for an alternative view of the same sessions, Brian Kelly has written up his impressions of the day here and here (I have similar thoughts about Saturday events!).

