It’s been a busy summer for me – lots of stimulating conferences and events.  Here’s my (eclectic) roundup of highlights:

No.1 spot has to go to the fabulous VeleHanden project, a collaborative digitisation and crowdsourcing project initiated by Amsterdam City Archives, with numerous archival partners from all over the Netherlands. I was lucky enough to be invited to the inaugural meeting of the user test panel for the pilot project, militieregisters (militia registers), in Amsterdam at the end of June.

Why do I never get to work in buildings like this?

The testing phase of the project is now well underway, and the project is due to go live in October.  VeleHanden interests me for a number of reasons: Firstly, it has an interesting and innovative private-public partnership funding model and project structure.  Participating archives have to pay to have their registers scanned by a commercial digitisation company, but the sheer size of the consortium has enabled the negotiation of a very low price per page digitised.  Research users of the militieregisters site will pay a small fee to download a digitised image (similar to Ancestry), thus providing an ongoing revenue stream for the project.  The crowdsourcing interface is being developed by a private company; in future the consortium (or individual members of the consortium) will hire the platform for new projects, and the developers will be free to sell their product to other crowdsourcing markets.  Secondly, I’m interested in the project’s (still evolving) approach to opening up archival metadata.  Thirdly, I’m interested in the way the project is going about recruiting and motivating volunteers to undertake the indexing of the registers – targeting the popular family history community; offering extrinsic quasi-financial rewards for participants in the shape of discounted access to digitised content; and promoting and celebrating competition between participants.

In fact, I think one of VeleHanden‘s great strengths is the project’s user-focused approach to design and testing, the importance of which was highlighted by Claire Warwick in a ‘How To’ session on Studying Users at Interface 2011, “a new international forum to learn, share and network between the fields of Humanities and Technology”.  Slides from the keynote and workshop sessions at this event are available on the Interface 2011 website; all are worth a look.  I particularly enjoyed the workshop on Thinking Through Networks and the practical tips on How to Get Funded should resonate with a much wider audience than just the academic community.  All the delegates had to give a lightening talk about their research.  Here is mine:

I also spoke at the Bloomsbury Conference on e-Publishing and e-Publications, and attended a couple of conferences I also went to last year – Research2, the Loughborough University student-organised conference on data analysis for information science, and AERI, the Archival Education and Research Institute, this year at Simmons College in Boston, MA.  It was interesting to note an increased interest in online participation and Internet-based methods at both events.  Podcasts of the AERI plenary sessions are available at the link above.

  • Digital Impacts: How to Measure and Understand the Usage and Impact of Digital Content, Oxford Internet Institute/JISC, Oxford, 20th May 2011 (#oiiimpacts)
  • Beyond Collections: Crowdsourcing for public engagement, RunCoCo Conference, Oxford, 26th May 2011 (#beyond2011)
  • Professor Sherry Turkle, Alone Together RSA Lecture, RSA, London, 1st June 2011 (#rsaonline)

I’m getting a bit behind with blog postings (again), so here, in the interests of ticking another thing off my to-do list, are a few highlights from various events I’ve attended recently…

It was good to see a couple of fellow archivists at the showcase conference for JISC’s Impact and Embedding of Digitised Resources programme. As searchroom visitor figures continue to fall, it is more important than ever that archivists understand how to measure and demonstrate the usage and impact of their online resources. The number of unique visitor’s to the archive service’s website (currently the only metric available in the CIPFA questionnaire for Archive Services, for instance) is no longer (if it ever was) adequate as a measure of online usage.  As Dr Eric Meyer pointed out in his introduction, one of the central lessons arising from the development of the Toolkit for the Impact of Digitised Scholarly Resources has been that no single metric will ever tell the whole story – a range of qualitative and quantitative methods are necessary to provide a full picture.  The word ‘scholarly’ in the toolkit’s name may be rather off-putting to some archivists working in local government repositories. That would be a shame, because this free online resource is full of very practical and useful advice and guidance. Like the historians caracatured by Sharon Howard of the Old Bailey Online project, archivists are not good at “studying people who can answer back” – the professional archival literature is full of laments about how poor we are at user studies. The synthesis report from the Impact programme, Splashes and Ripples: Synthesizing the Evidence on the Impacts of Digital Resources, is recommended reading; detailed evaluation reports from each of the projects which took part in the programme are also available (at http://www.jisc.ac.uk/whatwedo/programmes/digitisation/impactembedding.aspx).  Many of the recommendations made by the report would be relatively straightforward to implement, yet could potentially transform archive services’ online presence – and the TIDSR toolkit contains the resources to help evaluate the change.  Simple suggestions like picking non-word acronymns to improve project visibility online (like TIDSR – at last I have understood the Internet’s curious aversion to vowels, flickr, lanyrd, tumblr and so on!) and providing simple, automatic citations that are easy to copy or download (although I rather fear that archives are missing the boat on this one). Jane Winters was also excellent on the subject of sustaining digital impact, an important subject for archives whose online resources are perhaps more likely than most to have a long shelf-life. Twitter coverage of the event is available on Summarizr (another one!).

One gap in the existing digital measurement landscape which occurred to me during the Impacts event was the need for metrics which take account, not just of the passive audience of digital resources, but of those who contribute to them and participate in a more active way.  The problem is easily illustrated by the difficulties encountered when using standard quantitative measurement tools with Web2.0 type sites.  Attempting to collate statistics on sites such as Your Archives or Transcribe Bentham through the likes of Google Scholar or Yahoo’s Site Explorer is handicapped by the very flexibility of a wiki site structure, compounded again, I suspect, by the want of a uniquely traceable identity.  Google Scholar, in particular, seems averse to searches on URLs (although curiously, I discovered that although a search for yourarchives.nationalarchives.gov.uk produces 0 hits, yourarchives.nationalarchives.gov.* comes back with 26), whilst sites which invite user contributions are perhaps particularly susceptible to false-positive site inlink hits where they are highlighted as a general resource in blogrolls and the like.

This need to be clearer about what we mean by user engagement and how to measure when we’ve successfully achieved it was also my main take-away from the following week’s RunCoCo Conference – Beyond Collections: Crowdsourcing for Public Engagement.  Like Arfon Smith of the Zooniverse team, I am not very comfortable with the term ‘crowdsourcing’, and indeed many of the projects showcased at the Beyond conference seemed to me to be more technologically-enhanced outreach events or volunteer projects than true attempts to engage the ‘crowd’ (not that there is anything wrong with traditional approaches, but I just don’t think they’re crowdsourcing).  Even where large numbers of people are involved, are they truly ‘engaged’ by receiving a rubber stamp (in the case of the Erster Weltkrieg in Alltagsdokumenten project) to mark their attendance at an open day type event?  Understanding the social dynamics behind even large scale online collaborations is important – the Zooniverse ethical contract bears repeating:

  1. Contributors are collaborators, not users
  2. Contributors are motivated and engaged by real research
  3. Don’t waste people’s time

Podcasts of all the Beyond presentations and a series of short, reflective blog posts on the day’s proceedings are available.

Finally, Professor Sherry Turkle‘s RSA lecture to celebrate the launch of her new book, Alone Together, about the social impact of the Internet, was rather too brief to give more than a glimpse of her current thinking on our technology saturated society, but nevertheless there were some intriguing ideas which have potentially wide-ranging implications for the future of archives. One was the sense that the Internet is not currently serving our human needs.  She also spoke about the tensions between being willing to share and privacy.  Turkle asked what is democracy and what is intimacy without privacy? In response to questions from the audience, Turkle also claimed that people don’t like to say negative things online because it leaves a trace of things that went wrong. If that is true, it might have important implications for what we can expect people to contribute in archival contexts, and the nature of the debate which might take place in contested spaces of memory. Audio of the event is available from the RSA website.

Today I have a guest post about my research on UKOLN‘s Cultural Heritage Blog.

I had a day at the Society of Archivists’ Conference 2010 in Manchester last Thursday; rather a mixed bag. I wasn’t there in time for the first couple of papers, but caught the main strand on digital preservation after the coffee break. It’s really good to see digital preservation issues get such a prominent billing (especially as I understand there few sessions on digital preservation at the much larger Society of American Archivists’ Conference this year), although I was slightly disappointed that the papers were essentially show and tell rehearsals of how various organisations are tackling the digital challenge. I have given exactly this type of presentation at the Society’s Digital Preservation Roadshows and at various other beginners/introductory digital preservation events over the past year.  Sometimes of course this is precisely what is needed to get the nervous to engage with the practical realities of digital preservation, but all the same, it’s a pity that one or more of the papers at the main UK professional conference of the year did not develop the theme a little more and stimulate some discussion on the wider implications of digital archives.  However, it was interesting to see how the speakers assumed familiarity with OAIS and digital preservation concepts such as emulation. I suspect some of the audience were left rather bewildered by this, but the fact that speakers at an archives conference feel they can make such assumptions about audience understanding does at least suggest that some awareness of digital preservation theory and frameworks is at last crawling into the professional mainstream.

I was interested in Meena Gautam’s description of the National Archives of India‘s preparations for receiving digital content, which included a strategy for recruiting staff with relevant expertise. Given India’s riches in terms of qualified IT professionals, I would have expected a large pool of skilled people from which to recruit. But the direction of her talk seemed to suggest that, in actual fact, NAI is finding it difficult to attract the experts they require. [There was one particular comment – that the NAI considers conversion to microfilm to be the current best solution for preserving born-digital content – which seemed particularly extraordinary, although I have since discovered the website of the Indian National Digital Preservation Programme, which does suggest that the Indian Government is thinking beyond this analogue paradigm.]  Anyway, NAI are not alone in encountering difficulties in attracting technically skilled staff to work in the archives sector.  I assume that the reason for this is principally economic, in that people with IT qualifications can earn considerably more working in the private sector.

It was a shame that there was not an opportunity for questions at the end of the session, as I would have liked to ask Dr Gautam how archives could or should try to motivate computer scientists and technicians to work in the area of digital preservation.  Later in the same session, Sharon McMeekin from the Royal Commission on the Ancient and Historical Monuments of Scotland advocated that archives organisations should collaborate to build digital repositories, and I and several others amongst the Conference twitter audience agreed.  But from observation of the real archives world, I would suggest that, although most people agree in principle that collaboration is the way forward, there is very little evidence – as yet at least – of partnership in practice. I wonder just how likely it is that joint repositories will emerge in this era of recession and budget cuts (which might be when we need collaboration most, but when in reality most organisations’ operations become increasingly internally focused).  Since it seems archives are unable to compete in attracting skilled staff in the open market, and – for a variety of reasons – it seems that the establishment of joint digital repositories is hindered by traditional organisational boundaries, I pondered whether a potential solution to both issues might lie in Yochai Benkler‘s third organisational form of commons-based peer-production: as the means both to motivate a community of appropriately skilled experts to contribute their knowledge to the archives sector, and to build sustainable digital archives repositories in common.  There are already of course examples of open source development in the digital archives world (Archivematica is a good example, and many other tools, such as the National Archives of Australia’s Xena and The (UK) National Archives DROID are available under open source licences), since the use of open standards fits well with the preservation objective.  Could the archives profession build on these individual beginnings in order to stimulate or become the wider peer community needed to underpin sustainable digital preservation?

After lunch, we heard from Dr Elizabeth Shepherd and Dr Andrew Flinn on the work of the ICARUS research group at UCL’s Department of Information Studies, of which my user participation research is a small part.  It was good to see the the twitter discussion really pick up during the paper, and a good question and answer session afterwards.  Sarah Wickham has a good summary of this presentation.

Finally, at the end of the day, I helped out with the session to raise awareness of the UK Archives Discovery Network, and to gather input from the profession of how they would like UKAD to develop.  We asked for comments on post-it notes on a series of ‘impertinent questions‘.  I was particularly interested in the outcome of the question based upon UKAD’s Objective 4: In reality, there will always be backlogs of uncatalogued archives.” Are volunteers the answer?  From the responses we gathererd, there does appear to be increasing professional acceptance of the use of volunteers in description activities, although I suspect our use of the word ‘volunteer’ may be holding back appreciation of an important difference between the role of ‘expert’ volunteers in archives and user participation by the crowd.

A write-up of the second Archival Education Research Institute which I attended at  from 21st to 25th June.

The scheduled programme (or program, I suppose!) was a mixture of plenary sessions on the subject of interdisciplinarity in archival research, methods and mentoring workshops, curriculum discussion sessions, and research papers given by both doctoral students and faculty members.  We also experienced two fascinating and engaging, if slightly US-centric, theatrical performances by the University of Michigan’s Center for Research on Learning Theatre Program (ok, now I’m confused – why would it be ‘center’ but not ‘theater’?).

Most valuable to me personally were the methods workshops on Information Retrieval and User Studies.  IR research is largely new to me, although I was aware that current development work at The National Archives [TNA] includes a research strand being carried out at the University of Sheffield’s Information Studies Department which uses IR techniques to investigate information-seeking behaviour across TNA’s web domain and catalogue knowledge base.  I was interested to see whether these methods could be adapted for my research interests in user participation.  User Studies turned out to be more familiar territory, not least because of many years’ responsibility coordinating and analysing the Public Services Quality Group[PSQG] Survey of Visitors to UK Archives across the West Yorkshire Archive Service‘s five offices.  I hadn’t previously appreciated that the PSQG survey is unique in the archival world in providing over a decade’s worth of longitudinal data on UK archive users (despite what it says on the NCA website, the survey was first run in 1998), and it seems a shame that only occasional annual reports of the survey results have been formally published.

Of the paper sessions, I was particularly interested in several examples of participatory archive projects.  The examples given in the Digital Cultural Communities session – in particular Donghee Sinn’s outline of the No Gun Ri massacre digital archives and Vivian Wong’s film-making work with the Chinese American community in Los Angeles, together with Michelle Caswell’s description of the Cambodian Human Rights Tribunal in the session on Renegotiating Principles and Practice – reinforced my earlier conviction that past trauma or marginalisation may help to promote user-archives collaboration, and provide greater resilience against (or perhaps more sophisticated mechanisms for resolving) controversy.  However, Sue McKemmish and Shannon Faulkhead, in their presentations about another previously persecuted grouping, Australian Aboriginal natives (the Koorie and Gundjitmara communities specifically), gave me hope that the participatory attitudes of the Indigenous communities are just an early precursor to a much wider social movement which puts a high value upon co-creation and co-responsibility for records and record-keeping.  [Incidentally, if you have access, I see that Sue and Shannon’s Monash colleague Livia Iacovino has just published an article in Archival Science entitled Rethinking archival, ethical and legal frameworks for records of Indigenous Australian communities: a participant relationship model of rights and responsibilities, which looks highly pertinent – it’s currently in the ‘online first’ section]  I was also interested in Shannon’s comments about developing a framework to incorporate or authenticate traditional oral knowledge as an integral part of the overall community ‘archive’ (I’m not quite sure I’ve got this quite right, and would like to chat to her further about it).  William Uricchio has remarked of contemporary digital networks that “Decentralized, networked, collaborative, accretive, ephemeral and dynamic… these developments and others like them bear a closer resemblance to oral cultures than to the more stable regimes of print (writing and the printing press) and the trace (photography, film, recorded sound)”¹.  What can we learn from oral culture to inform our development of participatory practice in the digital domain?

Carlos Ovalle gave a useful paper on Copyright Challenges with Public Access to Digital Materials in Cultural Institutions in the Challenges/Problems in Use, Re-use, and Sharing session, which was interesting in the light of the UK Digital Economy Act and recent amendments to UK Copyright legislation, and some of my own current concerns about digitisation practices and business models in UK archives.

I cannot say I particularly enjoyed the plenary sessions and ensuing discussions.  I found the whole dispute about whether archival ‘science’ could, or should, be considered inter-disciplinary or multi-disciplinary, and which disciplines are core or which are peripheral, somewhat sterile and frankly rather futile.  Some of the arguments seemed to stand as witness to a kind of professional identity crisis, undermining any claim that archival research might have to a wider relevance in the modern world.  I was particularly surprised at how controversial ‘collaboration’ seemed to be in a US research context – a striking contrast I felt to the pervasive ‘partnership’ ethos that is accepted best practice in fields with which I am familiar in the UK.  Not just, I think, because I worked for what is in many ways a pioneering partnership of local authorities at West Yorkshire Joint Services; the current government policy on archives, Archives for the 21st Century similarly emphasises the benefits and indeed necessity (in the current economic climate) of partnership working in a specific archives context.

Sadly, there doesn’t seem to have been much blogging about AERI, but you can read one of the Australian participant’s Lessons from AERI Part I (is there a part II coming soon, Leisa?!).  I’ll link to any further blog posts I notice in the comments.

Finally, nothing to do with AERI, but I’ve finally got round to registering this blog with technorati and need to include the claim code in a post, so here goes: CF2RCBCUPWQC.

¹Uricchio, W. ‘Moving Beyond the Artifact: Lessons from Participatory Culture’ in Preserving the Digital Heritage Netherlands National Commission for UNESCO, 2007.  <http://www.knaw.nl/ecpa/publ/pdf/2735.pdf>

The buzzword ‘crowdsourcing’ is beginning to make itself heard in archives circles. Yesterday (via @tondelooijer on Twitter) I came across this Dutch partnership project to index militia records [the website’s in Dutch, but you can still spot the word ‘crowdsourcing’!]; the Archivist of the United States has traced crowdsourcing back to Aristotle; and the ArchivesNext Best Archives on the Web Awards have introduced a new category for ‘Best use of crowdsourcing for description’.

Two’s company, three’s a crowd, goes the saying. But my mental picture of a crowd is something more like this:

Crowd by James Cridland on flickr

What do we really mean by ‘crowd’-sourcing, and can we legitimatally claim any examples in the archives world? Your Archives has nearly 29000 registered users (as of 1 May 2010) – is this a crowd? It’s certainly a lot of people – or at least it sounds like it is until you consider the quarter of a million individuals involved with Galaxy Zoo.

Galaxy Zoo is an interesting comparison, because taking part turns out not to require very much knowledge of galaxies at all. As the tutorial says, the job is very simple – just match slightly fuzzy pictures of galaxies against the most likely option in a finite list of feature categories. The task reminds me of those shape games for children:

Compare the Galaxy Zoo task to the knowledge required to contribute to Your Archives, or even to add a substantive comment or tag an image in Flickr Commons.

Don’t get me wrong, I don’t want to knock the archives examples – I think they’re great (I particularly wish we could get going on a UK version of the Dutch project). But I just wonder whether they can truly be described as ‘crowdsourcing’. Or whether the archives examples merely represent a slight extending of the definition of ‘expert’. Conceding perhaps that the professional archivist is not the fount of all authoritative knowledge, but nevertheless adopting a fashionable term to describe a technology-facilitated adaptation of informal knowledge sharing between archivist and historian.

Of all the projects in the archives domain, the one that actually seems closest to my picture of the crowd is Ancestry’s World Archives Project – fairly trivial pattern-matching short tasks, with the answers from several contributors computationally cross-matched.

How many people does it take to make a crowd?

