The buzzword ‘crowdsourcing’ is beginning to make itself heard in archives circles. Yesterday (via @tondelooijer on Twitter) I came across this Dutch partnership project to index militia records [the website's in Dutch, but you can still spot the word 'crowdsourcing'!]; the Archivist of the United States has traced crowdsourcing back to Aristotle; and the ArchivesNext Best Archives on the Web Awards have introduced a new category for ‘Best use of crowdsourcing for description’.
Two’s company, three’s a crowd, goes the saying. But my mental picture of a crowd is something more like this:
What do we really mean by ‘crowd’-sourcing, and can we legitimatally claim any examples in the archives world? Your Archives has nearly 29000 registered users (as of 1 May 2010) – is this a crowd? It’s certainly a lot of people – or at least it sounds like it is until you consider the quarter of a million individuals involved with Galaxy Zoo.
Galaxy Zoo is an interesting comparison, because taking part turns out not to require very much knowledge of galaxies at all. As the tutorial says, the job is very simple – just match slightly fuzzy pictures of galaxies against the most likely option in a finite list of feature categories. The task reminds me of those shape games for children:
Don’t get me wrong, I don’t want to knock the archives examples – I think they’re great (I particularly wish we could get going on a UK version of the Dutch project). But I just wonder whether they can truly be described as ‘crowdsourcing’. Or whether the archives examples merely represent a slight extending of the definition of ‘expert’. Conceding perhaps that the professional archivist is not the fount of all authoritative knowledge, but nevertheless adopting a fashionable term to describe a technology-facilitated adaptation of informal knowledge sharing between archivist and historian.
Of all the projects in the archives domain, the one that actually seems closest to my picture of the crowd is Ancestry’s World Archives Project – fairly trivial pattern-matching short tasks, with the answers from several contributors computationally cross-matched.
How many people does it take to make a crowd?