Feeds:
Posts
Comments

Posts Tagged ‘ECDL’

Day 3 of ECDL started for me with the Query Log Analysis session.  I thought perhaps that, now the papers were getting heavily into IR technicalities, I might not understand what was being presented or that it would be less relevant to archives.  How wrong can you be!  Well, ok, IR metrics are complex, especially for someone new to the field, but when the first presentation was based upon a usability study of the EAD finding aids at the Nationaal Archief (the National Archives of the Netherlands), it wasn’t too difficult to spot the relevance.  In fact, it was interesting to see how you notice things when the test data is presented in a foreign language, that you wouldn’t necessarily observe if they were in your mother tongue.  In the case of the Nationaal Archief, I was horrified at how many clicks were required to reach an item description.  Most archives have this problem with web-based finding aids (unless they merely replicate a traditional format, for instance, a PDF copy of a paper list), but somehow it was so much more obvious when I wasn’t quite sure exactly what was being presented to me at each stage of the results.  This is what it must be like to be an archival novice.  No wonder they give up.

The second paper of the morning, Determining Time of Queries for Re-ranking Search Results, was also very pertinent to searching in an archival context.  It discussed ‘temporal documents’ where either the terminology itself has changed over time or time is highly relevant to the query.  This temporal intent may be either implicit or explicit in the query.  For example, ‘tsunami + Thailand’ is likely to refer to the 2004 tsunami.  These kinds of issues are obviously very important for historians, and for archivists making temporal collections available in a web environment, such as web archives and online archival finding aids.

Later in the morning, I was down to attend the stream on Domain-specific Digital Libraries.  One of these specific domains turned out to be archives, with an (appropriately) very philosophical paper presented by Pierre-Edouard Portier about DINAH [in French].  This is “a philological platform for the construction of multi-structured documents”, created to enable the transcription and annotation of the papes of the French philosopher, Jean-Toussaint Desanti, and to facilitate the visualization of the trace of user activities.  My tweeting of this paper (limited on account of both the presentation’s intellectual and technical complexity and the fact that I’d got to bed at around 3am that morning!) seemed to catch the attention of both the archival profession and the Linked Data community;  it certainly deserves some further coverage in the English-speaking archival professional literature.

In the same session, I was also interested in the visualization techniques presented for time-oriented scientific data by Jürgen Bernard, which reminded me of The Visible Archive research project funded by the National Archives of Australia.  The principle – that visual presentations are a useful, possibly preferable, alternative to text-based descriptions of huge series of data – is the same in both cases.  Similarly, the PROBADO project has investigated the development of tools to store and retrieve complex, non-textual data and objects, such as 3D CAD drawings and music.  There were important implications from all of these papers for the future development of archival finding aids.

In the afternoon, I found myself helping out at the Networked Knowledge Organization Systems/Services (NKOS) workshop.  I wasn’t really sure what this entailed, but it turned out to involve things like thesauri construction and semantic mapping between systems, all of which is very relevant to the UK Archives Discovery (UKAD) Network objectives.  I was particularly sorry I was unable to make the Friday session of the workshop, which was to be all about user-centred knowledge system design, and Linked Data, however the slides are all available with the programme for the workshop.

Once again, my sincere thanks to the conference organisers for my opportunity to participate in ECDL2010.  The conference proceedings are available from Springer, for those who want to follow up further, and presentation slides are gradually appearing on the conference website.

Advertisements

Read Full Post »

Since it seems a few people read my post about day one of ECDL2010, I guess I’d better continue with day two!

Liina Munari’s keynote about digital libraries from the European Commission’s perspective provided delegates with an early morning shower of acronymns.  Amongst the funder-speak, however, there were a number of proposals from the forthcoming FP7 Call 6 funding round which are interesting from an archives and records perspective, including projects investigating cloud storage and the preservation of context, and on appraisal and selection using the ‘wisdom of crowds’. Also, the ‘Digital Single Market’ will include work on copyright, specifically the orphan works problem, which promises to be useful to the archives sector – Liina pointed out that the total size of the European Public Domain is smaller than the US equivalent because of the extended period of copyright protection available to works whose current copyright owners are unknown. But I do wish people would not use the ‘black hole’ description; its alarmist and inaccurate.  If we combine this twentieth century black hole (digitised orphan works) with the oft-quoted born-digital black hole, it seems a wonder we have any cultural heritage left in Europe at all.

After the opening keynote, I attended the stream on the Social Web/Web 2.0, where we were treated to three excellent papers on privacy-aware folksonomies, seamless web editing, and the automatic classification of social tags. The seamless web editor, seaweed, is of interest to me in a personal capacity, because of its WordPress plugin, which would essentially enable the user to add new posts or edit existing ones directly into a web browser without recourse to the cumbersome WordPress dashboard, and absent mindedly adding new pages instead of new posts (which is what I generally manage to do by mistake). I’m sure there are archives applications too, possibly for instance in terms of the user interface design for encouraging participation in archival description.  Privacy-aware folksonomies, a system to enable greater user control over tagging (with levels user only, friends, and tag provider), might have application in respect of some of the more sensitive archive content, such as mental health records perhaps.  The paper on the automatic classification of social tags will be of particular interest to records managers interested in the searchability and re-usability of folksonomies in record-keeping systems, as well as to archivists implementing tagging systems into the online catalogue or digital archives interfaces.

After lunch we had a poster and demo session.  Those which particularly caught my attention included a poster from the University of Oregon entitled ‘Creating a Flexible Preservation Infrastructure for Electronic Records’ and described as the ‘do-it’ solution to digital preservation in a small repository without any money.  Sounded familiar!  The authors, digital library expert Karen Estlund and University Archivist Heather Briston, described how they have made best use of existing infrastructure, such as share drives (for deposit) and the software package Archivists Toolkit for description.  Their approach is similar to the workflow I put in place for West Yorkshire Archive Service, except that the University are fortunate to be in a position to train staff to carry out some self-appraisal before deposit, which simplifies the process.  I was also interested (as someone who is never really sure why tagging is useful) in a poster ‘Exploring the Influence of Tagging Motivation on Tagging Behaviour’ which classified taggers into two groups, describers and categorisers, and in the demonstration of the OCRopodium project at King’s College London, exploring the use of optical character recognition (OCR) with typescript texts.

In the final session of the day, I was assigned to the stream on search in digital libraries, where papers explored the impact of the search interface on search tasks, relevance judgements, and search interface design.

Then there was the conference dinner…

Read Full Post »

I am extremely lucky to have been offered a student place helping out at ECDL 2010, the European Conference on Research and Advanced Technology for Digital Libraries. The following are the highlights from day 1 of the conference for this archivist let loose in the virtual stacks:

Susan Dumais‘ keynote presented recent Microsoft research into the temporal dynamics of the web, analysing both changes to content and how people revisit web pages, checking for new content or looking for previously found information. She argued that the current generation of web browsers offer only a static, snapshot view, and went on to illustrate a browser plugin called DiffIE which highlights what has changed on a web page since the user’s last visit. She also presented some initial evaluation of this tool, which indicated that although perceptions of revisitation frequency remained constant, in practice users of the plugin increased their revisitation rate. There are lots of potential applications for this kind of tool for archives – from the presentation of web archives to the user interactions/annotations/ratings examples that Dumais herself gave. She also spoke about the implications of her research to the ranking and presentation of search results, illustrating how the pertinency and hence relevancy of certain terms can decline over time – for example, a user searching for ‘US Open’ this week is more likely to looking for information on the tennis grand slam than the golf tournament. Again, there are some interesting implications here for archival catalogue and document search systems.

Christos Papatheodorou from the Ionian University on Corfu spoke about the mapping of disparate cultural heritage (archives, museums, libraries) XML-based metadata schema to the CIDOC CRM ontology, and went on to describe the transformation of XPath queries submitted to a local (XML) data source into equivalent queries suitable to be submitted to other data sources, via the CIDOC CRM ontology. Having travelled up to Glasgow on the sleeper, arriving at 7 in the morning, I confess I got a bit lost in the technicalities from this point onwards, but the basic idea is to use CIDOC CRM as a mediator between disparate cultural heritage sources marked up in different XML schema. There was an extended worked example using EAD, which was nice to hear. In general, it has been interesting to observe a large number of papers at this conference which report experiments based upon data from cultural heritage rather than scientific domains. All of which tends to reinforce my thoughts after the Society of Archivists’ Conference about attracting technology experts to work in the archives sector: cultural heritage data is complex and thus, it seems, fascinating and intrinsically motivating to work with. We should be more proactive about promoting archival data to this kind of digital research community.

I’d been particularly looking forward to the paper on User-Contributed Metadata for Libraries and Cultural Institutions, although this turned out to be a Drexel University re-working of the Library of Congress flickr Commons experience, albeit concentrating more on user comments and less upon tagging. I was not quite comfortable anyway with the a priori categorization of comments described in the paper (into 1. personal and historical 2. link out (eg to wikipedia) 3. corrections and translations 4. link in (eg adding images to flickr groups) – seems to me that category 1. includes a particularly wide range of possible comment types), plus all the things I wanted to ask about seemed to be listed as ‘future research’. These include a fuller categorization, exploring motivations for adding comments, the presentation of comments in the user interface, and librarians’ (or archivists’) role in moderating user interaction.

I also enjoyed a couple of papers which presented ideas to do with improving information visualisation and user judgement using colours, layout and social navigation, all of which have some potential relevancy to the question of how best to present user-generated content.

Research and Advanced Technology for Digital Libraries, Proceedings of the 14th European Conference, ECDL 2010, Glasgow, UK, September 2010 is published as Lecture Notes in Computer Science 6273, available via SpringerLink, for those of you who have access.

And I have travelled twice on Glasgow’s baby underground train 🙂

Read Full Post »