Skip to content

Categories:

Data report challenges IR managers

Data organisationsNobody has asked the question – perhaps it doesn’t need asking after KeepIt course 1 – but why did a course on digital preservation start with organisational issues? Recently a new report was published by the Digital Curation Centre, and I will pick out one section that answers that question:

Data Dimensions: Disciplinary Differences in Research Data Sharing, Reuse and Long term Viability: A comparative review based on sixteen case studies

“In the UK the majority of higher education institutions now have repositories in place, though their use for digital research data curation is very limited at present. The benefit of this route to data curation is that the risk of datasets disappearing over time through lack of care or resources is spread across a whole institution and is, therefore, diminished. Looking after the data outputs of their research community is a key strategic challenge for institutional managers and repository administrators, one that is likely to involve changes in organisational structure and culture. Whether the will, skills and resources necessary exist to meet this considerable challenge within individual institutions remains to be seen, but two initiatives are helping to light the path ahead: JISC’s Digital Preservation and Records Management Programme is funding research projects in this field”

It was noted in discussion during the module 1 that some repositories have scoped and defined coverage. So this is not about prescription, but about keeping an open mind about the relation between the repository and the institution, and how to understand, manage and anticipate that complex and changing axis.

I should also point to Dorothea Salo’s blog on this report, which noted: “It also contains throwaway gems like “It is worth noting that researchers expected their own institutions to be able to provide affordable managed storage, technical support and a preservation facility – but few institutions appear to be able to offer such services at this point.” (p. 10)

We will be coming to those issues later in the course.

Posted in Uncategorized.

Tagged with , , .


NECTAR and the Data Asset Framework – first thoughts

NECTAR logoKeepIt course module 1, Southampton, 19 January 2010
Tools this module: DAF, AIDA
Tags Find out more about: this module KeepIt course 1, the full KeepIt course
Presentations referred to in this blog entry Using DAF as a Data Scoping Tool, The DAF at Southampton (Slideshare)
Presentations and tutorial exercises course 1 (source files)

Three years ago, when we first conceived the principles which would underpin NECTAR, it was agreed that datasets were a bridge too far.  Instead, we would focus on the research outputs that were already in the public domain.  This was a good criterion to use – it satisfied the research community’s desire for a ‘quality’ benchmark without putting NECTAR administrators or fellow researchers in the difficult position of having to sit in judgement on colleagues’ work. So NECTAR was able to contain details of papers, presentations, books, exhibitions and artworks, but not data files.

But times have moved on.

The research environment now is different from even three years ago.  With many of the major UK funders now mandating open access to data and the REF looming, with its focus on research environment and impact, the possibility of using NECTAR to store research data clearly needs to be considered.  Which begs two questions: how do researchers currently manage their data at Northampton and what might be their data management needs in the future?

So I attended the first KeepIt training day with high expectations and a keen interest in what Sarah Jones and Harry Gibbs had to say about the design and implementation of the Data Asset Framework (DAF).  I had already had a brief exposure to DAF with DCC Digital Creation 101 Lite training a few weeks previously, but Harry’s first hand experience of implementing the tool was new and eagerly anticipated.

As already described by others in this blog, DAF is essentially a framework which enables universities to audit departmental data collections, awareness, policies and practice for data curation and preservation.  The DAF methodology comprises 4 steps:

  1. Planning the audit
  2. Identifying and classifying assets
  3. Assessing management of data assets
  4. Reporting and recommendations

where a ‘data asset’ is

numerical data, statistics, output from experimental equipment, survey results, interview transcripts, databases, images or audiovisual files (from the DAF implementation guide)

With the benefit of having been tested in a number of previous pilot studies, these steps appear to offer a straightforward and (importantly) achievable procedure for gathering information about institutional data.

Harry described the aims of the Southampton DAF project as:

  • To get an overview of research data holdings
  • To find out about data management practices (eg sharing; version control)
  • To identify data sets for the institutional repository

Using the School of Social Sciences as an exemplar, Harry and her team first developed and promoted an online questionnaire and then conducted face to face interviews with researchers. The results were reported internally and not shared with the KeepIt course members, but the outcomes were positive:

  • An encouraging response from the School which led to an even better library/School relationship
  • Raised awareness of data management
  • Development of data support web pages on the library website
  • Further plans for DAF work.

From this it is clear to see why Debra Morris’s evaluation of the DAF tool focused on the advocacy and engagement.  Indeed, previous DAF studies have emphasised that the value of the tool lies as much in the process as in the final results.  Any mechanism which enables the repository manager (or liaison librarian) to engage meaningfully and usefully with their user community is to be encouraged.

So with this in mind, I took the DAF model to our NECTAR Steering Group.  Both Hilary Johnson (Director of Information Services) and Professor Hugh Matthews (Director of Research and Knowledge Transfer and Dean of the Graduate School) are members of this group, so support from these key people was essential to get a DAF project off the ground.  The challenge was to demonstrate why Northampton needed to undertake a DAF project, so the justification that I offered was as follows:

  • Little is known centrally about university researchers’ data storage requirements, or indeed the research workflow that incorporates the creation and management of data.
  • No university wide data storage policy or procedure currently exists.
  • Research funders are beginning to demand that data as well as published research outputs are made openly available.
  • In NECTAR, we have available the infrastructure to store and preserve digital data.
  • Previous studies have noted that the process of undertaking DAF has been valuable in itself, even if the resulting inventory of data is only partial.
  • There have been a number of previous implementations of DAF, these could be consulted or adapted to meet Northampton’s needs (saving time and ensuring the best possible outcome from the project).

Interestingly, but unsurprisingly, the Director of Research immediately saw the potential for the DAF in contributing to the university’s ‘research environment’.  He recognised the value of appropriately informed data management practices and policies being supported and promulgated throughout the research community.  The Director of Information Services, on the other hand, took a broader view, and stressed the need to consider all data assets – not just those produced by academic researchers.

So it looks as if, resources permitting, we will be using the DAF methodology at Northampton.  At the time of writing this post I am investigating possible sources of research support (Southampton used a research assistant to good effect), and I imagine the scope of the project will encompass both academic and support departments… but the details are for Step 1 of the project to decide.

Watch this space.

Posted in Uncategorized.

Tagged with , , , , .


KeepIt Course module 1 for EdShare

EdShare logoKeepIt course module 1, Southampton, 19 January 2010
Tools this module: DAF, AIDA
Tags Find out more about: this module KeepIt course 1, the full KeepIt course
Presentation referred to in this blog entry The AIDA toolkit: Assessing Institutional Digital Assets (Slideshare)
Presentations and tutorial exercises course 1 (source files)

As Manager of EdShare, the University of Southampton share for learning and teaching, my attendance at this 5-module training programme is both to review the relevance of the modules to the work of EdShare, as well as its potential relevance to managers of educational repositories in other institutions.

EdShare is one of the exemplar repositories in the KeepIt Project – a software package based on EPrints. The focus of EdShare is the educational (learning and teaching) content produced by the University of Southampton, rather than the research outputs of the institution.

Work for EdShare began back in October 2007, and we launched our service gradually during the summer and autumn of 2008. So, EdShare has some recent experience to draw on in working across an institution to achieve organisational commitment to the implementation of a “repository” as well in investigating the range and limits of “eligible” content for an institutional, educational “repository”.

You can see I use inverted commas around the “repository” word – we try not to use the word at all in connection with EdShare, because of many of the assumptions which have developed around the term, as well as in order to break away from the “passive”/”lodged” aspects of the term. We have always intended that EdShare be an active space on the web in which collaboration, re-use, sharing and creative processes will be supported.

From my EdShare perspective, Sarah Jones’ and Harry Gibbs’ presentations on The Data Asset Framework captured the investment that EdShare had put into the crucial activities of advocacy and engagement. Indeed, “advocacy and engagement” was the title of one specific work package within the Project to build EdShare. This decision was informed by the approach already taken during the development of the University’s institutional research repository by members of Library staff. Indeed, two existing members of University Library staff were employed on the work for EdShare, to lead advocacy and engagement, since the Library’s reputation in this area has historically been very strong. So, the DAF approach has actually been a cornerstone of work for EdShare right from the outset. The success of EdShare has been significantly determined by effective collaboration between the Project Team and (other) academics involved in education. We have only been able to achieve this by having a sound understanding of what the academic curriculum and syllabus is about, and by developing an appreciation of how everyday learning resources created by the people who do teaching in the University, support the curriculum.

The AIDA toolkit: Assessing Institutional Digital Assets, Ed Pinsent, University of London Computer Centre – This toolkit struck me as very reminiscent, in approach, to Stephen Marshall’s eLearning Maturity Model (eMM). The eMM was the model that was used at the University of Southampton to undertake the eLearning Benchmark Project supported by the HE Academy during 2007-2008. This was a significant piece of work that we undertook as an aspect of our institutional Learning and Teaching Enhancement Strategy. This Benchmarking Project was led by Hugh Davis, as University Director of Education responsible for eLearning. The process gave the institution a way to understand the ranges of capabilities across the University – Schools, disciplines as well as specialist service groups, to provide eLearning both operationally, managerially and strategically. It provided a snapshot of where the University of Southampton was in terms of delivery, planning, definition, management and optimisation of elearning at a specific point in time, as well as providing a benchmark for comparison with other Russell Group institutions which participated in the Project.

Hugh Davis then became the Director of the EdSpace Project, building EdShare. The EdShare team included other people who had also participated in the eLearning Benchmarking work – Dr. Su White (academic in the School of Electronics and Computer Science); and myself, Debra Morris (University Library eLearning lead). Our collective experience in this eLearning Benchmarking work meant that we were able to draw on the understanding, insights and foundations of the benchmarking work for the identification of relevant, exemplar Schools and subject disciplines to work with during the early stages of developing EdShare and identifying relevant content and partner teachers. The one activity created the foundations for the other so that we derived benefit from continued collaboration with specific academic groups as well as developing continuity in a thread of work that we built consistently over more than 2 years. From this starting point, my view is that we established early engagement with the concepts and approach supported by AIDA.

Debra Morris
EdShare Manager
University of Southampton

Posted in Uncategorized.

Tagged with , , , , , .


AIDA and Institutional wobbliness

Cornell University's three-legged stool model of digital asset management

Cornell University three-legged stool model for digital asset management

KeepIt course module 1, Southampton, 19 January 2010
Tools this module: DAF, AIDA
Tags Find out more about: this module KeepIt course 1, the full KeepIt course
Presentation referred to in this blog entry The AIDA toolkit: Assessing Institutional Digital Assets (Slideshare)
Presentations and tutorial exercises course 1 (source files)

Steve Hitchcock has organised a series of modules on Digital preservation tools for repository managers, and I was invited from ULCC to present something on Assessing Institutional Digital Assets (AIDA) at the very first module, on 19th January. It was very good to meet with such a warm reception from the intelligent and lively audience, many of whom were repository managers themselves, or involved with building and developing repositories. It was clear from the start they were all very engaged with the work, and understood the issues well.

I thought it would be interesting to set them an exercise that explored two data-management activities within the AIDA toolkit, namely ‘Metadata Management’ and ‘Access and Sharing’. The AIDA self-assessment toolkit is intended simply to offer a snapshot of an Institution’s readiness to carry out management of its digital assets, assessing that capability across three ‘legs’ – Organisation, Technology and Resources – while applying the assessment at the level of the entire Institution, and of a single repository. AIDA’s proposition is simple – the assessment will almost always result in a wobbly three-legged stool, quite often showing that the technology leg is the most advanced of the three. Steve pointed out that a result like this need not surprise us, and this is especially true given the advances being made with a tool such as eprints.

Concentrating on two activities from the larger and more complex AIDA framework was instructional – for me as well, as the manager who must put work into the toolkit to improve it. Two ambiguities in the metadata strand were exposed by the keen minds of the Southampton audience; did AIDA refer to discovery metadata, technological metadata, or preservation metadata? More importantly, the written exemplars in AIDA seem to be making an assumption that automation of metadata is a commonly-desired goal, and that the results of automation are always good. But some repository owners were proud of the quality of their “hand-crafted” metadata.

However, in the closing minutes of the day, one team fed back to me on their discussions on the Access and Sharing strand. Their AIDA-based deliberations showed to them quite clearly that their test repository scored highly in the organisational leg (managed shared storage, centralised management, and agreements about cross-department sharing) and the technological leg (technological capacity and an appropriate infrastructure to support those policies, and how well the strategies are aligned), but there was an imbalance in the resources leg, and resources allocated to technological development were not quite at the correct level to match the strategies and policies. This isn’t simply a matter of lacking the money and staff (who doesn’t!?), but a simple graphic demonstration of where the stool is wobbliest, through findings which can be backed up through the provision of documented evidence, and one which might enable the repository to take steps to achieve stability on all three legs. Many heads in the room nodded instantly as they recognised themselves, and one comment was “I think we could say a lot of repositories fit that model”. A result like that is pure gold to me as the AIDA owner.

Posted in Uncategorized.

Tagged with , , , , .


KeepIt course 1: Assessing institutional digital assets

KeepIt course module 1, Southampton, 19 January 2010
Tools this module: DAF, AIDA
Tags Find out more about: this module KeepIt course 1, the full KeepIt course
Presentations and tutorial exercises course 1 (source files)

Using the Data Asset Framework (DAF) will help you to discover hidden digital content produced in your institution that might be served by your repository, but how committed is your institution towards supporting a growing repository? We try AIDA, Assessing Institutional Digital Assets, a tool to help you find out.

By using the DAF in this course module we are seeking to expand the content horizons of your institutional repository. In this session we will discover the possible constraints on your repository, typically imposed, formally or by default, by the host institution. Two obvious examples are policy and costs. An institutional repository is bound to the institution’s policy devised for it so that it can be seen to serve the needs of the institution, and it cannot do much more or less than it is funded to do. You would guess that most repositories would have these engraved in their documentation, but many do not. We will consider repository and preservation costs in more detail in KeepIt course 2.

It turns out there are many more factors like these that need to be taken into account in assessing an institution’s support for managing its digital assets, say in a repository. Ed Pinsent of University of London Computer Centre has documented the known factors in creating AIDA. Modelled on the three-legged stool used by Cornell University to represent the three principal supports for digital asset management and preservation, AIDA documents a series of elements for each leg – the Organisational leg (11 elements), the Technology leg (11 elements), and the Resources leg (9 elements). In its organisation and tabulation of each element AIDA serves as tool for a qualitative assessment of the institution’s support, and with an associated scoring method it can also act as a quantitative tool.

[slideshare id=3002493&doc=aidakeepit2-100127054902-phpapp02]

In this presentation Ed introduces AIDA and describes its scope and methodology. We discover how AIDA builds on established tools from the digital preservation community, notably tools for evaluating trusted repositories, and other audit tools such as DRAMBORA and DAF, as well as the Cornell model.

By slide 17 we are ready to start using AIDA, and a group exercise begins on slide 19, with questions to drive feedback on the next slide.

AIDA can be used in conjunction with a lengthy document, the AIDA self-assessment toolkit (mark II was released in May 2009), which tells you all you need to know and includes tables for each leg and element. This toolkit is too long for a group exercise lasting 3/4h, so groups got one of two handouts containing one element from each of the three legs – so three tables! – and a scoresheet. Each handout had a theme: Asset Sharing, Re-Use, and Access, and Metadata Creation.

In another blog on this session on AIDA, Ed describes feedback from the group exercise. Find out why for Ed some of the results were ‘pure gold’.

Tools this module: DAF, AIDA

Posted in Uncategorized.

Tagged with , , , .


KeepIt course 1: Using the Data Asset Framework

KeepIt course module 1, Southampton, 19 January 2010
Tools this module: DAF, AIDA
Tags Find out more about: this module KeepIt course 1, the full KeepIt course
Presentations and tutorial exercises course 1 (source files)

How much do we know about the range of digital content produced in an institution? How much of that content could be managed and in scope of the IR?

It is likely we are facing a digital data explosion in institutions, although this may not yet be evident from many IRs today. If we are considering repository preservation we are already looking a few years ahead, but it’s likely we are thinking more of the content today than of the repository as it will be then. At that point we will be asking the same preservation questions of new content, and it would be as well that we can apply the same answers then too, but if the range and type of repository content has changed from our current assumptions then those answers may no longer apply. That’s why scoping the repository now and for the future is as important for preservation as preserving the content itself.

Repositories ought to be a dynamic reflection of the institutions they serve. What will your repository look like in five years, say? We will try and answer this question using the Data Asset Framework (DAF) from the University of Glasgow and the Digital Curation Centre.

Using DAF as a data scoping tool for institutional repositories, Sarah Jones

Sarah outlines the background and motivations for the development of the DAF, and introduces its fundamental methodology.

[slideshare id=2996958&doc=dafkeepit190110-100126111344-phpapp02]

The DAF at Southampton, by Harry Gibbs

The DAF is probably one of the most evaluated tools to have been produced by a JISC project. A series of pilot studies, listed in Sarah’s opening presentation, has been produced on application of the DAF in a range of institutional contexts. In one of those reports, Harry Gibbs described how the DAF was used to scope data types in the School of Social Sciences at the University of Southampton. Recalled in this presentation, the candid and numerous lessons learned and outcomes reveal how to get the most out of the DAF methodology.

[slideshare id=2997360&doc=dafpresentationjan10keepit-100126120238-phpapp02]

Based on the institutional pilot studies the DAF team have produced an extremely useful implementation guide that distills all the findings and presents a range of practical examples from which to select and follow.

Group exercise: scoping data and curation requirements

With copies of the implementation guide to hand, course participants were split into groups of 3-4 to work for 45 mins on an exercise set around their chosen repository. The task was to identify the data types and scope curation requirements for the repository, with a view to answering a series of set questions (slide 3 below) and reporting back to the full group with some answers.

[slideshare id=2997439&doc=dafexercisekeepit190110-100126121412-phpapp01]

DAF will help you to discover hidden digital content produced in your institution that might be served by your repository, but how committed is your institution towards supporting a growing repository? We try AIDA, a tool to help you find out.

Tools this module: DAF, AIDA

Posted in Uncategorized.

Tagged with , , , .


KeepIt course 1: digital preservation, repositories and institutions

Welcome to the first module of the KeepIt course that between now and the end of March 2010 is introducing repository managers to a range of digital preservation tools.

This is a digital preservation course with a difference. It is aimed at institutional repositories, and can therefore make assumptions about the working environment, and select appropriate tools. In each module two or more tools will be investigated in depth based on presentations and group work guided by expert tutors, many who designed the tools themselves. These tools will be presented in a structured way as the course progresses; it is not expected or required that repositories adopt them all.

KeepIt course module 1, Southampton, 19 January 2010
Tools this module: DAF, AIDA
Tags Find out more about: this module KeepIt course 1, the full KeepIt course
Presentations and tutorial exercises course 1 (source files)

[slideshare id=2996279&doc=keepit-course1-100126092357-phpapp02]

In these simple opening slides I briefly introduce today’s two tools, and outline the underlying aim of the project to produce exemplar repositories covering all data types that we might find in the future institutional repository: research, data and teaching materials across all disciplines including arts as well as sciences. We are working with a select group of repositories as our core exemplars, but all repositories participating in this course and selecting appropriate tools can exemplify good preservation practice. We can all be exemplars.

The course is structured to begin, in this module, with the institutional framework within which preservation strategies and decisions will have to be made. We feature a specific and influential factor in preservation planning – costs – in module 2. By module 3 we may be on more familiar ground for many participants, metadata, in particular metadata to inform preservation. It’s not until module 4 that we get to what strictly and technically might be called digital preservation, looking at a joined-up workflow to manage file formats and storage using interfaces provided in familiar repository software. We end the course by looking at the issue of trust, which is often considered prematurely but not for preservation repositories that have been implementing elements of the course this far.

At this point participants in this module introduced themselves. They come from universities and colleges across southern England, London and the Midlands, with one also from the north-west and two from Wales. We heard Twitter-length descriptions of their repositories and interest in digital preservation. For some, 140 characters proved a malleable target!

Before we begin in earnest it is necessary to unpack the terms that underpin the course: digital preservation and institutional repositories. Neither are well defined or understood in this context, particularly when joined together.

What is preservation? At best it is a set of achievable objectives allied to a set of processes. At worst it is wishful and idealised thinking.

Recently the Scholarly Publishing Roundtable in the USA produced a report with a series of recommendations aimed at research funders, including this one on preservation: “Polices should address the need to resolve the challenges of long-term digital preservation”. Broadly such statements can be helpful in this high-level context, but it is quite a typical line and I’ll leave the reader to decide where this form of words fits between the extremes of preservation by considering whether the verbs represent doing or hoping, and noting the ratchet-effect of using ‘long-term’ with preservation.

Typically advocacy for digital preservation uses worthy surveys (for example, “Long-term preservation is an issue which urgently needs to be addressed within the industry.” 91% of respondents either agreed or strongly agreed with the statement; no-one disagreed. Who could disagree, but where’s the action?), and scare stories about an impending black hole where all digital content will disappear without effective preservation. This may be OK when you don’t have answers to current practical questions, but digital preservation has made remarkable progress in recent years and has a better story to tell. That story is the tools that this course will work with.

What is a repository? Essentially it is a set of interfaces designed to accomplish a range of tasks – primarily content deposit, management and access – using some underlying services. It’s software that runs on a server, and may be best known by the type of software used: DSpace, EPrints and Fedora are the most widely used for institutional repositories. But even taking this narrow technical view, the infrastructure of repositories is changing, from a local implementation towards one based on a managed network of services, perhaps in the ‘cloud’. How this affects preservation we will see in module 4.

It’s when the repository tries to become institutional that it gets trickier. That’s why the the focus of much of this first module is the institution. Institutions of higher education, which are served by IRs, are large and complex organisations. The repository must serve both top-down requirements driven by senior managers (policy, objectives, etc.) and a bottom-up role for a potentially large and diverse group of authors and users (measured by growing content deposit and usage). This inevitably throws up opportunities and constraints. How effectively is the institution able to support the repository? We will explore this question using AIDA, Assessing Institutional Digital Assets, produced by the University of London Computer Centre.

We also want to consider the institution as a content generator. How much do we know about the range of digital content produced in an institution? How much of that content could be managed and in scope of the IR? We will try and answer these questions using the Data Asset Framework (DAF).

Tools this module: DAF, AIDA

Posted in Uncategorized.

Tagged with , , .


Preservation file formats report progresses the field

File formats are a critical feature in digital preservation. This is hardly news to specialists, but in their objectives the repository managers of our exemplars also expressed an interest in file formats, so I was interested to discover what the recent report on file formats from the Digital Preservation Coalition (DPC) might offer them.

Digital Preservation Coalition logo Malcolm Todd (The National Archives), File Formats for Preservation, DPC Technology Watch Report Series, Report 09-02, 2 December 2009

This is a well researched, wide-ranging and a deeply considered critical review of recent work on file formats. The report is intended ‘to assist repository managers and the preservation community’. My impression is it may work better for the latter than the former. ‘Repositories’ is used as a generic term; institutional repositories are mentioned by reference only.

In KeepIt our approach aims to be practical, joining up a series of tools for the workflow of file format management. This is based on an approach elaborated as active preservation at the National Archives, but not directly mentioned in the report. “The development and use of tools developed within the digital preservation community has a mostly separate literature from that of defining and implementing selection criteria.” (section 5)

The report’s summary says “At the time of writing, there is apparent consensus on five main criteria for file format selection.” It goes on to list the criteria, but work here has already progressed beyond this. The P2 registry that this project described at iPres 2009 links hundreds of criteria for different formats.

The report continues: “The main finding of this report is to support the proposal by Rog and van Wijk of the National Library of the Netherlands (2008) that such criteria should be used as a tool to work out the detailed implementation of a clear preservation strategy according to a prioritisation appropriate to the repository. This is essential to make sense of an otherwise bewildering array of considerations and provides key governance to ensure a preservation institution is managing the risk of obsolescence to its holdings.”

In other words, there has to be a way of connecting the format information with the repository requirements. This is being done in KeepIt by integrating the P2 registry with a planning tool, Plato, developed by Andreas Rauber and colleagues at Vienna University of Technology for the Planets project. It’s this joined-up approach that has been conspicuously lacking for preservation file format management workflow so far – “Some of the current literature appears to minimise this (interdependence)”. Although a work-in-progress this potentially ground-breaking approach will be the focal point of our ongoing KeepIt course on digital preservation tools (see module 4).

“Integrating the ability of formats to represent information content into scoring criteria seems some way off except for very simple digital objects”, but not as far off as it may seem (see slides 23, 24).

My initial thought on the report was that it will be useful to the extent that the work reviewed might be considered useful, but on detailed reading I have reappraised that view. The report’s effect is to progress the work it is reporting upon, by bringing new insights and identifying little-noted connections explicitly, although it also has to be said that this is probably the most complex aspect of the investigation, but worth the effort.

Effectively it’s describing the foundations for work that has already moved forward significantly: “this topic has progressed rapidly in the last decade. This research has improved considerably our understanding of effective format management strategies – even if the proliferation of initiatives and tools seems at first to render it less accessible.” I think it is fair to say this work is even further advanced than this report recognised.

Posted in Uncategorized.

Tagged with .


Digital preservation tools for repository managers

A practical, tools-based course designed for repository managers and presented by expert tutors, in five parts between January and March 2010, in the UK. This distinctive new course starts Tuesday 19 January 2010, in Southampton, UK.

Update (12 January 2010). This course is now full. Places for individual modules may be available on request.

Includes a ground-breaking tutorial on joined-up tools for preservation management workflow for repositories.

This free course is presented by the JISC KeepIt project in association with Digital Curation Centre, and the European-wide Planets project.

Update (11 February 2010): venues added for modules 3 and 5.

Update (30 September 2010): a full record is available for this course.

Why Digital Preservation?

Training in Second LifeDigital preservation, the ability to manage content effectively and ensure continued access over time, is an implicit commitment made by an institution to those who deposit their important and valuable content in its repository. It’s also an asset for a repository to be able demonstrate to its users good quality content management processes embracing preservation.

About the course

Digital preservation is currently well served with training courses that are strong on the foundations of the topic, and are aimed at a general audience. This KeepIt course, designed to create repository preservation exemplars in the UK, makes more specific assumptions about the working environment – digital institutional repositories – and will specialise in working with tools optimised for this environment. The course is applicable to all popular repository types, although one part on preservation workflow is currently implemented and optimised for EPrints.

The course is structured to place repositories and their preservation needs within an organisational and financial framework, culminating in hands-on work with a ground-breaking series of tools to manage a repository preservation workflow. A tutorial on these tools was first presented at a major European conference (ECDL) and is now brought to the UK for the first time. With repositories ready to become preservation exemplars, the course concludes by considering issues of trust between repository, users and services.

Each module will give extensive coverage of chosen topics and tools through presentation, practical exercises, group work and feedback.

The first and fourth modules will be held in Southampton. Other venues to be announced. Some or all will be in Southampton, the home of the KeepIt project. The modules each last a single day, apart from module 4 on the preservation workflow, which will last two days and includes an evening social event.

Course outline

Five modules formed around: organisational issues, costs, description, tools, and trust.

Tues. 19 January 2010

  • Module 1. Organisational issues, Southampton. Will will seek to connect institutional repository objectives with an emerging preservation architecture. Topics will include audit, selection and appraisal.

Fri. 5th February

  • Module 2. Costs, Southampton. The focus will be lifecycle costs for managing digital objects, based on the LIFE approach, and institutional costs, extending to policy implications.

Tues. 2nd March

  • Module 3. Description, London. Describing content for preservation. This has two parts (not mutually exclusive):
    • 3.1 User description: provenance, and significant properties
    • 3.2 Service-based description: formats, preservation metadata

Thurs. 18th-Fri. 19th March

  • Module 4. Tools, Southampton. This will be based on the earlier ECDL tutorial, and will cover the tools available in EPrints for format management, risk assessment and storage, and linked to the Plato planning tool from Planets.

Tues. 30th March

  • Module 5. Trust, Northampton. There are two angles here: trust (by others) of the repository’s approach to preservation; trust (by the repository) of the tools and services it chooses. This will connect us with users and services.

Expert tutors include:

  • Neil Beagrie (consultant, Charles Beagrie Ltd)
  • Joy Davidson (Digital Curation Centre)
  • Brian Hole (British Library)
  • Sarah Jones (DCC)
  • Ed Pinsent (University of London Computer Centre, and DPTP)
  • Andreas Rauber (Vienna University of Technology, and Planets project)
  • David Tarrant (University of Southampton)
  • Stephen Grace, Gareth Knight (Kings College London)

More tutors to be announced.

Join us to make your repository an exemplar preservation repository.

Contact for signing up: If you are interested in participating in this course, in the first instance contact Steve Hitchcock, KeepIt project manager, to register. Please advise of your role with the repository. Places are limited for practical training purposes.

Information for delegates

The course is aimed at institutional repository managers and other members of repository management teams, and does not require prior knowledge of digital preservation or specific technical expertise. Only a working knowledge of repository content management is assumed.

To get the full benefit of the course it is recommended that repositories sign up for all five modules, and preference will be given to repositories that can do so. Since the course emphasises a structured, joined-up approach, it will benefit repositories to be represented by the same person throughout, but this is not essential.

Following the course there will be optional follow-up sessions to assist and evaluate uptake of the tools within the repositories and to brief other members of repository teams.

The course is largely independent of any particular repository software, but one section of the course (part of module 4) will use EPrints as the available tools will have been integrated with this platform. At all stages the course will use available tools to emphasise practical developments and to assist subsequent application and uptake.

Posted in Uncategorized.

Tagged with , .


Acting on repository preservation objectives

project-objectives Recently each of the KeepIt project’s four exemplar repositories detailed objectives for their part in this work. Admittedly, when they were first asked to do this the reaction of the repository managers was one of surprise and perhaps consternation. They were probably thinking, as you are, why now, some way into the project, are we being asked to define objectives? In this post we will answer that question, and summarise the findings of the exercise.

EdShare logo The original project proposal elaborated a series of objectives, and the since the exemplar repositories were co-signatories of that proposal it could reasonably be assumed that they supported those objectives. No doubt the repository managers were looking to the project for expertise on digital preservation – one of the reasons they had joined – so they hardly expected to have the agenda turned back on them. There is a difference, however, between the project objectives and the objectives of each repository within the project. It became clear that we had not really considered the latter when defining the former.

NECTAR logo To simplify the project’s objective, it is, through a mixture of training and development, to transform the repositories to exemplify good digital preservation practice that others can follow. The original proposal mapped out how we would do this, and included an outline training plan. Meanwhile, I had a Damascus-like conversion about designing your own repository preservation training course. Without some input from the repositories, how could we be sure we would be serving their needs? These repositories, as we discovered from the original profiles provided in this blog, are each quite different in terms of content and approach. That’s one attraction of working together.

University of the Arts London logo So why now? We have been constructing a more detailed training and development programme for the exemplars, and others, with the course structure and schedule to be announced soon on this blog and elsewhere. We met with the repository managers in October, when they were first asked to think about objectives, which were then consolidated in the blog. That meeting, the repository objectives, and consultations with experts in the field, have since shaped the design of our training course. We believe the course will be distinctive because we now know and can target the needs of a specific group of users – repository managers – needs that are not being fully met for this group by digital preservation training courses elsewhere.

eCystals logo So what did the repository objectives tell us? They surprise and challenge, although we shouldn’t really be surprised given the first reactions. There are three common themes:

  1. Tools, in particular for managing file formats and preservation workflow. This appears in the objectives for all repositories, surprisingly because file formats, perhaps the most technical of preservation practices, are a perennial focus for specialists and so might be expected to be of less interest to others.
  2. Costs of preservation. Two repositories are concerned about costs, one for the purposes of business planning, another to enable researchers to include preservation costs in grant applications.
  3. Organisational issues. There’s something on this from all repositories, but to be honest this is my catch-all term, and it covers a variety of issues, from institutional and user concerns, within the repository support team and beyond. This covers advocacy at an institutional level, extended team training, and published guides and documentation.

Item 3 is the most challenging, and in its direct extension to the wider academy is probably beyond the scope of this single project, but it reminds us of the need to adapt the outputs, especially from training, for multiple purposes and audiences as far as we can. It’s not just about a training course, stop, but about extending that to other repository stakeholders, probably through the repository managers and their support teams.

There was only one mention in the various objectives of a core foundation of preservation: policy. There must be ways of tackling policy, but perhaps not in training, especially in a course such as the one planned. Preservation policy has to be part of repository policy, and for an institutional repository it therefore has to be part of institutional policy. On this basis training for policy would embrace influencing skills and management. That is for others. What we can do is document examples of preservation policy that are part of wider policy constructs, and not simply examples of wishful thinking.

It should be noted that the repository objectives are a work-in-progress, that the repository managers can choose to update their objectives at any time in response to their work with the project. It’s one way we have of measuring the effectiveness of how the project is serving its own objectives.

Posted in Uncategorized.

Tagged with , .