Skip to content

Categories:

Conditions for digital preservation

(or how to anticipate turnout at your preservation workshop)

DIGITAL PRESERVATION is

NOT so DIFFICULT

if you WANT to DO IT

You will want to do digital preservation if 


View the conditionals now, see slide 2 in the slideshow below, or read on.

Thus began the introduction to my brief interlude linking two practical sessions in Dave Tarrant’s 90-minute EPrints Preservation workshop at the EPrints User Group meeting, at the Open Repositories 2010 Conference in Madrid. The workshop aimed to connect preservation planning with tools provided for use with EPrints repository software. My role was to say something interesting, preferably on the theme of digital repository preservation and, if we were lucky, to link seamlessly back to the second part of the workshop, to the extent that participants would be refreshed, ready for the next challenges, and will know and understand a little bit more about what is to come and be better prepared for it. In this talk we also considered the role of file formats and the essentials of preservation workflow and preservation planning.

The primary resources for this workshop include, specifically, the File Formats exercise (short version) preceding this presentation, the Action and Provenance exercise following it, and the main Presentation. The workshop was scheduled to last 90 mins, so independent users can expect to gain something from these materials in a similar or less time.

[slideshare id=4770011&doc=or10workshop-sh-final-100716032612-phpapp02]

Notes are provided with these presentation slides and can be found by using the View on Slideshare button (bottom right-hand corner on the slide viewer above, or try here). Slideshare seems not to have reproduced the table in slide 12; the original can be found as Table 2 in this earlier blog entry.

@jisckeepit For the few at #or10 not at the EPrints preservation workshop (cough), a quick summary follows in three tweets
10:57 AM Jul 9th

Preservation institutions have identified a workflow for managing file formats and built tools, but no joinup and common interface
10:58 AM Jul 9th

So build preservation tools such as DROID, Plato, etc, into EPrints and access through common repository interface
10:58 AM Jul 9th

The clever part, which our workshop participants now know, is the button for importing a preservation plan from Plato to EPrints
10:59 AM Jul 9th

More on digital preservation conditionals

Originally I had not included ‘so’ in my opening statement. Instead it read: DIGITAL PRESERVATION is NOT DIFFICULT. When I arrived at OR10 the first item I fished out of my delegate pack was a colourful cardboard flyer for DuraSpace, and on the front it clearly said: â€œPreserving the world’s intellectual, cultural and scientific heritage isn’t easy”. I had to admit they have a point, and I was merely sloganising as a the simplest means of reassurance. So I modified my claim, but the key part really is in the qualifiers.

We have to understand why digital content managers and repository managers are concerned about digital preservation, yet why this translates so little into practice outside specialist preservation institutions. (And to understand the turnout at events like this: @jisckeepit Maybe ‘last day-preservation’ not a perfect fit. So thanks to those who turned up. We hope you found preservation works with EPrints #or10 10:06 AM Jul 9th.) To gauge at what point digital content and repository managers might expect a natural transition from interest/concern to practice, we produced this rough rule-of-thumb metric. If one or more of these criteria apply, then the application of digital preservation is likely to become magically less onerous and more beneficial for your content. So here are the conditionals:

DIGITAL PRESERVATION is

NOT so DIFFICULT

if you WANT to DO IT

You will want to do digital preservation if you have

  1. a lot of digital content
  2. collected over years
  3. a specified responsibility and resources for that content
  4. an understanding of how that content is used now
  5. how it will be needed in future,
  6. how the type of content you collect may change going forward

Can you add more?

Among our KeepIt exemplar repositories, I would say at least three can apply point 1, while perhaps only one might say that point 2 applies, so far. Due to their status as ‘institutional’ repositories, all four exemplars would have understood point 3 but as a result of participating in the project are likely to understand better the connection with preservation, and we hope that all four are making progress on points 4-6 as a result of the KeepIt course. So the conditions apply broadly, to include different types of repository at different stages of development. They do not exclude any repository, and can act as a rough indicator of when preservation should become a higher priority that ought to be properly resourced.

If one or more of these six points describes your repository, you are ready to act now. There are tools available to help. Real tools for real content. Now those tools can be applied directly to your repository – for preservation planning and workflow, and for storage – all accessed and controlled through your repository interface.

Whether your content includes text documents, images, sound, vision, science data, or is used for teaching or research in science, arts or any other field, there are strategies to help you.

If that doesn’t describe your repository now, focus on how you plan to get there. Getting to digital preservation will be your success story.

For some repositories there are preservation tools built into your workflow. For others there are just tools. You can work it out.

Posted in Uncategorized.

Tagged with , , , , , , .


Digital Collections Risk Assessment at LSE: Using DRAMBORA

lse-logo

At LSE we are undertaking a programme of infrastructure development to build on our capacity to store, manage, preserve and provide access to our growing digital collections. We have a mature IR service holding our research outputs, but are seeking to expand our capacity to handle other digital collections such as the outputs from digitisation projects and born-digital archives.

Key to this development is making the case for investment, a task which is made even harder by the current economic outlook in higher education. DRAMBORA, a tool introduced on the KeepIt course in which LSE participated, takes a risk assessment approach to auditing repository contents (which can be taken to mean any collection of digital material) which highlights areas where a repository needs to develop its practices. “Practices” can mean developments to technology, skills or organisational attitude.

The tool itself is both a paper-based exercise and an online assessment centre where you can go through the process of self-auditing digital collections. The process has already been described in other posts on this blog. One point to note is that the tool advises that you will need access to various supporting information, such as institutional strategies, policy documents, mandates and staffing structures. However, once this domain modelling is in place, the core of the assessment is analysing the risks to your collections, the process and outputs of which are not tremendously enhanced by the use of this information.

DRAMBORA-logo

DRAMBORA is based on various tools such as OAIS and TRAC (see the full list) and comes with a set of predefined risks which have been developed and refined by the project partners, and which can provide the starting point for adaptation to local circumstances. In our case, we took a fairly lightweight approach and worked through the provided list to select a small number of risks that crossed functional domains (organisational, technical, and so on) and were representative of concerns in each area. This included high-level, intangible risks such as threats to institutional reputation should we fail to preserve our digital collections, through to low-level, technical risks such as failing to preserve a significant characteristic of a file format or make suitable backups. Our purpose was to demonstrate the general state of our collections and to highlight areas where we need to focus our efforts, rather than to capture every detail in all or even one functional area. It may be that we will use the tool in more detail further down the line, producing, for instance, a comprehensive list identifying technical risks.

We ended up with a set of about 10 risks, which is by no means exhaustive, which we refined with local expressions of each risk by taking examples from our own circumstances and collections. This gave us a good oversight of our current situation and also provided the basis on which to sketch out a roadmap for development. We found the ability to scale our use of the tool to this fairly lightweight implementation to be a huge benefit of the bottom-up or self-audit approach.

The outputs, in the form of a risk register, gave us a succinct, well-reasoned and clearly explained summary of the risks to our collections, given an air of authority firstly by the nature of the process—a thorough and transparent audit—and secondly by the provenance of the tool—its development by a group of experts in digital preservation and association with a range of best practice in the field.

We took the risk register produced by the tool and combined it with parallel work of the infrastructure development programme which was investigating our functional requirements and possible software solutions to support our curatorial activities. This resulted in a report and set of recommendations for our senior management team which identified the risks our collections face right now, and proposed solutions. In each case we were able to demonstrate that our proposed developments addressed specific risks to our collections, users and institution generally.

DRAMBORA formed part of our investigations of our next steps in building preservation capacity, and was invaluable in the role it provided—making the case for investment clear by pointing out the risks to our current position. However, our use of the tool formed part of an ongoing programme of work by an established team which included other strands of exploration, and the importance of this wider context cannot be underestimated. Nevertheless, the self-auditing or bottom-up approach to understanding necessary developments in terms of risk has proved very beneficial.

Posted in Uncategorized.

Tagged with , , , , , , .


KeepIt exemplars reveal seven steps to digital preservation readiness

The following edited Twitterstream from Wed. 7th July 2010 is taken from the conference record for OR 2010 on Twapper Keeper (hashtag #or10).

@jisckeepit KeepIt @ OR10 in Madrid. Waiting for Miggie to begin. Funky music playing. Auditorium huge. Audience building

@jisckeepit Miggie speaking as repository manager, not as ‘specialist’ in preservation. Want to show preservation is a ‘realistic aim’ for repositories

[slideshare id=4745059&doc=or10preservingrepositorycontent-100713090052-phpapp02]

@jisckeepit KeepIt @ #or10: Miggie talking about exemplar surveys, objectives and training. Everything is blogged http://blog.soton.ac.uk/keepit/

@jisckeepit KeepIt @ #or10 Miggie: DAF has been a cracking project, and will inform development of NECTAR repository at Northampton for years to come

Miggie Pickton anime at the International Conference on Open Repositories 2010

Reproduced with permission of the OR10 conference organiser, Fundación Española para la Ciencia y la Tecnología (FECYT), Madrid

@RepoSupport Miggie Picton – 7 steps to preservation readiness

@gmcmahon #or10 Preserving repository content: practical steps for repository managers – paper
 http://ff.im/-nktqO

@jisckeepit KeepIt @ #or10: good question on linking format actions and costs. As Miggie said, need to link LIFE3 on costs with risk analysis from Plato

@jisckeepit KeepIt @ #or10 Excellent talk, good story and well delivered, perfect timing. Well done Miggie

@jisckeepit Preservation @ #or10 session ends. Kudos #or10 organisers for putting this in the main auditorium. Good size audience for this topic

@jisckeepit Miggie says there were 140 people at her KeepIt talk. She counted from the podium. You thought the audience were watching the speaker?

Appended tweet, 14 July 2010:
@jisckeepit Lady in green-terrific stream of #or10 photos for Wed 7 July includes many of Miggie giving KeepIt talk http://bit.ly/amUmlV. Sense of scale

Miggie in pictures

Miggie: the essence of KeepIt

Photographs added to this blog 16 July 2010, reproduced with permission of the OR10 conference organiser, Fundación Española para la Ciencia y la Tecnología (FECYT), Madrid.

The final tweet above links to the OR10 slideshow. Well worth a view. The individual photographs are available on Flickr. Here are some other photos featuring Miggie

A copy of the related short paper is available from the conference site.

Posted in Uncategorized.

Tagged with , , , , .


Exemplars driving JISC’s digital preservation directions: an update and recap on KeepIt

JISC logo

By funding projects like KeepIt, JISC has for many years sought to develop and promote best practice in digital preservation. This is consistent with JISC’s role as an advisor on technology and digital strategies to UK higher and further education, through the Higher Education Funding Councils. At the invitation of Neil Grindley, JISC digital preservation programme manager, a small number of current digital preservation projects, all working on developing preservation exemplars, met recently to discuss progress with a view to identifying future directions.

These projects included Biophysical Repositories in the Lab (BRIL), Embedding Institutional Data Curation Services in Research (EIDCSR), PEKin (Preservation Exemplar at King’s) – all of which were linked from my previous blog on DP-related JISC projects – and Significant Properties in the Lab (SPIL), another Kings College London project which for some reason was missed from my earlier list.

We will learn what future directions JISC is to adopt in due course. This blog reports on the KeepIt presentation to the meeting and acts as an update on the project. Remember this is intended to be a short presentation (20 mins) for an informal small group discussion, so we did not over-elaborate the slides. A few slides may be familiar from earlier presentations, but complete the story here.

[slideshare id=4534220&doc=jisc-exemplars-meet-jun10-100618051354-phpapp02]

Broadly, our interest is in the preservation of digital repositories, particularly institutional repositories, which have sprung up across higher education institutions in the UK largely thanks to JISC.

What we have found is that instead of sticking strictly to the original open access agenda – providing access to published research papers – repositories have been diversifying, storing and providing access to different types of content (Slide 2). This recognition, that repository content was changing and hence content management practices would need to change, framed KeepIt’s repository preservation agenda.

When we began in 2004 with our first JISC preservation project, Preserv, the holy grail was to provide preservation services so we could say to repositories: “you don’t have to worry about preservation because there are other experts who can do this for you”.

What we have, instead of organisations providing these services, are a set of tools that have effectively abstracted this preservation expertise for application by others. We may have witnessed a golden age in the development of preservation tools, many produced by JISC projects, and some by the behemoth Planets project, and by other organisations.

With an array of preservation tools come an array of interfaces, which have begun to be subjected to evaluation by users such as Prom (Practical E-Records blog). An ongoing development is the integration of tools within packages, accessed through single, unified interfaces. One example is from KeepIt, which has continued work from Preserv to integrate tools within an EPrints repository interface (more below).

We can simplify this picture (Slide 3).

Tools are one side of the story. The grail is not achievable without those responsible for digital content management, in our case the repository managers and administrators, having sufficient knowledge and confidence to identify and set out the parameters for preservation, taking account of policy, cost and risk, and then to select and apply appropriate tools and services.

So KeepIt has two strands – People and Technology – and it seeks to connect the two.

Strand 1: People. Early in the project we surveyed our exemplar repositories, selected to represent a range of content types, then asked the managers to set out their preservation objectives and work with us to design what turned out to be a five-part KeepIt course focussed on preservation tools (Slide 4). The course covered: organisations, costs, content description, format management and storage, and trust. The course was presented by the experts who had developed the tools, was opened up to other repositories beyond our exemplars, and between 15-18 people attended each module, which was about the right number for a practical, hands-on approach.

The course has now completed, and was evaluated by participants. The biggest test was whether the number of participants would hold up over all five modules. The course was free, so they had plenty of scope to vote with their feet between modules. Yet the numbers remained consistent throughout.

It was clear we must be doing something right. Our course evaluations confirmed this. Participants liked the course structure, mixing presentation with practical (Slide 5). We set clear objectives for the course, outlined in the original course notice. We achieved those objectives and those of the participants (Slide 6). You can find out more on the outcomes of the KeepIt course from our presentation at the European Conference on Digital Archiving (April 2010).

The culmination of the course was a two-day module on preservation planning using Plato and the EPrints preservation tools.

Strand 2: Technology. At this point in the presentation Dave Tarrant gave a live demonstration of the EPrints preservation tools, which work with the latest version of the repository software (v3.2). You can recreate this demo by watching a series of three short videos, with sound commentary, prepared by Dave.

Although preservation planning can initially appear to be complex, persistence pays off, as evaluation of this course module showed (Slide 7).

Back to the people and the exemplar repositories. Following the course we asked the managers of our exemplar repositories to re-examine and prioritise their original objectives, and we are working with them to achieve their primary objectives within the project (by end of September), and to set them up to achieve those objectives with a longer time horizon.

All want to upgrade to EPrints 3.2 and apply the preservation tools. The speed with which this can happen depends on local IT support and repository service providers (Slide 8).

The two type-specific repositories want to specialise the EPrints preservation tools (Slide 9):

  • Edshare: to identify a typical format profile for teaching and learning repositories and assess the preservation implications
  • eCrystals: to add the two main formats used in storing crystallography data (Crystallographic Information File, CIF; Chemical Markup Language, CML) to the tools, and to seek to coordinate this with the organisations who maintain the formats

No two exemplars are the same, and the institution-wide exemplars are taking different approaches (Slide 10):

This is not to ignore the critical role of costs in managing digital preservation (Slide 11). eCrystals contributed to the KRDS2 survey, and the KeepIt exemplars and all course participants have been invited to evaluate the LIFE3 beta tool, which made such an impact in the KeepIt course, for assessing the costs of managing digital content over its lifecycle.

The KeepIt course provided a number of  lessons for the project, its participants and the wider community (Slide 12).

A critical issue for the project, as it approaches its conclusion at the end of September, is how do the repository exemplars exemplify preservation practice? What does it mean to exemplify? (Slide 13) It is not enough for our exemplar preservation repositories themselves to be preservation-ready, according to the parameters they set. They have had the opportunity through this project to dedicate more time and energy to the problem than would have been possible otherwise. I want the repositories to present to, and influence, their peers, in their communities, in their fields. I believe that other repositories want to hear from their peers what has been achieved, and they are more likely to emulate that experience than they are to emulate experts. Will there be the opportunities in the time available?

Miggie Pickton will report in the general session of Open Repositories 2010 in Madrid in July. I hope other presentations and publications from the exemplars will follow.

As with the KeepIt course, perhaps we will need to create our own forum, again sponsored by JISC, DCC, Planets, on the theme: What we have done with the preservation tools? As we have reported on this blog and on the project Twitterstream, we know there has been application of the course tools among participants – e.g. Kingston, ESRC Restore project – and uptake of EPrints preservation tools among other repositories – e.g. at EDINA, Siena.

Of course, we will continue to blog developments and progress here.

Posted in Uncategorized.

Tagged with , , .


Digital Preservation, Risk Management, and UAL Research Online

KeepIt course module 5, Northampton, 30 March 2010
Tools this module: TRAC, DRAMBORA
Tags Find out more about: this module KeepIt course 5, the full KeepIt course
Presentation referred to in this blog entry DRAMBORA: Risk and trust and Data management (Slideshare)
Presentations and tutorial exercises course 5 (source files)

University of the Arts London logo UAL Research Online is a specialist repository of research outputs in arts, design, and media, operating on a version of EPrints that has been customised to be able to hold, manage and showcase our mainly practice-based research.  The research outputs of our university (University of the Arts London, which consists of London College of Fashion, Central Saint Martins College of Art and Design, London College of Communication, Chelsea College of Arts and Design, and Camberwell College of Art) are rarely text documents. They are exhibitions, paintings, textile designs, events, stage designs, films, costume designsound art, industrial designs, photography, sculpture, installations, etc. This means that our institutional repository is rather different than any other.

Our file formats include:

    Images: jpeg, png, bmp, tiff, gif, pdf
    Audio: avi, mp3, mpeg4, wav, ac3, flac, ogg
    Video: mov, mpeg, quick time, flash, avi, theora/ogg

We are also beginning to include archived websites.

Because of this diversity, our preservation issues are a little more complicated. It will be important for us to use the EPrints extensions (developed by Dave Tarrant of the University of Southampton) that incorporate format recognition, and we will upgrade to the version of EPrints (v3.2) which these tools require, before the end of the summer of 2010.

DRAMBORA logoIn addition to implementing the tools developed within the KeepIt project, based on the KeepIt course modules of the various preservation tools available, I have chosen to work through the online preservation tool DRAMBORA, as it best suits the needs of UAL Research Online at this point in its evolution.

I chose this tool from among the many we discussed during the KeepIt course for the following reasons:

    it is designed for repositories rather than all digital assets of an organisation;
    it can be applied to very new repositories;
    it is a self-assessment exercise;
    it does not require advanced technical knowledge

DRAMBORA stands for “Digital Repository Audit Method Based on Risk Assessment”. It is sponsored by JISC and managed by the DCC, the Digital Curation Centre in the UK.

DRAMBORA defines digital curation as the management of risk. The repository manager establishes the objectives, activities, and assets of the repository, and then assesses the areas of risk – identifying weaknesses and strengths, and then managing the areas of risk.

Essential to DRAMBORA’s approach is the belief that “the job of digital curator is to rationalise the uncertainties and threats that inhibit efforts to maintain digital object authenticity and understandability, transforming them into manageable risks.” DPE Newsletter, Issue 2: September 2007, p.9

DRAMBORA includes the following steps:

  • Defining the mandate and scope of functions of the repository
  • Identifying the activities and assets of the repository
  • Identifying the risks and vulnerabilities associated with the mandate, activities and assets
  • Assessing and calculating the risks
  • Defining risk management measures
  • Reporting on the self-audit

After the DRAMBORA exercise is completed, UAL Research Online should have:

– a ‘comprehensive and documented awareness of mission, aims, objectives, activities and assets.’

– a ‘catalogue of pertinent risks, categorised according to type and relationships, which have been described in terms of ownership, probability and impact’

– ‘internal understanding of shortcomings of the organisation – so that resources can be allocated or redistributed to pressing areas’

We should also be prepared for an external audit, if needed. Compatible external audits are said to include:

– Trustworthy Repositories Audit & Certification (TRAC) – an accreditation of the US National Archives and Records Administration,

– Nestor Catalogue of Criteria for Trusted Repositories, or

– Consultative Committee for Space Data Systems (CCSDS) digital repository audit assessment criteria

One of the objectives of UAL’s participation in the KeepIt project  (as defined by my predecessor as manager of UAL Research Online) was to write a series of guides for digital preservation, meant to advise staff about it and to impress the need for it to our senior management. I hope that the DRAMBORA results will feed into this document as well.

So I explored the DRAMBORA site and signed up for the process. I have completed the first stage, in which I defined the functions and scope of the repository. Already I found much food for thought and have several questions I need to ask the senior managers about the specifics of my repository mandate – I can see that DRAMBORA will require me to think through more than just preservation risks, and will be helpful in specifically defining other aspects of our repository.

After I’d done this, we were fortunate to have a visit from Martin Donnelly of the Digital Curation Centre at the fifth module of the KeepIt course, which was held on 30 March at the University of Northampton. Martin gave us a thorough grounding in DRAMBORA and we were able to complete some practice exercises. Interestingly, at the end of the course we were polled for our reactions, and all 15 respondents indicated that DRAMBORA either could be useful, they intended to use it, or they have used it; no one was unlikely to use it.

Martin advised us that our audit scope and purpose must be decided ahead of time, and we must make it clear at which stage of repository development the audit is being performed. It’s important to realise that no repository exists in a vacuum – we are embedded in our institutional management structures and policies, as well as the limitations and possibilities of our IT support provision, the climate of research we function in, and the wider world of UK higher education and funding.  We need to be clear on the repository’s goals: What do we do/What will we do?

Another of the important preliminary steps Martin highlighted was the need to ascribe selected “functional classes” to the repository – for example, metadata management.

We had a workshop session in which we filled out a sample section of the assessment (for reference, this was Stages 4-5-6 on Form T8/T9/T10). Our group looked specifically at T10, entitled Manage Risks. The form asked us to name and describe a possible risk, and then explain its manifestation. Then we classified the nature of the risk, identified the risk owner and stakeholders, listed the risk relationships, probability, potential impact, and from these we calculated its severity. Then we devised a risk management strategy, a risk management activity, and identified the owners of these two. This was a lot of work! and much group discussion ensued. It was a bit difficult to do as a group, because we found our repositories were all quite different, even in terms of the sorts of risks we each thought we’d be likely to face. But it’s clearly a very thorough process.

After trying T10, I  was apprehensive about my ability to think up all the possible risks that the repository faces, but was glad to learn that the DRAMBORA pdf guide includes lots of examples of risks repositories may face. The DRAMBORA website claims: “DRAMBORA Interactive provides a host of real-world risk exemplars which you can use or modify for your own repository’s circumstances.” I think this is a crucial part of the process, and I’ll certainly need to refer to these examples.

A minor concern the emerged during the hands-on experience with T10 is that filling out a lot of these forms will be tedious – I envision that there will be a lot of repetition, e.g., stakeholders will be the same for many risks.  Also, although the fact that DRAMBORA is a self-assessment is one of its good points, I do wonder if I am qualified to assess all these areas.  I don’t see how to independently test my decisions, so how will I know the probability of x happening? DRAMBORA is meant to ‘provide peace of mind’ but if it is based only on my judgement, I wonder how reassuring it will be.

Martin advised that we should allot 5 full working days to the self-audit, and I am not sure where I will find this much time to devote to this, despite my best intentions and awareness of my responsibility as a KeepIt project partner and exemplar.  I will have to put it together in bits and pieces, rather than get immersed in the task for a block of time; the latter would be far preferable. It was suggested that there might be a possibility of a ‘DRAMBORA Light’, that I could put together for myself and report on, for the use of other repository managers that are as busy as I seem to be. There are lots of exciting things going on for UAL Research Online in the next months, including our EPrints software upgrade, the complete restructuring of the university’s research office, adopting the repository to be used for all research reporting functions in the university, and my involvement in three additional projects with their own sets of deadlines, meetings and papers to write. It’s easy to keep putting off getting properly stuck into DRAMBORA, and it’s not just about my own time management – I think this illustrates a common problem for digital preservation generally. We all know that we very much need to assess, manage and minimise risk, but preservation tasks tend to fall into the ‘Important’ category, not the ‘Urgent’ one. It’s easy to spend six months attending to ‘Urgent’ work, and never get to any of the ‘Important’ bits.

Over the last few months the need for a good risk management has been very dramatically demonstrated in the news – I wonder if it would help to post this photo in a prominent place in my workspace?

Deepwater Horizon oil rig fire

Deepwater Horizon oil rig fire
Photo courtesy of the U.S. Coast Guard Eighth District External Affairs http://www.flickr.com/photos/uscgd8/4542934710/
Licensed under Creative Commons Attribution/Share Alike 2.0.

Stephanie Meece
Institutional Repository Manager, UAL Research Online
University of the Arts London

Posted in Uncategorized.

Tagged with , , , , , , , , , , , , .


Getting down to the nitty-gritty: preservation workflow tools

KeepIt course module 4, Southampton, 18-19 March 2010
Tools this module: EPrints, Plato
Tags Find out more about: this module KeepIt course 4, the full KeepIt course
Presentation referred to in this blog entry Preservation planning using Plato (Slideshare)
Presentations and tutorial exercises course 4 (source files)

So far, in the KeepIt training course we have been introduced to a series of tools that will help us, the repository managers, to prepare our repositories for the long term preservation of their content. These tools have covered aspects of organisational preparedness through strategy and policy (DAF and AIDA); issues around costing (KRDS and LIFE3); and description for preservation using significant properties, metadata and provenance (InSPECT and PREMIS).

In session 4 of the course we finally reached what I believe will be the core tools for repository managers – the Eprints preservation apps (including the storage controller) and the PLANETS tool, Plato. Although I’ll be concentrating on Plato in this post, it will really be the interaction between Eprints and Plato that I hope will allow me to preserve my repository content in a manageable and cost-effective way.

About Plato

First released in November 2007, Plato is described as a ‘preservation planning tool’ . It defines a consistent workflow which will lead to a complete preservation plan for a given set of objects:

“A preservation plan defines a series of preservation actions to be taken by a responsible institution due to an identified risk for a given set of digital objects or records (called collection).
The Preservation Plan takes into account the preservation policies, legal obligations, organisational and technical constraints, user requirements and preservation goals and describes the preservation context, the evaluated preservation strategies and the resulting decision for one strategy, including the reasoning for the decision. It also specifies a series of steps or actions (called preservation action plan) along with responsibilities and rules and conditions for execution on the collection. Provided that the actions and their deployment as well as the technical environment allow it, this action plan is an executable workflow definition.
”
http://www.ifs.tuwien.ac.at/dp/plato/intro_documentation.html

Our tutors, Hannes Kulovits and Andreas Rauber of the Vienna University of Technology, took us through the various steps in the development of the preservation plan:

Preservation planning workflow with Plato

Starting with defining requirements, we considered the context of the preservation plan, including what triggered the preservation planning activity. Institutional constraints, legal obligations and the target community all play a part here. If the organization has a preservation mandate or mission statement then that is relevant at this stage too.

For the purposes of the plan, a selection of records need to be considered. The sample should be representative of the features and characteristics of all objects in the collection. Stratification by file type, size, content and time of creation may be appropriate. We were advised to use the DROID and JHOVE tools to identify file formats.

To identify requirements it was stressed that we would need input from a wide range of colleagues, including content producers, managers, lawyers, technical specialists and others. The purpose of this step is to define all the relevant goals and characteristics of the plan. Four groups of characteristics were suggested: object characteristics, record characteristics, process characteristics and costs. In the practical exercise for this section we used the Freemind mind-mapping tool to describe the linkages between these characteristics and then we used the in-built facility to import the requirements into the Plato tree editor.

Requirements tree

Rather than try to describe the full set of requirements, KeepIt course members each tackled a small part of the requirements tree. Even this was sufficient to cause much discussion among our small groups – especially when it came to assigning measurable units to each ‘leaf’ of the tree. In our group we decided that a set of templates showing the ‘normal’ requirements for a range of object types would be a useful addition to the Plato tool. This would give new Plato users a benchmark against which they could consider the specific requirements of their own institution.

Moving on to the evaluation of alternatives, we thought about the suitability and possibility of different preservation strategies and tools for each object in the sample. Migration and emulation are the most obvious contenders. Examples of alternative strategies might be conversion from DOC to RTF or PDF format, or migration from one version of PDF to another. For each alternative it is necessary to define which tool to use, which functions and parameters of the tool, and what resources would be required. This leads on to the Go/No-Go decision for each alternative: whether to continue the preservation procedure or not.

If continuing, then the next tasks are to develop, run and evaluate standardized experiments on the object. The PLANETS testbed was used for this. The value of the experimental approach is that objects can be moved through a consistent set of steps, producing results that are comparable and repeatable. By conducting experiments, different tools can be evaluated and the outputs assessed according to the requirements previously defined. The most appropriate tool may then be chosen for the eventual transformation of the objects in the collection.

In our group exercise we experimented with using different tools to convert image files from .gif to other image formats. We examined criteria such as the availability and ease of use of the tool, the change in file size and image quality and the time taken to perform the transformation. In real life, each of these criteria would be weighted according to its relative importance to the organization and these weights would be allowed for in the analysis of the results.

Having followed the Plato process, it is easy to have confidence in the resulting preservation plan. The methodology is both thorough and sound, and decisions based on this will be fully accountable.

Using Plato in the repository

Like most tools designed to support digital preservation, the Plato tool was not originally intended for use in repositories. Repositories are often complex digital collections containing files in a multitude of formats, with metadata that may be inconsistent and/or incomplete. So the Plato concept of ‘collection’ is potentially very helpful. It enables the repository manager to address the preservation needs of a subset of repository content, a ‘collection’ defined by a set of common characteristics. For each collection a preservation plan can be created and then implemented as needed.

Recognising the point of need is where the new tools in Eprints come in. Eprints is now able to perform an analysis of file formats and identify those that are at risk of no longer being accessible or editable. This information can be used to trigger action according to the preservation plan created in Plato. The transformed objects can then be re-imported back into Eprints, now at a much lower risk of loss.

Of course this does not obviate the need for vigilance by the repository manager. In the discussion which followed the presentations on Eprints and Plato, Dave Tarrant reminded us to be proactive about preservation – to identify potential risks and make repository users aware of these. For some repository content there may be no migration solution (e.g. for scientific datasets); the repository manager may have to make the risk explicit (e.g by documenting it) and allow others to develop a preservation solution.

Nor do the tools provide all the answers. Plato and the new Eprints tools are both in a relatively early stage of development. As Andreas said, showing that a prototype works is quite different from widespread deployment. These solutions need to be turned into a preservation infrastructure, supported by robust digital preservation standards.

Posted in Uncategorized.

Tagged with , , , , .


KeepIt course 3: Primer on preservation workflow, formats and characterisation

KeepIt course module 3, London, 2 March 2010
Tools this module: Significant properties, PREMIS, Open Provenance Model (OPM)
Tags Find out more about: this module KeepIt course 3, the full KeepIt course
Presentations and tutorial exercises course 3 (source files)

Typical repository format profile, from Registry of Open Access Repositories (ROAR)This post was updated on 9 April 2010.

In this module we really engaged with issues right at the core of preservation – developing an understanding of the real properties, functions, behaviours, structures, content and contexts which together provide a thorough understanding of a digital object. It is critical to have this understanding and clarity before we can be confident that we are working to preserve what truly is important about an “object”.

The module focussed on:

  1. Preservation workflow and format risks
  2. Significant properties
  3. Preservation metadata and provenance

This was designed to act as a primer for the following KeepIt course 4, which put this into practice using a preservation planning tool and repository applications.

1 Preservation workflow and format risks

Steve Hitchcock, KeepIt project, University of Southampton

We started our day with a short, informal and enjoyable game for sharing significant data and characteristics amongst small groups. Different groups were given simple numeric data, or four playing cards, and asked to transmit the data around the group using a given frequency of transcription. Sometimes the data was preserved (numbers or value of playing cards), sometimes the format (suit and/or colour) and sequence, sometimes both; and, as we discovered, sometimes data is lost or errors are introduced, and sometimes these can be corrected. This gave us a useful way to capture our attention for significant properties and the actions involved in preserving data.

First, some important background terms necessary to understand the concept of significant properties were introduced.

Open Archival Information System (OAIS) reference model:
Data object interpreted via Representation Information yields Information Object
National Archives of Australia (NAA) Performance model:
Source interpreted via Process yields Performance

OAIS, as a reference model, provides a way to compare digital preservation systems. This provides us with a mechanism to establish trust between different approaches. It also provides a model that can be aligned with repository processes, such as deposit, content management and access. From an OAIS and content management perspective this can be divided further into preservation-related processes, and among these we find three to describe our preservation workflow for file format management (Table 1).

Check

Analyse

Action

Format – version, verification (tools available: JHOVE, DROID) Preservation planning –Significant properties, provenance, technical characteristics, risk factors, (PLATO, PRONOM, Inform) Migration, emulation, storage selection

Table 1. Preservation workflow for file format management

We were given a practical exercise to make a comparison between one format over another, using the PRONOM inherent properties of file formats, and decide which performs better. Groups were given a free choice of which two formats to compare, and each group chose a different comparison. No prior knowledge of the properties was assumed, other than the simple descriptions provided, or they were assumed to be self-descriptive. Two properties were excluded as these are less easy to evaluate without some prior knowledge.

PRONOM Inherent Property

Word/PDF

TIFF/JPEG

PDF/XML

1000 Ubiquity

PDF

JPEG

PDF

1001 Support

PDF

=

XML

1002 Disclosure
1003 Document
1004 Stability

=

TIFF

XML

1005 Ease of identification

= (or marg.PDF)

=

=

1006 Ease of validation

PDF (internal mechanisms)

=

XML

1007 Lossiness

=

TIFF

XML

1008 IP

=

=

XML

1009 Complexity

Word

TIFF

XML

Table 2.  Comparing popular formats with reference to file format properties

As can be seen from Table 2, in each case a clear format winner was identified based on the analysis provided.

We were then asked to consider why we might choose NOT to use the format that performed better for these criteria:
‱ PDF/Word – Why not PDF? PDF is essentially a conversion format, not a source authoring format.
‱ TIFF/JPEG – Why not TIFF? JPEG is compressed, would take up less space in storage. This factor may be crucial. Archival quality copy or a derivative?
‱ XML/PDF – Why not XML? Many repository resources are deposited in PDF. Do people understand what they need to do with XML?

Some thoughts about formats [1, 2]:
‱ Free vs open source vs open standard?
‱ MS Office – XML – open standard (Word doc can be saved as XML)
‱ Open Office – free – XML – open standard
‱ PDF – page representation
‱ XML – generic web format, computational

We recognised that it is crucial to work with content creators for repositories – authors, typically, cannot be expected to check to see if they are required to provide a converted copy of a resource [3].

Steve Hitchcock summarised by observing that the issue is essentially that of risk assessment – if we had identified a risk in our personal lives, we would wish to have some way to moderate or to manage the risk – by means of an insurance policy, for instance
..or smoke detectors, alarm systems etc. For repository content there may be very specific risks which we need to undertake detailed analysis and provide specific solutions.

References:
1 Repositories Support Project briefing document on Preservation & Storage Formats for Repositories(May 2008) http://www.rsp.ac.uk/pubs/briefingpapers-docs/technical-preservformats.pdf

2 Rosenthal, D., dshr’s blog, accessed 24 March 2010, various posts on file formats, e.g. Are format specifications important for preservation? (January 4, 2009), Format Obsolescence: the Prostate Cancer of Preservation (May 7, 2007), Format Obsolescence: Scenarios (April 29, 2007) http://blog.dshr.org/

3 Ashby, S., Summary of responses to IR questionnaire. JISC-Repositories, 18 February 2010 [online]. Available from: JISC-REPOSITORIES@JISCMAIL.AC.UK [Accessed 24 March 2010] http://bit.ly/8Zqdjl

2 Introduction to Significant Properties

Stephen Grace and Gareth Knight from Kings College London.
http://www.significantproperties.org.uk

Stephen Grace started off this detailed section of the day with an introduction to the understanding and definition of Significant Properties and their relevance to work of perservation of digital resources. The InSPECT Project: “
adapted the Function-Behaviour-Structure (FBS) framework, a framework developed by John Gero to assist engineers and designers with the process of creating and re-engineering systems.”

Sequencing the analysis of significant properties of digital objectsIn essence, the InSPECT framework suggests that the purposes of a digital object are determined by uses required by specific stakeholders. These purposes determine functionality and in turn, the properties that are needed over a period of time. By concentrating on these aspects, the suggestion is that an institution may develop a speedier, simpler and cheaper strategy for preserving resources over time.

In InSPECT, FBS – Function – Structure – Behaviour, becomes CCRSB:

  • Content – conveys information (human or machine readable)
  • Context – information from the broader environment in which the objects exist
  • Rendering – how content of an object appears or is re-created
  • Structure – components of the object and how they inter-relate
  • Behaviour – intrinsic functional properties of, or within, an object

Four distinct stages for identifying the Significant Properties of objects were identified:

  1. Documentation of technical properties
  2. Description of specific intellectual entities
  3. Determination of priorities for preservation
  4. Measurement of the success of the transformation process

As in other KeepIt training modules, we undertook practical exercises to enable a clearer grasp of the issues we were beginning to address with respect to Significant Properties. In the first of these, focussing on Object Analysis, we undertook analysis of an email. We explored the Content (structure and technical properties); Context (sender, recipient, bcc, cc etc.); Rendering (issues for re-creating an email); Structure (links with other emails in a thread, attachments etc.) and Behaviour (content interactions e.g.embedded hyperlinks).

This exercise was detailed and enabled us to reflect on the inter-connectedness of the properties of an object, the behaviours it might support and the relationship of such behaviours to functions – all of these elements being core to the preservation motivation for maintaining the authenticity, integrity and viability of any resources.

As a complement to the first exercise, we then proceeded to the second exercise which focused on the stakeholders for an email (sender, recipient and custodian) where the perspective was that of a research student interested in understanding research lifecycles by using real life examples. By understanding the stakeholder relationships to an object, we can derive functions for an object as well. Logically, it would then be possible to re-develop any object, with different functions and in support of different behaviours. The concept of Significant Properties then can be “fluid”, supporting a pragmatic approach; although, equally there may be “must have” features which it is imperative to identify.

In the next section of the day, we explored the software tools which are available for file format identification and analysis:

Digital Record Object Identification – (DROID) is a software tool developed by The National Archives to perform automated batch identification of file formats. It allows files and folders to be selected from a file system for identification. After the identification process had been run, the results can be output in XML, CSV or printer-friendly formats.

The InSPECT project has used a variety of tools to establish significant properties of various types of file e.g. Aperture, REadPST and XENA for understanding emails.

JSTOR/Harvard Object Validation Environment – (JHOVE) – provides functions to perform format-specific identification, validation, and characterization of digital objects (latest version is JHOVE2).
JHOVE can address the following:
1. Identification
a. “I have an object; what format is it?”
2. Validation
a. “I have an object that purports to be format F; is it?”
b. “I have an object of format F; does it meet profile P of F?”
c. “I have an object of format F and external metadata about F in schema S; are they consistent?”
3. Characterization
a. “I have an object of format F; what are its salient properties (given in schema S)?”

Extensible Characterisation Language – (XCL) – Every file format specification uses a different vocabulary for the properties of each file and stores these properties in its own structures in the byte stream. The Planets team is developing ways to describe these file formats to enable comparisons of the information contained within files in different formats. This is done with two formal languages, called the Extensible Characterisation Definition Language (XCDL) and the Extensible Characterisation Extraction Language (XCEL), which describe formats and the information contained within individual files.

In exploring some practical file type analysis using some of these tools, we noted that although, in principle, the development of these analysis tools is useful there are problems: e.g. they only provide limited format support; they require variable access methods; have inconsistent reporting; may use different metrics; and even suggest metric variations between them. The practical useability of these tools will rest crucially on the capability of specific repository platforms to integrate them into the range of services they offer. No repository manager, administrator or editor will wish to encumber themselves with additional overheads relating to multiple interfaces, metrics and the resolution of internal inconsistencies.

3 Preservation Metadata and Provenance

In the penultimate section of the day’s work, Steve Hitchcock led a presentation and short practical on the means of describing and recording changes to content over time. Preservation metadata “supports activities intended to ensure the long-term usability of a digital resource.”

Steve’s optimistic message here was: “You are probably doing more preservation than you think.”

Repositories are already taking actions that affect preservation and contribute towards preservation results. Migration and emulation strategies require metadata about the original file formats and the hardware and software environments which support them.
The Library of Congress hosts PREMIS (Preservation Metadata Implementation Strategies) – when people refer to PREMIS, they are usually referring to the data dictionary of preservation metadata http://www.loc.gov/standards/premis.

The Dictionary describes and defines over 100 semantic units (i.e. items of metadata)

PREMIS documents four types of entity:

  • Objects – things the repository stores
  • Events – things that happen to the objects
  • Agents – people or organizations or software that act on objects
  • Rights – expression of rights applying to objects

(Note: Significant Properties are only a small part of PREMIS, currently.)

PREMIS data may come from: repository software; content creator; repository administrators; repository policy (describing what info needs to be recorded); preservation tools (e.g. format ID may be generated and validated by tools); preservation services.

It is possible to use PREMIS as a reference model and a starting point and add to it to suit the requirements of an individual repository. Some PREMIS fields will already be present in repository metadata. In the future, it is likely that developments for PREMIS will work with the Significant Properties model developed by the PLANETS Project.

Finally, our day concluded with Steve Hitchcock’s brief introduction to Provenance, linking closely to the work led by Luc Moreau on developing the Open Provenance Model (OPM). The provenance of a piece of data is the process that led to the piece of data – provenance describes and records the results of processes on objects over time. The aspiration for OPM (in development) is to support a digital representation for “stuff”, whether produced by computer systems or humans.

Dublin Core to Open Provenance Model, courtesy OPM

KeepIt course module 3 was a rich and packed session. It is clear, however, that for practical purposes there really needs to be some substantial work undertaken to integrate resources as well as applications to support content creators as well as repository managers in developing policies and practises for preservation. In KeepIt we need to work to support this integration.

Posted in Uncategorized.

Tagged with , , , , .


Tweets from the repository cloud

From the desktop to the cloud, via a repositoryTwitter RT @llordllama: The challenge is not cloud computing but cloud thinking

I didn’t go to the JISC-Eduserv meeting on Repositories and the Cloud (23 February 2010), but here is my take anyway, based on selected entries from the tagged Twitterstream for this event (#repcloud), presented in chronological order (and with a nod to my EPrints affiliation). Slideshare presentations by the main speakers on the day are available from the event Web site. There is also a Twapperkeeper archive of the Twitter record using the same tag. Thanks to all the contributors for adding the colour to this report.

Repository and cloud storage are a topic for module 4 of our ongoing KeepIt course on digital preservation tools for repository managers.

BTW, did anyone mention Nicholas Carr’s book The Big Switch?

neilstewart big science and research data seem to be the coming things for repositories

KavuBob Key providers: Amazon, EMC, Rackspace, Microsoft but can’t offer type of SLA, long-term reliability, or security academia wants

mcguthrie M Kimpton: Trusting 3rd Party, data security, long term reliability top 3 challenges in recent survey.

adrianstevenson M Kimpton: 47.7% instititutions said they were likely to use cloud in next 12 months

KavuBob ‘Duracloud designed to be be hosted by DuraSpace, ‘local’ install, or consortial service’ [for clarity DuraCloud =interface layer]

onothimagen extensive use of present tense in Duracloud talk. this is all production standard ready-to-roll – just like DSpace 2, right?

paddymcc “Standing on the shoulders of giants, or on the stepladders of their data”

tadpole99 RT @llordllama: The challenge is not cloud computing but cloud thinking

janestevenson Les Carr: Repositories need to be agile: to utilize and be able to migrate to new platforms.

andypowe11 bur “cloud can blow away” – hybrid storage controllers from eprints.org make decisions about where data goes based on policy file

llordllama This has gone too techie for me.

andypowe11 wondering where the drivers for cloud support in eprints duraspace and zentity really come from – is it just “because we can”?

davetaz @llordllama Integration with Plato, take the workflow from plato load it into EPrints, EPrints performs actions. Find me for more.

kevingashley For Eprints, the App Store becomes the Bazaar. Looking forward to haggling over prices, then 🙂

andypowe11 cloud APIs currentlyi n flux and not well documented – very weak SLAs currently – performance unpredictable – bandwidth ditto

kevingashley Very good analysis of security issues in cloud from Terry Harmer. Some good people have worked on this identifying risks

onothimagen great demo by @davetaz of EP3.2 selective storage control: master file in cloud, dissemination copies local. no vapourware here 😉

RT: @llordllama: Is it too early to go the cloud considering academics worries about even locally hosted repositories

adrianstevenson @paulmiller Policy feedback -“Lots of fear, uncertainty and doubt about the cloud. These need to be addressed “

Posted in Uncategorized.

Tagged with , .


Keeping data safe in an educational repository: cost-benefits analysis

EdShare logo KeepIt course module 2, Southampton, 5 February 2010
Tools this module: KRDS, LIFE
Tags Find out more about: this module KeepIt course 2, the full KeepIt course
Presentation referred to in this blog entry Costs, Policy, and Benefits in Long-term Digital Preservation (Slideshare)
Presentations and tutorial exercises course 2 (source files)

18 stalwart repository friends gathered in the School of Electronics & Computer Science, University of Southampton. Most people who came had attended Module 1 a couple of weeks before.

Keeping Research Data Safe (KRDS): an activity model

Module 2, entitled “Institutional and lifecycle preservation costs”, was launched by Neil Beagrie, on KRDS2 – Keeping Research Data Safe – an activity model for identifying the benefits of a repository and the corresponding allocation of costs to those benefits.

We followed the same pattern set out in Module 1 of the course – our guest gave a presentation which set out in a very clear, structured and helpful way, the rationale for the tool. We, as training course pilot participants, were interested in exploring the applicability of the tool and the benefits of following the processes set out by KRDS. Subsequently, we undertook two practical exercises which enabled us to focus on specific aspects of the tool and to understand in more detail the processes involved.

Since EdShare is an educational repository, we were interested in understanding what the likely applicability of the KRDS processes would be to the domain of learning and teaching? As a matter of fact, I agree with Neil Beagrie’s reply to this same question: not everything would be applicable, but there may be some relevant aspects for the domain.

In the practical exercise: in four separate groups, we were asked to categorise, according to the KRDS2 Benefits Taxonomy, which benefits could be costed. We were then tasked with identifying three of these benefits and identifying what information would be required in order to assign specific costs to them. Time was allocated for each of the four groups to report back to the whole, a summary of their discussion and any specific questions which were raised.

The KRDS 2 Taxonomy, structured costs in this way:

  • Dimension 1. Benefits – direct and indirect
  • Dimension 2. Time – near term, longer term
  • Dimension 3. Private/public – benefits

Early in the group activity, one of our members identified the complementary relationship between benefits identified within each of the dimensions. So, within Dimension 1, for each of the Direct Benefits identified, it appeared that there was a corresponding Indirect Benefit. To some extent this could create some redundancy within the task we were set, or otherwise this could be treated as a verification step within the process to provide additional support for the work undertaken.

For the specific categories identified within the Taxonomy, some of us struggled to understand what exactly was intended in their use – so, “Skills Base” was intended to refer to the specific skills of the researcher(s) within any Project or linked to the production of a Data Set. We, on the other hand, felt that it could refer to the skills of those involved in the repository or preservation activity, or indeed skills of a wide range of stakeholders identified within the research process at many levels. From this discussion, we suggested that clearer articulation, preferably with specific examples listed would be helpful throughout the Taxonomy.

Our practical session provided a series of feedback comments to Neil and his team, which were broadly positive and enthusiastic about the possibility of assigning complex costing models to summarised benefits. In addition, however, issues were raised for cases in which confidentiality was required/mandated by the funder for instance, where national security or commercial sensitivity were involved. There were many examples of specific benefits offered by preservation and repository work where economic costs could not necessarily be assigned, but where we would wish to indicate reputational, peer recognition and general community benefits for work undertaken.

Applying KRDS to EdShare

In the context of the educational interests of EdShare, within the Taxonomy, it would be appropriate to edit references to “research” (although maybe not always – there could be specific pedagogical research work linked to the development of educational resource sharing, especially as it develops over several years).

The Taxonomy might look more like this.

Dimension 1

Direct Benefits

Indirect Benefits

New educational opportunities

No re-creation of educational resources

Communication between educators

No loss of future educational opportunities

Re-purposing and re-use of resources

Lower future preservation costs

Increasing return on investment for education

Re-purposing resources for new audiences

Stimulating new networks/collaborations

Re-purposing methodologies

Improved skills for developing resources

Use by different audiences

Increasing applicability of resources

Protecting returns on earlier investments

Verification of educational approaches

Fulfilling funding/institutional requirements

Dimension 2

Near Term Benefits

Long Term Benefits

Value to teachers and students

Secures value to future teachers and students

Continuity of access during staff turnover

Short-term re-use of well curated resources

Secure storage for educational resources

Availability of resources underpinning educational programmes

Adds value over time as collection expands and acquires critical mass

Dimension 3

Private Benefits

Public Benefits

Benefits to public/funder/institutions/repository

Input for future educational programmes

Benefits to teacher

Motivating new educational programmes/learners

Fulfil funding obligations

Catalysing new learners and resources

Increased visibility and sharing

Commercialising education

We will need to work through this in more detail in the EdShare case and develop better understandings for these costing activities. As educational repositories develop nationally, within institutions as well as across subject disciplines, we will have more opportunities to explore what form the appropriate Taxonomy takes on. There may be different aspects which will need to be addressed dependent on whether resources are shared openly or restricted to specific communities.

In all, this session on the KRDS was both interesting and stimulating. There are undoubted benefits in this approach for better understanding of the opportunities and benefits provided by all the repository and preservation activities our community is involved in.

Posted in Uncategorized.

Tagged with , , , , , , , .


Planets Way, London, highlights

Planets project logoTwitter: jisckeepit Planets Way, London, day 1: colour, depth, and joined-up thinking (in parts) (11:24 AM Feb 10th)

I thought it might help to fill in a little detail for this tweet. This is a highlights package, so if you want a comprehensive report of proceedings at this event, I suggest you see this blog report on day 1 of Digital Preservation the Planets Way.

A couple of caveats. I wasn’t feeling great that day, so after a slow start I arrived by mid-morning break. The following seems to be the highlight of what I missed.

Twitter kevingashley Ross King – great 30-minute intro to whys and hows of dig preservation, incentives, markets, risks,
 (9:59 AM Feb 9th)

Post-break seemed distinctly black-and white, literally, when it came to the slides. BTW, that’s the second caveat. No slides are available on the Planets site yet.

UPDATE (February 17, 2010): within hours of posting this the presentations became available (totally unconnected I’m sure), but it’s not ideal. Instead of being able to link individually to each highlighted talk, there is simply a downloadable zip file that, presumably, gives you everything from the day (the location of this zip file was sent in an email to delegates, and I can’t see it on the Web site for this event, so grab it now in case it is withdrawn for allcomers!)

So to the colour: Manfred Thaller showed how a simple migration of an abstract colour image can result in a different image.

Twitter WilliamKilbride Manfred ‘comparison is simple: get people to look at every file’. 1,000,000 files=10,416 working days. Perhaps automate? (12:32 PM Feb 9th)

This is the technical, practical, hard end of digital preservation, concerned with identifying and preserving the characteristics of digital objects, and this was a graphic account of the framework, tools and means to manage this problem for large volumes of content.

Depth: Hannes Kulovits could not have packed more into a half hour presentation on preservation planning with Plato.

Twitter kevingashley PLATO uses freemind mind map tool as part of process; @cardcc would approve! (2:55 PM Feb 9th)

Hannes reminds me of KeepIt colleague Dave Tarrant, a great thinker, implementer, enthusiast, and speed presenter (1, 2). There was enough content here for two days, so it’s fortunate we have that time in our KeepIt course module 4 (18-19 March, Southampton, one-off places may be available), where Hannes, Dave and Andi Rauber will link tools for preservation workflow and planning in a real repository context.

Joined-up thinking: this was provided by the returning Ross King.

Twitter kevingashley Ross King has removed his tie because ‘this is a more technical presentation.’ Quite right too 🙂 (3:46 PM Feb 9th)

Planets is a big pan-European project of many years that completes its tour of duty in May. It has produced a lot of outputs, as this event attests. Surely the critical feature of an event that seeks to make overall sense of all this is how it joins up. In my journalist way of thinking, which puts the point of the story at the top, this is where the day should have started. I asked people and I was pretty sure I hadn’t missed this. So by the time Ross spoke for a second time, towards the end of the day, I thought the time for join-up had passed, but it was quickly clear that here it was. At last! My impression was that the slides captured most of this, even without the erudite commentary. I won’t attempt to summarise Planets join-up here, but if you can find the slides for this event, this is where I recommend you start, even if for some reason Planets Way, London, chose not to.

Posted in Uncategorized.

Tagged with .