KeepIt course module 4, Southampton, 18-19 March 2010
Tools this module: Plato, EPrints preservation apps
Tags Find out more about: this module KeepIt course 4, the full KeepIt course
Presentations and tutorial exercises course 4 (source files)
Preservation planning provides a workflow leading to a preservation plan. One of the problems with some approaches to digital preservation is they are too proactive and reactive where file formats are concerned. For example, take a file format, decide the consensus is against it (such as Microsoft Office formats), and migrate. There is very little formally to justify this process, just that it is possible and not very difficult. The longer-term consequences of this action are unknown. What was done with good intent may turn out to be detrimental.
Preservation planning seeks to give a more formal basis to such decisions, and in the process will help to automate and record the consequent actions.
For this session in KeepIt course 4 we welcome back Andreas Rauber and Hannes Kulovits from the Vienna University of Technology to provide an extensive presentation on preservation planning using a tool, Plato, which they developed as part of the PLANETS project. This will be followed by further practical work.
As a formal process, preservation planning requires, unsurprisingly but perhaps disconcertingly, a lot of preparation as it takes account of preservation policies, legal obligations, organisational and technical constraints, and user requirements (slide 8). Fortunately, participants in this course are well positioned to do this, since we have covered many of these issues already in KeepIt courses 1-3.
The preservation planning approach supported by Plato can be overlaid on the OAIS reference model (slide 10), and is shown in more detail in slide 12. Preservation planning with Plato involves four stages:
- Define requirements (slides 12-33)
- Evaluate alternatives (slides 34-45)
- Consider results (slides 46-58)
- Build preservation plan (an exercise)
The reader can explore the Plato workflow using the slides. Here we will highlight some of the critical stages.
In the KeepIt course, and indeed throughout the KeepIt project, we have begun our preservation approach with format identification, but now we go further. We have to relate our identification and other information about our digital objects to our requirements for those objects. This is where our understanding of the significant characteristics of digital objects from KeepIt course 3 becomes useful. Slide 25 shows a mindmap of the sort familiar, again, from course 3, and in slide 26 we encounter the Plato interface, the tree editor, for the first time. Flipping between mindmap and Plato, the following slides show how the requirements are elaborated and values added, illustrating how this information might be mapped to the Plato editor.
Once we have described our objects we want to know what we might do to preserve them. There is typically more than one choice, not just in terms of a preservation action (e.g. format migration) but also in how that action might be performed, and these alternatives need to be evaluated to serve our requirements. If you recall, at the end of the previous session in this course module, we deposited some GIF files in a test repository and then downloaded those files in readiness for this session. In slide 37 we see these files appearing in Plato for the first time. Plato now shows us what alternative actions are available for these files.
Now there is a decision to be made: go/no-go/deferred-go. To make an informed decision we need to run some experiments, that is, to run the alternatives and compare the results before we commit to any plan. Plato helps us to run and evaluate the experiments, in this case on our image files.
Having begun to get some results we could perhaps begin to think we have done the hard work, but there is still a tricky stage to negotiate. Before we can analyse the results we have to transform and weight the measured values from the experiments, that is, to normalise the values so different experiments are measuring the same thing (slide 48-49), and to set the level of importance for each of the factors in our requirements tree (slides 51-52).
Finally, Plato presents our results (slides 55, 57) and we can see the benefit of using this tool.
Summary and exercise
Before we start the exercise, slides 60-64 summarise the presentation so far. If there is one conclusion that I would highlight above the others, it is that
- preservation planning is a basis for well-informed, accountable decisions (slide 64)
It is no longer necessary or acceptable to make ad hoc preservation action decisions. This has been a detailed and involved process, but the benefit for a large repository is that the resulting plans can be used across all content in the analysed formats, now and in the future.
Two exercises are set out in slides 66-70. Again, we use our imported GIF files. My impression from observing participants on the course was that this may have been the hardest exercise in the whole KeepIt course, especially exercise 1 which confronts the requirements, and the first encounter with the tools, including the Freemind mindmap tool. There is nothing like a steep learning curve to get the best out of people, and by the end of this session you could hear the sound of pennies dropping.
We now have a preservation plan, and in the next session we will put that plan to work in a repository.