Digital preservation planning may be technical, but you have to get to grips with your organisational policy and planning first, as is revealed in a new paper by our KeepIt project advisor Andreas Rauber and coauthors in the latest issue of D-Lib Magazine
Although ostensibly about two particular image file formats, the real point of interest for us is how this acts as an example of preservation planning. This is a topic that we will encounter later in our project training, to be presented by Andi, and based on the preservation planning tool Plato, also described in the paper.
Preservation planning connects the processes of managing file formats within your collection. Computer applications change, so some file formats are at risk of becoming obsolete, and when this happens the content may become inaccessible. To prevent this preemptive action might be taken, but you have to know where (for which files or contents) and when such action might be taken.
The first step in this process is to identify the formats of all content files within a collection. Then you have to know the preservation implications of each format, and decide on appropriate actions, if any. This is where it gets tricky because the the number of risk factors to take into account for each format is large – status of applications and viewers, etc. – and thus making decisions on action becomes more complex. Bear in mind as well that every preservation action incurs a cost, and costs can rise rapidly for large, diverse digital collections. So good judgement is critical. By connecting these steps, preservation planning can help produce good judgement.
The D-Lib paper provides a practical example, but note that where this paper evaluates the migration, a preservation action, between two formats for a large volume archive, the exemplar repositories in KeepIt and other institutional repositories typically switch these factors around, i.e. smaller collections but with many more formats and consequently multiple possible conversions. I’ll leave you to judge the effect on complexity caused by each factor.
Now the really sobering part. The result of a preservation planning process, say Rauber et al., is dependent on the institutional and repository context: “the design of one plan can differ considerably from the plan of another institution, even one with a similar collection.” Considerations include “the institution’s preservation policies, legal obligations, organizational and technical constraints, requirements, and preservation goals, as well as the capabilities of the tested tools. Preservation Planning is a process that depends very much on an institution’s individual policies and requirements in its day-to-day work.” In reality, there is no magic wand.
This is where Plato comes in, and this is why we are trying to integrate it with EPrints, so that at least you might be able to invoke the planning process within a familiar repository environment. It’s also why we are not starting the training programme with preservation planning.
Finally, the conclusion of the paper highlights another factor, time dependence. The current recommendation for the image example, despite positive reports of the target format JPEG 2000 and all the accompanying analysis, is not to migrate. But this could change: “in one year we’ll look at this plan again to see if there are more tools available and whether or not the ones we considered in this year’s evaluation have been improved.” In other words, planning is an ongoing process. There is no single result, and even that can change over time, all depending on your repository, of course.