As part of our participation in the KeepIt project, the EPrints Formats/Risks plugin was installed on our newly-upgraded repository in order to allow us to identify format risks based on DROID (Digital Record Object Identification) [this blog post includes an explanation of how the DROID file format tool works]. We performed the identification task in late September 2010. The process was very quick (it took about 13 seconds for our repository of about 2000 items), did not affect repository function, and as it required a single click of a button, couldn’t be easier.
“The EPrints Formats/Risks plugin provided the most tangible value for us – it was quick and efficient and identified areas we can start investigating and improving right away.”
A comparison of the different project members’ results from this task will be discussed in an upcoming blog post. For the UAL Research Online repository, we immediately identified a potential preservation risk in the fact that over 200 objects were unidentifiable, and returned as ‘unknown’ file formats. The extensions for the unknown formats are generally recognizable, such as .swf or .mov. Possibly the objects do not contain digital signatures that DROID was able to recognise; if this is so, will need to investigate why these signatures are absent or unreadable. We also will check each of the unknown objects to make sure they are valid files and have not become corrupted (a good chance to do a little general housekeeping).
In response to our feedback about these results, the developer of the EPrints plugin is looking at providing a way for files unrecognised by DROID, but with identifiable extensions, to be manually classified. Using the Formats/Risks plugin on our diverse collection of research outputs in arts, design and media has therefore been immediately useful for learning about and managing our collection.
We also registered for a self-audit of our risks using the DRAMBORA tool. A previous blog post provided a general introduction to DRAMBORA, and our own approach to using this tool has also been explained.
“What I have learned about DRAMBORA is that it isn’t realistic to expect a small repository team to be able to complete the full process in and around their daily activities.”
As mentioned in that earlier blog, I was unsure that I would find the time to follow the whole DRAMBORA programme (see this schematic of the DRAMBORA method) – the user manual suggests that four to five days of 6 hours each would be required to carry out the full self-audit. My wariness was justified; I estimate that I have found no more than four hours to spend with DRAMBORA since registering UAL Research Online for the audit many weeks ago. The preliminary stage (the ‘Preparation Centre’) and the first elements of the next stage (the ‘Assessment Centre’) that I have so far encountered are still concerned with copying and pasting general descriptions and policies, such as the wording of our mandate, and listing our objectives. Currently I am detailing our constraints; this is glossed as “any factor that compels or influences the repository to operate in a particular fashion” and as such is becoming rather lengthy. I’m not sure that these larger policies will directly impact our preservation activities, though I can see that copying them over to one place is a thorough approach to documenting a repository.
Perhaps what I have learned about the DRAMBORA process is that it isn’t realistic to expect a small repository team to be able to complete the full process in and around their daily management activities. With 1 full-time manager and 1 part-time administrator our staffing arrangements are typical for UK repositories, and I know of several repositories managed by single part-time librarians.
Miggie Pickton at the University of Northampton, another KeepIt exemplar repository, completed her scoping project with the DAF tool, structuring the work as a separate project lasting 8 weeks and employing two graduate interns to carry out the research. The interns were found through the Graduate Boost programme, which receives funds from HEFCE and the European Social Fund. A DRAMBORA self-audit was completed for the repository at LSE where it was found to be beneficial. Unfortunately there was no estimate provided of the time and staff required by LSE to complete the process.
Our initial scoping of DRAMBORA did not indicate that we needed to set up a separate project similar to that at Northampton; however, it seems clear that undertaking DRAMBORA needs to be planned and funded, with additional staff required either to carry out the audit or to release existing staff to do so. The latter option is probably preferable, as DRAMBORA requires the auditor to have an in-depth understanding of high-level university policies as well as the institution’s IT procedures, and to have access to policy documents across university departments, including library, legal, research management and IT. The repository manager, or this person’s line manager, is best placed to be able to access and interpret these documents.
Steve Hitchcock, KeepIt project manager, suggested that a ‘DRAMBORA Light’ might be the solution; a scaled-down version, sacrificing some thoroughness in favour of rapid results, would be more realistic for institutional repository use. Rather than building up a comprehensive profile of the repository, the scaled-down version would aim directly at the most common and relevant risks encountered in repository management, and ideally could be completed in no more than half a day. If the repository manager judged it necessary, the risk register generated by the Light version could be used as the basis for a request to senior management for time or funds to carry out the full audit.
In sum, the EPrints Formats/Risks plugin provided the most tangible value for us. Although it only dealt with one aspect of digital preservation risks, it was quick and efficient and identified areas we can start investigating and improving right away.