One view of digital preservation is it is the management of risk. A component of preservation is storage, and there are an increasing range of storage options. This is why EPrints repository software, following work in the Preserv 2 and KeepIt projects, has developed a repository storage controller and a series of plugins to enable this controller to work with emerging storage services.
One new option is ‘cloud’ storage, which for the user removes the problems of building technology and infrastructure and reduces the issue to economics (cost of space x usage). EPrints has a plugin for the new Sun Cloud Storage Service as well as for Amazon S3.
But cloud services do not remove risk, they just alter the risk profile. Which cloud service do you trust?
Questions would have to be raised about each cloud service provider. One way of identifying the issues and questions, and a good way to prompt thinking about preservation, is to profile the services, just as we have been doing for our repository exemplars in KeepIt. Helpfully ZDNet has profiled the ‘big five’ (‘plus one to watch’, but not yet including Sun) cloud service providers, and it’s not just about storage.
No doubt we will discover more questions to ask, and it certainly doesn’t provide all the answers (What back-end cloud infrastructure does (the provider) have in place? “Amazon declined to provide any details”. “Google declined to provide details on the number of its datacentres or their locations”), but it is a start and is pitched at a good level for inquisitive repository managers.
Update (12 June 2009)
Lightning takes down Amazon cloud; Just an example. Judge the risks for yourself. “While most instances were unaffected, a set of racks does not currently have power, so the instances on those racks are down. The disruption lasted around four hours. … A series of outages have hit other online or cloud computing services over recent months … Google services were hit by an outage which apparently affected one in 10 of its users. … Salesforce.com experienced an outage that disrupted all its customers for approximately an hour.”