RedFeather at Open Repositories
Last week I headed up to Edinburgh for Open Repositories 2012 and I took RedFeather along with me. While I didn’t have my own session or PechaKucha, my contacts in the EPrints community allowed me to get a short presentation as part of the EPrints user groups. This allowed me to talk about RedFeather as a way to introduce new users to the concept of repositories and allow them to discover the value of such systems. The eventual idea being to set them up with a full scale repository afterwards (I think of it as a gateway drug into repository platform addiction!). While the presentation was only short, it did give me a chance to talk about it afterwards with the user group members. I got some interesting feedback about how it might fit into the repository ecosystem.
My main mission, however, was to gratuitously talk about RedFeather as much as possible to anyone who would listen throughout the entire duration of the conference. I thought the best way to do this would be to enter the dev challenge and integrate RedFeather into my entry somehow. After bouncing some ideas off the other developers I eventually came up with a novel idea, Splinter Repositories, which utilised RedFeather without including any of the use cases RedFeather was originally intended for (it’s generally bad form to enter an existing project into the dev challenge).
A Splinter Repository is an offshoot of a larger repository that acts on its behalf, while still being an independent entity. Like its namesake, a Splinter Cell, it can later be reabsorbed into its progenitor or – if it was unsuccessful – simply disposed of. This makes it ideal for situations such as a conference or workshop, where there are a large number of unregistered, inexperienced or untrusted users who wish to contribute to a repository. Instead of unleashing them on your precious main repository, you can simply spawn a Splinter Repository to cater for this group. This reduces the administration overhead from supporting new users as well as acting as insulation between them and your existing content.
All a user has to do to join the Splinter Repository is visit a special page on the main site which allows them to spawn their own private instance of RedFeather. This microrepository instance (thanks to Mahendra Mahey for coining the term) will be automatically registered as part of the Splinter Repository and can be accessed via a special index. The allows it to act as a pseudopublic workspace – externally visible, but not part of the archive content. After the Splinter Repository has reached the end of its lifecycle, the repository admin then has the option to selectively absorb any of the content in the microrepository swarm into the main archive using SWORD. The entire Splinter Repository can then be trivially discarded or deleted without affecting the main repo.
Sadly, I didn’t win any prizes this year, but I feel it did introduce some interesting new concepts that might not have been considered before. Maybe next time…
Embedding Previews
One of the biggest challenges with RedFeather is providing users with the capability to preview OERs within their browser without requiring any specific browser extensions, plugins or server side conversion tools. When we were first scoping out the project we were aiming to support only simple media types such as PDF or images and embed them in a standards compliant fashion. Then we discovered the fantastic Google docs viewer, which allows us to embed a much greater number of media types, including the all important Word and Powerpoint.
This is actually quite a big deal, especially coming from an EdShare background where these document types cause huge problems. Going from a docx file to a embeddable html element involves first converting to PDF using OpenOffice and then converting the resulting PDF to images using ImageMagick. This two stage process performs quite well with simple documents but can fail quite badly if presented with complex content (e.g. unusual character sets or intricate vector based images).
The Google doc viewer seems to produce much more accurate conversions with the average content, but slightly worse with very complicated content (occasionally it fails to produce a preview at all). However, the main appeal of the Google converter is that, unlike the EdShare toolchain, it can run without requiring any additional dependencies on the server.
One downside of the viewer is that the conversion has to take place every single time you access the resource and it doesn’t seem possible to cache the result in any way. This isn’t really a huge problem but after accessing HumBox or LanguageBox it does seem slightly sluggish. While there is no way to speed up viewing Word or Powerpoint document, certain other media types could be rendered without using Google. Images are the most obvious, and implementing an image viewer was trivial. The next candidate was PDF but it was at this point that we started experiencing difficulties.
In order to embed a PDF in a webpage you need to use the object element and then delegate rendering to a specific browser plugin. You can also define some open parameters, which allow you to automatically skip to specific pages or scale the document to fit the element you place it in (which is an important usability factor in our case). Ironically, it was our previous saviour Google who thwarted our efforts. The Chrome browser has its own PDF viewer built in, which helps them avoid the many security problems associated with the standard. However, they haven’t implemented these critical open parameters – which means that every single PDF we embedded was loaded very fast, but at the incorrect size! We searched for a solution but sadly, there seems to be no work around.
The other problem we experienced with the Google doc viewer was a bug that caused the player not to appear in certain circumstances. After investigating the problem it appears to be a bug with API itself which occurs if your Google browser session has timed out. If you are currently logged into Google or were never logged in, the service work fine. However, if you have a valid but expired Google session ID – the embed code just fails without notification and you can’t use the previewer until log in again. This meant you would just get empty space on the RedFeather preview screen for every single resource. Since we can’t fix any problems with the Google API the best we can do is notify users how to fix the problem if it occurs. This proved surprisingly hard.
The main complication is that it is completely impossible to detect whether or not the bug has even occurred because the Google viewer is embedded using a cross-domain iframe. This means that the contents are totally invisible to the browser to prevent cross-site scripting attacks. No amount of javascript trickery will be able to detect whether or not the viewer has loaded since it will ALWAYS appear empty on the client side. Furthermore, it’s not possible to detect on the PHP server side since it is the client browser that is causing the problem.
We couldn’t detect when the error had occurred but at the same time we couldn’t ignore this fringe case, since it is absolutely catastrophic from a usability perspective. One solution would be to just warn the user that this might happen by putting a message somewhere on the site. However, this is unsatisfactory since it clutters the interface and makes it seem like the software is unreliable.
It turned out that the solution had been right in front of us the whole time: hide the message underneath the viewer. That way, if it fails to load you’ll be able to see the message through the empty iframe. If it succeeds, then the text will simply be obscured by the content! This allowed us to give the impression that the system was responding to the error without actually being aware of it at all.

