Lessons in dependency management

The approach I had to take when developing RedFeather was very different from my usual methodology due to the unusual motivations behind the system. As I mentioned in my earlier blog posts, the primary goal behind RedFeather is to provide a system with installation and usage barriers as low as possible. In this blog post I’ll be talking about how this affected the way I designed and implemented the system.

Limiting the amount of dependencies required to install and operate RedFeather was probably the biggest challenge during development. It is a well established ideology that you shouldn’t constantly reinvent the wheel – if somebody has already built a component, it is usually better to use it then to redevelop it yourself. This is why things like databases and software libraries exist; they have been developed to solve a very specific task and are usually mature, well documented, and proven technologies. For example, no sensible developer would program his own video codecs from scratch if something like ffmpeg would be adequate for their needs. However, every additional dependency included within RedFeather increases the overhead involved with installation and reduces the simplicity and self-containedness of the system.

The most important rule I imposed on RedFeather was to include zero server-side dependencies, thus eliminating any overhead associated with configuring the server environment. In practise, simple overheads (such as access to a database server) are usually acceptable but eliminating them entirely allows users to install RedFeather quickly without having to involve system administrators. However, programmers are accustomed to having access to these luxuries and often take them for granted. Having to develop without them requires us to take a step back and make the most of the facilities that are available. In the case of RedFeather, the most obvious alternative was to use the filesystem of the webserver itself which provides a convenient location to store both the resource metadata and files.

When you contrast this with a full-scale repository platform and you can see clear advantages. In EPrints, for example, resource metadata is stored in the database using a complex multi-table schema while the uploaded files are placed in a special data structure hidden on the server. Downloading a file involves first looking up the resource in the database, determining the file_id and then serving it using a custom file handler. In RedFeather, the resource metadata for the entire repository is stored in a single file in a schemaless human-readable format. The associated files are stored directly in the public webspace of the server and are therefore immediate accessible.

While the method utilised in EPrints is obviously far more flexible than RedFeather, I believe it is unnecessarily complex for 90% of cases and certainly over-specified for someone who just wants to share a handful of simple resources. Notably limitations of RedFeather in this respect are the lack of explicit collections and multi-file resources. However, neither of these features are precluded by this method of storage and could be added at a later date.

A harder lesson to learn was that certain features and functionality simply can’t be supported without external tools. This includes complex file processing or conversion, and media rendering (video and audio are particularly problematic since HTML 5 only supports a small number of formats). Exceptions had to be made for JQuery (which is indispensable for cross browser javascript) and the Google Docs Viewer (without which there would be no embedded previews). Fortunately, both of these dependencies are entirely web-based and therefore add no overhead to RedFeather’s installation process.

Lessons in dependency management

Leave a Reply