April 11, 2013
by Ash Smith
Just recently I’ve been looking for data we can publish as RDF with minimal effort, and without requiring any access to restricted services or taking up peoples’ time. I came across the University’s jobs site, jobs.soton.ac.uk. It uses a pretty cool system which exports all the vacancies as easily parsable RSS feeds, grouped into sensible categories. We have a feed for each campus, and a feed for each organisational unit of the University, so if a job appears in, for example, the feed for Highfield Campus as well as the feed for Finance, the job is a finance-based job on the Highfield Campus. Because of this, it’s trivial to write a script that parses all the RSS feeds on the jobs site and produces RDF. So that’s what I did, and you can see the results in our new Vacancies dataset.
Normally when I produce a new dataset I like to provide a clever web tool or search engine to make use of the data, but this time I haven’t, because the jobs site already does this very well. So why republish the data at all? There are two reasons. Firstly, our colleague at Oxford University, Alexander Dutton, has already done this with Oxford’s vacancies. If we do the same, using the same data format, we’ve effectively got a standard. If other organisations begin to do the same thing, suddenly the magic of linked open data can happen. The second reason is because now SPARQL queries are possible. They’re a bit advanced for the layman, but if you were looking, for example, for a job at Southampton General Hospital paying £25K or higher, you can write a SPARQL query that does all the hard work for you, and the same query will work with Oxford’s data, although obviously you’ll need to replace the location URI with one of theirs.
Feel free to have a poke around at the data and, as always, if you manage to come up with a cool use for this data – even just an idea – then please let me know.