Southampton Open Data Blog

New Data

Quick Sunday Update

July 17, 2011
by Christopher Gutteridge

First of all, I’ve also just added a new post over at the web team blog which might be interesting to our readers on the data blog, if you’ve ever been confused about the relationship between Open Data, Linked Data and RDF Data.

Secondly, I’ve just added in the sameAs links between our bus-stop data and data.gov.uk. I should have done this months ago, but kept forgetting. It’s up now and I imagine Hugh Glaser will import them into the sameAs.org service which will allow you to discover our data on Southampton bus-stops by resolving the government ID for a bus-stop in sameas.org (maybe we’ll link to a demo, as I don’t think I explained that very well!)

** UPDATE **: Turns out my sameAs links were wrong, but Colin has created a full set which also links our codes for train stations and I’ve added in the airport. I’ve published it as a separate linkset.

Lastly, I asked a few keys staff for comments about the value of Open Data, and here’s a great one:

The Open Day map, based on open data, amazed so many of our visitors, is was great example of how our leading edge research has translated into a very real an practical application, second only to Soton Bus!

— University of Southampton Pro Vice-Chancellor Education, Professor Debra Humphris

The open days pages aren’t actually linked from the data.southampton homepage; but they aren’t secret, just only valuable for the period of the now-passed event.

Bus Route Updates

July 13, 2011
by Christopher Gutteridge

The Southampton ROMANSE project has given us the go-ahead to put the Southampton bus times data under the OGL (Open Government License).

In celebration, I’ve added a new bus routes page to better navigate this data.

If you look deep in the data, sometimes the data identifies the exact vehicle which is coming.

I admit the RDF is shonky, is anybody working on an ontology about  that should get in touch!

Grasping the nettle and changing some URIs

March 24, 2011
by Christopher Gutteridge

We’ve realised that using UPPER CASE in some URIs looked fine in a spreadsheet but makes for ugly URLS, and if we’re stuck with them, we want them to look nice.

Hence I’ve taken an executive decision and renamed the URIs for all the Points of Service from looking like this

http://id.southampton.ac.uk/point-of-service/38-LATTES

to this

http://id.southampton.ac.uk/point-of-service/38-lattes

meaning the URL is now

http://data.southampton.ac.uk/point-of-service/38-lattes.html

This actually matters, as these are going to become the long term web pages for the catering points of service, so aesthetics are important, and “If t’were to be done, t’were best done quickly”.

We’ve seen lots of visitors as a result of the Register Article, which is nice. (we saw a 10x increase in visitors, so that’s good)

I’ve just added in the lunchtime menu for the Nuffield. They are not yet quite taking ownership of their data, but that’s just a case of getting them some training. I’ve also talked today to the manager of the on-campus book shop to see if they want to list some prices and products. I’m thinking they could do well to list the oddball stuff they sell like memory sticks & backpacks.

Mostly I’m preparing to tidy up the back-end code — it needs to be a bit more slick and logical, more on this later.

Also today our very own Nigel Shadbolt is featured in the first ever edition of the Google Magazine. (It’s a PDF!)

Jargon FIle

March 15, 2011
by Christopher Gutteridge

I’ve added a new dataset;

It’s semi-crowd sourced; I’ll give any member of iSolutions, or other professional services, the ability to edit it. It could use a search tool similar to the phonebook, but we’ll get to that at some point.

Back to Reality (almost)

March 9, 2011
by Christopher Gutteridge

We’re spending this week catching up on little jobs we put on hold to get data.southampton ready on time, but there’s people who keep offering me data!

Unilink Happy

The Unilink office have offered to take over looking after the bike shed location data, as they look after that service. That’s great! Our goal is for the ColinSourced data to be slowly handed over to parts of the university administration, with his dataset left to be the odds and ends which nobody is specifically responsible for. We’ve also been discussing how to advertise the bus-times related features to students and Unilink users. I don’t want to dive in with both feet for a week or two, in case there’s issues that’ve not come to light yet. I’m dead proud of the fact that their receptionist told her boss “yes, I can look after that data, it’s just a google spreadsheet, it’s easy!”. That’s our goal!

Muster Points

I got a great suggestion from Mike, our facilities manager, that we could add public safety information about buildings. We don’t need every detail of every fire escape (there’s signs in buildings for that), but we could usefully add a list of first aiders and a map polygon of the muster point (which we can render on the building page). We’ve created a mini dataset for the ECS buildings muster points, but I’ve not yet had time to import it into the site.

Google data a bit Shonky

My contact for university buildings data pointed out (rightly) that we had the incorrect location on the Highfield Site page for a few things. That’s not my data!! It’s the labels added by Google based on whatever they’ve found on the web. In this case the data would be accurate enough if planning how to drive to the Gallery, but useless if you are plotting a site map.

I fixed it by just using “t=k” instead of “t=h” which turns off the labels from Google.

Of course the best way to get better data into Google is to publish it on the web! Anybody got any advice on how to get Google to read the locations of our stuff?

How does it all work, then?

I’ll write on the Web Team Blog, sometime soon, an explanation of how the systems on the data site work, but for now here’s the diagram.

The first thing you’ll notice about this diagram is that I’ve had a hair cut. I really hated how I look in the last picture I posted here and was sick of trying to look after it. Fresh new look for Spring!

OK, the diagram has 3 colours of arrows but I couldn’t find 3 pens so the dotted arrows at the bottom are the 3rd colour. Doing more with less!

The black arrows represent manual processes like people emailing me data and me copying into the correct directory.

The red arrows are all the stuff which happen when I run “publish dataset” (warning Perl!).

The dotted lines are triggered when a request is made to a URL. It’s not very complicated and the script does most of the heavy lifting.

The long term goal is for something to download copies of spreadsheets (etc.) once an hour and MD5 them. If the checksum has changed since the last publication date, then it’ll republish automatically. That way I can leave it polling the spreadsheet describing the location of our cycle sheds, and if it ever changes it’ll republish it without bothering me. I work hard to get to be lazy later!

Dave Challis would like to point out that I’ve skipped all his clever stuff around the 4store and ARC section so he’ll write a post later to explain that in more detail.

Recognition Back Home

There’s been lots of tweets, but few blog posts about this site so far, but it’s nice to be mentioned on the hyper-local blog from my home town!