Southampton Open Data Blog

Uncategorized

Visit from UCL

April 26, 2016
by Ash Smith

Last Thursday (21st April), we had a visit from some staff and students representing TechSoc at University College London. TechSoc, as the name suggests, is the technology society, and they have been investigating the feasibility of building an API for members of UCL to access data about the institution.

Despite our complete lack of entertainment budget (!) they were happy to get on a train and come and visit us to find out about our experiences in creating and maintaining the Open Data service. Wilhelm Klopp, one of their number, has written a nice blog post about the experience.

http://techsoc.io/blog/2016/04/22/southampton-trip/

 

Unplanned Outage

July 8, 2013
by Christopher Gutteridge

We had an unplanned outage over part of this weekend. This was due to a logfile unexpectedly growing huge. We do have monitors in place but this time we didn’t catch the alert emails in time. The log normally takes 3.2Mb per week, and last week using 25Gb!

Ash was working very hard all of Friday & Saturday, at the Open Days, showing off the Open Data Service to potential students and their parents.

 

The Vacancies Dataset

April 11, 2013
by Ash Smith

Just recently I’ve been looking for data we can publish as RDF with minimal effort, and without requiring any access to restricted services or taking up peoples’ time. I came across the University’s jobs site, jobs.soton.ac.uk. It uses a pretty cool system which exports all the vacancies as easily parsable RSS feeds, grouped into sensible categories. We have a feed for each campus, and a feed for each organisational unit of the University, so if a job appears in, for example, the feed for Highfield Campus as well as the feed for Finance, the job is a finance-based job on the Highfield Campus. Because of this, it’s trivial to write a script that parses all the RSS feeds on the jobs site and produces RDF. So that’s what I did, and you can see the results in our new Vacancies dataset.

Normally when I produce a new dataset I like to provide a clever web tool or search engine to make use of the data, but this time I haven’t, because the jobs site already does this very well. So why republish the data at all? There are two reasons. Firstly, our colleague at Oxford University, Alexander Dutton, has already done this with Oxford’s vacancies. If we do the same, using the same data format, we’ve effectively got a standard. If other organisations begin to do the same thing, suddenly the magic of linked open data can happen. The second reason is because now SPARQL queries are possible. They’re a bit advanced for the layman, but if you were looking, for example, for a job at Southampton General Hospital paying £25K or higher, you can write a SPARQL query that does all the hard work for you, and the same query will work with Oxford’s data, although obviously you’ll need to replace the location URI with one of theirs.

Feel free to have a poke around at the data and, as always, if you manage to come up with a cool use for this data – even just an idea – then please let me know.

FOI Man Visit

January 18, 2013
by Christopher Gutteridge

Last week we had a visit from Paul Gibbons aka “FOI Man”. He works at SOAS and came down to Southampton to see what we’ve been up to with open data.

He’s written it up in his blog.

At Southampton the FOI-handling stuff and open data have only a nod-in-the-corridor relationship, but there’s some obvious wins in working together.

In other news, we’ve got more data in the pipes, and are writing importers for it in the next few days, we’ve had a meeting about moving some core critial parts of the open data service into “BAU” – business as usual, so that there’s people who know how to maintain it outside our team, and the core is (change) managed more formally. This is essential if we want open data to be part of the long term IT strategy and not a glued-on-bit on the edge.

I’m also thinking about the fact we have very spotty data on research group building occupation, and so forth. By rights this data probably belongs to the “Faculty Operatiing Office”, but they are busy and don’t answer my questions very often. A cunning plan has entered my mind… Make a ‘report’ URL for each faculty which provides a spreadsheet with what we know about their faculty and let them download it and send it back to us. I think they could ‘colour in’ the missing information in a few minutes, and it will better express the problem to the management/administrator mindset if I show them a spreadsheet with blank cells in. To me, it’s a just data, but then I’m a data nerd, and we’re learning you have to have the data owner work with data in a way that makes sense to them.

Times Higher Education Award

December 3, 2012
by Christopher Gutteridge

The University of Southampton won the award for “Outstanding ICT Initiative of the Year” for the open data service.

Personally I feel rather smug about this, as you can imagine, but while I may have worked my socks off, there’s a hell of  a lot of people who made it possible.

Obviously first of all is Professor Nigel Shadbolt & Dame Professor Wendy Hall for convincing the University it should have an open data service.

Next up is the team who created the origional ECS open data serviceMarcus Cobden, Alastair Cummings, Dr Nicholas Gibbins and Dr Colin Williams (who got his PhD last week).

Lots of general support from the Web and Internet Science research group and the members of the Enakting Project in particular.

There’s the project board; who’ve been very enthusiastic from the start; Malcolm Ace (Chief Operating Officer), Wendy Hall, Nigel Shadbolt, Debra Humphis (now sadly left the Uni to work for some place called “Imperial College”, sounds nice), Simon Peatfield (our head of Communications),  Hugh Davis (head of eLearning)  and Pete Hancock (our head of IT). The first meeting with this bunch had me really bloody scared but it went well, and they were all keen to see if we could prove this technology/approach in our day to day operations.

Dr Su White deserves special mention, as whenever I talked to any of the heads of services, it seemed she’d been chatting to them only a few days previous, talking up the benefits of open data.

Thanks to Paul Seabrooke in Buildings & Estates for help with navigating the subtleties of our list of buildings, and lots of other people in that department;  Jodie Barker and the energy team, and Neil Smith and the sustainability and recycling people, Adam Tewkesbury in the transport office (who was also part of a team shortlisted for a different THE Award).

A special James Leeming and his team in retain catering for being helpful, enthusiastic and patient when we’ve not yet delivered everything we promised.

In my own department, Tim Boardman who has now gone to some place up the road called “Oxford”, but was really helpful helping us learn to navigate the politics of databases in our University, Graham Robinson who did the cool feed which enables us to have workstations-in-use data. Lots of people who’ve given help, or had more work generated as a result of this project.

Nic Burns at the council, and both the previous and new real-time bus information contractors. We’re hoping to have that all up and running soon!

The Equipment sharing project team; Adrian Cox, Louise Payne, (and recently Adam Field has joined that mix), Don Spalinger, Hilary Smith, Pete Hancock (again), some helpful people from Finance who’s names elude me right now but are helping get things hooked up to their data.

The other open data projects around the UK have been a source of inspiration (and occasionally the only other people who understand the weird new challenges these projects bring). Mathieu D’Aquin (data.open.ac.uk) who I’ve not always agreed with but have learned lots in our discussions, Alex Bilbie and Joss Winn at http://data.lincoln.ac.uk/. And a big thank-you for Dave Flanders for creating the UK community of developers that has meant we’ve started sharing ideas and solutions rather than stay buried in our institutional silos.

(I knew this was a long list, but wow! We’re down to the last few now…)

Dave Challis who kept the triplestores up and happy and worried about details I wouldn’t have had time for.

The company Garlik has a number of ex-Southampton staff who’ve been very helpful with advice on good practice. I’ll be gracious and still thank them even though they went and hired Dave Challis away from us. (he seemed happy when we had lunch on Saturday, so maybe the real world isn’t so bad).

Gavin Costigan actually put together our entry, and evidently did a good job– we won!

Charles Elder is the member of Communications who accompanied us to the awards, and was reassuring when we were rather out of our depth.

Naomi & Caroline, My and Dave’s girlfriends, who have been “RDF Widows” on a number of occasions when we were working silly hours to get everything working.

Colin Williams. What can I say about Colin? I think he’s the reason we won the award, without all the stuff he built on top of the open data, plus the events calendar. He’s had an amazing week with both the awards show and then successfully defending his PhD the day after. I’m gutted he’s leaving, but I’m sure we’ll see each other at the occasional hack day.

A wave to my new immediate team mates Patrick McSweeney and Ash Smith who both joined the team this year, Ash as full time Open Data Service development and Patrick as a “replacement” for Dave, although his facial hair is different enough to avoid people getting confused.

I think my biggest thanks goes to Alex Dutton at data.ox.ac.uk for being the sounding board, friend, and rival that we needed to make data.southampton.ac.uk what it is. It’s fair to say that I can see aspects of my designs in data.ox.ac.uk, and of Alex’s in our service.

I’ve not included everyone who’s been a help, but this post is already nearly a thousand words, and past the TL;DR point, so I’m going to call it to a halt. Thanks to everybody, as a child I read science-fiction. Now I implement it.

Christopher Gutteridge, 2012.

 

Shortlisted for a Times Higher Education Award

September 6, 2012
by Christopher Gutteridge

Some very exciting news. I’m proud to say that Southampton have been short-listed for the Times Higher Education awards for “Outstanding ICT Initiative of the year” and the submission was for our work with data.southampton.ac.uk!

This may involve me having to wear a dinner jacket, which may get a chuckle from people who know my usual, er, style.

While I’ve worked very hard on the open data service, none of it would be possible without the help of dozens of people from all around the University, so it really is an award to the whole university. That said, I’m hoping I’m the one who gets the tasty dinner!

 

We are hiring!

August 4, 2012
by Christopher Gutteridge

This is very exciting news.

The university has created a full time postion (initially 2 year fixed term) for data.southampton! This will involve taking the system towards maturity and “business as usual”. It’ll involve working closely with myself and Patrick.

I’m hoping to get someone enthusiastic about the technology and way it can improve how we all work, but with different skills to Patrick and I. My ideal candidate is the type of person who enjoys doing all the fiddling required to build a really good software package release. Part of the goal is to make open data not only practical for other organisations but actually easy.

I’m really chuffed that our university thinks its worth investing in Linked Data as infrastructure, not just as a research area.

Location: Highfield Campus
Salary: £27,578 to £33,884
Full Time Fixed Term
Closing Date: Sunday 19 August 2012
Interview Date: To be confirmed
Reference: 146112JF

More Information

Job Description and Person Specification [Word Document] — we will use the person specification to determine who gets the job, I anticipate we may know, or even be friends with, some of the applicants, so judging everybody by the person spec. helps keep it fair.

data.ac.uk and some things to read

April 24, 2012
by Christopher Gutteridge

The really exciting news is that we’ve just registered data.ac.uk to act as a home for UK-wide data projects. If you want to contribute ideas, join the data.ac.uk mailing list.

This week is also the last chance to respond to the UK Government Consultation on Open Stanards. Large companies have reportedly been pushing their agenda in this consultation, but anybody is allowed to voice their support, objections or suggestions to the proposals.

I’ve published a couple of relevant blog posts on the webteam blog:

There’s also been two recent blog posts from peer-projects I’d like to recommend:

Response to the Public Data Consultation

October 17, 2011
by Christopher Gutteridge

Here is a draft of what I’m planning to send to the Public Data Corporation Consultation. I’m posting it here first to see if anyone suggests any good improvements. I’ll send it in at the end of the week. The consultation closes on October 27th. If you care about public data in the UK maybe you should respond too!

*    *    *    *    *

Update: I went to hand this in today, and it seems that actually the consultation is a bunch of questions which rather threw me. I’ve done my best with it but it feels like answering “how would you like to pay to access the data?”. I am not very happy, it feels like there’s already a plan and they are just rubber stamping it. Sigh.

My basic take is that if data is required by government/council/public sector to run the country, then it’s required by the citizens to live in that country.

*    *    *    *    *

In brief: I believe the UK government should provide all public data with a license which allows free reuse (OGL), in formats which make it easy to work with, and identifiers which allow diverse data to be joined together in new ways. This will increase the wealth of all citizens and visitors to the UK. It will enable people in the UK to make better choices, and live better lives. The work begun by data.gov.uk makes me proud to be British and enables new kinds of benefits unprecedented in human history.

My name is Chrstopher Gutteridge. I am the Linked Data Architect for the University of Southampton. If it wasn’t for data.gov.uk this job title wouldn’t even exist!

I run the Open Data Service for the University of Southampton <http://data.southampton.ac.uk/>. This service was inspired by the UK government project, and has proved beneficial to our organisation with very positive support and feedback from as diverse sources as the Dept. V.C. for  Education (Debra Humphis) and the head of our Catering services! By providing easy, open and joined-up access to information from the diverse parts of our organistation it means we improve the experience for our staff, students and visitors. The most beneficial tool using our data, to date, is a map of the university amenities <http://opendatamap.ecs.soton.ac.uk/> was not produced by our paid staff but by one our research students, who was keen to find a way to contribute.

I believe similar benfits and opportunities exist at the national and international level.

There’s two real benefits to the nation. The first is transparency. Allowing anybody to write tools using government data on things like crime, health, education, and other factors about quality of life or services is great — it helps people make informed decisions.

What is also a huge national asset is the fact that we’ve begun to publish catalogs of things in the UK, like postcodes, transport stops, postboxes, schools, parks, roads, etc. I see this as the digital equivalent of standardising UK plug sockets and domestic mains electricity.

If there’s a central way to identify, say, a road then any organisation from Google & Apple down to someone collecting a list of pot holes, they can all use the same code to identify the road. This allows organisations, or citizens, to later join up information from diverse sources to provide new value from existing databases. This is amazing and has so much potential. Not doing this is like allowing every train company to use a different train track gauge.

To use my own work as an example of how powerful this is. I collate information from University Catering on their coffee shops, from the timetable office on what teaching rooms we have, from our buildings and estates dept. on the ID Number, name, architect & year of construction, from the disibility office on the disabled access, and so forth. None of these departments need to talk to each other, but because all the other departments publish data using the building code number defined by buildings and estates I’m able to, very easily, join these up to create a far more useful resource: <http://data.southampton.ac.uk/building/85.html>

I’m very concerned that people might start having to pay to access this data. This will exclude the growing community who create computer and phone applications out of interest and enthusiasm and desire to make something to help people. It will exclude small companies, who can’t afford the risk.

I believe that unrestricted access, under the Open Government License, to all government and council data will make this a better country to live in for everyone.

Charging for government data would be like starting charging people to produce devices which use standard UK mains voltage — a regressive step which I believe woud do more harm than could ever be matched by the income generated.

I have used Linked Open Data to make the University of Southampton a better place. Please continue to do the same for the UK, Europe and the World.

Christopher Gutteridge – University of Southampton – http://users.ecs.soton.ac.uk/cjg/ – cjg@ecs.soton.ac.uk

*  *  *  *  *

2nd update; I just remembered this idea which I think is a sane compromise to kick start unfettered open data.

Hi, I already made a return to the PDC consultation but forgot to include something.
Christopher Gutteridge. University of Southampton Linked Open Data Architect but speaking as a private citizen.

Nobody yet has seriously investigating corporate sponsorship of open data. For example; getting every public loo in the country into a single database could be sponsored by Andrex. Anybody wanting to use this data would by able to freely use it under the OGL, but would have to credit “Andrex” in addition to the government.

This gives any individual or company unrestricted use of the data. Very large companies may choose to may a substantial fee to waive the requirement to credit the sponsor. I believe some companies call this a “white label” fee.

Admittedly some datasets would be controversial or hard to find sponsorship and it would be important to maintain public trust in the data, but there’s large amounts that this *might* work for and it ends with a better situation for the UK people.

The second any ‘hoops’ are required to to get to data (including API keys, license click-throughs etc.) then the power of it being open is massively reduced.
A very simple example of it being done right; The OS postcode interface to data allows you to view the data as an HTML web page:
http://data.ordnancesurvey.co.uk/doc/postcodeunit/SO171BJ
but because any tool can access and allow you to explore this data, you can view it in my own data viewer:
http://graphite.ecs.soton.ac.uk/browser/?uri=http%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fid%2Fpostcodeunit%2FSO171BJ
or someone else’s
http://linksailor.com/nav?uri=http%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fid%2Fpostcodeunit%2FSO171BJ&go.x=0&go.y=0

These examples are trivial but show software gaining direct access to the data. Imagine the frustration if most pages on the web made you sign an End User License Agreement before you could view them. It wouldn’t be as good, and nobody would really benefit.

I’m hoping that in the next year or two, things like arts festivals and museums may consider this sponsor-for-attribution model of financing so people could, for example, use the data from Edinburgh Fringe in any phone or web app, but the festival has already made its income by gaining sponsorship of the data, and the sponsors get their return as all users of the data have to attribute them in their apps. This allows an ecology of apps to be created rather than each data source only having a very few. This ecology will lead to innovation and improvements to the user experience. The more the application-creator has to pay for data, the less innovation we’ll see.

Ask the SPARQL Monkey

July 26, 2011
by Christopher Gutteridge

We had a neat idea the other day, to offer a “ask the SPARQL Monkey” service, which you can request we create you a custom query over the Southampton data service, eg. lat+long and building name of all buildings containing a vending machine(!). Please make requests as comments on this post for now. We can’t promise we can address your requests, but we’ll have a go. One of the output options is CSV which loads right into Microsoft Excel.

Also, if anyone feels artistics, we’re looking for an illustration of the SPARQL Monkey which we can put on a corner of the data homepage to advertise this service.

ps. This offer is open to anyone (not just Southampton people) who’s interested in working with our data and would like an XML, CSV or JSON version of some of it.