July 12, 2016
by Ash Smith
It was recently pointed out to us here in Open Data HQ that we don’t blog enough. We do a lot of things but most of them are little wins and not worth their own blog post. However, we’ve had enough little wins recently that it’s worth writing a paragraph or two about all of them.
Public People IDs
We have always had a few datasets containing information about people. The ePrints dataset, the public phonebook and, more recently, the facilities data all contain people triples. Sadly none of them linked up. This obviously troubled me.
We’ve always been a little timid around personal data. It’s a tough area – one wrong move and you’ll never live it down. Some businesses have been forced to close down due to poor management of personal data. Also, what some might consider acceptable some may not. I notice trends around the University – for example, most people in WAIS already have a public ‘people’ page, containing their email address, their degree and maybe a few other things, and see no wrong in it at all. Researchers actually need a public identity on their institutional website, it’s the way they communicate with other researchers and increase their reputation. But if I were to tell the groundskeeper or the staff in catering that their email addresses and photos are all going to be made public, they would probably be quite strongly opposed to the idea, and quite rightly so. I have no right to publish their personal information without their permission, and certainly not as open data.
We believe we have a compromise. We’ve assigned everyone in the University a URI, but that’s it. Occasionally these URIs resolve to something other than a blank page, but this only currently happens if (a) they have explicitly agreed to be in the public phonebook, (b) are the contact for a piece of equipment in our equipment database, or (c) I’ve added a bunch of information just for laughs (see http://data.southampton.ac.uk/person/D5mY.html for an example.) Even these pages are not visible to Google, and they are not centrally linked from anywhere so you can’t just scrape all the public people (unless you use the SPARQL endpoint I guess). But the point is that even people who have given us permission to publish their information are quite well hidden from Bad People™, and nobody else is public at all, even though we can refer to them in other datasets if we need to. I’m working with the team responsible for the recent main website re-brand, so I’ll link all this in with actual departmental people pages when I can. The main point is that everyone has a URI now, and these URIs are completely obfuscated and non-sequential, and (externally, at least) there is absolutely nothing you can do with them unless the person to which it refers is OK with their data being public. But we can use them to refer to people, and you can be sure the person referred to within one dataset is the same person being referred to by another.
An interesting quirk we encountered (hopefully before it caused any problems!) was what we lovingly refer to as the ‘four letter word problem’. The URIs referring to people were created by generating random numbers starting at over a million (so nobody gets to be ‘number 1’) and then base64-encoding them. The result is a four-digit alphanumeric string which means nothing, and this gets added to the URI. Someone pointed out to us that we run the risk of accidentally generating something along the lines of ‘D0rk’, and the poor person who gets assigned this ID won’t be too happy. Realising there are other (often far worse) four-letter words that might be accidentally generated, we found a digital copy of the dictionary and cross-checked every ID we’d generated and re-generated it if it was a dictionary word. Now whenever we generate a new ID, for a new member of staff, we check against a dictionary before ‘releasing’ the URI into the wild.
This is an embeddable map of all the waste points on our many campuses. It’s particularly useful for people living in halls of residence, but is also used by Estates and Facilities when working with external contractors.
Nightingale Foyer Report
In the Nightingale building foyer there is a little touch-screen display from which anyone can look up the people who work in that building. The data comes from the Open Data Service, but we had to build a custom version of the data because of the embedded staff from professional services who also work in the building alongside the Nursing and Midwifery staff.
Room Shapes and Divisions
You will almost certainly have seen our lovely digital map, for which I am eternally grateful to Christopher Baines. It uses data from Open Street Map and our own open data, but in many cases where data does not exist, Chris went and created it anyway. The best example of this are the room shapes. Zoom in as far as you can on the map and the building shapes will open up into partial floor plans. We are actually not allowed to publish full floor plans for security reasons, but publicly accessible rooms are fair game. I’m currently working on a more maintainable version of the map, so I’ve taken the opportunity to convert all of Chris’ lovely polygons into an RDF description and published them as a dataset.
Additionally, I was asked a long time ago by one of the lab managers in Building 32 if the bays had URIs. The whole of levels 3 and 4 of Building 32 make up a big open plan lab area for Web Scientists, and this is split into bays of four to six people. At the time the bays didn’t have any URIs but I figured that because I’m the University’s only Linked Open Data specialist, it’s probably my job to assign them! So I did, and built a dataset describing their physical position within the lab.
We’ve always had a list of ATMs on campus as part of the Amenities dataset, but that’s as far as it goes. A colleague of ours, Martin Chivers, suggested we include the location of all the ATMs in Southampton, and that the information is all available from various websites already. A few hours of googling added to my own personal knowledge and we now have a spreadsheet of ATM points for the central Southampton and Highfield areas. I’ll try to keep it up to date but the amenities dataset is manually maintained so I apologise if I miss one or two. Looking to the future, I believe we can keep this up to date automatically using Open Street Map, although this is basically crowd-sourced data and we’d like something better. The Link website actually contains a search engine for ATMs in Southampton, but this is almost certainly not open data and we have no right to republish it. So this remains an ongoing project.
Open Day Events
We’ve had lists of open day events before, but this year someone has already gone through the hassle of entering them all into our mobile app, MySouthampton. One Android phone and a copy of WireShark later and I’ve worked out where it gets its data and mercilessly stolen it! This of course means that the open data is up-to-date as long as MySouthampton is. I’ve described the events in the same way as we currently do seminars, etc, so you can always tell when it’s open day because suddenly all the buildings pages have enormous lists of events.
June 27, 2013
by Patrick McSweeney
As you know from Ash’s previous post ran an Open Data Open Day for people in the University to come and learn about open data. As part of the event we invited the University top hackers to come and do some open data hacking. The template is one familiar to many of you who have attended our events or JISC hack days before. Simply get hackers to sit down together, form into teams of 2-5 people, give them a blank slate, float some ideas, keep coffee on tap and periodically wheel food in and out. At the end of the day go to the bar so that each team can present what they have been working on over a hard earned beer.
One of the key aims for the day was to link up people who had data with people who could do something cool with it. The hack day is a good way to do this because you have a room full of people who can do something cool. Then over the course of the day people with data drop by and talk to the hackers, tell them about there data and get ideas to do cool stuff. It is a friendly and informal environment to work in and people come out with some really good ideas. What always surprises me is, even though I have participated in at least 30 and run at least 5, the outputs from the day are always so amazing good. This hack day was no exception and our teams came out with a combination of the awesomely cool and mind-bogglingly useful stuff.
The outputs were:
Collin Williams (CISCO Systems) , Rikki Prince, Biscuits Newton and Andreas Galazis Decided the usability of data.southampton.ac.uk was not good enough for non-technical people. They performed a usability study of the site and identified key areas of weakness to feed back to us. The biggest problem they found was that the search on data.southampton was nearly unusable. To combat the problem they created LOD Search, a SPARQL smart search indexing tool which can generate a usual search engine on any SPARQL end point. The demo they presented was very very impressive and prompted a lot of questions from the audience. Attempts to trip up the system by asking for difficult things gave it no trouble at all and the interface was surprisingly good given the short time available to work on it.
Adam Field (iSolutions), Matt Smith (iSolutions) and Lisha Chen-Wilson (iSolutions) met Adam Tewksbury from the University transport office. He was looking for a cool map and video of cycle routes which he can embed in the transport website and attach to the open data pages and maps. The team took helmet cam videos and GPS data about the route and combined them to make cool video which moves the pointer on an open street map as the the video plays. The technique is very powerful and reproducible for any combination of video and GPS data in KML or CSV format. Hopefully this will result in more students and staff getting on to their bikes to cycles the safe routes of Southampton.
Exchange Calendar to iCal
Martin Chivers (iSolutions) spent a chunk of the morning in talks learning about the nuts and bolts of open data. In the afternoon he grabbed a laptop and cracked really big data.southampton.ac.uk walnut. He created a commandline tool which exports a exchange calendar as iCal. One of our big bug bears in open data has been getting data out of exchange and this tool hit the problem squarely on the head. In the past we have had to ask users with temporal data to set up an account on Google calendar since we can get data out of it but not from our own exchange server. Now users will be able to work in their normal workflow without having to use a tool outside of the University to do a fair ordinary task. As a demo he was kind enough to give us the iSolution change management calendar as open data.
Southampton Blackout the real story
Tyler Ward (ECS) was more of a victim of circumstance than a hacking volunteer at our Hack Day. He was collocated with us to work on ECS’s media sensation Erica the Rhino but since he is keen got caught up in the open data hackery. The University of Southampton ran a media campaign called the Southampton Blackout to promote efficient power use at the University. The write up from our Comms team had some interesting mathematical inaccuracies which made the integrity of the finds questionable. Tylers aim was to use open energy usage data to tell the real story of the Southampton blackout. What he found is that some buildings during the blackout were using more energy not less to a level which almost eclipses the savings made in other buildings. He found some interesting trends in the data particularly in the figure below for ECS Mountbatten Silcon Fab lab. His close analysis of data was able to deduce where future campaigns should be targeting next an where savings can be made most easily.
All in all the day was a great success and lots of fun for the hackers involved. To quote Adam Field:
I had fun. It’s not often that I can take a problem and spend a day solving it.
March 19, 2013
by Ash Smith
We now have a tool that allows anyone in the University to find a suitable room for their event. We call it the Room Finder and I for one am rather proud of it. The tool pulls data from the places dataset, the room features dataset and the new room bookings dataset, and is a really simple way of finding a room at the University of Southampton. Let’s say, for example, that you need a room for a lunchtime meeting on Friday somewhere on Highfield Campus – and by the way, the room must contain a data projector and a piano. Using the Room Finder, you can check to see if such a room is available at the time you need and, if so, click through to the room description pages to find out more. The tool doesn’t currently allow you to actually book the room, but it’s hoped that many phone calls to Estates and/or the central booking service can now be avoided as we continue our ongoing mission to get all the University’s useful data onto the web.
The Room Finder is still under development, so things will change in the coming days. Specifically, I’m not completely happy with the way it displays the features list, it’s still a little bit more technical than it needs to be. We’re also hoping to get a mobile version out soon, it’s a bit fiddly trying to use it on a small screen. But as with everything on this site, I hope it shows just how useful open data can be. If you do find a problem with it, have a request for an additional feature or just find it useful and want to let me know, then feel free to drop me an email at firstname.lastname@example.org.
November 25, 2011
by Christopher Gutteridge
This is something I’ve been banging on about all year. We all tell people that open data is a Good Thing, but we don’t really provide any rewards for it. Search engines do — if you mark up your pages maybe they will do something with your vocab.org content, maybe not.
RSS feeds also provide an immediate and understandable ROI, maybe less than they once did, but people see their data getting used, and can play with it in tools from the moment it goes live.
I actually thing the demonstrations below are a really big deal. It’s one tool which can provide a useful service on top of the RDF from three different organisations. This is what I want the future to look like! These can be embedded in your own pages in an iframe, but it’s a bit icky loading 4 google map iframes in one page.
- Southampton [raw data]
- Lincoln [raw data]
- Oxford [raw data]
- Tsinghau (China!) [raw data]
- Your university? (let me know when you publish the data and I’ll add you!)
Also get in touch, maybe via the comments, if you would like to publish open RDF data about your organisation’s places. It’s dead easy and all you need is a UNIXish (Linux or OSX) computer, a free tool (Grinder), and my example configuration files for places in grinder. Given that you can edit your list of buildings (and sites, and rooms…) in a spreadsheet and turn it into RDF data in minutes. Cheap, easy, and with an imediate Return On Investment. This is how Linked Data should be.
I’m even thinking about making a Grinder web service, so you just publish a speadsheet, and a configuration file and we spit out RDF for you…
August 11, 2011
by Christopher Gutteridge
So I’m off on my holidays, I’m going to be using the things I’ve been learning on data.southampton to use RDF data to power the events programme for a nice little arts festival in the town I grew up in.
I really like festival and conference data, there’s so many useful things you can do with it, and people just throw it into PDF or cruel abuses.
All the code will be reusable for future events…
July 28, 2011
by Christopher Gutteridge
IWMW: Dave & I have been away at IWMW 2011, learning and promoting the Southampton Open Data Service. A key message we’ve been pushing is to make sure the data providers get a return on their investment. If they don’t get a value to themselves or their users, why should they care? Very positive response from many people. I think the tipping point is upon us, finally!
Google Map: A few people have commented how the road on the Google Map goes right through building 85. I’ve put a report into teleatlas who provide that data and apparently a fix is in the works!
Bus Stops: I’ve added some neat new things to the sidebar [example]. A link to a fullscreen version of the bus times info, suitable for display screens (I think this would be nice in the NOC lobby), and also cut-and-paste code to add the bus times to your own pages, using an iframe.