Privacy Controls for Linked Data

I’ve been considering how we might allow our users to provide access to selected third parties, to data we hold about them. This includes timetables, module selections, handin deadlines. I’m very wary of anything more sensitive such as grades and feedback, but more about that in a minute.

The only URIs impacted is our /person/ URIs as this will be the only source of personal information. The idea being that our users may wish to grant limited access to some of there information, or even make it public.

Why allow users to make their information public?

I’ve got a few usecases for this. A key one comes from a student built iPhone app called iSoton which takes your university username and password and uses it to navigate several webpages to obtain your timetable and present it in a friendly format. There is no good reason for you to allow that app. access to anything more than it actually needs. With the username and password it could also read and send your email! I don’t currently have access to student timetable data, but it remains a usecase for people following the patterns we’re working out here.

Another is letting students see their coursework handin timetables in a more useful format, and load them into Google Calendar or whatnot. Anything which helps the students hand coursework in on time is a bonus, in my book. This means maybe adding a new ical export mode to our system.

Annoyingly we adopted the format of URI: FILEFORMAT.ecs.soton.ac.uk/person/123 so that maybe means one new HTTPS cert per format. Fortunately they no longer cost a fortune. If we are accepting passwords or passing out password protected data there’s no excuse not to use https.

Initial Design of our Closed Linked Data

My idea is that we’ll keep the same URIs for people, but we’ll shift the RDF to https://rdf.ecs.soton.ac.uk/ just for people, and redirect http://rdf.ecs.soton.ac.uk to it.

We’ll also add https://ical.ecs.soton.ac.uk/ (or maybe ics.ecs — whichever)

Students will then be able to set their module selections and coursework information to be either private, public, or available via a username/password. If they select to make it available via a password, then WE will generate a username:password pair for them (cjg-1:9wernhi3ewrfjio) and they will be able to set what that pair can access and give it an expiry date. Maybe we’ll insist on an expiry of 12 months or less). We won’t let them change the random password as it’s intended for one-shot cut-and-paste to phone or web-app, not remembering.

In this way a student can provide access to some of their personal data to calendar and semantic web tools while retaining control, and not compromising their real password. This should also allow them to try to build some toys on top of this for 3rd year projects.

As I mentioned in a previous post, it’s not acceptable to give out personal information about OTHERS to a 3rd party site or phone app, so we won’t provide a facility.

ePortfolios

I had a stimulating discussion the other day with an MSc student (Niha Shaikh) looking into ePortfolios. The idea is that a student gets a transcript of their university involvement as a digital document (could be data and human-readable), this file being signed using the private key of the University of Southampton. This way they could prove to prospective employers that they got certain marks in certain modules, or show feedback from courseworks in a way that can be easily verified as sourced from the University of Southampton. They could even upload these documents to recruitment sites to help them get a job.

This all ties in with the closed linked data ideas I’ve been kicking around — you could provide live access to the same data, and limit who can access it, or choose to make it public.

ePortfolio Risks

I could see an immediate danger with this, were it to become standard practice: Students with externals very interested in their ongoing performance, be it parents, sponsoring organisation or home government, might be required to hand over access to this information. Can you imagine the pressure if someone was checking up on you for handing in a coursework a day late? Part of the value of university, to many students, is the chance to start becoming independent adults, and this level of monitoring would rather kill that.

A passing colleague had another interesting take on this which even impacts the digitally signed document the student theoretically gets at the end of the course! If every coursework feedback and late handin penalty becomes part of what you show to a prospective employer, this suddenly becomes worth appealing as it now matters. Even small penalties on modules which don’t impact your final mark. The costs to the appeals system would be astronomical.

Posted in RDF, web management.

Tagged with 303, eportfolios, https, ical, RDF, risk, timetables.

rev="post-380" 8 comments

By Christopher Gutteridge – August 31, 2010

8 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

Kingsley Idehen says

Are you going to look into WebID Protocols and ACLs here?

Links:

1. http://esw.w3.org/Foaf%2Bssl
2. http://www.mail-archive.com/public-lod@w3.org/msg05665.html — old post about WebID (nee. FOAF+SSL) ACL example

August 31, 2010, 5:15 pm Reply
- Christopher Gutteridge says
  
  Well… no. Not without a damn good usecase.
  
  I want our users to be able to provide access to 3rd party sites to some simple data about themselves. Their coursework timetable is the only really useful thing I can offer to play with.
  
  My interest is in allowing our users to give limited access to parts of their information to a 3rd party, human but generally a phone or web application. I could be wrong, but I suspect more 3rd party tools are likely to consume basic authentication than FOAF+SSL.
  
  BTW, the example on the w3.org page; it confused me. Better to split the list into Authentication 1-5 Authorisation (example) 6-7
  
  If I understand correctly, to us this system in my model, I would need the user to generate a certificate and upload it to the 3rd party site, or hope their phone could cope with it…? hmm.
  
  I don’t hate this system, but it’s not the bit I want to experiment on and would seriously cut the number of people using it. Maybe we can add it as an alternative method later.
  
  August 31, 2010, 5:53 pm Reply
Kingsley Idehen says

WebID is about ACLs without the conventional tedium of PKI. You share a resource for an individual or a group using HTTPS URIs. The protocol simply adds a Public Key lookup against a FOAF based Profile Document (e.g. via SPARQL).

Your use case is quite typical. Let people share resources with their own Web of Trust.

Did you watch any of the screencast demos?

August 31, 2010, 8:51 pm Reply
Kingsley Idehen says

Platforms that currently implement the WebID protocol have all invested in making the X.509 Cert generation “one click” or “wizard driven”.

Do take a look, it isn’t PKI of yore 🙂

August 31, 2010, 8:53 pm Reply
- Christopher Gutteridge says
  
  I’ve certainly done a quick spin around the tech as a result of your comments today. I think that the key difference is that it’s well suited to a user controlling their identity, but that’s not quite what I’m after.
  
  My interest is in a user being able to give limited access to parts of ( for the sake of simplicity, let’s call it) their profile to other agents.
  
  They could do this by authorising the URI of the external agent, at which point their privacy depends on the external agent keeping its private key secret.
  
  I can’t see much wrong with giving them a secret key to the external agent, although that is also subject to being leaked, but it’s much more simple which wins in my book. The certificate solution requires the user to learn a new concept AND the the developer to run some fairly complex libraries and stuff.
  
  August 31, 2010, 10:13 pm Reply
Marcus Cobden says

On reflection, it shouldn’t be too hard to change away from the FORMAT.ecs.soton.ac.uk convention.

The URIs which matter (and thus the ones you’re stuck with) are the ones which identify things, not documents.
Although if you were to change to something like data.ecs.soton.ac.uk/format/… a simple permanent redirect would probably be wise.

September 1, 2010, 2:58 pm Reply
- Christopher Gutteridge says
  
  But no getting away from id.ecs.soton.ac.uk
  
  September 1, 2010, 5:33 pm Reply
Marcus Cobden says

Permanent redirects ought to do the same trick for URIs for things, allowing you to move away from id.ecs.soton.ac.uk.

However I don’t think anyone’s published a definitive best practices on this sort of thing yet, so support would probably be patchy at best.
I did hear that another of Nick Gibbins’ students was looking at the meaning of such HTTP codes, when used in conjunction with each other.

Including an owl:sameas triple in the updated documents would also be wise, if you ever were to change the identifiers.

September 2, 2010, 12:46 pm Reply

« Wanted: Data Shamen First thoughts about FOAF+SSL »

Proudly powered by WordPress and Carrington.

Carrington Theme by Crowd Favorite

Privacy Controls for Linked Data