Skip to content


What would an independent Scotland mean for UK open data?

 

British Isles Euler diagram [CC0 by TWCarlson, via Wikipedia]

I’m posing the question because it looks like a reality we may all be dealing with very soon.

My preference would be for Scotland to vote no to leaving the UK [views my own, not my employer], but that’s not my decision to make.

So, I’m just trying to get my thoughts in order on what the implications might be. Please feel free to chip in if I’ve missed things, or you think I’ve been overly persimistic or optimistic.

Mergers and splits are always a time consuming data management job. For example, when we reorganised the research groups in the Electronics & Computer Science department there were some tricky decisions to make. We list the publications of a research group on it’s website. When groups split and merge, we either have to start from scratch or decide what group to assign each paper to. Last time we waited until all the new group memberships were settled then wrote a script to reassign papers to groups based on the memberships of the authors. That’s a storm in a teacup compared with splitting a country into two.

Domains: When .uk is no longer accurate.

My best guess is that, if Scotland leaves the union, it will have a new top-level domain, but that many cross-border businesses and organisation will use .uk for years into the future. There does appear to be a .scot domain, but I suspect it will take some time to move sites like www.ed.ac.uk to www.ed.ac.scot, and it seems likely they’d just redirect their .ac.uk domains.

Not many Scottish universities have done much with linked data yet, so the issues of organisation-assigned URIs isn’t so likely to raise its ugly head this time, but it does lead to questions about data.ac.uk. What should we do with http://id.learning-provider.data.ac.uk/ukprn/10007790  — should a UK organisation be defining .uk URIs for a .scot institution.

HESA & UCAS

HESA are the UK higher education statistics agency and UCAS are the organisation that allows UK students to apply for University.

One of the big advantages of HESA is that they allow us to compare data (open and otherwise) between UK universities. Data returns to HESA are mandatory, but sadly they are not funded in a way that allows them to make the majority of their data open and available to all. I would hate to see HESA and UCAS fragmented, it wouldn’t benefit the students in any way I can see.

Russell Group, Universities UK etc.

The UK has a number of university consortia. Which is to say clubs to which university’s belong. The University of Southampton is part of the Russell Group, which includes Glasgow & Edinburgh. In the short term, there’s little value in breaking up these consortia over state lines, but one of the reasons they exist is to provide a collective voice to government and to address strategic responses to government policy. As the research & education policies diverge these may become less meaningful.

Ordnance Survey

The Ordance Survey is more-or-less the UK mapping organistation. It’s based in Southampton so I’ve a number of friends who work there. Said friends are rightfully proud of the work OS does.

The more-or-less I mentioned is that the OS handles Great Britain, so Scotland, England, Wales but not Northern Ireland. NI is covered by a separate mapping organisation, so things are already no restricted to state lines.

The OS does have open data and URIs: http://data.ordnancesurvey.co.uk/id/postcodeunit/SO173AG and that could be tricky for scottish data with .uk URIs.

data.gov.uk

I think that the UK government open data site is one of the more likely to be split early. As it’s primarily a repository rather than a service, I think you could do this with less trauma than with services or APIs.

That said, there’s no rush, and it would be helpful for agencies both sides of the border to collect data with the same approaches so that they can still be meaningfully compared and used by apps which work either side of the border.

What to do?

Maybe minting URIs by state is not going to work in the long term. States do occasionally merge and split. I have no idea what a better solution might be. Cool URIs don’t change, but what are you to do when your URI is suddenly unpatriotic?

I think that if Scotland votes to leave the UK, then there’s not hurry to address the issues of cross-border data and cross-border agencies. Public sector open data is something for which the UK has been a world-leader. Obviously there’s many bigger questions about what the future might hold for the British Isles, but whatever happens, please don’t rush to split services, websites and datasets. Slow and steady.

What did I miss? What’s the best plan for national open data when a nation divides?

Posted in Best Practice.

Tagged with .


7 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. ww says

    This is not something that linked data does well — handling change. RDF doesn’t do time or annotations well. sameAs doesn’t capture “this entity used to be X and now it is Y”. TimBL’s “cool URIs don’t change” is, sadly, a bit naive because things do change, for all kinds of reasons. The only way that could work long term is if *no* semantic information was embedded in the UR{I|L} string. But that breaks the ability to dereference, which is one of the nicest features. The I-L equivocation is a fundamental flaw in linked data inherited from DNS.

    Is there a way out? An intriguing possibility is the way that magnet links work — there is a pretty good explanation in the new Wikileaks book, actually, where Julian Assange explains it to Eric Schmidt. That strategy addresses both problems, it removes the equivocation, and it doesn’t change because the identifier is a hash of the thing itself. But it introduces new ones, how do you hash a real-world object, what value do you use? Or indeed an empty object that is only meant to have an identifier but is distinct from another empty object?

  2. Hugh Glaser says

    Nice.
    Of course, you know what I am going to say.
    As domains split, put any sameAs links you want into sameAs.org services (and probably sameAs.org itself so they can be found).
    Then, as long as your data consumption is aware of the services, and which ones it wants to use, you can use any of the URIs, until the world settles down again.
    We did this, for example, when ECS ePrints were moved into Southampton ePrints – none of the code or RDF changed – we just added all the links (thanks for some of them to your team!) to the appropriate sameAs stores.

  3. Hugh Glaser says

    WW:
    Magnet links don’t make much difference to these issues – they allow you to do “Linked Data” while using non-http IDs for the resources, which just means that you can no longer verify where the content is coming from.
    That is, of course, useful in some circumstances – it means that I can publish your Linked Data as magnet IDs; and it does tend to give more resilience and persistence.
    There have been Linked Data people talking about this for a while – I put my RDF (and Tim’s) into Bit Torrent quite a few years ago now – but it has never been something with a big need (yet?).

  4. ww says

    Hugh, something hash-based like magnet links address the I-L equivocation for documents, and for that case they do much better than sameAs. How much that matters depends on how many the of the things of interest are documents and how many are other things.

    I agree it doesn’t address time though, and neither do any of the linked data things. More generally, it’s context — time can be a part of context. Is ed.ac.uk sameAs ed.ac.scot? For some purposes yes, for some purposes no, it depends on the framing of the question.

    • Hugh Glaser says

      I don’t think the hash addresses the I/R thing (I assume you mean NIR v IR in RDF terms). You can’t distinguish the document as a location from the document as a resource any more than you can for http, and the hash v. URI is the same too.

      sameAs is always context dependent – there is nothing absolute in time or anything else (consumer, publisher, application) – hence you need multiple sameAs views, as I said. I might consider them the same, and so might the people at ed.ac.scot, but the people at ed.ac.uk might not:- that’s fine, and the infrastructure has to support all those views at once.
      Cheers

  5. Robin Rice says

    Are you sure there’s a difference between British Isles and British Islands?

    • Christopher Gutteridge says

      To be honest, I have no idea, I just cribbed the diagram from wikipedia as I figured it would be useful for non-UK/GB/BI/BI people, and nicking someone else’s diagram saved half-an-hour’s work.



Some HTML is OK

or, reply to this post via trackback.