Skip to content


Linked data vs Open Data

You can have data which is in a nice “open” format (eg. RDF/XML rather than HTML) but is not public.

Mild example; a FOAF profile for a member of staff should not be made public on the web without their permission, but could reasonably be made available to all members of the university.

Extreme example; exam transcripts should be made available in an electronic & machine readable form if a student (or tutor) wants them, but should be locked down.

You never want your students & staff giving their username/password to 3rd party apps willy nilly… although I bet they type it into SSH/Firefox/VPN on cybercafes so that ship pretty much sailed, the only difference is a smart phone app can be more targeted.

Here’s what I’m thinking a first draft of private linked data policy should be:

There is a web-page on the Intranet, requiring authentication which allows members to generate pass keys for their data, these keys will have a text description for what they are used for and a mandatory expiry date. They can then add “privs” to that key, like “profile data”, “timtable” or “assignments” (with carefully worded warnings about implications). They then cut and paste that key into the app they want to have access to the data.

The key consists of a username & password for the app to pass to the RDF server to get results. For example cjg-key2/8347e084309bc20 (where the username is a combo of my username and the key ID)

28, 14 and 0 days before the key expires the person gets an email telling them it’s expiring/has expired with a URL to click to add 12 months to the key. Maybe less time if it’s got very sensitive access.

RDF documents created with the username & password would contain a triple

<> generatedFor <...etc.../account/cjg#access-key2> .

And maybe and rdfs:comment with a deliberate long code in which the org can scan Google to catch if data is leaking.

Unexpected emergent behaviour

I can see one really interesting potential issue with this… If it became standard I can imagine high-pressure parents or even the government of the country an overseas student came from to keep a check on their grades! This is an interesting privacy issue, where a student can’t be *forced* to tell the truth to 3rd parties about their grades until their final graduation (or lack thereof).

Linked data makes this easier, but the problem exists already. You could just as easily insist the student gave over their normal username/password for parental or government monitoring.

Personal data vs Data I can view

Toby, one of our student coursework helpdesk guys, made a good point to me; there’s an important distinction between data I can view and data about me. A good policy may be to allow people to freely generate keys to any information about themselves (with some advice on the screen), but a key which can access raw data about other people, such as your tutees marks, or the internal university phonebook (with all those DPA-restricted names & numbers in) should require some more formal process, even if it’s data you can view via a portal in HTML.

That more formal process could be a signature, training or just a bigger EULA style page for you to fail to read.

Posted in Best Practice, Intranet, RDF.


Graphite Updates

I’ve done some more work on my Graphite PHP library. It now works a bit more like jQuery, which is nice. The general design philosophy is to try to make it as easy as possible to start doing something interesting.

I really like the following bit of code:

print $graph->allOfType( "foaf:Person" )->get( "foaf:name" )->join( ", " ).".\n";

Which prints out the list of all the peoples names. No loops! I even wrote documentation for it!

Graphite Browser

On Friday I needed a quick and easy linked-data browser and most of the ones out there I find bothersome and complicated, and I wanted something for a teaching aid, so I wrote this:

Which is very quick & dirty but I’m quite liking it to just look inside RDF & RDFa documents. It just does a regexp so all the hyperlinks go to http://graphite.ecs.soton.ac.uk/browser/?uri=XXXX instead of XXXX.

Posted in Graphite, PHP.


SIOC

This blog now supports SIOC RDF: http://blog.soton.ac.uk/webteam/?sioc_type=site

That is all.

Posted in RDF, Uncategorized.


Corkboards and Mind Maps

This is a cute tool to allow you to create and share virtual corkboards. I’m not sure how we could use it, but I’d like to find an excuse.

Also this week I’ve been using a mindmap to manage my scary todo list. Doing it in a webpage means it’s available via home, work laptop etc.

Posted in Uncategorized.


Turning alerts into stories

It is of interest when someone else writes something about the topic of one of our websites. Even more relevant when they write about the site itself. For example; http://webscience.org/ or http://users.ecs.soton.ac.uk/wh/

We use google alerts to keep our web/communications staff informed about anything being said about us, good or bad. Now and then one of the things is valuable to record and point people at. Currently this is done by hand, and is easy enough but any time someone is doing a repetitive job which takes more minutes every week, you gotta stop and see if a small script can help. There’s a tipping point where it’s worth doing a few hours work to save a few minutes of regular work.

Google alerts provide an RSS feed, so what we need is a tool which will allow our communications manager to see the latest few matches for a given search “Wendy Hall” “Web Science” “(ECS or (Elelectronics Computer Science)) and Southampton” etc.

Each feed item would have a “capture” icon which would save the title/link/summary into a database of “in the news/blogs/web” for that subject, but allow us to edit it afterwards for clarity.

Another interesting tool is http://purifyr.com/ which removes all but the “meat” of a webpage. A de-templater tool, if you will.

Posted in Templates, web management.


7 Degrees of Attention

In twitter, you don’t have friends. You have followers and people you follow. These are very much one-way arrows, compared with the facebook approach. Just because I see your tweets does not mean you choose to read mine (@cgutteridge should you wish to).

Today someone named @stuartbrown tweeted “anyone out there got info / point to info on RDF support in EPrints 3.2.1?”. I might have met him in the past, but I certainly don’t follow him, but my friend @psychemedia does, and so retweeted “RT @stuartbrown: anyone out there got info / point to info on RDF support in EPrints 3.2.1? #dev8d @cgutteridge” which drew it my attention and got him an answer in less than 40 minutes.

There’s often been talk of 6 degrees of separation and Bacon numbers and such, but these are very flimsy connections. foaf:knows style connections. It just indicates some basic connection between the two people. What today’s communication required was a chain of attention. @stuartbrown < @psychemedia < @cgutteridge. I’m wondering how hard it would be to work out the shortest chain of attention between two twitter users. What’s the shortest number of RT’s required to get my text to be read by decision maker X?

It’s not as painful as it sounds as “follows” lists are always much shorter. @nathanfillion may have 0.5 MegaFollowers, but only follows 91 people. However 2 hops is still around 10,000 users so would probably start to hit the API limits.

Of course, you could just mention them and they (may) see it anyway, but I was thinking more about how far your voice is from influencing decision makers via a channel they pay attention to.

Posted in twitter.


FutureStory

This isn’t strictly Web Team related, but this is the best place to post it.

So, I got asked to go to a thinktank event as a local blogger, which I guess I am. The event is about globalisation, which is a swear word in some of the circles I move (drink) in. Before agreeing I had to work out if I was comfortable with this, and I cam to the conclusion that my attitude to globalisation is that it’s a tool. Like, say, electricity. I think we can agree that electricity is, on balance, pretty useful (if not, how are you reading this?) However too much in the wrong place is a really Bad Thing.

This event is called FutureStory. This initially gave me the idea that it was going to be about trying to come up with ideas about the future of Southampton. Looking at the PDF files which came out of events in other cities, I was a bit disappointed, as it was all about what local companies and projects were currently doing. My housemate pointed out that what I’d missed was that all of these companies are looking to the future, which should be the launching point of ideas, but they are already established. What happens next? (edit: I’ve just noticed the Southampton FutureStory book is on the table, so they’re *not* writing that today… which means that it *is* the jumping off point. That’s hopeful)

Southampton is not generally very proud of itself, and it is quite cool seeing a bunch of younger people being told that this is somewhere where lots of globally important and interesting stuff happens.

The event is largely about getting young people thinking about the fact that business *is* going to change in the future. Companies which can adapt will survive. Many won’t. Looking to support the UK economy in 5 years from now means getting people learning the skills now.  Jonathon Shaw MP, the regional minister for the South East gave an introduction about the importance of getting people with the right skills, and helping people into smaller business as well as the big business track most people imagine themselves in.

The catch with what he’s saying here, is that how the hell do you train people for the new jobs? Imagine a kid doing their GCSEs this year. If they do a degree then they’ll be graduating around 2015. Twitter only appeared in 2006. Now there’s people who need to know how to use it for part of their job. This GSCE’r may well find in their first job they’ll need to be an expert in something that’s not going to be *invented* until next year. How do they pick which A Levels to take? IMO you can’t, but learning maths, expecially boolean logic won’t hurt. Nor will people skills. The ability to do a boolean search (using and, or and not effectively) is the difference between hours of slog and seconds of thinking to satisfy your curiosity.

I don’t like networking. Not my thing. At the “just introduce yourself to someone for 90 seconds” I manage to randomly end up talking to the ProVC of Solent. Used the opportunity to ask for a suggestion of how to find people at Solent to get involved in “Southampton Developers” . Dammit, I appear to be someone who benefits from networking! At least I’m still not wearing a suit, yet.

The video they showed about Southampton businesses looking to the future stared with lots of shots of the docks and the cargo canisters. In my head the whole thing got overlaid with the theme music from season 2 of the wire but I don’t want to suggest that the similarity goes beyond cranes and containers. And that’s the thing that stuck with me from the bit about the docks. Using containers speeds up the turnaround by a factor of 4. That’s amazing and just requires getting everyone in the world on the same page. It’s like a cube root. It’s really easy to check you got a valid answer, but getting the answer in the first place is really hard. The best ideas (like Open Access to research and Linked government data, or the scientific method) once they are established seem obvious. And many great and *obvious* ideas have not yet been expressed, or at least not reached the tipping point where they become obvious.

Yikes; good core skills (according to the Minister) are reading, writing, adding up and ICT. The three R’s have changed since I was at school. It’s a bit of a shock to hear it expressed. Jonathan Shaw is also talking about government contracts needing to include a requirement to take on apprenticeships and be engaged with bringing skills to the community. Sounds like a good long term idea. The example; a recent government loan to construction companies to get them to start building despite the recession. The loans required them to take on apprentices on the builds.

I was going to ask the panel a difficult question, but that’s become apparent that it’s not appropriate. This is useful for the students here and they want to ask questions about the future. My question was going to be how to ensure we have “firebreaks” in globalisation? There’s a danger of building a super efficient house of cards where one natural disaster can collapse the entire global economy. The global just-in-time economy scares the hell out of me. This article about the collapse of the USSR got me thinking about these things last year. There is a really exciting answer to this which is that globalisation could also mean creating more peer to peer distribution networks so local business delivers locally. Rather than all farms supplying the big supermarkets, I could have a web service more like Amazon, where I order my food and it’s delivered by the most local source. If the economy collapses, we still have the basic means of production, for essential services, in each region. “Food Miles” are going to be an increasingly big deal. As is the miles travelled for any resource. I can imagine services like lulu.com and cafepress operating globally but  with points of production all over the world. When 3D printing gets off the ground, that’ll be interesting too.

In the Q&A phase most of the questions are about how to get the opportunities. Schools have a focus on getting exams passed, not getting their students work experience and other activities. Some schools are great at it. But it sucks if your school doesn’t bother with that. Advice to students; badger your school teachers! Getting the first bit of experience to get the next job is an utter arse for people. Many good people are lost to minimum wage jobs because they can’t get that first rung on the ladder. It puts me in mind of a job advert I saw in 1998 which wanted a java programmer with 5 years of experience. As it was released in 1995 it’s possible a few people at Sun had that much but the rules were changing too quick for the HR department. How do you hire someone for a job where you’re not sure of what the job description will by the time their probation period is over?

The key message for students (it seems to me) is to take the initiative. You have access to google & can use it better than your teachers. If you don’t investigate what’s on offer to create your own future you’ll just falling into something. You might get lucky, but I’ve noticed that people who do their research and leg-work tend to be luckier. Probably just a co-incidence.

Posted in Uncategorized.


Dev8D twitter network, part 2

A continuation of an earlier post:

Having (vaguely) got to grips with Graphviz last night, I’ve produced a few diagrams which attempt to better visualise the changes in twitter network of dev8D attendees.

In the diagrams below, I’m looking at how the network between 113 dev8D attendees changes over the course of the event (and a few days surrounding it).

Some details on the data I’m looking at:

  • The list of 113 was obtained by looking at all people with twitter accounts who signed up for the dev8D wiki
  • I’ve ignored all connections which existed before 22nd Feb 2010
  • I’m looking at the change in friends (i.e. follows) of each user, ignoring any changes in who they’re followed by
  • The changes in network are cumulative, so the very last diagram includes all the changes starting from 22nd Feb 2010
  • I’m only looking at new connections made between those 113 dev8d attendees, I’ve ignored new follows to people who aren’t in that list of 113
  • Any nodes on the diagrams which are unconnected are fresh signups to the wiki

So what does this mean?

The diagrams themselves show how the network of a small number of dev8D attendees increased amongst themselves over a period of time surrounding dev8D. This alone is good to see, but what’s far more exciting is something that arises when considering how small an area this looks at:

  • These diagrams represent only the connections formed between less than half of the 250 attendees
  • They represent only the twitter connections formed between those 113 people

What’s exciting is knowing that this is only a small subset of the true amount of networking that took place:

  • Face to face meetings
  • Email address exchanges
  • Connecting to non-dev8D attendees
  • Awareness of dev8D and the developer community
  • Community built through DevCSI
  • Connections on social networks outside of twitter

All these things are hard to measure and quantify, but I’m confident that anyone who attended dev8D can tell you that these things happened non-stop throughout the event.

And finally….the diagrams

The diagrams represent the connections made on a day by day basis, starting on Feb 22nd 2010 to Mar 3rd 2010.

Each image is a thumbnail, I’ve included a link to the full size image of each underneath (which should be zoomable enough to view individual node names).


2010-02-22

Graph of cumulative twitter following network for dev8D
2010-02-22.png full size


2010-02-23

Graph of cumulative twitter following network for dev8D
2010-02-23.png full size


2010-02-24

Graph of cumulative twitter following network for dev8D
2010-02-24.png full size


2010-02-25

Graph of cumulative twitter following network for dev8D
2010-02-25.png full size


2010-02-26

Graph of cumulative twitter following network for dev8D
2010-02-26.png full size


2010-02-27

Graph of cumulative twitter following network for dev8D
2010-02-27.png full size


2010-02-28

Graph of cumulative twitter following network for dev8D
2010-02-28.png full size


2010-03-01

Graph of cumulative twitter following network for dev8D
2010-03-01.png full size


2010-03-02

Graph of cumulative twitter following network for dev8D
2010-03-02.png full size


2010-03-03

Graph of cumulative twitter following network for dev8D
2010-03-03.png full size

Posted in Uncategorized.


Dev8D Mobile Tools

So having met @samscam it turns out that he yings-my-yang when it comes to building tools for events. He’s good with the mobile stuff and wants data to consume (yay!)

Now he’d given me lat+long for each location, it was easy to use my Graphite library (built last week because I figured I’d need it to hack at Dev8D) to convert the programme data into location & time based data. The first idea was to show the current thing on in a location, which was easy enough to do as a KML file, but is now meaningless as it’s over. So here’s some links to it running as if it were still in the event:

Given that, plus the fact the conference had a twitter ID for each location, described in the programme data, it was easy to make a KML of the last tweet from each room:

So… afterwards I realised that it would be much more useful to show what’s on next as well, but we’ll work on that for next time.

The really cool bit is (soon) going to be described by Sam (I’ll link it here when he’s done)

Posted in Conference Website, dev8d, PHP, twitter.


A first look at the dev8d twitter network

A couple of days before setting off for dev8D, I set a script running to log the changes in friends/followers of twitter accounts related to dev8D.

In looking at the data, I’ll talk about three sets of twitter users:

  1. Wiki users – this is anyone who has registered on the dev8D wiki with a twitter account. Anyone is this category is assumed to have been an attendee at dev8D.
  2. Dev8D community users – this is anyone who has mentioned ‘dev8D’ in a twitter post or is a wiki user
  3. Total users – someone who has been followed by or follows someone in the two categories above

So far I’ve just looked at the smallest possible category: those who registered on the dev8D wiki with their twitter accounts, numbering 113 in total.

I captured 9 days of useful data, from 22nd Feb 2010 to 3rd March 2010.

The Numbers

In summary:

Those 113 attendees followed:

  • 158 other attendees (i.e. wiki users)
  • 250 dev8D community users (including wiki users)
  • 565 total users (including wiki/community)

and were followed by:

  • 73 other attendees (i.e. wiki users)
  • 149 dev8D community users (including wiki users)
  • 644 total users (including wiki/community)

Putting those figures into bar charts:
Changes in twitter friends of dev8d wiki users

Changes in twitter followers of dev8D wiki users

Source Data

Raw data is available in a sqlite3 database at:
dev8d_twitter_network_2010-03-03.db (or view one of the images above, and get the data passed into the google charts query string)

Observations and Comments

Data oddities:

  • There’s some discrepancy between wiki follows and wiki followed by numbers, the higher of the two is actually the true value. This is due to the fact that I’m only looking at diffs in a person’s network, i.e. connections between dev8D attendees that were made before the event or before signing up to the wiki are ignored.
  • The numbers for 2010-02-22 are higher than expected, due to the fact that data collection failed for the two days preceding it, causing all the figures to be lumped together (dividing that day’s numbers by 3 would give a more realistic estimate of changes for that day). I’m not sure why data collection failed, network problems? PC crash? Temporary twitter rate limit ban? Script bug?

On average, each and every dev8d attendee on twitter gained 6 followers over dev8D (and a couple of days surrounding it). I’d be interesting in seeing how this compares to the average over the rest of the year.

Also of interest, is that fact that dev8D was attended by 200 or so people per day (450 on the first day, which included a linked data meetup), but was mentioned by around 500 different people on twitter. Hopefully this is some evidence that news/outcomes/interest from the event reaches far beyond its immediate participants.

Next step is to look at using GraphViz to produce some diagrams of the changes in network over time. Suggestions on visualisation for this are more than welcome – my network diagrams so far look like squashed spiders…

(continued in my next post)

Posted in dev8d, twitter.