Skip to content


URI’s in the page, please!

I’m trying to build a website using best linked-data practice.

This means providing RDF info about each resource described on the site (easy with a SQL DB backend), plus also providing sameas information to join our information to 3rd party sites like dbpedia.

Problem is finding a persons URI. I’m having to do websearches, which wastes my time and makes the job less likely to happen. Pages should make it easy to get to the URI of a person, event, conference or other resource you want to be linked up. If you don’t, we won’t bother sameas’ing you!

ECS has provided an RDF link in the page which makes it easy to get to both the RDF & the URI’s. Without this, how the hell do you expect people to find the URI to link it in the first place!?!

On our database driven people pages for ECS, the RDF button on the left side of http://www.ecs.soton.ac.uk/people/cjg takes you to http://www.ecs.soton.ac.uk/people/cjg/rdf which tells you my URI, the URI of my role in the school and the links to the RDF.

We also put the URI & RDF links on projects pages: http://www.ecs.soton.ac.uk/research/projects/42

As a result of this annoyance, I’ve asked our webmaster to add the URI’s directly to the list of contact details along with email, homepage, phone etc.!

UPDATE: @das05r suggests “semantic radar” for Firefox and http://sig.ma/ which is all very well, but I want to do this with mimimum work. I want the URI right there one the conference homepage, not to be forced to use special tools to find it!

Posted in Uncategorized.

Tagged with , .


Spelling and credability

I’ve noticed that I tend to dismiss the opinions of poorly spelled, or punctuated comments on blogs and forums.

Initially I assumed this was because if they could not be bothered to show the respect of putting the basic effort to use a capital “I”, and use the correct “there”/”their” and so forth, then they were less worthy of my respect.

However, I now have a new theory. There’s a bit of my brain which filters spam. Most spam has grammatical and spelling errors in which means I’ve come to hit delete automatically on things which have more than a few spelling, punctuation and grammar mistakes and aren’t directly addressed to me.

Dave has extended this theory; he thinks that someone in Nigeria deeply cares about proper use of the English language, and is sending out thousands of poorly constructed junk messages for the sole purpose of forcing the world to use correct spelling, and so forth, if they want to be taken seriously. Agent provocateurs of the Grammar Nazis?

Posted in Uncategorized.


Avoiding web design by committee

We design and build lots of websites for our school. Normally it’s fine, but there’s a project coming up which wanted us to build the website which had me worried. (or at least planning how to cope with it)

The reason being, all the people with the power to make decisions are too busy to do so. This can be a real pain as they’ll expect the site to be all things to all people, but not give any guidance on what they want. Which is understandable, but makes an otherwise reasonable job into an utter pain in the arse.

The good news is it looks like the website responsibility will be delegated to a non-professor member of staff who can set our priorities and make decisions (we’ll offer advice, but there needs to be an executive).

I’m thinking that this might be a very good model for any design work we do, that we insist that we have a single liason for a project. We can talk with lots of people, but only one has the responsibility to make decisions. That way we can avoid design by committee, crossed wires etc. When someone wants something “out of scope”, or that we think is a poor choice, we can ask them to go via the liason. Also the liason would have responsibilty for getting us any information we required to progress. My experince is that Profs just don’t have the brainwidth to deal with the niggling details, which is to be expected but does mean they absolutely must delegate them.

The worst case of this was a really big project which got stalled for 2 YEARS because a Prof. wouldn’t let it go live until they signed off on it, but never had time to look at it. I’m still bitter about this one, although he wasn’t a bad bloke it was utterly demoralising.

There’s a good tip for academics here; if you ask our team to get something done ASAP we will. Maybe even in an evening or weekend if we’re online and it’s not too much bother. However, if the site/wiki/blog/database then sits unused for six months, we’ll probably not go the extra mile for you again. On the flip side, if you immediately start heavily using the thing for which we put a rush on, then mucho respect.

Better still, don’t ask for rush jobs. It’s not fair to know that you need a website, but only ask for it 2 hours before it’s due to go live. We do our best, but sometimes people mistake our normal excellent turn around for a guaranteed service level. More notice is just polite, and allows us to do a batch of similar tasks together thus reducing our workload.

Posted in web management.


Bulk Moving or Deleting “locked” files in OSX

A bit off topic this one, but Google didn’t show up the right answer, so it’s worth putting it somewhere on the web.

I was trying to move a bunch of files in OSX and it seemed that some of them were “locked”. This appears ot be a system flag and can be removed on a single file via the file-info tool. But I’ve got 100’s. There’s a bunch of bad suggestions out there. The information you really need is:

chflags -R nouchg ...path...

via the command line. This will remove the “uchg” (unchangeable) flag from the path, and recurse down. To view the flags on a file or directory use:

ls -lO ..path...

You could use “uchg” if you wanted to bulk-set a whole bunch of files to locked, too.

UPDATE: While this worked on the files, I still could not delete the directories. Even from trash. How weird? As it’s an external drive, I think I’ll just mount in on linux and hose them there.

UPDATE 2: Upgraded to Snow Leopard and the problem went away. yay.

Posted in OSX.

Tagged with .


There’s always one

All of our modules fit nicely into academic sessions… except 2, the MSc projects, which last into the first semester of the year AFTER they start.

In several places I need to work with all “current” modules: The student homepage needs to show them all the modules they are currently on, up to now I’ve just had to search the current session. Also I need to show students their project examiners and vice versa, but by default my system was showing these as last years relationships rather than current, as we’re now in 0910 rather than 0809.

My rather hacky solution is to maintain a list of modules for each session which should be shown in semester 1 of the next year. This is a cheap and dirty hack but got done in an hour.

What I should really do is give every module a start and end date and test which are current, however I can just see the start/end date data getting out of sync. A middle road might be to keep an “exceptions” list, then build start & end dates via a script. Then there’s the question of an end date — as we need to keep them open until after referrals etc.

The thing is, this only impacts 1% of our modules, so, sadly, I think the hack’s the right solution. I just don’t have the resources to re-engineer the whole system for one special case. If we ever rebuild the system we’ll know better next time.

UPDATE: after thinking it over and discussing with other staff, I think the best long term solution is to add a table for “module expiry date” which will be module_id, session_id, date. This is only used for modules which last into the next session and the canonical list of active modules are all those where the session_id is the current academic session PLUS all those with a expiry date and that date still in the future.

Posted in Database, Intranet.


Keeping track of DNS

Right now we have a whole heap of domains registered for our department. About 30% of websites on core servers are not *.soton.ac.uk but some are under the same registration, like roar.eprints.org and demoprints.eprints.org.

We’ve never kept track of these registrations in any really organised way. I build a daily list of the primary DNS for every virtualhost on our infrastructure kit, but not for the 100 or so research webservers. People may have registered DNS in any old place, not just our own DNS server!

  • ECS (280)
  • auth-ns0.csail.mit.edu., auth-ns1.csail.mit.edu., auth-ns2.csail.mit.edu. (3)
  • dns0.brad.ac.uk., dns0.soton.ac.uk., dns1.brad.ac.uk., dns1.soton.ac.uk. (4)
  • ns.hosteurope.COM., ns2.hosteurope.COM. (2)
  • ns.newdream.net., ns2.newdream.net. (1)
  • ns1.dreamhost.com., ns2.dreamhost.com., ns3.dreamhost.com. (1)
  • ns1.ecs.soton.ac.uk., ns2.ecs.soton.ac.uk. (1)
  • ns29.domaincontrol.com., ns30.domaincontrol.com. (1)
  • ns41.domaincontrol.com., ns42.domaincontrol.com. (1)
  • raven.ecs.soton.ac.uk. (1)
  • unknown (16)

My system regards “ns0.ecs.soton.ac.uk” as ECS. It’s odd that we’ve got a site on ns1 & ns2 only, and also one with raven.ecs listed as the DNS (this is the true name of ns0, but it should be using the alias in case we wish to move ns0 to another server)

It’s getting the data by parsing the output of /usr/bin/dig ns0.ecs.soton.ac.uk -t any domain and also checking an external DNS to ensure they tally.

The newdream and dreamhost entries are for my own personal projects, so I’m hardly blameless.

I’ve not got any solutions to this one yet, and it’s not really an urgent problem, just one I’m mulling over. In an ideal world we’d force everybody to use only .ecs.soton.ac.uk but there’s plenty of good reasons not to do that.

The best compromise would be to insist that we handled all DNS registrations and kept a careful track of both the person and the project which pays for it. Forcing people to pay upfront for years of registration to put-off the problem of what to do when registration runs out and the site remains with valuable content (research output, or worse URI’s which mean something which someone else could usurp!) – this solution is also too draconian for our users.

A more reasonable solution will be to:

  • Find out about the existance of all non *.ecs.soton.ac.uk entries referring to our IP range. Using some clever web-fu or what-not.
  • Keep track of one or two “owners” for each, linked with a website db entry if appropriate.
  • Any sites registered in external DNS’s should be gently encouraged to move them to our DNS.
  • We should make the path of least resistance to get people to let us register DNS for them and for many years past the end of the project.
  • We should collect all relevant info. when the domain is registered.
  • We should discourage people from creating non ecs.soton.ac.uk domains if there’s no good reason. Especially when using such domains to create URIs.

But, like I said, this is a gentle plan of attack. What we’re already doing works well enough, the above ideas would just streamline what we already do, and move knowledge into a database rather than a few people’s heads.

Posted in web management.


Changing user agent in Microsoft Search Server 2008

As part of evaluating Microsoft Search Server 2008 as an intranet search tool, I wanted to modify its crawler’s user agent string in order to trivially monitor it in our web logs.

I couldn’t find any documentation on this, but eventually came across a registry entry which controls it:

Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager\UserAgent

Modifying this (using Start->run->regedit), and restarting the search service (Start->Administrative Tools->Services->Office SharePoint Server Search) then resulted in the crawler using the text I’d entered as its user agent.

Posted in Uncategorized.

Tagged with , , , , .


ack

My top tip right now for any command line cowboys (like me) is to install “ack” as it’s saving me time and irritation. http://betterthangrep.com/

It is for searching for exact terms or regexps in a directory tree, just like grep -r, but with all the .svn and other crap filtered out. Many people I know have written their own hacky tools to do this, but “ack” is a better solution.

Posted in Uncategorized.

Tagged with , , .


Welcome the ECS Web Team Blog

This blog is intended to share some thoughts and tips from our team.

Our team runs the web-systems for the School of Electronics and Computer Science at the Universtity of Southampton. There’s 3 of us; Myself (Web developer & manager), Dave (webmaster), Joe (Teaching and Learning webmaster + web developer). We work with Sarah (Graphic designer and wielder of Dream Weaver) and Joyce (Communications manager).

ECS is highly involved in the web, so it’s a challenge to keep up with the new technologies, while keeping the basic stuff working well. Currently we have around 8 infrastructure webservers running 311 websites, plus from the firewall configuration we have another 100 or so research webservers.

This blog is aimed at people doing similar jobs to ours, and to members of our school so they can see a bit of what we do.

Posted in Uncategorized.