May 13

Power of Information

We use data every day to find out if transport systems are running on time, look for doctors and dentists, and find out about house prices and crime rates in different areas. Linking different types of data together makes it all the more powerful, and this can only be done through sharing information on the World Wide Web.

Last year the Prime Minister, Gordon Brown, asked professors Nigel Shadbolt and Sir Tim Berners-Lee, of the School of Electronics and Computer Science, to help transform public access to government information. As world- leading experts in the field of computer science, they have the know-how to make this work: Sir Tim is the inventor of the World Wide Web and Nigel a leading authority on the Semantic Web. They are co-founders of the Web Science Trust and Directors of the Web Foundation, both organisations devoted to promoting our understanding of the Web and its impact on society. The two professors are also leading the new £30m Institute for Web Science, a collaboration between the universities of Southampton and Oxford set up in March with funding from the UK government’s Department of Business, Innovation and Skills. In addition, Sir Tim is a Professor at Massachusetts Institute of Technology in the USA, where he also directs the World Wide Web Consortium that establishes the standards enabling the Web to run free of proprietary interests.

Public empowerment
Nigel and Sir Tim had a key role in the development of data.gov.uk, a website that allows people to view public information, with the ability to combine different threads of data and analyse them in innovative ways. Launched in January 2010, the data.gov.uk site was developed in just six months, with Nigel and Sir Tim working closely with a small group of technical and delivery experts, with leadership from the Cabinet Office. An important part of the site’s development was the release, after only a few months, of a test site that computer programmers and developers could use, comment on and help improve. Data.gov.uk was then launched to the public in January as a work in progress. It was built with very modest resources and uses open-source software.

Data.gov.uk contains almost 3,000 sets of data from across government about all aspects of our lives, ranging from information about education and traffic, to tax and crime. All the data is anonymous. It has all been collected, and has been paid for by the taxpayer already. The data has been released on data.gov.uk in a format that can be reused by any individual or business to create innovative new applications. These include giving information on house prices, local schools, amenities and services, or access to local hospitals. There is even a mobile phone application that lets you know the current anti-social behaviour order (ASBO) rate for your area.

The data.gov.uk site has been favourably compared to the US version, data.gov, which was introduced last year by the Obama administration and also offers open-access data to the public.

Mashing up the data
The ethos of data.gov.uk is to encourage people to be inventive and combine the different types of government information in innovative ways. This enables the creation of useful practical applications that go beyond what we could do with single, isolated sets
of data.

With blogs, forums and other ways to share ideas, visitors to the website are actively encouraged to get involved and add their ideas for ‘data mashups’. These are novel ways of combining different sets of data, and great examples of collaborative working in action. For example, a mashup combining addresses of local schools and their league table results could form a useful mapping tool for parents and school-age children who want to find out where the high-achieving schools are in their area. Other ideas posted on the website include a map showing where all the CCTV cameras are in the UK and an application that generates shopping lists based on foods ranked by the Food Standards Agency’s ‘traffic light’ system.

Sir Tim has long been an advocate of the release of data from public sources and for inspiring people to share their data in this way to promote innovation. “Government data should be a public resource. By releasing it, we can unlock new ideas for delivering public services, help communities and society work better, and let talented entrepreneurs and engineers create new businesses and services,” he says.

The website aims to change the culture of Whitehall and town halls so that data is seen as public property. Nigel comments: “Making more public sector information and data available is crucial if we are to exploit the innovative talent available to us in this country to produce really outstanding applications that have social and economic value.

“The vision is that citizens, consumers and government can create, reuse and distribute public information in ways that add value, support transparency, facilitate new services and increase efficiency. It is a job that is never going to be entirely finished; governments are always collecting data.”

Creating links to unlock innovation
The ethos of discovery of data.gov.uk is underpinned by the ideas behind the Semantic Web, which is evolving from the World Wide Web. The brainchild of Sir Tim and the major area of Nigel’s research, the Semantic Web enables data to be linked in imaginative new ways. Just as people add value to the familiar web of documents by creating links between pages, it is also possible to add value between data sets using a range of semantic technologies. The ability to link (or mash up) different data sets provides new information and new ways to access it.

Sharing raw data allows scientists to ask questions no one has asked before. “It’s essential that data is ‘unlocked’ and put onto the web so that the world’s biggest challenges can be addressed by scientists working together across different disciplines,” says Sir Tim. “A lot of the knowledge of the human race is currently sitting on private databases and not shared. We urgently need to move away from this ‘siloed’ thinking.”

Democracy in action
Nigel is now leading a panel of experts, which includes local government chief executives, information technology experts and entrepreneurs. They will work closely with key and relevant organisations to help improve local public services and empower citizens. Over a period of two years, the panel will aim to advance understanding of why the release of local public data is also important and how it can be used for the benefit of the public. This work will also feed into the continued development of the data.gov.uk site for all public data.

The website has been warmly received. Referring to the achievement of data.gov. uk, the Prime Minister, Gordon Brown, has commented: “Already as a result of the Berners-Lee–Shadbolt initiative a transformation is at work. A myriad of applications are being developed on the web by citizens for citizens – new websites on health, education, crime and local communities – that inform, enrich and enliven our democracy. It is truly direct democracy in action.”

For more information about the project, visit www.data.gov.uk

[This article originally appeared in the University of Southampton “New Boundaries” magazine, issue 10, May 2010.]

Mar 16

One of the poster issues for Web Science is the Web’s impact on intellectual property, especially copyright. The Web emerged from an open environment characterised by government-funded research teams, participating in large-scale international collaborations and inevitably promoted an considerably more open stance than our society had experienced. The Web itself is built from open source, interoperable software designed for open knowledge exchange and has triggered the formation of many “open content” initiatives: open access, open data, open educational resources, creative commons, scientific commons etc.

However, this attitude to open content is at odds with the market-based knowledge-trading paradigm (book and DVD purchases, journal, magazine and TV subscriptions) that has been the dominant model for information transfer in society. So how do content owners thrive in a world where content distribution is free? What role copyright (which allows only the holder to make copies) when copying is fundamental to every digital activity.

Lawrence Lessig recently wrote about the need to come to a new balance between “open” and “traded” material, or at least open and traded uses of material.

The law of copyright is shot through with balances struck to protect markets and to limit markets. Two hundred years of legislation shows a constant effort to identify and to secure the places where commercial values should reign and the places where they should be constrained.

We need a renewed effort to strike this balance through interests that recognize the good in both sides. It would be a mistake to destroy new markets by eliminating copyright protection where it would do good. It would also be a mistake to assume that all access to culture should be governed by markets, regardless of the effect it has on access to our past. In the most abstract sense, we need to decide what kinds of access should be free. And we need to craft the law to assure that freedom.

I have no clear view. I only know that the two extremes that are before us would, each of them, if operating alone, be awful for our culture. The one extreme, pushed by copyright abolitionists, that forces free access on every form of culture, would shrink the range and the diversity of culture. I am against abolitionism. And I see no reason to support the other extreme either–pushed by the content industry–that seeks to license every single use of culture, in whatever context. That extreme would radically shrink access to our past.

Instead we need an approach that recognizes the errors in both extremes, and that crafts the balance that any culture needs: incentives to support a diverse range of creativity, with an assurance that the creativity inspired remains for generations to access and understand.

Lawrence Lessig, For the Love of Culture: Google, copyright, and our future. The New Republic Magazine. Jan 2010

Lawrence restricts his discussion to culture in the sense of “the past”, but we are interested in a broader definition of cultural knowledge and cultural artefacts. How do private property and public dissemination go hand in hand?

Jan 04

Here’s a presentation that I have been working on to explain Web Science to potential doctoral students. It links to our Doctoral Training Centre page, which gives details about our fully funded studentships in Web Science.

Slideshare plug-in provided by rob

Jan 01

VIcovTom Standage’s book “The Victorian Internet” describes the development of the telegraph – the use of then barely-understood scientific phenomena, applied by nineteenth century chancers and opportunists to achieve the unthinkable goal of instantaneous trans-global communication. Perhaps this is not unlike the story of the Web’s development and takeup?

The telegraph network at its height really did qualify as an early internet – an interconnected set of regional and national communications lines using a variety of technologies. But the capability that this internet provided – instantaneous communication between two places hundreds or thousands of miles apart – was so incomprehensible and inexplicable to the contemporary audience that it took years for the possibilities to be realised. Government backers were skeptical and failed to understand the science or the application of proposed electronic telegraph systems. In a world in which the speed of communication and travel was limited to the speed of a galloping horse (e.g. the Pony Express), messages between business partners or even between military commanders and their armies might take weeks or months. So ingrained was this natural limitation to the operation of society that the revolutionary advantage of instantaneous communication afforded by this technology was grasped only by a few private individuals – the crank enthusiasts and inventors who built the first telegraph lines and established the early (initially unprofitable) telegraph companies.

initially the technology was seen as a novelty, but very quickly the telegraph became a success as the public, businesses and government began to realise the advantages to be had. “The rapid supply of information changed the way that business was done… Suddenly the price of goods and the speed at which they could be delivered became more important than their geographic location…. Direct transactions between producers and customers were made possible… manufacturers found that they could offer more competitive prices…”

The weakness of the telegraph was that the digital (morse code) messages had to be routed by increasingly overloaded hubs of human operators. The electric communication technology only allowed a message to be sent on a single leg of its journey; to travel multiple hops (from Reading to London on to Edinburgh) or between networks (across a national boundary to the French telegraph) a message had to be written down by a human operator, passed by hand to another office, and then retransmitted by another operator. As the number of telegraph lines and networks burgeoned and the volume of telegraphic traffic increased, the necessity for increasingly complex human systems and organisations to perform the manual routing slowed the whole system down alarmingly.

The Web, of course, is a network of stored and shared documents built upon the technology of the Internet. No such analogue exists for the telegraph (Victorian internet), it is only a system for delivering messages. Without electronic storage and programming, nothing similar could be produced without human effort every step of the way. In theory, it would be possible to telegraph the contents of a book or a newspaper between two remote parties, but the costs would be prohibitive and even though the speed of transmission may be instantaneous, the delays involved in translating into and out of Morse code by human operators would be intolerable.

A further distinction to be drawn between the Victorian Internet and our contemporary internet is that the telegraph only enabled communication between telegraph operators, not between members of the public. All messages had to be sent and received from a telegraph office, not only because the wired infrastructure did not extend to customers’ offices and houses, but because the Morse code skills necessary to use the equipment were not available to the general public. In fact, the telegraph led a generation later to the invention of the telephone which both extended the network to subscribers’ houses and simultaneously ended “the heyday of the telegrapher as a highly paid, highly skilled information worker”.

Nov 22

Is the Web More Pro-human or Antisocial?

This week saw the launch of the Web Foundation, Tim Berners-Lee’s organisation devoted to empowering the 75% of the world who currently do not have the advantages of Web access. The aim of the Web Foundation is to ensure that the Web is a benefit to human society across the world. The aim of Web Science is to analyse the Web’s impact on society and inform the future development of the Web to ensure that aim is realised.

Sep 24

One of the chief aims of Web Science is to take the “mystery” out of the web and to answer questions like “how did the web change the world?”, “why is Wikipedia so popular?” and “who could have predicted Twitter?” In fact, the Web was just one of a whole family of similar technologies with similar aims; of them all, it happened to be in the right place, with the right people, at the right time to achieve the initial take-up that triggered massive popular adoption and gave it the significant role in society that we now take for granted.

The contributors to this blog are the leaders and lecturers on the EPSRC’s Doctoral Training Centre in Web Science at the University of Southampton.