I often notice people get confused between Open Data, Linked Data and RDF. Here’s a quick overview to get you straight:
These 3 things are all slightly different; “Open Data” is a policy; “Linked Data” is an approach and “RDF” is a data structure.
Open Data: Open data is data which you can use more or less freely. It’s generally available on the web, and uses non-proprietry formats like XML, CSV. An extremist definition is data with a clear copyright & an open license (which allows commercial reuse), available from a URL or a well documented API without any restrictions, in formats which are completely open (ie. no patent concerns etc.) A milder definition is “available as data on the web in a form people can do stuff with”. Some Open Data is also Linked Data and RDF, but probably less than half.
Linked Data: Linked data is data which contains links to other datasets. Generally these will use URIs which are resolvable to discover more facts. It’s not essential for the URIs to be resolvable, it’s still really useful to have two different datasets which have used the same identifiers. URIs are unambiguous. However, some data doesn’t make much sense to link up, or the costs are too high and put people off. Linked data is often open, but doesn’t have to be — for example you can have internal confidential data which links up with other data sources. A good example is a lecture timetable; which is confidentidal to the student, but links to data about rooms & modules which are open. Almost all Linked Data is currently expressed in RDF, but you could have links in XML, KML, CSV etc. it’s just RDF is designed with linking in mind.
RDF: RDF is a useful data-structure for creating interoperable data. It has a number of file formats for exchanging this data. Most common is RDF/XML. Nicest (in my opinion) is Turtle. Simplest is N-Triples, where you just write out the data one fact per line. You can also express RDF data embeded in HTML as “RDFa”. The structure of RDF makes it trivial to merge data from multiple sources — it’s all triples. Also it assumes that you will want to either link the data yourself, or other people will want to link into your data. You can publish RDF data which becomes linked data as other people link to it, just like publishing pages on the web. RDF is just a way of structuring data and as such is not always open and not always linked.
Linked Open Data: (aka LOD) is a common term, and as you can see is usually going to be in RDF too. The key thing is not to get put off by the linking. Add links when they provide value to your data and will help people using your data (yourself included) do more with it.