Southampton Web and Data Innovation Team

Ideas and Tips from the Team

Categories:

Advertising
AI
Apache
Best Practice
Bitcoin
Command Line
Community
Conference Spam
Conference Website
Data
- Research Data
Database
dev8d
Doug Englebart
Drupal
Events
Gateway to Research
GDPR
Geo
HESA
HTTP
Internet Archive
Intranet
Javascript
Jisc
Management
- Recruitment
Minecraft
Open Data
Open Source
ORCID
OSX
Outreach
Perl
PHP
Programming
python
RDF
- 4store
- Graphite
- SPARQL
- Triplestore
Repositories
Sharepoint
SQL
Team
Templates
Terms and Conditions
testing
Tips
Training
Tutorial
twitter
Uncategorized
web management
Wordpress

What you need to know about RDF+XML

RDF+XML is a much loathed format.

It is a way of writing RDF data (triples of subject,predicate,object) in XML.

RDF+XML is not RDF. It’s a way of encoding RDF. There are better ones, such as n3, but it’s the one everyone expects you to provide, so you better learn the basics.

RDF+XML is way too big. You can do everything lots of ways. That makes things confusing, so I figured I’d write a guide to the bare minimum you need to know to create valid RDF+XML.

The basics

The subject & predicate are always a URI.

The object is a URI /or/ a literal value. If it’s a literal it may have an associated data type URI or a language code, but not both.

predicate is just a fancy word for “relation”. It relates the subject to the object, eg. Bob hasFriend Jill. (Note that you can’t assume Jill has a friend Bob, it’s a one way thing. Sorry Bob)

The correct mimetype is “application/rdf+xml”

How to write RDF+XML

This is going to cover the smallest learning curve approach. There’s lots more to RDF+XML but it’s all optional sugar. Don’t worry about it.

I’m assuming you already know what actual triples you want to write. If not this isn’t the correct tutorial for you yet.

An RDF document is an XML document and so always starts with

<?xml version=”1.0″ encoding=”utf-8″ ?>

If in doubt, always encode it as utf-8.

Here is a minimum RDF document defining no actual data:

  <?xml version="1.0" encoding="utf-8" ?>
  <rdf:RDF
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  >
  </rdf:RDF>

See that bit which says “xmlns:rdf” that defines that any tag starting with “rdf:” is in the “namespace” http://www.w3.org/1999/02/22-rdf-syntax-ns#

That means that the unique identifier for that element is http://www.w3.org/1999/02/22-rdf-syntax-ns#RDF

If you want to use any predicates, and you do, you’ll need to define namespaces for them in the opening tag. Most common namespaces have a widely accepted prefix.

To fine the standard prefix for a namespace you can look it up on prefix.cc which is handy. Don’t use a different prefix without a good reason. If you can’t find the namespace on prefix.cc then pick something sensible. If you are writing a document with a bunch of namespaces, prefix.cc has a very funky shortcut… try this link:

http://prefix.cc/foaf,skos,gr.xml

Neat, huh? You can just cut and paste it. This saves time and typos. You might not notice missing a “#” from the end, but a computer will treat it as a completely different namespace!

OK. Now to encode some data. Here’s my data. I’m going to use the prefixes to keep it readable:

My name is Marvin
- http://example.com/marvin#me
- foaf:name
- “Marvin Fenderson”
I am a Person
- http://example.com/marvin#me
- rdf:type
- foaf:Person
My hat size is 10
- http://example.com/marvin#me
- myprefix:hatSize
- 10 ( type is http://www.w3.org/2001/XMLSchema#int )
the big head club is an organization
- http://example.com/bigheadsclub#org
- rdf:type
- foaf:Organization
The big head club has a member who is me!
- http://example.com/bigheadsclub#org
- foaf:member
- http://example.com/marvin#me
The big head club is called “The Big Head Club” in English.
- http://example.com/bigheadsclub#org
- foaf:name
- “The Big Head Club” (in English)

OK, that’s enough data. Note that because predicates are one way sometimes you say things backwards. I wanted to say “I’m a member of the club”, but because I’m using a predicate that relates organizations to members, I have to do it that way around.

Note that many things (like Organization in FOAF) have the US spelling. Don’t correct it, computers want an exact string. If you feel annoyed add a label to stuff with a en-gb language version of the label!

Here’s how to encode the above: For each distinct “subjects” (the #me and the #org are the “subjects” in the above data, 3 triples start with each), create a sub-element of the top level rdf:RDF element. Call these sub-elements <rdf:Description> and give them an rdf:about attribute which is the URI of the subject:

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about="http://example.com/marvin#me">
  </rdf:Description>
  <rdf:Description rdf:about="http://example.com/bigheadsclub#org">
  </rdf:Description>
</rdf:RDF>

OK! That’s still valid RDF (assuming it’s inside the <rdf:RDF> element), but it still contains no data. We need to relate Marvin to the Big Head Club.

For triples where the object is a URI (which indicates they relate the subject resource to another resource, not just a number or string), add them as a tag matching the predicate. The namespace must have been correctly aliases in an xmlns:xxxx=”yyyy”. The element should close itself at once and contain the attribute rdf:resource=”URI” where URI is the subject of the triple. Note that you don’t use the short version of the namespace in rdf:resource or rdf:about, just in the predicates relating subjects to objects.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:hats="http://example.com/hats/ns/"
>
  <rdf:Description rdf:about="http://example.com/marvin#me">
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person" />
  </rdf:Description>
  <rdf:Description rdf:about="http://example.com/bigheadsclub#org">
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Organization" />
    <foaf:member rdf:resource="http://example.com/marvin#me" />
  </rdf:Description>
</rdf:RDF>

OK. The last bit is to add in the literals; the strings and the number. Create a tag of the same name as you would for linking to a resource but this time don’t close it at once, but wrap it around the value. If their is a language to express for a string add an xml:lang=’xx’ attribute, when xx is the language code. Alternatively, if you need to express a dataype, use rdf:datatype=”xxx”.

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:hats="http://example.com/hats/ns/"
>
  <rdf:Description rdf:about="http://example.com/marvin#me">
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person" />
    <foaf:name>Marvin Fenderson</foaf:name>
    <hats:hatSize rdf:datatype="http://www.w3.org/2001/XMLSchema#int">10</hats:hatSize>
  </rdf:Description>
  <rdf:Description rdf:about="http://example.com/bigheadsclub#org">
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Organization" />
    <foaf:name xml:lang="en">The Big Head Club</foaf:name>
    <foaf:member rdf:resource="http://example.com/marvin#me" />
  </rdf:Description>
</rdf:RDF>

The order of relations inside a description, and the order of the descriptions does not matter. I think it’s nice to put ‘types’ and ‘labels’ near the top of each description. Relations can be repeated.

At this point you could add an additional rdf:Description, the about of which is the URL of the RDF document. This allows you to make statements about the document as a whole, such as who wrote it, what license it is, what it’s called etc. There’s still no agreement on what is useful, but a title and license are handy. Use rdfs:label to label it.

While it’s not strictly required, it’s helpful to add a rdfs:label and rdf:type to describe every URI that is a subject or object in the document, not counting the objeects of rdf:type. Some people say this is overkill, but it does help debugging.

Checking your RDF

Don’t skip checking it. I keep running into broken RDF produced by people who never sanity checked it.

The best way to check your RDF is to put it on a URL and poke things at it. The first thing I usually do is load it in Firefox and check that it’s valid XML. If it’s not that’s a dealbreaker before we start. Here’s a link to an online copy of our file

Marvin Hat Data

If you load it in firefox it’ll tell you about any XML errors, other browsers are not so helpful.

Once you’ve done that, you should load it into an RDF aware viewer. I use the Graphite Quick & Dirty RDF Browser which I wrote. Here’s what the data looks like if you view it in the browser.

The rdf:type’s have been spotted and are shown on the right hand top corner of each box. The foaf:names have also been highlighted. This helps you spot obvious mistakes. Also, because we’ve got a valid label for Marvin, the list showing members of the organisation is showing his name rather than the URI (hover the mouse to see the URI). This is also handy in spotting obvious mistakes.

If the Graphite Browser can’t parse your RDF it’ll link you to the W3C RDF Validator which is sometimes helpful. Also double check your xmlns definitions. Missing a character off the end will cause lots of problems!

Better than using my generic RDF viewer, if available also check your data in one that is designed to understand the namespaces you’re working with. There’s not many around yet, but that will change. Personally I find the existing ones quite confusing.

If you don’t check your RDF+XML it’s bound to be buggy.

What I’ve skipped

Almost everything! But the only *useful* thing I’ve skipped is how to write bNodes (resources without an associated URI), that’s to keep this simple and because I have a dislike for them.

RDF+XML offers huge numbers of short cuts, but you don’t need any of them. They just make it easier to make mistakes. Sod ’em.

How to read RDF+XML

Simple. Use a library. Most programming languages now have a good library for parsing all the crazy crap in RDF+XML. Don’t bother trying to do it yourself, it’s a waste of time and there’s more important work to be done!

In PHP, I use ARC2. It handles lots of other RDF formats as well, and so I have one less problem to worry about. I just point it at web addresses and it sucks down triples. How cares how they were encoded?

N3

This is another way to encode the same data. It also has some shortcuts, but is much more elegant. Check out the same data:

     @prefix foaf: <http://xmlns.com/foaf/0.1/> .
     @prefix hats: <http://example.com/hats/ns/> .
     @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

    <http://example.com/bigheadsclub#org>     a foaf:Organization;
         foaf:member <http://example.com/marvin#me>;
         foaf:name "The Big Head Club"@en .

    <http://example.com/marvin#me>     a foaf:Person;
         hats:hatSize "10"^^<http://www.w3.org/2001/XMLSchema#int>;
         foaf:name "Marvin Fenderson" .

End

I’m sure that I’ve made a mistake or two myself. Suggestions on how to improve the above would be welcome.

Posted in RDF.

Tagged with Tutorial.

rev="post-465" 3 comments

By Christopher Gutteridge – November 8, 2010

3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

Jodi Schneider says

Thanks for the http://prefix.cc/foaf,skos,gr.xml trick. That is genius!

For readability: use colo(u)rs to highlight the info just added.

For completeness: link to an N3 tutorial, link to a language codes reference, and possibly to more info on common datatypes.

Random thoughts:
*In “We need to relate these things to other things.” s/these things/Marvin and the Big Heads Club

*Watch the open quotes (there are some right smart-quotes where you want left ones)

November 9, 2010, 12:05 am Reply
Ian Millard says

There is often confusion around N3 and Turtle (both of which are a much easier format to author by hand, and almost readable too 🙂

You can actually do wacky things in N3 that are outside of RDF, Turtle is a subset of N3 that is the useful stuff.

Full syntax etc is a little scary to read, but the examples are useful. http://www.w3.org/TeamSubmission/turtle/

If you’re on a linux based platform the Raptor library (http://librdf.org/raptor/) comes with an extremely useful tool called “rapper” which can both check for parse/syntax errors and convert between a variety of RDF formats (eg rdf+xml, turtle, RDFa (parser))

sudo apt-get install raptor-utils

November 10, 2010, 10:42 am Reply
mikele says

Hey thanks for the write-up, clean and straight to the point! I’ll definitely pass it on to less rdf-aware colleagues!

February 8, 2011, 3:25 pm Reply

« TLD changing under your feet Searching a SPARQL endpoint »

Proudly powered by WordPress and Carrington.

Carrington Theme by Crowd Favorite

What you need to know about RDF+XML

The basics

How to write RDF+XML

Checking your RDF

What I’ve skipped

How to read RDF+XML

N3

End

3 Responses

Authors

Recent Posts

Meta

Blogroll

Tags

What you need to know about RDF+XML

The basics

How to write RDF+XML

Checking your RDF

What I’ve skipped

How to read RDF+XML

N3

End

3 Responses

Subscribe

Authors

Recent Posts

Meta

Blogroll

Tags