Southampton Web and Data Innovation Team

Ideas and Tips from the Team

Categories:

Advertising
AI
Apache
Best Practice
Bitcoin
Command Line
Community
Conference Spam
Conference Website
Data
- Research Data
Database
dev8d
Doug Englebart
Drupal
Events
Gateway to Research
GDPR
Geo
HESA
HTTP
Internet Archive
Intranet
Javascript
Jisc
Management
- Recruitment
Minecraft
Open Data
Open Source
ORCID
OSX
Outreach
Perl
PHP
Programming
python
RDF
- 4store
- Graphite
- SPARQL
- Triplestore
Repositories
Sharepoint
SQL
Team
Templates
Terms and Conditions
testing
Tips
Training
Tutorial
twitter
Uncategorized
web management
Wordpress

Using a triplestore instead of MySQL as a backend

I’m still looking at the barriers to using an RDF triple store as the back-end for a website. I’ve discussed some of this back in February already, but the problems remain unsolved.

Our usual pattern, when designing a website, is to identify the various types of entity that will be described by pages on the site. For an academic site we have some of people, groups, projects, publications, events, articles. We then create a database table or tables for each of these and php wrapper functions to get individual records, lists of records and methods to create & update records of each type. In PHP, we have an object representing the set of items (eg. Events) and an object representing each item. The SQL is kept abstracted away as much as possible.

The PHP classes which represent an item or a list of items, has methods for mapping the data into various formats; short HTML summary, an HTML page, RDF, XML, .ics, rss, atom etc. Occasionally some fields may be not shown to the public, for example if we use the same database for some internal administration.

On some sites, we have a table which stores all revisions of each item, and a table which maps each primary_item_id to its revision_id. Previous versions should never, ever be shown to the public as they may have contained errors or information we actively do not want to be public.

What I’m interested in is how normal web developers, rather than researchers, can achieve this.

I am still imagining a system with “classes” of things, like people and events, where the PHP is configured in such a way to be able to create/retrieve/update/delete individual “records”, that each triple will belong to only one record, and that we’ll have PHP functions which retrieve data from a set of records (by abstracting SPARQL instead of SQL)

Unanswered questions:

Internally, do we use our own namespace for the predicates or established namespaces (FOAF, SIOC etc) or a mixure?
If we use our own namespace, do we map into common schemas (FOAF,SIOC…) for the public view of .rdf data? Do we map it on demand, or when a record is updated? Do we expose our internal namespace predicates? I don’t believe just providing a mapping and let people map it themselves is a reasonable option.
Do we expose all of the triples? (what about ones used for administration? do we just make sure we have no secrets in the triplestore?) If so, how do we handle revisions? Have 2 triplestores — One for the public and one for admin? Or can triplestore SPARQL endpoints be configured in fancy ways?
How do we generate brief, unique URIs for items when they are created? In my experience URIs built from any of the meaningful data in an item are a mistake, eg. surnames etc. Using uuid’s are not an option — they are ugly. http://webscience.org/person/6 is better. My previous post suggested some solutions, and Talis have a weird solution using a pool of available IDs, but I don’t regard it as a solved problem. Then again there’s no standard solution in SQL databases.
If using tools to add value by importing/generating additional triples, how do we manage these? For example do we need to erase any of these if the records they refer to are removed or updated?

I think there are probably answers to all of these, but they need to be moved from ‘research’ to ‘development’. I’ll post updates if people solve any of these for me.

Posted in Best Practice, RDF.

rev="post-314" No comments

By Christopher Gutteridge – April 16, 2010

0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

« MediaWiki Authentication Using Twitter and OAuth Quick and Dirty RDF Reader »

Proudly powered by WordPress and Carrington.

Carrington Theme by Crowd Favorite

Using a triplestore instead of MySQL as a backend

0 Responses

Authors

Recent Posts

Meta

Blogroll

Tags

Using a triplestore instead of MySQL as a backend

0 Responses

Subscribe

Authors

Recent Posts

Meta

Blogroll

Tags